├── .gitignore ├── .vscode └── settings.json ├── LICENSE.md ├── README.md ├── detection_3d ├── __init__.py ├── create_dataset_lists.py ├── data_preprocessing │ └── pandaset_tools │ │ ├── helpers.py │ │ ├── preprocess_data.py │ │ ├── transform.py │ │ └── visualize_data.py ├── detection_dataset.py ├── losses.py ├── metrics.py ├── model.py ├── parameters.py ├── tools │ ├── augmentation_tools.py │ ├── detection_helpers.py │ ├── file_io.py │ ├── statics.py │ ├── summary_helpers.py │ ├── training_helpers.py │ └── visualization_tools.py ├── train.py └── validation_inferece.py ├── pictures ├── box_parametrization.png ├── result.png └── topview.png └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | *~ 3 | *.txt 4 | dataset 5 | *-INFO 6 | log 7 | inference 8 | *.bin 9 | 10 | -------------------------------------------------------------------------------- /.vscode/settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "python.formatting.provider": "black" 3 | } -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | Copyright 2020 Denis Tananaev 2 | 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: 4 | 5 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. 6 | 7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Dynamic objects detection in LiDAR 2 | 3 | [![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/Dtananaev/lidar_dynamic_objects_detection/blob/master/LICENSE.md) 4 | 5 | ## The result of network (click on the image below) 6 | 7 | [![result](https://github.com/Dtananaev/lidar_dynamic_objects_detection/blob/master/pictures/result.png)](https://youtu.be/f_HZg9Cq-h4) 8 | The network weights could be loaded [weight](https://drive.google.com/file/d/1m8N5m2WXATgFNw88BRqEbUieiyV7p3S0/view?usp=sharing). 9 | ## Installation 10 | For ubuntu 18.04 install necessary dependecies: 11 | ``` 12 | sudo apt update 13 | sudo apt install python3-dev python3-pip python3-venv 14 | ``` 15 | Create virtual environment and activate it: 16 | ``` 17 | python3 -m venv --system-site-packages ./venv 18 | source ./venv/bin/activate 19 | ``` 20 | Upgrade pip tools: 21 | ``` 22 | pip install --upgrade pip 23 | ``` 24 | Install tensorflow 2.0 (for more details check the tensofrolow install tutorial: [tensorflow](https://www.tensorflow.org/install/pip)) 25 | ``` 26 | pip install --upgrade tensorflow-gpu 27 | ``` 28 | Clone this repository and then install it: 29 | ``` 30 | cd lidar_dynamic_objects_detection 31 | pip install -r requirements.txt 32 | pip install -e . 33 | ``` 34 | This should install all the necessary packages to your environment. 35 | 36 | ## The method 37 | 38 | The lidar point cloud represented as top view image where each pixel of the image corresponds to 12.5x12.5 cm. For each grid cell 39 | we project random point and get the height and intensity 40 |

41 | 42 |

43 | We are doing direct regression of the 3D boxes, thus for each pixel of the image we regress confidence between 0 and 1, 7 parameters for box (dx_centroid, dy_centroid, z_centroid, width, height, dx_front, dy_front) and classes. 44 |

45 | 46 |

47 | We apply binary cross entrophy for confidence loss, l1 loss for all box parameters regression and softmax loss for classes prediction. 48 | The confidence map computed from ground truth boxes. We assign the closest to the box centroid cell as confidence 1.0 (green on the image above) 49 | and 0 otherwise. We apply confidence loss for all the pixels. Other losses applied only for those pixels where we have confidence ground truth 1.0. 50 | 51 | 52 | ## The dataset preparation 53 | We work with Pandaset dataset which can be uploaded from here: [Pandaset](https://pandaset.org/) 54 | Upload and unpack all the data to dataset folder (e.g. ~/dataset). 55 | The dataset should have the next folder structure: 56 | ``` bash 57 | dataset 58 | ├── 001 # The sequence number 59 | │ ├── annotations # Bounding boxes and semseg annotations 60 | | | ├──cuboids 61 | | | | ├──00.pkl.gz 62 | | | | └── ... 63 | | | ├──semseg 64 | | | ├──00.pkl.gz 65 | | | └── ... 66 | │ ├── camera # cameras images 67 | | | ├──back_camera 68 | | | | ├──00.jpg 69 | | | | └── .. 70 | | | ├──front_camera 71 | | | └── ... 72 | │ ├── lidar # lidar data 73 | │ | ├── 00.pkl.gz 74 | │ | └── ... 75 | | ├── meta 76 | | | ├── gps.json 77 | | | ├── timestamps.json 78 | ├── 002 79 | └── ... 80 | ``` 81 | Preprocess dataset by applying next command: 82 | ``` 83 | cd lidar_dynamic_objects_detection/detection_3d/data_preprocessing/pandaset_tools 84 | python preprocess_data.py --dataset_dir 85 | ``` 86 | Create dataset lists: 87 | ``` 88 | cd lidar_dynamic_objects_detection/detection_3d/ 89 | python create_dataset_lists.py --dataset_dir 90 | ``` 91 | This should create ```train.datatxt``` and ```val.datatxt``` into your dataset folder. 92 | Finally change into ```parameters.py``` the directory of the dataset. 93 | ## Train 94 | In order to train the network: 95 | ``` 96 | python train.py 97 | ``` 98 | In order to resume training: 99 | ``` 100 | python train.py --resume 101 | ``` 102 | The training can be monitored in tensorboard: 103 | ``` 104 | tensorboard --logdir=log 105 | ``` 106 | ## Inference on validation dataset 107 | In order to do inference on validation dataset: 108 | ``` 109 | python validation_inference.py --dataset_file /val.datatxt --output_dir --model_dir 110 | ``` 111 | The result of the inference is 3d boxes and also visualized 3d boxes on top view image. The visualized top view image (upper) concatenated with ground truth top view image (bottom). 112 | -------------------------------------------------------------------------------- /detection_3d/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/detection_3d/__init__.py -------------------------------------------------------------------------------- /detection_3d/create_dataset_lists.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import os 20 | import glob 21 | import numpy as np 22 | import argparse 23 | from detection_3d.tools.file_io import save_dataset_list 24 | 25 | 26 | class PandaDetectionDataset: 27 | def __init__(self, dataset_dir): 28 | self.dataset_dir = dataset_dir 29 | 30 | def get_data(self): 31 | search_string = os.path.join(self.dataset_dir, "*", "lidar_processed", "*.bin") 32 | lidar_list = np.asarray(sorted(glob.glob(search_string))) 33 | search_string = os.path.join(self.dataset_dir, "*", "bbox_processed", "*.txt") 34 | box_list = np.asarray(sorted(glob.glob(search_string))) 35 | data = np.concatenate((lidar_list[:, None], box_list[:, None],), axis=1,) 36 | data = [";".join(x) for x in data] 37 | return data 38 | 39 | def create_datasets_file(self): 40 | """ 41 | Creates train.dataset and val.dataset file 42 | """ 43 | data_list = self.get_data() 44 | 45 | split_num = 80 * int(103 * 0.75) 46 | print(f"split_num {split_num}") 47 | # Save train and validation dataset 48 | filename = os.path.join(self.dataset_dir, "train.datatxt") 49 | save_dataset_list(filename, data_list[:split_num]) 50 | print( 51 | f"The dataset of the size {len(data_list[:split_num])} saved in {filename}." 52 | ) 53 | filename = os.path.join(self.dataset_dir, "val.datatxt") 54 | save_dataset_list(filename, data_list[split_num:]) 55 | print( 56 | f"The dataset of the size {len(data_list[split_num:])} saved in {filename}." 57 | ) 58 | 59 | 60 | if __name__ == "__main__": 61 | parser = argparse.ArgumentParser(description="Create kitti dataset file.") 62 | parser.add_argument("--dataset_dir", default="dataset") 63 | args = parser.parse_args() 64 | dataset_creator = PandaDetectionDataset(args.dataset_dir) 65 | dataset_creator.create_datasets_file() 66 | -------------------------------------------------------------------------------- /detection_3d/data_preprocessing/pandaset_tools/helpers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import numpy as np 20 | 21 | labels = { 22 | "Cones": 0, 23 | "Towed Object": 1, 24 | "Semi-truck": 2, 25 | "Train": 3, 26 | "Temporary Construction Barriers": 4, 27 | "Rolling Containers": 5, 28 | "Animals - Other": 6, 29 | "Pylons": 7, 30 | "Emergency Vehicle": 8, 31 | "Motorcycle": 9, 32 | "Construction Signs": 10, 33 | "Medium-sized Truck": 11, 34 | "Other Vehicle - Uncommon": 12, 35 | "Tram / Subway": 13, 36 | "Road Barriers": 14, 37 | "Bus": 15, 38 | "Pedestrian with Object": 16, 39 | "Personal Mobility Device": 17, 40 | "Signs": 18, 41 | "Other Vehicle - Pedicab": 19, 42 | "Pedestrian": 20, 43 | "Car": 21, 44 | "Other Vehicle - Construction Vehicle": 22, 45 | "Bicycle": 23, 46 | "Motorized Scooter": 24, 47 | "Pickup Truck": 25, 48 | } 49 | 50 | 51 | def get_color(label): 52 | # "Tram": 0, "Car": 1, "Misc": 2, "Van": 3, "Person_sitting": 4, "Pedestrian": 5, "Truck": 6, "Cyclist": 7 53 | color = np.asarray( 54 | [ 55 | [255, 229, 204], # "Cones": 0, 56 | [255, 255, 204], # "Towed Object": 1, 57 | [204, 204, 255], # "Semi-truck": 2, 58 | [255, 204, 204], # "Train": 3, 59 | [255, 204, 153], # "Temporary Construction Barriers": 4, 60 | [204, 255, 204], # "Rolling Containers": 5, 61 | [255, 204, 229], # "Animals - Other": 6, 62 | [153, 255, 153], # "Pylons": 7, 63 | [128, 128, 128], # "Emergency Vehicle": 8, 64 | [255, 255, 102], # "Motorcycle": 9, 65 | [255, 153, 51], # "Construction Signs": 10, 66 | [153, 153, 255], # "Medium-sized Truck": 11, 67 | [255, 255, 255], # "Other Vehicle - Uncommon": 12, 68 | [255, 102, 102], # "Tram / Subway": 13, 69 | [204, 102, 0], # "Road Barriers": 14, 70 | [0, 0, 255], # "Bus": 15, 71 | [255, 51, 153], # "Pedestrian with Object": 16, 72 | [153, 153, 0], # "Personal Mobility Device" 73 | [255, 153, 51], # "Signs": 18, 74 | [128, 128, 128], # "Other Vehicle - Pedicab": 19, 75 | [204, 0, 102], # Pedestrian 76 | [0, 255, 0], # Car 77 | [0, 0, 102], # "Other Vehicle - Construction Vehicle" 78 | [255, 255, 0], # "Other Vehicle - Construction Vehicle" 79 | [255, 255, 153], # "Motorized Scooter": 24, 80 | [51, 255, 255], # "Motorized Scooter": 24, 81 | ] 82 | ) 83 | return color[int(label)] 84 | 85 | 86 | def make_xzyhwly(bboxes): 87 | """ 88 | Get raw data from bboxes and return xyzwlhy 89 | """ 90 | label = bboxes[:, 1] 91 | yaw = bboxes[:, 2] 92 | c_x = bboxes[:, 5] 93 | c_y = bboxes[:, 6] 94 | c_z = bboxes[:, 7] 95 | length = bboxes[:, 8] 96 | width = bboxes[:, 9] 97 | height = bboxes[:, 10] 98 | new_boxes = np.asarray([c_x, c_y, c_z, length, width, height, yaw], dtype=np.float) 99 | return label, np.transpose(new_boxes) 100 | 101 | 102 | def filter_boxes(labels, bboxes_3d, orient_3d, lidar, treshold=20): 103 | labels_res = [] 104 | box_res = [] 105 | orient_res = [] 106 | for idx, box in enumerate(bboxes_3d): 107 | min_x = np.min(box[:, 0]) 108 | max_x = np.max(box[:, 0]) 109 | min_y = np.min(box[:, 1]) 110 | max_y = np.max(box[:, 1]) 111 | min_z = np.min(box[:, 2]) 112 | max_z = np.max(box[:, 2]) 113 | mask_x = (lidar[:, 0] >= min_x) & (lidar[:, 0] <= max_x) 114 | mask_y = (lidar[:, 1] >= min_y) & (lidar[:, 1] <= max_y) 115 | mask_z = (lidar[:, 2] >= min_z) & (lidar[:, 2] <= max_z) 116 | mask = mask_x & mask_y & mask_z 117 | result = np.sum(mask.astype(float)) 118 | if result > treshold: 119 | box_res.append(box) 120 | orient_res.append(orient_3d[idx]) 121 | labels_res.append(labels[idx]) 122 | return np.asarray(labels_res), np.asarray(box_res), np.asarray(orient_res) 123 | -------------------------------------------------------------------------------- /detection_3d/data_preprocessing/pandaset_tools/preprocess_data.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import argparse 20 | import numpy as np 21 | import os 22 | import glob 23 | import pandas as pd 24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import ( 25 | make_xzyhwly, 26 | filter_boxes, 27 | ) 28 | from detection_3d.tools.detection_helpers import ( 29 | make_eight_points_boxes, 30 | get_bboxes_parameters_from_points, 31 | ) 32 | import mayavi.mlab as mlab 33 | from tqdm import tqdm 34 | from detection_3d.tools.file_io import read_json, save_bboxes_to_file, save_lidar 35 | from detection_3d.data_preprocessing.pandaset_tools.transform import ( 36 | quaternion_to_euler, 37 | to_transform_matrix, 38 | transform_lidar_box_3d, 39 | ) 40 | from detection_3d.tools.visualization_tools import visualize_lidar, visualize_bboxes_3d 41 | 42 | 43 | def preprocess_data(dataset_dir): 44 | """ 45 | The function prepares data for training from pandaset. 46 | Arguments: 47 | dataset_dir: directory with Pandaset data 48 | """ 49 | 50 | # Get list of data samples 51 | search_string = os.path.join(dataset_dir, "*") 52 | seq_list = sorted(glob.glob(search_string)) 53 | for seq in tqdm(seq_list, desc="Process sequences", total=len(seq_list)): 54 | # Make output dirs for data 55 | lidar_out_dir = os.path.join(seq, "lidar_processed") 56 | bbox_out_dir = os.path.join(seq, "bbox_processed") 57 | os.makedirs(lidar_out_dir, exist_ok=True) 58 | os.makedirs(bbox_out_dir, exist_ok=True) 59 | search_string = os.path.join(seq, "lidar", "*.pkl.gz") 60 | lidar_list = sorted(glob.glob(search_string)) 61 | lidar_pose_path = os.path.join(seq, "lidar", "poses.json") 62 | lidar_pose = read_json(lidar_pose_path) 63 | for idx, lidar_path in enumerate(lidar_list): 64 | sample_idx = os.path.splitext(os.path.basename(lidar_path))[0].split(".")[0] 65 | # Get pose of the lidar 66 | translation = lidar_pose[idx]["position"] 67 | translation = np.asarray([translation[key] for key in translation]) 68 | rotation = lidar_pose[idx]["heading"] 69 | rotation = np.asarray([rotation[key] for key in rotation]) 70 | rotation = quaternion_to_euler(*rotation) 71 | Rt = to_transform_matrix(translation, rotation) 72 | 73 | # Get respective bboxes 74 | bbox_path = lidar_path.split("/") 75 | bbox_path[-2] = "annotations/cuboids" 76 | bbox_path = os.path.join(*bbox_path) 77 | 78 | # Load data 79 | lidar = np.asarray(pd.read_pickle(lidar_path)) 80 | # Get only lidar 0 (there is also lidar 1) 81 | lidar = lidar[lidar[:, -1] == 0] 82 | intensity = lidar[:, 3] 83 | lidar = transform_lidar_box_3d(lidar, Rt) 84 | # add intensity 85 | lidar = np.concatenate((lidar, intensity[:, None]), axis=-1) 86 | 87 | # Load bboxes 88 | bboxes = np.asarray(pd.read_pickle(bbox_path)) 89 | labels, bboxes = make_xzyhwly(bboxes) 90 | corners_3d, orientation_3d = make_eight_points_boxes(bboxes) 91 | corners_3d = np.asarray( 92 | [transform_lidar_box_3d(box, Rt) for box in corners_3d] 93 | ) 94 | orientation_3d = np.asarray( 95 | [transform_lidar_box_3d(box, Rt) for box in orientation_3d] 96 | ) 97 | # filter boxes containing less then 20 lidar points inside 98 | labels, corners_3d, orientation_3d = filter_boxes( 99 | labels, corners_3d, orientation_3d, lidar 100 | ) 101 | centroid, width, length, height, yaw = get_bboxes_parameters_from_points( 102 | corners_3d 103 | ) 104 | 105 | # Save data 106 | lidar_filename = os.path.join(lidar_out_dir, sample_idx + ".bin") 107 | save_lidar(lidar_filename, lidar.astype(np.float32)) 108 | box_filename = os.path.join(bbox_out_dir, sample_idx + ".txt") 109 | save_bboxes_to_file( 110 | box_filename, centroid, width, length, height, yaw, labels 111 | ) 112 | 113 | 114 | if __name__ == "__main__": 115 | parser = argparse.ArgumentParser(description="Preprocess 3D pandaset.") 116 | parser.add_argument("--dataset_dir", default="../../dataset") 117 | args = parser.parse_args() 118 | preprocess_data(args.dataset_dir) 119 | -------------------------------------------------------------------------------- /detection_3d/data_preprocessing/pandaset_tools/transform.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | 20 | import numpy as np 21 | 22 | 23 | def quaternion_to_euler(w, x, y, z): 24 | """ 25 | Converts quaternions with components w, x, y, z into a tuple (roll, pitch, yaw) 26 | 27 | """ 28 | sinr_cosp = 2 * (w * x + y * z) 29 | cosr_cosp = 1 - 2 * (x ** 2 + y ** 2) 30 | roll = np.arctan2(sinr_cosp, cosr_cosp) 31 | 32 | sinp = 2 * (w * y - z * x) 33 | pitch = np.where(np.abs(sinp) >= 1, np.sign(sinp) * np.pi / 2, np.arcsin(sinp)) 34 | 35 | siny_cosp = 2 * (w * z + x * y) 36 | cosy_cosp = 1 - 2 * (y ** 2 + z ** 2) 37 | yaw = np.arctan2(siny_cosp, cosy_cosp) 38 | 39 | return roll, pitch, yaw 40 | 41 | 42 | # Calculates Rotation Matrix given euler angles. 43 | def eulerAnglesToRotationMatrix(theta): 44 | 45 | R_x = np.array( 46 | [ 47 | [1, 0, 0], 48 | [0, np.cos(theta[0]), -np.sin(theta[0])], 49 | [0, np.sin(theta[0]), np.cos(theta[0])], 50 | ] 51 | ) 52 | 53 | R_y = np.array( 54 | [ 55 | [np.cos(theta[1]), 0, np.sin(theta[1])], 56 | [0, 1, 0], 57 | [-np.sin(theta[1]), 0, np.cos(theta[1])], 58 | ] 59 | ) 60 | 61 | R_z = np.array( 62 | [ 63 | [np.cos(theta[2]), -np.sin(theta[2]), 0], 64 | [np.sin(theta[2]), np.cos(theta[2]), 0], 65 | [0, 0, 1], 66 | ] 67 | ) 68 | 69 | R = np.dot(R_z, np.dot(R_y, R_x)) 70 | 71 | return R 72 | 73 | 74 | def to_transform_matrix(translation, rotation): 75 | Rt = np.eye(4) 76 | Rt[:3, :3] = eulerAnglesToRotationMatrix(rotation) 77 | Rt[:3, 3] = translation 78 | return Rt 79 | 80 | 81 | def transform_lidar_box_3d(lidar, Rt): 82 | rt_inv = np.linalg.inv(Rt) 83 | 84 | lidar_3d = lidar[:, :3] 85 | lidar_3d = np.transpose(lidar_3d) 86 | 87 | ones = np.ones_like(lidar_3d[0])[None, :] 88 | hom_coord = np.concatenate((lidar_3d, ones), axis=0) 89 | lidar_3d = np.dot(rt_inv, hom_coord) 90 | lidar_3d = np.transpose(lidar_3d)[:, :3] 91 | 92 | return lidar_3d 93 | -------------------------------------------------------------------------------- /detection_3d/data_preprocessing/pandaset_tools/visualize_data.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import argparse 20 | import numpy as np 21 | import os 22 | import glob 23 | import pandas as pd 24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import ( 25 | make_xzyhwly, 26 | filter_boxes, 27 | ) 28 | from detection_3d.tools.detection_helpers import ( 29 | make_eight_points_boxes, 30 | get_bboxes_parameters_from_points, 31 | ) 32 | import mayavi.mlab as mlab 33 | from tqdm import tqdm 34 | from detection_3d.tools.file_io import read_json 35 | from detection_3d.data_preprocessing.pandaset_tools.transform import ( 36 | quaternion_to_euler, 37 | to_transform_matrix, 38 | transform_lidar_box_3d, 39 | ) 40 | from detection_3d.tools.visualization_tools import visualize_lidar, visualize_bboxes_3d 41 | 42 | 43 | def preprocess_data(dataset_dir): 44 | """ 45 | The function visualizes data from pandaset. 46 | Arguments: 47 | dataset_dir: directory with Pandaset data 48 | """ 49 | shift_lidar = [ 50 | 25, 51 | 50, 52 | 2.5, 53 | ] # The lidar coordinates is in the middle of point cloud we shift them to left top corner of the top view image 54 | # the top view image applied to the area of 50x100 meters around the car, where the most dense lidar point cloud 55 | # Get list of data samples 56 | search_string = os.path.join(dataset_dir, "*") 57 | seq_list = sorted(glob.glob(search_string)) 58 | for seq in tqdm(seq_list, desc="Process sequences", total=len(seq_list)): 59 | search_string = os.path.join(seq, "lidar", "*.pkl.gz") 60 | lidar_list = sorted(glob.glob(search_string)) 61 | lidar_pose_path = os.path.join(seq, "lidar", "poses.json") 62 | lidar_pose = read_json(lidar_pose_path) 63 | for idx, lidar_path in enumerate(lidar_list): 64 | # Get pose of the lidar 65 | translation = lidar_pose[idx]["position"] 66 | translation = np.asarray([translation[key] for key in translation]) 67 | rotation = lidar_pose[idx]["heading"] 68 | rotation = np.asarray([rotation[key] for key in rotation]) 69 | rotation = quaternion_to_euler(*rotation) 70 | Rt = to_transform_matrix(translation, rotation) 71 | 72 | # Get respective bboxes 73 | bbox_path = lidar_path.split("/") 74 | bbox_path[-2] = "annotations/cuboids" 75 | bbox_path = os.path.join(*bbox_path) 76 | 77 | # Load data 78 | lidar = np.asarray(pd.read_pickle(lidar_path)) 79 | # Get only lidar 0 (there is also lidar 1) 80 | lidar = lidar[lidar[:, -1] == 0] 81 | intensity = lidar[:, 3] 82 | lidar = transform_lidar_box_3d(lidar, Rt) 83 | # add intensity 84 | lidar = np.concatenate((lidar, intensity[:, None]), axis=-1) 85 | 86 | # Load bboxes 87 | bboxes = np.asarray(pd.read_pickle(bbox_path)) 88 | labels, bboxes = make_xzyhwly(bboxes) 89 | corners_3d, orientation_3d = make_eight_points_boxes(bboxes) 90 | corners_3d = np.asarray( 91 | [transform_lidar_box_3d(box, Rt) for box in corners_3d] 92 | ) 93 | orientation_3d = np.asarray( 94 | [transform_lidar_box_3d(box, Rt) for box in orientation_3d] 95 | ) 96 | labels, corners_3d, orientation_3d = filter_boxes( 97 | labels, corners_3d, orientation_3d, lidar 98 | ) 99 | centroid, width, length, height, yaw = get_bboxes_parameters_from_points( 100 | corners_3d 101 | ) 102 | 103 | boxes_new = np.concatenate( 104 | ( 105 | centroid, 106 | length[:, None], 107 | width[:, None], 108 | height[:, None], 109 | yaw[:, None], 110 | ), 111 | axis=-1, 112 | ) 113 | lidar[:, :3] = lidar[:, :3] + shift_lidar 114 | 115 | corners_3d, orientation_3d = make_eight_points_boxes(boxes_new) 116 | corners_3d = corners_3d + shift_lidar 117 | orientation_3d = orientation_3d + shift_lidar 118 | figure = visualize_bboxes_3d(corners_3d, None, orientation_3d) 119 | figure = visualize_lidar(lidar, figure) 120 | mlab.show(1) 121 | input() 122 | mlab.close(figure) 123 | 124 | 125 | if __name__ == "__main__": 126 | parser = argparse.ArgumentParser(description="Preprocess 3D pandaset.") 127 | parser.add_argument("--dataset_dir", default="../../dataset") 128 | args = parser.parse_args() 129 | preprocess_data(args.dataset_dir) 130 | -------------------------------------------------------------------------------- /detection_3d/detection_dataset.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import tensorflow as tf 20 | import numpy as np 21 | import argparse 22 | from tqdm import tqdm 23 | from detection_3d.parameters import Parameters 24 | from detection_3d.tools.file_io import load_dataset_list, load_lidar, load_bboxes 25 | from detection_3d.tools.detection_helpers import ( 26 | make_top_view_image, 27 | make_eight_points_boxes, 28 | get_bboxes_grid, 29 | ) 30 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image 31 | from detection_3d.tools.augmentation_tools import ( 32 | random_rotate_lidar_boxes, 33 | random_flip_x_lidar_boxes, 34 | random_flip_y_lidar_boxes, 35 | ) 36 | from PIL import Image 37 | 38 | 39 | class DetectionDataset: 40 | """ 41 | This is dataset layer for 3d detection experiment 42 | Arguments: 43 | param_settings: parameters of experiment 44 | dataset_file: name of .dataset file 45 | shuffle: shuffle the data True/False 46 | """ 47 | 48 | def __init__(self, param_settings, dataset_file, augmentation=False, shuffle=False): 49 | # Private methods 50 | self.seed = param_settings["seed"] 51 | np.random.seed(self.seed) 52 | 53 | self.augmentation = augmentation 54 | 55 | self.param_settings = param_settings 56 | self.dataset_file = dataset_file 57 | self.inputs_list = load_dataset_list( 58 | self.param_settings["dataset_dir"], dataset_file 59 | ) 60 | self.num_samples = len(self.inputs_list) 61 | self.num_it_per_epoch = int( 62 | self.num_samples / self.param_settings["batch_size"] 63 | ) 64 | self.output_types = [tf.float32, tf.float32, tf.string] 65 | 66 | ds = tf.data.Dataset.from_tensor_slices(self.inputs_list) 67 | 68 | if shuffle: 69 | ds = ds.shuffle(self.num_samples) 70 | ds = ds.map( 71 | map_func=lambda x: tf.py_function( 72 | self.load_data, [x], Tout=self.output_types 73 | ), 74 | num_parallel_calls=12, 75 | ) 76 | ds = ds.batch(self.param_settings["batch_size"]) 77 | ds = ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE) 78 | self.dataset = ds 79 | 80 | def load_data(self, data_input): 81 | """ 82 | Loads image and semseg and resizes it 83 | Note: This is numpy function. 84 | """ 85 | lidar_file, bboxes_file = np.asarray(data_input).astype("U") 86 | 87 | lidar = load_lidar(lidar_file) 88 | bboxes = load_bboxes(bboxes_file) 89 | labels = bboxes[:, -1] 90 | lidar_corners_3d, _ = make_eight_points_boxes(bboxes[:, :-1]) 91 | if self.augmentation: 92 | np.random.shuffle(lidar) 93 | if np.random.uniform(0, 1) < 0.50: # 50% probability to flip over x axis 94 | lidar, lidar_corners_3d = random_flip_x_lidar_boxes( 95 | lidar, lidar_corners_3d 96 | ) 97 | if np.random.uniform(0, 1) < 0.50: # 50% probability to flip over y axis 98 | lidar, lidar_corners_3d = random_flip_y_lidar_boxes( 99 | lidar, lidar_corners_3d 100 | ) 101 | if np.random.uniform(0, 1) < 0.80: # 80% probability to rotate 102 | lidar, lidar_corners_3d = random_rotate_lidar_boxes( 103 | lidar, lidar_corners_3d 104 | ) 105 | 106 | # # Shift lidar coordinate to positive quadrant 107 | lidar_coord = np.asarray(self.param_settings["lidar_offset"], dtype=np.float32) 108 | lidar = lidar + lidar_coord 109 | lidar_corners_3d = lidar_corners_3d + lidar_coord[:3] 110 | # Process data 111 | top_view = make_top_view_image( 112 | lidar, self.param_settings["grid_meters"], self.param_settings["voxel_size"] 113 | ) 114 | box_grid = get_bboxes_grid( 115 | labels, 116 | lidar_corners_3d, 117 | self.param_settings["grid_meters"], 118 | self.param_settings["bbox_voxel_size"], 119 | ) 120 | return top_view, box_grid, lidar_file 121 | 122 | 123 | if __name__ == "__main__": 124 | parser = argparse.ArgumentParser(description="DatasetLayer.") 125 | parser.add_argument( 126 | "--dataset_file", 127 | type=str, 128 | help="creates .dataset file", 129 | default="train.datatxt", 130 | ) 131 | args = parser.parse_args() 132 | 133 | param_settings = Parameters().settings 134 | train_dataset = DetectionDataset(param_settings, args.dataset_file) 135 | 136 | bbox_voxel_size = np.asarray(param_settings["bbox_voxel_size"], dtype=np.float32) 137 | grid_meters = np.array(param_settings["grid_meters"], dtype=np.float32) 138 | 139 | lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32) 140 | 141 | for samples in tqdm(train_dataset.dataset, total=train_dataset.num_it_per_epoch): 142 | top_images, boxes_grid, lidar_file = samples 143 | print( 144 | f"lidar {top_images.shape}, boxes {boxes_grid.shape}, lidar_file {lidar_file}" 145 | ) 146 | 147 | top_view = ( 148 | visualize_2d_boxes_on_top_image( 149 | boxes_grid, top_images, grid_meters, bbox_voxel_size 150 | ) 151 | * 255 152 | ) 153 | img = Image.fromarray(top_view[0].astype("uint8")) 154 | img.save("result.png") 155 | input() 156 | -------------------------------------------------------------------------------- /detection_3d/losses.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | 20 | import tensorflow as tf 21 | import numpy as np 22 | from tensorflow.keras.losses import binary_crossentropy, sparse_categorical_crossentropy 23 | 24 | 25 | def detection_loss(gt_bboxes, pred_bboxes, num_classes=26): 26 | 27 | # [2, 280, 160, 7] 28 | # (x_grid, y_grid, (objectness, min_delta_x, min_delta_y, max_delta_x, max_delta_y, z, label)) 29 | ( 30 | gt_objectness, 31 | gt_delta_xy, 32 | gt_orient_xy, 33 | gt_z_coord, 34 | gt_width, 35 | gt_height, 36 | gt_label, 37 | ) = tf.split(gt_bboxes, (1, 2, 2, 1, 1, 1, 1), axis=-1) 38 | 39 | ( 40 | p_objectness, 41 | p_delta_xy, 42 | p_orient_xy, 43 | p_z_coord, 44 | p_width, 45 | p_height, 46 | p_label, 47 | ) = tf.split(pred_bboxes, (1, 2, 2, 1, 1, 1, num_classes), axis=-1) 48 | 49 | # Objectness 50 | p_objectness = tf.sigmoid(p_objectness) 51 | obj_loss = binary_crossentropy(gt_objectness, p_objectness) 52 | 53 | # Evaluate regression only for non-zero ground truth objects 54 | obj_mask = tf.squeeze(gt_objectness, -1) 55 | 56 | # Evaluate other 6 parameters of the bboxes 57 | label_loss = obj_mask * sparse_categorical_crossentropy( 58 | gt_label, p_label, from_logits=True 59 | ) 60 | 61 | delta_xy_loss = obj_mask * tf.reduce_sum(tf.abs(gt_delta_xy - p_delta_xy), axis=-1) 62 | delta_orient_loss = obj_mask * tf.reduce_sum( 63 | tf.abs(gt_orient_xy - p_orient_xy), axis=-1 64 | ) 65 | 66 | z_loss = obj_mask * tf.squeeze(tf.abs(gt_z_coord - p_z_coord), -1) 67 | width_loss = obj_mask * tf.squeeze(tf.abs(gt_width - p_width), -1) 68 | height_loss = obj_mask * tf.squeeze(tf.abs(gt_height - p_height), -1) 69 | 70 | obj_loss = tf.reduce_sum(obj_loss) 71 | label_loss = tf.reduce_sum(label_loss) 72 | z_loss = tf.reduce_sum(z_loss) 73 | delta_xy_loss = tf.reduce_sum(delta_xy_loss) 74 | width_loss = tf.reduce_sum(width_loss) 75 | height_loss = tf.reduce_sum(height_loss) 76 | delta_orient_loss = tf.reduce_sum(delta_orient_loss) 77 | 78 | return ( 79 | obj_loss, 80 | label_loss, 81 | z_loss, 82 | delta_xy_loss, 83 | width_loss, 84 | height_loss, 85 | delta_orient_loss, 86 | ) 87 | 88 | -------------------------------------------------------------------------------- /detection_3d/metrics.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | 20 | 21 | import tensorflow as tf 22 | from detection_3d.tools.file_io import save_to_json 23 | import numpy as np 24 | import os 25 | 26 | 27 | class EpochMetrics: 28 | """ 29 | The class computes the loss 30 | for train and validation step 31 | """ 32 | 33 | def __init__(self): 34 | self.train_loss = tf.keras.metrics.Mean(name="train_loss") 35 | self.val_loss = tf.keras.metrics.Mean(name="val_loss") 36 | 37 | def reset(self): 38 | """ 39 | Reset all metrics to zero (need to do each epoch) 40 | """ 41 | self.train_loss.reset_states() 42 | self.val_loss.reset_states() 43 | 44 | def save_to_json(self, dir_to_save): 45 | """ 46 | Save all metrics to the json file 47 | """ 48 | 49 | # Check that folder is exitsts or create it 50 | os.makedirs(dir_to_save, exist_ok=True) 51 | json_filename = os.path.join(dir_to_save, "epoch_metrics.json") 52 | # fill the dict 53 | metrics_dict = { 54 | "train_loss": str(self.train_loss.result().numpy()), 55 | "val_loss": str(self.val_loss.result().numpy()), 56 | } 57 | save_to_json(json_filename, metrics_dict) 58 | 59 | def print_metrics(self): 60 | """ 61 | Print all metrics 62 | """ 63 | train_loss = np.around(self.train_loss.result().numpy(), decimals=2) 64 | val_loss = np.around(self.val_loss.result().numpy(), decimals=2) 65 | 66 | template = "train_loss {}, val_loss {}".format(train_loss, val_loss) 67 | print(template) 68 | -------------------------------------------------------------------------------- /detection_3d/model.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | 20 | 21 | import tensorflow as tf 22 | from tensorflow.keras.layers import ( 23 | Conv2D, 24 | Layer, 25 | UpSampling2D, 26 | BatchNormalization, 27 | LeakyReLU, 28 | ) 29 | from tensorflow.keras import Model 30 | from tensorflow.keras.regularizers import l2 31 | 32 | 33 | class DarkNetConv2D(Layer): 34 | """ 35 | The darknet conv layer yolo_v3 36 | """ 37 | 38 | def __init__( 39 | self, 40 | filters, 41 | kernel, 42 | strides, 43 | padding, 44 | weight_decay, 45 | batch_norm=True, 46 | activation_funct=True, 47 | data_format="channels_last", 48 | ): 49 | super(DarkNetConv2D, self).__init__() 50 | self.batch_norm = batch_norm 51 | self.activation_funct = activation_funct 52 | self.conv = Conv2D( 53 | filters, 54 | kernel, 55 | strides=strides, 56 | activation=None, 57 | kernel_regularizer=l2(weight_decay), 58 | padding=padding, 59 | data_format=data_format, 60 | ) 61 | self.bn = BatchNormalization() 62 | self.activation = LeakyReLU(alpha=0.1) 63 | 64 | def call(self, x, training=False): 65 | x = self.conv(x) 66 | if self.batch_norm: 67 | x = self.bn(x, training=training) 68 | if self.activation_funct: 69 | x = self.activation(x) 70 | return x 71 | 72 | 73 | class DarkNetBlock(Layer): 74 | """ 75 | The darknet block layer 76 | """ 77 | 78 | def __init__( 79 | self, filters, weight_decay, batch_norm=True, data_format="channels_last" 80 | ): 81 | super(DarkNetBlock, self).__init__() 82 | 83 | self.conv1 = DarkNetConv2D( 84 | filters // 2, 85 | 1, 86 | strides=1, 87 | padding="same", 88 | weight_decay=weight_decay, 89 | batch_norm=batch_norm, 90 | data_format=data_format, 91 | ) 92 | self.conv2 = DarkNetConv2D( 93 | filters, 94 | 3, 95 | strides=1, 96 | padding="same", 97 | weight_decay=weight_decay, 98 | batch_norm=batch_norm, 99 | data_format=data_format, 100 | ) 101 | 102 | def call(self, x, training=False): 103 | prev = x 104 | x = self.conv1(x, training=training) 105 | x = self.conv2(x, training=training) 106 | return prev + x 107 | 108 | 109 | class DarkNetDecoderBlock(Layer): 110 | """ 111 | The yolo v3 decoder layer 112 | """ 113 | 114 | def __init__(self, filters, weight_decay, data_format="channels_last"): 115 | super(DarkNetDecoderBlock, self).__init__() 116 | 117 | self.conv1_1 = DarkNetConv2D( 118 | filters, 119 | (1, 1), 120 | strides=(1, 1), 121 | weight_decay=weight_decay, 122 | padding="same", 123 | data_format=data_format, 124 | ) 125 | self.conv1_2 = DarkNetConv2D( 126 | filters * 2, 127 | (3, 3), 128 | strides=(1, 1), 129 | weight_decay=weight_decay, 130 | padding="same", 131 | data_format=data_format, 132 | ) 133 | self.conv2_1 = DarkNetConv2D( 134 | filters, 135 | (1, 1), 136 | strides=(1, 1), 137 | weight_decay=weight_decay, 138 | padding="same", 139 | data_format=data_format, 140 | ) 141 | self.conv2_2 = DarkNetConv2D( 142 | filters * 2, 143 | (3, 3), 144 | strides=(1, 1), 145 | weight_decay=weight_decay, 146 | padding="same", 147 | data_format=data_format, 148 | ) 149 | self.conv3 = DarkNetConv2D( 150 | filters, 151 | (1, 1), 152 | strides=(1, 1), 153 | weight_decay=weight_decay, 154 | padding="same", 155 | data_format=data_format, 156 | ) 157 | 158 | def call(self, x, training=False): 159 | x = self.conv1_1(x, training=training) 160 | x = self.conv1_2(x, training=training) 161 | x = self.conv2_1(x, training=training) 162 | x = self.conv2_2(x, training=training) 163 | x = self.conv3(x, training=training) 164 | return x 165 | 166 | 167 | class DarkNetEncoder(Layer): 168 | """ 169 | The darknet 53 encoder from yolo_v3 170 | See: https://arxiv.org/abs/1804.02767 171 | """ 172 | 173 | def __init__(self, name, weight_decay, data_format="channels_last"): 174 | super(DarkNetEncoder, self).__init__(name=name) 175 | # Input 176 | self.conv1 = DarkNetConv2D( 177 | 32, 178 | (3, 3), 179 | strides=(1, 1), 180 | weight_decay=weight_decay, 181 | padding="same", 182 | data_format=data_format, 183 | ) 184 | # Conv with stride 2 185 | self.conv2 = DarkNetConv2D( 186 | 64, 187 | (3, 3), 188 | strides=(2, 2), 189 | weight_decay=weight_decay, 190 | padding="same", 191 | data_format=data_format, 192 | ) 193 | # Residual block 194 | self.block_1 = DarkNetBlock( 195 | 64, weight_decay=weight_decay, data_format=data_format 196 | ) 197 | # Conv with stride 2 198 | self.conv3 = DarkNetConv2D( 199 | 128, 200 | (3, 3), 201 | strides=(2, 2), 202 | weight_decay=weight_decay, 203 | padding="same", 204 | data_format=data_format, 205 | ) 206 | # Residual blocks 2x 207 | self.block_2 = [] 208 | for _ in range(2): 209 | self.block_2.append( 210 | DarkNetBlock(128, weight_decay=weight_decay, data_format=data_format) 211 | ) 212 | # Conv with stride 2 213 | self.conv4 = DarkNetConv2D( 214 | 256, 215 | (3, 3), 216 | strides=(2, 2), 217 | weight_decay=weight_decay, 218 | padding="same", 219 | data_format=data_format, 220 | ) 221 | # Residual blocks 8x 222 | self.block_3 = [] 223 | for _ in range(8): 224 | self.block_3.append( 225 | DarkNetBlock(256, weight_decay=weight_decay, data_format=data_format) 226 | ) 227 | # Conv with stride 2 228 | self.conv5 = DarkNetConv2D( 229 | 512, 230 | (3, 3), 231 | strides=(2, 2), 232 | weight_decay=weight_decay, 233 | padding="same", 234 | data_format=data_format, 235 | ) 236 | # Residual blocks 8x 237 | self.block_4 = [] 238 | for _ in range(8): 239 | self.block_4.append( 240 | DarkNetBlock(512, weight_decay=weight_decay, data_format=data_format) 241 | ) 242 | # Conv with stride 2 243 | self.conv6 = DarkNetConv2D( 244 | 1024, 245 | (3, 3), 246 | strides=(2, 2), 247 | weight_decay=weight_decay, 248 | padding="same", 249 | data_format=data_format, 250 | ) 251 | # Residual blocks 4x 252 | self.block_5 = [] 253 | for _ in range(4): 254 | self.block_5.append( 255 | DarkNetBlock(1024, weight_decay=weight_decay, data_format=data_format) 256 | ) 257 | 258 | def call(self, x, training=False): 259 | x = self.conv1(x, training=training) 260 | x = self.conv2(x, training=training) 261 | x = x_b1 = self.block_1(x, training=training) 262 | x = self.conv3(x, training=training) 263 | for i in range(len(self.block_2)): 264 | x = x_b2 = self.block_2[i](x, training=training) 265 | x = self.conv4(x) 266 | for i in range(len(self.block_3)): 267 | x = x_b3 = self.block_3[i](x, training=training) 268 | x = self.conv5(x, training=training) 269 | for i in range(len(self.block_4)): 270 | x = x_b4 = self.block_4[i](x, training=training) 271 | x = self.conv6(x, training=training) 272 | for i in range(len(self.block_5)): 273 | x = x_b5 = self.block_5[i](x, training=training) 274 | 275 | return x_b5, x_b4, x_b3, x_b2, x_b1 276 | 277 | 278 | class DarkNetDecoder(Layer): 279 | """ 280 | The yolo v3 decoder 281 | """ 282 | 283 | def __init__(self, name, weight_decay, data_format="channels_last"): 284 | super(DarkNetDecoder, self).__init__(name=name) 285 | 286 | self.decoder_block_1 = DarkNetDecoderBlock( 287 | filters=512, weight_decay=weight_decay, data_format=data_format 288 | ) 289 | 290 | self.conv1 = DarkNetConv2D( 291 | 256, 292 | (1, 1), 293 | strides=(1, 1), 294 | weight_decay=weight_decay, 295 | padding="same", 296 | data_format=data_format, 297 | ) 298 | self.up1 = UpSampling2D(size=(2, 2), data_format=data_format) 299 | self.decoder_block_2 = DarkNetDecoderBlock( 300 | filters=256, weight_decay=weight_decay, data_format=data_format 301 | ) 302 | 303 | self.conv2 = DarkNetConv2D( 304 | 128, 305 | (1, 1), 306 | strides=(1, 1), 307 | weight_decay=weight_decay, 308 | padding="same", 309 | data_format=data_format, 310 | ) 311 | self.up2 = UpSampling2D(size=(2, 2), data_format=data_format) 312 | self.decoder_block_3 = DarkNetDecoderBlock( 313 | filters=128, weight_decay=weight_decay, data_format=data_format 314 | ) 315 | self.up3 = UpSampling2D(size=(2, 2), data_format=data_format) 316 | self.decoder_block_4 = DarkNetDecoderBlock( 317 | filters=64, weight_decay=weight_decay, data_format=data_format 318 | ) 319 | self.up4 = UpSampling2D(size=(2, 2), data_format=data_format) 320 | self.decoder_block_5 = DarkNetDecoderBlock( 321 | filters=64, weight_decay=weight_decay, data_format=data_format 322 | ) 323 | 324 | def call(self, x_in, training=False): 325 | # First lvl 326 | x_b5, x_b4, x_b3, x_b2, x_b1 = x_in 327 | x = self.decoder_block_1(x_b5, training=training) 328 | # Second lvl 329 | x = self.conv1(x, training=training) 330 | x = self.up1(x) 331 | x = tf.concat([x, x_b4], axis=-1) 332 | x = self.decoder_block_2(x, training=training) 333 | # Third lvl 334 | x = self.conv2(x, training=training) 335 | x = self.up2(x) 336 | x = tf.concat([x, x_b3], axis=-1) 337 | x = self.decoder_block_3(x, training=training) 338 | x = self.up3(x) 339 | x = tf.concat([x, x_b2], axis=-1) 340 | x = self.decoder_block_4(x, training=training) 341 | x = self.up4(x) 342 | x = tf.concat([x, x_b1], axis=-1) 343 | x = self.decoder_block_5(x, training=training) 344 | 345 | return x 346 | 347 | 348 | class YoloV3_Lidar(Model): 349 | def __init__(self, weight_decay, num_classes=26, data_format="channels_last"): 350 | super(YoloV3_Lidar, self).__init__(name="YoloV3_Lidar") 351 | self.encoder = DarkNetEncoder( 352 | name="DarkNetEncoder", weight_decay=weight_decay, data_format=data_format 353 | ) 354 | self.decoder = DarkNetDecoder( 355 | name="DarkNetDecoder", weight_decay=weight_decay, data_format=data_format 356 | ) 357 | self.final_layer = Conv2D( 358 | 8 + num_classes, 359 | (1, 1), 360 | activation=None, 361 | padding="same", 362 | data_format=data_format, 363 | ) 364 | 365 | def call(self, x, training): 366 | x = self.encoder(x, training=training) 367 | x = self.decoder(x, training=training) 368 | x = self.final_layer(x) 369 | return x 370 | -------------------------------------------------------------------------------- /detection_3d/parameters.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import os 20 | from detection_3d.tools.file_io import save_to_json 21 | from detection_3d.tools.statics import NO_SCHEDULER, RESTARTS_SCHEDULER, ADAM 22 | 23 | 24 | class Parameters(object): 25 | """ 26 | The class contains experiment parameters. 27 | """ 28 | 29 | def __init__(self): 30 | 31 | self.settings = { 32 | # The directory of the dataset 33 | "dataset_dir": "dataset", 34 | "batch_size": 4, 35 | # The checkpoint related 36 | "checkpoints_dir": "log/checkpoints", 37 | "train_summaries": "log/summaries/train", 38 | "eval_summaries": "log/summaries/val", 39 | # Update tensorboard train images each step_summaries iterations 40 | "step_summaries": 100, # to turn off make it None 41 | # General settings 42 | "seed": 2020, 43 | "max_epochs": 1000, 44 | "weight_decay": 1, 45 | } 46 | 47 | # Set special parameters 48 | self.settings["optimizer"] = ADAM 49 | self.settings["scheduler"] = SchedulerSettings.no_scheduler() 50 | self.settings["augmentation"] = True 51 | # Detection related 52 | self.settings["grid_meters"] = [52.0, 104.0, 8.0] # [x,y,z ] in meters 53 | # [x,y,z, intensity] offset to shift all lidar points in positive coordinate quadrant 54 | # (all x,y,z coords >=0) 55 | self.settings["lidar_offset"] = [26.0, 52.0, 2.5, 0.0] 56 | # [x,y,z] voxel size in meters 57 | self.settings["voxel_size"] = [0.125, 0.125, 8.0] 58 | # [x,y,z] voxel size in meters 59 | self.settings["bbox_voxel_size"] = [0.25, 0.25, 1.0] 60 | 61 | # Automatically defined during training parameters 62 | self.settings["train_size"] = None # the size of train set 63 | self.settings["val_size"] = None # the size of val set 64 | 65 | def save_to_json(self, dir_to_save): 66 | """ 67 | Save parameters to .json 68 | """ 69 | # Check that folder is exitsts or create it 70 | os.makedirs(dir_to_save, exist_ok=True) 71 | json_filename = os.path.join(dir_to_save, "parameters.json") 72 | save_to_json(json_filename, self.settings) 73 | 74 | 75 | class SchedulerSettings: 76 | """ 77 | The class contains parameters for different schedulers. 78 | """ 79 | 80 | def __init__(self): 81 | pass 82 | 83 | # Supported schedulers 84 | @staticmethod 85 | def no_scheduler(): 86 | """ 87 | Constant learning rate scheduler. 88 | """ 89 | scheduler = { 90 | "name": NO_SCHEDULER, 91 | "initial_learning_rate": 1e-5, 92 | } 93 | return scheduler 94 | 95 | @staticmethod 96 | def restarts_scheduler(): 97 | """ 98 | The warm restarts scheduler. 99 | See: https://arxiv.org/abs/1608.03983 100 | """ 101 | scheduler = { 102 | "name": RESTARTS_SCHEDULER, 103 | "initial_learning_rate": 1e-4, # 2e-3 104 | "first_decay_steps": 80, # Important: convertable param from epoch to iteration 105 | "t_mul": 2.0, 106 | "m_mul": 1.0, 107 | "alpha": 1e-6, 108 | } 109 | return scheduler 110 | -------------------------------------------------------------------------------- /detection_3d/tools/augmentation_tools.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | 20 | import numpy as np 21 | from detection_3d.data_preprocessing.pandaset_tools.transform import ( 22 | eulerAnglesToRotationMatrix, 23 | ) 24 | 25 | 26 | def random_rotate_lidar_boxes( 27 | lidar, lidar_corners_3d, min_angle=-np.pi / 4, max_angle=np.pi / 4 28 | ): 29 | yaw = np.random.uniform(min_angle, max_angle) 30 | R = eulerAnglesToRotationMatrix([0, 0, yaw]) 31 | lidar = np.transpose(lidar) 32 | lidar_corners_3d = np.transpose(lidar_corners_3d, (0, 2, 1)) 33 | 34 | lidar[:3] = np.matmul(R, lidar[:3]) 35 | lidar_corners_3d = np.matmul(R, lidar_corners_3d) 36 | 37 | lidar_corners_3d = np.transpose(lidar_corners_3d, (0, 2, 1)) 38 | lidar = np.transpose(lidar) 39 | return lidar, lidar_corners_3d 40 | 41 | 42 | def random_flip_x_lidar_boxes(lidar, lidar_corners_3d): 43 | lidar[:, 0] = -lidar[:, 0] 44 | lidar_corners_3d[:, :, 0] = -lidar_corners_3d[:, :, 0] 45 | return lidar, lidar_corners_3d 46 | 47 | 48 | def random_flip_y_lidar_boxes(lidar, lidar_corners_3d): 49 | lidar[:, 1] = -lidar[:, 1] 50 | lidar_corners_3d[:, :, 1] = -lidar_corners_3d[:, :, 1] 51 | return lidar, lidar_corners_3d 52 | -------------------------------------------------------------------------------- /detection_3d/tools/detection_helpers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import numpy as np 20 | import tensorflow as tf 21 | 22 | 23 | def rot_z(t): 24 | """ Rotation about the z-axis. """ 25 | c = np.cos(t) 26 | s = np.sin(t) 27 | ones = np.ones_like(c) 28 | zeros = np.zeros_like(c) 29 | return np.asarray([[c, -s, zeros], [s, c, zeros], [zeros, zeros, ones]]) 30 | 31 | 32 | def make_eight_points_boxes(bboxes_xyzlwhy): 33 | bboxes_xyzlwhy = np.asarray(bboxes_xyzlwhy) 34 | l = bboxes_xyzlwhy[:, 3] / 2.0 35 | w = bboxes_xyzlwhy[:, 4] / 2.0 36 | h = bboxes_xyzlwhy[:, 5] / 2.0 37 | # 3d bounding box corners 38 | x_corners = np.asarray([l, l, -l, -l, l, l, -l, -l]) 39 | y_corners = np.asarray([w, -w, -w, w, w, -w, -w, w]) 40 | z_corners = np.asarray([-h, -h, -h, -h, h, h, h, h]) 41 | corners_3d = np.concatenate(([x_corners], [y_corners], [z_corners]), axis=0) 42 | yaw = np.asarray(bboxes_xyzlwhy[:, -1], dtype=np.float) 43 | corners_3d = np.transpose(corners_3d, (2, 0, 1)) 44 | R = np.transpose(rot_z(yaw), (2, 0, 1)) 45 | 46 | corners_3d = np.matmul(R, corners_3d) 47 | 48 | centroid = bboxes_xyzlwhy[:, :3] 49 | corners_3d += centroid[:, :, None] 50 | orient_p = (corners_3d[:, :, 0] + corners_3d[:, :, 7]) / 2.0 51 | orientation_3d = np.concatenate( 52 | (centroid[:, :, None], orient_p[:, :, None]), axis=-1 53 | ) 54 | corners_3d = np.transpose(corners_3d, (0, 2, 1)) 55 | orientation_3d = np.transpose(orientation_3d, (0, 2, 1)) 56 | 57 | return corners_3d, orientation_3d 58 | 59 | 60 | def get_bboxes_parameters_from_points(lidar_corners_3d): 61 | """ 62 | The function returns 7 parameters of box [x, y, z, w, l, h, yaw] 63 | 64 | Arguments: 65 | lidar_corners_3d: [num_ponts, 8, 3] 66 | """ 67 | centroid = (lidar_corners_3d[:, -2, :] + lidar_corners_3d[:, 0, :]) / 2.0 68 | delta_l = lidar_corners_3d[:, 0, :2] - lidar_corners_3d[:, 1, :2] 69 | delta_w = lidar_corners_3d[:, 1, :2] - lidar_corners_3d[:, 2, :2] 70 | width = np.linalg.norm(delta_w, axis=-1) 71 | length = np.linalg.norm(delta_l, axis=-1) 72 | 73 | height = lidar_corners_3d[:, -1, -1] - lidar_corners_3d[:, 0, -1] 74 | yaw = np.arctan2(delta_l[:, 1], delta_l[:, 0]) 75 | 76 | return centroid, width, length, height, yaw 77 | 78 | 79 | def get_voxels_grid(voxel_size, grid_meters): 80 | voxel_size = np.asarray(voxel_size, np.float32) 81 | grid_size_meters = np.asarray(grid_meters, np.float32) 82 | voxels_grid = np.asarray(grid_size_meters / voxel_size, np.int32) 83 | return voxels_grid 84 | 85 | 86 | def get_bboxes_grid(bbox_labels, lidar_corners_3d, grid_meters, bbox_voxel_size): 87 | """ 88 | The function transform lidar_corners_3d (8 points of bboxes) to 89 | parametrized version of bbox. 90 | """ 91 | voxels_grid = get_voxels_grid(bbox_voxel_size, grid_meters) 92 | # Find box parameters 93 | centroid, width, length, height, _ = get_bboxes_parameters_from_points( 94 | lidar_corners_3d 95 | ) 96 | # find the vector of orientation [centroid, orient_point] 97 | orient_point = (lidar_corners_3d[:, 1] + lidar_corners_3d[:, 2]) / 2.0 98 | 99 | voxel_coordinates = np.asarray( 100 | np.floor(centroid[:, :2] / bbox_voxel_size[:2]), np.int32 101 | ) 102 | # Filter bboxes not fall in the grid 103 | bound_x = (voxel_coordinates[:, 0] >= 0) & ( 104 | voxel_coordinates[:, 0] < voxels_grid[0] 105 | ) 106 | bound_y = (voxel_coordinates[:, 1] >= 0) & ( 107 | voxel_coordinates[:, 1] < voxels_grid[1] 108 | ) 109 | mask = bound_x & bound_y 110 | # Filter all non related bboxes 111 | centroid = centroid[mask] 112 | orient_point = orient_point[mask] 113 | width = width[mask] 114 | length = length[mask] 115 | height = height[mask] 116 | bbox_labels = bbox_labels[mask] 117 | voxel_coordinates = voxel_coordinates[mask] 118 | # Confidence 119 | confidence = np.ones_like(width) 120 | 121 | # Voxels close corners to the coordinate system origin (0,0,0) 122 | voxels_close_corners = ( 123 | np.asarray(voxel_coordinates, np.float32) * bbox_voxel_size[:2] 124 | ) 125 | # Get x,y, coordinate 126 | delta_xy = centroid[:, :2] - voxels_close_corners 127 | orient_xy = orient_point[:, :2] - voxels_close_corners 128 | z_coord = centroid[:, -1] 129 | 130 | # print( 131 | # f"confidence {confidence.shape}, delta_xy {delta_xy.shape}, orient_xy {orient_xy.shape}, z_coord {z_coord.shape}, width {width.shape}, height {height.shape}, bbox_labels {bbox_labels.shape}" 132 | # ) 133 | # (x_grid, y_grid, (objectness, min_delta_x, min_delta_y, max_delta_x, max_delta_y, z, label)) 134 | # objectness means 1 if box exists for this grid cell else 0 135 | output_tensor = np.zeros((voxels_grid[0], voxels_grid[1], 9), np.float32) 136 | if len(bbox_labels) > 0: 137 | data = np.concatenate( 138 | ( 139 | confidence[:, None], 140 | delta_xy, 141 | orient_xy, 142 | z_coord[:, None], 143 | width[:, None], 144 | height[:, None], 145 | bbox_labels[:, None], 146 | ), 147 | axis=-1, 148 | ) 149 | output_tensor[voxel_coordinates[:, 0], voxel_coordinates[:, 1]] = data 150 | return output_tensor 151 | 152 | 153 | def get_boxes_from_box_grid(box_grid, bbox_voxel_size, conf_trhld=0.0): 154 | 155 | # Get non-zero voxels 156 | objectness, delta_xy, orient_xy, z_coord, width, height, label = tf.split( 157 | box_grid, (1, 2, 2, 1, 1, 1, -1), axis=-1 158 | ) 159 | 160 | mask = box_grid[:, :, 0] > conf_trhld 161 | valid_idx = tf.where(mask) 162 | 163 | z_coord = tf.gather_nd(z_coord, valid_idx) 164 | width = tf.gather_nd(width, valid_idx) 165 | height = tf.gather_nd(height, valid_idx) 166 | objectness = tf.gather_nd(objectness, valid_idx) 167 | label = tf.gather_nd(label, valid_idx) 168 | delta_xy = tf.gather_nd(delta_xy, valid_idx) 169 | orient_xy = tf.gather_nd(orient_xy, valid_idx) 170 | voxels_close_corners = tf.cast(valid_idx, tf.float32) * bbox_voxel_size[None, :2] 171 | xy_coord = delta_xy + voxels_close_corners 172 | xy_orient = orient_xy + voxels_close_corners 173 | 174 | delta = xy_orient[:, :2] - xy_coord[:, :2] 175 | length = 2 * tf.norm(delta, axis=-1, keepdims=True) 176 | yaw = tf.expand_dims(tf.atan2(delta[:, 1], delta[:, 0]), axis=-1) 177 | 178 | bbox = tf.concat([xy_coord, z_coord, length, width, height, yaw], axis=-1,) 179 | return bbox, label, objectness 180 | 181 | 182 | def make_top_view_image(lidar, grid_meters, voxels_size, channels=3): 183 | """ 184 | The function makes top view image from lidar 185 | Arguments: 186 | lidar: lidar array of the shape [num_points, 3] 187 | width: width of the top view image 188 | height: height of the top view image 189 | channels: number of channels of the top view image 190 | """ 191 | mask_x = (lidar[:, 0] >= 0) & (lidar[:, 0] < grid_meters[0]) 192 | mask_y = (lidar[:, 1] >= 0) & (lidar[:, 1] < grid_meters[1]) 193 | mask_z = (lidar[:, 2] >= 0) & (lidar[:, 2] < grid_meters[2]) 194 | mask = mask_x & mask_y & mask_z 195 | lidar = lidar[mask] 196 | voxel_grid = get_voxels_grid(voxels_size, grid_meters) 197 | voxels = np.asarray(np.floor(lidar[:, :3] / voxels_size), np.int32) 198 | top_view = np.zeros((voxel_grid[0], voxel_grid[1], 2), np.float32) 199 | top_view[voxels[:, 0], voxels[:, 1], 0] = lidar[:, 2] # z values 200 | top_view[voxels[:, 0], voxels[:, 1], 1] = lidar[:, 3] # intensity values 201 | 202 | return top_view 203 | -------------------------------------------------------------------------------- /detection_3d/tools/file_io.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import tensorflow as tf 20 | import matplotlib.pyplot as plt 21 | import numpy as np 22 | import os 23 | import json 24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import labels 25 | 26 | 27 | def load_and_resize_image(image_filename, resize=None, data_type=tf.float32): 28 | """ 29 | Load png image to tensor and resize if necessary 30 | Arguments: 31 | image_filename: image file to load 32 | resize: tensor [new_width, new_height] or None 33 | Return: 34 | img: tensor of the size [1, H, W, 3] 35 | """ 36 | 37 | img = tf.io.read_file(image_filename) 38 | img = tf.image.decode_png(img) 39 | # Add batch dim 40 | img = tf.expand_dims(img, axis=0) 41 | 42 | if resize is not None: 43 | img = tf.compat.v1.image.resize_nearest_neighbor(img, resize) 44 | 45 | img = tf.cast(img, data_type) 46 | return img 47 | 48 | 49 | def save_plot_to_image(file_to_save, figure): 50 | """ 51 | Save matplotlib figure to image and close 52 | """ 53 | plt.savefig(file_to_save) 54 | plt.close(figure) 55 | 56 | 57 | def read_json(json_filename): 58 | with open(json_filename) as json_file: 59 | data = json.load(json_file) 60 | return data 61 | 62 | 63 | def save_bboxes_to_file( 64 | filename, centroid, width, length, height, alpha, label, delim=";" 65 | ): 66 | 67 | if centroid is not None: 68 | with open(filename, "w") as the_file: 69 | for c, w, l, h, a, lbl in zip( 70 | centroid, width, length, height, alpha, label 71 | ): 72 | data = ( 73 | delim.join( 74 | ( 75 | str(c[0]), 76 | str(c[1]), 77 | str(c[2]), 78 | str(l), 79 | str(w), 80 | str(h), 81 | str(a), 82 | str(lbl), 83 | ) 84 | ) 85 | + "\n" 86 | ) 87 | # data = "{};{};{};{};{};{};{};{}\n".format( 88 | # c[0], c[1], c[2], l, w, h, a, lbl 89 | # ) 90 | the_file.write(data) 91 | 92 | 93 | def load_bboxes(label_filename, label_string=True): 94 | # returns the array with [num_boxes, (bbox_parm)] 95 | with open(label_filename) as f: 96 | bboxes = np.asarray([line.rstrip().split(";") for line in f]) 97 | # Convert labels to numbers 98 | if label_string: 99 | bboxes[:, -1] = [labels[label] for label in bboxes[:, -1]] 100 | bboxes = np.asarray(bboxes, dtype=float) 101 | return bboxes 102 | 103 | 104 | def load_lidar(lidar_filename, dtype=np.float32, n_vec=4): 105 | scan = np.fromfile(lidar_filename, dtype=dtype) 106 | scan = scan.reshape((-1, n_vec)) 107 | return scan 108 | 109 | 110 | def save_lidar(lidar_filename, scan): 111 | scan = scan.reshape((-1)) 112 | scan.tofile(lidar_filename) 113 | 114 | 115 | def save_to_json(json_filename, dict_to_save): 116 | """ 117 | Save to json file 118 | """ 119 | with open(json_filename, "w") as f: 120 | json.dump(dict_to_save, f, indent=2) 121 | 122 | 123 | def save_dataset_list(dataset_file, data_list): 124 | """ 125 | Saves dataset list to file. 126 | """ 127 | with open(dataset_file, "w") as f: 128 | for item in data_list: 129 | f.write("%s\n" % item) 130 | 131 | 132 | def load_dataset_list(dataset_dir, dataset_file, delimiter=";"): 133 | """ 134 | The function loads list of data from dataset 135 | file. 136 | Args: 137 | dataset_file: path to the .dataset file. 138 | Returns: 139 | dataset_list: list of data. 140 | """ 141 | 142 | file_path = os.path.join(dataset_dir, dataset_file) 143 | dataset_list = [] 144 | with open(file_path) as f: 145 | dataset_list = f.readlines() 146 | dataset_list = [x.strip().split(delimiter) for x in dataset_list] 147 | return dataset_list 148 | -------------------------------------------------------------------------------- /detection_3d/tools/statics.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | 20 | NO_SCHEDULER = "no_scheduler" 21 | RESTARTS_SCHEDULER = "restarts" 22 | 23 | ADAM = "adam" 24 | -------------------------------------------------------------------------------- /detection_3d/tools/summary_helpers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import tensorflow as tf 20 | import numpy as np 21 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image 22 | 23 | 24 | def train_summaries(train_out, optimizer, param_settings, learning_rate): 25 | """ 26 | Visualizes the train outputs in tensorboards 27 | """ 28 | 29 | writer = tf.summary.create_file_writer(param_settings["train_summaries"]) 30 | with writer.as_default(): 31 | # Losses 32 | ( 33 | obj_loss, 34 | label_loss, 35 | z_loss, 36 | delta_xy_loss, 37 | width_loss, 38 | height_loss, 39 | delta_orient_loss, 40 | ) = train_out["losses"] 41 | 42 | # Show learning rate given scheduler 43 | if param_settings["scheduler"]["name"] != "no_scheduler": 44 | with tf.name_scope("Optimizer info"): 45 | step = float( 46 | optimizer.iterations.numpy() 47 | ) # triangular_scheduler learning rate needs float dtype 48 | tf.summary.scalar( 49 | "learning_rate", learning_rate(step), step=optimizer.iterations 50 | ) 51 | with tf.name_scope("Training losses"): 52 | tf.summary.scalar( 53 | "1.Total loss", train_out["total_loss"], step=optimizer.iterations 54 | ) 55 | tf.summary.scalar("2.obj loss", obj_loss, step=optimizer.iterations) 56 | tf.summary.scalar("3.label_loss", label_loss, step=optimizer.iterations) 57 | tf.summary.scalar("4. z_loss", z_loss, step=optimizer.iterations) 58 | tf.summary.scalar( 59 | "5. delta_xy_loss", delta_xy_loss, step=optimizer.iterations 60 | ) 61 | tf.summary.scalar("6. width_loss", width_loss, step=optimizer.iterations) 62 | tf.summary.scalar("8. height_loss", height_loss, step=optimizer.iterations) 63 | tf.summary.scalar( 64 | "9. delta_orient_loss", delta_orient_loss, step=optimizer.iterations 65 | ) 66 | 67 | if ( 68 | param_settings["step_summaries"] is not None 69 | and optimizer.iterations % param_settings["step_summaries"] == 0 70 | ): 71 | bbox_voxel_size = np.asarray( 72 | param_settings["bbox_voxel_size"], dtype=np.float32 73 | ) 74 | lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32) 75 | gt_bboxes = train_out["box_grid"] 76 | p_bboxes = train_out["predictions"] 77 | grid_meters = param_settings["grid_meters"] 78 | top_view = train_out["top_view"] 79 | gt_top_view = visualize_2d_boxes_on_top_image( 80 | gt_bboxes, top_view, grid_meters, bbox_voxel_size, 81 | ) 82 | 83 | p_top_view = visualize_2d_boxes_on_top_image( 84 | p_bboxes, top_view, grid_meters, bbox_voxel_size, prediction=True, 85 | ) 86 | 87 | # Show GT 88 | with tf.name_scope("1-Ground truth bounding boxes"): 89 | tf.summary.image("Top view", gt_top_view, step=optimizer.iterations) 90 | 91 | with tf.name_scope("2-Predicted bounding boxes"): 92 | tf.summary.image( 93 | "Predicted top view", p_top_view, step=optimizer.iterations 94 | ) 95 | 96 | 97 | def epoch_metrics_summaries(param_settings, epoch_metrics, epoch): 98 | """ 99 | Visualizes epoch metrics 100 | """ 101 | # Train results 102 | writer = tf.summary.create_file_writer(param_settings["train_summaries"]) 103 | with writer.as_default(): 104 | # Show epoch metrics for train 105 | with tf.name_scope("Epoch metrics"): 106 | tf.summary.scalar( 107 | "1. Loss", epoch_metrics.train_loss.result().numpy(), step=epoch 108 | ) 109 | 110 | # Val results 111 | writer = tf.summary.create_file_writer(param_settings["eval_summaries"]) 112 | with writer.as_default(): 113 | # Show epoch metrics for train 114 | with tf.name_scope("Epoch metrics"): 115 | tf.summary.scalar( 116 | "1. Loss", epoch_metrics.val_loss.result().numpy(), step=epoch 117 | ) 118 | -------------------------------------------------------------------------------- /detection_3d/tools/training_helpers.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import os 20 | import glob 21 | import tensorflow as tf 22 | import copy 23 | from detection_3d.tools.statics import NO_SCHEDULER, RESTARTS_SCHEDULER, ADAM 24 | 25 | 26 | def setup_gpu(): 27 | physical_devices = tf.config.experimental.list_physical_devices("GPU") 28 | if len(physical_devices) > 0: 29 | # Will not allocate all memory but only necessary amount 30 | tf.config.experimental.set_memory_growth(physical_devices[0], True) 31 | 32 | 33 | def initialize_model(model, input_shape): 34 | """ 35 | Helper tf2 specific model initialization (need for saving mechanism) 36 | """ 37 | sample = tf.zeros(input_shape, tf.float32) 38 | model.predict(sample) 39 | 40 | 41 | def load_model(checkpoints_dir, model, resume): 42 | """ 43 | Resume model from given checkpoint 44 | """ 45 | start_epoch = 0 46 | if resume: 47 | search_string = os.path.join(checkpoints_dir, "*") 48 | checkpoints_list = sorted(glob.glob(search_string)) 49 | if len(checkpoints_list) > 0: 50 | current_epoch = int(os.path.split(checkpoints_list[-1])[-1].split("-")[-1]) 51 | model = tf.keras.models.load_model(checkpoints_list[-1]) 52 | start_epoch = current_epoch + 1 # we should continue from the next epoch 53 | print(f"RESUME TRAINING FROM CHECKPOINT: {checkpoints_list[-1]}.") 54 | else: 55 | print(f"CAN'T RESUME TRAINING! NO CHECKPOINT FOUND! START NEW TRAINING!") 56 | return start_epoch, model 57 | 58 | 59 | def get_optimizer(optimizer_name, scheduler, num_iter_per_epoch): 60 | if scheduler["name"] == NO_SCHEDULER: 61 | learning_rate = scheduler["initial_learning_rate"] 62 | elif scheduler["name"] == RESTARTS_SCHEDULER: 63 | tmp_scheduler = copy.deepcopy(scheduler) 64 | tmp_scheduler["first_decay_steps"] = ( 65 | scheduler["first_decay_steps"] * num_iter_per_epoch 66 | ) 67 | learning_rate = tf.keras.experimental.CosineDecayRestarts(**tmp_scheduler) 68 | 69 | if optimizer_name == ADAM: 70 | optimizer_type = tf.keras.optimizers.Adam(learning_rate) 71 | else: 72 | ValueError("Error: Unknow optimizer {}".format(optimizer_name)) 73 | 74 | return learning_rate, optimizer_type 75 | -------------------------------------------------------------------------------- /detection_3d/tools/visualization_tools.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import mayavi.mlab as mlab 20 | import numpy as np 21 | import cv2 22 | import matplotlib.pyplot as plt 23 | import copy 24 | from tqdm import tqdm 25 | from detection_3d.tools.detection_helpers import ( 26 | get_boxes_from_box_grid, 27 | make_eight_points_boxes, 28 | ) 29 | from detection_3d.data_preprocessing.pandaset_tools.helpers import get_color 30 | 31 | 32 | def visualize_lidar(lidar, figure=None): 33 | """ 34 | Draw lidar points 35 | Args: 36 | lidar: numpy array (n,3) of XYZ 37 | figure: mayavi figure handler, if None create new one otherwise will use it 38 | Returns: 39 | fig: created or used fig 40 | """ 41 | 42 | if figure is None: 43 | figure = mlab.figure( 44 | figure=None, bgcolor=(0, 0, 0), fgcolor=None, engine=None, size=(1600, 1000) 45 | ) 46 | 47 | color = lidar[:, 2] 48 | mlab.points3d( 49 | lidar[:, 0], 50 | lidar[:, 1], 51 | lidar[:, 2], 52 | color, 53 | mode="point", 54 | scale_factor=0.3, 55 | figure=figure, 56 | ) 57 | 58 | # draw origin 59 | mlab.points3d( 60 | 0, 0, 0, color=(1, 1, 1), mode="sphere", scale_factor=0.2, figure=figure 61 | ) 62 | # draw axis 63 | mlab.plot3d( 64 | [0, 2], [0, 0], [0, 0], color=(1, 0, 0), tube_radius=None, figure=figure 65 | ) 66 | mlab.plot3d( 67 | [0, 0], [0, 2], [0, 0], color=(0, 1, 0), tube_radius=None, figure=figure 68 | ) 69 | mlab.plot3d( 70 | [0, 0], [0, 0], [0, 2], color=(0, 0, 1), tube_radius=None, figure=figure 71 | ) 72 | return figure 73 | 74 | 75 | def visualize_bboxes_3d(lidar_corners_3d, figure=None, orientation=None): 76 | if figure is None: 77 | figure = mlab.figure( 78 | figure=None, bgcolor=(0, 0, 0), fgcolor=None, engine=None, size=(1600, 1000) 79 | ) 80 | 81 | for b in tqdm(lidar_corners_3d, desc=f"Add bboxes", total=len(lidar_corners_3d)): 82 | for k in range(0, 4): 83 | i, j = k, (k + 1) % 4 84 | mlab.plot3d( 85 | [b[i, 0], b[j, 0]], 86 | [b[i, 1], b[j, 1]], 87 | [b[i, 2], b[j, 2]], 88 | color=(1, 1, 1), 89 | tube_radius=None, 90 | line_width=1, 91 | figure=figure, 92 | ) 93 | 94 | i, j = k + 4, (k + 1) % 4 + 4 95 | mlab.plot3d( 96 | [b[i, 0], b[j, 0]], 97 | [b[i, 1], b[j, 1]], 98 | [b[i, 2], b[j, 2]], 99 | color=(1, 1, 1), 100 | tube_radius=None, 101 | line_width=1, 102 | figure=figure, 103 | ) 104 | 105 | i, j = k, k + 4 106 | mlab.plot3d( 107 | [b[i, 0], b[j, 0]], 108 | [b[i, 1], b[j, 1]], 109 | [b[i, 2], b[j, 2]], 110 | color=(1, 1, 1), 111 | tube_radius=None, 112 | line_width=1, 113 | figure=figure, 114 | ) 115 | if orientation is not None: 116 | for o in orientation: 117 | mlab.plot3d( 118 | [o[0, 0], o[1, 0]], 119 | [o[0, 1], o[1, 1]], 120 | [o[0, 2], o[1, 2]], 121 | color=(1, 1, 1), 122 | tube_radius=None, 123 | line_width=1, 124 | figure=figure, 125 | ) 126 | print(f"Done") 127 | return figure 128 | 129 | 130 | def draw_boxes_top_view( 131 | top_view_image, boxes_3d, grid_meters, labels, orientation_3d=None 132 | ): 133 | height, width, channels = top_view_image.shape 134 | delimiter_x = grid_meters[0] / height 135 | delimiter_y = grid_meters[1] / width 136 | thickness = 2 137 | for idx, b in enumerate(boxes_3d): 138 | color = get_color(labels[idx]) / 255 139 | b = b[:4] 140 | x = np.floor(b[:, 0] / delimiter_x).astype(int) 141 | 142 | y = np.floor(b[:, 1] / delimiter_y).astype(int) 143 | 144 | cv2.line(top_view_image, (y[0], x[0]), (y[1], x[1]), color, thickness) 145 | cv2.line(top_view_image, (y[1], x[1]), (y[2], x[2]), color, thickness) 146 | cv2.line(top_view_image, (y[2], x[2]), (y[3], x[3]), color, thickness) 147 | cv2.line(top_view_image, (y[3], x[3]), (y[0], x[0]), color, thickness) 148 | 149 | if orientation_3d is not None: 150 | for o in orientation_3d: 151 | x = np.floor(o[:, 0] / delimiter_x).astype(int) 152 | y = np.floor(o[:, 1] / delimiter_y).astype(int) 153 | cv2.arrowedLine( 154 | top_view_image, (y[0], x[0]), (y[1], x[1]), (1, 0, 0), thickness 155 | ) 156 | return top_view_image 157 | 158 | 159 | def visualize_2d_boxes_on_top_image( 160 | bboxes_grid, top_view, grid_meters, bbox_voxel_size, prediction=False 161 | ): 162 | top_image_vis = [] 163 | for boxes, top_image in zip(bboxes_grid, top_view): # iterate over batch 164 | top_image = top_image.numpy() 165 | shape = top_image.shape 166 | rgb_image = np.zeros((shape[0], shape[1], 3)) 167 | rgb_image[top_image[:, :, 0] > 0] = 1 168 | 169 | box, labels, _ = get_boxes_from_box_grid(boxes, bbox_voxel_size) 170 | box = box.numpy() 171 | box, orientation_3d = make_eight_points_boxes(box) 172 | 173 | if prediction: 174 | labels = np.argmax(labels, axis=-1) 175 | if len(box) > 0: 176 | rgb_image = draw_boxes_top_view( 177 | rgb_image, box, grid_meters, labels, orientation_3d 178 | ) 179 | 180 | # rgb_image = np.rot90(rgb_image) 181 | top_image_vis.append(rgb_image) 182 | return np.asarray(top_image_vis) 183 | 184 | 185 | def visualize_bboxes_on_image(image, bboxes_2d, labels, orientation_2d=None): 186 | """ 187 | The function visualize the reprojected 3d bounding boxes 188 | on 2d image 189 | Arguments: 190 | images: the tensor of the shape [height, width, 3] 191 | bboxes: the reprojected bboxes of the shape [num_boxes, 8, 2] 192 | Returns: 193 | resulted_images: the tensor with bboxes of the shape [height, width, 3] 194 | """ 195 | 196 | height, width, _ = image.shape 197 | thickness = 2 198 | boundaries = np.asarray([width, height]) 199 | for idx, b in enumerate(bboxes_2d): 200 | color = get_color(labels[idx]) / 255 201 | 202 | b = b.astype(np.int32) 203 | first_square = False 204 | second_square = False 205 | if ( 206 | (b[0] >= 0).all() & (b[0] < boundaries).all() 207 | or (b[1] >= 0).all() & (b[1] < boundaries).all() 208 | or (b[4] >= 0).all() & (b[4] < boundaries).all() 209 | or (b[5] >= 0).all() & (b[5] < boundaries).all() 210 | ): 211 | first_square = True 212 | cv2.line(image, (b[0, 0], b[0, 1]), (b[1, 0], b[1, 1]), color, thickness) 213 | cv2.line(image, (b[4, 0], b[4, 1]), (b[0, 0], b[0, 1]), color, thickness) 214 | cv2.line(image, (b[5, 0], b[5, 1]), (b[1, 0], b[1, 1]), color, thickness) 215 | cv2.line(image, (b[4, 0], b[4, 1]), (b[5, 0], b[5, 1]), color, thickness) 216 | if ( 217 | (b[2] >= 0).all() & (b[2] < boundaries).all() 218 | or (b[3] >= 0).all() & (b[3] < boundaries).all() 219 | or (b[6] >= 0).all() & (b[6] < boundaries).all() 220 | or (b[7] >= 0).all() & (b[7] < boundaries).all() 221 | ): 222 | second_square = True 223 | cv2.line(image, (b[2, 0], b[2, 1]), (b[3, 0], b[3, 1]), color, thickness) 224 | cv2.line(image, (b[6, 0], b[6, 1]), (b[2, 0], b[2, 1]), color, thickness) 225 | cv2.line(image, (b[7, 0], b[7, 1]), (b[3, 0], b[3, 1]), color, thickness) 226 | cv2.line(image, (b[7, 0], b[7, 1]), (b[6, 0], b[6, 1]), color, thickness) 227 | 228 | if first_square and second_square: 229 | cv2.line(image, (b[0, 0], b[0, 1]), (b[3, 0], b[3, 1]), color, thickness) 230 | cv2.line(image, (b[1, 0], b[1, 1]), (b[2, 0], b[2, 1]), color, thickness) 231 | cv2.line(image, (b[4, 0], b[4, 1]), (b[7, 0], b[7, 1]), color, thickness) 232 | cv2.line(image, (b[5, 0], b[5, 1]), (b[6, 0], b[6, 1]), color, thickness) 233 | 234 | if orientation_2d is not None: 235 | for o in orientation_2d: 236 | o = o.astype(np.int32) 237 | if ( 238 | (o[0] >= 0).all() 239 | & (o[0] < boundaries).all() 240 | & (o[1] >= 0).all() 241 | & (o[1] < boundaries).all() 242 | ): 243 | cv2.arrowedLine( 244 | image, (o[0, 0], o[0, 1]), (o[1, 0], o[1, 1]), (1, 0, 0), thickness 245 | ) 246 | 247 | return image 248 | -------------------------------------------------------------------------------- /detection_3d/train.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | import argparse 20 | import os 21 | import tensorflow as tf 22 | from detection_3d.parameters import Parameters 23 | from detection_3d.detection_dataset import DetectionDataset 24 | from detection_3d.tools.detection_helpers import get_voxels_grid 25 | from detection_3d.model import YoloV3_Lidar 26 | from detection_3d.tools.training_helpers import ( 27 | setup_gpu, 28 | initialize_model, 29 | load_model, 30 | get_optimizer, 31 | ) 32 | from detection_3d.losses import detection_loss 33 | from detection_3d.tools.summary_helpers import train_summaries, epoch_metrics_summaries 34 | from detection_3d.metrics import EpochMetrics 35 | from tqdm import tqdm 36 | 37 | 38 | @tf.function 39 | def train_step(param_settings, train_samples, model, optimizer, epoch_metrics=None): 40 | 41 | with tf.GradientTape() as tape: 42 | top_view, box_grid, _ = train_samples 43 | predictions = model(top_view, training=True) 44 | ( 45 | obj_loss, 46 | label_loss, 47 | z_loss, 48 | delta_xy_loss, 49 | width_loss, 50 | height_loss, 51 | delta_orient_loss, 52 | ) = detection_loss(box_grid, predictions) 53 | losses = [ 54 | obj_loss, 55 | label_loss, 56 | z_loss, 57 | delta_xy_loss, 58 | width_loss, 59 | height_loss, 60 | delta_orient_loss, 61 | ] 62 | total_detection_loss = tf.reduce_sum(losses) 63 | # Get L2 losses for weight decay 64 | total_loss = total_detection_loss + tf.add_n(model.losses) 65 | 66 | gradients = tape.gradient(total_loss, model.trainable_variables) 67 | optimizer.apply_gradients(zip(gradients, model.trainable_variables)) 68 | if epoch_metrics is not None: 69 | epoch_metrics.train_loss(total_detection_loss) 70 | 71 | train_outputs = { 72 | "total_loss": total_loss, 73 | "losses": losses, 74 | "box_grid": box_grid, 75 | "predictions": predictions, 76 | "top_view": top_view, 77 | } 78 | 79 | return train_outputs 80 | 81 | 82 | @tf.function 83 | def val_step(samples, model, epoch_metrics=None): 84 | 85 | top_view, box_grid, _ = samples 86 | predictions = model(top_view, training=False) 87 | ( 88 | obj_loss, 89 | label_loss, 90 | z_loss, 91 | delta_xy_loss, 92 | width_loss, 93 | height_loss, 94 | delta_orient_loss, 95 | ) = detection_loss(box_grid, predictions) 96 | losses = [ 97 | obj_loss, 98 | label_loss, 99 | z_loss, 100 | delta_xy_loss, 101 | width_loss, 102 | height_loss, 103 | delta_orient_loss, 104 | ] 105 | total_detection_loss = tf.reduce_sum(losses) 106 | 107 | if epoch_metrics is not None: 108 | epoch_metrics.val_loss(total_detection_loss) 109 | 110 | 111 | def train(resume=False): 112 | setup_gpu() 113 | # General parameters 114 | param = Parameters() 115 | 116 | # Init label colors and label names 117 | tf.random.set_seed(param.settings["seed"]) 118 | 119 | train_dataset = DetectionDataset( 120 | param.settings, 121 | "train.datatxt", 122 | augmentation=param.settings["augmentation"], 123 | shuffle=True, 124 | ) 125 | 126 | param.settings["train_size"] = train_dataset.num_samples 127 | val_dataset = DetectionDataset(param.settings, "val.datatxt", shuffle=False) 128 | param.settings["val_size"] = val_dataset.num_samples 129 | 130 | model = YoloV3_Lidar(weight_decay=param.settings["weight_decay"]) 131 | voxels_grid = get_voxels_grid( 132 | param.settings["voxel_size"], param.settings["grid_meters"] 133 | ) 134 | input_shape = [1, voxels_grid[0], voxels_grid[1], 2] 135 | initialize_model(model, input_shape) 136 | model.summary() 137 | start_epoch, model = load_model(param.settings["checkpoints_dir"], model, resume) 138 | model_path = os.path.join(param.settings["checkpoints_dir"], "{model}-{epoch:04d}") 139 | 140 | learning_rate, optimizer = get_optimizer( 141 | param.settings["optimizer"], 142 | param.settings["scheduler"], 143 | train_dataset.num_it_per_epoch, 144 | ) 145 | epoch_metrics = EpochMetrics() 146 | 147 | for epoch in range(start_epoch, param.settings["max_epochs"]): 148 | save_dir = model_path.format(model=model.name, epoch=epoch) 149 | epoch_metrics.reset() 150 | for train_samples in tqdm( 151 | train_dataset.dataset, 152 | desc=f"Epoch {epoch}", 153 | total=train_dataset.num_it_per_epoch, 154 | ): 155 | train_outputs = train_step( 156 | param.settings, train_samples, model, optimizer, epoch_metrics 157 | ) 158 | train_summaries(train_outputs, optimizer, param.settings, learning_rate) 159 | for val_samples in tqdm( 160 | val_dataset.dataset, desc="Validation", total=val_dataset.num_it_per_epoch 161 | ): 162 | val_step(val_samples, model, epoch_metrics) 163 | epoch_metrics_summaries(param.settings, epoch_metrics, epoch) 164 | epoch_metrics.print_metrics() 165 | # Save all 166 | param.save_to_json(save_dir) 167 | epoch_metrics.save_to_json(save_dir) 168 | model.save(save_dir) 169 | 170 | 171 | if __name__ == "__main__": 172 | parser = argparse.ArgumentParser(description="Train CNN.") 173 | parser.add_argument( 174 | "--resume", 175 | type=lambda x: x, 176 | nargs="?", 177 | const=True, 178 | default=False, 179 | help="Activate nice mode.", 180 | ) 181 | args = parser.parse_args() 182 | train(resume=args.resume) 183 | -------------------------------------------------------------------------------- /detection_3d/validation_inferece.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | __copyright__ = """ 3 | Copyright (c) 2020 Tananaev Denis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies 9 | of the Software, and to permit persons to whom the Software is furnished to do so, 10 | subject to the following conditions: The above copyright notice and this permission 11 | notice shall be included in all copies or substantial portions of the Software. 12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, 13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR 14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE 15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | """ 19 | 20 | import argparse 21 | import os 22 | import numpy as np 23 | import tensorflow as tf 24 | from detection_3d.parameters import Parameters 25 | from detection_3d.tools.training_helpers import setup_gpu 26 | from detection_3d.detection_dataset import DetectionDataset 27 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image 28 | from detection_3d.tools.file_io import save_bboxes_to_file 29 | from detection_3d.tools.detection_helpers import ( 30 | make_eight_points_boxes, 31 | get_boxes_from_box_grid, 32 | get_bboxes_parameters_from_points, 33 | ) 34 | from PIL import Image 35 | from tqdm import tqdm 36 | import timeit 37 | 38 | 39 | def validation_inference(param_settings, dataset_file, model_dir, output_dir): 40 | setup_gpu() 41 | 42 | # Load model 43 | model = tf.keras.models.load_model(model_dir) 44 | bbox_voxel_size = np.asarray(param_settings["bbox_voxel_size"], dtype=np.float32) 45 | lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32) 46 | grid_meters = param_settings["grid_meters"] 47 | 48 | val_dataset = DetectionDataset(param_settings, dataset_file, shuffle=False) 49 | param_settings["val_size"] = val_dataset.num_samples 50 | for val_samples in tqdm( 51 | val_dataset.dataset, desc=f"val_inference", total=val_dataset.num_it_per_epoch, 52 | ): 53 | top_view, gt_boxes, lidar_filenames = val_samples 54 | predictions = model(top_view, training=False) 55 | for image, predict, gt, filename in zip( 56 | top_view, predictions, gt_boxes, lidar_filenames 57 | ): 58 | filename = str(filename.numpy()) 59 | seq_folder = filename.split("/")[-3] 60 | name = os.path.splitext(os.path.basename(filename))[0] 61 | # Ensure that output dir exists or create it 62 | top_view_dir = os.path.join(output_dir, "top_view", seq_folder) 63 | bboxes_dir = os.path.join(output_dir, "bboxes", seq_folder) 64 | os.makedirs(top_view_dir, exist_ok=True) 65 | os.makedirs(bboxes_dir, exist_ok=True) 66 | p_top_view = ( 67 | visualize_2d_boxes_on_top_image( 68 | [predict], [image], grid_meters, bbox_voxel_size, prediction=True, 69 | ) 70 | * 255 71 | ) 72 | gt_top_view = ( 73 | visualize_2d_boxes_on_top_image( 74 | [gt], [image], grid_meters, bbox_voxel_size, prediction=False, 75 | ) 76 | * 255 77 | ) 78 | result = np.vstack((p_top_view[0], gt_top_view[0])) 79 | file_to_save = os.path.join(top_view_dir, name + ".png") 80 | img = Image.fromarray(result.astype("uint8")) 81 | img.save(file_to_save) 82 | 83 | box, labels, _ = get_boxes_from_box_grid(predict, bbox_voxel_size) 84 | box = box.numpy() 85 | box, _ = make_eight_points_boxes(box) 86 | if len(box) > 0: 87 | box = box - lidar_coord[:3] 88 | labels = np.argmax(labels, axis=-1) 89 | ( 90 | centroid, 91 | width, 92 | length, 93 | height, 94 | yaw, 95 | ) = get_bboxes_parameters_from_points(box) 96 | bboxes_name = os.path.join(bboxes_dir, name + ".txt") 97 | save_bboxes_to_file( 98 | bboxes_name, centroid, width, length, height, yaw, labels 99 | ) 100 | 101 | 102 | if __name__ == "__main__": 103 | parser = argparse.ArgumentParser(description="Inference validation set.") 104 | parser.add_argument( 105 | "--dataset_file", default="val.datatxt", 106 | ) 107 | 108 | parser.add_argument("--output_dir", default="inference") 109 | 110 | parser.add_argument( 111 | "--model_dir", default="YoloV3_Lidar-0085", 112 | ) 113 | args = parser.parse_args() 114 | 115 | param_settings = Parameters().settings 116 | validation_inference( 117 | param_settings, args.dataset_file, args.model_dir, args.output_dir 118 | ) 119 | -------------------------------------------------------------------------------- /pictures/box_parametrization.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/box_parametrization.png -------------------------------------------------------------------------------- /pictures/result.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/result.png -------------------------------------------------------------------------------- /pictures/topview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/topview.png -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | 3 | with open("README.md", "r") as fh: 4 | long_description = fh.read() 5 | 6 | setuptools.setup( 7 | name="detection_3d-Denis-Tananaev", 8 | version="0.0.1", 9 | author="Denis Tananaev", 10 | author_email="d.d.tananaev@gmail.com", 11 | description="3D bbox detection with Lidar", 12 | long_description=long_description, 13 | long_description_content_type="text/markdown", 14 | url="https://github.com/Dtananaev/lidar_dynamic_objects_detection", 15 | packages=setuptools.find_packages(), 16 | classifiers=[ 17 | "Programming Language :: Python :: 3", 18 | "License :: OSI Approved :: MIT License", 19 | "Operating System :: OS Independent", 20 | ], 21 | python_requires='>=3.6', 22 | ) 23 | --------------------------------------------------------------------------------