├── .gitignore
├── .vscode
└── settings.json
├── LICENSE.md
├── README.md
├── detection_3d
├── __init__.py
├── create_dataset_lists.py
├── data_preprocessing
│ └── pandaset_tools
│ │ ├── helpers.py
│ │ ├── preprocess_data.py
│ │ ├── transform.py
│ │ └── visualize_data.py
├── detection_dataset.py
├── losses.py
├── metrics.py
├── model.py
├── parameters.py
├── tools
│ ├── augmentation_tools.py
│ ├── detection_helpers.py
│ ├── file_io.py
│ ├── statics.py
│ ├── summary_helpers.py
│ ├── training_helpers.py
│ └── visualization_tools.py
├── train.py
└── validation_inferece.py
├── pictures
├── box_parametrization.png
├── result.png
└── topview.png
└── setup.py
/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | *~
3 | *.txt
4 | dataset
5 | *-INFO
6 | log
7 | inference
8 | *.bin
9 |
10 |
--------------------------------------------------------------------------------
/.vscode/settings.json:
--------------------------------------------------------------------------------
1 | {
2 | "python.formatting.provider": "black"
3 | }
--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | Copyright 2020 Denis Tananaev
2 |
3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4 |
5 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6 |
7 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Dynamic objects detection in LiDAR
2 |
3 | [](https://github.com/Dtananaev/lidar_dynamic_objects_detection/blob/master/LICENSE.md)
4 |
5 | ## The result of network (click on the image below)
6 |
7 | [](https://youtu.be/f_HZg9Cq-h4)
8 | The network weights could be loaded [weight](https://drive.google.com/file/d/1m8N5m2WXATgFNw88BRqEbUieiyV7p3S0/view?usp=sharing).
9 | ## Installation
10 | For ubuntu 18.04 install necessary dependecies:
11 | ```
12 | sudo apt update
13 | sudo apt install python3-dev python3-pip python3-venv
14 | ```
15 | Create virtual environment and activate it:
16 | ```
17 | python3 -m venv --system-site-packages ./venv
18 | source ./venv/bin/activate
19 | ```
20 | Upgrade pip tools:
21 | ```
22 | pip install --upgrade pip
23 | ```
24 | Install tensorflow 2.0 (for more details check the tensofrolow install tutorial: [tensorflow](https://www.tensorflow.org/install/pip))
25 | ```
26 | pip install --upgrade tensorflow-gpu
27 | ```
28 | Clone this repository and then install it:
29 | ```
30 | cd lidar_dynamic_objects_detection
31 | pip install -r requirements.txt
32 | pip install -e .
33 | ```
34 | This should install all the necessary packages to your environment.
35 |
36 | ## The method
37 |
38 | The lidar point cloud represented as top view image where each pixel of the image corresponds to 12.5x12.5 cm. For each grid cell
39 | we project random point and get the height and intensity
40 |
41 |
42 |
43 | We are doing direct regression of the 3D boxes, thus for each pixel of the image we regress confidence between 0 and 1, 7 parameters for box (dx_centroid, dy_centroid, z_centroid, width, height, dx_front, dy_front) and classes.
44 |
45 |
46 |
47 | We apply binary cross entrophy for confidence loss, l1 loss for all box parameters regression and softmax loss for classes prediction.
48 | The confidence map computed from ground truth boxes. We assign the closest to the box centroid cell as confidence 1.0 (green on the image above)
49 | and 0 otherwise. We apply confidence loss for all the pixels. Other losses applied only for those pixels where we have confidence ground truth 1.0.
50 |
51 |
52 | ## The dataset preparation
53 | We work with Pandaset dataset which can be uploaded from here: [Pandaset](https://pandaset.org/)
54 | Upload and unpack all the data to dataset folder (e.g. ~/dataset).
55 | The dataset should have the next folder structure:
56 | ``` bash
57 | dataset
58 | ├── 001 # The sequence number
59 | │ ├── annotations # Bounding boxes and semseg annotations
60 | | | ├──cuboids
61 | | | | ├──00.pkl.gz
62 | | | | └── ...
63 | | | ├──semseg
64 | | | ├──00.pkl.gz
65 | | | └── ...
66 | │ ├── camera # cameras images
67 | | | ├──back_camera
68 | | | | ├──00.jpg
69 | | | | └── ..
70 | | | ├──front_camera
71 | | | └── ...
72 | │ ├── lidar # lidar data
73 | │ | ├── 00.pkl.gz
74 | │ | └── ...
75 | | ├── meta
76 | | | ├── gps.json
77 | | | ├── timestamps.json
78 | ├── 002
79 | └── ...
80 | ```
81 | Preprocess dataset by applying next command:
82 | ```
83 | cd lidar_dynamic_objects_detection/detection_3d/data_preprocessing/pandaset_tools
84 | python preprocess_data.py --dataset_dir
85 | ```
86 | Create dataset lists:
87 | ```
88 | cd lidar_dynamic_objects_detection/detection_3d/
89 | python create_dataset_lists.py --dataset_dir
90 | ```
91 | This should create ```train.datatxt``` and ```val.datatxt``` into your dataset folder.
92 | Finally change into ```parameters.py``` the directory of the dataset.
93 | ## Train
94 | In order to train the network:
95 | ```
96 | python train.py
97 | ```
98 | In order to resume training:
99 | ```
100 | python train.py --resume
101 | ```
102 | The training can be monitored in tensorboard:
103 | ```
104 | tensorboard --logdir=log
105 | ```
106 | ## Inference on validation dataset
107 | In order to do inference on validation dataset:
108 | ```
109 | python validation_inference.py --dataset_file /val.datatxt --output_dir --model_dir
110 | ```
111 | The result of the inference is 3d boxes and also visualized 3d boxes on top view image. The visualized top view image (upper) concatenated with ground truth top view image (bottom).
112 |
--------------------------------------------------------------------------------
/detection_3d/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/detection_3d/__init__.py
--------------------------------------------------------------------------------
/detection_3d/create_dataset_lists.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import os
20 | import glob
21 | import numpy as np
22 | import argparse
23 | from detection_3d.tools.file_io import save_dataset_list
24 |
25 |
26 | class PandaDetectionDataset:
27 | def __init__(self, dataset_dir):
28 | self.dataset_dir = dataset_dir
29 |
30 | def get_data(self):
31 | search_string = os.path.join(self.dataset_dir, "*", "lidar_processed", "*.bin")
32 | lidar_list = np.asarray(sorted(glob.glob(search_string)))
33 | search_string = os.path.join(self.dataset_dir, "*", "bbox_processed", "*.txt")
34 | box_list = np.asarray(sorted(glob.glob(search_string)))
35 | data = np.concatenate((lidar_list[:, None], box_list[:, None],), axis=1,)
36 | data = [";".join(x) for x in data]
37 | return data
38 |
39 | def create_datasets_file(self):
40 | """
41 | Creates train.dataset and val.dataset file
42 | """
43 | data_list = self.get_data()
44 |
45 | split_num = 80 * int(103 * 0.75)
46 | print(f"split_num {split_num}")
47 | # Save train and validation dataset
48 | filename = os.path.join(self.dataset_dir, "train.datatxt")
49 | save_dataset_list(filename, data_list[:split_num])
50 | print(
51 | f"The dataset of the size {len(data_list[:split_num])} saved in {filename}."
52 | )
53 | filename = os.path.join(self.dataset_dir, "val.datatxt")
54 | save_dataset_list(filename, data_list[split_num:])
55 | print(
56 | f"The dataset of the size {len(data_list[split_num:])} saved in {filename}."
57 | )
58 |
59 |
60 | if __name__ == "__main__":
61 | parser = argparse.ArgumentParser(description="Create kitti dataset file.")
62 | parser.add_argument("--dataset_dir", default="dataset")
63 | args = parser.parse_args()
64 | dataset_creator = PandaDetectionDataset(args.dataset_dir)
65 | dataset_creator.create_datasets_file()
66 |
--------------------------------------------------------------------------------
/detection_3d/data_preprocessing/pandaset_tools/helpers.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import numpy as np
20 |
21 | labels = {
22 | "Cones": 0,
23 | "Towed Object": 1,
24 | "Semi-truck": 2,
25 | "Train": 3,
26 | "Temporary Construction Barriers": 4,
27 | "Rolling Containers": 5,
28 | "Animals - Other": 6,
29 | "Pylons": 7,
30 | "Emergency Vehicle": 8,
31 | "Motorcycle": 9,
32 | "Construction Signs": 10,
33 | "Medium-sized Truck": 11,
34 | "Other Vehicle - Uncommon": 12,
35 | "Tram / Subway": 13,
36 | "Road Barriers": 14,
37 | "Bus": 15,
38 | "Pedestrian with Object": 16,
39 | "Personal Mobility Device": 17,
40 | "Signs": 18,
41 | "Other Vehicle - Pedicab": 19,
42 | "Pedestrian": 20,
43 | "Car": 21,
44 | "Other Vehicle - Construction Vehicle": 22,
45 | "Bicycle": 23,
46 | "Motorized Scooter": 24,
47 | "Pickup Truck": 25,
48 | }
49 |
50 |
51 | def get_color(label):
52 | # "Tram": 0, "Car": 1, "Misc": 2, "Van": 3, "Person_sitting": 4, "Pedestrian": 5, "Truck": 6, "Cyclist": 7
53 | color = np.asarray(
54 | [
55 | [255, 229, 204], # "Cones": 0,
56 | [255, 255, 204], # "Towed Object": 1,
57 | [204, 204, 255], # "Semi-truck": 2,
58 | [255, 204, 204], # "Train": 3,
59 | [255, 204, 153], # "Temporary Construction Barriers": 4,
60 | [204, 255, 204], # "Rolling Containers": 5,
61 | [255, 204, 229], # "Animals - Other": 6,
62 | [153, 255, 153], # "Pylons": 7,
63 | [128, 128, 128], # "Emergency Vehicle": 8,
64 | [255, 255, 102], # "Motorcycle": 9,
65 | [255, 153, 51], # "Construction Signs": 10,
66 | [153, 153, 255], # "Medium-sized Truck": 11,
67 | [255, 255, 255], # "Other Vehicle - Uncommon": 12,
68 | [255, 102, 102], # "Tram / Subway": 13,
69 | [204, 102, 0], # "Road Barriers": 14,
70 | [0, 0, 255], # "Bus": 15,
71 | [255, 51, 153], # "Pedestrian with Object": 16,
72 | [153, 153, 0], # "Personal Mobility Device"
73 | [255, 153, 51], # "Signs": 18,
74 | [128, 128, 128], # "Other Vehicle - Pedicab": 19,
75 | [204, 0, 102], # Pedestrian
76 | [0, 255, 0], # Car
77 | [0, 0, 102], # "Other Vehicle - Construction Vehicle"
78 | [255, 255, 0], # "Other Vehicle - Construction Vehicle"
79 | [255, 255, 153], # "Motorized Scooter": 24,
80 | [51, 255, 255], # "Motorized Scooter": 24,
81 | ]
82 | )
83 | return color[int(label)]
84 |
85 |
86 | def make_xzyhwly(bboxes):
87 | """
88 | Get raw data from bboxes and return xyzwlhy
89 | """
90 | label = bboxes[:, 1]
91 | yaw = bboxes[:, 2]
92 | c_x = bboxes[:, 5]
93 | c_y = bboxes[:, 6]
94 | c_z = bboxes[:, 7]
95 | length = bboxes[:, 8]
96 | width = bboxes[:, 9]
97 | height = bboxes[:, 10]
98 | new_boxes = np.asarray([c_x, c_y, c_z, length, width, height, yaw], dtype=np.float)
99 | return label, np.transpose(new_boxes)
100 |
101 |
102 | def filter_boxes(labels, bboxes_3d, orient_3d, lidar, treshold=20):
103 | labels_res = []
104 | box_res = []
105 | orient_res = []
106 | for idx, box in enumerate(bboxes_3d):
107 | min_x = np.min(box[:, 0])
108 | max_x = np.max(box[:, 0])
109 | min_y = np.min(box[:, 1])
110 | max_y = np.max(box[:, 1])
111 | min_z = np.min(box[:, 2])
112 | max_z = np.max(box[:, 2])
113 | mask_x = (lidar[:, 0] >= min_x) & (lidar[:, 0] <= max_x)
114 | mask_y = (lidar[:, 1] >= min_y) & (lidar[:, 1] <= max_y)
115 | mask_z = (lidar[:, 2] >= min_z) & (lidar[:, 2] <= max_z)
116 | mask = mask_x & mask_y & mask_z
117 | result = np.sum(mask.astype(float))
118 | if result > treshold:
119 | box_res.append(box)
120 | orient_res.append(orient_3d[idx])
121 | labels_res.append(labels[idx])
122 | return np.asarray(labels_res), np.asarray(box_res), np.asarray(orient_res)
123 |
--------------------------------------------------------------------------------
/detection_3d/data_preprocessing/pandaset_tools/preprocess_data.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import argparse
20 | import numpy as np
21 | import os
22 | import glob
23 | import pandas as pd
24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import (
25 | make_xzyhwly,
26 | filter_boxes,
27 | )
28 | from detection_3d.tools.detection_helpers import (
29 | make_eight_points_boxes,
30 | get_bboxes_parameters_from_points,
31 | )
32 | import mayavi.mlab as mlab
33 | from tqdm import tqdm
34 | from detection_3d.tools.file_io import read_json, save_bboxes_to_file, save_lidar
35 | from detection_3d.data_preprocessing.pandaset_tools.transform import (
36 | quaternion_to_euler,
37 | to_transform_matrix,
38 | transform_lidar_box_3d,
39 | )
40 | from detection_3d.tools.visualization_tools import visualize_lidar, visualize_bboxes_3d
41 |
42 |
43 | def preprocess_data(dataset_dir):
44 | """
45 | The function prepares data for training from pandaset.
46 | Arguments:
47 | dataset_dir: directory with Pandaset data
48 | """
49 |
50 | # Get list of data samples
51 | search_string = os.path.join(dataset_dir, "*")
52 | seq_list = sorted(glob.glob(search_string))
53 | for seq in tqdm(seq_list, desc="Process sequences", total=len(seq_list)):
54 | # Make output dirs for data
55 | lidar_out_dir = os.path.join(seq, "lidar_processed")
56 | bbox_out_dir = os.path.join(seq, "bbox_processed")
57 | os.makedirs(lidar_out_dir, exist_ok=True)
58 | os.makedirs(bbox_out_dir, exist_ok=True)
59 | search_string = os.path.join(seq, "lidar", "*.pkl.gz")
60 | lidar_list = sorted(glob.glob(search_string))
61 | lidar_pose_path = os.path.join(seq, "lidar", "poses.json")
62 | lidar_pose = read_json(lidar_pose_path)
63 | for idx, lidar_path in enumerate(lidar_list):
64 | sample_idx = os.path.splitext(os.path.basename(lidar_path))[0].split(".")[0]
65 | # Get pose of the lidar
66 | translation = lidar_pose[idx]["position"]
67 | translation = np.asarray([translation[key] for key in translation])
68 | rotation = lidar_pose[idx]["heading"]
69 | rotation = np.asarray([rotation[key] for key in rotation])
70 | rotation = quaternion_to_euler(*rotation)
71 | Rt = to_transform_matrix(translation, rotation)
72 |
73 | # Get respective bboxes
74 | bbox_path = lidar_path.split("/")
75 | bbox_path[-2] = "annotations/cuboids"
76 | bbox_path = os.path.join(*bbox_path)
77 |
78 | # Load data
79 | lidar = np.asarray(pd.read_pickle(lidar_path))
80 | # Get only lidar 0 (there is also lidar 1)
81 | lidar = lidar[lidar[:, -1] == 0]
82 | intensity = lidar[:, 3]
83 | lidar = transform_lidar_box_3d(lidar, Rt)
84 | # add intensity
85 | lidar = np.concatenate((lidar, intensity[:, None]), axis=-1)
86 |
87 | # Load bboxes
88 | bboxes = np.asarray(pd.read_pickle(bbox_path))
89 | labels, bboxes = make_xzyhwly(bboxes)
90 | corners_3d, orientation_3d = make_eight_points_boxes(bboxes)
91 | corners_3d = np.asarray(
92 | [transform_lidar_box_3d(box, Rt) for box in corners_3d]
93 | )
94 | orientation_3d = np.asarray(
95 | [transform_lidar_box_3d(box, Rt) for box in orientation_3d]
96 | )
97 | # filter boxes containing less then 20 lidar points inside
98 | labels, corners_3d, orientation_3d = filter_boxes(
99 | labels, corners_3d, orientation_3d, lidar
100 | )
101 | centroid, width, length, height, yaw = get_bboxes_parameters_from_points(
102 | corners_3d
103 | )
104 |
105 | # Save data
106 | lidar_filename = os.path.join(lidar_out_dir, sample_idx + ".bin")
107 | save_lidar(lidar_filename, lidar.astype(np.float32))
108 | box_filename = os.path.join(bbox_out_dir, sample_idx + ".txt")
109 | save_bboxes_to_file(
110 | box_filename, centroid, width, length, height, yaw, labels
111 | )
112 |
113 |
114 | if __name__ == "__main__":
115 | parser = argparse.ArgumentParser(description="Preprocess 3D pandaset.")
116 | parser.add_argument("--dataset_dir", default="../../dataset")
117 | args = parser.parse_args()
118 | preprocess_data(args.dataset_dir)
119 |
--------------------------------------------------------------------------------
/detection_3d/data_preprocessing/pandaset_tools/transform.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 |
20 | import numpy as np
21 |
22 |
23 | def quaternion_to_euler(w, x, y, z):
24 | """
25 | Converts quaternions with components w, x, y, z into a tuple (roll, pitch, yaw)
26 |
27 | """
28 | sinr_cosp = 2 * (w * x + y * z)
29 | cosr_cosp = 1 - 2 * (x ** 2 + y ** 2)
30 | roll = np.arctan2(sinr_cosp, cosr_cosp)
31 |
32 | sinp = 2 * (w * y - z * x)
33 | pitch = np.where(np.abs(sinp) >= 1, np.sign(sinp) * np.pi / 2, np.arcsin(sinp))
34 |
35 | siny_cosp = 2 * (w * z + x * y)
36 | cosy_cosp = 1 - 2 * (y ** 2 + z ** 2)
37 | yaw = np.arctan2(siny_cosp, cosy_cosp)
38 |
39 | return roll, pitch, yaw
40 |
41 |
42 | # Calculates Rotation Matrix given euler angles.
43 | def eulerAnglesToRotationMatrix(theta):
44 |
45 | R_x = np.array(
46 | [
47 | [1, 0, 0],
48 | [0, np.cos(theta[0]), -np.sin(theta[0])],
49 | [0, np.sin(theta[0]), np.cos(theta[0])],
50 | ]
51 | )
52 |
53 | R_y = np.array(
54 | [
55 | [np.cos(theta[1]), 0, np.sin(theta[1])],
56 | [0, 1, 0],
57 | [-np.sin(theta[1]), 0, np.cos(theta[1])],
58 | ]
59 | )
60 |
61 | R_z = np.array(
62 | [
63 | [np.cos(theta[2]), -np.sin(theta[2]), 0],
64 | [np.sin(theta[2]), np.cos(theta[2]), 0],
65 | [0, 0, 1],
66 | ]
67 | )
68 |
69 | R = np.dot(R_z, np.dot(R_y, R_x))
70 |
71 | return R
72 |
73 |
74 | def to_transform_matrix(translation, rotation):
75 | Rt = np.eye(4)
76 | Rt[:3, :3] = eulerAnglesToRotationMatrix(rotation)
77 | Rt[:3, 3] = translation
78 | return Rt
79 |
80 |
81 | def transform_lidar_box_3d(lidar, Rt):
82 | rt_inv = np.linalg.inv(Rt)
83 |
84 | lidar_3d = lidar[:, :3]
85 | lidar_3d = np.transpose(lidar_3d)
86 |
87 | ones = np.ones_like(lidar_3d[0])[None, :]
88 | hom_coord = np.concatenate((lidar_3d, ones), axis=0)
89 | lidar_3d = np.dot(rt_inv, hom_coord)
90 | lidar_3d = np.transpose(lidar_3d)[:, :3]
91 |
92 | return lidar_3d
93 |
--------------------------------------------------------------------------------
/detection_3d/data_preprocessing/pandaset_tools/visualize_data.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import argparse
20 | import numpy as np
21 | import os
22 | import glob
23 | import pandas as pd
24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import (
25 | make_xzyhwly,
26 | filter_boxes,
27 | )
28 | from detection_3d.tools.detection_helpers import (
29 | make_eight_points_boxes,
30 | get_bboxes_parameters_from_points,
31 | )
32 | import mayavi.mlab as mlab
33 | from tqdm import tqdm
34 | from detection_3d.tools.file_io import read_json
35 | from detection_3d.data_preprocessing.pandaset_tools.transform import (
36 | quaternion_to_euler,
37 | to_transform_matrix,
38 | transform_lidar_box_3d,
39 | )
40 | from detection_3d.tools.visualization_tools import visualize_lidar, visualize_bboxes_3d
41 |
42 |
43 | def preprocess_data(dataset_dir):
44 | """
45 | The function visualizes data from pandaset.
46 | Arguments:
47 | dataset_dir: directory with Pandaset data
48 | """
49 | shift_lidar = [
50 | 25,
51 | 50,
52 | 2.5,
53 | ] # The lidar coordinates is in the middle of point cloud we shift them to left top corner of the top view image
54 | # the top view image applied to the area of 50x100 meters around the car, where the most dense lidar point cloud
55 | # Get list of data samples
56 | search_string = os.path.join(dataset_dir, "*")
57 | seq_list = sorted(glob.glob(search_string))
58 | for seq in tqdm(seq_list, desc="Process sequences", total=len(seq_list)):
59 | search_string = os.path.join(seq, "lidar", "*.pkl.gz")
60 | lidar_list = sorted(glob.glob(search_string))
61 | lidar_pose_path = os.path.join(seq, "lidar", "poses.json")
62 | lidar_pose = read_json(lidar_pose_path)
63 | for idx, lidar_path in enumerate(lidar_list):
64 | # Get pose of the lidar
65 | translation = lidar_pose[idx]["position"]
66 | translation = np.asarray([translation[key] for key in translation])
67 | rotation = lidar_pose[idx]["heading"]
68 | rotation = np.asarray([rotation[key] for key in rotation])
69 | rotation = quaternion_to_euler(*rotation)
70 | Rt = to_transform_matrix(translation, rotation)
71 |
72 | # Get respective bboxes
73 | bbox_path = lidar_path.split("/")
74 | bbox_path[-2] = "annotations/cuboids"
75 | bbox_path = os.path.join(*bbox_path)
76 |
77 | # Load data
78 | lidar = np.asarray(pd.read_pickle(lidar_path))
79 | # Get only lidar 0 (there is also lidar 1)
80 | lidar = lidar[lidar[:, -1] == 0]
81 | intensity = lidar[:, 3]
82 | lidar = transform_lidar_box_3d(lidar, Rt)
83 | # add intensity
84 | lidar = np.concatenate((lidar, intensity[:, None]), axis=-1)
85 |
86 | # Load bboxes
87 | bboxes = np.asarray(pd.read_pickle(bbox_path))
88 | labels, bboxes = make_xzyhwly(bboxes)
89 | corners_3d, orientation_3d = make_eight_points_boxes(bboxes)
90 | corners_3d = np.asarray(
91 | [transform_lidar_box_3d(box, Rt) for box in corners_3d]
92 | )
93 | orientation_3d = np.asarray(
94 | [transform_lidar_box_3d(box, Rt) for box in orientation_3d]
95 | )
96 | labels, corners_3d, orientation_3d = filter_boxes(
97 | labels, corners_3d, orientation_3d, lidar
98 | )
99 | centroid, width, length, height, yaw = get_bboxes_parameters_from_points(
100 | corners_3d
101 | )
102 |
103 | boxes_new = np.concatenate(
104 | (
105 | centroid,
106 | length[:, None],
107 | width[:, None],
108 | height[:, None],
109 | yaw[:, None],
110 | ),
111 | axis=-1,
112 | )
113 | lidar[:, :3] = lidar[:, :3] + shift_lidar
114 |
115 | corners_3d, orientation_3d = make_eight_points_boxes(boxes_new)
116 | corners_3d = corners_3d + shift_lidar
117 | orientation_3d = orientation_3d + shift_lidar
118 | figure = visualize_bboxes_3d(corners_3d, None, orientation_3d)
119 | figure = visualize_lidar(lidar, figure)
120 | mlab.show(1)
121 | input()
122 | mlab.close(figure)
123 |
124 |
125 | if __name__ == "__main__":
126 | parser = argparse.ArgumentParser(description="Preprocess 3D pandaset.")
127 | parser.add_argument("--dataset_dir", default="../../dataset")
128 | args = parser.parse_args()
129 | preprocess_data(args.dataset_dir)
130 |
--------------------------------------------------------------------------------
/detection_3d/detection_dataset.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import tensorflow as tf
20 | import numpy as np
21 | import argparse
22 | from tqdm import tqdm
23 | from detection_3d.parameters import Parameters
24 | from detection_3d.tools.file_io import load_dataset_list, load_lidar, load_bboxes
25 | from detection_3d.tools.detection_helpers import (
26 | make_top_view_image,
27 | make_eight_points_boxes,
28 | get_bboxes_grid,
29 | )
30 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image
31 | from detection_3d.tools.augmentation_tools import (
32 | random_rotate_lidar_boxes,
33 | random_flip_x_lidar_boxes,
34 | random_flip_y_lidar_boxes,
35 | )
36 | from PIL import Image
37 |
38 |
39 | class DetectionDataset:
40 | """
41 | This is dataset layer for 3d detection experiment
42 | Arguments:
43 | param_settings: parameters of experiment
44 | dataset_file: name of .dataset file
45 | shuffle: shuffle the data True/False
46 | """
47 |
48 | def __init__(self, param_settings, dataset_file, augmentation=False, shuffle=False):
49 | # Private methods
50 | self.seed = param_settings["seed"]
51 | np.random.seed(self.seed)
52 |
53 | self.augmentation = augmentation
54 |
55 | self.param_settings = param_settings
56 | self.dataset_file = dataset_file
57 | self.inputs_list = load_dataset_list(
58 | self.param_settings["dataset_dir"], dataset_file
59 | )
60 | self.num_samples = len(self.inputs_list)
61 | self.num_it_per_epoch = int(
62 | self.num_samples / self.param_settings["batch_size"]
63 | )
64 | self.output_types = [tf.float32, tf.float32, tf.string]
65 |
66 | ds = tf.data.Dataset.from_tensor_slices(self.inputs_list)
67 |
68 | if shuffle:
69 | ds = ds.shuffle(self.num_samples)
70 | ds = ds.map(
71 | map_func=lambda x: tf.py_function(
72 | self.load_data, [x], Tout=self.output_types
73 | ),
74 | num_parallel_calls=12,
75 | )
76 | ds = ds.batch(self.param_settings["batch_size"])
77 | ds = ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
78 | self.dataset = ds
79 |
80 | def load_data(self, data_input):
81 | """
82 | Loads image and semseg and resizes it
83 | Note: This is numpy function.
84 | """
85 | lidar_file, bboxes_file = np.asarray(data_input).astype("U")
86 |
87 | lidar = load_lidar(lidar_file)
88 | bboxes = load_bboxes(bboxes_file)
89 | labels = bboxes[:, -1]
90 | lidar_corners_3d, _ = make_eight_points_boxes(bboxes[:, :-1])
91 | if self.augmentation:
92 | np.random.shuffle(lidar)
93 | if np.random.uniform(0, 1) < 0.50: # 50% probability to flip over x axis
94 | lidar, lidar_corners_3d = random_flip_x_lidar_boxes(
95 | lidar, lidar_corners_3d
96 | )
97 | if np.random.uniform(0, 1) < 0.50: # 50% probability to flip over y axis
98 | lidar, lidar_corners_3d = random_flip_y_lidar_boxes(
99 | lidar, lidar_corners_3d
100 | )
101 | if np.random.uniform(0, 1) < 0.80: # 80% probability to rotate
102 | lidar, lidar_corners_3d = random_rotate_lidar_boxes(
103 | lidar, lidar_corners_3d
104 | )
105 |
106 | # # Shift lidar coordinate to positive quadrant
107 | lidar_coord = np.asarray(self.param_settings["lidar_offset"], dtype=np.float32)
108 | lidar = lidar + lidar_coord
109 | lidar_corners_3d = lidar_corners_3d + lidar_coord[:3]
110 | # Process data
111 | top_view = make_top_view_image(
112 | lidar, self.param_settings["grid_meters"], self.param_settings["voxel_size"]
113 | )
114 | box_grid = get_bboxes_grid(
115 | labels,
116 | lidar_corners_3d,
117 | self.param_settings["grid_meters"],
118 | self.param_settings["bbox_voxel_size"],
119 | )
120 | return top_view, box_grid, lidar_file
121 |
122 |
123 | if __name__ == "__main__":
124 | parser = argparse.ArgumentParser(description="DatasetLayer.")
125 | parser.add_argument(
126 | "--dataset_file",
127 | type=str,
128 | help="creates .dataset file",
129 | default="train.datatxt",
130 | )
131 | args = parser.parse_args()
132 |
133 | param_settings = Parameters().settings
134 | train_dataset = DetectionDataset(param_settings, args.dataset_file)
135 |
136 | bbox_voxel_size = np.asarray(param_settings["bbox_voxel_size"], dtype=np.float32)
137 | grid_meters = np.array(param_settings["grid_meters"], dtype=np.float32)
138 |
139 | lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32)
140 |
141 | for samples in tqdm(train_dataset.dataset, total=train_dataset.num_it_per_epoch):
142 | top_images, boxes_grid, lidar_file = samples
143 | print(
144 | f"lidar {top_images.shape}, boxes {boxes_grid.shape}, lidar_file {lidar_file}"
145 | )
146 |
147 | top_view = (
148 | visualize_2d_boxes_on_top_image(
149 | boxes_grid, top_images, grid_meters, bbox_voxel_size
150 | )
151 | * 255
152 | )
153 | img = Image.fromarray(top_view[0].astype("uint8"))
154 | img.save("result.png")
155 | input()
156 |
--------------------------------------------------------------------------------
/detection_3d/losses.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 |
20 | import tensorflow as tf
21 | import numpy as np
22 | from tensorflow.keras.losses import binary_crossentropy, sparse_categorical_crossentropy
23 |
24 |
25 | def detection_loss(gt_bboxes, pred_bboxes, num_classes=26):
26 |
27 | # [2, 280, 160, 7]
28 | # (x_grid, y_grid, (objectness, min_delta_x, min_delta_y, max_delta_x, max_delta_y, z, label))
29 | (
30 | gt_objectness,
31 | gt_delta_xy,
32 | gt_orient_xy,
33 | gt_z_coord,
34 | gt_width,
35 | gt_height,
36 | gt_label,
37 | ) = tf.split(gt_bboxes, (1, 2, 2, 1, 1, 1, 1), axis=-1)
38 |
39 | (
40 | p_objectness,
41 | p_delta_xy,
42 | p_orient_xy,
43 | p_z_coord,
44 | p_width,
45 | p_height,
46 | p_label,
47 | ) = tf.split(pred_bboxes, (1, 2, 2, 1, 1, 1, num_classes), axis=-1)
48 |
49 | # Objectness
50 | p_objectness = tf.sigmoid(p_objectness)
51 | obj_loss = binary_crossentropy(gt_objectness, p_objectness)
52 |
53 | # Evaluate regression only for non-zero ground truth objects
54 | obj_mask = tf.squeeze(gt_objectness, -1)
55 |
56 | # Evaluate other 6 parameters of the bboxes
57 | label_loss = obj_mask * sparse_categorical_crossentropy(
58 | gt_label, p_label, from_logits=True
59 | )
60 |
61 | delta_xy_loss = obj_mask * tf.reduce_sum(tf.abs(gt_delta_xy - p_delta_xy), axis=-1)
62 | delta_orient_loss = obj_mask * tf.reduce_sum(
63 | tf.abs(gt_orient_xy - p_orient_xy), axis=-1
64 | )
65 |
66 | z_loss = obj_mask * tf.squeeze(tf.abs(gt_z_coord - p_z_coord), -1)
67 | width_loss = obj_mask * tf.squeeze(tf.abs(gt_width - p_width), -1)
68 | height_loss = obj_mask * tf.squeeze(tf.abs(gt_height - p_height), -1)
69 |
70 | obj_loss = tf.reduce_sum(obj_loss)
71 | label_loss = tf.reduce_sum(label_loss)
72 | z_loss = tf.reduce_sum(z_loss)
73 | delta_xy_loss = tf.reduce_sum(delta_xy_loss)
74 | width_loss = tf.reduce_sum(width_loss)
75 | height_loss = tf.reduce_sum(height_loss)
76 | delta_orient_loss = tf.reduce_sum(delta_orient_loss)
77 |
78 | return (
79 | obj_loss,
80 | label_loss,
81 | z_loss,
82 | delta_xy_loss,
83 | width_loss,
84 | height_loss,
85 | delta_orient_loss,
86 | )
87 |
88 |
--------------------------------------------------------------------------------
/detection_3d/metrics.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 |
20 |
21 | import tensorflow as tf
22 | from detection_3d.tools.file_io import save_to_json
23 | import numpy as np
24 | import os
25 |
26 |
27 | class EpochMetrics:
28 | """
29 | The class computes the loss
30 | for train and validation step
31 | """
32 |
33 | def __init__(self):
34 | self.train_loss = tf.keras.metrics.Mean(name="train_loss")
35 | self.val_loss = tf.keras.metrics.Mean(name="val_loss")
36 |
37 | def reset(self):
38 | """
39 | Reset all metrics to zero (need to do each epoch)
40 | """
41 | self.train_loss.reset_states()
42 | self.val_loss.reset_states()
43 |
44 | def save_to_json(self, dir_to_save):
45 | """
46 | Save all metrics to the json file
47 | """
48 |
49 | # Check that folder is exitsts or create it
50 | os.makedirs(dir_to_save, exist_ok=True)
51 | json_filename = os.path.join(dir_to_save, "epoch_metrics.json")
52 | # fill the dict
53 | metrics_dict = {
54 | "train_loss": str(self.train_loss.result().numpy()),
55 | "val_loss": str(self.val_loss.result().numpy()),
56 | }
57 | save_to_json(json_filename, metrics_dict)
58 |
59 | def print_metrics(self):
60 | """
61 | Print all metrics
62 | """
63 | train_loss = np.around(self.train_loss.result().numpy(), decimals=2)
64 | val_loss = np.around(self.val_loss.result().numpy(), decimals=2)
65 |
66 | template = "train_loss {}, val_loss {}".format(train_loss, val_loss)
67 | print(template)
68 |
--------------------------------------------------------------------------------
/detection_3d/model.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 |
20 |
21 | import tensorflow as tf
22 | from tensorflow.keras.layers import (
23 | Conv2D,
24 | Layer,
25 | UpSampling2D,
26 | BatchNormalization,
27 | LeakyReLU,
28 | )
29 | from tensorflow.keras import Model
30 | from tensorflow.keras.regularizers import l2
31 |
32 |
33 | class DarkNetConv2D(Layer):
34 | """
35 | The darknet conv layer yolo_v3
36 | """
37 |
38 | def __init__(
39 | self,
40 | filters,
41 | kernel,
42 | strides,
43 | padding,
44 | weight_decay,
45 | batch_norm=True,
46 | activation_funct=True,
47 | data_format="channels_last",
48 | ):
49 | super(DarkNetConv2D, self).__init__()
50 | self.batch_norm = batch_norm
51 | self.activation_funct = activation_funct
52 | self.conv = Conv2D(
53 | filters,
54 | kernel,
55 | strides=strides,
56 | activation=None,
57 | kernel_regularizer=l2(weight_decay),
58 | padding=padding,
59 | data_format=data_format,
60 | )
61 | self.bn = BatchNormalization()
62 | self.activation = LeakyReLU(alpha=0.1)
63 |
64 | def call(self, x, training=False):
65 | x = self.conv(x)
66 | if self.batch_norm:
67 | x = self.bn(x, training=training)
68 | if self.activation_funct:
69 | x = self.activation(x)
70 | return x
71 |
72 |
73 | class DarkNetBlock(Layer):
74 | """
75 | The darknet block layer
76 | """
77 |
78 | def __init__(
79 | self, filters, weight_decay, batch_norm=True, data_format="channels_last"
80 | ):
81 | super(DarkNetBlock, self).__init__()
82 |
83 | self.conv1 = DarkNetConv2D(
84 | filters // 2,
85 | 1,
86 | strides=1,
87 | padding="same",
88 | weight_decay=weight_decay,
89 | batch_norm=batch_norm,
90 | data_format=data_format,
91 | )
92 | self.conv2 = DarkNetConv2D(
93 | filters,
94 | 3,
95 | strides=1,
96 | padding="same",
97 | weight_decay=weight_decay,
98 | batch_norm=batch_norm,
99 | data_format=data_format,
100 | )
101 |
102 | def call(self, x, training=False):
103 | prev = x
104 | x = self.conv1(x, training=training)
105 | x = self.conv2(x, training=training)
106 | return prev + x
107 |
108 |
109 | class DarkNetDecoderBlock(Layer):
110 | """
111 | The yolo v3 decoder layer
112 | """
113 |
114 | def __init__(self, filters, weight_decay, data_format="channels_last"):
115 | super(DarkNetDecoderBlock, self).__init__()
116 |
117 | self.conv1_1 = DarkNetConv2D(
118 | filters,
119 | (1, 1),
120 | strides=(1, 1),
121 | weight_decay=weight_decay,
122 | padding="same",
123 | data_format=data_format,
124 | )
125 | self.conv1_2 = DarkNetConv2D(
126 | filters * 2,
127 | (3, 3),
128 | strides=(1, 1),
129 | weight_decay=weight_decay,
130 | padding="same",
131 | data_format=data_format,
132 | )
133 | self.conv2_1 = DarkNetConv2D(
134 | filters,
135 | (1, 1),
136 | strides=(1, 1),
137 | weight_decay=weight_decay,
138 | padding="same",
139 | data_format=data_format,
140 | )
141 | self.conv2_2 = DarkNetConv2D(
142 | filters * 2,
143 | (3, 3),
144 | strides=(1, 1),
145 | weight_decay=weight_decay,
146 | padding="same",
147 | data_format=data_format,
148 | )
149 | self.conv3 = DarkNetConv2D(
150 | filters,
151 | (1, 1),
152 | strides=(1, 1),
153 | weight_decay=weight_decay,
154 | padding="same",
155 | data_format=data_format,
156 | )
157 |
158 | def call(self, x, training=False):
159 | x = self.conv1_1(x, training=training)
160 | x = self.conv1_2(x, training=training)
161 | x = self.conv2_1(x, training=training)
162 | x = self.conv2_2(x, training=training)
163 | x = self.conv3(x, training=training)
164 | return x
165 |
166 |
167 | class DarkNetEncoder(Layer):
168 | """
169 | The darknet 53 encoder from yolo_v3
170 | See: https://arxiv.org/abs/1804.02767
171 | """
172 |
173 | def __init__(self, name, weight_decay, data_format="channels_last"):
174 | super(DarkNetEncoder, self).__init__(name=name)
175 | # Input
176 | self.conv1 = DarkNetConv2D(
177 | 32,
178 | (3, 3),
179 | strides=(1, 1),
180 | weight_decay=weight_decay,
181 | padding="same",
182 | data_format=data_format,
183 | )
184 | # Conv with stride 2
185 | self.conv2 = DarkNetConv2D(
186 | 64,
187 | (3, 3),
188 | strides=(2, 2),
189 | weight_decay=weight_decay,
190 | padding="same",
191 | data_format=data_format,
192 | )
193 | # Residual block
194 | self.block_1 = DarkNetBlock(
195 | 64, weight_decay=weight_decay, data_format=data_format
196 | )
197 | # Conv with stride 2
198 | self.conv3 = DarkNetConv2D(
199 | 128,
200 | (3, 3),
201 | strides=(2, 2),
202 | weight_decay=weight_decay,
203 | padding="same",
204 | data_format=data_format,
205 | )
206 | # Residual blocks 2x
207 | self.block_2 = []
208 | for _ in range(2):
209 | self.block_2.append(
210 | DarkNetBlock(128, weight_decay=weight_decay, data_format=data_format)
211 | )
212 | # Conv with stride 2
213 | self.conv4 = DarkNetConv2D(
214 | 256,
215 | (3, 3),
216 | strides=(2, 2),
217 | weight_decay=weight_decay,
218 | padding="same",
219 | data_format=data_format,
220 | )
221 | # Residual blocks 8x
222 | self.block_3 = []
223 | for _ in range(8):
224 | self.block_3.append(
225 | DarkNetBlock(256, weight_decay=weight_decay, data_format=data_format)
226 | )
227 | # Conv with stride 2
228 | self.conv5 = DarkNetConv2D(
229 | 512,
230 | (3, 3),
231 | strides=(2, 2),
232 | weight_decay=weight_decay,
233 | padding="same",
234 | data_format=data_format,
235 | )
236 | # Residual blocks 8x
237 | self.block_4 = []
238 | for _ in range(8):
239 | self.block_4.append(
240 | DarkNetBlock(512, weight_decay=weight_decay, data_format=data_format)
241 | )
242 | # Conv with stride 2
243 | self.conv6 = DarkNetConv2D(
244 | 1024,
245 | (3, 3),
246 | strides=(2, 2),
247 | weight_decay=weight_decay,
248 | padding="same",
249 | data_format=data_format,
250 | )
251 | # Residual blocks 4x
252 | self.block_5 = []
253 | for _ in range(4):
254 | self.block_5.append(
255 | DarkNetBlock(1024, weight_decay=weight_decay, data_format=data_format)
256 | )
257 |
258 | def call(self, x, training=False):
259 | x = self.conv1(x, training=training)
260 | x = self.conv2(x, training=training)
261 | x = x_b1 = self.block_1(x, training=training)
262 | x = self.conv3(x, training=training)
263 | for i in range(len(self.block_2)):
264 | x = x_b2 = self.block_2[i](x, training=training)
265 | x = self.conv4(x)
266 | for i in range(len(self.block_3)):
267 | x = x_b3 = self.block_3[i](x, training=training)
268 | x = self.conv5(x, training=training)
269 | for i in range(len(self.block_4)):
270 | x = x_b4 = self.block_4[i](x, training=training)
271 | x = self.conv6(x, training=training)
272 | for i in range(len(self.block_5)):
273 | x = x_b5 = self.block_5[i](x, training=training)
274 |
275 | return x_b5, x_b4, x_b3, x_b2, x_b1
276 |
277 |
278 | class DarkNetDecoder(Layer):
279 | """
280 | The yolo v3 decoder
281 | """
282 |
283 | def __init__(self, name, weight_decay, data_format="channels_last"):
284 | super(DarkNetDecoder, self).__init__(name=name)
285 |
286 | self.decoder_block_1 = DarkNetDecoderBlock(
287 | filters=512, weight_decay=weight_decay, data_format=data_format
288 | )
289 |
290 | self.conv1 = DarkNetConv2D(
291 | 256,
292 | (1, 1),
293 | strides=(1, 1),
294 | weight_decay=weight_decay,
295 | padding="same",
296 | data_format=data_format,
297 | )
298 | self.up1 = UpSampling2D(size=(2, 2), data_format=data_format)
299 | self.decoder_block_2 = DarkNetDecoderBlock(
300 | filters=256, weight_decay=weight_decay, data_format=data_format
301 | )
302 |
303 | self.conv2 = DarkNetConv2D(
304 | 128,
305 | (1, 1),
306 | strides=(1, 1),
307 | weight_decay=weight_decay,
308 | padding="same",
309 | data_format=data_format,
310 | )
311 | self.up2 = UpSampling2D(size=(2, 2), data_format=data_format)
312 | self.decoder_block_3 = DarkNetDecoderBlock(
313 | filters=128, weight_decay=weight_decay, data_format=data_format
314 | )
315 | self.up3 = UpSampling2D(size=(2, 2), data_format=data_format)
316 | self.decoder_block_4 = DarkNetDecoderBlock(
317 | filters=64, weight_decay=weight_decay, data_format=data_format
318 | )
319 | self.up4 = UpSampling2D(size=(2, 2), data_format=data_format)
320 | self.decoder_block_5 = DarkNetDecoderBlock(
321 | filters=64, weight_decay=weight_decay, data_format=data_format
322 | )
323 |
324 | def call(self, x_in, training=False):
325 | # First lvl
326 | x_b5, x_b4, x_b3, x_b2, x_b1 = x_in
327 | x = self.decoder_block_1(x_b5, training=training)
328 | # Second lvl
329 | x = self.conv1(x, training=training)
330 | x = self.up1(x)
331 | x = tf.concat([x, x_b4], axis=-1)
332 | x = self.decoder_block_2(x, training=training)
333 | # Third lvl
334 | x = self.conv2(x, training=training)
335 | x = self.up2(x)
336 | x = tf.concat([x, x_b3], axis=-1)
337 | x = self.decoder_block_3(x, training=training)
338 | x = self.up3(x)
339 | x = tf.concat([x, x_b2], axis=-1)
340 | x = self.decoder_block_4(x, training=training)
341 | x = self.up4(x)
342 | x = tf.concat([x, x_b1], axis=-1)
343 | x = self.decoder_block_5(x, training=training)
344 |
345 | return x
346 |
347 |
348 | class YoloV3_Lidar(Model):
349 | def __init__(self, weight_decay, num_classes=26, data_format="channels_last"):
350 | super(YoloV3_Lidar, self).__init__(name="YoloV3_Lidar")
351 | self.encoder = DarkNetEncoder(
352 | name="DarkNetEncoder", weight_decay=weight_decay, data_format=data_format
353 | )
354 | self.decoder = DarkNetDecoder(
355 | name="DarkNetDecoder", weight_decay=weight_decay, data_format=data_format
356 | )
357 | self.final_layer = Conv2D(
358 | 8 + num_classes,
359 | (1, 1),
360 | activation=None,
361 | padding="same",
362 | data_format=data_format,
363 | )
364 |
365 | def call(self, x, training):
366 | x = self.encoder(x, training=training)
367 | x = self.decoder(x, training=training)
368 | x = self.final_layer(x)
369 | return x
370 |
--------------------------------------------------------------------------------
/detection_3d/parameters.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import os
20 | from detection_3d.tools.file_io import save_to_json
21 | from detection_3d.tools.statics import NO_SCHEDULER, RESTARTS_SCHEDULER, ADAM
22 |
23 |
24 | class Parameters(object):
25 | """
26 | The class contains experiment parameters.
27 | """
28 |
29 | def __init__(self):
30 |
31 | self.settings = {
32 | # The directory of the dataset
33 | "dataset_dir": "dataset",
34 | "batch_size": 4,
35 | # The checkpoint related
36 | "checkpoints_dir": "log/checkpoints",
37 | "train_summaries": "log/summaries/train",
38 | "eval_summaries": "log/summaries/val",
39 | # Update tensorboard train images each step_summaries iterations
40 | "step_summaries": 100, # to turn off make it None
41 | # General settings
42 | "seed": 2020,
43 | "max_epochs": 1000,
44 | "weight_decay": 1,
45 | }
46 |
47 | # Set special parameters
48 | self.settings["optimizer"] = ADAM
49 | self.settings["scheduler"] = SchedulerSettings.no_scheduler()
50 | self.settings["augmentation"] = True
51 | # Detection related
52 | self.settings["grid_meters"] = [52.0, 104.0, 8.0] # [x,y,z ] in meters
53 | # [x,y,z, intensity] offset to shift all lidar points in positive coordinate quadrant
54 | # (all x,y,z coords >=0)
55 | self.settings["lidar_offset"] = [26.0, 52.0, 2.5, 0.0]
56 | # [x,y,z] voxel size in meters
57 | self.settings["voxel_size"] = [0.125, 0.125, 8.0]
58 | # [x,y,z] voxel size in meters
59 | self.settings["bbox_voxel_size"] = [0.25, 0.25, 1.0]
60 |
61 | # Automatically defined during training parameters
62 | self.settings["train_size"] = None # the size of train set
63 | self.settings["val_size"] = None # the size of val set
64 |
65 | def save_to_json(self, dir_to_save):
66 | """
67 | Save parameters to .json
68 | """
69 | # Check that folder is exitsts or create it
70 | os.makedirs(dir_to_save, exist_ok=True)
71 | json_filename = os.path.join(dir_to_save, "parameters.json")
72 | save_to_json(json_filename, self.settings)
73 |
74 |
75 | class SchedulerSettings:
76 | """
77 | The class contains parameters for different schedulers.
78 | """
79 |
80 | def __init__(self):
81 | pass
82 |
83 | # Supported schedulers
84 | @staticmethod
85 | def no_scheduler():
86 | """
87 | Constant learning rate scheduler.
88 | """
89 | scheduler = {
90 | "name": NO_SCHEDULER,
91 | "initial_learning_rate": 1e-5,
92 | }
93 | return scheduler
94 |
95 | @staticmethod
96 | def restarts_scheduler():
97 | """
98 | The warm restarts scheduler.
99 | See: https://arxiv.org/abs/1608.03983
100 | """
101 | scheduler = {
102 | "name": RESTARTS_SCHEDULER,
103 | "initial_learning_rate": 1e-4, # 2e-3
104 | "first_decay_steps": 80, # Important: convertable param from epoch to iteration
105 | "t_mul": 2.0,
106 | "m_mul": 1.0,
107 | "alpha": 1e-6,
108 | }
109 | return scheduler
110 |
--------------------------------------------------------------------------------
/detection_3d/tools/augmentation_tools.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 |
20 | import numpy as np
21 | from detection_3d.data_preprocessing.pandaset_tools.transform import (
22 | eulerAnglesToRotationMatrix,
23 | )
24 |
25 |
26 | def random_rotate_lidar_boxes(
27 | lidar, lidar_corners_3d, min_angle=-np.pi / 4, max_angle=np.pi / 4
28 | ):
29 | yaw = np.random.uniform(min_angle, max_angle)
30 | R = eulerAnglesToRotationMatrix([0, 0, yaw])
31 | lidar = np.transpose(lidar)
32 | lidar_corners_3d = np.transpose(lidar_corners_3d, (0, 2, 1))
33 |
34 | lidar[:3] = np.matmul(R, lidar[:3])
35 | lidar_corners_3d = np.matmul(R, lidar_corners_3d)
36 |
37 | lidar_corners_3d = np.transpose(lidar_corners_3d, (0, 2, 1))
38 | lidar = np.transpose(lidar)
39 | return lidar, lidar_corners_3d
40 |
41 |
42 | def random_flip_x_lidar_boxes(lidar, lidar_corners_3d):
43 | lidar[:, 0] = -lidar[:, 0]
44 | lidar_corners_3d[:, :, 0] = -lidar_corners_3d[:, :, 0]
45 | return lidar, lidar_corners_3d
46 |
47 |
48 | def random_flip_y_lidar_boxes(lidar, lidar_corners_3d):
49 | lidar[:, 1] = -lidar[:, 1]
50 | lidar_corners_3d[:, :, 1] = -lidar_corners_3d[:, :, 1]
51 | return lidar, lidar_corners_3d
52 |
--------------------------------------------------------------------------------
/detection_3d/tools/detection_helpers.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import numpy as np
20 | import tensorflow as tf
21 |
22 |
23 | def rot_z(t):
24 | """ Rotation about the z-axis. """
25 | c = np.cos(t)
26 | s = np.sin(t)
27 | ones = np.ones_like(c)
28 | zeros = np.zeros_like(c)
29 | return np.asarray([[c, -s, zeros], [s, c, zeros], [zeros, zeros, ones]])
30 |
31 |
32 | def make_eight_points_boxes(bboxes_xyzlwhy):
33 | bboxes_xyzlwhy = np.asarray(bboxes_xyzlwhy)
34 | l = bboxes_xyzlwhy[:, 3] / 2.0
35 | w = bboxes_xyzlwhy[:, 4] / 2.0
36 | h = bboxes_xyzlwhy[:, 5] / 2.0
37 | # 3d bounding box corners
38 | x_corners = np.asarray([l, l, -l, -l, l, l, -l, -l])
39 | y_corners = np.asarray([w, -w, -w, w, w, -w, -w, w])
40 | z_corners = np.asarray([-h, -h, -h, -h, h, h, h, h])
41 | corners_3d = np.concatenate(([x_corners], [y_corners], [z_corners]), axis=0)
42 | yaw = np.asarray(bboxes_xyzlwhy[:, -1], dtype=np.float)
43 | corners_3d = np.transpose(corners_3d, (2, 0, 1))
44 | R = np.transpose(rot_z(yaw), (2, 0, 1))
45 |
46 | corners_3d = np.matmul(R, corners_3d)
47 |
48 | centroid = bboxes_xyzlwhy[:, :3]
49 | corners_3d += centroid[:, :, None]
50 | orient_p = (corners_3d[:, :, 0] + corners_3d[:, :, 7]) / 2.0
51 | orientation_3d = np.concatenate(
52 | (centroid[:, :, None], orient_p[:, :, None]), axis=-1
53 | )
54 | corners_3d = np.transpose(corners_3d, (0, 2, 1))
55 | orientation_3d = np.transpose(orientation_3d, (0, 2, 1))
56 |
57 | return corners_3d, orientation_3d
58 |
59 |
60 | def get_bboxes_parameters_from_points(lidar_corners_3d):
61 | """
62 | The function returns 7 parameters of box [x, y, z, w, l, h, yaw]
63 |
64 | Arguments:
65 | lidar_corners_3d: [num_ponts, 8, 3]
66 | """
67 | centroid = (lidar_corners_3d[:, -2, :] + lidar_corners_3d[:, 0, :]) / 2.0
68 | delta_l = lidar_corners_3d[:, 0, :2] - lidar_corners_3d[:, 1, :2]
69 | delta_w = lidar_corners_3d[:, 1, :2] - lidar_corners_3d[:, 2, :2]
70 | width = np.linalg.norm(delta_w, axis=-1)
71 | length = np.linalg.norm(delta_l, axis=-1)
72 |
73 | height = lidar_corners_3d[:, -1, -1] - lidar_corners_3d[:, 0, -1]
74 | yaw = np.arctan2(delta_l[:, 1], delta_l[:, 0])
75 |
76 | return centroid, width, length, height, yaw
77 |
78 |
79 | def get_voxels_grid(voxel_size, grid_meters):
80 | voxel_size = np.asarray(voxel_size, np.float32)
81 | grid_size_meters = np.asarray(grid_meters, np.float32)
82 | voxels_grid = np.asarray(grid_size_meters / voxel_size, np.int32)
83 | return voxels_grid
84 |
85 |
86 | def get_bboxes_grid(bbox_labels, lidar_corners_3d, grid_meters, bbox_voxel_size):
87 | """
88 | The function transform lidar_corners_3d (8 points of bboxes) to
89 | parametrized version of bbox.
90 | """
91 | voxels_grid = get_voxels_grid(bbox_voxel_size, grid_meters)
92 | # Find box parameters
93 | centroid, width, length, height, _ = get_bboxes_parameters_from_points(
94 | lidar_corners_3d
95 | )
96 | # find the vector of orientation [centroid, orient_point]
97 | orient_point = (lidar_corners_3d[:, 1] + lidar_corners_3d[:, 2]) / 2.0
98 |
99 | voxel_coordinates = np.asarray(
100 | np.floor(centroid[:, :2] / bbox_voxel_size[:2]), np.int32
101 | )
102 | # Filter bboxes not fall in the grid
103 | bound_x = (voxel_coordinates[:, 0] >= 0) & (
104 | voxel_coordinates[:, 0] < voxels_grid[0]
105 | )
106 | bound_y = (voxel_coordinates[:, 1] >= 0) & (
107 | voxel_coordinates[:, 1] < voxels_grid[1]
108 | )
109 | mask = bound_x & bound_y
110 | # Filter all non related bboxes
111 | centroid = centroid[mask]
112 | orient_point = orient_point[mask]
113 | width = width[mask]
114 | length = length[mask]
115 | height = height[mask]
116 | bbox_labels = bbox_labels[mask]
117 | voxel_coordinates = voxel_coordinates[mask]
118 | # Confidence
119 | confidence = np.ones_like(width)
120 |
121 | # Voxels close corners to the coordinate system origin (0,0,0)
122 | voxels_close_corners = (
123 | np.asarray(voxel_coordinates, np.float32) * bbox_voxel_size[:2]
124 | )
125 | # Get x,y, coordinate
126 | delta_xy = centroid[:, :2] - voxels_close_corners
127 | orient_xy = orient_point[:, :2] - voxels_close_corners
128 | z_coord = centroid[:, -1]
129 |
130 | # print(
131 | # f"confidence {confidence.shape}, delta_xy {delta_xy.shape}, orient_xy {orient_xy.shape}, z_coord {z_coord.shape}, width {width.shape}, height {height.shape}, bbox_labels {bbox_labels.shape}"
132 | # )
133 | # (x_grid, y_grid, (objectness, min_delta_x, min_delta_y, max_delta_x, max_delta_y, z, label))
134 | # objectness means 1 if box exists for this grid cell else 0
135 | output_tensor = np.zeros((voxels_grid[0], voxels_grid[1], 9), np.float32)
136 | if len(bbox_labels) > 0:
137 | data = np.concatenate(
138 | (
139 | confidence[:, None],
140 | delta_xy,
141 | orient_xy,
142 | z_coord[:, None],
143 | width[:, None],
144 | height[:, None],
145 | bbox_labels[:, None],
146 | ),
147 | axis=-1,
148 | )
149 | output_tensor[voxel_coordinates[:, 0], voxel_coordinates[:, 1]] = data
150 | return output_tensor
151 |
152 |
153 | def get_boxes_from_box_grid(box_grid, bbox_voxel_size, conf_trhld=0.0):
154 |
155 | # Get non-zero voxels
156 | objectness, delta_xy, orient_xy, z_coord, width, height, label = tf.split(
157 | box_grid, (1, 2, 2, 1, 1, 1, -1), axis=-1
158 | )
159 |
160 | mask = box_grid[:, :, 0] > conf_trhld
161 | valid_idx = tf.where(mask)
162 |
163 | z_coord = tf.gather_nd(z_coord, valid_idx)
164 | width = tf.gather_nd(width, valid_idx)
165 | height = tf.gather_nd(height, valid_idx)
166 | objectness = tf.gather_nd(objectness, valid_idx)
167 | label = tf.gather_nd(label, valid_idx)
168 | delta_xy = tf.gather_nd(delta_xy, valid_idx)
169 | orient_xy = tf.gather_nd(orient_xy, valid_idx)
170 | voxels_close_corners = tf.cast(valid_idx, tf.float32) * bbox_voxel_size[None, :2]
171 | xy_coord = delta_xy + voxels_close_corners
172 | xy_orient = orient_xy + voxels_close_corners
173 |
174 | delta = xy_orient[:, :2] - xy_coord[:, :2]
175 | length = 2 * tf.norm(delta, axis=-1, keepdims=True)
176 | yaw = tf.expand_dims(tf.atan2(delta[:, 1], delta[:, 0]), axis=-1)
177 |
178 | bbox = tf.concat([xy_coord, z_coord, length, width, height, yaw], axis=-1,)
179 | return bbox, label, objectness
180 |
181 |
182 | def make_top_view_image(lidar, grid_meters, voxels_size, channels=3):
183 | """
184 | The function makes top view image from lidar
185 | Arguments:
186 | lidar: lidar array of the shape [num_points, 3]
187 | width: width of the top view image
188 | height: height of the top view image
189 | channels: number of channels of the top view image
190 | """
191 | mask_x = (lidar[:, 0] >= 0) & (lidar[:, 0] < grid_meters[0])
192 | mask_y = (lidar[:, 1] >= 0) & (lidar[:, 1] < grid_meters[1])
193 | mask_z = (lidar[:, 2] >= 0) & (lidar[:, 2] < grid_meters[2])
194 | mask = mask_x & mask_y & mask_z
195 | lidar = lidar[mask]
196 | voxel_grid = get_voxels_grid(voxels_size, grid_meters)
197 | voxels = np.asarray(np.floor(lidar[:, :3] / voxels_size), np.int32)
198 | top_view = np.zeros((voxel_grid[0], voxel_grid[1], 2), np.float32)
199 | top_view[voxels[:, 0], voxels[:, 1], 0] = lidar[:, 2] # z values
200 | top_view[voxels[:, 0], voxels[:, 1], 1] = lidar[:, 3] # intensity values
201 |
202 | return top_view
203 |
--------------------------------------------------------------------------------
/detection_3d/tools/file_io.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import tensorflow as tf
20 | import matplotlib.pyplot as plt
21 | import numpy as np
22 | import os
23 | import json
24 | from detection_3d.data_preprocessing.pandaset_tools.helpers import labels
25 |
26 |
27 | def load_and_resize_image(image_filename, resize=None, data_type=tf.float32):
28 | """
29 | Load png image to tensor and resize if necessary
30 | Arguments:
31 | image_filename: image file to load
32 | resize: tensor [new_width, new_height] or None
33 | Return:
34 | img: tensor of the size [1, H, W, 3]
35 | """
36 |
37 | img = tf.io.read_file(image_filename)
38 | img = tf.image.decode_png(img)
39 | # Add batch dim
40 | img = tf.expand_dims(img, axis=0)
41 |
42 | if resize is not None:
43 | img = tf.compat.v1.image.resize_nearest_neighbor(img, resize)
44 |
45 | img = tf.cast(img, data_type)
46 | return img
47 |
48 |
49 | def save_plot_to_image(file_to_save, figure):
50 | """
51 | Save matplotlib figure to image and close
52 | """
53 | plt.savefig(file_to_save)
54 | plt.close(figure)
55 |
56 |
57 | def read_json(json_filename):
58 | with open(json_filename) as json_file:
59 | data = json.load(json_file)
60 | return data
61 |
62 |
63 | def save_bboxes_to_file(
64 | filename, centroid, width, length, height, alpha, label, delim=";"
65 | ):
66 |
67 | if centroid is not None:
68 | with open(filename, "w") as the_file:
69 | for c, w, l, h, a, lbl in zip(
70 | centroid, width, length, height, alpha, label
71 | ):
72 | data = (
73 | delim.join(
74 | (
75 | str(c[0]),
76 | str(c[1]),
77 | str(c[2]),
78 | str(l),
79 | str(w),
80 | str(h),
81 | str(a),
82 | str(lbl),
83 | )
84 | )
85 | + "\n"
86 | )
87 | # data = "{};{};{};{};{};{};{};{}\n".format(
88 | # c[0], c[1], c[2], l, w, h, a, lbl
89 | # )
90 | the_file.write(data)
91 |
92 |
93 | def load_bboxes(label_filename, label_string=True):
94 | # returns the array with [num_boxes, (bbox_parm)]
95 | with open(label_filename) as f:
96 | bboxes = np.asarray([line.rstrip().split(";") for line in f])
97 | # Convert labels to numbers
98 | if label_string:
99 | bboxes[:, -1] = [labels[label] for label in bboxes[:, -1]]
100 | bboxes = np.asarray(bboxes, dtype=float)
101 | return bboxes
102 |
103 |
104 | def load_lidar(lidar_filename, dtype=np.float32, n_vec=4):
105 | scan = np.fromfile(lidar_filename, dtype=dtype)
106 | scan = scan.reshape((-1, n_vec))
107 | return scan
108 |
109 |
110 | def save_lidar(lidar_filename, scan):
111 | scan = scan.reshape((-1))
112 | scan.tofile(lidar_filename)
113 |
114 |
115 | def save_to_json(json_filename, dict_to_save):
116 | """
117 | Save to json file
118 | """
119 | with open(json_filename, "w") as f:
120 | json.dump(dict_to_save, f, indent=2)
121 |
122 |
123 | def save_dataset_list(dataset_file, data_list):
124 | """
125 | Saves dataset list to file.
126 | """
127 | with open(dataset_file, "w") as f:
128 | for item in data_list:
129 | f.write("%s\n" % item)
130 |
131 |
132 | def load_dataset_list(dataset_dir, dataset_file, delimiter=";"):
133 | """
134 | The function loads list of data from dataset
135 | file.
136 | Args:
137 | dataset_file: path to the .dataset file.
138 | Returns:
139 | dataset_list: list of data.
140 | """
141 |
142 | file_path = os.path.join(dataset_dir, dataset_file)
143 | dataset_list = []
144 | with open(file_path) as f:
145 | dataset_list = f.readlines()
146 | dataset_list = [x.strip().split(delimiter) for x in dataset_list]
147 | return dataset_list
148 |
--------------------------------------------------------------------------------
/detection_3d/tools/statics.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 |
20 | NO_SCHEDULER = "no_scheduler"
21 | RESTARTS_SCHEDULER = "restarts"
22 |
23 | ADAM = "adam"
24 |
--------------------------------------------------------------------------------
/detection_3d/tools/summary_helpers.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import tensorflow as tf
20 | import numpy as np
21 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image
22 |
23 |
24 | def train_summaries(train_out, optimizer, param_settings, learning_rate):
25 | """
26 | Visualizes the train outputs in tensorboards
27 | """
28 |
29 | writer = tf.summary.create_file_writer(param_settings["train_summaries"])
30 | with writer.as_default():
31 | # Losses
32 | (
33 | obj_loss,
34 | label_loss,
35 | z_loss,
36 | delta_xy_loss,
37 | width_loss,
38 | height_loss,
39 | delta_orient_loss,
40 | ) = train_out["losses"]
41 |
42 | # Show learning rate given scheduler
43 | if param_settings["scheduler"]["name"] != "no_scheduler":
44 | with tf.name_scope("Optimizer info"):
45 | step = float(
46 | optimizer.iterations.numpy()
47 | ) # triangular_scheduler learning rate needs float dtype
48 | tf.summary.scalar(
49 | "learning_rate", learning_rate(step), step=optimizer.iterations
50 | )
51 | with tf.name_scope("Training losses"):
52 | tf.summary.scalar(
53 | "1.Total loss", train_out["total_loss"], step=optimizer.iterations
54 | )
55 | tf.summary.scalar("2.obj loss", obj_loss, step=optimizer.iterations)
56 | tf.summary.scalar("3.label_loss", label_loss, step=optimizer.iterations)
57 | tf.summary.scalar("4. z_loss", z_loss, step=optimizer.iterations)
58 | tf.summary.scalar(
59 | "5. delta_xy_loss", delta_xy_loss, step=optimizer.iterations
60 | )
61 | tf.summary.scalar("6. width_loss", width_loss, step=optimizer.iterations)
62 | tf.summary.scalar("8. height_loss", height_loss, step=optimizer.iterations)
63 | tf.summary.scalar(
64 | "9. delta_orient_loss", delta_orient_loss, step=optimizer.iterations
65 | )
66 |
67 | if (
68 | param_settings["step_summaries"] is not None
69 | and optimizer.iterations % param_settings["step_summaries"] == 0
70 | ):
71 | bbox_voxel_size = np.asarray(
72 | param_settings["bbox_voxel_size"], dtype=np.float32
73 | )
74 | lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32)
75 | gt_bboxes = train_out["box_grid"]
76 | p_bboxes = train_out["predictions"]
77 | grid_meters = param_settings["grid_meters"]
78 | top_view = train_out["top_view"]
79 | gt_top_view = visualize_2d_boxes_on_top_image(
80 | gt_bboxes, top_view, grid_meters, bbox_voxel_size,
81 | )
82 |
83 | p_top_view = visualize_2d_boxes_on_top_image(
84 | p_bboxes, top_view, grid_meters, bbox_voxel_size, prediction=True,
85 | )
86 |
87 | # Show GT
88 | with tf.name_scope("1-Ground truth bounding boxes"):
89 | tf.summary.image("Top view", gt_top_view, step=optimizer.iterations)
90 |
91 | with tf.name_scope("2-Predicted bounding boxes"):
92 | tf.summary.image(
93 | "Predicted top view", p_top_view, step=optimizer.iterations
94 | )
95 |
96 |
97 | def epoch_metrics_summaries(param_settings, epoch_metrics, epoch):
98 | """
99 | Visualizes epoch metrics
100 | """
101 | # Train results
102 | writer = tf.summary.create_file_writer(param_settings["train_summaries"])
103 | with writer.as_default():
104 | # Show epoch metrics for train
105 | with tf.name_scope("Epoch metrics"):
106 | tf.summary.scalar(
107 | "1. Loss", epoch_metrics.train_loss.result().numpy(), step=epoch
108 | )
109 |
110 | # Val results
111 | writer = tf.summary.create_file_writer(param_settings["eval_summaries"])
112 | with writer.as_default():
113 | # Show epoch metrics for train
114 | with tf.name_scope("Epoch metrics"):
115 | tf.summary.scalar(
116 | "1. Loss", epoch_metrics.val_loss.result().numpy(), step=epoch
117 | )
118 |
--------------------------------------------------------------------------------
/detection_3d/tools/training_helpers.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import os
20 | import glob
21 | import tensorflow as tf
22 | import copy
23 | from detection_3d.tools.statics import NO_SCHEDULER, RESTARTS_SCHEDULER, ADAM
24 |
25 |
26 | def setup_gpu():
27 | physical_devices = tf.config.experimental.list_physical_devices("GPU")
28 | if len(physical_devices) > 0:
29 | # Will not allocate all memory but only necessary amount
30 | tf.config.experimental.set_memory_growth(physical_devices[0], True)
31 |
32 |
33 | def initialize_model(model, input_shape):
34 | """
35 | Helper tf2 specific model initialization (need for saving mechanism)
36 | """
37 | sample = tf.zeros(input_shape, tf.float32)
38 | model.predict(sample)
39 |
40 |
41 | def load_model(checkpoints_dir, model, resume):
42 | """
43 | Resume model from given checkpoint
44 | """
45 | start_epoch = 0
46 | if resume:
47 | search_string = os.path.join(checkpoints_dir, "*")
48 | checkpoints_list = sorted(glob.glob(search_string))
49 | if len(checkpoints_list) > 0:
50 | current_epoch = int(os.path.split(checkpoints_list[-1])[-1].split("-")[-1])
51 | model = tf.keras.models.load_model(checkpoints_list[-1])
52 | start_epoch = current_epoch + 1 # we should continue from the next epoch
53 | print(f"RESUME TRAINING FROM CHECKPOINT: {checkpoints_list[-1]}.")
54 | else:
55 | print(f"CAN'T RESUME TRAINING! NO CHECKPOINT FOUND! START NEW TRAINING!")
56 | return start_epoch, model
57 |
58 |
59 | def get_optimizer(optimizer_name, scheduler, num_iter_per_epoch):
60 | if scheduler["name"] == NO_SCHEDULER:
61 | learning_rate = scheduler["initial_learning_rate"]
62 | elif scheduler["name"] == RESTARTS_SCHEDULER:
63 | tmp_scheduler = copy.deepcopy(scheduler)
64 | tmp_scheduler["first_decay_steps"] = (
65 | scheduler["first_decay_steps"] * num_iter_per_epoch
66 | )
67 | learning_rate = tf.keras.experimental.CosineDecayRestarts(**tmp_scheduler)
68 |
69 | if optimizer_name == ADAM:
70 | optimizer_type = tf.keras.optimizers.Adam(learning_rate)
71 | else:
72 | ValueError("Error: Unknow optimizer {}".format(optimizer_name))
73 |
74 | return learning_rate, optimizer_type
75 |
--------------------------------------------------------------------------------
/detection_3d/tools/visualization_tools.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import mayavi.mlab as mlab
20 | import numpy as np
21 | import cv2
22 | import matplotlib.pyplot as plt
23 | import copy
24 | from tqdm import tqdm
25 | from detection_3d.tools.detection_helpers import (
26 | get_boxes_from_box_grid,
27 | make_eight_points_boxes,
28 | )
29 | from detection_3d.data_preprocessing.pandaset_tools.helpers import get_color
30 |
31 |
32 | def visualize_lidar(lidar, figure=None):
33 | """
34 | Draw lidar points
35 | Args:
36 | lidar: numpy array (n,3) of XYZ
37 | figure: mayavi figure handler, if None create new one otherwise will use it
38 | Returns:
39 | fig: created or used fig
40 | """
41 |
42 | if figure is None:
43 | figure = mlab.figure(
44 | figure=None, bgcolor=(0, 0, 0), fgcolor=None, engine=None, size=(1600, 1000)
45 | )
46 |
47 | color = lidar[:, 2]
48 | mlab.points3d(
49 | lidar[:, 0],
50 | lidar[:, 1],
51 | lidar[:, 2],
52 | color,
53 | mode="point",
54 | scale_factor=0.3,
55 | figure=figure,
56 | )
57 |
58 | # draw origin
59 | mlab.points3d(
60 | 0, 0, 0, color=(1, 1, 1), mode="sphere", scale_factor=0.2, figure=figure
61 | )
62 | # draw axis
63 | mlab.plot3d(
64 | [0, 2], [0, 0], [0, 0], color=(1, 0, 0), tube_radius=None, figure=figure
65 | )
66 | mlab.plot3d(
67 | [0, 0], [0, 2], [0, 0], color=(0, 1, 0), tube_radius=None, figure=figure
68 | )
69 | mlab.plot3d(
70 | [0, 0], [0, 0], [0, 2], color=(0, 0, 1), tube_radius=None, figure=figure
71 | )
72 | return figure
73 |
74 |
75 | def visualize_bboxes_3d(lidar_corners_3d, figure=None, orientation=None):
76 | if figure is None:
77 | figure = mlab.figure(
78 | figure=None, bgcolor=(0, 0, 0), fgcolor=None, engine=None, size=(1600, 1000)
79 | )
80 |
81 | for b in tqdm(lidar_corners_3d, desc=f"Add bboxes", total=len(lidar_corners_3d)):
82 | for k in range(0, 4):
83 | i, j = k, (k + 1) % 4
84 | mlab.plot3d(
85 | [b[i, 0], b[j, 0]],
86 | [b[i, 1], b[j, 1]],
87 | [b[i, 2], b[j, 2]],
88 | color=(1, 1, 1),
89 | tube_radius=None,
90 | line_width=1,
91 | figure=figure,
92 | )
93 |
94 | i, j = k + 4, (k + 1) % 4 + 4
95 | mlab.plot3d(
96 | [b[i, 0], b[j, 0]],
97 | [b[i, 1], b[j, 1]],
98 | [b[i, 2], b[j, 2]],
99 | color=(1, 1, 1),
100 | tube_radius=None,
101 | line_width=1,
102 | figure=figure,
103 | )
104 |
105 | i, j = k, k + 4
106 | mlab.plot3d(
107 | [b[i, 0], b[j, 0]],
108 | [b[i, 1], b[j, 1]],
109 | [b[i, 2], b[j, 2]],
110 | color=(1, 1, 1),
111 | tube_radius=None,
112 | line_width=1,
113 | figure=figure,
114 | )
115 | if orientation is not None:
116 | for o in orientation:
117 | mlab.plot3d(
118 | [o[0, 0], o[1, 0]],
119 | [o[0, 1], o[1, 1]],
120 | [o[0, 2], o[1, 2]],
121 | color=(1, 1, 1),
122 | tube_radius=None,
123 | line_width=1,
124 | figure=figure,
125 | )
126 | print(f"Done")
127 | return figure
128 |
129 |
130 | def draw_boxes_top_view(
131 | top_view_image, boxes_3d, grid_meters, labels, orientation_3d=None
132 | ):
133 | height, width, channels = top_view_image.shape
134 | delimiter_x = grid_meters[0] / height
135 | delimiter_y = grid_meters[1] / width
136 | thickness = 2
137 | for idx, b in enumerate(boxes_3d):
138 | color = get_color(labels[idx]) / 255
139 | b = b[:4]
140 | x = np.floor(b[:, 0] / delimiter_x).astype(int)
141 |
142 | y = np.floor(b[:, 1] / delimiter_y).astype(int)
143 |
144 | cv2.line(top_view_image, (y[0], x[0]), (y[1], x[1]), color, thickness)
145 | cv2.line(top_view_image, (y[1], x[1]), (y[2], x[2]), color, thickness)
146 | cv2.line(top_view_image, (y[2], x[2]), (y[3], x[3]), color, thickness)
147 | cv2.line(top_view_image, (y[3], x[3]), (y[0], x[0]), color, thickness)
148 |
149 | if orientation_3d is not None:
150 | for o in orientation_3d:
151 | x = np.floor(o[:, 0] / delimiter_x).astype(int)
152 | y = np.floor(o[:, 1] / delimiter_y).astype(int)
153 | cv2.arrowedLine(
154 | top_view_image, (y[0], x[0]), (y[1], x[1]), (1, 0, 0), thickness
155 | )
156 | return top_view_image
157 |
158 |
159 | def visualize_2d_boxes_on_top_image(
160 | bboxes_grid, top_view, grid_meters, bbox_voxel_size, prediction=False
161 | ):
162 | top_image_vis = []
163 | for boxes, top_image in zip(bboxes_grid, top_view): # iterate over batch
164 | top_image = top_image.numpy()
165 | shape = top_image.shape
166 | rgb_image = np.zeros((shape[0], shape[1], 3))
167 | rgb_image[top_image[:, :, 0] > 0] = 1
168 |
169 | box, labels, _ = get_boxes_from_box_grid(boxes, bbox_voxel_size)
170 | box = box.numpy()
171 | box, orientation_3d = make_eight_points_boxes(box)
172 |
173 | if prediction:
174 | labels = np.argmax(labels, axis=-1)
175 | if len(box) > 0:
176 | rgb_image = draw_boxes_top_view(
177 | rgb_image, box, grid_meters, labels, orientation_3d
178 | )
179 |
180 | # rgb_image = np.rot90(rgb_image)
181 | top_image_vis.append(rgb_image)
182 | return np.asarray(top_image_vis)
183 |
184 |
185 | def visualize_bboxes_on_image(image, bboxes_2d, labels, orientation_2d=None):
186 | """
187 | The function visualize the reprojected 3d bounding boxes
188 | on 2d image
189 | Arguments:
190 | images: the tensor of the shape [height, width, 3]
191 | bboxes: the reprojected bboxes of the shape [num_boxes, 8, 2]
192 | Returns:
193 | resulted_images: the tensor with bboxes of the shape [height, width, 3]
194 | """
195 |
196 | height, width, _ = image.shape
197 | thickness = 2
198 | boundaries = np.asarray([width, height])
199 | for idx, b in enumerate(bboxes_2d):
200 | color = get_color(labels[idx]) / 255
201 |
202 | b = b.astype(np.int32)
203 | first_square = False
204 | second_square = False
205 | if (
206 | (b[0] >= 0).all() & (b[0] < boundaries).all()
207 | or (b[1] >= 0).all() & (b[1] < boundaries).all()
208 | or (b[4] >= 0).all() & (b[4] < boundaries).all()
209 | or (b[5] >= 0).all() & (b[5] < boundaries).all()
210 | ):
211 | first_square = True
212 | cv2.line(image, (b[0, 0], b[0, 1]), (b[1, 0], b[1, 1]), color, thickness)
213 | cv2.line(image, (b[4, 0], b[4, 1]), (b[0, 0], b[0, 1]), color, thickness)
214 | cv2.line(image, (b[5, 0], b[5, 1]), (b[1, 0], b[1, 1]), color, thickness)
215 | cv2.line(image, (b[4, 0], b[4, 1]), (b[5, 0], b[5, 1]), color, thickness)
216 | if (
217 | (b[2] >= 0).all() & (b[2] < boundaries).all()
218 | or (b[3] >= 0).all() & (b[3] < boundaries).all()
219 | or (b[6] >= 0).all() & (b[6] < boundaries).all()
220 | or (b[7] >= 0).all() & (b[7] < boundaries).all()
221 | ):
222 | second_square = True
223 | cv2.line(image, (b[2, 0], b[2, 1]), (b[3, 0], b[3, 1]), color, thickness)
224 | cv2.line(image, (b[6, 0], b[6, 1]), (b[2, 0], b[2, 1]), color, thickness)
225 | cv2.line(image, (b[7, 0], b[7, 1]), (b[3, 0], b[3, 1]), color, thickness)
226 | cv2.line(image, (b[7, 0], b[7, 1]), (b[6, 0], b[6, 1]), color, thickness)
227 |
228 | if first_square and second_square:
229 | cv2.line(image, (b[0, 0], b[0, 1]), (b[3, 0], b[3, 1]), color, thickness)
230 | cv2.line(image, (b[1, 0], b[1, 1]), (b[2, 0], b[2, 1]), color, thickness)
231 | cv2.line(image, (b[4, 0], b[4, 1]), (b[7, 0], b[7, 1]), color, thickness)
232 | cv2.line(image, (b[5, 0], b[5, 1]), (b[6, 0], b[6, 1]), color, thickness)
233 |
234 | if orientation_2d is not None:
235 | for o in orientation_2d:
236 | o = o.astype(np.int32)
237 | if (
238 | (o[0] >= 0).all()
239 | & (o[0] < boundaries).all()
240 | & (o[1] >= 0).all()
241 | & (o[1] < boundaries).all()
242 | ):
243 | cv2.arrowedLine(
244 | image, (o[0, 0], o[0, 1]), (o[1, 0], o[1, 1]), (1, 0, 0), thickness
245 | )
246 |
247 | return image
248 |
--------------------------------------------------------------------------------
/detection_3d/train.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 | import argparse
20 | import os
21 | import tensorflow as tf
22 | from detection_3d.parameters import Parameters
23 | from detection_3d.detection_dataset import DetectionDataset
24 | from detection_3d.tools.detection_helpers import get_voxels_grid
25 | from detection_3d.model import YoloV3_Lidar
26 | from detection_3d.tools.training_helpers import (
27 | setup_gpu,
28 | initialize_model,
29 | load_model,
30 | get_optimizer,
31 | )
32 | from detection_3d.losses import detection_loss
33 | from detection_3d.tools.summary_helpers import train_summaries, epoch_metrics_summaries
34 | from detection_3d.metrics import EpochMetrics
35 | from tqdm import tqdm
36 |
37 |
38 | @tf.function
39 | def train_step(param_settings, train_samples, model, optimizer, epoch_metrics=None):
40 |
41 | with tf.GradientTape() as tape:
42 | top_view, box_grid, _ = train_samples
43 | predictions = model(top_view, training=True)
44 | (
45 | obj_loss,
46 | label_loss,
47 | z_loss,
48 | delta_xy_loss,
49 | width_loss,
50 | height_loss,
51 | delta_orient_loss,
52 | ) = detection_loss(box_grid, predictions)
53 | losses = [
54 | obj_loss,
55 | label_loss,
56 | z_loss,
57 | delta_xy_loss,
58 | width_loss,
59 | height_loss,
60 | delta_orient_loss,
61 | ]
62 | total_detection_loss = tf.reduce_sum(losses)
63 | # Get L2 losses for weight decay
64 | total_loss = total_detection_loss + tf.add_n(model.losses)
65 |
66 | gradients = tape.gradient(total_loss, model.trainable_variables)
67 | optimizer.apply_gradients(zip(gradients, model.trainable_variables))
68 | if epoch_metrics is not None:
69 | epoch_metrics.train_loss(total_detection_loss)
70 |
71 | train_outputs = {
72 | "total_loss": total_loss,
73 | "losses": losses,
74 | "box_grid": box_grid,
75 | "predictions": predictions,
76 | "top_view": top_view,
77 | }
78 |
79 | return train_outputs
80 |
81 |
82 | @tf.function
83 | def val_step(samples, model, epoch_metrics=None):
84 |
85 | top_view, box_grid, _ = samples
86 | predictions = model(top_view, training=False)
87 | (
88 | obj_loss,
89 | label_loss,
90 | z_loss,
91 | delta_xy_loss,
92 | width_loss,
93 | height_loss,
94 | delta_orient_loss,
95 | ) = detection_loss(box_grid, predictions)
96 | losses = [
97 | obj_loss,
98 | label_loss,
99 | z_loss,
100 | delta_xy_loss,
101 | width_loss,
102 | height_loss,
103 | delta_orient_loss,
104 | ]
105 | total_detection_loss = tf.reduce_sum(losses)
106 |
107 | if epoch_metrics is not None:
108 | epoch_metrics.val_loss(total_detection_loss)
109 |
110 |
111 | def train(resume=False):
112 | setup_gpu()
113 | # General parameters
114 | param = Parameters()
115 |
116 | # Init label colors and label names
117 | tf.random.set_seed(param.settings["seed"])
118 |
119 | train_dataset = DetectionDataset(
120 | param.settings,
121 | "train.datatxt",
122 | augmentation=param.settings["augmentation"],
123 | shuffle=True,
124 | )
125 |
126 | param.settings["train_size"] = train_dataset.num_samples
127 | val_dataset = DetectionDataset(param.settings, "val.datatxt", shuffle=False)
128 | param.settings["val_size"] = val_dataset.num_samples
129 |
130 | model = YoloV3_Lidar(weight_decay=param.settings["weight_decay"])
131 | voxels_grid = get_voxels_grid(
132 | param.settings["voxel_size"], param.settings["grid_meters"]
133 | )
134 | input_shape = [1, voxels_grid[0], voxels_grid[1], 2]
135 | initialize_model(model, input_shape)
136 | model.summary()
137 | start_epoch, model = load_model(param.settings["checkpoints_dir"], model, resume)
138 | model_path = os.path.join(param.settings["checkpoints_dir"], "{model}-{epoch:04d}")
139 |
140 | learning_rate, optimizer = get_optimizer(
141 | param.settings["optimizer"],
142 | param.settings["scheduler"],
143 | train_dataset.num_it_per_epoch,
144 | )
145 | epoch_metrics = EpochMetrics()
146 |
147 | for epoch in range(start_epoch, param.settings["max_epochs"]):
148 | save_dir = model_path.format(model=model.name, epoch=epoch)
149 | epoch_metrics.reset()
150 | for train_samples in tqdm(
151 | train_dataset.dataset,
152 | desc=f"Epoch {epoch}",
153 | total=train_dataset.num_it_per_epoch,
154 | ):
155 | train_outputs = train_step(
156 | param.settings, train_samples, model, optimizer, epoch_metrics
157 | )
158 | train_summaries(train_outputs, optimizer, param.settings, learning_rate)
159 | for val_samples in tqdm(
160 | val_dataset.dataset, desc="Validation", total=val_dataset.num_it_per_epoch
161 | ):
162 | val_step(val_samples, model, epoch_metrics)
163 | epoch_metrics_summaries(param.settings, epoch_metrics, epoch)
164 | epoch_metrics.print_metrics()
165 | # Save all
166 | param.save_to_json(save_dir)
167 | epoch_metrics.save_to_json(save_dir)
168 | model.save(save_dir)
169 |
170 |
171 | if __name__ == "__main__":
172 | parser = argparse.ArgumentParser(description="Train CNN.")
173 | parser.add_argument(
174 | "--resume",
175 | type=lambda x: x,
176 | nargs="?",
177 | const=True,
178 | default=False,
179 | help="Activate nice mode.",
180 | )
181 | args = parser.parse_args()
182 | train(resume=args.resume)
183 |
--------------------------------------------------------------------------------
/detection_3d/validation_inferece.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | __copyright__ = """
3 | Copyright (c) 2020 Tananaev Denis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies
9 | of the Software, and to permit persons to whom the Software is furnished to do so,
10 | subject to the following conditions: The above copyright notice and this permission
11 | notice shall be included in all copies or substantial portions of the Software.
12 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
13 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
14 | PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE
15 | FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
16 | OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
17 | DEALINGS IN THE SOFTWARE.
18 | """
19 |
20 | import argparse
21 | import os
22 | import numpy as np
23 | import tensorflow as tf
24 | from detection_3d.parameters import Parameters
25 | from detection_3d.tools.training_helpers import setup_gpu
26 | from detection_3d.detection_dataset import DetectionDataset
27 | from detection_3d.tools.visualization_tools import visualize_2d_boxes_on_top_image
28 | from detection_3d.tools.file_io import save_bboxes_to_file
29 | from detection_3d.tools.detection_helpers import (
30 | make_eight_points_boxes,
31 | get_boxes_from_box_grid,
32 | get_bboxes_parameters_from_points,
33 | )
34 | from PIL import Image
35 | from tqdm import tqdm
36 | import timeit
37 |
38 |
39 | def validation_inference(param_settings, dataset_file, model_dir, output_dir):
40 | setup_gpu()
41 |
42 | # Load model
43 | model = tf.keras.models.load_model(model_dir)
44 | bbox_voxel_size = np.asarray(param_settings["bbox_voxel_size"], dtype=np.float32)
45 | lidar_coord = np.array(param_settings["lidar_offset"], dtype=np.float32)
46 | grid_meters = param_settings["grid_meters"]
47 |
48 | val_dataset = DetectionDataset(param_settings, dataset_file, shuffle=False)
49 | param_settings["val_size"] = val_dataset.num_samples
50 | for val_samples in tqdm(
51 | val_dataset.dataset, desc=f"val_inference", total=val_dataset.num_it_per_epoch,
52 | ):
53 | top_view, gt_boxes, lidar_filenames = val_samples
54 | predictions = model(top_view, training=False)
55 | for image, predict, gt, filename in zip(
56 | top_view, predictions, gt_boxes, lidar_filenames
57 | ):
58 | filename = str(filename.numpy())
59 | seq_folder = filename.split("/")[-3]
60 | name = os.path.splitext(os.path.basename(filename))[0]
61 | # Ensure that output dir exists or create it
62 | top_view_dir = os.path.join(output_dir, "top_view", seq_folder)
63 | bboxes_dir = os.path.join(output_dir, "bboxes", seq_folder)
64 | os.makedirs(top_view_dir, exist_ok=True)
65 | os.makedirs(bboxes_dir, exist_ok=True)
66 | p_top_view = (
67 | visualize_2d_boxes_on_top_image(
68 | [predict], [image], grid_meters, bbox_voxel_size, prediction=True,
69 | )
70 | * 255
71 | )
72 | gt_top_view = (
73 | visualize_2d_boxes_on_top_image(
74 | [gt], [image], grid_meters, bbox_voxel_size, prediction=False,
75 | )
76 | * 255
77 | )
78 | result = np.vstack((p_top_view[0], gt_top_view[0]))
79 | file_to_save = os.path.join(top_view_dir, name + ".png")
80 | img = Image.fromarray(result.astype("uint8"))
81 | img.save(file_to_save)
82 |
83 | box, labels, _ = get_boxes_from_box_grid(predict, bbox_voxel_size)
84 | box = box.numpy()
85 | box, _ = make_eight_points_boxes(box)
86 | if len(box) > 0:
87 | box = box - lidar_coord[:3]
88 | labels = np.argmax(labels, axis=-1)
89 | (
90 | centroid,
91 | width,
92 | length,
93 | height,
94 | yaw,
95 | ) = get_bboxes_parameters_from_points(box)
96 | bboxes_name = os.path.join(bboxes_dir, name + ".txt")
97 | save_bboxes_to_file(
98 | bboxes_name, centroid, width, length, height, yaw, labels
99 | )
100 |
101 |
102 | if __name__ == "__main__":
103 | parser = argparse.ArgumentParser(description="Inference validation set.")
104 | parser.add_argument(
105 | "--dataset_file", default="val.datatxt",
106 | )
107 |
108 | parser.add_argument("--output_dir", default="inference")
109 |
110 | parser.add_argument(
111 | "--model_dir", default="YoloV3_Lidar-0085",
112 | )
113 | args = parser.parse_args()
114 |
115 | param_settings = Parameters().settings
116 | validation_inference(
117 | param_settings, args.dataset_file, args.model_dir, args.output_dir
118 | )
119 |
--------------------------------------------------------------------------------
/pictures/box_parametrization.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/box_parametrization.png
--------------------------------------------------------------------------------
/pictures/result.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/result.png
--------------------------------------------------------------------------------
/pictures/topview.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Dtananaev/lidar_dynamic_objects_detection/3b8f3d5fcce0fb914bb83e5d43a3ca652739139e/pictures/topview.png
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | import setuptools
2 |
3 | with open("README.md", "r") as fh:
4 | long_description = fh.read()
5 |
6 | setuptools.setup(
7 | name="detection_3d-Denis-Tananaev",
8 | version="0.0.1",
9 | author="Denis Tananaev",
10 | author_email="d.d.tananaev@gmail.com",
11 | description="3D bbox detection with Lidar",
12 | long_description=long_description,
13 | long_description_content_type="text/markdown",
14 | url="https://github.com/Dtananaev/lidar_dynamic_objects_detection",
15 | packages=setuptools.find_packages(),
16 | classifiers=[
17 | "Programming Language :: Python :: 3",
18 | "License :: OSI Approved :: MIT License",
19 | "Operating System :: OS Independent",
20 | ],
21 | python_requires='>=3.6',
22 | )
23 |
--------------------------------------------------------------------------------