├── README.md
├── datasets
├── __pycache__
│ ├── base_loader.cpython-39.pyc
│ ├── infobatch.cpython-39.pyc
│ └── lidarloc.cpython-39.pyc
├── base_loader.py
├── lidarloc.py
├── robotcar_sdk
│ ├── extrinsics
│ │ ├── ins.txt
│ │ ├── ldmrs.txt
│ │ ├── lms_front.txt
│ │ ├── lms_rear.txt
│ │ ├── mono_left.txt
│ │ ├── mono_rear.txt
│ │ ├── mono_right.txt
│ │ ├── nlct_velodyne.txt
│ │ ├── radar.txt
│ │ ├── stereo.txt
│ │ ├── velodyne_left.txt
│ │ └── velodyne_right.txt
│ └── python
│ │ ├── README.md
│ │ ├── __init__.py
│ │ ├── __pycache__
│ │ ├── __init__.cpython-310.pyc
│ │ ├── __init__.cpython-36.pyc
│ │ ├── __init__.cpython-37.pyc
│ │ ├── __init__.cpython-38.pyc
│ │ ├── __init__.cpython-39.pyc
│ │ ├── interpolate_poses.cpython-36.pyc
│ │ ├── interpolate_poses.cpython-37.pyc
│ │ ├── interpolate_poses.cpython-38.pyc
│ │ ├── transform.cpython-310.pyc
│ │ ├── transform.cpython-36.pyc
│ │ ├── transform.cpython-37.pyc
│ │ ├── transform.cpython-38.pyc
│ │ ├── transform.cpython-39.pyc
│ │ ├── velodyne.cpython-310.pyc
│ │ ├── velodyne.cpython-36.pyc
│ │ ├── velodyne.cpython-37.pyc
│ │ └── velodyne.cpython-38.pyc
│ │ ├── build_pointcloud.py
│ │ ├── camera_model.py
│ │ ├── image.py
│ │ ├── interpolate_poses.py
│ │ ├── play_images.py
│ │ ├── play_radar.py
│ │ ├── play_velodyne.py
│ │ ├── project_laser_into_camera.py
│ │ ├── radar.py
│ │ ├── requirements.txt
│ │ ├── transform.py
│ │ └── velodyne.py
└── rsd.py
├── img
├── Pipline.png
└── Teaser.png
├── install.sh
├── models
├── __pycache__
│ ├── ace.cpython-310.pyc
│ ├── ace.cpython-39.pyc
│ ├── loss.cpython-39.pyc
│ └── sc2pcr.cpython-39.pyc
├── lightloc.py
├── loss.py
└── sc2pcr.py
├── nclt_process.py
├── test.py
├── train.py
└── utils
├── __init__.py
├── __pycache__
├── __init__.cpython-39.pyc
├── augmentor.cpython-39.pyc
├── pose_util.cpython-39.pyc
└── train_util.cpython-39.pyc
├── augmentor.py
├── pose_util.py
└── train_util.py
/README.md:
--------------------------------------------------------------------------------
1 | # LightLoc: Learning Outdoor LiDAR Localization at Light Speed
2 |
3 | **[Paper](https://arxiv.org/pdf/2503.17814)** | **[Poster](https://drive.google.com/file/d/1Q5B6Ec4NN-NArkpaAjupIMamdCnwTknp/view?usp=sharing)**
4 |
5 |
6 |
7 | **LightLoc** is a fast and efficient framework for outdoor LiDAR-based localization. This repository provides the implementation of the LightLoc method, as described in our paper.
8 |
9 | ---
10 | ## Environment
11 |
12 | ### Requirements
13 | - python 3.8
14 |
15 | - pytorch 1.11
16 |
17 | - cuda 11.3
18 | ### Installation
19 | To set up the environment, run:
20 | ```
21 | source install.sh
22 | ```
23 |
24 | ## Supported Datasets
25 |
26 | LightLoc currently supports the following datasets:
27 | - [Oxford Radar RobotCar](https://oxford-robotics-institute.github.io/radar-robotcar-dataset/datasets)
28 | - [NCLT](https://robots.engin.umich.edu/nclt/)
29 |
30 | ### Dataset Structure
31 | Organize the dataset directories as follows:
32 |
33 | - Oxford
34 | ```
35 | data_root
36 | ├── 2019-01-11-14-02-26-radar-oxford-10k
37 | │ ├── velodyne_left
38 | │ │ ├── xxx.bin
39 | │ │ ├── xxx.bin
40 | │ │ ├── …
41 | │ ├── velodyne_left_calibrateFalse.h5
42 | │ ├── velodyne_left_False.h5
43 | ├── …
44 | ```
45 | - NCLT
46 | ```
47 | data_root
48 | ├── 2012-01-22
49 | │ ├── velodyne_left
50 | │ │ ├── xxx.bin
51 | │ │ ├── xxx.bin
52 | │ │ ├── …
53 | ├── velodyne_sync
54 | │ │ ├── xxx.bin
55 | │ │ ├── xxx.bin
56 | │ │ ├── …
57 | │ ├── velodyne_left_False.h5
58 | ├── …
59 | ```
60 | **Note:**
61 | - h5 ground truth files are provided in the [groundtruth](https://drive.google.com/drive/folders/1IAPbppgy88fr3KEgcKHJHUvdC0q1TJTo?usp=sharing).
62 | - We use [nclt_process.py](nclt_process.py) to preprocess velodyne_sync to speed up data reading.
63 | ```
64 | python nclt_process.py
65 | ```
66 |
67 | ## Train
68 | 
69 | #### (QE)Oxford
70 |
71 | - Train classifier
72 | ```
73 | python train.py --sample_cls=True --generate_clusters=True --batch_size=512 --epochs=50 --level_cluster=25
74 | ```
75 | - Train regressor
76 | ```
77 | python train.py --sample_cls=False --generate_clusters=False --batch_size=256 --epochs=25 --rsd=True --level_cluster=25
78 | ```
79 | **Note:** To use the QEOxford dataset, uncomment line 81 in [lidarloc.py](datasets/lidarloc.py).
80 | #### NCLT
81 |
82 | - Train classifier
83 | ```
84 | python train.py --sample_cls=True --generate_clusters=True --batch_size=512 --epochs=50 --level_cluster=100 --training_buffer_size=120000 --voxel_size=0.3
85 | ```
86 | - Train regressor
87 | ```
88 | python train.py --sample_cls=False --generate_clusters=False --batch_size=256 --epochs=30 --rsd=True --prune_ratio=0.15 --level_cluster=100 --voxel_size=0.3
89 | ```
90 | ## Test
91 | #### (QE)Oxford
92 | ```
93 | python test.py
94 | ```
95 | #### NCLT
96 | ```
97 | python test.py --voxel_size=0.3
98 | ```
99 | **Note:** For results matching Table 3 in the paper, exclude lines 4300–4500 from the 2012-05-26 trajectory.
100 |
101 | ## Model Zoo
102 | Pretrained models (backbone and scene-specific prediction heads) are available for download [here](https://drive.google.com/drive/folders/1ZuEs7NbVGO8afqTZM0xCYJgeLtNzPbt0?usp=sharing).
103 |
104 | ## Citation
105 | If you find this codebase useful in your research, please cite our work:
106 | ```
107 | @article{li2025lightloc,
108 | title={LightLoc: Learning Outdoor LiDAR Localization at Light Speed},
109 | author={Li, Wen and Liu, Chen and Yu, Shangshu and Liu, Dunqiang and Zhou, Yin and Shen, Siqi and Wen, Chenglu and Wang, Cheng},
110 | journal={arXiv preprint arXiv:2503.17814},
111 | year={2025}
112 | }
113 | ```
114 | This code builds on previous LiDAR localization pipelines, namely SGLoc and DiffLoc. Please consider citing:
115 | ```
116 | @inproceedings{li2024diffloc,
117 | title={DiffLoc: Diffusion Model for Outdoor LiDAR Localization},
118 | author={Li, Wen and Yang, Yuyang and Yu, Shangshu and Hu, Guosheng and Wen, Chenglu and Cheng, Ming and Wang, Cheng},
119 | booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
120 | pages={15045--15054},
121 | year={2024}
122 | }
123 |
124 | @inproceedings{li2023sgloc,
125 | title={SGLoc: Scene Geometry Encoding for Outdoor LiDAR Localization},
126 | author={Li, Wen and Yu, Shangshu and Wang, Cheng and Hu, Guosheng and Shen, Siqi and Wen, Chenglu},
127 | booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
128 | pages={9286--9295},
129 | year={2023}
130 | }
131 | ```
132 |
--------------------------------------------------------------------------------
/datasets/__pycache__/base_loader.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/__pycache__/base_loader.cpython-39.pyc
--------------------------------------------------------------------------------
/datasets/__pycache__/infobatch.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/__pycache__/infobatch.cpython-39.pyc
--------------------------------------------------------------------------------
/datasets/__pycache__/lidarloc.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/__pycache__/lidarloc.cpython-39.pyc
--------------------------------------------------------------------------------
/datasets/base_loader.py:
--------------------------------------------------------------------------------
1 | import logging
2 | import torch
3 | import numpy as np
4 | import MinkowskiEngine as ME
5 |
6 |
7 | class CollationFunctionFactory:
8 | def __init__(self, collation_type='default'):
9 | if collation_type == 'default':
10 | self.collation_fn = self.collate_default
11 | elif collation_type == 'collate_pair_cls':
12 | self.collation_fn = self.collate_pair_cls_fn
13 | elif collation_type == 'collate_pair_reg':
14 | self.collation_fn = self.collate_pair_reg_fn
15 |
16 | def __call__(self, list_data):
17 | return self.collation_fn(list_data)
18 |
19 | def collate_default(self, list_data):
20 | return list_data
21 |
22 | def collate_pair_cls_fn(self, list_data):
23 | list_data = [list_data]
24 | N = len(list_data)
25 | list_data = [data for data in list_data if data is not None]
26 | if N != len(list_data):
27 | logging.info(f"Retain {len(list_data)} from {N} data.")
28 |
29 | if len(list_data) == 0:
30 | raise ValueError('No data in the batch')
31 |
32 | coords, feats, lbl, pose, scene = list(zip(*list_data))
33 | lbl_batch = torch.from_numpy(np.stack(lbl)).to(torch.int64)
34 | pose_batch = torch.from_numpy(np.stack(pose)).float()
35 | self.num_scenes = pose_batch.size(1)
36 | coords_batch = self.batched_seq_coordinates(coords, scene)
37 | feats_batch = torch.from_numpy(np.concatenate(feats, 0)).float()
38 |
39 | return {
40 | 'sinput_C': coords_batch,
41 | 'sinput_F': feats_batch,
42 | 'lbl': lbl_batch,
43 | 'pose': pose_batch
44 | }
45 |
46 | def collate_pair_reg_fn(self, list_data):
47 | N = len(list_data)
48 | list_data = [data for data in list_data if data is not None]
49 | if N != len(list_data):
50 | logging.info(f"Retain {len(list_data)} from {N} data.")
51 |
52 | if len(list_data) == 0:
53 | raise ValueError('No data in the batch')
54 |
55 | coords, feats, lbl, idx, pose = list(zip(*list_data))
56 | lbl_batch = torch.from_numpy(np.stack(lbl)).to(torch.int64)
57 | idx_batch = torch.from_numpy(np.stack(idx)).to(torch.int64)
58 | pose_batch = torch.from_numpy(np.stack(pose)).float()
59 | coords_batch = ME.utils.batched_coordinates(coords)
60 | feats_batch = torch.from_numpy(np.concatenate(feats, 0)).float()
61 |
62 | return {
63 | 'sinput_C': coords_batch,
64 | 'sinput_F': feats_batch,
65 | 'lbl': lbl_batch,
66 | 'idx': idx_batch,
67 | 'pose': pose_batch
68 | }
69 |
70 | def batched_seq_coordinates(self, coords, scenes, dtype=torch.int32, device=None):
71 | D = np.unique(np.array([cs.shape[1] for cs in coords]))
72 | D = D[0]
73 | if device is None:
74 | if isinstance(coords, torch.Tensor):
75 | device = coords[0].device
76 | else:
77 | device = "cpu"
78 | N = np.array([len(cs) for cs in coords]).sum()
79 | bcoords = torch.zeros((N, D + 1), dtype=dtype, device=device) # uninitialized
80 | s = 0
81 | scenes = list(scenes)
82 | for i in range(len(scenes)):
83 | scenes[i] = list(scenes[i])
84 | if i > 0:
85 | scenes[i] = [x + scenes[i-1][-1] for x in scenes[i]]
86 | for b, cs in enumerate(coords):
87 | if dtype == torch.int32:
88 | if isinstance(cs, np.ndarray):
89 | cs = torch.from_numpy(np.floor(cs))
90 | elif not (
91 | isinstance(cs, torch.IntTensor) or isinstance(cs, torch.LongTensor)
92 | ):
93 | cs = cs.floor()
94 |
95 | cs = cs.int()
96 | else:
97 | if isinstance(cs, np.ndarray):
98 | cs = torch.from_numpy(cs)
99 | if b > 0:
100 | for i in range(self.num_scenes):
101 | bcoords[scenes[b][i]: scenes[b][i + 1], 1:] = cs[scenes[b][i] - scenes[b - 1][-1]:
102 | scenes[b][i + 1] - scenes[b - 1][-1]]
103 | bcoords[scenes[b][i]: scenes[b][i + 1], 0] = s
104 | s += 1
105 | else:
106 | for i in range(self.num_scenes):
107 | bcoords[scenes[b][i]: scenes[b][i + 1], 1:] = cs[scenes[b][i]: scenes[b][i + 1]]
108 | bcoords[scenes[b][i]: scenes[b][i + 1], 0] = s
109 | s += 1
110 | return bcoords
--------------------------------------------------------------------------------
/datasets/lidarloc.py:
--------------------------------------------------------------------------------
1 | import h5py
2 | import logging
3 | import numpy as np
4 | import os.path as osp
5 | import concurrent.futures
6 | import MinkowskiEngine as ME
7 | from torch.utils.data import Dataset
8 | from sklearn.cluster import KMeans
9 |
10 | from utils.pose_util import process_poses, calibrate_process_poses
11 | from utils.augmentor import Augmentor, AugmentParams
12 |
13 | _logger = logging.getLogger(__name__)
14 | BASE_DIR = osp.dirname(osp.abspath(__file__))
15 |
16 |
17 | def compute_distances_chunk(xyz, centers):
18 | """
19 | Calculate the distance between xyz and centers, and return the index of the minimum distance
20 | """
21 | dist_mat = np.linalg.norm(xyz[:, np.newaxis, :2] - centers[np.newaxis, :, :2], axis=-1)
22 | lbl_chunk = np.argmin(dist_mat, axis=1)
23 |
24 | return lbl_chunk
25 |
26 |
27 | def parallel_kmeans(xyz, centers, n_jobs=8):
28 | """
29 | Use concurrent.futures to compute the distance matrix in parallel and return the index of the minimum distance.
30 | """
31 | chunk_size = len(xyz) // n_jobs
32 | xyz_chunks = [xyz[i:i + chunk_size] for i in range(0, len(xyz), chunk_size)]
33 |
34 | with concurrent.futures.ProcessPoolExecutor(max_workers=n_jobs) as executor:
35 | results = list(executor.map(compute_distances_chunk, xyz_chunks, [centers] * len(xyz_chunks)))
36 |
37 | lbl = np.concatenate(results, axis=0)
38 |
39 | return lbl
40 |
41 |
42 | class LiDARLocDataset(Dataset):
43 | """LiDAR localization dataset.
44 |
45 | Access to point clouds, calibration and ground truth data given a dataset directory
46 | """
47 |
48 | def __init__(self,
49 | root_dir,
50 | train=True,
51 | sample_cls=False,
52 | augment=False,
53 | aug_rotation=10,
54 | aug_translation=1,
55 | generate_clusters=False,
56 | level_clusters=25,
57 | voxel_size=0.25):
58 |
59 | # Only support Oxford and NCLT
60 | self.root_dir = root_dir
61 |
62 | self.train = train
63 | self.augment = augment
64 | self.aug_rotation = aug_rotation
65 | self.aug_translation = aug_translation
66 | self.voxel_size = voxel_size
67 |
68 | # which dataset?
69 | self.scene = osp.split(root_dir)[-1]
70 | if self.scene == 'Oxford':
71 | if self.train:
72 | seqs = ['2019-01-11-14-02-26-radar-oxford-10k', '2019-01-14-12-05-52-radar-oxford-10k',
73 | '2019-01-14-14-48-55-radar-oxford-10k', '2019-01-18-15-20-12-radar-oxford-10k']
74 | else:
75 | # change test seqs
76 | seqs = ['2019-01-15-13-06-37-radar-oxford-10k']
77 | # seqs = ['2019-01-17-13-26-39-radar-oxford-10k']
78 | # seqs = ['2019-01-17-14-03-00-radar-oxford-10k']
79 | # seqs = ['2019-01-18-14-14-42-radar-oxford-10k']
80 |
81 | self.scene = 'QEOxford' # If you use the QEOxford dataset, please uncomment this note
82 |
83 | elif self.scene == 'NCLT':
84 | if self.train:
85 | seqs = ['2012-01-22', '2012-02-02', '2012-02-18', '2012-05-11']
86 | else:
87 | # seqs = ['2012-02-12']
88 | # seqs = ['2012-02-19']
89 | seqs = ['2012-03-31']
90 | # seqs = ['2012-05-26']
91 | else:
92 | raise RuntimeError('Only support Oxford and NCLT!')
93 |
94 | ps = {}
95 | ts = {}
96 | vo_stats = {}
97 | self.pcs = []
98 |
99 | for seq in seqs:
100 | seq_dir = osp.join(self.root_dir, seq)
101 | if self.scene == 'QEOxford':
102 | h5_path = osp.join(self.root_dir, seq, 'velodyne_left_' + 'calibrateFalse.h5')
103 | else:
104 | h5_path = osp.join(self.root_dir, seq, 'velodyne_left_' + 'False.h5')
105 |
106 | # load h5 file, save pose interpolating time
107 | print("load " + seq + ' pose from ' + h5_path)
108 | h5_file = h5py.File(h5_path, 'r')
109 | ts[seq] = h5_file['valid_timestamps'][5:-5]
110 | ps[seq] = h5_file['poses'][5:-5]
111 | self.pcs.extend(
112 | [osp.join(seq_dir, 'velodyne_left', '{:d}.bin'.format(t)) for t in ts[seq]])
113 |
114 | vo_stats[seq] = {'R': np.eye(3), 't': np.zeros(3), 's': 1}
115 |
116 | poses = np.empty((0, 12))
117 | for p in ps.values():
118 | poses = np.vstack((poses, p))
119 |
120 | kmeans_pose_stats_filename = osp.join(self.root_dir, self.scene + '_cls_pose_stats.txt')
121 | mean_pose_stats_filename = osp.join(self.root_dir, self.scene + '_pose_stats.txt')
122 |
123 | self.poses = np.empty((0, 6))
124 | self.rots = np.empty((0, 3, 3))
125 |
126 | if self.train:
127 | if self.scene == 'QEOxford':
128 | # Calculate the mean value of translations in training seqs
129 | self.mean_t = np.mean(poses[:, 9:], axis=0) # (3,)
130 | # Save
131 | np.savetxt(mean_pose_stats_filename, self.mean_t, fmt='%8.7f')
132 | else:
133 | # Calculate the mean value of translations in training seqs
134 | self.mean_t = np.mean(poses[:, [3, 7, 11]], axis=0) # (3,)
135 | # Save
136 | np.savetxt(mean_pose_stats_filename, self.mean_t, fmt='%8.7f')
137 | else:
138 | self.mean_t = np.loadtxt(mean_pose_stats_filename)
139 |
140 | for seq in seqs:
141 | if self.scene == 'QEOxford':
142 | pss, rotation = calibrate_process_poses(poses_in=ps[seq], mean_t=self.mean_t, align_R=vo_stats[seq]['R'],
143 | align_t=vo_stats[seq]['t'], align_s=vo_stats[seq]['s'])
144 | else:
145 | pss, rotation = process_poses(poses_in=ps[seq], mean_t=self.mean_t, align_R=vo_stats[seq]['R'],
146 | align_t=vo_stats[seq]['t'], align_s=vo_stats[seq]['s'])
147 | self.poses = np.vstack((self.poses, pss))
148 | self.rots = np.vstack((self.rots, rotation))
149 |
150 | xyz = self.poses[:, :3]
151 |
152 | if self.train:
153 | if sample_cls and generate_clusters:
154 | print("generate clusters")
155 | centers = []
156 | level = KMeans(n_clusters=level_clusters, random_state=0).fit(xyz)
157 | for l_cid in range(level_clusters):
158 | centers.append(level.cluster_centers_[l_cid])
159 | centers = np.array(centers)
160 | np.savetxt(kmeans_pose_stats_filename, centers, fmt='%8.7f')
161 | self.lbl = parallel_kmeans(xyz[:, :2], centers[:, :2]).reshape(-1, 1)
162 | else:
163 | centers = np.loadtxt(kmeans_pose_stats_filename)
164 | self.lbl = parallel_kmeans(xyz[:, :2], centers[:, :2]).reshape(-1, 1)
165 | else:
166 | if sample_cls:
167 | centers = np.loadtxt(kmeans_pose_stats_filename)
168 | self.lbl = parallel_kmeans(xyz[:, :2], centers[:, :2]).reshape(-1, 1)
169 | else:
170 | centers = np.zeros((level_clusters, 3))
171 | self.lbl = np.zeros((len(xyz), 1))
172 |
173 | self.centers = centers
174 |
175 | # data augment
176 | augment_params = AugmentParams()
177 | if self.augment:
178 | augment_params.setTranslationParams(
179 | p_transx=0.5, trans_xmin=-1 * self.aug_translation, trans_xmax=self.aug_translation,
180 | p_transy=0.5, trans_ymin=-1 * self.aug_translation, trans_ymax=self.aug_translation,
181 | p_transz=0, trans_zmin=-1 * self.aug_translation, trans_zmax=self.aug_translation)
182 | augment_params.setRotationParams(
183 | p_rot_roll=0, rot_rollmin=-1 * self.aug_rotation, rot_rollmax=self.aug_rotation,
184 | p_rot_pitch=0, rot_pitchmin=-1 * self.aug_rotation, rot_pitchmax=self.aug_rotation,
185 | p_rot_yaw=0.5, rot_yawmin=-1 * self.aug_rotation, rot_yawmax=self.aug_rotation)
186 | self.augmentor = Augmentor(augment_params)
187 | else:
188 | self.augmentor = None
189 |
190 | if self.train:
191 | print("train data num:" + str(len(self.poses)))
192 | else:
193 | print("valid data num:" + str(len(self.poses)))
194 |
195 | def __len__(self):
196 | return len(self.poses)
197 |
198 | def __getitem__(self, idx):
199 | # For Classification training
200 | if type(idx) == list:
201 | scenes = [0]
202 | num = 0
203 | for i in idx:
204 | scan_path = self.pcs[i]
205 | if self.scene == 'NCLT':
206 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(-1, 4)[:, :3]
207 | else:
208 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(4, -1).transpose()[:, :3]
209 | # just for z-axis up
210 | ptcld[:, 2] = -1 * ptcld[:, 2]
211 |
212 | scan = ptcld
213 |
214 | scan = np.ascontiguousarray(scan)
215 |
216 | lbl = self.lbl[i]
217 | pose = self.poses[i]
218 | rot = self.rots[i]
219 | scan_gt = (rot @ scan.transpose(1, 0)).transpose(1, 0) + pose[:3].reshape(1, 3)
220 |
221 | if self.train & self.augment:
222 | scan = self.augmentor.doAugmentation(scan) # n, 5
223 |
224 | scan_gt_s8 = np.concatenate((scan, scan_gt), axis=1)
225 |
226 | coord, feat = ME.utils.sparse_quantize(
227 | coordinates=scan,
228 | features=scan_gt_s8,
229 | quantization_size=self.voxel_size)
230 |
231 | if num == 0:
232 | coords = coord
233 | feats = feat
234 | lbls = lbl.reshape(-1, 1)
235 | poses = pose.reshape(1, 6)
236 | scenes.append(len(coord))
237 | else:
238 | coords = np.concatenate((coords, coord))
239 | feats = np.concatenate((feats, feat))
240 | lbls = np.concatenate((lbls, lbl.reshape(-1, 1)))
241 | poses = np.concatenate((poses, pose.reshape(1, 6)))
242 | scenes.append(scenes[num] + len(coord))
243 | num += 1
244 |
245 | return (coords, feats, lbls, poses, scenes)
246 |
247 | else:
248 | # For Regression training
249 | scan_path = self.pcs[idx]
250 | if self.scene == 'NCLT':
251 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(-1, 4)[:, :3]
252 | else:
253 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(4, -1).transpose()[:, :3]
254 | # just for z-axis up
255 | ptcld[:, 2] = -1 * ptcld[:, 2]
256 | scan = ptcld
257 |
258 | scan = np.ascontiguousarray(scan)
259 |
260 | lbl = self.lbl[idx]
261 | pose = self.poses[idx] # (6,)
262 | rot = self.rots[idx] # [3, 3]
263 |
264 | scan_gt = (rot @ scan.transpose(1, 0)).transpose(1, 0) + pose[:3].reshape(1, 3)
265 |
266 | if self.train & self.augment:
267 | scan = self.augmentor.doAugmentation(scan) # n, 5
268 |
269 | scan_gt_s8 = np.concatenate((scan, scan_gt), axis=1)
270 |
271 | coords, feats = ME.utils.sparse_quantize(
272 | coordinates=scan,
273 | features=scan_gt_s8,
274 | quantization_size=self.voxel_size)
275 |
276 | return (coords, feats, lbl, idx, pose)
277 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/ins.txt:
--------------------------------------------------------------------------------
1 | -1.7132 0.1181 1.1948 -0.0125 0.0400 0.0050
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/ldmrs.txt:
--------------------------------------------------------------------------------
1 | 1.5349 0.0090 1.3777 0.0205 0.0767 -0.0299
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/lms_front.txt:
--------------------------------------------------------------------------------
1 | 1.7589 0.2268 1.0411 -0.0437 -1.4572 0.0356
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/lms_rear.txt:
--------------------------------------------------------------------------------
1 | -2.5850 0.2852 1.0885 -2.8405 -1.5090 -0.3614
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/mono_left.txt:
--------------------------------------------------------------------------------
1 | -0.0905 1.6375 0.2803 0.2079 -0.2339 1.2321
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/mono_rear.txt:
--------------------------------------------------------------------------------
1 | -2.0582 0.0894 0.3675 -0.0119 -0.2498 3.1283
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/mono_right.txt:
--------------------------------------------------------------------------------
1 | -0.2587 -1.6810 0.3226 -0.1961 -0.2469 -1.2675
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/nlct_velodyne.txt:
--------------------------------------------------------------------------------
1 | 0.002 -0.004 -0.957 0.807 0.166 -90.703
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/radar.txt:
--------------------------------------------------------------------------------
1 | -0.71813 0.12 -0.54479 0 0.05 0
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/stereo.txt:
--------------------------------------------------------------------------------
1 | 0 0 0 0 0 0
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/velodyne_left.txt:
--------------------------------------------------------------------------------
1 | -0.60072 -0.34077 -0.26837 -0.0053948 -0.041998 -3.1337
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/extrinsics/velodyne_right.txt:
--------------------------------------------------------------------------------
1 | -0.61153 0.55676 -0.27023 0.0027052 -0.041999 -3.1357
2 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/README.md:
--------------------------------------------------------------------------------
1 | RobotCar Dataset Python Tools
2 | =============================
3 |
4 | This directory contains sample python code for viewing and manipulating data from the [Oxford Robotcar Dataset](http://robotcar-dataset.robots.ox.ac.uk) and [Oxford Radar Robotcar Dataset](http://ori.ox.ac.uk/datasets/radar-robotcar-dataset).
5 |
6 | Requirements
7 | ------------
8 | The python tools have been tested on Python 2.7.
9 | Python 3.* compatibility has not been verified.
10 |
11 | The following packages are required:
12 | * numpy
13 | * matplotlib
14 | * colour_demosaicing
15 | * pillow
16 | * opencv-python
17 | * open3d-python
18 |
19 | These can be installed with pip:
20 |
21 | ```
22 | pip install numpy matplotlib colour_demosaicing pillow opencv-python open3d-python
23 | ```
24 |
25 | Command Line Tools
26 | ------------------
27 |
28 | ### Viewing Images
29 | The `play_images.py` script can be used to view images from the dataset.
30 |
31 | ```bash
32 | python play_images.py /path/to/data/yyyy-mm-dd-hh-mm-ss/stereo/centre
33 | ```
34 |
35 | If you wish to undistort the images before viewing them, pass the camera model directory as a second argument:
36 |
37 | ```bash
38 | python play_images.py /path/to/data/yyyy-mm-dd-hh-mm-ss/stereo/centre --models_dir /path/to/camera/models
39 | ```
40 |
41 | ### Viewing Radar Scans
42 | The `play_radar.py` script can be used to view radar scans.
43 |
44 | ```bash
45 | python play_radar.py /path/to/data/yyyy-mm-dd-hh-mm-ss/radar
46 | ```
47 |
48 | ### Viewing Velodyne Scans
49 | The `play_velodyne.py` script can be used to view velodyne scans from raw or binary form.
50 |
51 | ```bash
52 | python play_velodyne.py /path/to/data/yyyy-mm-dd-hh-mm-ss/velodyne_left
53 | ```
54 |
55 | ### Building Pointclouds
56 | The `build_pointcloud.py` script builds and displays a 3D pointcloud by combining multiple LIDAR scans with a pose source.
57 | The pose source can be either INS data or the supplied visual odometry data. For example:
58 |
59 | ```bash
60 | python build_pointcloud.py --laser_dir /path/to/data/yyyy-mm-dd-hh-mm-ss/lms_front --extrinsics_dir ../extrinsics --poses_file /path/to/data/yyyy-mm-dd-hh-mm-ss/vo/vo.csv
61 | ```
62 |
63 | ### Projecting pointclouds into images
64 | The `project_laser_into_camera.py` script first builds a pointcloud, then projects it into a camera image using a pinhole camera model.
65 | For example:
66 |
67 | ```bash
68 | python project_laser_into_camera.py --image_dir /path/to/data/yyyy-mm-dd-hh-mm-ss/stereo/centre --laser_dir /path/to/data/yyyy-mm-dd-hh-mm-ss/ldmrs --poses_file /path/to/data/yyyy-mm-dd-hh-mm-ss/vo/vo.csv --models_dir /path/to/models --extrinsics_dir ../extrinsics --image_idx 200
69 | ```
70 |
71 | Usage from Python
72 | -----------------
73 | The scripts here are also designed to be used in your own scripts.
74 |
75 | * `build_pointcloud.py`: function for building a pointcloud from LIDAR and odometry data
76 | * `camera_model.py`: loads camera models from disk, and provides undistortion of images and projection of pointclouds
77 | * `interpolate_poses.py`: functions for interpolating VO or INS data to obtain pose estimates at arbitrary timestamps
78 | * `transform.py`: functions for converting between various transform representations
79 | * `image.py`: function for loading, Bayer demosaicing and undistorting images
80 | * `velodyne.py`: functions for loading Velodyne scan data and converting a raw scan representation to pointcloud
81 | * `radar.py`: functions for loading radar scan data and converting a polar scan representation to Cartesian
82 |
83 | For examples of how to use these functions, see the command line tools above.
84 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__init__.py:
--------------------------------------------------------------------------------
1 | from . import transform
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-310.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-310.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-36.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-37.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-38.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-39.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-36.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-37.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-38.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/transform.cpython-310.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-310.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/transform.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-36.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/transform.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-37.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/transform.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-38.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/transform.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-39.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-310.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-310.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-36.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-36.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-37.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-38.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-38.pyc
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/build_pointcloud.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Geoff Pascoe (gmp@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ################################################################################
14 |
15 | import os
16 | import re
17 | import numpy as np
18 |
19 | from transform import build_se3_transform
20 | from interpolate_poses import interpolate_vo_poses, interpolate_ins_poses
21 | from velodyne import load_velodyne_raw, load_velodyne_binary, velodyne_raw_to_pointcloud
22 |
23 |
24 | def build_pointcloud(lidar_dir, poses_file, extrinsics_dir, start_time, end_time, origin_time=-1):
25 | """Builds a pointcloud by combining multiple LIDAR scans with odometry information.
26 |
27 | Args:
28 | lidar_dir (str): Directory containing LIDAR scans.
29 | poses_file (str): Path to a file containing pose information. Can be VO or INS data.
30 | extrinsics_dir (str): Directory containing extrinsic calibrations.
31 | start_time (int): UNIX timestamp of the start of the window over which to build the pointcloud.
32 | end_time (int): UNIX timestamp of the end of the window over which to build the pointcloud.
33 | origin_time (int): UNIX timestamp of origin frame. Pointcloud coordinates are relative to this frame.
34 |
35 | Returns:
36 | numpy.ndarray: 3xn array of (x, y, z) coordinates of pointcloud
37 | numpy.array: array of n reflectance values or None if no reflectance values are recorded (LDMRS)
38 |
39 | Raises:
40 | ValueError: if specified window doesn't contain any laser scans.
41 | IOError: if scan files are not found.
42 |
43 | """
44 | if origin_time < 0:
45 | origin_time = start_time
46 |
47 | lidar = re.search('(lms_front|lms_rear|ldmrs|velodyne_left|velodyne_right)', lidar_dir).group(0)
48 | timestamps_path = os.path.join(lidar_dir, os.pardir, lidar + '.timestamps')
49 |
50 | timestamps = []
51 | with open(timestamps_path) as timestamps_file:
52 | for line in timestamps_file:
53 | timestamp = int(line.split(' ')[0])
54 | if start_time <= timestamp <= end_time:
55 | timestamps.append(timestamp)
56 |
57 | if len(timestamps) == 0:
58 | raise ValueError("No LIDAR data in the given time bracket.")
59 |
60 | with open(os.path.join(extrinsics_dir, lidar + '.txt')) as extrinsics_file:
61 | extrinsics = next(extrinsics_file)
62 | G_posesource_laser = build_se3_transform([float(x) for x in extrinsics.split(' ')])
63 |
64 | poses_type = re.search('(vo|ins|rtk)\.csv', poses_file).group(1)
65 |
66 | if poses_type in ['ins', 'rtk']:
67 | with open(os.path.join(extrinsics_dir, 'ins.txt')) as extrinsics_file:
68 | extrinsics = next(extrinsics_file)
69 | G_posesource_laser = np.linalg.solve(build_se3_transform([float(x) for x in extrinsics.split(' ')]),
70 | G_posesource_laser)
71 |
72 | poses = interpolate_ins_poses(poses_file, timestamps, origin_time, use_rtk=(poses_type == 'rtk'))
73 | else:
74 | # sensor is VO, which is located at the main vehicle frame
75 | poses = interpolate_vo_poses(poses_file, timestamps, origin_time)
76 |
77 | pointcloud = np.array([[0], [0], [0], [0]])
78 | if lidar == 'ldmrs':
79 | reflectance = None
80 | else:
81 | reflectance = np.empty((0))
82 |
83 | for i in range(0, len(poses)):
84 | scan_path = os.path.join(lidar_dir, str(timestamps[i]) + '.bin')
85 | if "velodyne" not in lidar:
86 | if not os.path.isfile(scan_path):
87 | continue
88 |
89 | scan_file = open(scan_path)
90 | scan = np.fromfile(scan_file, np.double)
91 | scan_file.close()
92 |
93 | scan = scan.reshape((len(scan) // 3, 3)).transpose()
94 |
95 | if lidar != 'ldmrs':
96 | # LMS scans are tuples of (x, y, reflectance)
97 | reflectance = np.concatenate((reflectance, np.ravel(scan[2, :])))
98 | scan[2, :] = np.zeros((1, scan.shape[1]))
99 | else:
100 | if os.path.isfile(scan_path):
101 | ptcld = load_velodyne_binary(scan_path)
102 | else:
103 | scan_path = os.path.join(lidar_dir, str(timestamps[i]) + '.png')
104 | if not os.path.isfile(scan_path):
105 | continue
106 | ranges, intensities, angles, approximate_timestamps = load_velodyne_raw(scan_path)
107 | ptcld = velodyne_raw_to_pointcloud(ranges, intensities, angles)
108 |
109 | reflectance = np.concatenate((reflectance, ptcld[3]))
110 | scan = ptcld[:3]
111 |
112 | scan = np.dot(np.dot(poses[i], G_posesource_laser), np.vstack([scan, np.ones((1, scan.shape[1]))]))
113 | pointcloud = np.hstack([pointcloud, scan])
114 |
115 | pointcloud = pointcloud[:, 1:]
116 | if pointcloud.shape[1] == 0:
117 | raise IOError("Could not find scan files for given time range in directory " + lidar_dir)
118 |
119 | return pointcloud, reflectance
120 |
121 |
122 | if __name__ == "__main__":
123 | import argparse
124 | import open3d
125 |
126 | parser = argparse.ArgumentParser(description='Build and display a pointcloud')
127 | parser.add_argument('--poses_file', type=str, default=None, help='File containing relative or absolute poses')
128 | parser.add_argument('--extrinsics_dir', type=str, default=None,
129 | help='Directory containing extrinsic calibrations')
130 | parser.add_argument('--laser_dir', type=str, default=None, help='Directory containing LIDAR data')
131 |
132 | args = parser.parse_args()
133 |
134 | lidar = re.search('(lms_front|lms_rear|ldmrs|velodyne_left|velodyne_right)', args.laser_dir).group(0)
135 | timestamps_path = os.path.join(args.laser_dir, os.pardir, lidar + '.timestamps')
136 | with open(timestamps_path) as timestamps_file:
137 | start_time = int(next(timestamps_file).split(' ')[0])
138 |
139 | end_time = start_time + 2e7
140 |
141 | pointcloud, reflectance = build_pointcloud(args.laser_dir, args.poses_file,
142 | args.extrinsics_dir, start_time, end_time)
143 |
144 | if reflectance is not None:
145 | colours = (reflectance - reflectance.min()) / (reflectance.max() - reflectance.min())
146 | colours = 1 / (1 + np.exp(-10 * (colours - colours.mean())))
147 | else:
148 | colours = 'gray'
149 |
150 | # Pointcloud Visualisation using Open3D
151 | vis = open3d.Visualizer()
152 | vis.create_window(window_name=os.path.basename(__file__))
153 | render_option = vis.get_render_option()
154 | render_option.background_color = np.array([0.1529, 0.1569, 0.1333], np.float32)
155 | render_option.point_color_option = open3d.PointColorOption.ZCoordinate
156 | coordinate_frame = open3d.geometry.create_mesh_coordinate_frame()
157 | vis.add_geometry(coordinate_frame)
158 | pcd = open3d.geometry.PointCloud()
159 | pcd.points = open3d.utility.Vector3dVector(
160 | -np.ascontiguousarray(pointcloud[[1, 0, 2]].transpose().astype(np.float64)))
161 | pcd.colors = open3d.utility.Vector3dVector(np.tile(colours[:, np.newaxis], (1, 3)).astype(np.float64))
162 | # Rotate pointcloud to align displayed coordinate frame colouring
163 | pcd.transform(build_se3_transform([0, 0, 0, np.pi, 0, -np.pi / 2]))
164 | vis.add_geometry(pcd)
165 | view_control = vis.get_view_control()
166 | params = view_control.convert_to_pinhole_camera_parameters()
167 | params.extrinsic = build_se3_transform([0, 3, 10, 0, -np.pi * 0.42, -np.pi / 2])
168 | view_control.convert_from_pinhole_camera_parameters(params)
169 | vis.run()
170 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/camera_model.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Geoff Pascoe (gmp@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ################################################################################
14 |
15 | import re
16 | import os
17 | import numpy as np
18 | import scipy.interpolate as interp
19 | from scipy.ndimage import map_coordinates
20 |
21 |
22 | class CameraModel:
23 | """Provides intrinsic parameters and undistortion LUT for a camera.
24 |
25 | Attributes:
26 | camera (str): Name of the camera.
27 | camera sensor (str): Name of the sensor on the camera for multi-sensor cameras.
28 | focal_length (tuple[float]): Focal length of the camera in horizontal and vertical axis, in pixels.
29 | principal_point (tuple[float]): Principal point of camera for pinhole projection model, in pixels.
30 | G_camera_image (:obj: `numpy.matrixlib.defmatrix.matrix`): Transform from image frame to camera frame.
31 | bilinear_lut (:obj: `numpy.ndarray`): Look-up table for undistortion of images, mapping pixels in an undistorted
32 | image to pixels in the distorted image
33 |
34 | """
35 |
36 | def __init__(self, models_dir, images_dir):
37 | """Loads a camera model from disk.
38 |
39 | Args:
40 | models_dir (str): directory containing camera model files.
41 | images_dir (str): directory containing images for which to read camera model.
42 |
43 | """
44 | self.camera = None
45 | self.camera_sensor = None
46 | self.focal_length = None
47 | self.principal_point = None
48 | self.G_camera_image = None
49 | self.bilinear_lut = None
50 |
51 | self.__load_intrinsics(models_dir, images_dir)
52 | self.__load_lut(models_dir, images_dir)
53 |
54 | def project(self, xyz, image_size):
55 | """Projects a pointcloud into the camera using a pinhole camera model.
56 |
57 | Args:
58 | xyz (:obj: `numpy.ndarray`): 3xn array, where each column is (x, y, z) point relative to camera frame.
59 | image_size (tuple[int]): dimensions of image in pixels
60 |
61 | Returns:
62 | numpy.ndarray: 2xm array of points, where each column is the (u, v) pixel coordinates of a point in pixels.
63 | numpy.array: array of depth values for points in image.
64 |
65 | Note:
66 | Number of output points m will be less than or equal to number of input points n, as points that do not
67 | project into the image are discarded.
68 |
69 | """
70 | if xyz.shape[0] == 3:
71 | xyz = np.stack((xyz, np.ones((1, xyz.shape[1]))))
72 | xyzw = np.linalg.solve(self.G_camera_image, xyz)
73 |
74 | # Find which points lie in front of the camera
75 | in_front = [i for i in range(0, xyzw.shape[1]) if xyzw[2, i] >= 0]
76 | xyzw = xyzw[:, in_front]
77 |
78 | uv = np.vstack((self.focal_length[0] * xyzw[0, :] / xyzw[2, :] + self.principal_point[0],
79 | self.focal_length[1] * xyzw[1, :] / xyzw[2, :] + self.principal_point[1]))
80 |
81 | in_img = [i for i in range(0, uv.shape[1])
82 | if 0.5 <= uv[0, i] <= image_size[1] and 0.5 <= uv[1, i] <= image_size[0]]
83 |
84 | return uv[:, in_img], np.ravel(xyzw[2, in_img])
85 |
86 | def undistort(self, image):
87 | """Undistorts an image.
88 |
89 | Args:
90 | image (:obj: `numpy.ndarray`): A distorted image. Must be demosaiced - ie. must be a 3-channel RGB image.
91 |
92 | Returns:
93 | numpy.ndarray: Undistorted version of image.
94 |
95 | Raises:
96 | ValueError: if image size does not match camera model.
97 | ValueError: if image only has a single channel.
98 |
99 | """
100 | if image.shape[0] * image.shape[1] != self.bilinear_lut.shape[0]:
101 | raise ValueError('Incorrect image size for camera model')
102 |
103 | lut = self.bilinear_lut[:, 1::-1].T.reshape((2, image.shape[0], image.shape[1]))
104 |
105 | if len(image.shape) == 1:
106 | raise ValueError('Undistortion function only works with multi-channel images')
107 |
108 | undistorted = np.rollaxis(np.array([map_coordinates(image[:, :, channel], lut, order=1)
109 | for channel in range(0, image.shape[2])]), 0, 3)
110 |
111 | return undistorted.astype(image.dtype)
112 |
113 | def __get_model_name(self, images_dir):
114 | self.camera = re.search('(stereo|mono_(left|right|rear))', images_dir).group(0)
115 | if self.camera == 'stereo':
116 | self.camera_sensor = re.search('(left|centre|right)', images_dir).group(0)
117 | if self.camera_sensor == 'left':
118 | return 'stereo_wide_left'
119 | elif self.camera_sensor == 'right':
120 | return 'stereo_wide_right'
121 | elif self.camera_sensor == 'centre':
122 | return 'stereo_narrow_left'
123 | else:
124 | raise RuntimeError('Unknown camera model for given directory: ' + images_dir)
125 | else:
126 | return self.camera
127 |
128 | def __load_intrinsics(self, models_dir, images_dir):
129 | model_name = self.__get_model_name(images_dir)
130 | intrinsics_path = os.path.join(models_dir, model_name + '.txt')
131 |
132 | with open(intrinsics_path) as intrinsics_file:
133 | vals = [float(x) for x in next(intrinsics_file).split()]
134 | self.focal_length = (vals[0], vals[1])
135 | self.principal_point = (vals[2], vals[3])
136 |
137 | G_camera_image = []
138 | for line in intrinsics_file:
139 | G_camera_image.append([float(x) for x in line.split()])
140 | self.G_camera_image = np.array(G_camera_image)
141 |
142 | def __load_lut(self, models_dir, images_dir):
143 | model_name = self.__get_model_name(images_dir)
144 | lut_path = os.path.join(models_dir, model_name + '_distortion_lut.bin')
145 |
146 | lut = np.fromfile(lut_path, np.double)
147 | lut = lut.reshape([2, lut.size // 2])
148 | self.bilinear_lut = lut.transpose()
149 |
150 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/image.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Geoff Pascoe (gmp@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ###############################################################################
14 |
15 | import re
16 | from PIL import Image
17 | from colour_demosaicing import demosaicing_CFA_Bayer_bilinear as demosaic
18 | import numpy as np
19 |
20 | BAYER_STEREO = 'gbrg'
21 | BAYER_MONO = 'rggb'
22 |
23 |
24 | def load_image(image_path, model=None):
25 | """Loads and rectifies an image from file.
26 |
27 | Args:
28 | image_path (str): path to an image from the dataset.
29 | model (camera_model.CameraModel): if supplied, model will be used to undistort image.
30 |
31 | Returns:
32 | numpy.ndarray: demosaiced and optionally undistorted image
33 |
34 | """
35 | if model:
36 | camera = model.camera
37 | else:
38 | camera = re.search('(stereo|mono_(left|right|rear))', image_path).group(0)
39 | if camera == 'stereo':
40 | pattern = BAYER_STEREO
41 | else:
42 | pattern = BAYER_MONO
43 |
44 | img = Image.open(image_path)
45 | img = demosaic(img, pattern)
46 | if model:
47 | img = model.undistort(img)
48 |
49 | return np.array(img).astype(np.uint8)
50 |
51 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/interpolate_poses.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Geoff Pascoe (gmp@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ################################################################################
14 |
15 | import bisect
16 | import csv
17 | import numpy as np
18 | import numpy.matlib as ml
19 | import sys
20 | sys.path.append('.')
21 | from .transform import *
22 |
23 | def NCLT_interpolate_vo_poses(vo_path, pose_timestamps, origin_timestamp):
24 | """Interpolate poses from visual odometry.
25 |
26 | Args:
27 | vo_path (str): path to file containing relative poses from visual odometry.
28 | pose_timestamps (list[int]): UNIX timestamps at which interpolated poses are required.
29 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame.
30 |
31 | Returns:
32 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp.
33 |
34 | """
35 | with open(vo_path) as vo_file:
36 | vo_reader = csv.reader(vo_file)
37 | headers = next(vo_file)
38 |
39 | vo_timestamps = [0]
40 | abs_poses = [ml.identity(4)]
41 |
42 | lower_timestamp = min(min(pose_timestamps), origin_timestamp)
43 | upper_timestamp = max(max(pose_timestamps), origin_timestamp)
44 |
45 | for row in vo_reader:
46 | timestamp = int(row[0])
47 | if timestamp < lower_timestamp:
48 | vo_timestamps[0] = timestamp
49 | continue
50 |
51 | vo_timestamps.append(timestamp)
52 |
53 | xyzrpy = [float(v) for v in row[1:7]]
54 | rel_pose = build_se3_transform(xyzrpy)
55 | abs_pose = abs_poses[-1] * rel_pose
56 | abs_poses.append(abs_pose)
57 |
58 | if timestamp >= upper_timestamp:
59 | break
60 |
61 | return interpolate_poses(vo_timestamps, abs_poses, pose_timestamps, origin_timestamp)
62 |
63 | def interpolate_vo_poses(vo_path, pose_timestamps, origin_timestamp):
64 | """Interpolate poses from visual odometry.
65 |
66 | Args:
67 | vo_path (str): path to file containing relative poses from visual odometry.
68 | pose_timestamps (list[int]): UNIX timestamps at which interpolated poses are required.
69 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame.
70 |
71 | Returns:
72 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp.
73 |
74 | """
75 | with open(vo_path) as vo_file:
76 | vo_reader = csv.reader(vo_file)
77 | headers = next(vo_file)
78 |
79 | vo_timestamps = [0]
80 | abs_poses = [ml.identity(4)]
81 |
82 | lower_timestamp = min(min(pose_timestamps), origin_timestamp)
83 | upper_timestamp = max(max(pose_timestamps), origin_timestamp)
84 |
85 | for row in vo_reader:
86 | timestamp = int(row[0])
87 | if timestamp < lower_timestamp:
88 | vo_timestamps[0] = timestamp
89 | continue
90 |
91 | vo_timestamps.append(timestamp)
92 |
93 | xyzrpy = [float(v) for v in row[2:8]]
94 | rel_pose = build_se3_transform(xyzrpy)
95 | abs_pose = abs_poses[-1] * rel_pose
96 | abs_poses.append(abs_pose)
97 |
98 | if timestamp >= upper_timestamp:
99 | break
100 |
101 | return interpolate_poses(vo_timestamps, abs_poses, pose_timestamps, origin_timestamp)
102 |
103 |
104 | def interpolate_ins_poses(ins_path, pose_timestamps, origin_timestamp, use_rtk=False):
105 | """Interpolate poses from INS.
106 |
107 | Args:
108 | ins_path (str): path to file containing poses from INS.
109 | pose_timestamps (list[int]): UNIX timestamps at which interpolated poses are required.
110 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame.
111 |
112 | Returns:
113 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp.
114 |
115 | """
116 | with open(ins_path) as ins_file:
117 | ins_reader = csv.reader(ins_file)
118 | headers = next(ins_file)
119 |
120 | ins_timestamps = [0]
121 | abs_poses = [ml.identity(4)]
122 |
123 | upper_timestamp = max(max(pose_timestamps), origin_timestamp)
124 |
125 | for row in ins_reader:
126 | timestamp = int(row[0])
127 | ins_timestamps.append(timestamp)
128 |
129 | utm = row[5:8] if not use_rtk else row[4:7]
130 | rpy = row[-3:] if not use_rtk else row[11:14]
131 | xyzrpy = [float(v) for v in utm] + [float(v) for v in rpy]
132 | abs_pose = build_se3_transform(xyzrpy)
133 | abs_poses.append(abs_pose)
134 |
135 | if timestamp >= upper_timestamp:
136 | break
137 |
138 | ins_timestamps = ins_timestamps[1:]
139 | abs_poses = abs_poses[1:]
140 |
141 | return interpolate_poses(ins_timestamps, abs_poses, pose_timestamps, origin_timestamp)
142 |
143 |
144 | def interpolate_ins_poses_lw(ins_pose, pose_timestamps, origin_timestamp, use_rtk=False):
145 | """Interpolate poses from INS.
146 |
147 | Args:
148 | ins_path (str): path to file containing poses from INS.
149 | pose_timestamps (list[int]): UNIX timestamps at which interpolated poses are required.
150 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame.
151 |
152 | Returns:
153 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp.
154 |
155 | """
156 |
157 | ins_timestamps = [0]
158 | abs_poses = [ml.identity(4)]
159 |
160 | upper_timestamp = max(max(pose_timestamps), origin_timestamp)
161 | abs_pose = np.eye(4)
162 | for row in ins_pose:
163 | timestamp = int(row[0])
164 | ins_timestamps.append(timestamp)
165 |
166 | # utm = row[5:8] if not use_rtk else row[4:7]
167 | # rpy = row[-3:] if not use_rtk else row[11:14]
168 | # xyzrpy = [float(v) for v in utm] + [float(v) for v in rpy]
169 | # abs_pose = build_se3_transform(xyzrpy)
170 | abs_pose[:3, :3] = row[1:10].reshape(3, 3)
171 | abs_pose[:3, 3] = row[10:]
172 | abs_poses.append(abs_pose)
173 |
174 | if timestamp >= upper_timestamp:
175 | break
176 |
177 | ins_timestamps = ins_timestamps[1:]
178 | abs_poses = abs_poses[1:]
179 |
180 | return interpolate_poses(ins_timestamps, abs_poses, pose_timestamps, origin_timestamp)
181 |
182 |
183 | def interpolate_poses(pose_timestamps, abs_poses, requested_timestamps, origin_timestamp):
184 | """Interpolate between absolute poses.
185 |
186 | Args:
187 | pose_timestamps (list[int]): Timestamps of supplied poses. Must be in ascending order.
188 | abs_poses (list[numpy.matrixlib.defmatrix.matrix]): SE3 matrices representing poses at the timestamps specified.
189 | requested_timestamps (list[int]): Timestamps for which interpolated timestamps are required.
190 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame.
191 |
192 | Returns:
193 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp.
194 |
195 | Raises:
196 | ValueError: if pose_timestamps and abs_poses are not the same length
197 | ValueError: if pose_timestamps is not in ascending order
198 |
199 | """
200 | requested_timestamps.insert(0, origin_timestamp)
201 | requested_timestamps = np.array(requested_timestamps)
202 | pose_timestamps = np.array(pose_timestamps)
203 |
204 | if len(pose_timestamps) != len(abs_poses):
205 | raise ValueError('Must supply same number of timestamps as poses')
206 |
207 | abs_quaternions = np.zeros((4, len(abs_poses)))
208 | abs_positions = np.zeros((3, len(abs_poses)))
209 | for i, pose in enumerate(abs_poses):
210 | if i > 0 and pose_timestamps[i-1] >= pose_timestamps[i]:
211 | raise ValueError('Pose timestamps must be in ascending order')
212 |
213 | abs_quaternions[:, i] = so3_to_quaternion(pose[0:3, 0:3])
214 | abs_positions[:, i] = np.ravel(pose[0:3, 3])
215 |
216 | upper_indices = [bisect.bisect(pose_timestamps, pt) for pt in requested_timestamps]
217 | lower_indices = [u - 1 for u in upper_indices]
218 |
219 | if max(upper_indices) >= len(pose_timestamps):
220 | upper_indices = [min(i, len(pose_timestamps) - 1) for i in upper_indices]
221 |
222 | fractions = (requested_timestamps - pose_timestamps[lower_indices]) / \
223 | (pose_timestamps[upper_indices] - pose_timestamps[lower_indices])
224 | # import warnings
225 | # warnings.filterwarnings('error')
226 | # try:
227 | # fractions = (requested_timestamps - pose_timestamps[lower_indices]) / \
228 | # (pose_timestamps[upper_indices] - pose_timestamps[lower_indices])
229 | # except Warning:
230 | # print('Warning was raised as an exception!')
231 | # print(pose_timestamps[upper_indices])
232 | # print(pose_timestamps[lower_indices])
233 |
234 | quaternions_lower = abs_quaternions[:, lower_indices]
235 | quaternions_upper = abs_quaternions[:, upper_indices]
236 |
237 | d_array = (quaternions_lower * quaternions_upper).sum(0)
238 |
239 | linear_interp_indices = np.nonzero(d_array >= 1)
240 | sin_interp_indices = np.nonzero(d_array < 1)
241 |
242 | scale0_array = np.zeros(d_array.shape)
243 | scale1_array = np.zeros(d_array.shape)
244 |
245 | scale0_array[linear_interp_indices] = 1 - fractions[linear_interp_indices]
246 | scale1_array[linear_interp_indices] = fractions[linear_interp_indices]
247 |
248 | theta_array = np.arccos(np.abs(d_array[sin_interp_indices]))
249 |
250 | scale0_array[sin_interp_indices] = \
251 | np.sin((1 - fractions[sin_interp_indices]) * theta_array) / np.sin(theta_array)
252 | scale1_array[sin_interp_indices] = \
253 | np.sin(fractions[sin_interp_indices] * theta_array) / np.sin(theta_array)
254 |
255 | negative_d_indices = np.nonzero(d_array < 0)
256 | scale1_array[negative_d_indices] = -scale1_array[negative_d_indices]
257 |
258 | quaternions_interp = np.tile(scale0_array, (4, 1)) * quaternions_lower \
259 | + np.tile(scale1_array, (4, 1)) * quaternions_upper
260 |
261 | positions_lower = abs_positions[:, lower_indices]
262 | positions_upper = abs_positions[:, upper_indices]
263 |
264 | positions_interp = np.multiply(np.tile((1 - fractions), (3, 1)), positions_lower) \
265 | + np.multiply(np.tile(fractions, (3, 1)), positions_upper)
266 |
267 | poses_mat = ml.zeros((4, 4 * len(requested_timestamps)))
268 |
269 | poses_mat[0, 0::4] = 1 - 2 * np.square(quaternions_interp[2, :]) - \
270 | 2 * np.square(quaternions_interp[3, :])
271 | poses_mat[0, 1::4] = 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[2, :]) - \
272 | 2 * np.multiply(quaternions_interp[3, :], quaternions_interp[0, :])
273 | poses_mat[0, 2::4] = 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[3, :]) + \
274 | 2 * np.multiply(quaternions_interp[2, :], quaternions_interp[0, :])
275 |
276 | poses_mat[1, 0::4] = 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[2, :]) \
277 | + 2 * np.multiply(quaternions_interp[3, :], quaternions_interp[0, :])
278 | poses_mat[1, 1::4] = 1 - 2 * np.square(quaternions_interp[1, :]) \
279 | - 2 * np.square(quaternions_interp[3, :])
280 | poses_mat[1, 2::4] = 2 * np.multiply(quaternions_interp[2, :], quaternions_interp[3, :]) - \
281 | 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[0, :])
282 |
283 | poses_mat[2, 0::4] = 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[3, :]) - \
284 | 2 * np.multiply(quaternions_interp[2, :], quaternions_interp[0, :])
285 | poses_mat[2, 1::4] = 2 * np.multiply(quaternions_interp[2, :], quaternions_interp[3, :]) + \
286 | 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[0, :])
287 | poses_mat[2, 2::4] = 1 - 2 * np.square(quaternions_interp[1, :]) - \
288 | 2 * np.square(quaternions_interp[2, :])
289 |
290 | poses_mat[0:3, 3::4] = positions_interp
291 | poses_mat[3, 3::4] = 1
292 |
293 | # don't use relative pose
294 | # poses_mat = np.linalg.solve(poses_mat[0:4, 0:4], poses_mat)
295 |
296 | poses_out = [0] * (len(requested_timestamps) - 1)
297 | for i in range(1, len(requested_timestamps)):
298 | poses_out[i - 1] = poses_mat[0:4, i * 4:(i + 1) * 4]
299 |
300 | return poses_out
301 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/play_images.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Geoff Pascoe (gmp@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ################################################################################
14 |
15 | import argparse
16 | import os
17 | import re
18 | import matplotlib.pyplot as plt
19 | from datetime import datetime as dt
20 | from image import load_image
21 | from camera_model import CameraModel
22 |
23 | parser = argparse.ArgumentParser(description='Play back images from a given directory')
24 |
25 | parser.add_argument('dir', type=str, help='Directory containing images.')
26 | parser.add_argument('--models_dir', type=str, default=None, help='(optional) Directory containing camera model. If supplied, images will be undistorted before display')
27 | parser.add_argument('--scale', type=float, default=1.0, help='(optional) factor by which to scale images before display')
28 |
29 | args = parser.parse_args()
30 |
31 | camera = re.search('(stereo|mono_(left|right|rear))', args.dir).group(0)
32 |
33 | timestamps_path = os.path.join(os.path.join(args.dir, os.pardir, camera + '.timestamps'))
34 | if not os.path.isfile(timestamps_path):
35 | timestamps_path = os.path.join(args.dir, os.pardir, os.pardir, camera + '.timestamps')
36 | if not os.path.isfile(timestamps_path):
37 | raise IOError("Could not find timestamps file")
38 |
39 | model = None
40 | if args.models_dir:
41 | model = CameraModel(args.models_dir, args.dir)
42 |
43 | current_chunk = 0
44 | timestamps_file = open(timestamps_path)
45 | for line in timestamps_file:
46 | tokens = line.split()
47 | datetime = dt.utcfromtimestamp(int(tokens[0])/1000000)
48 | chunk = int(tokens[1])
49 |
50 | filename = os.path.join(args.dir, tokens[0] + '.png')
51 | if not os.path.isfile(filename):
52 | if chunk != current_chunk:
53 | print("Chunk " + str(chunk) + " not found")
54 | current_chunk = chunk
55 | continue
56 |
57 | current_chunk = chunk
58 |
59 | img = load_image(filename, model)
60 | plt.imshow(img)
61 | plt.xlabel(datetime)
62 | plt.xticks([])
63 | plt.yticks([])
64 | plt.pause(0.01)
65 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/play_radar.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Dan Barnes (dbarnes@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ################################################################################
14 |
15 | import argparse
16 | import os
17 | from radar import load_radar, radar_polar_to_cartesian
18 | import numpy as np
19 | import cv2
20 |
21 | parser = argparse.ArgumentParser(description='Play back radar data from a given directory')
22 |
23 | parser.add_argument('dir', type=str, help='Directory containing radar data.')
24 |
25 | args = parser.parse_args()
26 |
27 | timestamps_path = os.path.join(os.path.join(args.dir, os.pardir, 'radar.timestamps'))
28 | if not os.path.isfile(timestamps_path):
29 | raise IOError("Could not find timestamps file")
30 |
31 | # Cartesian Visualsation Setup
32 | # Resolution of the cartesian form of the radar scan in metres per pixel
33 | cart_resolution = .25
34 | # Cartesian visualisation size (used for both height and width)
35 | cart_pixel_width = 501 # pixels
36 | interpolate_crossover = True
37 |
38 | title = "Radar Visualisation Example"
39 |
40 | radar_timestamps = np.loadtxt(timestamps_path, delimiter=' ', usecols=[0], dtype=np.int64)
41 | for radar_timestamp in radar_timestamps:
42 | filename = os.path.join(args.dir, str(radar_timestamp) + '.png')
43 |
44 | if not os.path.isfile(filename):
45 | raise FileNotFoundError("Could not find radar example: {}".format(filename))
46 |
47 | timestamps, azimuths, valid, fft_data, radar_resolution = load_radar(filename)
48 | cart_img = radar_polar_to_cartesian(azimuths, fft_data, radar_resolution, cart_resolution, cart_pixel_width,
49 | interpolate_crossover)
50 |
51 | # Combine polar and cartesian for visualisation
52 | # The raw polar data is resized to the height of the cartesian representation
53 | downsample_rate = 4
54 | fft_data_vis = fft_data[:, ::downsample_rate]
55 | resize_factor = float(cart_img.shape[0]) / float(fft_data_vis.shape[0])
56 | fft_data_vis = cv2.resize(fft_data_vis, (0, 0), None, resize_factor, resize_factor)
57 | vis = cv2.hconcat((fft_data_vis, fft_data_vis[:, :10] * 0 + 1, cart_img))
58 |
59 | cv2.imshow(title, vis * 2.) # The data is doubled to improve visualisation
60 | cv2.waitKey(1)
61 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/play_velodyne.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Dan Barnes (dbarnes@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ################################################################################
14 |
15 |
16 | import argparse
17 | from argparse import RawTextHelpFormatter
18 | import os
19 | from velodyne import load_velodyne_raw, load_velodyne_binary, velodyne_raw_to_pointcloud
20 | import numpy as np
21 | import cv2
22 | from matplotlib.cm import get_cmap
23 | from scipy import interpolate
24 | import open3d
25 | from transform import build_se3_transform
26 |
27 | mode_flag_help = """Mode to run in, one of: (raw|raw_interp|raw_ptcld|bin_ptcld)
28 | - 'raw' - visualise the raw velodyne data (intensities and ranges)
29 | - 'raw_interp' - visualise the raw velodyne data (intensities and ranges) interpolated to consistent azimuth angles
30 | between scans.
31 | - 'raw_ptcld' - visualise the raw velodyne data converted to a pointcloud (converts files of the form
32 | .png to pointcloud)
33 | - 'bin_ptcld' - visualise the precomputed velodyne pointclouds (files of the form .bin). This is
34 | approximately 2x faster than running the conversion from raw data `raw_ptcld` at the cost of
35 | approximately 8x the storage space.
36 | """
37 |
38 | parser = argparse.ArgumentParser(description='Play back velodyne data from a given directory',
39 | formatter_class=RawTextHelpFormatter)
40 | parser.add_argument('--mode', default="raw_interp", type=str, help=mode_flag_help)
41 | parser.add_argument('--scale', default=2., type=float, help="Scale visualisations by this amount")
42 | parser.add_argument('dir', type=str, help='Directory containing velodyne data.')
43 |
44 | args = parser.parse_args()
45 |
46 |
47 | def main():
48 | velodyne_dir = args.dir
49 | if velodyne_dir[-1] == '/':
50 | velodyne_dir = velodyne_dir[:-1]
51 |
52 | velodyne_sensor = os.path.basename(velodyne_dir)
53 | if velodyne_sensor not in ["velodyne_left", "velodyne_right"]:
54 | raise ValueError("Velodyne directory not valid: {}".format(velodyne_dir))
55 |
56 | timestamps_path = velodyne_dir + '.timestamps'
57 | if not os.path.isfile(timestamps_path):
58 | raise IOError("Could not find timestamps file: {}".format(timestamps_path))
59 |
60 | title = "Velodyne Visualisation Example"
61 | extension = ".bin" if args.mode == "bin_ptcld" else ".png"
62 | velodyne_timestamps = np.loadtxt(timestamps_path, delimiter=' ', usecols=[0], dtype=np.int64)
63 | colourmap = (get_cmap("viridis")(np.linspace(0, 1, 255))[:, :3] * 255).astype(np.uint8)[:, ::-1]
64 | interp_angles = np.mod(np.linspace(np.pi, 3 * np.pi, 720), 2 * np.pi)
65 | vis = None
66 |
67 | for velodyne_timestamp in velodyne_timestamps:
68 |
69 | filename = os.path.join(args.dir, str(velodyne_timestamp) + extension)
70 |
71 | if args.mode == "bin_ptcld":
72 | ptcld = load_velodyne_binary(filename)
73 | else:
74 | ranges, intensities, angles, approximate_timestamps = load_velodyne_raw(filename)
75 |
76 | if args.mode == "raw_ptcld":
77 | ptcld = velodyne_raw_to_pointcloud(ranges, intensities, angles)
78 | elif args.mode == "raw_interp":
79 | intensities = interpolate.interp1d(angles[0], intensities, bounds_error=False)(interp_angles)
80 | ranges = interpolate.interp1d(angles[0], ranges, bounds_error=False)(interp_angles)
81 | intensities[np.isnan(intensities)] = 0
82 | ranges[np.isnan(ranges)] = 0
83 |
84 | if '_ptcld' in args.mode:
85 | # Pointcloud Visualisation using Open3D
86 | if vis is None:
87 | vis = open3d.Visualizer()
88 | vis.create_window(window_name=title)
89 | pcd = open3d.geometry.PointCloud()
90 | # initialise the geometry pre loop
91 | pcd.points = open3d.utility.Vector3dVector(ptcld[:3].transpose().astype(np.float64))
92 | pcd.colors = open3d.utility.Vector3dVector(np.tile(ptcld[3:].transpose(), (1, 3)).astype(np.float64))
93 | # Rotate pointcloud to align displayed coordinate frame colouring
94 | pcd.transform(build_se3_transform([0, 0, 0, np.pi, 0, -np.pi / 2]))
95 | vis.add_geometry(pcd)
96 | render_option = vis.get_render_option()
97 | render_option.background_color = np.array([0.1529, 0.1569, 0.1333], np.float32)
98 | render_option.point_color_option = open3d.PointColorOption.ZCoordinate
99 | coordinate_frame = open3d.geometry.create_mesh_coordinate_frame()
100 | vis.add_geometry(coordinate_frame)
101 | view_control = vis.get_view_control()
102 | params = view_control.convert_to_pinhole_camera_parameters()
103 | params.extrinsic = build_se3_transform([0, 3, 10, 0, -np.pi * 0.42, -np.pi / 2])
104 | view_control.convert_from_pinhole_camera_parameters(params)
105 |
106 | pcd.points = open3d.utility.Vector3dVector(ptcld[:3].transpose().astype(np.float64))
107 | pcd.colors = open3d.utility.Vector3dVector(
108 | np.tile(ptcld[3:].transpose(), (1, 3)).astype(np.float64) / 40)
109 | vis.update_geometry()
110 | vis.poll_events()
111 | vis.update_renderer()
112 |
113 | else:
114 | # Ranges and Intensities visualisation using OpenCV
115 | intensities_vis = colourmap[np.clip((intensities * 4).astype(np.int), 0, colourmap.shape[0] - 1)]
116 | ranges_vis = colourmap[np.clip((ranges * 4).astype(np.int), 0, colourmap.shape[0] - 1)]
117 | visualisation = np.concatenate((intensities_vis, ranges_vis), 0)
118 | visualisation = cv2.resize(visualisation, None, fy=6 * args.scale, fx=args.scale,
119 | interpolation=cv2.INTER_NEAREST)
120 | cv2.imshow(title, visualisation)
121 | cv2.waitKey(1)
122 |
123 |
124 | if __name__ == "__main__":
125 | main()
126 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/project_laser_into_camera.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Geoff Pascoe (gmp@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ################################################################################
14 |
15 | import os
16 | import re
17 | import numpy as np
18 | import matplotlib.pyplot as plt
19 | import argparse
20 |
21 | from build_pointcloud import build_pointcloud
22 | from transform import build_se3_transform
23 | from image import load_image
24 | from camera_model import CameraModel
25 |
26 | parser = argparse.ArgumentParser(description='Project LIDAR data into camera image')
27 | parser.add_argument('--image_dir', type=str, help='Directory containing images')
28 | parser.add_argument('--laser_dir', type=str, help='Directory containing LIDAR scans')
29 | parser.add_argument('--poses_file', type=str, help='File containing either INS or VO poses')
30 | parser.add_argument('--models_dir', type=str, help='Directory containing camera models')
31 | parser.add_argument('--extrinsics_dir', type=str, help='Directory containing sensor extrinsics')
32 | parser.add_argument('--image_idx', type=int, help='Index of image to display')
33 |
34 | args = parser.parse_args()
35 |
36 | model = CameraModel(args.models_dir, args.image_dir)
37 |
38 | extrinsics_path = os.path.join(args.extrinsics_dir, model.camera + '.txt')
39 | with open(extrinsics_path) as extrinsics_file:
40 | extrinsics = [float(x) for x in next(extrinsics_file).split(' ')]
41 |
42 | G_camera_vehicle = build_se3_transform(extrinsics)
43 | G_camera_posesource = None
44 |
45 | poses_type = re.search('(vo|ins|rtk)\.csv', args.poses_file).group(1)
46 | if poses_type in ['ins', 'rtk']:
47 | with open(os.path.join(args.extrinsics_dir, 'ins.txt')) as extrinsics_file:
48 | extrinsics = next(extrinsics_file)
49 | G_camera_posesource = G_camera_vehicle * build_se3_transform([float(x) for x in extrinsics.split(' ')])
50 | else:
51 | # VO frame and vehicle frame are the same
52 | G_camera_posesource = G_camera_vehicle
53 |
54 |
55 | timestamps_path = os.path.join(args.image_dir, os.pardir, model.camera + '.timestamps')
56 | if not os.path.isfile(timestamps_path):
57 | timestamps_path = os.path.join(args.image_dir, os.pardir, os.pardir, model.camera + '.timestamps')
58 |
59 | timestamp = 0
60 | with open(timestamps_path) as timestamps_file:
61 | for i, line in enumerate(timestamps_file):
62 | if i == args.image_idx:
63 | timestamp = int(line.split(' ')[0])
64 |
65 | pointcloud, reflectance = build_pointcloud(args.laser_dir, args.poses_file, args.extrinsics_dir,
66 | timestamp - 1e7, timestamp + 1e7, timestamp)
67 |
68 | pointcloud = np.dot(G_camera_posesource, pointcloud)
69 |
70 | image_path = os.path.join(args.image_dir, str(timestamp) + '.png')
71 | image = load_image(image_path, model)
72 |
73 | uv, depth = model.project(pointcloud, image.shape)
74 |
75 | plt.imshow(image)
76 | plt.scatter(np.ravel(uv[0, :]), np.ravel(uv[1, :]), s=2, c=depth, edgecolors='none', cmap='jet')
77 | plt.xlim(0, image.shape[1])
78 | plt.ylim(image.shape[0], 0)
79 | plt.xticks([])
80 | plt.yticks([])
81 | plt.show()
82 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/radar.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Dan Barnes (dbarnes@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ###############################################################################
14 |
15 | from typing import AnyStr, Tuple
16 | import numpy as np
17 | import cv2
18 |
19 |
20 | def load_radar(example_path: AnyStr) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, float]:
21 | """Decode a single Oxford Radar RobotCar Dataset radar example
22 | Args:
23 | example_path (AnyStr): Oxford Radar RobotCar Dataset Example png
24 | Returns:
25 | timestamps (np.ndarray): Timestamp for each azimuth in int64 (UNIX time)
26 | azimuths (np.ndarray): Rotation for each polar radar azimuth (radians)
27 | valid (np.ndarray) Mask of whether azimuth data is an original sensor reading or interpolated from adjacent
28 | azimuths
29 | fft_data (np.ndarray): Radar power readings along each azimuth
30 | radar_resolution (float): Resolution of the polar radar data (metres per pixel)
31 | """
32 | # Hard coded configuration to simplify parsing code
33 | radar_resolution = np.array([0.0432], np.float32)
34 | encoder_size = 5600
35 |
36 | raw_example_data = cv2.imread(example_path, cv2.IMREAD_GRAYSCALE)
37 | timestamps = raw_example_data[:, :8].copy().view(np.int64)
38 | azimuths = (raw_example_data[:, 8:10].copy().view(np.uint16) / float(encoder_size) * 2 * np.pi).astype(np.float32)
39 | valid = raw_example_data[:, 10:11] == 255
40 | fft_data = raw_example_data[:, 11:].astype(np.float32)[:, :, np.newaxis] / 255.
41 |
42 | return timestamps, azimuths, valid, fft_data, radar_resolution
43 |
44 |
45 | def radar_polar_to_cartesian(azimuths: np.ndarray, fft_data: np.ndarray, radar_resolution: float,
46 | cart_resolution: float, cart_pixel_width: int, interpolate_crossover=True) -> np.ndarray:
47 | """Convert a polar radar scan to cartesian.
48 | Args:
49 | azimuths (np.ndarray): Rotation for each polar radar azimuth (radians)
50 | fft_data (np.ndarray): Polar radar power readings
51 | radar_resolution (float): Resolution of the polar radar data (metres per pixel)
52 | cart_resolution (float): Cartesian resolution (metres per pixel)
53 | cart_pixel_size (int): Width and height of the returned square cartesian output (pixels). Please see the Notes
54 | below for a full explanation of how this is used.
55 | interpolate_crossover (bool, optional): If true interpolates between the end and start azimuth of the scan. In
56 | practice a scan before / after should be used but this prevents nan regions in the return cartesian form.
57 |
58 | Returns:
59 | np.ndarray: Cartesian radar power readings
60 | Notes:
61 | After using the warping grid the output radar cartesian is defined as as follows where
62 | X and Y are the `real` world locations of the pixels in metres:
63 | If 'cart_pixel_width' is odd:
64 | +------ Y = -1 * cart_resolution (m)
65 | |+----- Y = 0 (m) at centre pixel
66 | ||+---- Y = 1 * cart_resolution (m)
67 | |||+--- Y = 2 * cart_resolution (m)
68 | |||| +- Y = cart_pixel_width // 2 * cart_resolution (m) (at last pixel)
69 | |||| +-----------+
70 | vvvv v
71 | +---------------+---------------+
72 | | | |
73 | | | |
74 | | | |
75 | | | |
76 | | | |
77 | | | |
78 | | | |
79 | +---------------+---------------+ <-- X = 0 (m) at centre pixel
80 | | | |
81 | | | |
82 | | | |
83 | | | |
84 | | | |
85 | | | |
86 | | | |
87 | +---------------+---------------+
88 | <------------------------------->
89 | cart_pixel_width (pixels)
90 | If 'cart_pixel_width' is even:
91 | +------ Y = -0.5 * cart_resolution (m)
92 | |+----- Y = 0.5 * cart_resolution (m)
93 | ||+---- Y = 1.5 * cart_resolution (m)
94 | |||+--- Y = 2.5 * cart_resolution (m)
95 | |||| +- Y = (cart_pixel_width / 2 - 0.5) * cart_resolution (m) (at last pixel)
96 | |||| +----------+
97 | vvvv v
98 | +------------------------------+
99 | | |
100 | | |
101 | | |
102 | | |
103 | | |
104 | | |
105 | | |
106 | | |
107 | | |
108 | | |
109 | | |
110 | | |
111 | | |
112 | | |
113 | | |
114 | +------------------------------+
115 | <------------------------------>
116 | cart_pixel_width (pixels)
117 | """
118 | if (cart_pixel_width % 2) == 0:
119 | cart_min_range = (cart_pixel_width / 2 - 0.5) * cart_resolution
120 | else:
121 | cart_min_range = cart_pixel_width // 2 * cart_resolution
122 | coords = np.linspace(-cart_min_range, cart_min_range, cart_pixel_width, dtype=np.float32)
123 | Y, X = np.meshgrid(coords, -coords)
124 | sample_range = np.sqrt(Y * Y + X * X)
125 | sample_angle = np.arctan2(Y, X)
126 | sample_angle += (sample_angle < 0).astype(np.float32) * 2. * np.pi
127 |
128 | # Interpolate Radar Data Coordinates
129 | azimuth_step = azimuths[1] - azimuths[0]
130 | sample_u = (sample_range - radar_resolution / 2) / radar_resolution
131 | sample_v = (sample_angle - azimuths[0]) / azimuth_step
132 |
133 | # We clip the sample points to the minimum sensor reading range so that we
134 | # do not have undefined results in the centre of the image. In practice
135 | # this region is simply undefined.
136 | sample_u[sample_u < 0] = 0
137 |
138 | if interpolate_crossover:
139 | fft_data = np.concatenate((fft_data[-1:], fft_data, fft_data[:1]), 0)
140 | sample_v = sample_v + 1
141 |
142 | polar_to_cart_warp = np.stack((sample_u, sample_v), -1)
143 | cart_img = np.expand_dims(cv2.remap(fft_data, polar_to_cart_warp, None, cv2.INTER_LINEAR), -1)
144 | return cart_img
145 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy
2 | matplotlib
3 | colour_demosaicing
4 | pillow
5 | opencv-python
6 | open3d-python
7 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/transform.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Geoff Pascoe (gmp@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ################################################################################
14 |
15 | import numpy as np
16 | import numpy.matlib as matlib
17 | from math import sin, cos, atan2, sqrt
18 |
19 | MATRIX_MATCH_TOLERANCE = 1e-4
20 |
21 |
22 | def build_se3_transform(xyzrpy):
23 | """Creates an SE3 transform from translation and Euler angles.
24 |
25 | Args:
26 | xyzrpy (list[float]): translation and Euler angles for transform. Must have six components.
27 |
28 | Returns:
29 | numpy.matrixlib.defmatrix.matrix: SE3 homogeneous transformation matrix
30 |
31 | Raises:
32 | ValueError: if `len(xyzrpy) != 6`
33 |
34 | """
35 | if len(xyzrpy) != 6:
36 | raise ValueError("Must supply 6 values to build transform")
37 |
38 | se3 = matlib.identity(4)
39 | se3[0:3, 0:3] = euler_to_so3(xyzrpy[3:6])
40 | se3[0:3, 3] = np.matrix(xyzrpy[0:3]).transpose()
41 | return se3
42 |
43 |
44 | def euler_to_so3(rpy):
45 | """Converts Euler angles to an SO3 rotation matrix.
46 |
47 | Args:
48 | rpy (list[float]): Euler angles (in radians). Must have three components.
49 |
50 | Returns:
51 | numpy.matrixlib.defmatrix.matrix: 3x3 SO3 rotation matrix
52 |
53 | Raises:
54 | ValueError: if `len(rpy) != 3`.
55 |
56 | """
57 | if len(rpy) != 3:
58 | raise ValueError("Euler angles must have three components")
59 |
60 | R_x = np.matrix([[1, 0, 0],
61 | [0, cos(rpy[0]), -sin(rpy[0])],
62 | [0, sin(rpy[0]), cos(rpy[0])]])
63 | R_y = np.matrix([[cos(rpy[1]), 0, sin(rpy[1])],
64 | [0, 1, 0],
65 | [-sin(rpy[1]), 0, cos(rpy[1])]])
66 | R_z = np.matrix([[cos(rpy[2]), -sin(rpy[2]), 0],
67 | [sin(rpy[2]), cos(rpy[2]), 0],
68 | [0, 0, 1]])
69 | R_zyx = R_z * R_y * R_x
70 | return R_zyx
71 |
72 |
73 | def so3_to_euler(so3):
74 | """Converts an SO3 rotation matrix to Euler angles
75 |
76 | Args:
77 | so3: 3x3 rotation matrix
78 |
79 | Returns:
80 | numpy.matrixlib.defmatrix.matrix: list of Euler angles (size 3)
81 |
82 | Raises:
83 | ValueError: if so3 is not 3x3
84 | ValueError: if a valid Euler parametrisation cannot be found
85 |
86 | """
87 | if so3.shape != (3, 3):
88 | raise ValueError("SO3 matrix must be 3x3")
89 | roll = atan2(so3[2, 1], so3[2, 2])
90 | yaw = atan2(so3[1, 0], so3[0, 0])
91 | denom = sqrt(so3[0, 0] ** 2 + so3[1, 0] ** 2)
92 | pitch_poss = [atan2(-so3[2, 0], denom), atan2(-so3[2, 0], -denom)]
93 |
94 | R = euler_to_so3((roll, pitch_poss[0], yaw))
95 |
96 | if (so3 - R).sum() < MATRIX_MATCH_TOLERANCE:
97 | return np.matrix([roll, pitch_poss[0], yaw])
98 | else:
99 | R = euler_to_so3((roll, pitch_poss[1], yaw))
100 | if (so3 - R).sum() > MATRIX_MATCH_TOLERANCE:
101 | raise ValueError("Could not find valid pitch angle")
102 | return np.matrix([roll, pitch_poss[1], yaw])
103 |
104 |
105 | def so3_to_quaternion(so3):
106 | """Converts an SO3 rotation matrix to a quaternion
107 |
108 | Args:
109 | so3: 3x3 rotation matrix
110 |
111 | Returns:
112 | numpy.ndarray: quaternion [w, x, y, z]
113 |
114 | Raises:
115 | ValueError: if so3 is not 3x3
116 | """
117 | if so3.shape != (3, 3):
118 | raise ValueError("SO3 matrix must be 3x3")
119 |
120 | R_xx = so3[0, 0]
121 | R_xy = so3[0, 1]
122 | R_xz = so3[0, 2]
123 | R_yx = so3[1, 0]
124 | R_yy = so3[1, 1]
125 | R_yz = so3[1, 2]
126 | R_zx = so3[2, 0]
127 | R_zy = so3[2, 1]
128 | R_zz = so3[2, 2]
129 |
130 | try:
131 | w = sqrt(so3.trace() + 1) / 2
132 | except(ValueError):
133 | # w is non-real
134 | w = 0
135 |
136 | # Due to numerical precision the value passed to `sqrt` may be a negative of the order 1e-15.
137 | # To avoid this error we clip these values to a minimum value of 0.
138 | x = sqrt(max(1 + R_xx - R_yy - R_zz, 0)) / 2
139 | y = sqrt(max(1 + R_yy - R_xx - R_zz, 0)) / 2
140 | z = sqrt(max(1 + R_zz - R_yy - R_xx, 0)) / 2
141 |
142 | max_index = max(range(4), key=[w, x, y, z].__getitem__)
143 |
144 | if max_index == 0:
145 | x = (R_zy - R_yz) / (4 * w)
146 | y = (R_xz - R_zx) / (4 * w)
147 | z = (R_yx - R_xy) / (4 * w)
148 | elif max_index == 1:
149 | w = (R_zy - R_yz) / (4 * x)
150 | y = (R_xy + R_yx) / (4 * x)
151 | z = (R_zx + R_xz) / (4 * x)
152 | elif max_index == 2:
153 | w = (R_xz - R_zx) / (4 * y)
154 | x = (R_xy + R_yx) / (4 * y)
155 | z = (R_yz + R_zy) / (4 * y)
156 | elif max_index == 3:
157 | w = (R_yx - R_xy) / (4 * z)
158 | x = (R_zx + R_xz) / (4 * z)
159 | y = (R_yz + R_zy) / (4 * z)
160 |
161 | return np.array([w, x, y, z])
162 |
163 |
164 | def se3_to_components(se3):
165 | """Converts an SE3 rotation matrix to linear translation and Euler angles
166 |
167 | Args:
168 | se3: 4x4 transformation matrix
169 |
170 | Returns:
171 | numpy.matrixlib.defmatrix.matrix: list of [x, y, z, roll, pitch, yaw]
172 |
173 | Raises:
174 | ValueError: if se3 is not 4x4
175 | ValueError: if a valid Euler parametrisation cannot be found
176 |
177 | """
178 | if se3.shape != (4, 4):
179 | raise ValueError("SE3 transform must be a 4x4 matrix")
180 | xyzrpy = np.empty(6)
181 | xyzrpy[0:3] = se3[0:3, 3].transpose()
182 | xyzrpy[3:6] = so3_to_euler(se3[0:3, 0:3])
183 | return xyzrpy
184 |
--------------------------------------------------------------------------------
/datasets/robotcar_sdk/python/velodyne.py:
--------------------------------------------------------------------------------
1 | ################################################################################
2 | #
3 | # Copyright (c) 2017 University of Oxford
4 | # Authors:
5 | # Dan Barnes (dbarnes@robots.ox.ac.uk)
6 | #
7 | # This work is licensed under the Creative Commons
8 | # Attribution-NonCommercial-ShareAlike 4.0 International License.
9 | # To view a copy of this license, visit
10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to
11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
12 | #
13 | ###############################################################################
14 |
15 | from typing import AnyStr
16 | import numpy as np
17 | import os
18 | import cv2
19 |
20 | # Hard coded configuration to simplify parsing code
21 | hdl32e_range_resolution = 0.002 # m / pixel
22 | hdl32e_minimum_range = 1.0
23 | hdl32e_elevations = np.array([-0.1862, -0.1628, -0.1396, -0.1164, -0.0930,
24 | -0.0698, -0.0466, -0.0232, 0., 0.0232, 0.0466, 0.0698,
25 | 0.0930, 0.1164, 0.1396, 0.1628, 0.1862, 0.2094, 0.2327,
26 | 0.2560, 0.2793, 0.3025, 0.3259, 0.3491, 0.3723, 0.3957,
27 | 0.4189, 0.4421, 0.4655, 0.4887, 0.5119, 0.5353])[:, np.newaxis]
28 | hdl32e_base_to_fire_height = 0.090805
29 | hdl32e_cos_elevations = np.cos(hdl32e_elevations)
30 | hdl32e_sin_elevations = np.sin(hdl32e_elevations)
31 |
32 |
33 | def load_velodyne_binary(velodyne_bin_path: AnyStr):
34 | """Decode a binary Velodyne example (of the form '.bin')
35 | Args:
36 | example_path (AnyStr): Oxford Radar RobotCar Dataset binary Velodyne pointcloud example path
37 | Returns:
38 | ptcld (np.ndarray): XYZI pointcloud from the binary Velodyne data Nx4
39 | Notes:
40 | - The pre computed points are *NOT* motion compensated.
41 | - Converting a raw velodyne scan to pointcloud can be done using the
42 | `velodyne_ranges_intensities_angles_to_pointcloud` function.
43 | """
44 | ext = os.path.splitext(velodyne_bin_path)[1]
45 | if ext != ".bin":
46 | raise RuntimeError("Velodyne binary pointcloud file should have `.bin` extension but had: {}".format(ext))
47 | if not os.path.isfile(velodyne_bin_path):
48 | raise FileNotFoundError("Could not find velodyne bin example: {}".format(velodyne_bin_path))
49 | data = np.fromfile(velodyne_bin_path, dtype=np.float32)
50 | ptcld = data.reshape((4, -1))
51 | return ptcld
52 |
53 | def load_velodyne_binary_seg(velodyne_bin_path: AnyStr):
54 | """Decode a binary Velodyne example (of the form '.bin')
55 | Args:
56 | example_path (AnyStr): Oxford Radar RobotCar Dataset binary Velodyne pointcloud example path
57 | Returns:
58 | ptcld (np.ndarray): XYZI pointcloud from the binary Velodyne data Nx4
59 | Notes:
60 | - The pre computed points are *NOT* motion compensated.
61 | - Converting a raw velodyne scan to pointcloud can be done using the
62 | `velodyne_ranges_intensities_angles_to_pointcloud` function.
63 | """
64 | ext = os.path.splitext(velodyne_bin_path)[1]
65 | if ext != ".bin":
66 | raise RuntimeError("Velodyne binary pointcloud file should have `.bin` extension but had: {}".format(ext))
67 | if not os.path.isfile(velodyne_bin_path):
68 | raise FileNotFoundError("Could not find velodyne bin example: {}".format(velodyne_bin_path))
69 | data = np.fromfile(velodyne_bin_path, dtype=np.float32)
70 | ptcld = data.reshape((-1, 4))
71 | return ptcld
72 |
73 | def load_velodyne_raw(velodyne_raw_path: AnyStr):
74 | """Decode a raw Velodyne example. (of the form '.png')
75 | Args:
76 | example_path (AnyStr): Oxford Radar RobotCar Dataset raw Velodyne example path
77 | Returns:
78 | ranges (np.ndarray): Range of each measurement in meters where 0 == invalid, (32 x N)
79 | intensities (np.ndarray): Intensity of each measurement where 0 == invalid, (32 x N)
80 | angles (np.ndarray): Angle of each measurement in radians (1 x N)
81 | approximate_timestamps (np.ndarray): Approximate linearly interpolated timestamps of each mesaurement (1 x N).
82 | Approximate as we only receive timestamps for each packet. The timestamp of the next frame will was used to
83 | interpolate the last packet timestamps. If there was no next frame, the last packet timestamps was
84 | extrapolated. The original packet timestamps can be recovered with:
85 | approximate_timestamps(:, 1:12:end) (12 is the number of azimuth returns in each packet)
86 | Notes:
87 | Reference: https://velodynelidar.com/lidar/products/manual/63-9113%20HDL-32E%20manual_Rev%20E_NOV2012.pdf
88 | """
89 | ext = os.path.splitext(velodyne_raw_path)[1]
90 | if ext != ".png":
91 | raise RuntimeError("Velodyne raw file should have `.png` extension but had: {}".format(ext))
92 | if not os.path.isfile(velodyne_raw_path):
93 | raise FileNotFoundError("Could not find velodyne raw example: {}".format(velodyne_raw_path))
94 | example = cv2.imread(velodyne_raw_path, cv2.IMREAD_GRAYSCALE)
95 | intensities, ranges_raw, angles_raw, timestamps_raw = np.array_split(example, [32, 96, 98], 0)
96 | ranges = np.ascontiguousarray(ranges_raw.transpose()).view(np.uint16).transpose()
97 | ranges = ranges * hdl32e_range_resolution
98 | angles = np.ascontiguousarray(angles_raw.transpose()).view(np.uint16).transpose()
99 | angles = angles * (2. * np.pi) / 36000
100 | approximate_timestamps = np.ascontiguousarray(timestamps_raw.transpose()).view(np.int64).transpose()
101 | return ranges, intensities, angles, approximate_timestamps
102 |
103 |
104 | def velodyne_raw_to_pointcloud(ranges: np.ndarray, intensities: np.ndarray, angles: np.ndarray):
105 | """ Convert raw Velodyne data (from load_velodyne_raw) into a pointcloud
106 | Args:
107 | ranges (np.ndarray): Raw Velodyne range readings
108 | intensities (np.ndarray): Raw Velodyne intensity readings
109 | angles (np.ndarray): Raw Velodyne angles
110 | Returns:
111 | pointcloud (np.ndarray): XYZI pointcloud generated from the raw Velodyne data Nx4
112 |
113 | Notes:
114 | - This implementation does *NOT* perform motion compensation on the generated pointcloud.
115 | - Accessing the pointclouds in binary form via `load_velodyne_pointcloud` is approximately 2x faster at the cost
116 | of 8x the storage space
117 | """
118 | valid = ranges > hdl32e_minimum_range
119 | z = hdl32e_sin_elevations * ranges - hdl32e_base_to_fire_height
120 | xy = hdl32e_cos_elevations * ranges
121 | x = np.sin(angles) * xy
122 | y = -np.cos(angles) * xy
123 |
124 | xf = x[valid].reshape(-1)
125 | yf = y[valid].reshape(-1)
126 | zf = z[valid].reshape(-1)
127 | intensityf = intensities[valid].reshape(-1).astype(np.float32)
128 | ptcld = np.stack((xf, yf, zf, intensityf), 0)
129 | return ptcld
130 |
--------------------------------------------------------------------------------
/datasets/rsd.py:
--------------------------------------------------------------------------------
1 | """
2 | This code is designed to implement the following:
3 | 1. After reaching a certain epoch, it clips the data based on the median loss.
4 | 1. After reaching a certain epoch, it restores the full dataset.
5 |
6 | Modified by Infobatch
7 | """
8 | import math
9 | import torch
10 | import numpy as np
11 | import MinkowskiEngine as ME
12 | from torch.utils.data import Dataset
13 |
14 |
15 | class RSD(Dataset):
16 | """
17 | Args:
18 | dataset: Dataset used for training.
19 | num_epochs (int): The number of epochs for pruning.
20 | prune_ratio (float, optional): The proportion of samples being pruned during training.
21 | delta (float, optional): The first delta * num_epochs the pruning process is conducted. It should be close to 1. Defaults to 0.875.
22 | """
23 |
24 | def __init__(self, dataset: Dataset, num_epochs: int,
25 | prune_ratio: float = 0.25, delta_start: float = 0.25, windows_radio: float = 0.1,
26 | delta_stop: float = 0.85):
27 | self.dataset = dataset
28 | self.keep_ratio = min(1.0, max(1e-1, 1.0 - prune_ratio))
29 | self.prune_ratio = prune_ratio
30 | self.num_epochs = num_epochs
31 | self.delta_start = delta_start
32 | self.delta_stop = delta_stop
33 | self.windows = windows_radio # 0.1
34 | self.keep_windows_ratio = ((delta_stop - delta_start) - 2 * windows_radio) / 2 # 0.2
35 | self.weights = torch.ones(len(self.dataset), 1)
36 | self.select_idx = torch.zeros(len(dataset), dtype=torch.bool)
37 | self.num_pruned_samples = 0
38 | self.cur_batch_index = None
39 |
40 | def set_active_indices(self, cur_batch_indices: torch.Tensor):
41 | self.cur_batch_index = cur_batch_indices
42 |
43 | def update(self, loss, idx, batch_num):
44 | """
45 | Scaling loss
46 | """
47 | device = loss.device
48 | weights = self.weights.to(device)[idx][batch_num.long()]
49 | loss.mul_(weights)
50 |
51 | return loss.mean()
52 |
53 | def __len__(self):
54 | return len(self.dataset)
55 |
56 | def __getitem__(self, idx):
57 | scan_path = self.dataset.pcs[idx]
58 | if self.dataset.scene == 'NCLT':
59 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(-1, 4)[:, :3]
60 | else:
61 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(4, -1).transpose()[:, :3]
62 | ptcld[:, 2] = -1 * ptcld[:, 2]
63 |
64 | scan = ptcld
65 |
66 | scan = np.ascontiguousarray(scan)
67 |
68 | lbl = self.dataset.lbl[idx]
69 | pose = self.dataset.poses[idx] # (6,)
70 | rot = self.dataset.rots[idx] # [3, 3]
71 | scan_gt = (rot @ scan.transpose(1, 0)).transpose(1, 0) + pose[:3].reshape(1, 3)
72 |
73 | if self.dataset.train and self.dataset.augment:
74 | scan = self.dataset.augmentor.doAugmentation(scan) # n, 5
75 |
76 | scan_gt_s8 = np.concatenate((scan, scan_gt), axis=1)
77 |
78 | coords, feats = ME.utils.sparse_quantize(
79 | coordinates=scan,
80 | features=scan_gt_s8,
81 | quantization_size=self.dataset.voxel_size)
82 |
83 | return (coords, feats, lbl, idx, pose)
84 |
85 | # First Downsampling
86 | def cal_std(self, values, labels, label_num):
87 | # 1. Calculate the variance of each row
88 | variances = values.var(dim=1)
89 | # 2. Calculate separately for each cluster
90 | for i in range(label_num):
91 | # Obtain the sample indices for this cluster
92 | label = i
93 | label_indices = torch.where(labels == label)[0]
94 | # Sort by variance in descending order and select the top keep_ratio percentage of samples
95 | label_variances = variances[label_indices]
96 | num_top_samples = int(self.keep_ratio * len(label_variances))
97 | # Obtain the indices of the samples with the largest num_top_samples variances
98 | _, top_indices = torch.topk(label_variances, num_top_samples)
99 | selected_indices = label_indices[top_indices]
100 | # Set the mask of these samples to true
101 | self.select_idx[selected_indices] = True
102 |
103 | remain_idx = torch.where(self.select_idx)[0]
104 | self.weights[remain_idx] = self.weights[remain_idx] * (1 / self.keep_ratio)
105 |
106 | return len(remain_idx), remain_idx
107 |
108 | # Second Downsampling, update select_idx
109 | def sec_cal_std(self, values, labels, selected_indices, label_num):
110 | values = values[selected_indices]
111 | labels = labels[selected_indices]
112 | variances = values.var(dim=1)
113 | self.select_idx[:] = False
114 | for i in range(label_num):
115 | label = i
116 | label_indices = torch.where(labels == label)[0]
117 | label_variances = variances[label_indices]
118 | num_top_samples = int(self.keep_ratio * len(label_variances))
119 | _, top_indices = torch.topk(label_variances, num_top_samples)
120 | second_selected_indices = label_indices[top_indices]
121 | self.select_idx[second_selected_indices] = True
122 |
123 | remain_idx = torch.where(self.select_idx)[0]
124 | self.weights[remain_idx] = self.weights[remain_idx] * (1 / self.keep_ratio)
125 |
126 | return torch.sum(self.select_idx == True)
127 |
128 | def prune(self):
129 | # Prune samples that are well learned, rebalance the weight by scaling up remaining
130 | # well learned samples' learning rate to keep estimation about the same
131 | # for the next version, also consider new class balance
132 | remained_mask = self.select_idx.numpy()
133 | remained_indices = np.where(remained_mask)[0].tolist()
134 | np.random.shuffle(remained_indices)
135 |
136 | return remained_indices
137 |
138 | @property
139 | def sampler(self):
140 | sampler = IBSampler(self)
141 | return sampler
142 |
143 | def no_prune(self):
144 | samples_indices = list(range(len(self)))
145 | np.random.shuffle(samples_indices)
146 | return samples_indices
147 |
148 | def reset_weights(self):
149 | self.weights[:] = 1
150 |
151 | @property
152 | def first_prune(self):
153 | # Round down
154 | return math.floor(self.num_epochs * self.delta_start)
155 |
156 | @property
157 | def stop_prune(self):
158 | # Round down
159 | return math.floor(self.num_epochs * self.delta_stop)
160 |
161 | @property
162 | def second_prune(self):
163 | # Round down
164 | # ceil(25 * (0.25 + 0.1 + 0.2) = 13.75) = 14
165 | return math.floor(self.num_epochs * (self.delta_start + self.windows + self.keep_windows_ratio))
166 |
167 | @property
168 | def win_std(self):
169 | # Round up
170 | return math.ceil(self.num_epochs * self.windows)
171 |
172 |
173 | class IBSampler(object):
174 | def __init__(self, dataset: RSD):
175 | self.dataset = dataset
176 | self.first_prune = dataset.first_prune
177 | self.stop_prune = dataset.stop_prune
178 | self.windows = dataset.win_std
179 | self.iterations = -1
180 | self.sample_indices = None
181 | self.iter_obj = None
182 | self.reset()
183 |
184 | def reset(self):
185 | # np.random.seed(self.iterations)
186 | print("iterations: %d" % self.iterations)
187 | if self.iterations >= self.stop_prune or self.iterations < (self.first_prune + self.windows):
188 | print("no pruning")
189 | if self.iterations == self.stop_prune:
190 | self.dataset.reset_weights()
191 | self.sample_indices = self.dataset.no_prune()
192 | else:
193 | print("pruning")
194 | self.sample_indices = self.dataset.prune()
195 | print(len(self.sample_indices))
196 | self.iter_obj = iter(self.sample_indices)
197 | self.iterations += 1
198 |
199 | def __next__(self):
200 | nxt = next(self.iter_obj)
201 | return nxt
202 |
203 | def __len__(self):
204 | return len(self.sample_indices)
205 |
206 | def __iter__(self):
207 | self.reset()
208 | return self
209 |
--------------------------------------------------------------------------------
/img/Pipline.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/img/Pipline.png
--------------------------------------------------------------------------------
/img/Teaser.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/img/Teaser.png
--------------------------------------------------------------------------------
/install.sh:
--------------------------------------------------------------------------------
1 | conda deactivate
2 |
3 | # Set environment variables
4 | export ENV_NAME=LightLoc
5 | export PYTHON_VERSION=3.8
6 | export PYTORCH_VERSION=1.11.0
7 | export CUDA_VERSION=11.3
8 |
9 | # Create a new conda environment and activate it
10 | conda create -n $ENV_NAME python=$PYTHON_VERSION -y
11 | conda activate $ENV_NAME
12 |
13 | # Install PyTorch
14 | conda install pytorch==$PYTORCH_VERSION torchvision torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=$CUDA_VERSION -c pytorch -y
15 | # Install MinkowskiEngine
16 | conda install openblas-devel -c anaconda -y
17 | pip install pip==22.2.1
18 | # Downgrade the version of setuptools
19 | pip uninstall setuptools -y
20 | pip install setuptools==69.5.1
21 | # gcc g++ 9.5
22 | pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps --install-option="--blas_include_dirs=${CONDA_PREFIX}/include" --install-option="--blas=openblas"
23 |
24 | # Install pip packages
25 | pip install matplotlib
26 | pip install tqdm
27 | pip install h5py
28 | pip install pandas
29 | pip install transforms3d
30 | pip install open3d
31 |
--------------------------------------------------------------------------------
/models/__pycache__/ace.cpython-310.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/models/__pycache__/ace.cpython-310.pyc
--------------------------------------------------------------------------------
/models/__pycache__/ace.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/models/__pycache__/ace.cpython-39.pyc
--------------------------------------------------------------------------------
/models/__pycache__/loss.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/models/__pycache__/loss.cpython-39.pyc
--------------------------------------------------------------------------------
/models/__pycache__/sc2pcr.cpython-39.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/models/__pycache__/sc2pcr.cpython-39.pyc
--------------------------------------------------------------------------------
/models/lightloc.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # Copyright © Niantic, Inc. 2022.
3 | import re
4 | import torch
5 | import logging
6 | import torch.nn as nn
7 | import MinkowskiEngine as ME
8 | import MinkowskiEngine.MinkowskiFunctional as MEF
9 | import torch.nn.functional as F
10 |
11 |
12 | _logger = logging.getLogger(__name__)
13 |
14 |
15 | def Norm(norm_type, num_feats, bn_momentum=0.1, D=-1):
16 | if norm_type == 'BN':
17 | return ME.MinkowskiBatchNorm(num_feats, momentum=bn_momentum)
18 | elif norm_type == 'IN':
19 | return ME.MinkowskiInstanceNorm(num_feats, dimension=D)
20 | else:
21 | raise ValueError(f'Type {norm_type}, not defined')
22 |
23 |
24 | class Conv(nn.Module):
25 | def __init__(self,
26 | inplanes,
27 | planes,
28 | kernel_size=3,
29 | stride=1,
30 | dilation=1,
31 | bias=False,
32 | dimension=3):
33 | super(Conv, self).__init__()
34 |
35 | self.net = nn.Sequential(ME.MinkowskiConvolution(inplanes,
36 | planes,
37 | kernel_size=kernel_size,
38 | stride=stride,
39 | dilation=dilation,
40 | bias=bias,
41 | dimension=dimension),)
42 |
43 | def forward(self, x):
44 | return self.net(x)
45 |
46 |
47 | class Encoder(ME.MinkowskiNetwork):
48 | """
49 | FCN encoder, used to extract features from the input point clouds.
50 |
51 | The number of output channels is configurable, the default used in the paper is 512.
52 | """
53 |
54 | def __init__(self, out_channels, norm_type, D=3):
55 | super(Encoder, self).__init__(D)
56 |
57 | self.in_channels = 3
58 | self.out_channels = out_channels
59 | self.norm_type = norm_type
60 | self.conv_planes = [32, 64, 128, 256, 256, 256, 256, 512, 512]
61 |
62 | # in_channels, conv_planes, kernel_size, stride dilation bias
63 | self.conv1 = Conv(self.in_channels, self.conv_planes[0], 3, 1, 1, True)
64 | self.conv2 = Conv(self.conv_planes[0], self.conv_planes[1], 3, 2, bias=True)
65 | self.conv3 = Conv(self.conv_planes[1], self.conv_planes[2], 3, 2, bias=True)
66 | self.conv4 = Conv(self.conv_planes[2], self.conv_planes[3], 3, 2, bias=True)
67 |
68 | self.res1_conv1 = Conv(self.conv_planes[3], self.conv_planes[4], 3, 1, bias=True)
69 | # 1
70 | self.res1_conv2 = Conv(self.conv_planes[4], self.conv_planes[5], 1, 1, bias=True)
71 | self.res1_conv3 = Conv(self.conv_planes[5], self.conv_planes[6], 3, 1, bias=True)
72 |
73 | self.res2_conv1 = Conv(self.conv_planes[6], self.conv_planes[7], 3, 1, bias=True)
74 | # 2
75 | self.res2_conv2 = Conv(self.conv_planes[7], self.conv_planes[8], 1, 1, bias=True)
76 | self.res2_conv3 = Conv(self.conv_planes[8], self.out_channels, 3, 1, bias=True)
77 |
78 | self.res2_skip = Conv(self.conv_planes[6], self.out_channels, 1, 1, bias=True)
79 |
80 | def forward(self, x):
81 |
82 | x = MEF.relu(self.conv1(x))
83 | x = MEF.relu(self.conv2(x))
84 | x = MEF.relu(self.conv3(x))
85 | res = MEF.relu(self.conv4(x))
86 |
87 | x = MEF.relu(self.res1_conv1(res))
88 | x = MEF.relu(self.res1_conv2(x))
89 | x._F = x.F.to(torch.float32)
90 | x = MEF.relu(self.res1_conv3(x))
91 |
92 | res = res + x
93 |
94 | x = MEF.relu(self.res2_conv1(res))
95 | x = MEF.relu(self.res2_conv2(x))
96 | x._F = x.F.to(torch.float32)
97 | x = MEF.relu(self.res2_conv3(x))
98 |
99 | x = self.res2_skip(res) + x
100 |
101 | return x
102 |
103 |
104 | def one_hot(x, N):
105 | one_hot = torch.FloatTensor(x.size(0), N, x.size(1), x.size(2)).zero_().to(x.device)
106 | one_hot = one_hot.scatter_(1, x.unsqueeze(1), 1)
107 | return one_hot
108 |
109 |
110 | class CondLayer(nn.Module):
111 | """
112 | pixel-wise feature modulation.
113 | """
114 | def __init__(self, in_channels):
115 | super(CondLayer, self).__init__()
116 | self.bn = nn.BatchNorm1d(in_channels)
117 |
118 | def forward(self, x, gammas, betas):
119 | return F.relu(self.bn((gammas * x) + betas))
120 |
121 |
122 | class Cls_Head(nn.Module):
123 | """
124 | Classification Head
125 | """
126 | def __init__(self, in_channels=512, level_cluster=25):
127 | super(Cls_Head, self).__init__()
128 |
129 | channels_c = [512, 256, level_cluster]
130 |
131 | # level network.
132 | self.conv1_l1 = nn.Linear(in_channels, channels_c[0])
133 | self.norm1_l1 = nn.BatchNorm1d(channels_c[0])
134 | self.conv2_l1 = nn.Linear(channels_c[0], channels_c[1])
135 | self.norm2_l1 = nn.BatchNorm1d(channels_c[1])
136 | self.conv3_l1 = nn.Linear(channels_c[1], channels_c[2])
137 | self.dp1_l1 = nn.Dropout(0.5)
138 | self.dp2_l1 = nn.Dropout(0.5)
139 |
140 |
141 | def forward(self, res):
142 |
143 | x1 = self.dp1_l1(F.relu(self.norm1_l1(self.conv1_l1(res))))
144 | x1 = self.dp2_l1(F.relu(self.norm2_l1(self.conv2_l1(x1))))
145 | # output the classification probability.
146 | out_lbl_1 = self.conv3_l1(x1)
147 |
148 | return out_lbl_1
149 |
150 |
151 | class Reg_Head(nn.Module):
152 | """
153 | nn.Linear版
154 | """
155 | def __init__(self, num_head_blocks, in_channels=512, mlp_ratio=1.0):
156 | super(Reg_Head, self).__init__()
157 | self.in_channels = in_channels # Number of encoder features.
158 | self.head_channels = in_channels # Hardcoded.
159 |
160 | # We may need a skip layer if the number of features output by the encoder is different.
161 | self.head_skip = nn.Identity() if self.in_channels == self.head_channels \
162 | else nn.Linear(self.in_channels, self.head_channels)
163 |
164 | block_channels = int(self.head_channels * mlp_ratio)
165 | self.res3_conv1 = nn.Linear(self.in_channels, self.head_channels)
166 | self.res3_conv2 = nn.Linear(self.head_channels, block_channels)
167 | self.res3_conv3 = nn.Linear(block_channels, self.head_channels)
168 |
169 | self.res_blocks = []
170 | self.norm_blocks = []
171 |
172 | for block in range(num_head_blocks):
173 | self.res_blocks.append((
174 | nn.Linear(self.head_channels, self.head_channels),
175 | nn.Linear(self.head_channels, block_channels),
176 | nn.Linear(block_channels, self.head_channels),
177 | ))
178 |
179 | super(Reg_Head, self).add_module(str(block) + 'c0', self.res_blocks[block][0])
180 | super(Reg_Head, self).add_module(str(block) + 'c1', self.res_blocks[block][1])
181 | super(Reg_Head, self).add_module(str(block) + 'c2', self.res_blocks[block][2])
182 |
183 | self.fc1 = nn.Linear(self.head_channels, self.head_channels)
184 | self.fc2 = nn.Linear(self.head_channels, block_channels)
185 | self.fc3 = nn.Linear(block_channels, 3)
186 |
187 | def forward(self, res):
188 | x = F.relu(self.res3_conv1(res))
189 | x = F.relu(self.res3_conv2(x))
190 | x = F.relu(self.res3_conv3(x))
191 |
192 | res = self.head_skip(res) + x
193 |
194 | for res_block in self.res_blocks:
195 |
196 | x = F.relu(res_block[0](res))
197 | x = F.relu(res_block[1](x))
198 | x = F.relu(res_block[2](x))
199 | res = res + x
200 |
201 | sc = F.relu(self.fc1(res))
202 | sc = F.relu(self.fc2(sc))
203 | sc = self.fc3(sc)
204 |
205 | return sc
206 |
207 |
208 | class Regressor(ME.MinkowskiNetwork):
209 | """
210 | FCN architecture for scene coordinate regression.
211 |
212 | The network predicts a 3d scene coordinates, the output is subsampled by a factor of 8 compared to the input.
213 | """
214 |
215 | OUTPUT_SUBSAMPLE = 8
216 |
217 | def __init__(self, num_head_blocks, num_encoder_features, level_clusters=25,
218 | mlp_ratio=1.0, sample_cls=False, D=3):
219 | """
220 | Constructor.
221 |
222 | mean: Learn scene coordinates relative to a mean coordinate (e.g. the center of the scene).
223 | num_head_blocks: How many extra residual blocks to use in the head (one is always used).
224 | use_homogeneous: Whether to learn homogeneous or 3D coordinates.
225 | num_encoder_features: Number of channels output of the encoder network.
226 | """
227 | super(Regressor, self).__init__(D)
228 |
229 | self.feature_dim = num_encoder_features
230 |
231 | self.encoder = Encoder(out_channels=self.feature_dim, norm_type='BN')
232 | self.cls_heads = Cls_Head(in_channels=self.feature_dim, level_cluster=level_clusters)
233 | if not sample_cls:
234 | self.reg_heads = Reg_Head(num_head_blocks=num_head_blocks, in_channels=self.feature_dim + level_clusters,
235 | mlp_ratio=mlp_ratio)
236 |
237 | @classmethod
238 | def create_from_encoder(cls, encoder_state_dict, classifier_state_dict=None,
239 | num_head_blocks=None, level_clusters=25, mlp_ratio=1.0, sample_cls=False):
240 | """
241 | Create a regressor using a pretrained encoder, loading encoder-specific parameters from the state dict.
242 |
243 | encoder_state_dict: pretrained encoder state dictionary.
244 | classifier_state_dict: trained classifier state dictionary
245 | num_head_blocks: How many extra residual blocks to use in the head.
246 | level_cluster: How many classification categories.
247 | mlp_ratio: Channel expansion ratio.
248 | sample_cls: training for classifier
249 | """
250 |
251 | num_encoder_features = encoder_state_dict['res2_conv3.net.0.bias'].shape[1]
252 |
253 | # Create a regressor.
254 | _logger.info(f"Creating Regressor using pretrained encoder with {num_encoder_features} feature size.")
255 | regressor = cls(num_head_blocks, num_encoder_features, level_clusters, mlp_ratio, sample_cls)
256 |
257 | # Load encoder weights.
258 | regressor.encoder.load_state_dict(encoder_state_dict)
259 |
260 | if classifier_state_dict!=None:
261 | regressor.cls_heads.load_state_dict(classifier_state_dict)
262 |
263 | # Done.
264 | return regressor
265 |
266 | @classmethod
267 | def create_from_state_dict(cls, state_dict):
268 | """
269 | Instantiate a regressor from a pretrained state dictionary.
270 |
271 | state_dict: pretrained state dictionary.
272 | """
273 | # Count how many head blocks are in the dictionary.
274 | pattern = re.compile(r"^reg_heads\.\d+c0\.weight$")
275 | num_head_blocks = sum(1 for k in state_dict.keys() if pattern.match(k))
276 |
277 | # Number of output channels of the last encoder layer.
278 | num_encoder_features = state_dict['encoder.res2_conv3.net.0.bias'].shape[1]
279 | num_decoder_features = state_dict['cls_heads.conv1_l1.weight'].shape[1]
280 | head_channels = state_dict['cls_heads.conv1_l1.weight'].shape[0]
281 | level_clusters = state_dict['cls_heads.conv3_l1.weight'].shape[0]
282 | reg = any(key.startswith("reg_heads") for key in state_dict)
283 | if reg:
284 | mlp_ratio = state_dict['reg_heads.res3_conv2.weight'].shape[0] / \
285 | state_dict['reg_heads.res3_conv2.weight'].shape[1]
286 | else:
287 | mlp_ratio = 1
288 |
289 | # Create a regressor.
290 | _logger.info(f"Creating regressor from pretrained state_dict:"
291 | f"\n\tNum head blocks: {num_head_blocks}"
292 | f"\n\tEncoder feature size: {num_encoder_features}"
293 | f"\n\tDecoder feature size: {num_decoder_features}"
294 | f"\n\tHead channels: {head_channels}"
295 | f"\n\tMLP ratio: {mlp_ratio}")
296 | regressor = cls(num_head_blocks, num_encoder_features, mlp_ratio=mlp_ratio, level_clusters=level_clusters)
297 |
298 | # Load all weights.
299 | regressor.load_state_dict(state_dict)
300 |
301 | # Done.
302 | return regressor
303 |
304 | @classmethod
305 | def create_from_split_state_dict(cls, encoder_state_dict, cls_head_state_dict, reg_head_state_dict=None):
306 | """
307 | Instantiate a regressor from a pretrained encoder (scene-agnostic) and a scene-specific head.
308 |
309 | encoder_state_dict: encoder state dictionary
310 | head_state_dict: scene-specific head state dictionary
311 | """
312 | # We simply merge the dictionaries and call the other constructor.
313 | merged_state_dict = {}
314 |
315 | for k, v in encoder_state_dict.items():
316 | merged_state_dict[f"encoder.{k}"] = v
317 |
318 | for k, v in cls_head_state_dict.items():
319 | merged_state_dict[f"cls_heads.{k}"] = v.squeeze(-1).squeeze(-1)
320 |
321 | if reg_head_state_dict != None:
322 | for k, v in reg_head_state_dict.items():
323 | merged_state_dict[f"reg_heads.{k}"] = v.squeeze(-1).squeeze(-1)
324 |
325 | return cls.create_from_state_dict(merged_state_dict)
326 |
327 |
328 | def load_encoder(self, encoder_dict_file):
329 | """
330 | Load weights into the encoder network.
331 | """
332 | self.encoder.load_state_dict(torch.load(encoder_dict_file))
333 |
334 | def get_features(self, inputs):
335 | return self.encoder(inputs)
336 |
337 | def get_scene_coordinates(self, features):
338 | out = self.reg_heads(features)
339 | return out
340 |
341 | def get_scene_classification(self, features):
342 | out = self.cls_heads(features)
343 |
344 | return out
345 |
346 | def forward(self, inputs):
347 | """
348 | Forward pass.
349 | """
350 | features = self.encoder(inputs)
351 | out = self.get_scene_coordinates(features.F)
352 | out = ME.SparseTensor(
353 | features=out,
354 | coordinates=features.C,
355 | )
356 |
357 | return {'pred': out}
--------------------------------------------------------------------------------
/models/loss.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torch import nn
3 |
4 |
5 | class RSD_Criterion(nn.Module):
6 | def __init__(self, first_prune_epoch, second_prune_epoch, windows):
7 | super(RSD_Criterion, self).__init__()
8 | self.first_prune_epoch = first_prune_epoch
9 | self.second_prune_epoch = second_prune_epoch
10 | self.windows = windows
11 |
12 | def forward(self, pred_point, gt_point, batch_size, epoch_nums, idx, values):
13 | loss_map = torch.sum(torch.abs(pred_point - gt_point), axis=-1) # [B*N]
14 | loss = loss_map.detach().clone().view(batch_size, -1) # [B, N]
15 | if self.first_prune_epoch <= epoch_nums < (self.first_prune_epoch + self.windows):
16 | current_epoch = epoch_nums - self.first_prune_epoch
17 | values[idx, current_epoch] = torch.median(loss, dim=1).values
18 | elif self.second_prune_epoch <= epoch_nums < (self.second_prune_epoch + self.windows):
19 | current_epoch = epoch_nums - self.second_prune_epoch
20 | values[idx, current_epoch] = torch.median(loss, dim=1).values
21 | loss_map = torch.mean(loss_map)
22 |
23 | # loss_map = torch.sum(torch.abs(pred_point - gt_point), axis=-1, keepdims=True)
24 | # if self.first_prune_epoch <= epoch_nums < (self.first_prune_epoch + self.windows):
25 | # loss_lw = loss_map.detach().clone()
26 | # current_epoch = epoch_nums - self.first_prune_epoch
27 | # # print(batch_nums.shape)
28 | # # print(loss_map.shape)
29 | # for i in range(batch_size):
30 | # # 取出当前序列中的第N个batch数据
31 | # mask = batch_nums == i
32 | # values[idx[i], current_epoch] = torch.median(loss_lw[mask]) # 将其求中值,以忽略噪点影响
33 | # elif self.second_prune_epoch <= epoch_nums < (self.second_prune_epoch + self.windows):
34 | # loss_lw = loss_map.detach().clone()
35 | # current_epoch = epoch_nums - self.second_prune_epoch
36 | # for i in range(batch_size):
37 | # # 取出当前序列中的第N个batch数据
38 | # mask = batch_nums == i
39 | # values[idx[i], current_epoch] = torch.median(loss_lw[mask]) # 将其求中值,以忽略噪点影响
40 |
41 | # loss_map = torch.mean(loss_map)
42 |
43 | return loss_map, values
44 |
45 |
46 | class REG_Criterion(nn.Module):
47 | def __init__(self):
48 | super(REG_Criterion, self).__init__()
49 |
50 | def forward(self, pred_point, gt_point):
51 | loss_map = torch.sum(torch.abs(pred_point - gt_point), axis=-1, keepdims=True)
52 | loss_map = torch.mean(loss_map)
53 |
54 | return loss_map
55 |
56 |
57 | class CLS_Criterion(nn.Module):
58 | def __init__(self):
59 | super(CLS_Criterion, self).__init__()
60 | self.loc_loss = nn.CrossEntropyLoss(label_smoothing=0.1)
61 |
62 | def forward(self, pred_loc, gt_loc):
63 | loss = self.loc_loss(pred_loc, gt_loc.long())
64 |
65 | return loss
--------------------------------------------------------------------------------
/models/sc2pcr.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import transforms3d.quaternions as txq
3 | import numpy as np
4 | from utils.pose_util import rigid_transform_3d, transform
5 |
6 |
7 | class Matcher():
8 | def __init__(self,
9 | inlier_threshold=1.4,
10 | num_node='all',
11 | use_mutual=True,
12 | d_thre=2,
13 | num_iterations=10,
14 | ratio=0.1, # 0.2
15 | nms_radius=0.1,
16 | max_points=2000,
17 | k1=40,
18 | k2=20,
19 | select_scene=None,
20 | ):
21 | self.inlier_threshold = inlier_threshold
22 | self.num_node = num_node
23 | self.use_mutual = use_mutual
24 | self.d_thre = d_thre
25 | self.num_iterations = num_iterations # maximum iteration of power iteration algorithm
26 | self.ratio = ratio # the maximum ratio of seeds.
27 | self.max_points = max_points
28 | self.nms_radius = nms_radius
29 | self.k1 = k1
30 | self.k2 = k2
31 |
32 | def pick_seeds(self, dists, scores, R, max_num):
33 | """
34 | Select seeding points using Non Maximum Suppression. (here we only support bs=1)
35 | Input:
36 | - dists: [bs, num_corr, num_corr] src keypoints distance matrix
37 | - scores: [bs, num_corr] initial confidence of each correspondence
38 | - R: float radius of nms
39 | - max_num: int maximum number of returned seeds
40 | Output:
41 | - picked_seeds: [bs, num_seeds] the index to the seeding correspondences
42 | """
43 | assert scores.shape[0] == 1
44 |
45 | # parallel Non Maximum Suppression (more efficient)
46 | score_relation = scores.T >= scores # [num_corr, num_corr], save the relation of leading_eig
47 | # score_relation[dists[0] >= R] = 1 # mask out the non-neighborhood node
48 | score_relation = score_relation.bool() | (dists[0] >= R).bool()
49 | is_local_max = score_relation.min(-1)[0].float()
50 |
51 | score_local_max = scores * is_local_max
52 | sorted_score = torch.argsort(score_local_max, dim=1, descending=True)
53 |
54 | # max_num = scores.shape[1]
55 |
56 | return_idx = sorted_score[:, 0: max_num].detach()
57 |
58 | return return_idx
59 |
60 | def cal_seed_trans(self, seeds, SC2_measure, src_keypts, tgt_keypts):
61 | """
62 | Calculate the transformation for each seeding correspondences.
63 | Input:
64 | - seeds: [bs, num_seeds] the index to the seeding correspondence
65 | - SC2_measure: [bs, num_corr, num_channels]
66 | - src_keypts: [bs, num_corr, 3]
67 | - tgt_keypts: [bs, num_corr, 3]
68 | Output: leading eigenvector
69 | - final_trans: [bs, 4, 4] best transformation matrix (after post refinement) for each batch.
70 | """
71 | bs, num_corr, num_channels = SC2_measure.shape[0], SC2_measure.shape[1], SC2_measure.shape[2]
72 | k1 = self.k1
73 | k2 = self.k2
74 |
75 | if k1 > num_channels:
76 | k1 = 4
77 | k2 = 4
78 |
79 | #################################
80 | # The first stage consensus set sampling
81 | # Finding the k1 nearest neighbors around each seed
82 | #################################
83 | sorted_score = torch.argsort(SC2_measure, dim=2, descending=True)
84 | knn_idx = sorted_score[:, :, 0: k1]
85 | sorted_value, _ = torch.sort(SC2_measure, dim=2, descending=True)
86 | idx_tmp = knn_idx.contiguous().view([bs, -1])
87 | idx_tmp = idx_tmp[:, :, None]
88 | idx_tmp = idx_tmp.expand(-1, -1, 3)
89 |
90 | #################################
91 | # construct the local SC2 measure of each consensus subset obtained in the first stage.
92 | #################################
93 | src_knn = src_keypts.gather(dim=1, index=idx_tmp).view([bs, -1, k1, 3]) # [bs, num_seeds, k, 3]
94 | tgt_knn = tgt_keypts.gather(dim=1, index=idx_tmp).view([bs, -1, k1, 3])
95 | src_dist = ((src_knn[:, :, :, None, :] - src_knn[:, :, None, :, :]) ** 2).sum(-1) ** 0.5
96 | tgt_dist = ((tgt_knn[:, :, :, None, :] - tgt_knn[:, :, None, :, :]) ** 2).sum(-1) ** 0.5
97 | cross_dist = torch.abs(src_dist - tgt_dist)
98 | local_hard_SC_measure = (cross_dist < self.d_thre).float()
99 | local_SC2_measure = torch.matmul(local_hard_SC_measure[:, :, :1, :], local_hard_SC_measure)
100 |
101 | #################################
102 | # perform second stage consensus set sampling
103 | #################################
104 | sorted_score = torch.argsort(local_SC2_measure, dim=3, descending=True)
105 | knn_idx_fine = sorted_score[:, :, :, 0: k2]
106 |
107 | #################################
108 | # construct the soft SC2 matrix of the consensus set
109 | #################################
110 | num = knn_idx_fine.shape[1]
111 | knn_idx_fine = knn_idx_fine.contiguous().view([bs, num, -1])[:, :, :, None]
112 | knn_idx_fine = knn_idx_fine.expand(-1, -1, -1, 3)
113 | src_knn_fine = src_knn.gather(dim=2, index=knn_idx_fine).view([bs, -1, k2, 3]) # [bs, num_seeds, k, 3]
114 | tgt_knn_fine = tgt_knn.gather(dim=2, index=knn_idx_fine).view([bs, -1, k2, 3])
115 |
116 | src_dist = ((src_knn_fine[:, :, :, None, :] - src_knn_fine[:, :, None, :, :]) ** 2).sum(-1) ** 0.5
117 | tgt_dist = ((tgt_knn_fine[:, :, :, None, :] - tgt_knn_fine[:, :, None, :, :]) ** 2).sum(-1) ** 0.5
118 | cross_dist = torch.abs(src_dist - tgt_dist)
119 | local_hard_measure = (cross_dist < self.d_thre * 2).float()
120 | local_SC2_measure = torch.matmul(local_hard_measure, local_hard_measure) / k2
121 | local_SC_measure = torch.clamp(1 - cross_dist ** 2 / self.d_thre ** 2, min=0)
122 | # local_SC2_measure = local_SC_measure * local_SC2_measure
123 | local_SC2_measure = local_SC_measure
124 | local_SC2_measure = local_SC2_measure.view([-1, k2, k2])
125 |
126 |
127 | #################################
128 | # Power iteratation to get the inlier probability
129 | #################################
130 | local_SC2_measure[:, torch.arange(local_SC2_measure.shape[1]), torch.arange(local_SC2_measure.shape[1])] = 0
131 | total_weight = self.cal_leading_eigenvector(local_SC2_measure, method='power')
132 | total_weight = total_weight.view([bs, -1, k2])
133 | total_weight = total_weight / (torch.sum(total_weight, dim=-1, keepdim=True) + 1e-6)
134 |
135 | #################################
136 | # calculate the transformation by weighted least-squares for each subsets in parallel
137 | #################################
138 | total_weight = total_weight.view([-1, k2])
139 | src_knn = src_knn_fine
140 | tgt_knn = tgt_knn_fine
141 | src_knn, tgt_knn = src_knn.view([-1, k2, 3]), tgt_knn.view([-1, k2, 3])
142 |
143 | #################################
144 | # compute the rigid transformation for each seed by the weighted SVD
145 | #################################
146 | seedwise_trans = rigid_transform_3d(src_knn, tgt_knn, total_weight)
147 | seedwise_trans = seedwise_trans.view([bs, -1, 4, 4])
148 |
149 | #################################
150 | # calculate the inlier number for each hypothesis, and find the best transformation for each point cloud pair
151 | #################################
152 | pred_position = torch.einsum('bsnm,bmk->bsnk', seedwise_trans[:, :, :3, :3],
153 | src_keypts.permute(0, 2, 1)) + seedwise_trans[:, :, :3,
154 | 3:4] # [bs, num_seeds, num_corr, 3]
155 | #################################
156 | # calculate the inlier number for each hypothesis, and find the best transformation for each point cloud pair
157 | #################################
158 | pred_position = pred_position.permute(0, 1, 3, 2)
159 | L2_dis = torch.norm(pred_position - tgt_keypts[:, None, :, :], dim=-1) # [bs, num_seeds, num_corr]
160 | seedwise_fitness = torch.sum((L2_dis < self.inlier_threshold).float(), dim=-1) # [bs, num_seeds]
161 | batch_best_guess = seedwise_fitness.argmax(dim=1)
162 | best_guess_ratio = seedwise_fitness[0, batch_best_guess]
163 | final_trans = seedwise_trans.gather(dim=1,index=batch_best_guess[:, None, None, None].expand(-1, -1, 4, 4)).squeeze(1)
164 |
165 | return final_trans
166 |
167 | def cal_leading_eigenvector(self, M, method='power'):
168 | """
169 | Calculate the leading eigenvector using power iteration algorithm or torch.symeig
170 | Input:
171 | - M: [bs, num_corr, num_corr] the compatibility matrix
172 | - method: select different method for calculating the learding eigenvector.
173 | Output:
174 | - solution: [bs, num_corr] leading eigenvector
175 | """
176 | if method == 'power':
177 | # power iteration algorithm
178 | leading_eig = torch.ones_like(M[:, :, 0:1])
179 | leading_eig_last = leading_eig
180 | for i in range(self.num_iterations):
181 | leading_eig = torch.bmm(M, leading_eig)
182 | leading_eig = leading_eig / (torch.norm(leading_eig, dim=1, keepdim=True) + 1e-6)
183 | if torch.allclose(leading_eig, leading_eig_last):
184 | break
185 | leading_eig_last = leading_eig
186 | leading_eig = leading_eig.squeeze(-1)
187 | return leading_eig
188 | elif method == 'eig': # cause NaN during back-prop
189 | e, v = torch.symeig(M, eigenvectors=True)
190 | leading_eig = v[:, :, -1]
191 | return leading_eig
192 | else:
193 | exit(-1)
194 |
195 | def cal_confidence(self, M, leading_eig, method='eig_value'):
196 | """
197 | Calculate the confidence of the spectral matching solution based on spectral analysis.
198 | Input:
199 | - M: [bs, num_corr, num_corr] the compatibility matrix
200 | - leading_eig [bs, num_corr] the leading eigenvector of matrix M
201 | Output:
202 | - confidence
203 | """
204 | if method == 'eig_value':
205 | # max eigenvalue as the confidence (Rayleigh quotient)
206 | max_eig_value = (leading_eig[:, None, :] @ M @ leading_eig[:, :, None]) / (
207 | leading_eig[:, None, :] @ leading_eig[:, :, None])
208 | confidence = max_eig_value.squeeze(-1)
209 | return confidence
210 | elif method == 'eig_value_ratio':
211 | # max eigenvalue / second max eigenvalue as the confidence
212 | max_eig_value = (leading_eig[:, None, :] @ M @ leading_eig[:, :, None]) / (
213 | leading_eig[:, None, :] @ leading_eig[:, :, None])
214 | # compute the second largest eigen-value
215 | B = M - max_eig_value * leading_eig[:, :, None] @ leading_eig[:, None, :]
216 | solution = torch.ones_like(B[:, :, 0:1])
217 | for i in range(self.num_iterations):
218 | solution = torch.bmm(B, solution)
219 | solution = solution / (torch.norm(solution, dim=1, keepdim=True) + 1e-6)
220 | solution = solution.squeeze(-1)
221 | second_eig = solution
222 | second_eig_value = (second_eig[:, None, :] @ B @ second_eig[:, :, None]) / (
223 | second_eig[:, None, :] @ second_eig[:, :, None])
224 | confidence = max_eig_value / second_eig_value
225 | return confidence
226 | elif method == 'xMx':
227 | # max xMx as the confidence (x is the binary solution)
228 | # rank = torch.argsort(leading_eig, dim=1, descending=True)[:, 0:int(M.shape[1]*self.ratio)]
229 | # binary_sol = torch.zeros_like(leading_eig)
230 | # binary_sol[0, rank[0]] = 1
231 | confidence = leading_eig[:, None, :] @ M @ leading_eig[:, :, None]
232 | confidence = confidence.squeeze(-1) / M.shape[1]
233 | return confidence
234 |
235 | def post_refinement(self, initial_trans, src_keypts, tgt_keypts, it_num, weights=None):
236 | """
237 | Perform post refinement using the initial transformation matrix, only adopted during testing.
238 | Input
239 | - initial_trans: [bs, 4, 4]
240 | - src_keypts: [bs, num_corr, 3]
241 | - tgt_keypts: [bs, num_corr, 3]
242 | - weights: [bs, num_corr]
243 | Output:
244 | - final_trans: [bs, 4, 4]
245 | """
246 | assert initial_trans.shape[0] == 1
247 |
248 | if self.inlier_threshold == 0.10: # for 3DMatch
249 | inlier_threshold_list = [0.10] * it_num
250 | else: # for KITTI
251 | inlier_threshold_list = [0.6] * it_num
252 |
253 | previous_inlier_num = 0
254 | for inlier_threshold in inlier_threshold_list:
255 | warped_src_keypts = transform(src_keypts, initial_trans)
256 |
257 | L2_dis = torch.norm(warped_src_keypts - tgt_keypts, dim=-1)
258 | pred_inlier = (L2_dis < inlier_threshold)[0] # assume bs = 1
259 | inlier_num = torch.sum(pred_inlier)
260 | if abs(int(inlier_num - previous_inlier_num)) < 1:
261 | break
262 | else:
263 | previous_inlier_num = inlier_num
264 | initial_trans = rigid_transform_3d(
265 | A=src_keypts[:, pred_inlier, :],
266 | B=tgt_keypts[:, pred_inlier, :],
267 | ## https://link.springer.com/article/10.1007/s10589-014-9643-2
268 | # weights=None,
269 | weights=1 / (1 + (L2_dis / inlier_threshold) ** 2)[:, pred_inlier],
270 | # weights=((1-L2_dis/inlier_threshold)**2)[:, pred_inlier]
271 | )
272 | return initial_trans
273 |
274 | def match_pair(self, src_keypts, tgt_keypts, src_features, tgt_features):
275 | N_src = src_features.shape[1]
276 | N_tgt = tgt_features.shape[1]
277 | # use all point or sample points.
278 | if self.num_node == 'all':
279 | src_sel_ind = np.arange(N_src)
280 | tgt_sel_ind = np.arange(N_tgt)
281 | else:
282 | src_sel_ind = np.random.choice(N_src, self.num_node)
283 | tgt_sel_ind = np.random.choice(N_tgt, self.num_node)
284 | src_desc = src_features[:, src_sel_ind, :]
285 | tgt_desc = tgt_features[:, tgt_sel_ind, :]
286 | src_keypts = src_keypts[:, src_sel_ind, :]
287 | tgt_keypts = tgt_keypts[:, tgt_sel_ind, :]
288 |
289 | # match points in feature space.
290 | distance = torch.sqrt(2 - 2 * (src_desc[0] @ tgt_desc[0].T) + 1e-6)
291 | distance = distance.unsqueeze(0)
292 | source_idx = torch.argmin(distance[0], dim=1)
293 | corr = torch.cat([torch.arange(source_idx.shape[0])[:, None].cuda(), source_idx[:, None]], dim=-1)
294 |
295 | # generate correspondences
296 | src_keypts_corr = src_keypts[:, corr[:, 0]]
297 | tgt_keypts_corr = tgt_keypts[:, corr[:, 1]]
298 |
299 | return src_keypts_corr, tgt_keypts_corr
300 |
301 | def SC2_PCR(self, src_keypts, tgt_keypts):
302 | # print(src_keypts.shape)
303 | # print(tgt_keypts.shape)
304 | """
305 | Input:
306 | - src_keypts: [bs, num_corr, 3]
307 | - tgt_keypts: [bs, num_corr, 3]
308 | Output:
309 | - pred_trans: [bs, 4, 4], the predicted transformation matrix.
310 | - pred_labels: [bs, num_corr], the predicted inlier/outlier label (0,1), for classification loss calculation.
311 | """
312 | bs, num_corr = src_keypts.shape[0], tgt_keypts.shape[1]
313 |
314 | #################################
315 | # downsample points
316 | #################################
317 | if num_corr > self.max_points:
318 | src_keypts = src_keypts[:, :self.max_points, :]
319 | tgt_keypts = tgt_keypts[:, :self.max_points, :]
320 | num_corr = self.max_points
321 |
322 | #################################
323 | # compute cross dist
324 | #################################
325 | src_dist = torch.norm((src_keypts[:, :, None, :] - src_keypts[:, None, :, :]), dim=-1)
326 | target_dist = torch.norm((tgt_keypts[:, :, None, :] - tgt_keypts[:, None, :, :]), dim=-1)
327 | cross_dist = torch.abs(src_dist - target_dist)
328 |
329 | #################################
330 | # compute first order measure
331 | #################################
332 | SC_dist_thre = self.d_thre
333 | SC_measure = torch.clamp(1.0 - cross_dist ** 2 / SC_dist_thre ** 2, min=0)
334 | hard_SC_measure = (cross_dist < SC_dist_thre).float()
335 |
336 | #################################
337 | # select reliable seed correspondences
338 | #################################
339 | confidence = self.cal_leading_eigenvector(SC_measure, method='power')
340 | seeds = self.pick_seeds(src_dist, confidence, R=self.nms_radius, max_num=int(num_corr * self.ratio))
341 |
342 | #################################
343 | # compute second order measure
344 | #################################
345 | SC2_dist_thre = self.d_thre / 2
346 | hard_SC_measure_tight = (cross_dist < SC2_dist_thre).float()
347 | seed_hard_SC_measure = hard_SC_measure.gather(dim=1,
348 | index=seeds[:, :, None].expand(-1, -1, num_corr))
349 | seed_hard_SC_measure_tight = hard_SC_measure_tight.gather(dim=1,
350 | index=seeds[:, :, None].expand(-1, -1, num_corr))
351 | SC2_measure = torch.matmul(seed_hard_SC_measure_tight, hard_SC_measure_tight) * seed_hard_SC_measure
352 |
353 | #################################
354 | # compute the seed-wise transformations and select the best one
355 | #################################
356 | final_trans = self.cal_seed_trans(seeds, SC2_measure, src_keypts, tgt_keypts)
357 |
358 | #################################
359 | # refine the result by recomputing the transformation over the whole set
360 | #################################
361 | # 这里关闭了
362 | # final_trans = self.post_refinement(final_trans, src_keypts, tgt_keypts, 500)
363 |
364 | return final_trans
365 |
366 | def estimator(self, src_keypts_corr, tgt_keypts_corr):
367 | """
368 | Input:
369 | - src_keypts_corr: [bs, num_corr, 3]
370 | - tgt_keypts_corr: [bs, num_corr, 3]
371 | Output:
372 | - pred_trans: [bs, 4, 4], the predicted transformation matrix
373 | - pred_trans: [bs, num_corr], the predicted inlier/outlier label (0,1)
374 | - src_keypts_corr: [bs, num_corr, 3], the source points in the matched correspondences
375 | - tgt_keypts_corr: [bs, num_corr, 3], the target points in the matched correspondences
376 | """
377 | #################################
378 | # generate coarse correspondences
379 | #################################
380 | # src_keypts_corr, tgt_keypts_corr = self.match_pair(src_keypts, tgt_keypts, src_features, tgt_features)
381 |
382 | #################################
383 | # use the proposed SC2-PCR to estimate the rigid transformation
384 | #################################
385 | pred_t = np.zeros((1, 3))
386 | pred_q = np.zeros((1, 4))
387 |
388 | pred_trans = self.SC2_PCR(src_keypts_corr, tgt_keypts_corr)
389 | # print(pred_trans)
390 | pred_t[0, :] = pred_trans[0, :3, 3:].detach().squeeze().cpu().numpy()
391 | pred_q[0, :] = txq.mat2quat(pred_trans[0, :3, :3].detach().cpu().numpy())
392 | pred_T = pred_trans.detach().squeeze().cpu().numpy()
393 |
394 | return pred_t, pred_q, pred_T
--------------------------------------------------------------------------------
/nclt_process.py:
--------------------------------------------------------------------------------
1 | import os
2 | import glob
3 | import struct
4 | import numpy as np
5 |
6 |
7 | def convert_nclt(x_s, y_s, z_s):
8 | scaling = 0.005 # 5 mm
9 | offset = -100.0
10 |
11 | x = x_s * scaling + offset
12 | y = y_s * scaling + offset
13 | z = z_s * scaling + offset
14 |
15 | return x, y, z
16 |
17 |
18 | def load_velodyne_binary_nclt(filename):
19 | f_bin = open(filename, "rb")
20 | hits = []
21 | while True:
22 | x_str = f_bin.read(2)
23 | if x_str == b'': #eof
24 | break
25 | x = struct.unpack(' 1:
368 | # d = 1
369 | # error = 2 * np.arccos(d) * 180 / np.pi0
370 | # d = abs(np.dot(groundtruth, predicted))
371 | # d = min(1.0, max(-1.0, d))
372 |
373 | d = np.abs(np.dot(groundtruth, predicted))
374 | d = np.minimum(1.0, np.maximum(-1.0, d))
375 | error = 2 * np.arccos(d) * 180 / np.pi
376 |
377 | return error
378 |
379 |
380 | def poses2mats(poses_in):
381 | poses_out = np.zeros((len(poses_in), 3, 3)) # (B, 3, 3)
382 | poses_qua = np.asarray([qexp(q) for q in poses_in.cpu().detach().numpy()])
383 |
384 | # align
385 | for i in range(len(poses_out)):
386 | R = txq.quat2mat(poses_qua[i])
387 | poses_out[i, ...] = R
388 |
389 | return poses_out
390 |
391 |
392 | def transform(pts, trans):
393 | """
394 | Applies the SE3 transformations, support torch.Tensor and np.ndarry. Equation: trans_pts = R @ pts + t
395 | Input
396 | - pts: [num_pts, 3] or [bs, num_pts, 3], pts to be transformed
397 | - trans: [4, 4] or [bs, 4, 4], SE3 transformation matrix
398 | Output
399 | - pts: [num_pts, 3] or [bs, num_pts, 3] transformed pts
400 | """
401 | if len(pts.shape) == 3:
402 | trans_pts = trans[:, :3, :3] @ pts.permute(0, 2, 1) + trans[:, :3, 3:4]
403 | return trans_pts.permute(0, 2, 1)
404 | else:
405 | trans_pts = trans[:3, :3] @ pts.T + trans[:3, 3:4]
406 | return trans_pts.T
407 |
408 |
409 | def rigid_transform_3d(A, B, weights=None, weight_threshold=0):
410 | """
411 | Input:
412 | - A: [bs, num_corr, 3], source point cloud
413 | - B: [bs, num_corr, 3], target point cloud
414 | - weights: [bs, num_corr] weight for each correspondence
415 | - weight_threshold: float, clips points with weight below threshold
416 | Output:
417 | - R, t
418 | """
419 | bs = A.shape[0]
420 | if weights is None:
421 | weights = torch.ones_like(A[:, :, 0])
422 | weights[weights < weight_threshold] = 0
423 | # weights = weights / (torch.sum(weights, dim=-1, keepdim=True) + 1e-6)
424 |
425 | # find mean of point cloud
426 | centroid_A = torch.sum(A * weights[:, :, None], dim=1, keepdim=True) / (
427 | torch.sum(weights, dim=1, keepdim=True)[:, :, None] + 1e-6)
428 | centroid_B = torch.sum(B * weights[:, :, None], dim=1, keepdim=True) / (
429 | torch.sum(weights, dim=1, keepdim=True)[:, :, None] + 1e-6)
430 |
431 | # subtract mean
432 | Am = A - centroid_A
433 | Bm = B - centroid_B
434 |
435 | # construct weight covariance matrix
436 | Weight = torch.diag_embed(weights)
437 | H = Am.permute(0, 2, 1) @ Weight @ Bm
438 |
439 | # find rotation
440 | U, S, Vt = torch.svd(H.cpu())
441 | U, S, Vt = U.to(weights.device), S.to(weights.device), Vt.to(weights.device)
442 | delta_UV = torch.det(Vt @ U.permute(0, 2, 1))
443 | eye = torch.eye(3)[None, :, :].repeat(bs, 1, 1).to(A.device)
444 | eye[:, -1, -1] = delta_UV
445 | R = Vt @ eye @ U.permute(0, 2, 1)
446 | t = centroid_B.permute(0, 2, 1) - R @ centroid_A.permute(0, 2, 1)
447 | # warp_A = transform(A, integrate_trans(R,t))
448 | # RMSE = torch.sum( (warp_A - B) ** 2, dim=-1).mean()
449 | return integrate_trans(R, t)
450 |
451 |
452 | def estimate_pose(gt_pc, pred_pc, threshold=1.4, device='cuda'):
453 | # print(source_pc.shape)
454 | source_pc = gt_pc.cpu().numpy().reshape(-1, 3)
455 | target_pc = pred_pc[:, :3].cpu().numpy().reshape(-1, 3)
456 | num_points = source_pc.shape[0]
457 | pred_t = np.zeros((1, 3))
458 | pred_q = np.zeros((1, 4))
459 | index1 = np.arange(0, num_points)
460 | index2 = np.arange(0, num_points)
461 | # np.random.shuffle(index1)
462 | index1 = np.expand_dims(index1, axis=1)
463 | index2 = np.expand_dims(index2, axis=1)
464 | corr = np.concatenate((index1, index2), axis=1)
465 |
466 | source_xyz = source_pc
467 | target_xyz = target_pc
468 | source = open3d.geometry.PointCloud()
469 | target = open3d.geometry.PointCloud()
470 | source.points = open3d.utility.Vector3dVector(source_xyz)
471 | target.points = open3d.utility.Vector3dVector(target_xyz)
472 | corres = open3d.utility.Vector2iVector(corr)
473 |
474 | M = open3d.pipelines.registration.registration_ransac_based_on_correspondence(
475 | source,
476 | target,
477 | corres,
478 | threshold,
479 | open3d.pipelines.registration.TransformationEstimationPointToPoint(False),
480 | 3,
481 | [
482 | open3d.pipelines.registration.
483 | CorrespondenceCheckerBasedOnEdgeLength(0.9),
484 | open3d.pipelines.registration.CorrespondenceCheckerBasedOnDistance(threshold)
485 | ],
486 | open3d.pipelines.registration.RANSACConvergenceCriteria(100000, 0.999))
487 |
488 | pred_t[0, :] = M.transformation[:3, 3:].squeeze()
489 | pred_q[0, :] = txq.mat2quat(M.transformation[:3, :3])
490 |
491 | return pred_t, pred_q
492 |
493 |
494 | def integrate_trans(R, t):
495 | """
496 | Integrate SE3 transformations from R and t, support torch.Tensor and np.ndarry.
497 | Input
498 | - R: [3, 3] or [bs, 3, 3], rotation matrix
499 | - t: [3, 1] or [bs, 3, 1], translation matrix
500 | Output
501 | - trans: [4, 4] or [bs, 4, 4], SE3 transformation matrix
502 | """
503 | if len(R.shape) == 3:
504 | if isinstance(R, torch.Tensor):
505 | trans = torch.eye(4)[None].repeat(R.shape[0], 1, 1).to(R.device)
506 | else:
507 | trans = np.eye(4)[None]
508 | trans[:, :3, :3] = R
509 | trans[:, :3, 3:4] = t.view([-1, 3, 1])
510 | else:
511 | if isinstance(R, torch.Tensor):
512 | trans = torch.eye(4).to(R.device)
513 | else:
514 | trans = np.eye(4)
515 | trans[:3, :3] = R
516 | trans[:3, 3:4] = t
517 | return trans
518 |
519 |
520 | def calc_vos_simple(poses):
521 | """
522 | calculate the VOs, from a list of consecutive poses
523 | :param poses: N x T x 7
524 | :return: N x (T-1) x 7
525 | """
526 | vos = []
527 | for p in poses:
528 | pvos = [p[i + 1].unsqueeze(0) - p[i].unsqueeze(0) for i in range(len(p) - 1)]
529 | vos.append(torch.cat(pvos, dim=0))
530 | vos = torch.stack(vos, dim=0)
531 |
532 | return vos
533 |
534 |
535 | def filter_overflow_ts(filename, ts_raw):
536 | #
537 | file_data = pd.read_csv(filename)
538 | base_name = osp.basename(filename)
539 |
540 | if base_name.find('vo') > -1:
541 | ts_key = 'source_timestamp'
542 | else:
543 | ts_key = 'timestamp'
544 |
545 | pose_timestamps = file_data[ts_key].values
546 | min_pose_timestamps = min(pose_timestamps)
547 | max_pose_timestamps = max(pose_timestamps)
548 | ts_filted = [t for t in ts_raw if min_pose_timestamps < t < max_pose_timestamps]
549 | abandon_num = len(ts_raw) - len(ts_filted)
550 | print('abandom %d pointclouds that exceed the range of %s' % (abandon_num, filename))
551 |
552 | return ts_filted
553 |
554 |
555 | def filter_overflow_nclt(gt_filename, ts_raw): # 滤波函数
556 | # gt_filename: GT对应的文件名
557 | # ts_raw: 原始数据集提供的点云时间戳
558 | ground_truth = np.loadtxt(gt_filename, delimiter=",")[1:, 0]
559 | min_pose_timestamps = min(ground_truth)
560 | max_pose_timestamps = max(ground_truth)
561 | ts_filted = [t for t in ts_raw if min_pose_timestamps < t < max_pose_timestamps]
562 | abandon_num = len(ts_raw) - len(ts_filted)
563 | print('abandom %d pointclouds that exceed the range of %s' % (abandon_num, gt_filename))
564 |
565 | return ts_filted
566 |
567 |
568 | def interpolate_pose_nclt(gt_filename, ts_raw): # 插值函数
569 | # gt_filename: GT对应文件名
570 | # ts_raw: 滤波后的点云时间戳
571 | ground_truth = np.loadtxt(gt_filename, delimiter=",")
572 | ground_truth = ground_truth[np.logical_not(np.any(np.isnan(ground_truth), 1))]
573 | interp = scipy.interpolate.interp1d(ground_truth[:, 0], ground_truth[:, 1:], kind='nearest', axis=0)
574 | pose_gt = interp(ts_raw)
575 |
576 | return pose_gt
577 |
578 |
579 | def so3_to_euler_nclt(poses_in):
580 | N = len(poses_in)
581 | poses_out = np.zeros((N, 4, 4))
582 | for i in range(N):
583 | poses_out[i, :, :] = build_se3_transform([poses_in[i, 0], poses_in[i, 1], poses_in[i, 2],
584 | poses_in[i, 3], poses_in[i, 4], poses_in[i, 5]])
585 |
586 | return poses_out
587 |
588 |
589 | def convert_nclt(x_s, y_s, z_s): # 输入点云转换函数
590 | # 文档种提供的转换函数
591 | # 原文档返回为x, y, z,但在绘制可视化图时z取负,此处先取负
592 | scaling = 0.005 # 5 mm
593 | offset = -100.0
594 |
595 | x = x_s * scaling + offset
596 | y = y_s * scaling + offset
597 | z = z_s * scaling + offset
598 |
599 | return x, y, -z
600 |
601 |
602 | def load_velodyne_binary_nclt(filename): # 读入二进制点云
603 | f_bin = open(filename, "rb")
604 | hits = []
605 | while True:
606 | x_str = f_bin.read(2)
607 | if x_str == b'': # eof
608 | break
609 | x = struct.unpack(' None:
30 | np.random.seed(seed)
31 | torch.manual_seed(seed)
32 | random.seed(seed)
33 |
34 |
35 | class VizStats(Stats):
36 | def plot_stats(
37 | self, viz=None, visdom_env=None, plot_file=None, visdom_server=None, visdom_port=None
38 | ):
39 | # use the cached visdom env if none supplied
40 | if visdom_env is None:
41 | visdom_env = self.visdom_env
42 | if visdom_server is None:
43 | visdom_server = self.visdom_server
44 | if visdom_port is None:
45 | visdom_port = self.visdom_port
46 | if plot_file is None:
47 | plot_file = self.plot_file
48 |
49 | stat_sets = list(self.stats.keys())
50 |
51 | logger.debug(
52 | f"printing charts to visdom env '{visdom_env}' ({visdom_server}:{visdom_port})"
53 | )
54 |
55 | novisdom = False
56 |
57 | if viz is None:
58 | viz = get_visdom_connection(server=visdom_server, port=visdom_port)
59 |
60 | if viz is None or not viz.check_connection():
61 | logger.info("no visdom server! -> skipping visdom plots")
62 | novisdom = True
63 |
64 | lines = []
65 |
66 | # plot metrics
67 | if not novisdom:
68 | viz.close(env=visdom_env, win=None)
69 |
70 | for stat in self.log_vars:
71 | vals = []
72 | stat_sets_now = []
73 | for stat_set in stat_sets:
74 | val = self.stats[stat_set][stat].get_epoch_averages()
75 | if val is None:
76 | continue
77 | else:
78 | val = np.array(val).reshape(-1)
79 | stat_sets_now.append(stat_set)
80 | vals.append(val)
81 |
82 | if len(vals) == 0:
83 | continue
84 |
85 | lines.append((stat_sets_now, stat, vals))
86 |
87 | if not novisdom:
88 | for tmodes, stat, vals in lines:
89 | title = "%s" % stat
90 | opts = {"title": title, "legend": list(tmodes)}
91 | for i, (tmode, val) in enumerate(zip(tmodes, vals)):
92 | update = "append" if i > 0 else None
93 | valid = np.where(np.isfinite(val))[0]
94 | if len(valid) == 0:
95 | continue
96 | x = np.arange(len(val))
97 | viz.line(
98 | Y=val[valid],
99 | X=x[valid],
100 | env=visdom_env,
101 | opts=opts,
102 | win=f"stat_plot_{title}",
103 | name=tmode,
104 | update=update,
105 | )
106 |
107 | if plot_file:
108 | logger.info(f"plotting stats to {plot_file}")
109 | ncol = 3
110 | nrow = int(np.ceil(float(len(lines)) / ncol))
111 | matplotlib.rcParams.update({"font.size": 5})
112 | color = cycle(plt.cm.tab10(np.linspace(0, 1, 10)))
113 | fig = plt.figure(1)
114 | plt.clf()
115 | for idx, (tmodes, stat, vals) in enumerate(lines):
116 | c = next(color)
117 | plt.subplot(nrow, ncol, idx + 1)
118 | plt.gca()
119 | for vali, vals_ in enumerate(vals):
120 | c_ = c * (1.0 - float(vali) * 0.3)
121 | valid = np.where(np.isfinite(vals_))[0]
122 | if len(valid) == 0:
123 | continue
124 | x = np.arange(len(vals_))
125 | plt.plot(x[valid], vals_[valid], c=c_, linewidth=1)
126 | plt.ylabel(stat)
127 | plt.xlabel("epoch")
128 | plt.gca().yaxis.label.set_color(c[0:3] * 0.75)
129 | plt.legend(tmodes)
130 | gcolor = np.array(mcolors.to_rgba("lightgray"))
131 | grid_params = {"visible": True, "color": gcolor}
132 | plt.grid(**grid_params, which="major", linestyle="-", linewidth=0.4)
133 | plt.grid(**grid_params, which="minor", linestyle="--", linewidth=0.2)
134 | plt.minorticks_on()
135 |
136 | plt.tight_layout()
137 | plt.show()
138 | try:
139 | fig.savefig(plot_file)
140 | except PermissionError:
141 | warnings.warn("Cant dump stats due to insufficient permissions!")
142 |
143 |
144 | def rotation_angle(rot_gt, rot_pred, batch_size=None):
145 | # rot_gt, rot_pred (B, 3, 3)
146 | # masks_flat: B, 1
147 | rel_angle_cos = so3_relative_angle(rot_gt, rot_pred, eps=1e-4)
148 | rel_rangle_deg = rel_angle_cos * 180 / np.pi
149 |
150 | if batch_size is not None:
151 | rel_rangle_deg = rel_rangle_deg.reshape(batch_size, -1)
152 |
153 | return rel_rangle_deg
154 |
155 |
156 | def translation_angle(tvec_gt, tvec_pred, batch_size=None):
157 | rel_tangle_deg = evaluate_translation_batch(tvec_gt, tvec_pred)
158 | rel_tangle_deg = rel_tangle_deg * 180.0 / np.pi
159 |
160 | if batch_size is not None:
161 | rel_tangle_deg = rel_tangle_deg.reshape(batch_size, -1)
162 |
163 | return rel_tangle_deg
164 |
165 |
166 | def evaluate_translation_batch(t_gt, t, eps=1e-15, default_err=1e6):
167 | """Normalize the translation vectors and compute the angle between them."""
168 | t_norm = torch.norm(t, dim=1, keepdim=True)
169 | t = t / (t_norm + eps)
170 |
171 | t_gt_norm = torch.norm(t_gt, dim=1, keepdim=True)
172 | t_gt = t_gt / (t_gt_norm + eps)
173 |
174 | loss_t = torch.clamp_min(1.0 - torch.sum(t * t_gt, dim=1) ** 2, eps)
175 | err_t = torch.acos(torch.sqrt(1 - loss_t))
176 |
177 | err_t[torch.isnan(err_t) | torch.isinf(err_t)] = default_err
178 | return err_t
179 |
180 |
181 | def batched_all_pairs(B, N):
182 | # B, N = se3.shape[:2]
183 | i1_, i2_ = torch.combinations(
184 | torch.arange(N), 2, with_replacement=False
185 | ).unbind(-1)
186 | i1, i2 = [
187 | (i[None] + torch.arange(B)[:, None] * N).reshape(-1)
188 | for i in [i1_, i2_]
189 | ]
190 |
191 | return i1, i2
192 |
193 |
194 | class WarmupCosineRestarts(torch.optim.lr_scheduler._LRScheduler):
195 | def __init__(self, optimizer, T_0, iters_per_epoch, T_mult=1, eta_min=0, warmup_ratio=0.1, warmup_lr_init=1e-7,
196 | last_epoch=-1):
197 | self.T_0 = T_0 * iters_per_epoch # 50 * 156988
198 | self.T_mult = T_mult # 1
199 | self.eta_min = eta_min # 0
200 | self.warmup_iters = int(T_0 * warmup_ratio * iters_per_epoch) # int(50 * 156988 * 0.1 * 156988)
201 | self.warmup_lr_init = warmup_lr_init # 1e-7
202 | super(WarmupCosineRestarts, self).__init__(optimizer, last_epoch)
203 |
204 | def get_lr(self):
205 | if self.T_mult == 1:
206 | i_restart = self.last_epoch // self.T_0 # 算出是否需要restart
207 | T_cur = self.last_epoch - i_restart * self.T_0 #
208 | else:
209 | n = int(math.log((self.last_epoch / self.T_0 * (self.T_mult - 1) + 1), self.T_mult))
210 | T_cur = self.last_epoch - self.T_0 * (self.T_mult ** n - 1) // (self.T_mult - 1)
211 |
212 | if T_cur < self.warmup_iters:
213 | warmup_ratio = T_cur / self.warmup_iters
214 | return [self.warmup_lr_init + (base_lr - self.warmup_lr_init) * warmup_ratio for base_lr in self.base_lrs]
215 | else:
216 | T_cur_adjusted = T_cur - self.warmup_iters
217 | T_i = self.T_0 - self.warmup_iters
218 | return [self.eta_min + (base_lr - self.eta_min) * (1 + math.cos(math.pi * T_cur_adjusted / T_i)) / 2
219 | for base_lr in self.base_lrs]
220 |
221 |
222 | class WarmupCosineLR(toptim._LRScheduler):
223 | ''' Warmup learning rate scheduler.
224 | Initially, increases the learning rate from 0 to the final value, in a
225 | certain number of steps. After this number of steps, each step decreases
226 | LR exponentially.
227 | '''
228 |
229 | def __init__(self, optimizer, lr, warmup_steps, momentum, max_steps):
230 | # cyclic params
231 | self.optimizer = optimizer
232 | self.lr = lr
233 | self.warmup_steps = warmup_steps
234 | self.momentum = momentum
235 |
236 | # cap to one
237 | if self.warmup_steps < 1:
238 | self.warmup_steps = 1
239 |
240 | # cyclic lr
241 | self.cosine_scheduler = toptim.CosineAnnealingLR(
242 | self.optimizer, T_max=max_steps)
243 |
244 | self.initial_scheduler = toptim.CyclicLR(self.optimizer,
245 | base_lr=0,
246 | max_lr=self.lr,
247 | step_size_up=self.warmup_steps,
248 | step_size_down=self.warmup_steps,
249 | cycle_momentum=False,
250 | base_momentum=self.momentum,
251 | max_momentum=self.momentum)
252 |
253 | self.last_epoch = -1
254 | self.finished = False
255 | super().__init__(optimizer)
256 |
257 | def step(self, epoch=None):
258 | if self.finished or self.initial_scheduler.last_epoch >= self.warmup_steps:
259 | if not self.finished:
260 | self.base_lrs = [self.lr for lr in self.base_lrs]
261 | self.finished = True
262 | return self.cosine_scheduler.step(epoch)
263 | else:
264 | return self.initial_scheduler.step(epoch)
265 |
266 |
267 | class CosineLR(toptim._LRScheduler):
268 | ''' learning rate scheduler.
269 | Each step decreases LR exponentially.
270 | '''
271 |
272 | def __init__(self, optimizer, max_steps):
273 | # cyclic params
274 | self.optimizer = optimizer
275 | # cyclic lr
276 | self.cosine_scheduler = toptim.CosineAnnealingLR(
277 | self.optimizer, T_max=max_steps)
278 |
279 | super().__init__(optimizer)
280 |
281 | def step(self, epoch=None):
282 | return self.cosine_scheduler.step(epoch)
283 |
284 |
285 | # 训练策略
286 | POWER = 0.9
287 |
288 |
289 | def lr_poly(base_lr, iter, max_iter, power):
290 | return base_lr * ((1 - float(iter) / max_iter) ** power)
291 |
292 |
293 | def lr_warmup(base_lr, iter, max_iter, warmup_iter):
294 | return base_lr * (float(iter) / warmup_iter)
295 |
296 |
297 | def adjust_learning_rate(lr, i_iter, max_iter, PREHEAT_STEPS):
298 | if i_iter < PREHEAT_STEPS:
299 | lr = lr_warmup(lr, i_iter, max_iter, PREHEAT_STEPS)
300 | else:
301 | lr = lr_poly(lr, i_iter, max_iter, POWER)
302 |
303 | return lr
304 |
305 |
306 | class DynamicBatchSampler(BatchSampler):
307 | def __init__(self, num_sequences, dataset_len=1024, max_images=128, images_per_seq=(3, 20)):
308 | # len(dataset): 32; cfg.train.len_train: 16384; max_image: 512; images_per_seq: [3, 51]
309 | # self.dataset = dataset
310 | self.max_images = max_images
311 | self.images_per_seq = list(range(images_per_seq[0], images_per_seq[1]))
312 | self.num_sequences = num_sequences
313 | self.dataset_len = dataset_len
314 |
315 | def _capped_random_choice(self, x, size, replace: bool = True):
316 | len_x = x if isinstance(x, int) else len(x)
317 | if replace:
318 | return np.random.choice(x, size=size, replace=len_x < size)
319 | else:
320 | return np.random.choice(x, size=min(size, len_x), replace=False)
321 |
322 | def __iter__(self):
323 | for batch_idx in range(self.dataset_len):
324 | # NOTE batch_idx is never used later
325 | # print(f"process {batch_idx}")
326 | n_per_seq = np.random.choice(self.images_per_seq)
327 | n_seqs = (self.max_images // n_per_seq)
328 |
329 | chosen_seq = self._capped_random_choice(self.num_sequences, n_seqs)
330 | # print(f"get the chosen_seq for {batch_idx}")
331 |
332 | batches = [(bidx, n_per_seq) for bidx in chosen_seq]
333 | # print(f"yield the batches for {batch_idx}")
334 | yield batches
335 |
336 | def __len__(self):
337 | return self.dataset_len
338 |
339 |
340 | class FixBatchSampler(torch.utils.data.Sampler):
341 | def __init__(self, dataset, dataset_len=1024, batch_size=64, max_images=128, images_per_seq=(3, 20)):
342 | # dataset; len_train: 16384; max_images: 512; images_per_seq: [3, 51]
343 | self.dataset = dataset
344 | self.max_images = max_images # 32
345 | self.images_per_seq = list(range(images_per_seq[0], images_per_seq[1])) # [3, ..., 21]
346 | self.num_sequences = len(self.dataset) # 4
347 | self.dataset_len = dataset_len # 156988
348 | self.batch_size = 48 # bath_size
349 | self.fix_images_per_seq = True
350 |
351 | def _capped_random_choice(self, x, size, replace: bool = True):
352 | len_x = x if isinstance(x, int) else len(x)
353 | if replace:
354 | return np.random.choice(x, size=size, replace=len_x < size)
355 | else:
356 | return np.random.choice(x, size=min(size, len_x), replace=False)
357 |
358 | def __iter__(self):
359 | for batch_idx in range(self.dataset_len):
360 | # NOTE batch_idx is never used later
361 | # print(f"process {batch_idx}")
362 | if self.fix_images_per_seq:
363 | # n_per_seq = 12
364 | n_per_seq = 1
365 | else:
366 | n_per_seq = np.random.choice(self.images_per_seq)
367 |
368 | n_seqs = self.batch_size
369 |
370 | chosen_seq = self._capped_random_choice(self.num_sequences, n_seqs)
371 | # print(f"get the chosen_seq for {batch_idx}")
372 |
373 | batches = [(bidx, n_per_seq) for bidx in chosen_seq]
374 | # print(f"yield the batches for {batch_idx}")
375 | yield batches
376 |
377 | def __len__(self):
378 | return self.dataset_len
379 |
380 |
381 | def find_last_checkpoint(
382 | exp_dir, any_path: bool = False, all_checkpoints: bool = False
383 | ):
384 | if any_path:
385 | exts = [".pth", "_stats.jgz", "_opt.pth"]
386 | else:
387 | exts = [".pth"]
388 |
389 | for ext in exts:
390 | fls = sorted(
391 | glob.glob(
392 | os.path.join(glob.escape(exp_dir), "model_epoch_" + "[0-9]" * 8 + ext)
393 | )
394 | )
395 | if len(fls) > 0:
396 | break
397 | # pyre-fixme[61]: `fls` is undefined, or not always defined.
398 | if len(fls) == 0:
399 | fl = None
400 | else:
401 | if all_checkpoints:
402 | # pyre-fixme[61]: `fls` is undefined, or not always defined.
403 | fl = [f[0: -len(ext)] + ".pth" for f in fls]
404 | else:
405 | # pyre-fixme[61]: `ext` is undefined, or not always defined.
406 | fl = fls[-1][0: -len(ext)] + ".pth"
407 |
408 | return fl
409 |
410 |
411 | def get_equal_size_feat(feat_3d, sup_point, cls_label, B, pcs_num, max_select):
412 | # 统一将点数采样/复制到固定数量
413 | bs = B
414 | feat_3d_copy = feat_3d.F
415 | sup_point_copy = sup_point
416 | feat_3d_new = []
417 | sup_point_new = []
418 | for bs_i in range(bs):
419 | # 1. 得到bs_i的 索引
420 | temp_bs_i_ind = feat_3d.C[:, 0] == bs_i
421 | # 2. 得到bs_i的特征
422 | temp_bs_i_ft = feat_3d_copy[temp_bs_i_ind]
423 | temp_bs_i_sp = sup_point_copy[temp_bs_i_ind]
424 | # 3. 每个样本取max_pcs_num - len(temp_bs_i_ft)个特征
425 | len_rand_inds = pcs_num - len(temp_bs_i_ft)
426 | if len_rand_inds > 0:
427 | # 需要padding 随机一些点出来
428 | rand_inds = np.random.choice(len(temp_bs_i_ft), len_rand_inds, replace=True)
429 | # for feat
430 | temp_rand_bs_i_ft = temp_bs_i_ft[rand_inds]
431 | new_bs_i_ft = torch.cat((temp_bs_i_ft, temp_rand_bs_i_ft), 0)
432 | # for sup
433 | temp_rand_bs_i_sp = temp_bs_i_sp[rand_inds]
434 | new_bs_i_sp = torch.cat((temp_bs_i_sp, temp_rand_bs_i_sp), 0)
435 | else:
436 | # 此时多于pc_num个点,所以选择随机抽取pc_num点
437 | rand_inds = np.random.choice(len(temp_bs_i_ft), pcs_num, replace=True)
438 | new_bs_i_ft = temp_bs_i_ft[rand_inds]
439 | new_bs_i_sp = temp_bs_i_sp[rand_inds]
440 |
441 | # for feat
442 | feat_3d_new.append(new_bs_i_ft)
443 | # for sp
444 | sup_point_new.append(new_bs_i_sp)
445 |
446 | # 展平
447 | feat_3d_new = normalize_shape(torch.stack(feat_3d_new))[:max_select]
448 | sup_point_new = normalize_shape(torch.stack(sup_point_new))[:max_select]
449 | label_new = normalize_shape(cls_label.unsqueeze(1).repeat(1, pcs_num, 1))[:max_select] # [B, 3]
450 |
451 | # 不展平,此时[B, N, C]
452 | # feat_3d_new = torch.stack(feat_3d_new)
453 | # sup_point_new = torch.stack(sup_point_new)
454 | # label_new = cls_label.unsqueeze(1).repeat(1, pcs_num, 1) # [B, N, 3]
455 |
456 | return feat_3d_new, sup_point_new, label_new
457 |
458 |
459 | def normalize_shape(tensor_in):
460 | """Bring tensor from shape BxNxC to BNxC"""
461 | return tensor_in.transpose(0, 2).flatten(1).transpose(0, 1)
--------------------------------------------------------------------------------