├── README.md ├── datasets ├── __pycache__ │ ├── base_loader.cpython-39.pyc │ ├── infobatch.cpython-39.pyc │ └── lidarloc.cpython-39.pyc ├── base_loader.py ├── lidarloc.py ├── robotcar_sdk │ ├── extrinsics │ │ ├── ins.txt │ │ ├── ldmrs.txt │ │ ├── lms_front.txt │ │ ├── lms_rear.txt │ │ ├── mono_left.txt │ │ ├── mono_rear.txt │ │ ├── mono_right.txt │ │ ├── nlct_velodyne.txt │ │ ├── radar.txt │ │ ├── stereo.txt │ │ ├── velodyne_left.txt │ │ └── velodyne_right.txt │ └── python │ │ ├── README.md │ │ ├── __init__.py │ │ ├── __pycache__ │ │ ├── __init__.cpython-310.pyc │ │ ├── __init__.cpython-36.pyc │ │ ├── __init__.cpython-37.pyc │ │ ├── __init__.cpython-38.pyc │ │ ├── __init__.cpython-39.pyc │ │ ├── interpolate_poses.cpython-36.pyc │ │ ├── interpolate_poses.cpython-37.pyc │ │ ├── interpolate_poses.cpython-38.pyc │ │ ├── transform.cpython-310.pyc │ │ ├── transform.cpython-36.pyc │ │ ├── transform.cpython-37.pyc │ │ ├── transform.cpython-38.pyc │ │ ├── transform.cpython-39.pyc │ │ ├── velodyne.cpython-310.pyc │ │ ├── velodyne.cpython-36.pyc │ │ ├── velodyne.cpython-37.pyc │ │ └── velodyne.cpython-38.pyc │ │ ├── build_pointcloud.py │ │ ├── camera_model.py │ │ ├── image.py │ │ ├── interpolate_poses.py │ │ ├── play_images.py │ │ ├── play_radar.py │ │ ├── play_velodyne.py │ │ ├── project_laser_into_camera.py │ │ ├── radar.py │ │ ├── requirements.txt │ │ ├── transform.py │ │ └── velodyne.py └── rsd.py ├── img ├── Pipline.png └── Teaser.png ├── install.sh ├── models ├── __pycache__ │ ├── ace.cpython-310.pyc │ ├── ace.cpython-39.pyc │ ├── loss.cpython-39.pyc │ └── sc2pcr.cpython-39.pyc ├── lightloc.py ├── loss.py └── sc2pcr.py ├── nclt_process.py ├── test.py ├── train.py └── utils ├── __init__.py ├── __pycache__ ├── __init__.cpython-39.pyc ├── augmentor.cpython-39.pyc ├── pose_util.cpython-39.pyc └── train_util.cpython-39.pyc ├── augmentor.py ├── pose_util.py └── train_util.py /README.md: -------------------------------------------------------------------------------- 1 | # LightLoc: Learning Outdoor LiDAR Localization at Light Speed 2 | 3 | **[Paper](https://arxiv.org/pdf/2503.17814)** | **[Poster](https://drive.google.com/file/d/1Q5B6Ec4NN-NArkpaAjupIMamdCnwTknp/view?usp=sharing)** 4 | 5 |

6 | 7 | **LightLoc** is a fast and efficient framework for outdoor LiDAR-based localization. This repository provides the implementation of the LightLoc method, as described in our paper. 8 | 9 | --- 10 | ## Environment 11 | 12 | ### Requirements 13 | - python 3.8 14 | 15 | - pytorch 1.11 16 | 17 | - cuda 11.3 18 | ### Installation 19 | To set up the environment, run: 20 | ``` 21 | source install.sh 22 | ``` 23 | 24 | ## Supported Datasets 25 | 26 | LightLoc currently supports the following datasets: 27 | - [Oxford Radar RobotCar](https://oxford-robotics-institute.github.io/radar-robotcar-dataset/datasets) 28 | - [NCLT](https://robots.engin.umich.edu/nclt/) 29 | 30 | ### Dataset Structure 31 | Organize the dataset directories as follows: 32 | 33 | - Oxford 34 | ``` 35 | data_root 36 | ├── 2019-01-11-14-02-26-radar-oxford-10k 37 | │ ├── velodyne_left 38 | │ │ ├── xxx.bin 39 | │ │ ├── xxx.bin 40 | │ │ ├── … 41 | │ ├── velodyne_left_calibrateFalse.h5 42 | │ ├── velodyne_left_False.h5 43 | ├── … 44 | ``` 45 | - NCLT 46 | ``` 47 | data_root 48 | ├── 2012-01-22 49 | │ ├── velodyne_left 50 | │ │ ├── xxx.bin 51 | │ │ ├── xxx.bin 52 | │ │ ├── … 53 | ├── velodyne_sync 54 | │ │ ├── xxx.bin 55 | │ │ ├── xxx.bin 56 | │ │ ├── … 57 | │ ├── velodyne_left_False.h5 58 | ├── … 59 | ``` 60 | **Note:** 61 | - h5 ground truth files are provided in the [groundtruth](https://drive.google.com/drive/folders/1IAPbppgy88fr3KEgcKHJHUvdC0q1TJTo?usp=sharing). 62 | - We use [nclt_process.py](nclt_process.py) to preprocess velodyne_sync to speed up data reading. 63 | ``` 64 | python nclt_process.py 65 | ``` 66 | 67 | ## Train 68 | ![image](img/Pipline.png) 69 | #### (QE)Oxford 70 | 71 | - Train classifier 72 | ``` 73 | python train.py --sample_cls=True --generate_clusters=True --batch_size=512 --epochs=50 --level_cluster=25 74 | ``` 75 | - Train regressor 76 | ``` 77 | python train.py --sample_cls=False --generate_clusters=False --batch_size=256 --epochs=25 --rsd=True --level_cluster=25 78 | ``` 79 | **Note:** To use the QEOxford dataset, uncomment line 81 in [lidarloc.py](datasets/lidarloc.py). 80 | #### NCLT 81 | 82 | - Train classifier 83 | ``` 84 | python train.py --sample_cls=True --generate_clusters=True --batch_size=512 --epochs=50 --level_cluster=100 --training_buffer_size=120000 --voxel_size=0.3 85 | ``` 86 | - Train regressor 87 | ``` 88 | python train.py --sample_cls=False --generate_clusters=False --batch_size=256 --epochs=30 --rsd=True --prune_ratio=0.15 --level_cluster=100 --voxel_size=0.3 89 | ``` 90 | ## Test 91 | #### (QE)Oxford 92 | ``` 93 | python test.py 94 | ``` 95 | #### NCLT 96 | ``` 97 | python test.py --voxel_size=0.3 98 | ``` 99 | **Note:** For results matching Table 3 in the paper, exclude lines 4300–4500 from the 2012-05-26 trajectory. 100 | 101 | ## Model Zoo 102 | Pretrained models (backbone and scene-specific prediction heads) are available for download [here](https://drive.google.com/drive/folders/1ZuEs7NbVGO8afqTZM0xCYJgeLtNzPbt0?usp=sharing). 103 | 104 | ## Citation 105 | If you find this codebase useful in your research, please cite our work: 106 | ``` 107 | @article{li2025lightloc, 108 | title={LightLoc: Learning Outdoor LiDAR Localization at Light Speed}, 109 | author={Li, Wen and Liu, Chen and Yu, Shangshu and Liu, Dunqiang and Zhou, Yin and Shen, Siqi and Wen, Chenglu and Wang, Cheng}, 110 | journal={arXiv preprint arXiv:2503.17814}, 111 | year={2025} 112 | } 113 | ``` 114 | This code builds on previous LiDAR localization pipelines, namely SGLoc and DiffLoc. Please consider citing: 115 | ``` 116 | @inproceedings{li2024diffloc, 117 | title={DiffLoc: Diffusion Model for Outdoor LiDAR Localization}, 118 | author={Li, Wen and Yang, Yuyang and Yu, Shangshu and Hu, Guosheng and Wen, Chenglu and Cheng, Ming and Wang, Cheng}, 119 | booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, 120 | pages={15045--15054}, 121 | year={2024} 122 | } 123 | 124 | @inproceedings{li2023sgloc, 125 | title={SGLoc: Scene Geometry Encoding for Outdoor LiDAR Localization}, 126 | author={Li, Wen and Yu, Shangshu and Wang, Cheng and Hu, Guosheng and Shen, Siqi and Wen, Chenglu}, 127 | booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, 128 | pages={9286--9295}, 129 | year={2023} 130 | } 131 | ``` 132 | -------------------------------------------------------------------------------- /datasets/__pycache__/base_loader.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/__pycache__/base_loader.cpython-39.pyc -------------------------------------------------------------------------------- /datasets/__pycache__/infobatch.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/__pycache__/infobatch.cpython-39.pyc -------------------------------------------------------------------------------- /datasets/__pycache__/lidarloc.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/__pycache__/lidarloc.cpython-39.pyc -------------------------------------------------------------------------------- /datasets/base_loader.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import torch 3 | import numpy as np 4 | import MinkowskiEngine as ME 5 | 6 | 7 | class CollationFunctionFactory: 8 | def __init__(self, collation_type='default'): 9 | if collation_type == 'default': 10 | self.collation_fn = self.collate_default 11 | elif collation_type == 'collate_pair_cls': 12 | self.collation_fn = self.collate_pair_cls_fn 13 | elif collation_type == 'collate_pair_reg': 14 | self.collation_fn = self.collate_pair_reg_fn 15 | 16 | def __call__(self, list_data): 17 | return self.collation_fn(list_data) 18 | 19 | def collate_default(self, list_data): 20 | return list_data 21 | 22 | def collate_pair_cls_fn(self, list_data): 23 | list_data = [list_data] 24 | N = len(list_data) 25 | list_data = [data for data in list_data if data is not None] 26 | if N != len(list_data): 27 | logging.info(f"Retain {len(list_data)} from {N} data.") 28 | 29 | if len(list_data) == 0: 30 | raise ValueError('No data in the batch') 31 | 32 | coords, feats, lbl, pose, scene = list(zip(*list_data)) 33 | lbl_batch = torch.from_numpy(np.stack(lbl)).to(torch.int64) 34 | pose_batch = torch.from_numpy(np.stack(pose)).float() 35 | self.num_scenes = pose_batch.size(1) 36 | coords_batch = self.batched_seq_coordinates(coords, scene) 37 | feats_batch = torch.from_numpy(np.concatenate(feats, 0)).float() 38 | 39 | return { 40 | 'sinput_C': coords_batch, 41 | 'sinput_F': feats_batch, 42 | 'lbl': lbl_batch, 43 | 'pose': pose_batch 44 | } 45 | 46 | def collate_pair_reg_fn(self, list_data): 47 | N = len(list_data) 48 | list_data = [data for data in list_data if data is not None] 49 | if N != len(list_data): 50 | logging.info(f"Retain {len(list_data)} from {N} data.") 51 | 52 | if len(list_data) == 0: 53 | raise ValueError('No data in the batch') 54 | 55 | coords, feats, lbl, idx, pose = list(zip(*list_data)) 56 | lbl_batch = torch.from_numpy(np.stack(lbl)).to(torch.int64) 57 | idx_batch = torch.from_numpy(np.stack(idx)).to(torch.int64) 58 | pose_batch = torch.from_numpy(np.stack(pose)).float() 59 | coords_batch = ME.utils.batched_coordinates(coords) 60 | feats_batch = torch.from_numpy(np.concatenate(feats, 0)).float() 61 | 62 | return { 63 | 'sinput_C': coords_batch, 64 | 'sinput_F': feats_batch, 65 | 'lbl': lbl_batch, 66 | 'idx': idx_batch, 67 | 'pose': pose_batch 68 | } 69 | 70 | def batched_seq_coordinates(self, coords, scenes, dtype=torch.int32, device=None): 71 | D = np.unique(np.array([cs.shape[1] for cs in coords])) 72 | D = D[0] 73 | if device is None: 74 | if isinstance(coords, torch.Tensor): 75 | device = coords[0].device 76 | else: 77 | device = "cpu" 78 | N = np.array([len(cs) for cs in coords]).sum() 79 | bcoords = torch.zeros((N, D + 1), dtype=dtype, device=device) # uninitialized 80 | s = 0 81 | scenes = list(scenes) 82 | for i in range(len(scenes)): 83 | scenes[i] = list(scenes[i]) 84 | if i > 0: 85 | scenes[i] = [x + scenes[i-1][-1] for x in scenes[i]] 86 | for b, cs in enumerate(coords): 87 | if dtype == torch.int32: 88 | if isinstance(cs, np.ndarray): 89 | cs = torch.from_numpy(np.floor(cs)) 90 | elif not ( 91 | isinstance(cs, torch.IntTensor) or isinstance(cs, torch.LongTensor) 92 | ): 93 | cs = cs.floor() 94 | 95 | cs = cs.int() 96 | else: 97 | if isinstance(cs, np.ndarray): 98 | cs = torch.from_numpy(cs) 99 | if b > 0: 100 | for i in range(self.num_scenes): 101 | bcoords[scenes[b][i]: scenes[b][i + 1], 1:] = cs[scenes[b][i] - scenes[b - 1][-1]: 102 | scenes[b][i + 1] - scenes[b - 1][-1]] 103 | bcoords[scenes[b][i]: scenes[b][i + 1], 0] = s 104 | s += 1 105 | else: 106 | for i in range(self.num_scenes): 107 | bcoords[scenes[b][i]: scenes[b][i + 1], 1:] = cs[scenes[b][i]: scenes[b][i + 1]] 108 | bcoords[scenes[b][i]: scenes[b][i + 1], 0] = s 109 | s += 1 110 | return bcoords -------------------------------------------------------------------------------- /datasets/lidarloc.py: -------------------------------------------------------------------------------- 1 | import h5py 2 | import logging 3 | import numpy as np 4 | import os.path as osp 5 | import concurrent.futures 6 | import MinkowskiEngine as ME 7 | from torch.utils.data import Dataset 8 | from sklearn.cluster import KMeans 9 | 10 | from utils.pose_util import process_poses, calibrate_process_poses 11 | from utils.augmentor import Augmentor, AugmentParams 12 | 13 | _logger = logging.getLogger(__name__) 14 | BASE_DIR = osp.dirname(osp.abspath(__file__)) 15 | 16 | 17 | def compute_distances_chunk(xyz, centers): 18 | """ 19 | Calculate the distance between xyz and centers, and return the index of the minimum distance 20 | """ 21 | dist_mat = np.linalg.norm(xyz[:, np.newaxis, :2] - centers[np.newaxis, :, :2], axis=-1) 22 | lbl_chunk = np.argmin(dist_mat, axis=1) 23 | 24 | return lbl_chunk 25 | 26 | 27 | def parallel_kmeans(xyz, centers, n_jobs=8): 28 | """ 29 | Use concurrent.futures to compute the distance matrix in parallel and return the index of the minimum distance. 30 | """ 31 | chunk_size = len(xyz) // n_jobs 32 | xyz_chunks = [xyz[i:i + chunk_size] for i in range(0, len(xyz), chunk_size)] 33 | 34 | with concurrent.futures.ProcessPoolExecutor(max_workers=n_jobs) as executor: 35 | results = list(executor.map(compute_distances_chunk, xyz_chunks, [centers] * len(xyz_chunks))) 36 | 37 | lbl = np.concatenate(results, axis=0) 38 | 39 | return lbl 40 | 41 | 42 | class LiDARLocDataset(Dataset): 43 | """LiDAR localization dataset. 44 | 45 | Access to point clouds, calibration and ground truth data given a dataset directory 46 | """ 47 | 48 | def __init__(self, 49 | root_dir, 50 | train=True, 51 | sample_cls=False, 52 | augment=False, 53 | aug_rotation=10, 54 | aug_translation=1, 55 | generate_clusters=False, 56 | level_clusters=25, 57 | voxel_size=0.25): 58 | 59 | # Only support Oxford and NCLT 60 | self.root_dir = root_dir 61 | 62 | self.train = train 63 | self.augment = augment 64 | self.aug_rotation = aug_rotation 65 | self.aug_translation = aug_translation 66 | self.voxel_size = voxel_size 67 | 68 | # which dataset? 69 | self.scene = osp.split(root_dir)[-1] 70 | if self.scene == 'Oxford': 71 | if self.train: 72 | seqs = ['2019-01-11-14-02-26-radar-oxford-10k', '2019-01-14-12-05-52-radar-oxford-10k', 73 | '2019-01-14-14-48-55-radar-oxford-10k', '2019-01-18-15-20-12-radar-oxford-10k'] 74 | else: 75 | # change test seqs 76 | seqs = ['2019-01-15-13-06-37-radar-oxford-10k'] 77 | # seqs = ['2019-01-17-13-26-39-radar-oxford-10k'] 78 | # seqs = ['2019-01-17-14-03-00-radar-oxford-10k'] 79 | # seqs = ['2019-01-18-14-14-42-radar-oxford-10k'] 80 | 81 | self.scene = 'QEOxford' # If you use the QEOxford dataset, please uncomment this note 82 | 83 | elif self.scene == 'NCLT': 84 | if self.train: 85 | seqs = ['2012-01-22', '2012-02-02', '2012-02-18', '2012-05-11'] 86 | else: 87 | # seqs = ['2012-02-12'] 88 | # seqs = ['2012-02-19'] 89 | seqs = ['2012-03-31'] 90 | # seqs = ['2012-05-26'] 91 | else: 92 | raise RuntimeError('Only support Oxford and NCLT!') 93 | 94 | ps = {} 95 | ts = {} 96 | vo_stats = {} 97 | self.pcs = [] 98 | 99 | for seq in seqs: 100 | seq_dir = osp.join(self.root_dir, seq) 101 | if self.scene == 'QEOxford': 102 | h5_path = osp.join(self.root_dir, seq, 'velodyne_left_' + 'calibrateFalse.h5') 103 | else: 104 | h5_path = osp.join(self.root_dir, seq, 'velodyne_left_' + 'False.h5') 105 | 106 | # load h5 file, save pose interpolating time 107 | print("load " + seq + ' pose from ' + h5_path) 108 | h5_file = h5py.File(h5_path, 'r') 109 | ts[seq] = h5_file['valid_timestamps'][5:-5] 110 | ps[seq] = h5_file['poses'][5:-5] 111 | self.pcs.extend( 112 | [osp.join(seq_dir, 'velodyne_left', '{:d}.bin'.format(t)) for t in ts[seq]]) 113 | 114 | vo_stats[seq] = {'R': np.eye(3), 't': np.zeros(3), 's': 1} 115 | 116 | poses = np.empty((0, 12)) 117 | for p in ps.values(): 118 | poses = np.vstack((poses, p)) 119 | 120 | kmeans_pose_stats_filename = osp.join(self.root_dir, self.scene + '_cls_pose_stats.txt') 121 | mean_pose_stats_filename = osp.join(self.root_dir, self.scene + '_pose_stats.txt') 122 | 123 | self.poses = np.empty((0, 6)) 124 | self.rots = np.empty((0, 3, 3)) 125 | 126 | if self.train: 127 | if self.scene == 'QEOxford': 128 | # Calculate the mean value of translations in training seqs 129 | self.mean_t = np.mean(poses[:, 9:], axis=0) # (3,) 130 | # Save 131 | np.savetxt(mean_pose_stats_filename, self.mean_t, fmt='%8.7f') 132 | else: 133 | # Calculate the mean value of translations in training seqs 134 | self.mean_t = np.mean(poses[:, [3, 7, 11]], axis=0) # (3,) 135 | # Save 136 | np.savetxt(mean_pose_stats_filename, self.mean_t, fmt='%8.7f') 137 | else: 138 | self.mean_t = np.loadtxt(mean_pose_stats_filename) 139 | 140 | for seq in seqs: 141 | if self.scene == 'QEOxford': 142 | pss, rotation = calibrate_process_poses(poses_in=ps[seq], mean_t=self.mean_t, align_R=vo_stats[seq]['R'], 143 | align_t=vo_stats[seq]['t'], align_s=vo_stats[seq]['s']) 144 | else: 145 | pss, rotation = process_poses(poses_in=ps[seq], mean_t=self.mean_t, align_R=vo_stats[seq]['R'], 146 | align_t=vo_stats[seq]['t'], align_s=vo_stats[seq]['s']) 147 | self.poses = np.vstack((self.poses, pss)) 148 | self.rots = np.vstack((self.rots, rotation)) 149 | 150 | xyz = self.poses[:, :3] 151 | 152 | if self.train: 153 | if sample_cls and generate_clusters: 154 | print("generate clusters") 155 | centers = [] 156 | level = KMeans(n_clusters=level_clusters, random_state=0).fit(xyz) 157 | for l_cid in range(level_clusters): 158 | centers.append(level.cluster_centers_[l_cid]) 159 | centers = np.array(centers) 160 | np.savetxt(kmeans_pose_stats_filename, centers, fmt='%8.7f') 161 | self.lbl = parallel_kmeans(xyz[:, :2], centers[:, :2]).reshape(-1, 1) 162 | else: 163 | centers = np.loadtxt(kmeans_pose_stats_filename) 164 | self.lbl = parallel_kmeans(xyz[:, :2], centers[:, :2]).reshape(-1, 1) 165 | else: 166 | if sample_cls: 167 | centers = np.loadtxt(kmeans_pose_stats_filename) 168 | self.lbl = parallel_kmeans(xyz[:, :2], centers[:, :2]).reshape(-1, 1) 169 | else: 170 | centers = np.zeros((level_clusters, 3)) 171 | self.lbl = np.zeros((len(xyz), 1)) 172 | 173 | self.centers = centers 174 | 175 | # data augment 176 | augment_params = AugmentParams() 177 | if self.augment: 178 | augment_params.setTranslationParams( 179 | p_transx=0.5, trans_xmin=-1 * self.aug_translation, trans_xmax=self.aug_translation, 180 | p_transy=0.5, trans_ymin=-1 * self.aug_translation, trans_ymax=self.aug_translation, 181 | p_transz=0, trans_zmin=-1 * self.aug_translation, trans_zmax=self.aug_translation) 182 | augment_params.setRotationParams( 183 | p_rot_roll=0, rot_rollmin=-1 * self.aug_rotation, rot_rollmax=self.aug_rotation, 184 | p_rot_pitch=0, rot_pitchmin=-1 * self.aug_rotation, rot_pitchmax=self.aug_rotation, 185 | p_rot_yaw=0.5, rot_yawmin=-1 * self.aug_rotation, rot_yawmax=self.aug_rotation) 186 | self.augmentor = Augmentor(augment_params) 187 | else: 188 | self.augmentor = None 189 | 190 | if self.train: 191 | print("train data num:" + str(len(self.poses))) 192 | else: 193 | print("valid data num:" + str(len(self.poses))) 194 | 195 | def __len__(self): 196 | return len(self.poses) 197 | 198 | def __getitem__(self, idx): 199 | # For Classification training 200 | if type(idx) == list: 201 | scenes = [0] 202 | num = 0 203 | for i in idx: 204 | scan_path = self.pcs[i] 205 | if self.scene == 'NCLT': 206 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(-1, 4)[:, :3] 207 | else: 208 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(4, -1).transpose()[:, :3] 209 | # just for z-axis up 210 | ptcld[:, 2] = -1 * ptcld[:, 2] 211 | 212 | scan = ptcld 213 | 214 | scan = np.ascontiguousarray(scan) 215 | 216 | lbl = self.lbl[i] 217 | pose = self.poses[i] 218 | rot = self.rots[i] 219 | scan_gt = (rot @ scan.transpose(1, 0)).transpose(1, 0) + pose[:3].reshape(1, 3) 220 | 221 | if self.train & self.augment: 222 | scan = self.augmentor.doAugmentation(scan) # n, 5 223 | 224 | scan_gt_s8 = np.concatenate((scan, scan_gt), axis=1) 225 | 226 | coord, feat = ME.utils.sparse_quantize( 227 | coordinates=scan, 228 | features=scan_gt_s8, 229 | quantization_size=self.voxel_size) 230 | 231 | if num == 0: 232 | coords = coord 233 | feats = feat 234 | lbls = lbl.reshape(-1, 1) 235 | poses = pose.reshape(1, 6) 236 | scenes.append(len(coord)) 237 | else: 238 | coords = np.concatenate((coords, coord)) 239 | feats = np.concatenate((feats, feat)) 240 | lbls = np.concatenate((lbls, lbl.reshape(-1, 1))) 241 | poses = np.concatenate((poses, pose.reshape(1, 6))) 242 | scenes.append(scenes[num] + len(coord)) 243 | num += 1 244 | 245 | return (coords, feats, lbls, poses, scenes) 246 | 247 | else: 248 | # For Regression training 249 | scan_path = self.pcs[idx] 250 | if self.scene == 'NCLT': 251 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(-1, 4)[:, :3] 252 | else: 253 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(4, -1).transpose()[:, :3] 254 | # just for z-axis up 255 | ptcld[:, 2] = -1 * ptcld[:, 2] 256 | scan = ptcld 257 | 258 | scan = np.ascontiguousarray(scan) 259 | 260 | lbl = self.lbl[idx] 261 | pose = self.poses[idx] # (6,) 262 | rot = self.rots[idx] # [3, 3] 263 | 264 | scan_gt = (rot @ scan.transpose(1, 0)).transpose(1, 0) + pose[:3].reshape(1, 3) 265 | 266 | if self.train & self.augment: 267 | scan = self.augmentor.doAugmentation(scan) # n, 5 268 | 269 | scan_gt_s8 = np.concatenate((scan, scan_gt), axis=1) 270 | 271 | coords, feats = ME.utils.sparse_quantize( 272 | coordinates=scan, 273 | features=scan_gt_s8, 274 | quantization_size=self.voxel_size) 275 | 276 | return (coords, feats, lbl, idx, pose) 277 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/ins.txt: -------------------------------------------------------------------------------- 1 | -1.7132 0.1181 1.1948 -0.0125 0.0400 0.0050 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/ldmrs.txt: -------------------------------------------------------------------------------- 1 | 1.5349 0.0090 1.3777 0.0205 0.0767 -0.0299 -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/lms_front.txt: -------------------------------------------------------------------------------- 1 | 1.7589 0.2268 1.0411 -0.0437 -1.4572 0.0356 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/lms_rear.txt: -------------------------------------------------------------------------------- 1 | -2.5850 0.2852 1.0885 -2.8405 -1.5090 -0.3614 -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/mono_left.txt: -------------------------------------------------------------------------------- 1 | -0.0905 1.6375 0.2803 0.2079 -0.2339 1.2321 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/mono_rear.txt: -------------------------------------------------------------------------------- 1 | -2.0582 0.0894 0.3675 -0.0119 -0.2498 3.1283 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/mono_right.txt: -------------------------------------------------------------------------------- 1 | -0.2587 -1.6810 0.3226 -0.1961 -0.2469 -1.2675 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/nlct_velodyne.txt: -------------------------------------------------------------------------------- 1 | 0.002 -0.004 -0.957 0.807 0.166 -90.703 -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/radar.txt: -------------------------------------------------------------------------------- 1 | -0.71813 0.12 -0.54479 0 0.05 0 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/stereo.txt: -------------------------------------------------------------------------------- 1 | 0 0 0 0 0 0 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/velodyne_left.txt: -------------------------------------------------------------------------------- 1 | -0.60072 -0.34077 -0.26837 -0.0053948 -0.041998 -3.1337 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/extrinsics/velodyne_right.txt: -------------------------------------------------------------------------------- 1 | -0.61153 0.55676 -0.27023 0.0027052 -0.041999 -3.1357 2 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/README.md: -------------------------------------------------------------------------------- 1 | RobotCar Dataset Python Tools 2 | ============================= 3 | 4 | This directory contains sample python code for viewing and manipulating data from the [Oxford Robotcar Dataset](http://robotcar-dataset.robots.ox.ac.uk) and [Oxford Radar Robotcar Dataset](http://ori.ox.ac.uk/datasets/radar-robotcar-dataset). 5 | 6 | Requirements 7 | ------------ 8 | The python tools have been tested on Python 2.7. 9 | Python 3.* compatibility has not been verified. 10 | 11 | The following packages are required: 12 | * numpy 13 | * matplotlib 14 | * colour_demosaicing 15 | * pillow 16 | * opencv-python 17 | * open3d-python 18 | 19 | These can be installed with pip: 20 | 21 | ``` 22 | pip install numpy matplotlib colour_demosaicing pillow opencv-python open3d-python 23 | ``` 24 | 25 | Command Line Tools 26 | ------------------ 27 | 28 | ### Viewing Images 29 | The `play_images.py` script can be used to view images from the dataset. 30 | 31 | ```bash 32 | python play_images.py /path/to/data/yyyy-mm-dd-hh-mm-ss/stereo/centre 33 | ``` 34 | 35 | If you wish to undistort the images before viewing them, pass the camera model directory as a second argument: 36 | 37 | ```bash 38 | python play_images.py /path/to/data/yyyy-mm-dd-hh-mm-ss/stereo/centre --models_dir /path/to/camera/models 39 | ``` 40 | 41 | ### Viewing Radar Scans 42 | The `play_radar.py` script can be used to view radar scans. 43 | 44 | ```bash 45 | python play_radar.py /path/to/data/yyyy-mm-dd-hh-mm-ss/radar 46 | ``` 47 | 48 | ### Viewing Velodyne Scans 49 | The `play_velodyne.py` script can be used to view velodyne scans from raw or binary form. 50 | 51 | ```bash 52 | python play_velodyne.py /path/to/data/yyyy-mm-dd-hh-mm-ss/velodyne_left 53 | ``` 54 | 55 | ### Building Pointclouds 56 | The `build_pointcloud.py` script builds and displays a 3D pointcloud by combining multiple LIDAR scans with a pose source. 57 | The pose source can be either INS data or the supplied visual odometry data. For example: 58 | 59 | ```bash 60 | python build_pointcloud.py --laser_dir /path/to/data/yyyy-mm-dd-hh-mm-ss/lms_front --extrinsics_dir ../extrinsics --poses_file /path/to/data/yyyy-mm-dd-hh-mm-ss/vo/vo.csv 61 | ``` 62 | 63 | ### Projecting pointclouds into images 64 | The `project_laser_into_camera.py` script first builds a pointcloud, then projects it into a camera image using a pinhole camera model. 65 | For example: 66 | 67 | ```bash 68 | python project_laser_into_camera.py --image_dir /path/to/data/yyyy-mm-dd-hh-mm-ss/stereo/centre --laser_dir /path/to/data/yyyy-mm-dd-hh-mm-ss/ldmrs --poses_file /path/to/data/yyyy-mm-dd-hh-mm-ss/vo/vo.csv --models_dir /path/to/models --extrinsics_dir ../extrinsics --image_idx 200 69 | ``` 70 | 71 | Usage from Python 72 | ----------------- 73 | The scripts here are also designed to be used in your own scripts. 74 | 75 | * `build_pointcloud.py`: function for building a pointcloud from LIDAR and odometry data 76 | * `camera_model.py`: loads camera models from disk, and provides undistortion of images and projection of pointclouds 77 | * `interpolate_poses.py`: functions for interpolating VO or INS data to obtain pose estimates at arbitrary timestamps 78 | * `transform.py`: functions for converting between various transform representations 79 | * `image.py`: function for loading, Bayer demosaicing and undistorting images 80 | * `velodyne.py`: functions for loading Velodyne scan data and converting a raw scan representation to pointcloud 81 | * `radar.py`: functions for loading radar scan data and converting a polar scan representation to Cartesian 82 | 83 | For examples of how to use these functions, see the command line tools above. 84 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__init__.py: -------------------------------------------------------------------------------- 1 | from . import transform -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/__init__.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-310.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/__init__.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-36.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/__init__.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-37.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/__init__.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-38.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/__init__.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/__init__.cpython-39.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-36.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-37.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/interpolate_poses.cpython-38.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/transform.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-310.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/transform.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-36.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/transform.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-37.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/transform.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-38.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/transform.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/transform.cpython-39.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-310.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-36.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-37.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-37.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-38.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/datasets/robotcar_sdk/python/__pycache__/velodyne.cpython-38.pyc -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/build_pointcloud.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Geoff Pascoe (gmp@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ################################################################################ 14 | 15 | import os 16 | import re 17 | import numpy as np 18 | 19 | from transform import build_se3_transform 20 | from interpolate_poses import interpolate_vo_poses, interpolate_ins_poses 21 | from velodyne import load_velodyne_raw, load_velodyne_binary, velodyne_raw_to_pointcloud 22 | 23 | 24 | def build_pointcloud(lidar_dir, poses_file, extrinsics_dir, start_time, end_time, origin_time=-1): 25 | """Builds a pointcloud by combining multiple LIDAR scans with odometry information. 26 | 27 | Args: 28 | lidar_dir (str): Directory containing LIDAR scans. 29 | poses_file (str): Path to a file containing pose information. Can be VO or INS data. 30 | extrinsics_dir (str): Directory containing extrinsic calibrations. 31 | start_time (int): UNIX timestamp of the start of the window over which to build the pointcloud. 32 | end_time (int): UNIX timestamp of the end of the window over which to build the pointcloud. 33 | origin_time (int): UNIX timestamp of origin frame. Pointcloud coordinates are relative to this frame. 34 | 35 | Returns: 36 | numpy.ndarray: 3xn array of (x, y, z) coordinates of pointcloud 37 | numpy.array: array of n reflectance values or None if no reflectance values are recorded (LDMRS) 38 | 39 | Raises: 40 | ValueError: if specified window doesn't contain any laser scans. 41 | IOError: if scan files are not found. 42 | 43 | """ 44 | if origin_time < 0: 45 | origin_time = start_time 46 | 47 | lidar = re.search('(lms_front|lms_rear|ldmrs|velodyne_left|velodyne_right)', lidar_dir).group(0) 48 | timestamps_path = os.path.join(lidar_dir, os.pardir, lidar + '.timestamps') 49 | 50 | timestamps = [] 51 | with open(timestamps_path) as timestamps_file: 52 | for line in timestamps_file: 53 | timestamp = int(line.split(' ')[0]) 54 | if start_time <= timestamp <= end_time: 55 | timestamps.append(timestamp) 56 | 57 | if len(timestamps) == 0: 58 | raise ValueError("No LIDAR data in the given time bracket.") 59 | 60 | with open(os.path.join(extrinsics_dir, lidar + '.txt')) as extrinsics_file: 61 | extrinsics = next(extrinsics_file) 62 | G_posesource_laser = build_se3_transform([float(x) for x in extrinsics.split(' ')]) 63 | 64 | poses_type = re.search('(vo|ins|rtk)\.csv', poses_file).group(1) 65 | 66 | if poses_type in ['ins', 'rtk']: 67 | with open(os.path.join(extrinsics_dir, 'ins.txt')) as extrinsics_file: 68 | extrinsics = next(extrinsics_file) 69 | G_posesource_laser = np.linalg.solve(build_se3_transform([float(x) for x in extrinsics.split(' ')]), 70 | G_posesource_laser) 71 | 72 | poses = interpolate_ins_poses(poses_file, timestamps, origin_time, use_rtk=(poses_type == 'rtk')) 73 | else: 74 | # sensor is VO, which is located at the main vehicle frame 75 | poses = interpolate_vo_poses(poses_file, timestamps, origin_time) 76 | 77 | pointcloud = np.array([[0], [0], [0], [0]]) 78 | if lidar == 'ldmrs': 79 | reflectance = None 80 | else: 81 | reflectance = np.empty((0)) 82 | 83 | for i in range(0, len(poses)): 84 | scan_path = os.path.join(lidar_dir, str(timestamps[i]) + '.bin') 85 | if "velodyne" not in lidar: 86 | if not os.path.isfile(scan_path): 87 | continue 88 | 89 | scan_file = open(scan_path) 90 | scan = np.fromfile(scan_file, np.double) 91 | scan_file.close() 92 | 93 | scan = scan.reshape((len(scan) // 3, 3)).transpose() 94 | 95 | if lidar != 'ldmrs': 96 | # LMS scans are tuples of (x, y, reflectance) 97 | reflectance = np.concatenate((reflectance, np.ravel(scan[2, :]))) 98 | scan[2, :] = np.zeros((1, scan.shape[1])) 99 | else: 100 | if os.path.isfile(scan_path): 101 | ptcld = load_velodyne_binary(scan_path) 102 | else: 103 | scan_path = os.path.join(lidar_dir, str(timestamps[i]) + '.png') 104 | if not os.path.isfile(scan_path): 105 | continue 106 | ranges, intensities, angles, approximate_timestamps = load_velodyne_raw(scan_path) 107 | ptcld = velodyne_raw_to_pointcloud(ranges, intensities, angles) 108 | 109 | reflectance = np.concatenate((reflectance, ptcld[3])) 110 | scan = ptcld[:3] 111 | 112 | scan = np.dot(np.dot(poses[i], G_posesource_laser), np.vstack([scan, np.ones((1, scan.shape[1]))])) 113 | pointcloud = np.hstack([pointcloud, scan]) 114 | 115 | pointcloud = pointcloud[:, 1:] 116 | if pointcloud.shape[1] == 0: 117 | raise IOError("Could not find scan files for given time range in directory " + lidar_dir) 118 | 119 | return pointcloud, reflectance 120 | 121 | 122 | if __name__ == "__main__": 123 | import argparse 124 | import open3d 125 | 126 | parser = argparse.ArgumentParser(description='Build and display a pointcloud') 127 | parser.add_argument('--poses_file', type=str, default=None, help='File containing relative or absolute poses') 128 | parser.add_argument('--extrinsics_dir', type=str, default=None, 129 | help='Directory containing extrinsic calibrations') 130 | parser.add_argument('--laser_dir', type=str, default=None, help='Directory containing LIDAR data') 131 | 132 | args = parser.parse_args() 133 | 134 | lidar = re.search('(lms_front|lms_rear|ldmrs|velodyne_left|velodyne_right)', args.laser_dir).group(0) 135 | timestamps_path = os.path.join(args.laser_dir, os.pardir, lidar + '.timestamps') 136 | with open(timestamps_path) as timestamps_file: 137 | start_time = int(next(timestamps_file).split(' ')[0]) 138 | 139 | end_time = start_time + 2e7 140 | 141 | pointcloud, reflectance = build_pointcloud(args.laser_dir, args.poses_file, 142 | args.extrinsics_dir, start_time, end_time) 143 | 144 | if reflectance is not None: 145 | colours = (reflectance - reflectance.min()) / (reflectance.max() - reflectance.min()) 146 | colours = 1 / (1 + np.exp(-10 * (colours - colours.mean()))) 147 | else: 148 | colours = 'gray' 149 | 150 | # Pointcloud Visualisation using Open3D 151 | vis = open3d.Visualizer() 152 | vis.create_window(window_name=os.path.basename(__file__)) 153 | render_option = vis.get_render_option() 154 | render_option.background_color = np.array([0.1529, 0.1569, 0.1333], np.float32) 155 | render_option.point_color_option = open3d.PointColorOption.ZCoordinate 156 | coordinate_frame = open3d.geometry.create_mesh_coordinate_frame() 157 | vis.add_geometry(coordinate_frame) 158 | pcd = open3d.geometry.PointCloud() 159 | pcd.points = open3d.utility.Vector3dVector( 160 | -np.ascontiguousarray(pointcloud[[1, 0, 2]].transpose().astype(np.float64))) 161 | pcd.colors = open3d.utility.Vector3dVector(np.tile(colours[:, np.newaxis], (1, 3)).astype(np.float64)) 162 | # Rotate pointcloud to align displayed coordinate frame colouring 163 | pcd.transform(build_se3_transform([0, 0, 0, np.pi, 0, -np.pi / 2])) 164 | vis.add_geometry(pcd) 165 | view_control = vis.get_view_control() 166 | params = view_control.convert_to_pinhole_camera_parameters() 167 | params.extrinsic = build_se3_transform([0, 3, 10, 0, -np.pi * 0.42, -np.pi / 2]) 168 | view_control.convert_from_pinhole_camera_parameters(params) 169 | vis.run() 170 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/camera_model.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Geoff Pascoe (gmp@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ################################################################################ 14 | 15 | import re 16 | import os 17 | import numpy as np 18 | import scipy.interpolate as interp 19 | from scipy.ndimage import map_coordinates 20 | 21 | 22 | class CameraModel: 23 | """Provides intrinsic parameters and undistortion LUT for a camera. 24 | 25 | Attributes: 26 | camera (str): Name of the camera. 27 | camera sensor (str): Name of the sensor on the camera for multi-sensor cameras. 28 | focal_length (tuple[float]): Focal length of the camera in horizontal and vertical axis, in pixels. 29 | principal_point (tuple[float]): Principal point of camera for pinhole projection model, in pixels. 30 | G_camera_image (:obj: `numpy.matrixlib.defmatrix.matrix`): Transform from image frame to camera frame. 31 | bilinear_lut (:obj: `numpy.ndarray`): Look-up table for undistortion of images, mapping pixels in an undistorted 32 | image to pixels in the distorted image 33 | 34 | """ 35 | 36 | def __init__(self, models_dir, images_dir): 37 | """Loads a camera model from disk. 38 | 39 | Args: 40 | models_dir (str): directory containing camera model files. 41 | images_dir (str): directory containing images for which to read camera model. 42 | 43 | """ 44 | self.camera = None 45 | self.camera_sensor = None 46 | self.focal_length = None 47 | self.principal_point = None 48 | self.G_camera_image = None 49 | self.bilinear_lut = None 50 | 51 | self.__load_intrinsics(models_dir, images_dir) 52 | self.__load_lut(models_dir, images_dir) 53 | 54 | def project(self, xyz, image_size): 55 | """Projects a pointcloud into the camera using a pinhole camera model. 56 | 57 | Args: 58 | xyz (:obj: `numpy.ndarray`): 3xn array, where each column is (x, y, z) point relative to camera frame. 59 | image_size (tuple[int]): dimensions of image in pixels 60 | 61 | Returns: 62 | numpy.ndarray: 2xm array of points, where each column is the (u, v) pixel coordinates of a point in pixels. 63 | numpy.array: array of depth values for points in image. 64 | 65 | Note: 66 | Number of output points m will be less than or equal to number of input points n, as points that do not 67 | project into the image are discarded. 68 | 69 | """ 70 | if xyz.shape[0] == 3: 71 | xyz = np.stack((xyz, np.ones((1, xyz.shape[1])))) 72 | xyzw = np.linalg.solve(self.G_camera_image, xyz) 73 | 74 | # Find which points lie in front of the camera 75 | in_front = [i for i in range(0, xyzw.shape[1]) if xyzw[2, i] >= 0] 76 | xyzw = xyzw[:, in_front] 77 | 78 | uv = np.vstack((self.focal_length[0] * xyzw[0, :] / xyzw[2, :] + self.principal_point[0], 79 | self.focal_length[1] * xyzw[1, :] / xyzw[2, :] + self.principal_point[1])) 80 | 81 | in_img = [i for i in range(0, uv.shape[1]) 82 | if 0.5 <= uv[0, i] <= image_size[1] and 0.5 <= uv[1, i] <= image_size[0]] 83 | 84 | return uv[:, in_img], np.ravel(xyzw[2, in_img]) 85 | 86 | def undistort(self, image): 87 | """Undistorts an image. 88 | 89 | Args: 90 | image (:obj: `numpy.ndarray`): A distorted image. Must be demosaiced - ie. must be a 3-channel RGB image. 91 | 92 | Returns: 93 | numpy.ndarray: Undistorted version of image. 94 | 95 | Raises: 96 | ValueError: if image size does not match camera model. 97 | ValueError: if image only has a single channel. 98 | 99 | """ 100 | if image.shape[0] * image.shape[1] != self.bilinear_lut.shape[0]: 101 | raise ValueError('Incorrect image size for camera model') 102 | 103 | lut = self.bilinear_lut[:, 1::-1].T.reshape((2, image.shape[0], image.shape[1])) 104 | 105 | if len(image.shape) == 1: 106 | raise ValueError('Undistortion function only works with multi-channel images') 107 | 108 | undistorted = np.rollaxis(np.array([map_coordinates(image[:, :, channel], lut, order=1) 109 | for channel in range(0, image.shape[2])]), 0, 3) 110 | 111 | return undistorted.astype(image.dtype) 112 | 113 | def __get_model_name(self, images_dir): 114 | self.camera = re.search('(stereo|mono_(left|right|rear))', images_dir).group(0) 115 | if self.camera == 'stereo': 116 | self.camera_sensor = re.search('(left|centre|right)', images_dir).group(0) 117 | if self.camera_sensor == 'left': 118 | return 'stereo_wide_left' 119 | elif self.camera_sensor == 'right': 120 | return 'stereo_wide_right' 121 | elif self.camera_sensor == 'centre': 122 | return 'stereo_narrow_left' 123 | else: 124 | raise RuntimeError('Unknown camera model for given directory: ' + images_dir) 125 | else: 126 | return self.camera 127 | 128 | def __load_intrinsics(self, models_dir, images_dir): 129 | model_name = self.__get_model_name(images_dir) 130 | intrinsics_path = os.path.join(models_dir, model_name + '.txt') 131 | 132 | with open(intrinsics_path) as intrinsics_file: 133 | vals = [float(x) for x in next(intrinsics_file).split()] 134 | self.focal_length = (vals[0], vals[1]) 135 | self.principal_point = (vals[2], vals[3]) 136 | 137 | G_camera_image = [] 138 | for line in intrinsics_file: 139 | G_camera_image.append([float(x) for x in line.split()]) 140 | self.G_camera_image = np.array(G_camera_image) 141 | 142 | def __load_lut(self, models_dir, images_dir): 143 | model_name = self.__get_model_name(images_dir) 144 | lut_path = os.path.join(models_dir, model_name + '_distortion_lut.bin') 145 | 146 | lut = np.fromfile(lut_path, np.double) 147 | lut = lut.reshape([2, lut.size // 2]) 148 | self.bilinear_lut = lut.transpose() 149 | 150 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/image.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Geoff Pascoe (gmp@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ############################################################################### 14 | 15 | import re 16 | from PIL import Image 17 | from colour_demosaicing import demosaicing_CFA_Bayer_bilinear as demosaic 18 | import numpy as np 19 | 20 | BAYER_STEREO = 'gbrg' 21 | BAYER_MONO = 'rggb' 22 | 23 | 24 | def load_image(image_path, model=None): 25 | """Loads and rectifies an image from file. 26 | 27 | Args: 28 | image_path (str): path to an image from the dataset. 29 | model (camera_model.CameraModel): if supplied, model will be used to undistort image. 30 | 31 | Returns: 32 | numpy.ndarray: demosaiced and optionally undistorted image 33 | 34 | """ 35 | if model: 36 | camera = model.camera 37 | else: 38 | camera = re.search('(stereo|mono_(left|right|rear))', image_path).group(0) 39 | if camera == 'stereo': 40 | pattern = BAYER_STEREO 41 | else: 42 | pattern = BAYER_MONO 43 | 44 | img = Image.open(image_path) 45 | img = demosaic(img, pattern) 46 | if model: 47 | img = model.undistort(img) 48 | 49 | return np.array(img).astype(np.uint8) 50 | 51 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/interpolate_poses.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Geoff Pascoe (gmp@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ################################################################################ 14 | 15 | import bisect 16 | import csv 17 | import numpy as np 18 | import numpy.matlib as ml 19 | import sys 20 | sys.path.append('.') 21 | from .transform import * 22 | 23 | def NCLT_interpolate_vo_poses(vo_path, pose_timestamps, origin_timestamp): 24 | """Interpolate poses from visual odometry. 25 | 26 | Args: 27 | vo_path (str): path to file containing relative poses from visual odometry. 28 | pose_timestamps (list[int]): UNIX timestamps at which interpolated poses are required. 29 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame. 30 | 31 | Returns: 32 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp. 33 | 34 | """ 35 | with open(vo_path) as vo_file: 36 | vo_reader = csv.reader(vo_file) 37 | headers = next(vo_file) 38 | 39 | vo_timestamps = [0] 40 | abs_poses = [ml.identity(4)] 41 | 42 | lower_timestamp = min(min(pose_timestamps), origin_timestamp) 43 | upper_timestamp = max(max(pose_timestamps), origin_timestamp) 44 | 45 | for row in vo_reader: 46 | timestamp = int(row[0]) 47 | if timestamp < lower_timestamp: 48 | vo_timestamps[0] = timestamp 49 | continue 50 | 51 | vo_timestamps.append(timestamp) 52 | 53 | xyzrpy = [float(v) for v in row[1:7]] 54 | rel_pose = build_se3_transform(xyzrpy) 55 | abs_pose = abs_poses[-1] * rel_pose 56 | abs_poses.append(abs_pose) 57 | 58 | if timestamp >= upper_timestamp: 59 | break 60 | 61 | return interpolate_poses(vo_timestamps, abs_poses, pose_timestamps, origin_timestamp) 62 | 63 | def interpolate_vo_poses(vo_path, pose_timestamps, origin_timestamp): 64 | """Interpolate poses from visual odometry. 65 | 66 | Args: 67 | vo_path (str): path to file containing relative poses from visual odometry. 68 | pose_timestamps (list[int]): UNIX timestamps at which interpolated poses are required. 69 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame. 70 | 71 | Returns: 72 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp. 73 | 74 | """ 75 | with open(vo_path) as vo_file: 76 | vo_reader = csv.reader(vo_file) 77 | headers = next(vo_file) 78 | 79 | vo_timestamps = [0] 80 | abs_poses = [ml.identity(4)] 81 | 82 | lower_timestamp = min(min(pose_timestamps), origin_timestamp) 83 | upper_timestamp = max(max(pose_timestamps), origin_timestamp) 84 | 85 | for row in vo_reader: 86 | timestamp = int(row[0]) 87 | if timestamp < lower_timestamp: 88 | vo_timestamps[0] = timestamp 89 | continue 90 | 91 | vo_timestamps.append(timestamp) 92 | 93 | xyzrpy = [float(v) for v in row[2:8]] 94 | rel_pose = build_se3_transform(xyzrpy) 95 | abs_pose = abs_poses[-1] * rel_pose 96 | abs_poses.append(abs_pose) 97 | 98 | if timestamp >= upper_timestamp: 99 | break 100 | 101 | return interpolate_poses(vo_timestamps, abs_poses, pose_timestamps, origin_timestamp) 102 | 103 | 104 | def interpolate_ins_poses(ins_path, pose_timestamps, origin_timestamp, use_rtk=False): 105 | """Interpolate poses from INS. 106 | 107 | Args: 108 | ins_path (str): path to file containing poses from INS. 109 | pose_timestamps (list[int]): UNIX timestamps at which interpolated poses are required. 110 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame. 111 | 112 | Returns: 113 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp. 114 | 115 | """ 116 | with open(ins_path) as ins_file: 117 | ins_reader = csv.reader(ins_file) 118 | headers = next(ins_file) 119 | 120 | ins_timestamps = [0] 121 | abs_poses = [ml.identity(4)] 122 | 123 | upper_timestamp = max(max(pose_timestamps), origin_timestamp) 124 | 125 | for row in ins_reader: 126 | timestamp = int(row[0]) 127 | ins_timestamps.append(timestamp) 128 | 129 | utm = row[5:8] if not use_rtk else row[4:7] 130 | rpy = row[-3:] if not use_rtk else row[11:14] 131 | xyzrpy = [float(v) for v in utm] + [float(v) for v in rpy] 132 | abs_pose = build_se3_transform(xyzrpy) 133 | abs_poses.append(abs_pose) 134 | 135 | if timestamp >= upper_timestamp: 136 | break 137 | 138 | ins_timestamps = ins_timestamps[1:] 139 | abs_poses = abs_poses[1:] 140 | 141 | return interpolate_poses(ins_timestamps, abs_poses, pose_timestamps, origin_timestamp) 142 | 143 | 144 | def interpolate_ins_poses_lw(ins_pose, pose_timestamps, origin_timestamp, use_rtk=False): 145 | """Interpolate poses from INS. 146 | 147 | Args: 148 | ins_path (str): path to file containing poses from INS. 149 | pose_timestamps (list[int]): UNIX timestamps at which interpolated poses are required. 150 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame. 151 | 152 | Returns: 153 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp. 154 | 155 | """ 156 | 157 | ins_timestamps = [0] 158 | abs_poses = [ml.identity(4)] 159 | 160 | upper_timestamp = max(max(pose_timestamps), origin_timestamp) 161 | abs_pose = np.eye(4) 162 | for row in ins_pose: 163 | timestamp = int(row[0]) 164 | ins_timestamps.append(timestamp) 165 | 166 | # utm = row[5:8] if not use_rtk else row[4:7] 167 | # rpy = row[-3:] if not use_rtk else row[11:14] 168 | # xyzrpy = [float(v) for v in utm] + [float(v) for v in rpy] 169 | # abs_pose = build_se3_transform(xyzrpy) 170 | abs_pose[:3, :3] = row[1:10].reshape(3, 3) 171 | abs_pose[:3, 3] = row[10:] 172 | abs_poses.append(abs_pose) 173 | 174 | if timestamp >= upper_timestamp: 175 | break 176 | 177 | ins_timestamps = ins_timestamps[1:] 178 | abs_poses = abs_poses[1:] 179 | 180 | return interpolate_poses(ins_timestamps, abs_poses, pose_timestamps, origin_timestamp) 181 | 182 | 183 | def interpolate_poses(pose_timestamps, abs_poses, requested_timestamps, origin_timestamp): 184 | """Interpolate between absolute poses. 185 | 186 | Args: 187 | pose_timestamps (list[int]): Timestamps of supplied poses. Must be in ascending order. 188 | abs_poses (list[numpy.matrixlib.defmatrix.matrix]): SE3 matrices representing poses at the timestamps specified. 189 | requested_timestamps (list[int]): Timestamps for which interpolated timestamps are required. 190 | origin_timestamp (int): UNIX timestamp of origin frame. Poses will be reported relative to this frame. 191 | 192 | Returns: 193 | list[numpy.matrixlib.defmatrix.matrix]: SE3 matrix representing interpolated pose for each requested timestamp. 194 | 195 | Raises: 196 | ValueError: if pose_timestamps and abs_poses are not the same length 197 | ValueError: if pose_timestamps is not in ascending order 198 | 199 | """ 200 | requested_timestamps.insert(0, origin_timestamp) 201 | requested_timestamps = np.array(requested_timestamps) 202 | pose_timestamps = np.array(pose_timestamps) 203 | 204 | if len(pose_timestamps) != len(abs_poses): 205 | raise ValueError('Must supply same number of timestamps as poses') 206 | 207 | abs_quaternions = np.zeros((4, len(abs_poses))) 208 | abs_positions = np.zeros((3, len(abs_poses))) 209 | for i, pose in enumerate(abs_poses): 210 | if i > 0 and pose_timestamps[i-1] >= pose_timestamps[i]: 211 | raise ValueError('Pose timestamps must be in ascending order') 212 | 213 | abs_quaternions[:, i] = so3_to_quaternion(pose[0:3, 0:3]) 214 | abs_positions[:, i] = np.ravel(pose[0:3, 3]) 215 | 216 | upper_indices = [bisect.bisect(pose_timestamps, pt) for pt in requested_timestamps] 217 | lower_indices = [u - 1 for u in upper_indices] 218 | 219 | if max(upper_indices) >= len(pose_timestamps): 220 | upper_indices = [min(i, len(pose_timestamps) - 1) for i in upper_indices] 221 | 222 | fractions = (requested_timestamps - pose_timestamps[lower_indices]) / \ 223 | (pose_timestamps[upper_indices] - pose_timestamps[lower_indices]) 224 | # import warnings 225 | # warnings.filterwarnings('error') 226 | # try: 227 | # fractions = (requested_timestamps - pose_timestamps[lower_indices]) / \ 228 | # (pose_timestamps[upper_indices] - pose_timestamps[lower_indices]) 229 | # except Warning: 230 | # print('Warning was raised as an exception!') 231 | # print(pose_timestamps[upper_indices]) 232 | # print(pose_timestamps[lower_indices]) 233 | 234 | quaternions_lower = abs_quaternions[:, lower_indices] 235 | quaternions_upper = abs_quaternions[:, upper_indices] 236 | 237 | d_array = (quaternions_lower * quaternions_upper).sum(0) 238 | 239 | linear_interp_indices = np.nonzero(d_array >= 1) 240 | sin_interp_indices = np.nonzero(d_array < 1) 241 | 242 | scale0_array = np.zeros(d_array.shape) 243 | scale1_array = np.zeros(d_array.shape) 244 | 245 | scale0_array[linear_interp_indices] = 1 - fractions[linear_interp_indices] 246 | scale1_array[linear_interp_indices] = fractions[linear_interp_indices] 247 | 248 | theta_array = np.arccos(np.abs(d_array[sin_interp_indices])) 249 | 250 | scale0_array[sin_interp_indices] = \ 251 | np.sin((1 - fractions[sin_interp_indices]) * theta_array) / np.sin(theta_array) 252 | scale1_array[sin_interp_indices] = \ 253 | np.sin(fractions[sin_interp_indices] * theta_array) / np.sin(theta_array) 254 | 255 | negative_d_indices = np.nonzero(d_array < 0) 256 | scale1_array[negative_d_indices] = -scale1_array[negative_d_indices] 257 | 258 | quaternions_interp = np.tile(scale0_array, (4, 1)) * quaternions_lower \ 259 | + np.tile(scale1_array, (4, 1)) * quaternions_upper 260 | 261 | positions_lower = abs_positions[:, lower_indices] 262 | positions_upper = abs_positions[:, upper_indices] 263 | 264 | positions_interp = np.multiply(np.tile((1 - fractions), (3, 1)), positions_lower) \ 265 | + np.multiply(np.tile(fractions, (3, 1)), positions_upper) 266 | 267 | poses_mat = ml.zeros((4, 4 * len(requested_timestamps))) 268 | 269 | poses_mat[0, 0::4] = 1 - 2 * np.square(quaternions_interp[2, :]) - \ 270 | 2 * np.square(quaternions_interp[3, :]) 271 | poses_mat[0, 1::4] = 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[2, :]) - \ 272 | 2 * np.multiply(quaternions_interp[3, :], quaternions_interp[0, :]) 273 | poses_mat[0, 2::4] = 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[3, :]) + \ 274 | 2 * np.multiply(quaternions_interp[2, :], quaternions_interp[0, :]) 275 | 276 | poses_mat[1, 0::4] = 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[2, :]) \ 277 | + 2 * np.multiply(quaternions_interp[3, :], quaternions_interp[0, :]) 278 | poses_mat[1, 1::4] = 1 - 2 * np.square(quaternions_interp[1, :]) \ 279 | - 2 * np.square(quaternions_interp[3, :]) 280 | poses_mat[1, 2::4] = 2 * np.multiply(quaternions_interp[2, :], quaternions_interp[3, :]) - \ 281 | 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[0, :]) 282 | 283 | poses_mat[2, 0::4] = 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[3, :]) - \ 284 | 2 * np.multiply(quaternions_interp[2, :], quaternions_interp[0, :]) 285 | poses_mat[2, 1::4] = 2 * np.multiply(quaternions_interp[2, :], quaternions_interp[3, :]) + \ 286 | 2 * np.multiply(quaternions_interp[1, :], quaternions_interp[0, :]) 287 | poses_mat[2, 2::4] = 1 - 2 * np.square(quaternions_interp[1, :]) - \ 288 | 2 * np.square(quaternions_interp[2, :]) 289 | 290 | poses_mat[0:3, 3::4] = positions_interp 291 | poses_mat[3, 3::4] = 1 292 | 293 | # don't use relative pose 294 | # poses_mat = np.linalg.solve(poses_mat[0:4, 0:4], poses_mat) 295 | 296 | poses_out = [0] * (len(requested_timestamps) - 1) 297 | for i in range(1, len(requested_timestamps)): 298 | poses_out[i - 1] = poses_mat[0:4, i * 4:(i + 1) * 4] 299 | 300 | return poses_out 301 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/play_images.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Geoff Pascoe (gmp@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ################################################################################ 14 | 15 | import argparse 16 | import os 17 | import re 18 | import matplotlib.pyplot as plt 19 | from datetime import datetime as dt 20 | from image import load_image 21 | from camera_model import CameraModel 22 | 23 | parser = argparse.ArgumentParser(description='Play back images from a given directory') 24 | 25 | parser.add_argument('dir', type=str, help='Directory containing images.') 26 | parser.add_argument('--models_dir', type=str, default=None, help='(optional) Directory containing camera model. If supplied, images will be undistorted before display') 27 | parser.add_argument('--scale', type=float, default=1.0, help='(optional) factor by which to scale images before display') 28 | 29 | args = parser.parse_args() 30 | 31 | camera = re.search('(stereo|mono_(left|right|rear))', args.dir).group(0) 32 | 33 | timestamps_path = os.path.join(os.path.join(args.dir, os.pardir, camera + '.timestamps')) 34 | if not os.path.isfile(timestamps_path): 35 | timestamps_path = os.path.join(args.dir, os.pardir, os.pardir, camera + '.timestamps') 36 | if not os.path.isfile(timestamps_path): 37 | raise IOError("Could not find timestamps file") 38 | 39 | model = None 40 | if args.models_dir: 41 | model = CameraModel(args.models_dir, args.dir) 42 | 43 | current_chunk = 0 44 | timestamps_file = open(timestamps_path) 45 | for line in timestamps_file: 46 | tokens = line.split() 47 | datetime = dt.utcfromtimestamp(int(tokens[0])/1000000) 48 | chunk = int(tokens[1]) 49 | 50 | filename = os.path.join(args.dir, tokens[0] + '.png') 51 | if not os.path.isfile(filename): 52 | if chunk != current_chunk: 53 | print("Chunk " + str(chunk) + " not found") 54 | current_chunk = chunk 55 | continue 56 | 57 | current_chunk = chunk 58 | 59 | img = load_image(filename, model) 60 | plt.imshow(img) 61 | plt.xlabel(datetime) 62 | plt.xticks([]) 63 | plt.yticks([]) 64 | plt.pause(0.01) 65 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/play_radar.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Dan Barnes (dbarnes@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ################################################################################ 14 | 15 | import argparse 16 | import os 17 | from radar import load_radar, radar_polar_to_cartesian 18 | import numpy as np 19 | import cv2 20 | 21 | parser = argparse.ArgumentParser(description='Play back radar data from a given directory') 22 | 23 | parser.add_argument('dir', type=str, help='Directory containing radar data.') 24 | 25 | args = parser.parse_args() 26 | 27 | timestamps_path = os.path.join(os.path.join(args.dir, os.pardir, 'radar.timestamps')) 28 | if not os.path.isfile(timestamps_path): 29 | raise IOError("Could not find timestamps file") 30 | 31 | # Cartesian Visualsation Setup 32 | # Resolution of the cartesian form of the radar scan in metres per pixel 33 | cart_resolution = .25 34 | # Cartesian visualisation size (used for both height and width) 35 | cart_pixel_width = 501 # pixels 36 | interpolate_crossover = True 37 | 38 | title = "Radar Visualisation Example" 39 | 40 | radar_timestamps = np.loadtxt(timestamps_path, delimiter=' ', usecols=[0], dtype=np.int64) 41 | for radar_timestamp in radar_timestamps: 42 | filename = os.path.join(args.dir, str(radar_timestamp) + '.png') 43 | 44 | if not os.path.isfile(filename): 45 | raise FileNotFoundError("Could not find radar example: {}".format(filename)) 46 | 47 | timestamps, azimuths, valid, fft_data, radar_resolution = load_radar(filename) 48 | cart_img = radar_polar_to_cartesian(azimuths, fft_data, radar_resolution, cart_resolution, cart_pixel_width, 49 | interpolate_crossover) 50 | 51 | # Combine polar and cartesian for visualisation 52 | # The raw polar data is resized to the height of the cartesian representation 53 | downsample_rate = 4 54 | fft_data_vis = fft_data[:, ::downsample_rate] 55 | resize_factor = float(cart_img.shape[0]) / float(fft_data_vis.shape[0]) 56 | fft_data_vis = cv2.resize(fft_data_vis, (0, 0), None, resize_factor, resize_factor) 57 | vis = cv2.hconcat((fft_data_vis, fft_data_vis[:, :10] * 0 + 1, cart_img)) 58 | 59 | cv2.imshow(title, vis * 2.) # The data is doubled to improve visualisation 60 | cv2.waitKey(1) 61 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/play_velodyne.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Dan Barnes (dbarnes@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ################################################################################ 14 | 15 | 16 | import argparse 17 | from argparse import RawTextHelpFormatter 18 | import os 19 | from velodyne import load_velodyne_raw, load_velodyne_binary, velodyne_raw_to_pointcloud 20 | import numpy as np 21 | import cv2 22 | from matplotlib.cm import get_cmap 23 | from scipy import interpolate 24 | import open3d 25 | from transform import build_se3_transform 26 | 27 | mode_flag_help = """Mode to run in, one of: (raw|raw_interp|raw_ptcld|bin_ptcld) 28 | - 'raw' - visualise the raw velodyne data (intensities and ranges) 29 | - 'raw_interp' - visualise the raw velodyne data (intensities and ranges) interpolated to consistent azimuth angles 30 | between scans. 31 | - 'raw_ptcld' - visualise the raw velodyne data converted to a pointcloud (converts files of the form 32 | .png to pointcloud) 33 | - 'bin_ptcld' - visualise the precomputed velodyne pointclouds (files of the form .bin). This is 34 | approximately 2x faster than running the conversion from raw data `raw_ptcld` at the cost of 35 | approximately 8x the storage space. 36 | """ 37 | 38 | parser = argparse.ArgumentParser(description='Play back velodyne data from a given directory', 39 | formatter_class=RawTextHelpFormatter) 40 | parser.add_argument('--mode', default="raw_interp", type=str, help=mode_flag_help) 41 | parser.add_argument('--scale', default=2., type=float, help="Scale visualisations by this amount") 42 | parser.add_argument('dir', type=str, help='Directory containing velodyne data.') 43 | 44 | args = parser.parse_args() 45 | 46 | 47 | def main(): 48 | velodyne_dir = args.dir 49 | if velodyne_dir[-1] == '/': 50 | velodyne_dir = velodyne_dir[:-1] 51 | 52 | velodyne_sensor = os.path.basename(velodyne_dir) 53 | if velodyne_sensor not in ["velodyne_left", "velodyne_right"]: 54 | raise ValueError("Velodyne directory not valid: {}".format(velodyne_dir)) 55 | 56 | timestamps_path = velodyne_dir + '.timestamps' 57 | if not os.path.isfile(timestamps_path): 58 | raise IOError("Could not find timestamps file: {}".format(timestamps_path)) 59 | 60 | title = "Velodyne Visualisation Example" 61 | extension = ".bin" if args.mode == "bin_ptcld" else ".png" 62 | velodyne_timestamps = np.loadtxt(timestamps_path, delimiter=' ', usecols=[0], dtype=np.int64) 63 | colourmap = (get_cmap("viridis")(np.linspace(0, 1, 255))[:, :3] * 255).astype(np.uint8)[:, ::-1] 64 | interp_angles = np.mod(np.linspace(np.pi, 3 * np.pi, 720), 2 * np.pi) 65 | vis = None 66 | 67 | for velodyne_timestamp in velodyne_timestamps: 68 | 69 | filename = os.path.join(args.dir, str(velodyne_timestamp) + extension) 70 | 71 | if args.mode == "bin_ptcld": 72 | ptcld = load_velodyne_binary(filename) 73 | else: 74 | ranges, intensities, angles, approximate_timestamps = load_velodyne_raw(filename) 75 | 76 | if args.mode == "raw_ptcld": 77 | ptcld = velodyne_raw_to_pointcloud(ranges, intensities, angles) 78 | elif args.mode == "raw_interp": 79 | intensities = interpolate.interp1d(angles[0], intensities, bounds_error=False)(interp_angles) 80 | ranges = interpolate.interp1d(angles[0], ranges, bounds_error=False)(interp_angles) 81 | intensities[np.isnan(intensities)] = 0 82 | ranges[np.isnan(ranges)] = 0 83 | 84 | if '_ptcld' in args.mode: 85 | # Pointcloud Visualisation using Open3D 86 | if vis is None: 87 | vis = open3d.Visualizer() 88 | vis.create_window(window_name=title) 89 | pcd = open3d.geometry.PointCloud() 90 | # initialise the geometry pre loop 91 | pcd.points = open3d.utility.Vector3dVector(ptcld[:3].transpose().astype(np.float64)) 92 | pcd.colors = open3d.utility.Vector3dVector(np.tile(ptcld[3:].transpose(), (1, 3)).astype(np.float64)) 93 | # Rotate pointcloud to align displayed coordinate frame colouring 94 | pcd.transform(build_se3_transform([0, 0, 0, np.pi, 0, -np.pi / 2])) 95 | vis.add_geometry(pcd) 96 | render_option = vis.get_render_option() 97 | render_option.background_color = np.array([0.1529, 0.1569, 0.1333], np.float32) 98 | render_option.point_color_option = open3d.PointColorOption.ZCoordinate 99 | coordinate_frame = open3d.geometry.create_mesh_coordinate_frame() 100 | vis.add_geometry(coordinate_frame) 101 | view_control = vis.get_view_control() 102 | params = view_control.convert_to_pinhole_camera_parameters() 103 | params.extrinsic = build_se3_transform([0, 3, 10, 0, -np.pi * 0.42, -np.pi / 2]) 104 | view_control.convert_from_pinhole_camera_parameters(params) 105 | 106 | pcd.points = open3d.utility.Vector3dVector(ptcld[:3].transpose().astype(np.float64)) 107 | pcd.colors = open3d.utility.Vector3dVector( 108 | np.tile(ptcld[3:].transpose(), (1, 3)).astype(np.float64) / 40) 109 | vis.update_geometry() 110 | vis.poll_events() 111 | vis.update_renderer() 112 | 113 | else: 114 | # Ranges and Intensities visualisation using OpenCV 115 | intensities_vis = colourmap[np.clip((intensities * 4).astype(np.int), 0, colourmap.shape[0] - 1)] 116 | ranges_vis = colourmap[np.clip((ranges * 4).astype(np.int), 0, colourmap.shape[0] - 1)] 117 | visualisation = np.concatenate((intensities_vis, ranges_vis), 0) 118 | visualisation = cv2.resize(visualisation, None, fy=6 * args.scale, fx=args.scale, 119 | interpolation=cv2.INTER_NEAREST) 120 | cv2.imshow(title, visualisation) 121 | cv2.waitKey(1) 122 | 123 | 124 | if __name__ == "__main__": 125 | main() 126 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/project_laser_into_camera.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Geoff Pascoe (gmp@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ################################################################################ 14 | 15 | import os 16 | import re 17 | import numpy as np 18 | import matplotlib.pyplot as plt 19 | import argparse 20 | 21 | from build_pointcloud import build_pointcloud 22 | from transform import build_se3_transform 23 | from image import load_image 24 | from camera_model import CameraModel 25 | 26 | parser = argparse.ArgumentParser(description='Project LIDAR data into camera image') 27 | parser.add_argument('--image_dir', type=str, help='Directory containing images') 28 | parser.add_argument('--laser_dir', type=str, help='Directory containing LIDAR scans') 29 | parser.add_argument('--poses_file', type=str, help='File containing either INS or VO poses') 30 | parser.add_argument('--models_dir', type=str, help='Directory containing camera models') 31 | parser.add_argument('--extrinsics_dir', type=str, help='Directory containing sensor extrinsics') 32 | parser.add_argument('--image_idx', type=int, help='Index of image to display') 33 | 34 | args = parser.parse_args() 35 | 36 | model = CameraModel(args.models_dir, args.image_dir) 37 | 38 | extrinsics_path = os.path.join(args.extrinsics_dir, model.camera + '.txt') 39 | with open(extrinsics_path) as extrinsics_file: 40 | extrinsics = [float(x) for x in next(extrinsics_file).split(' ')] 41 | 42 | G_camera_vehicle = build_se3_transform(extrinsics) 43 | G_camera_posesource = None 44 | 45 | poses_type = re.search('(vo|ins|rtk)\.csv', args.poses_file).group(1) 46 | if poses_type in ['ins', 'rtk']: 47 | with open(os.path.join(args.extrinsics_dir, 'ins.txt')) as extrinsics_file: 48 | extrinsics = next(extrinsics_file) 49 | G_camera_posesource = G_camera_vehicle * build_se3_transform([float(x) for x in extrinsics.split(' ')]) 50 | else: 51 | # VO frame and vehicle frame are the same 52 | G_camera_posesource = G_camera_vehicle 53 | 54 | 55 | timestamps_path = os.path.join(args.image_dir, os.pardir, model.camera + '.timestamps') 56 | if not os.path.isfile(timestamps_path): 57 | timestamps_path = os.path.join(args.image_dir, os.pardir, os.pardir, model.camera + '.timestamps') 58 | 59 | timestamp = 0 60 | with open(timestamps_path) as timestamps_file: 61 | for i, line in enumerate(timestamps_file): 62 | if i == args.image_idx: 63 | timestamp = int(line.split(' ')[0]) 64 | 65 | pointcloud, reflectance = build_pointcloud(args.laser_dir, args.poses_file, args.extrinsics_dir, 66 | timestamp - 1e7, timestamp + 1e7, timestamp) 67 | 68 | pointcloud = np.dot(G_camera_posesource, pointcloud) 69 | 70 | image_path = os.path.join(args.image_dir, str(timestamp) + '.png') 71 | image = load_image(image_path, model) 72 | 73 | uv, depth = model.project(pointcloud, image.shape) 74 | 75 | plt.imshow(image) 76 | plt.scatter(np.ravel(uv[0, :]), np.ravel(uv[1, :]), s=2, c=depth, edgecolors='none', cmap='jet') 77 | plt.xlim(0, image.shape[1]) 78 | plt.ylim(image.shape[0], 0) 79 | plt.xticks([]) 80 | plt.yticks([]) 81 | plt.show() 82 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/radar.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Dan Barnes (dbarnes@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ############################################################################### 14 | 15 | from typing import AnyStr, Tuple 16 | import numpy as np 17 | import cv2 18 | 19 | 20 | def load_radar(example_path: AnyStr) -> Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray, float]: 21 | """Decode a single Oxford Radar RobotCar Dataset radar example 22 | Args: 23 | example_path (AnyStr): Oxford Radar RobotCar Dataset Example png 24 | Returns: 25 | timestamps (np.ndarray): Timestamp for each azimuth in int64 (UNIX time) 26 | azimuths (np.ndarray): Rotation for each polar radar azimuth (radians) 27 | valid (np.ndarray) Mask of whether azimuth data is an original sensor reading or interpolated from adjacent 28 | azimuths 29 | fft_data (np.ndarray): Radar power readings along each azimuth 30 | radar_resolution (float): Resolution of the polar radar data (metres per pixel) 31 | """ 32 | # Hard coded configuration to simplify parsing code 33 | radar_resolution = np.array([0.0432], np.float32) 34 | encoder_size = 5600 35 | 36 | raw_example_data = cv2.imread(example_path, cv2.IMREAD_GRAYSCALE) 37 | timestamps = raw_example_data[:, :8].copy().view(np.int64) 38 | azimuths = (raw_example_data[:, 8:10].copy().view(np.uint16) / float(encoder_size) * 2 * np.pi).astype(np.float32) 39 | valid = raw_example_data[:, 10:11] == 255 40 | fft_data = raw_example_data[:, 11:].astype(np.float32)[:, :, np.newaxis] / 255. 41 | 42 | return timestamps, azimuths, valid, fft_data, radar_resolution 43 | 44 | 45 | def radar_polar_to_cartesian(azimuths: np.ndarray, fft_data: np.ndarray, radar_resolution: float, 46 | cart_resolution: float, cart_pixel_width: int, interpolate_crossover=True) -> np.ndarray: 47 | """Convert a polar radar scan to cartesian. 48 | Args: 49 | azimuths (np.ndarray): Rotation for each polar radar azimuth (radians) 50 | fft_data (np.ndarray): Polar radar power readings 51 | radar_resolution (float): Resolution of the polar radar data (metres per pixel) 52 | cart_resolution (float): Cartesian resolution (metres per pixel) 53 | cart_pixel_size (int): Width and height of the returned square cartesian output (pixels). Please see the Notes 54 | below for a full explanation of how this is used. 55 | interpolate_crossover (bool, optional): If true interpolates between the end and start azimuth of the scan. In 56 | practice a scan before / after should be used but this prevents nan regions in the return cartesian form. 57 | 58 | Returns: 59 | np.ndarray: Cartesian radar power readings 60 | Notes: 61 | After using the warping grid the output radar cartesian is defined as as follows where 62 | X and Y are the `real` world locations of the pixels in metres: 63 | If 'cart_pixel_width' is odd: 64 | +------ Y = -1 * cart_resolution (m) 65 | |+----- Y = 0 (m) at centre pixel 66 | ||+---- Y = 1 * cart_resolution (m) 67 | |||+--- Y = 2 * cart_resolution (m) 68 | |||| +- Y = cart_pixel_width // 2 * cart_resolution (m) (at last pixel) 69 | |||| +-----------+ 70 | vvvv v 71 | +---------------+---------------+ 72 | | | | 73 | | | | 74 | | | | 75 | | | | 76 | | | | 77 | | | | 78 | | | | 79 | +---------------+---------------+ <-- X = 0 (m) at centre pixel 80 | | | | 81 | | | | 82 | | | | 83 | | | | 84 | | | | 85 | | | | 86 | | | | 87 | +---------------+---------------+ 88 | <-------------------------------> 89 | cart_pixel_width (pixels) 90 | If 'cart_pixel_width' is even: 91 | +------ Y = -0.5 * cart_resolution (m) 92 | |+----- Y = 0.5 * cart_resolution (m) 93 | ||+---- Y = 1.5 * cart_resolution (m) 94 | |||+--- Y = 2.5 * cart_resolution (m) 95 | |||| +- Y = (cart_pixel_width / 2 - 0.5) * cart_resolution (m) (at last pixel) 96 | |||| +----------+ 97 | vvvv v 98 | +------------------------------+ 99 | | | 100 | | | 101 | | | 102 | | | 103 | | | 104 | | | 105 | | | 106 | | | 107 | | | 108 | | | 109 | | | 110 | | | 111 | | | 112 | | | 113 | | | 114 | +------------------------------+ 115 | <------------------------------> 116 | cart_pixel_width (pixels) 117 | """ 118 | if (cart_pixel_width % 2) == 0: 119 | cart_min_range = (cart_pixel_width / 2 - 0.5) * cart_resolution 120 | else: 121 | cart_min_range = cart_pixel_width // 2 * cart_resolution 122 | coords = np.linspace(-cart_min_range, cart_min_range, cart_pixel_width, dtype=np.float32) 123 | Y, X = np.meshgrid(coords, -coords) 124 | sample_range = np.sqrt(Y * Y + X * X) 125 | sample_angle = np.arctan2(Y, X) 126 | sample_angle += (sample_angle < 0).astype(np.float32) * 2. * np.pi 127 | 128 | # Interpolate Radar Data Coordinates 129 | azimuth_step = azimuths[1] - azimuths[0] 130 | sample_u = (sample_range - radar_resolution / 2) / radar_resolution 131 | sample_v = (sample_angle - azimuths[0]) / azimuth_step 132 | 133 | # We clip the sample points to the minimum sensor reading range so that we 134 | # do not have undefined results in the centre of the image. In practice 135 | # this region is simply undefined. 136 | sample_u[sample_u < 0] = 0 137 | 138 | if interpolate_crossover: 139 | fft_data = np.concatenate((fft_data[-1:], fft_data, fft_data[:1]), 0) 140 | sample_v = sample_v + 1 141 | 142 | polar_to_cart_warp = np.stack((sample_u, sample_v), -1) 143 | cart_img = np.expand_dims(cv2.remap(fft_data, polar_to_cart_warp, None, cv2.INTER_LINEAR), -1) 144 | return cart_img 145 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/requirements.txt: -------------------------------------------------------------------------------- 1 | numpy 2 | matplotlib 3 | colour_demosaicing 4 | pillow 5 | opencv-python 6 | open3d-python 7 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/transform.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Geoff Pascoe (gmp@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ################################################################################ 14 | 15 | import numpy as np 16 | import numpy.matlib as matlib 17 | from math import sin, cos, atan2, sqrt 18 | 19 | MATRIX_MATCH_TOLERANCE = 1e-4 20 | 21 | 22 | def build_se3_transform(xyzrpy): 23 | """Creates an SE3 transform from translation and Euler angles. 24 | 25 | Args: 26 | xyzrpy (list[float]): translation and Euler angles for transform. Must have six components. 27 | 28 | Returns: 29 | numpy.matrixlib.defmatrix.matrix: SE3 homogeneous transformation matrix 30 | 31 | Raises: 32 | ValueError: if `len(xyzrpy) != 6` 33 | 34 | """ 35 | if len(xyzrpy) != 6: 36 | raise ValueError("Must supply 6 values to build transform") 37 | 38 | se3 = matlib.identity(4) 39 | se3[0:3, 0:3] = euler_to_so3(xyzrpy[3:6]) 40 | se3[0:3, 3] = np.matrix(xyzrpy[0:3]).transpose() 41 | return se3 42 | 43 | 44 | def euler_to_so3(rpy): 45 | """Converts Euler angles to an SO3 rotation matrix. 46 | 47 | Args: 48 | rpy (list[float]): Euler angles (in radians). Must have three components. 49 | 50 | Returns: 51 | numpy.matrixlib.defmatrix.matrix: 3x3 SO3 rotation matrix 52 | 53 | Raises: 54 | ValueError: if `len(rpy) != 3`. 55 | 56 | """ 57 | if len(rpy) != 3: 58 | raise ValueError("Euler angles must have three components") 59 | 60 | R_x = np.matrix([[1, 0, 0], 61 | [0, cos(rpy[0]), -sin(rpy[0])], 62 | [0, sin(rpy[0]), cos(rpy[0])]]) 63 | R_y = np.matrix([[cos(rpy[1]), 0, sin(rpy[1])], 64 | [0, 1, 0], 65 | [-sin(rpy[1]), 0, cos(rpy[1])]]) 66 | R_z = np.matrix([[cos(rpy[2]), -sin(rpy[2]), 0], 67 | [sin(rpy[2]), cos(rpy[2]), 0], 68 | [0, 0, 1]]) 69 | R_zyx = R_z * R_y * R_x 70 | return R_zyx 71 | 72 | 73 | def so3_to_euler(so3): 74 | """Converts an SO3 rotation matrix to Euler angles 75 | 76 | Args: 77 | so3: 3x3 rotation matrix 78 | 79 | Returns: 80 | numpy.matrixlib.defmatrix.matrix: list of Euler angles (size 3) 81 | 82 | Raises: 83 | ValueError: if so3 is not 3x3 84 | ValueError: if a valid Euler parametrisation cannot be found 85 | 86 | """ 87 | if so3.shape != (3, 3): 88 | raise ValueError("SO3 matrix must be 3x3") 89 | roll = atan2(so3[2, 1], so3[2, 2]) 90 | yaw = atan2(so3[1, 0], so3[0, 0]) 91 | denom = sqrt(so3[0, 0] ** 2 + so3[1, 0] ** 2) 92 | pitch_poss = [atan2(-so3[2, 0], denom), atan2(-so3[2, 0], -denom)] 93 | 94 | R = euler_to_so3((roll, pitch_poss[0], yaw)) 95 | 96 | if (so3 - R).sum() < MATRIX_MATCH_TOLERANCE: 97 | return np.matrix([roll, pitch_poss[0], yaw]) 98 | else: 99 | R = euler_to_so3((roll, pitch_poss[1], yaw)) 100 | if (so3 - R).sum() > MATRIX_MATCH_TOLERANCE: 101 | raise ValueError("Could not find valid pitch angle") 102 | return np.matrix([roll, pitch_poss[1], yaw]) 103 | 104 | 105 | def so3_to_quaternion(so3): 106 | """Converts an SO3 rotation matrix to a quaternion 107 | 108 | Args: 109 | so3: 3x3 rotation matrix 110 | 111 | Returns: 112 | numpy.ndarray: quaternion [w, x, y, z] 113 | 114 | Raises: 115 | ValueError: if so3 is not 3x3 116 | """ 117 | if so3.shape != (3, 3): 118 | raise ValueError("SO3 matrix must be 3x3") 119 | 120 | R_xx = so3[0, 0] 121 | R_xy = so3[0, 1] 122 | R_xz = so3[0, 2] 123 | R_yx = so3[1, 0] 124 | R_yy = so3[1, 1] 125 | R_yz = so3[1, 2] 126 | R_zx = so3[2, 0] 127 | R_zy = so3[2, 1] 128 | R_zz = so3[2, 2] 129 | 130 | try: 131 | w = sqrt(so3.trace() + 1) / 2 132 | except(ValueError): 133 | # w is non-real 134 | w = 0 135 | 136 | # Due to numerical precision the value passed to `sqrt` may be a negative of the order 1e-15. 137 | # To avoid this error we clip these values to a minimum value of 0. 138 | x = sqrt(max(1 + R_xx - R_yy - R_zz, 0)) / 2 139 | y = sqrt(max(1 + R_yy - R_xx - R_zz, 0)) / 2 140 | z = sqrt(max(1 + R_zz - R_yy - R_xx, 0)) / 2 141 | 142 | max_index = max(range(4), key=[w, x, y, z].__getitem__) 143 | 144 | if max_index == 0: 145 | x = (R_zy - R_yz) / (4 * w) 146 | y = (R_xz - R_zx) / (4 * w) 147 | z = (R_yx - R_xy) / (4 * w) 148 | elif max_index == 1: 149 | w = (R_zy - R_yz) / (4 * x) 150 | y = (R_xy + R_yx) / (4 * x) 151 | z = (R_zx + R_xz) / (4 * x) 152 | elif max_index == 2: 153 | w = (R_xz - R_zx) / (4 * y) 154 | x = (R_xy + R_yx) / (4 * y) 155 | z = (R_yz + R_zy) / (4 * y) 156 | elif max_index == 3: 157 | w = (R_yx - R_xy) / (4 * z) 158 | x = (R_zx + R_xz) / (4 * z) 159 | y = (R_yz + R_zy) / (4 * z) 160 | 161 | return np.array([w, x, y, z]) 162 | 163 | 164 | def se3_to_components(se3): 165 | """Converts an SE3 rotation matrix to linear translation and Euler angles 166 | 167 | Args: 168 | se3: 4x4 transformation matrix 169 | 170 | Returns: 171 | numpy.matrixlib.defmatrix.matrix: list of [x, y, z, roll, pitch, yaw] 172 | 173 | Raises: 174 | ValueError: if se3 is not 4x4 175 | ValueError: if a valid Euler parametrisation cannot be found 176 | 177 | """ 178 | if se3.shape != (4, 4): 179 | raise ValueError("SE3 transform must be a 4x4 matrix") 180 | xyzrpy = np.empty(6) 181 | xyzrpy[0:3] = se3[0:3, 3].transpose() 182 | xyzrpy[3:6] = so3_to_euler(se3[0:3, 0:3]) 183 | return xyzrpy 184 | -------------------------------------------------------------------------------- /datasets/robotcar_sdk/python/velodyne.py: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # 3 | # Copyright (c) 2017 University of Oxford 4 | # Authors: 5 | # Dan Barnes (dbarnes@robots.ox.ac.uk) 6 | # 7 | # This work is licensed under the Creative Commons 8 | # Attribution-NonCommercial-ShareAlike 4.0 International License. 9 | # To view a copy of this license, visit 10 | # http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to 11 | # Creative Commons, PO Box 1866, Mountain View, CA 94042, USA. 12 | # 13 | ############################################################################### 14 | 15 | from typing import AnyStr 16 | import numpy as np 17 | import os 18 | import cv2 19 | 20 | # Hard coded configuration to simplify parsing code 21 | hdl32e_range_resolution = 0.002 # m / pixel 22 | hdl32e_minimum_range = 1.0 23 | hdl32e_elevations = np.array([-0.1862, -0.1628, -0.1396, -0.1164, -0.0930, 24 | -0.0698, -0.0466, -0.0232, 0., 0.0232, 0.0466, 0.0698, 25 | 0.0930, 0.1164, 0.1396, 0.1628, 0.1862, 0.2094, 0.2327, 26 | 0.2560, 0.2793, 0.3025, 0.3259, 0.3491, 0.3723, 0.3957, 27 | 0.4189, 0.4421, 0.4655, 0.4887, 0.5119, 0.5353])[:, np.newaxis] 28 | hdl32e_base_to_fire_height = 0.090805 29 | hdl32e_cos_elevations = np.cos(hdl32e_elevations) 30 | hdl32e_sin_elevations = np.sin(hdl32e_elevations) 31 | 32 | 33 | def load_velodyne_binary(velodyne_bin_path: AnyStr): 34 | """Decode a binary Velodyne example (of the form '.bin') 35 | Args: 36 | example_path (AnyStr): Oxford Radar RobotCar Dataset binary Velodyne pointcloud example path 37 | Returns: 38 | ptcld (np.ndarray): XYZI pointcloud from the binary Velodyne data Nx4 39 | Notes: 40 | - The pre computed points are *NOT* motion compensated. 41 | - Converting a raw velodyne scan to pointcloud can be done using the 42 | `velodyne_ranges_intensities_angles_to_pointcloud` function. 43 | """ 44 | ext = os.path.splitext(velodyne_bin_path)[1] 45 | if ext != ".bin": 46 | raise RuntimeError("Velodyne binary pointcloud file should have `.bin` extension but had: {}".format(ext)) 47 | if not os.path.isfile(velodyne_bin_path): 48 | raise FileNotFoundError("Could not find velodyne bin example: {}".format(velodyne_bin_path)) 49 | data = np.fromfile(velodyne_bin_path, dtype=np.float32) 50 | ptcld = data.reshape((4, -1)) 51 | return ptcld 52 | 53 | def load_velodyne_binary_seg(velodyne_bin_path: AnyStr): 54 | """Decode a binary Velodyne example (of the form '.bin') 55 | Args: 56 | example_path (AnyStr): Oxford Radar RobotCar Dataset binary Velodyne pointcloud example path 57 | Returns: 58 | ptcld (np.ndarray): XYZI pointcloud from the binary Velodyne data Nx4 59 | Notes: 60 | - The pre computed points are *NOT* motion compensated. 61 | - Converting a raw velodyne scan to pointcloud can be done using the 62 | `velodyne_ranges_intensities_angles_to_pointcloud` function. 63 | """ 64 | ext = os.path.splitext(velodyne_bin_path)[1] 65 | if ext != ".bin": 66 | raise RuntimeError("Velodyne binary pointcloud file should have `.bin` extension but had: {}".format(ext)) 67 | if not os.path.isfile(velodyne_bin_path): 68 | raise FileNotFoundError("Could not find velodyne bin example: {}".format(velodyne_bin_path)) 69 | data = np.fromfile(velodyne_bin_path, dtype=np.float32) 70 | ptcld = data.reshape((-1, 4)) 71 | return ptcld 72 | 73 | def load_velodyne_raw(velodyne_raw_path: AnyStr): 74 | """Decode a raw Velodyne example. (of the form '.png') 75 | Args: 76 | example_path (AnyStr): Oxford Radar RobotCar Dataset raw Velodyne example path 77 | Returns: 78 | ranges (np.ndarray): Range of each measurement in meters where 0 == invalid, (32 x N) 79 | intensities (np.ndarray): Intensity of each measurement where 0 == invalid, (32 x N) 80 | angles (np.ndarray): Angle of each measurement in radians (1 x N) 81 | approximate_timestamps (np.ndarray): Approximate linearly interpolated timestamps of each mesaurement (1 x N). 82 | Approximate as we only receive timestamps for each packet. The timestamp of the next frame will was used to 83 | interpolate the last packet timestamps. If there was no next frame, the last packet timestamps was 84 | extrapolated. The original packet timestamps can be recovered with: 85 | approximate_timestamps(:, 1:12:end) (12 is the number of azimuth returns in each packet) 86 | Notes: 87 | Reference: https://velodynelidar.com/lidar/products/manual/63-9113%20HDL-32E%20manual_Rev%20E_NOV2012.pdf 88 | """ 89 | ext = os.path.splitext(velodyne_raw_path)[1] 90 | if ext != ".png": 91 | raise RuntimeError("Velodyne raw file should have `.png` extension but had: {}".format(ext)) 92 | if not os.path.isfile(velodyne_raw_path): 93 | raise FileNotFoundError("Could not find velodyne raw example: {}".format(velodyne_raw_path)) 94 | example = cv2.imread(velodyne_raw_path, cv2.IMREAD_GRAYSCALE) 95 | intensities, ranges_raw, angles_raw, timestamps_raw = np.array_split(example, [32, 96, 98], 0) 96 | ranges = np.ascontiguousarray(ranges_raw.transpose()).view(np.uint16).transpose() 97 | ranges = ranges * hdl32e_range_resolution 98 | angles = np.ascontiguousarray(angles_raw.transpose()).view(np.uint16).transpose() 99 | angles = angles * (2. * np.pi) / 36000 100 | approximate_timestamps = np.ascontiguousarray(timestamps_raw.transpose()).view(np.int64).transpose() 101 | return ranges, intensities, angles, approximate_timestamps 102 | 103 | 104 | def velodyne_raw_to_pointcloud(ranges: np.ndarray, intensities: np.ndarray, angles: np.ndarray): 105 | """ Convert raw Velodyne data (from load_velodyne_raw) into a pointcloud 106 | Args: 107 | ranges (np.ndarray): Raw Velodyne range readings 108 | intensities (np.ndarray): Raw Velodyne intensity readings 109 | angles (np.ndarray): Raw Velodyne angles 110 | Returns: 111 | pointcloud (np.ndarray): XYZI pointcloud generated from the raw Velodyne data Nx4 112 | 113 | Notes: 114 | - This implementation does *NOT* perform motion compensation on the generated pointcloud. 115 | - Accessing the pointclouds in binary form via `load_velodyne_pointcloud` is approximately 2x faster at the cost 116 | of 8x the storage space 117 | """ 118 | valid = ranges > hdl32e_minimum_range 119 | z = hdl32e_sin_elevations * ranges - hdl32e_base_to_fire_height 120 | xy = hdl32e_cos_elevations * ranges 121 | x = np.sin(angles) * xy 122 | y = -np.cos(angles) * xy 123 | 124 | xf = x[valid].reshape(-1) 125 | yf = y[valid].reshape(-1) 126 | zf = z[valid].reshape(-1) 127 | intensityf = intensities[valid].reshape(-1).astype(np.float32) 128 | ptcld = np.stack((xf, yf, zf, intensityf), 0) 129 | return ptcld 130 | -------------------------------------------------------------------------------- /datasets/rsd.py: -------------------------------------------------------------------------------- 1 | """ 2 | This code is designed to implement the following: 3 | 1. After reaching a certain epoch, it clips the data based on the median loss. 4 | 1. After reaching a certain epoch, it restores the full dataset. 5 | 6 | Modified by Infobatch 7 | """ 8 | import math 9 | import torch 10 | import numpy as np 11 | import MinkowskiEngine as ME 12 | from torch.utils.data import Dataset 13 | 14 | 15 | class RSD(Dataset): 16 | """ 17 | Args: 18 | dataset: Dataset used for training. 19 | num_epochs (int): The number of epochs for pruning. 20 | prune_ratio (float, optional): The proportion of samples being pruned during training. 21 | delta (float, optional): The first delta * num_epochs the pruning process is conducted. It should be close to 1. Defaults to 0.875. 22 | """ 23 | 24 | def __init__(self, dataset: Dataset, num_epochs: int, 25 | prune_ratio: float = 0.25, delta_start: float = 0.25, windows_radio: float = 0.1, 26 | delta_stop: float = 0.85): 27 | self.dataset = dataset 28 | self.keep_ratio = min(1.0, max(1e-1, 1.0 - prune_ratio)) 29 | self.prune_ratio = prune_ratio 30 | self.num_epochs = num_epochs 31 | self.delta_start = delta_start 32 | self.delta_stop = delta_stop 33 | self.windows = windows_radio # 0.1 34 | self.keep_windows_ratio = ((delta_stop - delta_start) - 2 * windows_radio) / 2 # 0.2 35 | self.weights = torch.ones(len(self.dataset), 1) 36 | self.select_idx = torch.zeros(len(dataset), dtype=torch.bool) 37 | self.num_pruned_samples = 0 38 | self.cur_batch_index = None 39 | 40 | def set_active_indices(self, cur_batch_indices: torch.Tensor): 41 | self.cur_batch_index = cur_batch_indices 42 | 43 | def update(self, loss, idx, batch_num): 44 | """ 45 | Scaling loss 46 | """ 47 | device = loss.device 48 | weights = self.weights.to(device)[idx][batch_num.long()] 49 | loss.mul_(weights) 50 | 51 | return loss.mean() 52 | 53 | def __len__(self): 54 | return len(self.dataset) 55 | 56 | def __getitem__(self, idx): 57 | scan_path = self.dataset.pcs[idx] 58 | if self.dataset.scene == 'NCLT': 59 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(-1, 4)[:, :3] 60 | else: 61 | ptcld = np.fromfile(scan_path, dtype=np.float32).reshape(4, -1).transpose()[:, :3] 62 | ptcld[:, 2] = -1 * ptcld[:, 2] 63 | 64 | scan = ptcld 65 | 66 | scan = np.ascontiguousarray(scan) 67 | 68 | lbl = self.dataset.lbl[idx] 69 | pose = self.dataset.poses[idx] # (6,) 70 | rot = self.dataset.rots[idx] # [3, 3] 71 | scan_gt = (rot @ scan.transpose(1, 0)).transpose(1, 0) + pose[:3].reshape(1, 3) 72 | 73 | if self.dataset.train and self.dataset.augment: 74 | scan = self.dataset.augmentor.doAugmentation(scan) # n, 5 75 | 76 | scan_gt_s8 = np.concatenate((scan, scan_gt), axis=1) 77 | 78 | coords, feats = ME.utils.sparse_quantize( 79 | coordinates=scan, 80 | features=scan_gt_s8, 81 | quantization_size=self.dataset.voxel_size) 82 | 83 | return (coords, feats, lbl, idx, pose) 84 | 85 | # First Downsampling 86 | def cal_std(self, values, labels, label_num): 87 | # 1. Calculate the variance of each row 88 | variances = values.var(dim=1) 89 | # 2. Calculate separately for each cluster 90 | for i in range(label_num): 91 | # Obtain the sample indices for this cluster 92 | label = i 93 | label_indices = torch.where(labels == label)[0] 94 | # Sort by variance in descending order and select the top keep_ratio percentage of samples 95 | label_variances = variances[label_indices] 96 | num_top_samples = int(self.keep_ratio * len(label_variances)) 97 | # Obtain the indices of the samples with the largest num_top_samples variances 98 | _, top_indices = torch.topk(label_variances, num_top_samples) 99 | selected_indices = label_indices[top_indices] 100 | # Set the mask of these samples to true 101 | self.select_idx[selected_indices] = True 102 | 103 | remain_idx = torch.where(self.select_idx)[0] 104 | self.weights[remain_idx] = self.weights[remain_idx] * (1 / self.keep_ratio) 105 | 106 | return len(remain_idx), remain_idx 107 | 108 | # Second Downsampling, update select_idx 109 | def sec_cal_std(self, values, labels, selected_indices, label_num): 110 | values = values[selected_indices] 111 | labels = labels[selected_indices] 112 | variances = values.var(dim=1) 113 | self.select_idx[:] = False 114 | for i in range(label_num): 115 | label = i 116 | label_indices = torch.where(labels == label)[0] 117 | label_variances = variances[label_indices] 118 | num_top_samples = int(self.keep_ratio * len(label_variances)) 119 | _, top_indices = torch.topk(label_variances, num_top_samples) 120 | second_selected_indices = label_indices[top_indices] 121 | self.select_idx[second_selected_indices] = True 122 | 123 | remain_idx = torch.where(self.select_idx)[0] 124 | self.weights[remain_idx] = self.weights[remain_idx] * (1 / self.keep_ratio) 125 | 126 | return torch.sum(self.select_idx == True) 127 | 128 | def prune(self): 129 | # Prune samples that are well learned, rebalance the weight by scaling up remaining 130 | # well learned samples' learning rate to keep estimation about the same 131 | # for the next version, also consider new class balance 132 | remained_mask = self.select_idx.numpy() 133 | remained_indices = np.where(remained_mask)[0].tolist() 134 | np.random.shuffle(remained_indices) 135 | 136 | return remained_indices 137 | 138 | @property 139 | def sampler(self): 140 | sampler = IBSampler(self) 141 | return sampler 142 | 143 | def no_prune(self): 144 | samples_indices = list(range(len(self))) 145 | np.random.shuffle(samples_indices) 146 | return samples_indices 147 | 148 | def reset_weights(self): 149 | self.weights[:] = 1 150 | 151 | @property 152 | def first_prune(self): 153 | # Round down 154 | return math.floor(self.num_epochs * self.delta_start) 155 | 156 | @property 157 | def stop_prune(self): 158 | # Round down 159 | return math.floor(self.num_epochs * self.delta_stop) 160 | 161 | @property 162 | def second_prune(self): 163 | # Round down 164 | # ceil(25 * (0.25 + 0.1 + 0.2) = 13.75) = 14 165 | return math.floor(self.num_epochs * (self.delta_start + self.windows + self.keep_windows_ratio)) 166 | 167 | @property 168 | def win_std(self): 169 | # Round up 170 | return math.ceil(self.num_epochs * self.windows) 171 | 172 | 173 | class IBSampler(object): 174 | def __init__(self, dataset: RSD): 175 | self.dataset = dataset 176 | self.first_prune = dataset.first_prune 177 | self.stop_prune = dataset.stop_prune 178 | self.windows = dataset.win_std 179 | self.iterations = -1 180 | self.sample_indices = None 181 | self.iter_obj = None 182 | self.reset() 183 | 184 | def reset(self): 185 | # np.random.seed(self.iterations) 186 | print("iterations: %d" % self.iterations) 187 | if self.iterations >= self.stop_prune or self.iterations < (self.first_prune + self.windows): 188 | print("no pruning") 189 | if self.iterations == self.stop_prune: 190 | self.dataset.reset_weights() 191 | self.sample_indices = self.dataset.no_prune() 192 | else: 193 | print("pruning") 194 | self.sample_indices = self.dataset.prune() 195 | print(len(self.sample_indices)) 196 | self.iter_obj = iter(self.sample_indices) 197 | self.iterations += 1 198 | 199 | def __next__(self): 200 | nxt = next(self.iter_obj) 201 | return nxt 202 | 203 | def __len__(self): 204 | return len(self.sample_indices) 205 | 206 | def __iter__(self): 207 | self.reset() 208 | return self 209 | -------------------------------------------------------------------------------- /img/Pipline.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/img/Pipline.png -------------------------------------------------------------------------------- /img/Teaser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/img/Teaser.png -------------------------------------------------------------------------------- /install.sh: -------------------------------------------------------------------------------- 1 | conda deactivate 2 | 3 | # Set environment variables 4 | export ENV_NAME=LightLoc 5 | export PYTHON_VERSION=3.8 6 | export PYTORCH_VERSION=1.11.0 7 | export CUDA_VERSION=11.3 8 | 9 | # Create a new conda environment and activate it 10 | conda create -n $ENV_NAME python=$PYTHON_VERSION -y 11 | conda activate $ENV_NAME 12 | 13 | # Install PyTorch 14 | conda install pytorch==$PYTORCH_VERSION torchvision torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=$CUDA_VERSION -c pytorch -y 15 | # Install MinkowskiEngine 16 | conda install openblas-devel -c anaconda -y 17 | pip install pip==22.2.1 18 | # Downgrade the version of setuptools 19 | pip uninstall setuptools -y 20 | pip install setuptools==69.5.1 21 | # gcc g++ 9.5 22 | pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps --install-option="--blas_include_dirs=${CONDA_PREFIX}/include" --install-option="--blas=openblas" 23 | 24 | # Install pip packages 25 | pip install matplotlib 26 | pip install tqdm 27 | pip install h5py 28 | pip install pandas 29 | pip install transforms3d 30 | pip install open3d 31 | -------------------------------------------------------------------------------- /models/__pycache__/ace.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/models/__pycache__/ace.cpython-310.pyc -------------------------------------------------------------------------------- /models/__pycache__/ace.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/models/__pycache__/ace.cpython-39.pyc -------------------------------------------------------------------------------- /models/__pycache__/loss.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/models/__pycache__/loss.cpython-39.pyc -------------------------------------------------------------------------------- /models/__pycache__/sc2pcr.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/liw95/LightLoc/1f57f87f9bfdd573238d0d7bdb506a57603a2e13/models/__pycache__/sc2pcr.cpython-39.pyc -------------------------------------------------------------------------------- /models/lightloc.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # Copyright © Niantic, Inc. 2022. 3 | import re 4 | import torch 5 | import logging 6 | import torch.nn as nn 7 | import MinkowskiEngine as ME 8 | import MinkowskiEngine.MinkowskiFunctional as MEF 9 | import torch.nn.functional as F 10 | 11 | 12 | _logger = logging.getLogger(__name__) 13 | 14 | 15 | def Norm(norm_type, num_feats, bn_momentum=0.1, D=-1): 16 | if norm_type == 'BN': 17 | return ME.MinkowskiBatchNorm(num_feats, momentum=bn_momentum) 18 | elif norm_type == 'IN': 19 | return ME.MinkowskiInstanceNorm(num_feats, dimension=D) 20 | else: 21 | raise ValueError(f'Type {norm_type}, not defined') 22 | 23 | 24 | class Conv(nn.Module): 25 | def __init__(self, 26 | inplanes, 27 | planes, 28 | kernel_size=3, 29 | stride=1, 30 | dilation=1, 31 | bias=False, 32 | dimension=3): 33 | super(Conv, self).__init__() 34 | 35 | self.net = nn.Sequential(ME.MinkowskiConvolution(inplanes, 36 | planes, 37 | kernel_size=kernel_size, 38 | stride=stride, 39 | dilation=dilation, 40 | bias=bias, 41 | dimension=dimension),) 42 | 43 | def forward(self, x): 44 | return self.net(x) 45 | 46 | 47 | class Encoder(ME.MinkowskiNetwork): 48 | """ 49 | FCN encoder, used to extract features from the input point clouds. 50 | 51 | The number of output channels is configurable, the default used in the paper is 512. 52 | """ 53 | 54 | def __init__(self, out_channels, norm_type, D=3): 55 | super(Encoder, self).__init__(D) 56 | 57 | self.in_channels = 3 58 | self.out_channels = out_channels 59 | self.norm_type = norm_type 60 | self.conv_planes = [32, 64, 128, 256, 256, 256, 256, 512, 512] 61 | 62 | # in_channels, conv_planes, kernel_size, stride dilation bias 63 | self.conv1 = Conv(self.in_channels, self.conv_planes[0], 3, 1, 1, True) 64 | self.conv2 = Conv(self.conv_planes[0], self.conv_planes[1], 3, 2, bias=True) 65 | self.conv3 = Conv(self.conv_planes[1], self.conv_planes[2], 3, 2, bias=True) 66 | self.conv4 = Conv(self.conv_planes[2], self.conv_planes[3], 3, 2, bias=True) 67 | 68 | self.res1_conv1 = Conv(self.conv_planes[3], self.conv_planes[4], 3, 1, bias=True) 69 | # 1 70 | self.res1_conv2 = Conv(self.conv_planes[4], self.conv_planes[5], 1, 1, bias=True) 71 | self.res1_conv3 = Conv(self.conv_planes[5], self.conv_planes[6], 3, 1, bias=True) 72 | 73 | self.res2_conv1 = Conv(self.conv_planes[6], self.conv_planes[7], 3, 1, bias=True) 74 | # 2 75 | self.res2_conv2 = Conv(self.conv_planes[7], self.conv_planes[8], 1, 1, bias=True) 76 | self.res2_conv3 = Conv(self.conv_planes[8], self.out_channels, 3, 1, bias=True) 77 | 78 | self.res2_skip = Conv(self.conv_planes[6], self.out_channels, 1, 1, bias=True) 79 | 80 | def forward(self, x): 81 | 82 | x = MEF.relu(self.conv1(x)) 83 | x = MEF.relu(self.conv2(x)) 84 | x = MEF.relu(self.conv3(x)) 85 | res = MEF.relu(self.conv4(x)) 86 | 87 | x = MEF.relu(self.res1_conv1(res)) 88 | x = MEF.relu(self.res1_conv2(x)) 89 | x._F = x.F.to(torch.float32) 90 | x = MEF.relu(self.res1_conv3(x)) 91 | 92 | res = res + x 93 | 94 | x = MEF.relu(self.res2_conv1(res)) 95 | x = MEF.relu(self.res2_conv2(x)) 96 | x._F = x.F.to(torch.float32) 97 | x = MEF.relu(self.res2_conv3(x)) 98 | 99 | x = self.res2_skip(res) + x 100 | 101 | return x 102 | 103 | 104 | def one_hot(x, N): 105 | one_hot = torch.FloatTensor(x.size(0), N, x.size(1), x.size(2)).zero_().to(x.device) 106 | one_hot = one_hot.scatter_(1, x.unsqueeze(1), 1) 107 | return one_hot 108 | 109 | 110 | class CondLayer(nn.Module): 111 | """ 112 | pixel-wise feature modulation. 113 | """ 114 | def __init__(self, in_channels): 115 | super(CondLayer, self).__init__() 116 | self.bn = nn.BatchNorm1d(in_channels) 117 | 118 | def forward(self, x, gammas, betas): 119 | return F.relu(self.bn((gammas * x) + betas)) 120 | 121 | 122 | class Cls_Head(nn.Module): 123 | """ 124 | Classification Head 125 | """ 126 | def __init__(self, in_channels=512, level_cluster=25): 127 | super(Cls_Head, self).__init__() 128 | 129 | channels_c = [512, 256, level_cluster] 130 | 131 | # level network. 132 | self.conv1_l1 = nn.Linear(in_channels, channels_c[0]) 133 | self.norm1_l1 = nn.BatchNorm1d(channels_c[0]) 134 | self.conv2_l1 = nn.Linear(channels_c[0], channels_c[1]) 135 | self.norm2_l1 = nn.BatchNorm1d(channels_c[1]) 136 | self.conv3_l1 = nn.Linear(channels_c[1], channels_c[2]) 137 | self.dp1_l1 = nn.Dropout(0.5) 138 | self.dp2_l1 = nn.Dropout(0.5) 139 | 140 | 141 | def forward(self, res): 142 | 143 | x1 = self.dp1_l1(F.relu(self.norm1_l1(self.conv1_l1(res)))) 144 | x1 = self.dp2_l1(F.relu(self.norm2_l1(self.conv2_l1(x1)))) 145 | # output the classification probability. 146 | out_lbl_1 = self.conv3_l1(x1) 147 | 148 | return out_lbl_1 149 | 150 | 151 | class Reg_Head(nn.Module): 152 | """ 153 | nn.Linear版 154 | """ 155 | def __init__(self, num_head_blocks, in_channels=512, mlp_ratio=1.0): 156 | super(Reg_Head, self).__init__() 157 | self.in_channels = in_channels # Number of encoder features. 158 | self.head_channels = in_channels # Hardcoded. 159 | 160 | # We may need a skip layer if the number of features output by the encoder is different. 161 | self.head_skip = nn.Identity() if self.in_channels == self.head_channels \ 162 | else nn.Linear(self.in_channels, self.head_channels) 163 | 164 | block_channels = int(self.head_channels * mlp_ratio) 165 | self.res3_conv1 = nn.Linear(self.in_channels, self.head_channels) 166 | self.res3_conv2 = nn.Linear(self.head_channels, block_channels) 167 | self.res3_conv3 = nn.Linear(block_channels, self.head_channels) 168 | 169 | self.res_blocks = [] 170 | self.norm_blocks = [] 171 | 172 | for block in range(num_head_blocks): 173 | self.res_blocks.append(( 174 | nn.Linear(self.head_channels, self.head_channels), 175 | nn.Linear(self.head_channels, block_channels), 176 | nn.Linear(block_channels, self.head_channels), 177 | )) 178 | 179 | super(Reg_Head, self).add_module(str(block) + 'c0', self.res_blocks[block][0]) 180 | super(Reg_Head, self).add_module(str(block) + 'c1', self.res_blocks[block][1]) 181 | super(Reg_Head, self).add_module(str(block) + 'c2', self.res_blocks[block][2]) 182 | 183 | self.fc1 = nn.Linear(self.head_channels, self.head_channels) 184 | self.fc2 = nn.Linear(self.head_channels, block_channels) 185 | self.fc3 = nn.Linear(block_channels, 3) 186 | 187 | def forward(self, res): 188 | x = F.relu(self.res3_conv1(res)) 189 | x = F.relu(self.res3_conv2(x)) 190 | x = F.relu(self.res3_conv3(x)) 191 | 192 | res = self.head_skip(res) + x 193 | 194 | for res_block in self.res_blocks: 195 | 196 | x = F.relu(res_block[0](res)) 197 | x = F.relu(res_block[1](x)) 198 | x = F.relu(res_block[2](x)) 199 | res = res + x 200 | 201 | sc = F.relu(self.fc1(res)) 202 | sc = F.relu(self.fc2(sc)) 203 | sc = self.fc3(sc) 204 | 205 | return sc 206 | 207 | 208 | class Regressor(ME.MinkowskiNetwork): 209 | """ 210 | FCN architecture for scene coordinate regression. 211 | 212 | The network predicts a 3d scene coordinates, the output is subsampled by a factor of 8 compared to the input. 213 | """ 214 | 215 | OUTPUT_SUBSAMPLE = 8 216 | 217 | def __init__(self, num_head_blocks, num_encoder_features, level_clusters=25, 218 | mlp_ratio=1.0, sample_cls=False, D=3): 219 | """ 220 | Constructor. 221 | 222 | mean: Learn scene coordinates relative to a mean coordinate (e.g. the center of the scene). 223 | num_head_blocks: How many extra residual blocks to use in the head (one is always used). 224 | use_homogeneous: Whether to learn homogeneous or 3D coordinates. 225 | num_encoder_features: Number of channels output of the encoder network. 226 | """ 227 | super(Regressor, self).__init__(D) 228 | 229 | self.feature_dim = num_encoder_features 230 | 231 | self.encoder = Encoder(out_channels=self.feature_dim, norm_type='BN') 232 | self.cls_heads = Cls_Head(in_channels=self.feature_dim, level_cluster=level_clusters) 233 | if not sample_cls: 234 | self.reg_heads = Reg_Head(num_head_blocks=num_head_blocks, in_channels=self.feature_dim + level_clusters, 235 | mlp_ratio=mlp_ratio) 236 | 237 | @classmethod 238 | def create_from_encoder(cls, encoder_state_dict, classifier_state_dict=None, 239 | num_head_blocks=None, level_clusters=25, mlp_ratio=1.0, sample_cls=False): 240 | """ 241 | Create a regressor using a pretrained encoder, loading encoder-specific parameters from the state dict. 242 | 243 | encoder_state_dict: pretrained encoder state dictionary. 244 | classifier_state_dict: trained classifier state dictionary 245 | num_head_blocks: How many extra residual blocks to use in the head. 246 | level_cluster: How many classification categories. 247 | mlp_ratio: Channel expansion ratio. 248 | sample_cls: training for classifier 249 | """ 250 | 251 | num_encoder_features = encoder_state_dict['res2_conv3.net.0.bias'].shape[1] 252 | 253 | # Create a regressor. 254 | _logger.info(f"Creating Regressor using pretrained encoder with {num_encoder_features} feature size.") 255 | regressor = cls(num_head_blocks, num_encoder_features, level_clusters, mlp_ratio, sample_cls) 256 | 257 | # Load encoder weights. 258 | regressor.encoder.load_state_dict(encoder_state_dict) 259 | 260 | if classifier_state_dict!=None: 261 | regressor.cls_heads.load_state_dict(classifier_state_dict) 262 | 263 | # Done. 264 | return regressor 265 | 266 | @classmethod 267 | def create_from_state_dict(cls, state_dict): 268 | """ 269 | Instantiate a regressor from a pretrained state dictionary. 270 | 271 | state_dict: pretrained state dictionary. 272 | """ 273 | # Count how many head blocks are in the dictionary. 274 | pattern = re.compile(r"^reg_heads\.\d+c0\.weight$") 275 | num_head_blocks = sum(1 for k in state_dict.keys() if pattern.match(k)) 276 | 277 | # Number of output channels of the last encoder layer. 278 | num_encoder_features = state_dict['encoder.res2_conv3.net.0.bias'].shape[1] 279 | num_decoder_features = state_dict['cls_heads.conv1_l1.weight'].shape[1] 280 | head_channels = state_dict['cls_heads.conv1_l1.weight'].shape[0] 281 | level_clusters = state_dict['cls_heads.conv3_l1.weight'].shape[0] 282 | reg = any(key.startswith("reg_heads") for key in state_dict) 283 | if reg: 284 | mlp_ratio = state_dict['reg_heads.res3_conv2.weight'].shape[0] / \ 285 | state_dict['reg_heads.res3_conv2.weight'].shape[1] 286 | else: 287 | mlp_ratio = 1 288 | 289 | # Create a regressor. 290 | _logger.info(f"Creating regressor from pretrained state_dict:" 291 | f"\n\tNum head blocks: {num_head_blocks}" 292 | f"\n\tEncoder feature size: {num_encoder_features}" 293 | f"\n\tDecoder feature size: {num_decoder_features}" 294 | f"\n\tHead channels: {head_channels}" 295 | f"\n\tMLP ratio: {mlp_ratio}") 296 | regressor = cls(num_head_blocks, num_encoder_features, mlp_ratio=mlp_ratio, level_clusters=level_clusters) 297 | 298 | # Load all weights. 299 | regressor.load_state_dict(state_dict) 300 | 301 | # Done. 302 | return regressor 303 | 304 | @classmethod 305 | def create_from_split_state_dict(cls, encoder_state_dict, cls_head_state_dict, reg_head_state_dict=None): 306 | """ 307 | Instantiate a regressor from a pretrained encoder (scene-agnostic) and a scene-specific head. 308 | 309 | encoder_state_dict: encoder state dictionary 310 | head_state_dict: scene-specific head state dictionary 311 | """ 312 | # We simply merge the dictionaries and call the other constructor. 313 | merged_state_dict = {} 314 | 315 | for k, v in encoder_state_dict.items(): 316 | merged_state_dict[f"encoder.{k}"] = v 317 | 318 | for k, v in cls_head_state_dict.items(): 319 | merged_state_dict[f"cls_heads.{k}"] = v.squeeze(-1).squeeze(-1) 320 | 321 | if reg_head_state_dict != None: 322 | for k, v in reg_head_state_dict.items(): 323 | merged_state_dict[f"reg_heads.{k}"] = v.squeeze(-1).squeeze(-1) 324 | 325 | return cls.create_from_state_dict(merged_state_dict) 326 | 327 | 328 | def load_encoder(self, encoder_dict_file): 329 | """ 330 | Load weights into the encoder network. 331 | """ 332 | self.encoder.load_state_dict(torch.load(encoder_dict_file)) 333 | 334 | def get_features(self, inputs): 335 | return self.encoder(inputs) 336 | 337 | def get_scene_coordinates(self, features): 338 | out = self.reg_heads(features) 339 | return out 340 | 341 | def get_scene_classification(self, features): 342 | out = self.cls_heads(features) 343 | 344 | return out 345 | 346 | def forward(self, inputs): 347 | """ 348 | Forward pass. 349 | """ 350 | features = self.encoder(inputs) 351 | out = self.get_scene_coordinates(features.F) 352 | out = ME.SparseTensor( 353 | features=out, 354 | coordinates=features.C, 355 | ) 356 | 357 | return {'pred': out} -------------------------------------------------------------------------------- /models/loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch import nn 3 | 4 | 5 | class RSD_Criterion(nn.Module): 6 | def __init__(self, first_prune_epoch, second_prune_epoch, windows): 7 | super(RSD_Criterion, self).__init__() 8 | self.first_prune_epoch = first_prune_epoch 9 | self.second_prune_epoch = second_prune_epoch 10 | self.windows = windows 11 | 12 | def forward(self, pred_point, gt_point, batch_size, epoch_nums, idx, values): 13 | loss_map = torch.sum(torch.abs(pred_point - gt_point), axis=-1) # [B*N] 14 | loss = loss_map.detach().clone().view(batch_size, -1) # [B, N] 15 | if self.first_prune_epoch <= epoch_nums < (self.first_prune_epoch + self.windows): 16 | current_epoch = epoch_nums - self.first_prune_epoch 17 | values[idx, current_epoch] = torch.median(loss, dim=1).values 18 | elif self.second_prune_epoch <= epoch_nums < (self.second_prune_epoch + self.windows): 19 | current_epoch = epoch_nums - self.second_prune_epoch 20 | values[idx, current_epoch] = torch.median(loss, dim=1).values 21 | loss_map = torch.mean(loss_map) 22 | 23 | # loss_map = torch.sum(torch.abs(pred_point - gt_point), axis=-1, keepdims=True) 24 | # if self.first_prune_epoch <= epoch_nums < (self.first_prune_epoch + self.windows): 25 | # loss_lw = loss_map.detach().clone() 26 | # current_epoch = epoch_nums - self.first_prune_epoch 27 | # # print(batch_nums.shape) 28 | # # print(loss_map.shape) 29 | # for i in range(batch_size): 30 | # # 取出当前序列中的第N个batch数据 31 | # mask = batch_nums == i 32 | # values[idx[i], current_epoch] = torch.median(loss_lw[mask]) # 将其求中值，以忽略噪点影响 33 | # elif self.second_prune_epoch <= epoch_nums < (self.second_prune_epoch + self.windows): 34 | # loss_lw = loss_map.detach().clone() 35 | # current_epoch = epoch_nums - self.second_prune_epoch 36 | # for i in range(batch_size): 37 | # # 取出当前序列中的第N个batch数据 38 | # mask = batch_nums == i 39 | # values[idx[i], current_epoch] = torch.median(loss_lw[mask]) # 将其求中值，以忽略噪点影响 40 | 41 | # loss_map = torch.mean(loss_map) 42 | 43 | return loss_map, values 44 | 45 | 46 | class REG_Criterion(nn.Module): 47 | def __init__(self): 48 | super(REG_Criterion, self).__init__() 49 | 50 | def forward(self, pred_point, gt_point): 51 | loss_map = torch.sum(torch.abs(pred_point - gt_point), axis=-1, keepdims=True) 52 | loss_map = torch.mean(loss_map) 53 | 54 | return loss_map 55 | 56 | 57 | class CLS_Criterion(nn.Module): 58 | def __init__(self): 59 | super(CLS_Criterion, self).__init__() 60 | self.loc_loss = nn.CrossEntropyLoss(label_smoothing=0.1) 61 | 62 | def forward(self, pred_loc, gt_loc): 63 | loss = self.loc_loss(pred_loc, gt_loc.long()) 64 | 65 | return loss -------------------------------------------------------------------------------- /models/sc2pcr.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import transforms3d.quaternions as txq 3 | import numpy as np 4 | from utils.pose_util import rigid_transform_3d, transform 5 | 6 | 7 | class Matcher(): 8 | def __init__(self, 9 | inlier_threshold=1.4, 10 | num_node='all', 11 | use_mutual=True, 12 | d_thre=2, 13 | num_iterations=10, 14 | ratio=0.1, # 0.2 15 | nms_radius=0.1, 16 | max_points=2000, 17 | k1=40, 18 | k2=20, 19 | select_scene=None, 20 | ): 21 | self.inlier_threshold = inlier_threshold 22 | self.num_node = num_node 23 | self.use_mutual = use_mutual 24 | self.d_thre = d_thre 25 | self.num_iterations = num_iterations # maximum iteration of power iteration algorithm 26 | self.ratio = ratio # the maximum ratio of seeds. 27 | self.max_points = max_points 28 | self.nms_radius = nms_radius 29 | self.k1 = k1 30 | self.k2 = k2 31 | 32 | def pick_seeds(self, dists, scores, R, max_num): 33 | """ 34 | Select seeding points using Non Maximum Suppression. (here we only support bs=1) 35 | Input: 36 | - dists: [bs, num_corr, num_corr] src keypoints distance matrix 37 | - scores: [bs, num_corr] initial confidence of each correspondence 38 | - R: float radius of nms 39 | - max_num: int maximum number of returned seeds 40 | Output: 41 | - picked_seeds: [bs, num_seeds] the index to the seeding correspondences 42 | """ 43 | assert scores.shape[0] == 1 44 | 45 | # parallel Non Maximum Suppression (more efficient) 46 | score_relation = scores.T >= scores # [num_corr, num_corr], save the relation of leading_eig 47 | # score_relation[dists[0] >= R] = 1 # mask out the non-neighborhood node 48 | score_relation = score_relation.bool() | (dists[0] >= R).bool() 49 | is_local_max = score_relation.min(-1)[0].float() 50 | 51 | score_local_max = scores * is_local_max 52 | sorted_score = torch.argsort(score_local_max, dim=1, descending=True) 53 | 54 | # max_num = scores.shape[1] 55 | 56 | return_idx = sorted_score[:, 0: max_num].detach() 57 | 58 | return return_idx 59 | 60 | def cal_seed_trans(self, seeds, SC2_measure, src_keypts, tgt_keypts): 61 | """ 62 | Calculate the transformation for each seeding correspondences. 63 | Input: 64 | - seeds: [bs, num_seeds] the index to the seeding correspondence 65 | - SC2_measure: [bs, num_corr, num_channels] 66 | - src_keypts: [bs, num_corr, 3] 67 | - tgt_keypts: [bs, num_corr, 3] 68 | Output: leading eigenvector 69 | - final_trans: [bs, 4, 4] best transformation matrix (after post refinement) for each batch. 70 | """ 71 | bs, num_corr, num_channels = SC2_measure.shape[0], SC2_measure.shape[1], SC2_measure.shape[2] 72 | k1 = self.k1 73 | k2 = self.k2 74 | 75 | if k1 > num_channels: 76 | k1 = 4 77 | k2 = 4 78 | 79 | ################################# 80 | # The first stage consensus set sampling 81 | # Finding the k1 nearest neighbors around each seed 82 | ################################# 83 | sorted_score = torch.argsort(SC2_measure, dim=2, descending=True) 84 | knn_idx = sorted_score[:, :, 0: k1] 85 | sorted_value, _ = torch.sort(SC2_measure, dim=2, descending=True) 86 | idx_tmp = knn_idx.contiguous().view([bs, -1]) 87 | idx_tmp = idx_tmp[:, :, None] 88 | idx_tmp = idx_tmp.expand(-1, -1, 3) 89 | 90 | ################################# 91 | # construct the local SC2 measure of each consensus subset obtained in the first stage. 92 | ################################# 93 | src_knn = src_keypts.gather(dim=1, index=idx_tmp).view([bs, -1, k1, 3]) # [bs, num_seeds, k, 3] 94 | tgt_knn = tgt_keypts.gather(dim=1, index=idx_tmp).view([bs, -1, k1, 3]) 95 | src_dist = ((src_knn[:, :, :, None, :] - src_knn[:, :, None, :, :]) ** 2).sum(-1) ** 0.5 96 | tgt_dist = ((tgt_knn[:, :, :, None, :] - tgt_knn[:, :, None, :, :]) ** 2).sum(-1) ** 0.5 97 | cross_dist = torch.abs(src_dist - tgt_dist) 98 | local_hard_SC_measure = (cross_dist < self.d_thre).float() 99 | local_SC2_measure = torch.matmul(local_hard_SC_measure[:, :, :1, :], local_hard_SC_measure) 100 | 101 | ################################# 102 | # perform second stage consensus set sampling 103 | ################################# 104 | sorted_score = torch.argsort(local_SC2_measure, dim=3, descending=True) 105 | knn_idx_fine = sorted_score[:, :, :, 0: k2] 106 | 107 | ################################# 108 | # construct the soft SC2 matrix of the consensus set 109 | ################################# 110 | num = knn_idx_fine.shape[1] 111 | knn_idx_fine = knn_idx_fine.contiguous().view([bs, num, -1])[:, :, :, None] 112 | knn_idx_fine = knn_idx_fine.expand(-1, -1, -1, 3) 113 | src_knn_fine = src_knn.gather(dim=2, index=knn_idx_fine).view([bs, -1, k2, 3]) # [bs, num_seeds, k, 3] 114 | tgt_knn_fine = tgt_knn.gather(dim=2, index=knn_idx_fine).view([bs, -1, k2, 3]) 115 | 116 | src_dist = ((src_knn_fine[:, :, :, None, :] - src_knn_fine[:, :, None, :, :]) ** 2).sum(-1) ** 0.5 117 | tgt_dist = ((tgt_knn_fine[:, :, :, None, :] - tgt_knn_fine[:, :, None, :, :]) ** 2).sum(-1) ** 0.5 118 | cross_dist = torch.abs(src_dist - tgt_dist) 119 | local_hard_measure = (cross_dist < self.d_thre * 2).float() 120 | local_SC2_measure = torch.matmul(local_hard_measure, local_hard_measure) / k2 121 | local_SC_measure = torch.clamp(1 - cross_dist ** 2 / self.d_thre ** 2, min=0) 122 | # local_SC2_measure = local_SC_measure * local_SC2_measure 123 | local_SC2_measure = local_SC_measure 124 | local_SC2_measure = local_SC2_measure.view([-1, k2, k2]) 125 | 126 | 127 | ################################# 128 | # Power iteratation to get the inlier probability 129 | ################################# 130 | local_SC2_measure[:, torch.arange(local_SC2_measure.shape[1]), torch.arange(local_SC2_measure.shape[1])] = 0 131 | total_weight = self.cal_leading_eigenvector(local_SC2_measure, method='power') 132 | total_weight = total_weight.view([bs, -1, k2]) 133 | total_weight = total_weight / (torch.sum(total_weight, dim=-1, keepdim=True) + 1e-6) 134 | 135 | ################################# 136 | # calculate the transformation by weighted least-squares for each subsets in parallel 137 | ################################# 138 | total_weight = total_weight.view([-1, k2]) 139 | src_knn = src_knn_fine 140 | tgt_knn = tgt_knn_fine 141 | src_knn, tgt_knn = src_knn.view([-1, k2, 3]), tgt_knn.view([-1, k2, 3]) 142 | 143 | ################################# 144 | # compute the rigid transformation for each seed by the weighted SVD 145 | ################################# 146 | seedwise_trans = rigid_transform_3d(src_knn, tgt_knn, total_weight) 147 | seedwise_trans = seedwise_trans.view([bs, -1, 4, 4]) 148 | 149 | ################################# 150 | # calculate the inlier number for each hypothesis, and find the best transformation for each point cloud pair 151 | ################################# 152 | pred_position = torch.einsum('bsnm,bmk->bsnk', seedwise_trans[:, :, :3, :3], 153 | src_keypts.permute(0, 2, 1)) + seedwise_trans[:, :, :3, 154 | 3:4] # [bs, num_seeds, num_corr, 3] 155 | ################################# 156 | # calculate the inlier number for each hypothesis, and find the best transformation for each point cloud pair 157 | ################################# 158 | pred_position = pred_position.permute(0, 1, 3, 2) 159 | L2_dis = torch.norm(pred_position - tgt_keypts[:, None, :, :], dim=-1) # [bs, num_seeds, num_corr] 160 | seedwise_fitness = torch.sum((L2_dis < self.inlier_threshold).float(), dim=-1) # [bs, num_seeds] 161 | batch_best_guess = seedwise_fitness.argmax(dim=1) 162 | best_guess_ratio = seedwise_fitness[0, batch_best_guess] 163 | final_trans = seedwise_trans.gather(dim=1,index=batch_best_guess[:, None, None, None].expand(-1, -1, 4, 4)).squeeze(1) 164 | 165 | return final_trans 166 | 167 | def cal_leading_eigenvector(self, M, method='power'): 168 | """ 169 | Calculate the leading eigenvector using power iteration algorithm or torch.symeig 170 | Input: 171 | - M: [bs, num_corr, num_corr] the compatibility matrix 172 | - method: select different method for calculating the learding eigenvector. 173 | Output: 174 | - solution: [bs, num_corr] leading eigenvector 175 | """ 176 | if method == 'power': 177 | # power iteration algorithm 178 | leading_eig = torch.ones_like(M[:, :, 0:1]) 179 | leading_eig_last = leading_eig 180 | for i in range(self.num_iterations): 181 | leading_eig = torch.bmm(M, leading_eig) 182 | leading_eig = leading_eig / (torch.norm(leading_eig, dim=1, keepdim=True) + 1e-6) 183 | if torch.allclose(leading_eig, leading_eig_last): 184 | break 185 | leading_eig_last = leading_eig 186 | leading_eig = leading_eig.squeeze(-1) 187 | return leading_eig 188 | elif method == 'eig': # cause NaN during back-prop 189 | e, v = torch.symeig(M, eigenvectors=True) 190 | leading_eig = v[:, :, -1] 191 | return leading_eig 192 | else: 193 | exit(-1) 194 | 195 | def cal_confidence(self, M, leading_eig, method='eig_value'): 196 | """ 197 | Calculate the confidence of the spectral matching solution based on spectral analysis. 198 | Input: 199 | - M: [bs, num_corr, num_corr] the compatibility matrix 200 | - leading_eig [bs, num_corr] the leading eigenvector of matrix M 201 | Output: 202 | - confidence 203 | """ 204 | if method == 'eig_value': 205 | # max eigenvalue as the confidence (Rayleigh quotient) 206 | max_eig_value = (leading_eig[:, None, :] @ M @ leading_eig[:, :, None]) / ( 207 | leading_eig[:, None, :] @ leading_eig[:, :, None]) 208 | confidence = max_eig_value.squeeze(-1) 209 | return confidence 210 | elif method == 'eig_value_ratio': 211 | # max eigenvalue / second max eigenvalue as the confidence 212 | max_eig_value = (leading_eig[:, None, :] @ M @ leading_eig[:, :, None]) / ( 213 | leading_eig[:, None, :] @ leading_eig[:, :, None]) 214 | # compute the second largest eigen-value 215 | B = M - max_eig_value * leading_eig[:, :, None] @ leading_eig[:, None, :] 216 | solution = torch.ones_like(B[:, :, 0:1]) 217 | for i in range(self.num_iterations): 218 | solution = torch.bmm(B, solution) 219 | solution = solution / (torch.norm(solution, dim=1, keepdim=True) + 1e-6) 220 | solution = solution.squeeze(-1) 221 | second_eig = solution 222 | second_eig_value = (second_eig[:, None, :] @ B @ second_eig[:, :, None]) / ( 223 | second_eig[:, None, :] @ second_eig[:, :, None]) 224 | confidence = max_eig_value / second_eig_value 225 | return confidence 226 | elif method == 'xMx': 227 | # max xMx as the confidence (x is the binary solution) 228 | # rank = torch.argsort(leading_eig, dim=1, descending=True)[:, 0:int(M.shape[1]*self.ratio)] 229 | # binary_sol = torch.zeros_like(leading_eig) 230 | # binary_sol[0, rank[0]] = 1 231 | confidence = leading_eig[:, None, :] @ M @ leading_eig[:, :, None] 232 | confidence = confidence.squeeze(-1) / M.shape[1] 233 | return confidence 234 | 235 | def post_refinement(self, initial_trans, src_keypts, tgt_keypts, it_num, weights=None): 236 | """ 237 | Perform post refinement using the initial transformation matrix, only adopted during testing. 238 | Input 239 | - initial_trans: [bs, 4, 4] 240 | - src_keypts: [bs, num_corr, 3] 241 | - tgt_keypts: [bs, num_corr, 3] 242 | - weights: [bs, num_corr] 243 | Output: 244 | - final_trans: [bs, 4, 4] 245 | """ 246 | assert initial_trans.shape[0] == 1 247 | 248 | if self.inlier_threshold == 0.10: # for 3DMatch 249 | inlier_threshold_list = [0.10] * it_num 250 | else: # for KITTI 251 | inlier_threshold_list = [0.6] * it_num 252 | 253 | previous_inlier_num = 0 254 | for inlier_threshold in inlier_threshold_list: 255 | warped_src_keypts = transform(src_keypts, initial_trans) 256 | 257 | L2_dis = torch.norm(warped_src_keypts - tgt_keypts, dim=-1) 258 | pred_inlier = (L2_dis < inlier_threshold)[0] # assume bs = 1 259 | inlier_num = torch.sum(pred_inlier) 260 | if abs(int(inlier_num - previous_inlier_num)) < 1: 261 | break 262 | else: 263 | previous_inlier_num = inlier_num 264 | initial_trans = rigid_transform_3d( 265 | A=src_keypts[:, pred_inlier, :], 266 | B=tgt_keypts[:, pred_inlier, :], 267 | ## https://link.springer.com/article/10.1007/s10589-014-9643-2 268 | # weights=None, 269 | weights=1 / (1 + (L2_dis / inlier_threshold) ** 2)[:, pred_inlier], 270 | # weights=((1-L2_dis/inlier_threshold)**2)[:, pred_inlier] 271 | ) 272 | return initial_trans 273 | 274 | def match_pair(self, src_keypts, tgt_keypts, src_features, tgt_features): 275 | N_src = src_features.shape[1] 276 | N_tgt = tgt_features.shape[1] 277 | # use all point or sample points. 278 | if self.num_node == 'all': 279 | src_sel_ind = np.arange(N_src) 280 | tgt_sel_ind = np.arange(N_tgt) 281 | else: 282 | src_sel_ind = np.random.choice(N_src, self.num_node) 283 | tgt_sel_ind = np.random.choice(N_tgt, self.num_node) 284 | src_desc = src_features[:, src_sel_ind, :] 285 | tgt_desc = tgt_features[:, tgt_sel_ind, :] 286 | src_keypts = src_keypts[:, src_sel_ind, :] 287 | tgt_keypts = tgt_keypts[:, tgt_sel_ind, :] 288 | 289 | # match points in feature space. 290 | distance = torch.sqrt(2 - 2 * (src_desc[0] @ tgt_desc[0].T) + 1e-6) 291 | distance = distance.unsqueeze(0) 292 | source_idx = torch.argmin(distance[0], dim=1) 293 | corr = torch.cat([torch.arange(source_idx.shape[0])[:, None].cuda(), source_idx[:, None]], dim=-1) 294 | 295 | # generate correspondences 296 | src_keypts_corr = src_keypts[:, corr[:, 0]] 297 | tgt_keypts_corr = tgt_keypts[:, corr[:, 1]] 298 | 299 | return src_keypts_corr, tgt_keypts_corr 300 | 301 | def SC2_PCR(self, src_keypts, tgt_keypts): 302 | # print(src_keypts.shape) 303 | # print(tgt_keypts.shape) 304 | """ 305 | Input: 306 | - src_keypts: [bs, num_corr, 3] 307 | - tgt_keypts: [bs, num_corr, 3] 308 | Output: 309 | - pred_trans: [bs, 4, 4], the predicted transformation matrix. 310 | - pred_labels: [bs, num_corr], the predicted inlier/outlier label (0,1), for classification loss calculation. 311 | """ 312 | bs, num_corr = src_keypts.shape[0], tgt_keypts.shape[1] 313 | 314 | ################################# 315 | # downsample points 316 | ################################# 317 | if num_corr > self.max_points: 318 | src_keypts = src_keypts[:, :self.max_points, :] 319 | tgt_keypts = tgt_keypts[:, :self.max_points, :] 320 | num_corr = self.max_points 321 | 322 | ################################# 323 | # compute cross dist 324 | ################################# 325 | src_dist = torch.norm((src_keypts[:, :, None, :] - src_keypts[:, None, :, :]), dim=-1) 326 | target_dist = torch.norm((tgt_keypts[:, :, None, :] - tgt_keypts[:, None, :, :]), dim=-1) 327 | cross_dist = torch.abs(src_dist - target_dist) 328 | 329 | ################################# 330 | # compute first order measure 331 | ################################# 332 | SC_dist_thre = self.d_thre 333 | SC_measure = torch.clamp(1.0 - cross_dist ** 2 / SC_dist_thre ** 2, min=0) 334 | hard_SC_measure = (cross_dist < SC_dist_thre).float() 335 | 336 | ################################# 337 | # select reliable seed correspondences 338 | ################################# 339 | confidence = self.cal_leading_eigenvector(SC_measure, method='power') 340 | seeds = self.pick_seeds(src_dist, confidence, R=self.nms_radius, max_num=int(num_corr * self.ratio)) 341 | 342 | ################################# 343 | # compute second order measure 344 | ################################# 345 | SC2_dist_thre = self.d_thre / 2 346 | hard_SC_measure_tight = (cross_dist < SC2_dist_thre).float() 347 | seed_hard_SC_measure = hard_SC_measure.gather(dim=1, 348 | index=seeds[:, :, None].expand(-1, -1, num_corr)) 349 | seed_hard_SC_measure_tight = hard_SC_measure_tight.gather(dim=1, 350 | index=seeds[:, :, None].expand(-1, -1, num_corr)) 351 | SC2_measure = torch.matmul(seed_hard_SC_measure_tight, hard_SC_measure_tight) * seed_hard_SC_measure 352 | 353 | ################################# 354 | # compute the seed-wise transformations and select the best one 355 | ################################# 356 | final_trans = self.cal_seed_trans(seeds, SC2_measure, src_keypts, tgt_keypts) 357 | 358 | ################################# 359 | # refine the result by recomputing the transformation over the whole set 360 | ################################# 361 | # 这里关闭了 362 | # final_trans = self.post_refinement(final_trans, src_keypts, tgt_keypts, 500) 363 | 364 | return final_trans 365 | 366 | def estimator(self, src_keypts_corr, tgt_keypts_corr): 367 | """ 368 | Input: 369 | - src_keypts_corr: [bs, num_corr, 3] 370 | - tgt_keypts_corr: [bs, num_corr, 3] 371 | Output: 372 | - pred_trans: [bs, 4, 4], the predicted transformation matrix 373 | - pred_trans: [bs, num_corr], the predicted inlier/outlier label (0,1) 374 | - src_keypts_corr: [bs, num_corr, 3], the source points in the matched correspondences 375 | - tgt_keypts_corr: [bs, num_corr, 3], the target points in the matched correspondences 376 | """ 377 | ################################# 378 | # generate coarse correspondences 379 | ################################# 380 | # src_keypts_corr, tgt_keypts_corr = self.match_pair(src_keypts, tgt_keypts, src_features, tgt_features) 381 | 382 | ################################# 383 | # use the proposed SC2-PCR to estimate the rigid transformation 384 | ################################# 385 | pred_t = np.zeros((1, 3)) 386 | pred_q = np.zeros((1, 4)) 387 | 388 | pred_trans = self.SC2_PCR(src_keypts_corr, tgt_keypts_corr) 389 | # print(pred_trans) 390 | pred_t[0, :] = pred_trans[0, :3, 3:].detach().squeeze().cpu().numpy() 391 | pred_q[0, :] = txq.mat2quat(pred_trans[0, :3, :3].detach().cpu().numpy()) 392 | pred_T = pred_trans.detach().squeeze().cpu().numpy() 393 | 394 | return pred_t, pred_q, pred_T -------------------------------------------------------------------------------- /nclt_process.py: -------------------------------------------------------------------------------- 1 | import os 2 | import glob 3 | import struct 4 | import numpy as np 5 | 6 | 7 | def convert_nclt(x_s, y_s, z_s): 8 | scaling = 0.005 # 5 mm 9 | offset = -100.0 10 | 11 | x = x_s * scaling + offset 12 | y = y_s * scaling + offset 13 | z = z_s * scaling + offset 14 | 15 | return x, y, z 16 | 17 | 18 | def load_velodyne_binary_nclt(filename): 19 | f_bin = open(filename, "rb") 20 | hits = [] 21 | while True: 22 | x_str = f_bin.read(2) 23 | if x_str == b'': #eof 24 | break 25 | x = struct.unpack(' 1: 368 | # d = 1 369 | # error = 2 * np.arccos(d) * 180 / np.pi0 370 | # d = abs(np.dot(groundtruth, predicted)) 371 | # d = min(1.0, max(-1.0, d)) 372 | 373 | d = np.abs(np.dot(groundtruth, predicted)) 374 | d = np.minimum(1.0, np.maximum(-1.0, d)) 375 | error = 2 * np.arccos(d) * 180 / np.pi 376 | 377 | return error 378 | 379 | 380 | def poses2mats(poses_in): 381 | poses_out = np.zeros((len(poses_in), 3, 3)) # (B, 3, 3) 382 | poses_qua = np.asarray([qexp(q) for q in poses_in.cpu().detach().numpy()]) 383 | 384 | # align 385 | for i in range(len(poses_out)): 386 | R = txq.quat2mat(poses_qua[i]) 387 | poses_out[i, ...] = R 388 | 389 | return poses_out 390 | 391 | 392 | def transform(pts, trans): 393 | """ 394 | Applies the SE3 transformations, support torch.Tensor and np.ndarry. Equation: trans_pts = R @ pts + t 395 | Input 396 | - pts: [num_pts, 3] or [bs, num_pts, 3], pts to be transformed 397 | - trans: [4, 4] or [bs, 4, 4], SE3 transformation matrix 398 | Output 399 | - pts: [num_pts, 3] or [bs, num_pts, 3] transformed pts 400 | """ 401 | if len(pts.shape) == 3: 402 | trans_pts = trans[:, :3, :3] @ pts.permute(0, 2, 1) + trans[:, :3, 3:4] 403 | return trans_pts.permute(0, 2, 1) 404 | else: 405 | trans_pts = trans[:3, :3] @ pts.T + trans[:3, 3:4] 406 | return trans_pts.T 407 | 408 | 409 | def rigid_transform_3d(A, B, weights=None, weight_threshold=0): 410 | """ 411 | Input: 412 | - A: [bs, num_corr, 3], source point cloud 413 | - B: [bs, num_corr, 3], target point cloud 414 | - weights: [bs, num_corr] weight for each correspondence 415 | - weight_threshold: float, clips points with weight below threshold 416 | Output: 417 | - R, t 418 | """ 419 | bs = A.shape[0] 420 | if weights is None: 421 | weights = torch.ones_like(A[:, :, 0]) 422 | weights[weights < weight_threshold] = 0 423 | # weights = weights / (torch.sum(weights, dim=-1, keepdim=True) + 1e-6) 424 | 425 | # find mean of point cloud 426 | centroid_A = torch.sum(A * weights[:, :, None], dim=1, keepdim=True) / ( 427 | torch.sum(weights, dim=1, keepdim=True)[:, :, None] + 1e-6) 428 | centroid_B = torch.sum(B * weights[:, :, None], dim=1, keepdim=True) / ( 429 | torch.sum(weights, dim=1, keepdim=True)[:, :, None] + 1e-6) 430 | 431 | # subtract mean 432 | Am = A - centroid_A 433 | Bm = B - centroid_B 434 | 435 | # construct weight covariance matrix 436 | Weight = torch.diag_embed(weights) 437 | H = Am.permute(0, 2, 1) @ Weight @ Bm 438 | 439 | # find rotation 440 | U, S, Vt = torch.svd(H.cpu()) 441 | U, S, Vt = U.to(weights.device), S.to(weights.device), Vt.to(weights.device) 442 | delta_UV = torch.det(Vt @ U.permute(0, 2, 1)) 443 | eye = torch.eye(3)[None, :, :].repeat(bs, 1, 1).to(A.device) 444 | eye[:, -1, -1] = delta_UV 445 | R = Vt @ eye @ U.permute(0, 2, 1) 446 | t = centroid_B.permute(0, 2, 1) - R @ centroid_A.permute(0, 2, 1) 447 | # warp_A = transform(A, integrate_trans(R,t)) 448 | # RMSE = torch.sum( (warp_A - B) ** 2, dim=-1).mean() 449 | return integrate_trans(R, t) 450 | 451 | 452 | def estimate_pose(gt_pc, pred_pc, threshold=1.4, device='cuda'): 453 | # print(source_pc.shape) 454 | source_pc = gt_pc.cpu().numpy().reshape(-1, 3) 455 | target_pc = pred_pc[:, :3].cpu().numpy().reshape(-1, 3) 456 | num_points = source_pc.shape[0] 457 | pred_t = np.zeros((1, 3)) 458 | pred_q = np.zeros((1, 4)) 459 | index1 = np.arange(0, num_points) 460 | index2 = np.arange(0, num_points) 461 | # np.random.shuffle(index1) 462 | index1 = np.expand_dims(index1, axis=1) 463 | index2 = np.expand_dims(index2, axis=1) 464 | corr = np.concatenate((index1, index2), axis=1) 465 | 466 | source_xyz = source_pc 467 | target_xyz = target_pc 468 | source = open3d.geometry.PointCloud() 469 | target = open3d.geometry.PointCloud() 470 | source.points = open3d.utility.Vector3dVector(source_xyz) 471 | target.points = open3d.utility.Vector3dVector(target_xyz) 472 | corres = open3d.utility.Vector2iVector(corr) 473 | 474 | M = open3d.pipelines.registration.registration_ransac_based_on_correspondence( 475 | source, 476 | target, 477 | corres, 478 | threshold, 479 | open3d.pipelines.registration.TransformationEstimationPointToPoint(False), 480 | 3, 481 | [ 482 | open3d.pipelines.registration. 483 | CorrespondenceCheckerBasedOnEdgeLength(0.9), 484 | open3d.pipelines.registration.CorrespondenceCheckerBasedOnDistance(threshold) 485 | ], 486 | open3d.pipelines.registration.RANSACConvergenceCriteria(100000, 0.999)) 487 | 488 | pred_t[0, :] = M.transformation[:3, 3:].squeeze() 489 | pred_q[0, :] = txq.mat2quat(M.transformation[:3, :3]) 490 | 491 | return pred_t, pred_q 492 | 493 | 494 | def integrate_trans(R, t): 495 | """ 496 | Integrate SE3 transformations from R and t, support torch.Tensor and np.ndarry. 497 | Input 498 | - R: [3, 3] or [bs, 3, 3], rotation matrix 499 | - t: [3, 1] or [bs, 3, 1], translation matrix 500 | Output 501 | - trans: [4, 4] or [bs, 4, 4], SE3 transformation matrix 502 | """ 503 | if len(R.shape) == 3: 504 | if isinstance(R, torch.Tensor): 505 | trans = torch.eye(4)[None].repeat(R.shape[0], 1, 1).to(R.device) 506 | else: 507 | trans = np.eye(4)[None] 508 | trans[:, :3, :3] = R 509 | trans[:, :3, 3:4] = t.view([-1, 3, 1]) 510 | else: 511 | if isinstance(R, torch.Tensor): 512 | trans = torch.eye(4).to(R.device) 513 | else: 514 | trans = np.eye(4) 515 | trans[:3, :3] = R 516 | trans[:3, 3:4] = t 517 | return trans 518 | 519 | 520 | def calc_vos_simple(poses): 521 | """ 522 | calculate the VOs, from a list of consecutive poses 523 | :param poses: N x T x 7 524 | :return: N x (T-1) x 7 525 | """ 526 | vos = [] 527 | for p in poses: 528 | pvos = [p[i + 1].unsqueeze(0) - p[i].unsqueeze(0) for i in range(len(p) - 1)] 529 | vos.append(torch.cat(pvos, dim=0)) 530 | vos = torch.stack(vos, dim=0) 531 | 532 | return vos 533 | 534 | 535 | def filter_overflow_ts(filename, ts_raw): 536 | # 537 | file_data = pd.read_csv(filename) 538 | base_name = osp.basename(filename) 539 | 540 | if base_name.find('vo') > -1: 541 | ts_key = 'source_timestamp' 542 | else: 543 | ts_key = 'timestamp' 544 | 545 | pose_timestamps = file_data[ts_key].values 546 | min_pose_timestamps = min(pose_timestamps) 547 | max_pose_timestamps = max(pose_timestamps) 548 | ts_filted = [t for t in ts_raw if min_pose_timestamps < t < max_pose_timestamps] 549 | abandon_num = len(ts_raw) - len(ts_filted) 550 | print('abandom %d pointclouds that exceed the range of %s' % (abandon_num, filename)) 551 | 552 | return ts_filted 553 | 554 | 555 | def filter_overflow_nclt(gt_filename, ts_raw): # 滤波函数 556 | # gt_filename: GT对应的文件名 557 | # ts_raw: 原始数据集提供的点云时间戳 558 | ground_truth = np.loadtxt(gt_filename, delimiter=",")[1:, 0] 559 | min_pose_timestamps = min(ground_truth) 560 | max_pose_timestamps = max(ground_truth) 561 | ts_filted = [t for t in ts_raw if min_pose_timestamps < t < max_pose_timestamps] 562 | abandon_num = len(ts_raw) - len(ts_filted) 563 | print('abandom %d pointclouds that exceed the range of %s' % (abandon_num, gt_filename)) 564 | 565 | return ts_filted 566 | 567 | 568 | def interpolate_pose_nclt(gt_filename, ts_raw): # 插值函数 569 | # gt_filename: GT对应文件名 570 | # ts_raw: 滤波后的点云时间戳 571 | ground_truth = np.loadtxt(gt_filename, delimiter=",") 572 | ground_truth = ground_truth[np.logical_not(np.any(np.isnan(ground_truth), 1))] 573 | interp = scipy.interpolate.interp1d(ground_truth[:, 0], ground_truth[:, 1:], kind='nearest', axis=0) 574 | pose_gt = interp(ts_raw) 575 | 576 | return pose_gt 577 | 578 | 579 | def so3_to_euler_nclt(poses_in): 580 | N = len(poses_in) 581 | poses_out = np.zeros((N, 4, 4)) 582 | for i in range(N): 583 | poses_out[i, :, :] = build_se3_transform([poses_in[i, 0], poses_in[i, 1], poses_in[i, 2], 584 | poses_in[i, 3], poses_in[i, 4], poses_in[i, 5]]) 585 | 586 | return poses_out 587 | 588 | 589 | def convert_nclt(x_s, y_s, z_s): # 输入点云转换函数 590 | # 文档种提供的转换函数 591 | # 原文档返回为x, y, z，但在绘制可视化图时z取负，此处先取负 592 | scaling = 0.005 # 5 mm 593 | offset = -100.0 594 | 595 | x = x_s * scaling + offset 596 | y = y_s * scaling + offset 597 | z = z_s * scaling + offset 598 | 599 | return x, y, -z 600 | 601 | 602 | def load_velodyne_binary_nclt(filename): # 读入二进制点云 603 | f_bin = open(filename, "rb") 604 | hits = [] 605 | while True: 606 | x_str = f_bin.read(2) 607 | if x_str == b'': # eof 608 | break 609 | x = struct.unpack(' None: 30 | np.random.seed(seed) 31 | torch.manual_seed(seed) 32 | random.seed(seed) 33 | 34 | 35 | class VizStats(Stats): 36 | def plot_stats( 37 | self, viz=None, visdom_env=None, plot_file=None, visdom_server=None, visdom_port=None 38 | ): 39 | # use the cached visdom env if none supplied 40 | if visdom_env is None: 41 | visdom_env = self.visdom_env 42 | if visdom_server is None: 43 | visdom_server = self.visdom_server 44 | if visdom_port is None: 45 | visdom_port = self.visdom_port 46 | if plot_file is None: 47 | plot_file = self.plot_file 48 | 49 | stat_sets = list(self.stats.keys()) 50 | 51 | logger.debug( 52 | f"printing charts to visdom env '{visdom_env}' ({visdom_server}:{visdom_port})" 53 | ) 54 | 55 | novisdom = False 56 | 57 | if viz is None: 58 | viz = get_visdom_connection(server=visdom_server, port=visdom_port) 59 | 60 | if viz is None or not viz.check_connection(): 61 | logger.info("no visdom server! -> skipping visdom plots") 62 | novisdom = True 63 | 64 | lines = [] 65 | 66 | # plot metrics 67 | if not novisdom: 68 | viz.close(env=visdom_env, win=None) 69 | 70 | for stat in self.log_vars: 71 | vals = [] 72 | stat_sets_now = [] 73 | for stat_set in stat_sets: 74 | val = self.stats[stat_set][stat].get_epoch_averages() 75 | if val is None: 76 | continue 77 | else: 78 | val = np.array(val).reshape(-1) 79 | stat_sets_now.append(stat_set) 80 | vals.append(val) 81 | 82 | if len(vals) == 0: 83 | continue 84 | 85 | lines.append((stat_sets_now, stat, vals)) 86 | 87 | if not novisdom: 88 | for tmodes, stat, vals in lines: 89 | title = "%s" % stat 90 | opts = {"title": title, "legend": list(tmodes)} 91 | for i, (tmode, val) in enumerate(zip(tmodes, vals)): 92 | update = "append" if i > 0 else None 93 | valid = np.where(np.isfinite(val))[0] 94 | if len(valid) == 0: 95 | continue 96 | x = np.arange(len(val)) 97 | viz.line( 98 | Y=val[valid], 99 | X=x[valid], 100 | env=visdom_env, 101 | opts=opts, 102 | win=f"stat_plot_{title}", 103 | name=tmode, 104 | update=update, 105 | ) 106 | 107 | if plot_file: 108 | logger.info(f"plotting stats to {plot_file}") 109 | ncol = 3 110 | nrow = int(np.ceil(float(len(lines)) / ncol)) 111 | matplotlib.rcParams.update({"font.size": 5}) 112 | color = cycle(plt.cm.tab10(np.linspace(0, 1, 10))) 113 | fig = plt.figure(1) 114 | plt.clf() 115 | for idx, (tmodes, stat, vals) in enumerate(lines): 116 | c = next(color) 117 | plt.subplot(nrow, ncol, idx + 1) 118 | plt.gca() 119 | for vali, vals_ in enumerate(vals): 120 | c_ = c * (1.0 - float(vali) * 0.3) 121 | valid = np.where(np.isfinite(vals_))[0] 122 | if len(valid) == 0: 123 | continue 124 | x = np.arange(len(vals_)) 125 | plt.plot(x[valid], vals_[valid], c=c_, linewidth=1) 126 | plt.ylabel(stat) 127 | plt.xlabel("epoch") 128 | plt.gca().yaxis.label.set_color(c[0:3] * 0.75) 129 | plt.legend(tmodes) 130 | gcolor = np.array(mcolors.to_rgba("lightgray")) 131 | grid_params = {"visible": True, "color": gcolor} 132 | plt.grid(**grid_params, which="major", linestyle="-", linewidth=0.4) 133 | plt.grid(**grid_params, which="minor", linestyle="--", linewidth=0.2) 134 | plt.minorticks_on() 135 | 136 | plt.tight_layout() 137 | plt.show() 138 | try: 139 | fig.savefig(plot_file) 140 | except PermissionError: 141 | warnings.warn("Cant dump stats due to insufficient permissions!") 142 | 143 | 144 | def rotation_angle(rot_gt, rot_pred, batch_size=None): 145 | # rot_gt, rot_pred (B, 3, 3) 146 | # masks_flat: B, 1 147 | rel_angle_cos = so3_relative_angle(rot_gt, rot_pred, eps=1e-4) 148 | rel_rangle_deg = rel_angle_cos * 180 / np.pi 149 | 150 | if batch_size is not None: 151 | rel_rangle_deg = rel_rangle_deg.reshape(batch_size, -1) 152 | 153 | return rel_rangle_deg 154 | 155 | 156 | def translation_angle(tvec_gt, tvec_pred, batch_size=None): 157 | rel_tangle_deg = evaluate_translation_batch(tvec_gt, tvec_pred) 158 | rel_tangle_deg = rel_tangle_deg * 180.0 / np.pi 159 | 160 | if batch_size is not None: 161 | rel_tangle_deg = rel_tangle_deg.reshape(batch_size, -1) 162 | 163 | return rel_tangle_deg 164 | 165 | 166 | def evaluate_translation_batch(t_gt, t, eps=1e-15, default_err=1e6): 167 | """Normalize the translation vectors and compute the angle between them.""" 168 | t_norm = torch.norm(t, dim=1, keepdim=True) 169 | t = t / (t_norm + eps) 170 | 171 | t_gt_norm = torch.norm(t_gt, dim=1, keepdim=True) 172 | t_gt = t_gt / (t_gt_norm + eps) 173 | 174 | loss_t = torch.clamp_min(1.0 - torch.sum(t * t_gt, dim=1) ** 2, eps) 175 | err_t = torch.acos(torch.sqrt(1 - loss_t)) 176 | 177 | err_t[torch.isnan(err_t) | torch.isinf(err_t)] = default_err 178 | return err_t 179 | 180 | 181 | def batched_all_pairs(B, N): 182 | # B, N = se3.shape[:2] 183 | i1_, i2_ = torch.combinations( 184 | torch.arange(N), 2, with_replacement=False 185 | ).unbind(-1) 186 | i1, i2 = [ 187 | (i[None] + torch.arange(B)[:, None] * N).reshape(-1) 188 | for i in [i1_, i2_] 189 | ] 190 | 191 | return i1, i2 192 | 193 | 194 | class WarmupCosineRestarts(torch.optim.lr_scheduler._LRScheduler): 195 | def __init__(self, optimizer, T_0, iters_per_epoch, T_mult=1, eta_min=0, warmup_ratio=0.1, warmup_lr_init=1e-7, 196 | last_epoch=-1): 197 | self.T_0 = T_0 * iters_per_epoch # 50 * 156988 198 | self.T_mult = T_mult # 1 199 | self.eta_min = eta_min # 0 200 | self.warmup_iters = int(T_0 * warmup_ratio * iters_per_epoch) # int(50 * 156988 * 0.1 * 156988) 201 | self.warmup_lr_init = warmup_lr_init # 1e-7 202 | super(WarmupCosineRestarts, self).__init__(optimizer, last_epoch) 203 | 204 | def get_lr(self): 205 | if self.T_mult == 1: 206 | i_restart = self.last_epoch // self.T_0 # 算出是否需要restart 207 | T_cur = self.last_epoch - i_restart * self.T_0 # 208 | else: 209 | n = int(math.log((self.last_epoch / self.T_0 * (self.T_mult - 1) + 1), self.T_mult)) 210 | T_cur = self.last_epoch - self.T_0 * (self.T_mult ** n - 1) // (self.T_mult - 1) 211 | 212 | if T_cur < self.warmup_iters: 213 | warmup_ratio = T_cur / self.warmup_iters 214 | return [self.warmup_lr_init + (base_lr - self.warmup_lr_init) * warmup_ratio for base_lr in self.base_lrs] 215 | else: 216 | T_cur_adjusted = T_cur - self.warmup_iters 217 | T_i = self.T_0 - self.warmup_iters 218 | return [self.eta_min + (base_lr - self.eta_min) * (1 + math.cos(math.pi * T_cur_adjusted / T_i)) / 2 219 | for base_lr in self.base_lrs] 220 | 221 | 222 | class WarmupCosineLR(toptim._LRScheduler): 223 | ''' Warmup learning rate scheduler. 224 | Initially, increases the learning rate from 0 to the final value, in a 225 | certain number of steps. After this number of steps, each step decreases 226 | LR exponentially. 227 | ''' 228 | 229 | def __init__(self, optimizer, lr, warmup_steps, momentum, max_steps): 230 | # cyclic params 231 | self.optimizer = optimizer 232 | self.lr = lr 233 | self.warmup_steps = warmup_steps 234 | self.momentum = momentum 235 | 236 | # cap to one 237 | if self.warmup_steps < 1: 238 | self.warmup_steps = 1 239 | 240 | # cyclic lr 241 | self.cosine_scheduler = toptim.CosineAnnealingLR( 242 | self.optimizer, T_max=max_steps) 243 | 244 | self.initial_scheduler = toptim.CyclicLR(self.optimizer, 245 | base_lr=0, 246 | max_lr=self.lr, 247 | step_size_up=self.warmup_steps, 248 | step_size_down=self.warmup_steps, 249 | cycle_momentum=False, 250 | base_momentum=self.momentum, 251 | max_momentum=self.momentum) 252 | 253 | self.last_epoch = -1 254 | self.finished = False 255 | super().__init__(optimizer) 256 | 257 | def step(self, epoch=None): 258 | if self.finished or self.initial_scheduler.last_epoch >= self.warmup_steps: 259 | if not self.finished: 260 | self.base_lrs = [self.lr for lr in self.base_lrs] 261 | self.finished = True 262 | return self.cosine_scheduler.step(epoch) 263 | else: 264 | return self.initial_scheduler.step(epoch) 265 | 266 | 267 | class CosineLR(toptim._LRScheduler): 268 | ''' learning rate scheduler. 269 | Each step decreases LR exponentially. 270 | ''' 271 | 272 | def __init__(self, optimizer, max_steps): 273 | # cyclic params 274 | self.optimizer = optimizer 275 | # cyclic lr 276 | self.cosine_scheduler = toptim.CosineAnnealingLR( 277 | self.optimizer, T_max=max_steps) 278 | 279 | super().__init__(optimizer) 280 | 281 | def step(self, epoch=None): 282 | return self.cosine_scheduler.step(epoch) 283 | 284 | 285 | # 训练策略 286 | POWER = 0.9 287 | 288 | 289 | def lr_poly(base_lr, iter, max_iter, power): 290 | return base_lr * ((1 - float(iter) / max_iter) ** power) 291 | 292 | 293 | def lr_warmup(base_lr, iter, max_iter, warmup_iter): 294 | return base_lr * (float(iter) / warmup_iter) 295 | 296 | 297 | def adjust_learning_rate(lr, i_iter, max_iter, PREHEAT_STEPS): 298 | if i_iter < PREHEAT_STEPS: 299 | lr = lr_warmup(lr, i_iter, max_iter, PREHEAT_STEPS) 300 | else: 301 | lr = lr_poly(lr, i_iter, max_iter, POWER) 302 | 303 | return lr 304 | 305 | 306 | class DynamicBatchSampler(BatchSampler): 307 | def __init__(self, num_sequences, dataset_len=1024, max_images=128, images_per_seq=(3, 20)): 308 | # len(dataset): 32; cfg.train.len_train: 16384; max_image: 512; images_per_seq: [3, 51] 309 | # self.dataset = dataset 310 | self.max_images = max_images 311 | self.images_per_seq = list(range(images_per_seq[0], images_per_seq[1])) 312 | self.num_sequences = num_sequences 313 | self.dataset_len = dataset_len 314 | 315 | def _capped_random_choice(self, x, size, replace: bool = True): 316 | len_x = x if isinstance(x, int) else len(x) 317 | if replace: 318 | return np.random.choice(x, size=size, replace=len_x < size) 319 | else: 320 | return np.random.choice(x, size=min(size, len_x), replace=False) 321 | 322 | def __iter__(self): 323 | for batch_idx in range(self.dataset_len): 324 | # NOTE batch_idx is never used later 325 | # print(f"process {batch_idx}") 326 | n_per_seq = np.random.choice(self.images_per_seq) 327 | n_seqs = (self.max_images // n_per_seq) 328 | 329 | chosen_seq = self._capped_random_choice(self.num_sequences, n_seqs) 330 | # print(f"get the chosen_seq for {batch_idx}") 331 | 332 | batches = [(bidx, n_per_seq) for bidx in chosen_seq] 333 | # print(f"yield the batches for {batch_idx}") 334 | yield batches 335 | 336 | def __len__(self): 337 | return self.dataset_len 338 | 339 | 340 | class FixBatchSampler(torch.utils.data.Sampler): 341 | def __init__(self, dataset, dataset_len=1024, batch_size=64, max_images=128, images_per_seq=(3, 20)): 342 | # dataset; len_train: 16384; max_images: 512; images_per_seq: [3, 51] 343 | self.dataset = dataset 344 | self.max_images = max_images # 32 345 | self.images_per_seq = list(range(images_per_seq[0], images_per_seq[1])) # [3, ..., 21] 346 | self.num_sequences = len(self.dataset) # 4 347 | self.dataset_len = dataset_len # 156988 348 | self.batch_size = 48 # bath_size 349 | self.fix_images_per_seq = True 350 | 351 | def _capped_random_choice(self, x, size, replace: bool = True): 352 | len_x = x if isinstance(x, int) else len(x) 353 | if replace: 354 | return np.random.choice(x, size=size, replace=len_x < size) 355 | else: 356 | return np.random.choice(x, size=min(size, len_x), replace=False) 357 | 358 | def __iter__(self): 359 | for batch_idx in range(self.dataset_len): 360 | # NOTE batch_idx is never used later 361 | # print(f"process {batch_idx}") 362 | if self.fix_images_per_seq: 363 | # n_per_seq = 12 364 | n_per_seq = 1 365 | else: 366 | n_per_seq = np.random.choice(self.images_per_seq) 367 | 368 | n_seqs = self.batch_size 369 | 370 | chosen_seq = self._capped_random_choice(self.num_sequences, n_seqs) 371 | # print(f"get the chosen_seq for {batch_idx}") 372 | 373 | batches = [(bidx, n_per_seq) for bidx in chosen_seq] 374 | # print(f"yield the batches for {batch_idx}") 375 | yield batches 376 | 377 | def __len__(self): 378 | return self.dataset_len 379 | 380 | 381 | def find_last_checkpoint( 382 | exp_dir, any_path: bool = False, all_checkpoints: bool = False 383 | ): 384 | if any_path: 385 | exts = [".pth", "_stats.jgz", "_opt.pth"] 386 | else: 387 | exts = [".pth"] 388 | 389 | for ext in exts: 390 | fls = sorted( 391 | glob.glob( 392 | os.path.join(glob.escape(exp_dir), "model_epoch_" + "[0-9]" * 8 + ext) 393 | ) 394 | ) 395 | if len(fls) > 0: 396 | break 397 | # pyre-fixme[61]: `fls` is undefined, or not always defined. 398 | if len(fls) == 0: 399 | fl = None 400 | else: 401 | if all_checkpoints: 402 | # pyre-fixme[61]: `fls` is undefined, or not always defined. 403 | fl = [f[0: -len(ext)] + ".pth" for f in fls] 404 | else: 405 | # pyre-fixme[61]: `ext` is undefined, or not always defined. 406 | fl = fls[-1][0: -len(ext)] + ".pth" 407 | 408 | return fl 409 | 410 | 411 | def get_equal_size_feat(feat_3d, sup_point, cls_label, B, pcs_num, max_select): 412 | # 统一将点数采样/复制到固定数量 413 | bs = B 414 | feat_3d_copy = feat_3d.F 415 | sup_point_copy = sup_point 416 | feat_3d_new = [] 417 | sup_point_new = [] 418 | for bs_i in range(bs): 419 | # 1. 得到bs_i的索引 420 | temp_bs_i_ind = feat_3d.C[:, 0] == bs_i 421 | # 2. 得到bs_i的特征 422 | temp_bs_i_ft = feat_3d_copy[temp_bs_i_ind] 423 | temp_bs_i_sp = sup_point_copy[temp_bs_i_ind] 424 | # 3. 每个样本取max_pcs_num - len(temp_bs_i_ft)个特征 425 | len_rand_inds = pcs_num - len(temp_bs_i_ft) 426 | if len_rand_inds > 0: 427 | # 需要padding 随机一些点出来 428 | rand_inds = np.random.choice(len(temp_bs_i_ft), len_rand_inds, replace=True) 429 | # for feat 430 | temp_rand_bs_i_ft = temp_bs_i_ft[rand_inds] 431 | new_bs_i_ft = torch.cat((temp_bs_i_ft, temp_rand_bs_i_ft), 0) 432 | # for sup 433 | temp_rand_bs_i_sp = temp_bs_i_sp[rand_inds] 434 | new_bs_i_sp = torch.cat((temp_bs_i_sp, temp_rand_bs_i_sp), 0) 435 | else: 436 | # 此时多于pc_num个点，所以选择随机抽取pc_num点 437 | rand_inds = np.random.choice(len(temp_bs_i_ft), pcs_num, replace=True) 438 | new_bs_i_ft = temp_bs_i_ft[rand_inds] 439 | new_bs_i_sp = temp_bs_i_sp[rand_inds] 440 | 441 | # for feat 442 | feat_3d_new.append(new_bs_i_ft) 443 | # for sp 444 | sup_point_new.append(new_bs_i_sp) 445 | 446 | # 展平 447 | feat_3d_new = normalize_shape(torch.stack(feat_3d_new))[:max_select] 448 | sup_point_new = normalize_shape(torch.stack(sup_point_new))[:max_select] 449 | label_new = normalize_shape(cls_label.unsqueeze(1).repeat(1, pcs_num, 1))[:max_select] # [B, 3] 450 | 451 | # 不展平，此时[B, N, C] 452 | # feat_3d_new = torch.stack(feat_3d_new) 453 | # sup_point_new = torch.stack(sup_point_new) 454 | # label_new = cls_label.unsqueeze(1).repeat(1, pcs_num, 1) # [B, N, 3] 455 | 456 | return feat_3d_new, sup_point_new, label_new 457 | 458 | 459 | def normalize_shape(tensor_in): 460 | """Bring tensor from shape BxNxC to BNxC""" 461 | return tensor_in.transpose(0, 2).flatten(1).transpose(0, 1) --------------------------------------------------------------------------------