├── .gitignore ├── DATA.md ├── README.md ├── data └── getIMU_KITTI.py ├── dataset ├── KITTIDataset.py └── MalagaDataset.py ├── eval ├── associate.py ├── devkit │ ├── cpp │ │ ├── evaluate_odometry │ │ ├── evaluate_odometry.cpp │ │ ├── io.h │ │ ├── mail.h │ │ ├── matrix.cpp │ │ └── matrix.h │ └── readme.txt ├── gt │ ├── poses │ │ ├── 00.txt │ │ ├── 01.txt │ │ ├── 02.txt │ │ ├── 04.txt │ │ ├── 05.txt │ │ ├── 06.txt │ │ ├── 07.txt │ │ ├── 08.txt │ │ ├── 09.txt │ │ ├── 10.txt │ │ └── ori_08.txt │ └── timestamp │ │ ├── 00.txt │ │ ├── 01.txt │ │ ├── 02.txt │ │ ├── 04.txt │ │ ├── 05.txt │ │ ├── 06.txt │ │ ├── 07.txt │ │ ├── 08.txt │ │ ├── 09.txt │ │ ├── 10.txt │ │ └── ori_08.txt ├── kitti_09.txt ├── kitti_10.txt ├── malaga_03.txt ├── malaga_07.txt └── malaga_09.txt ├── imgs ├── .DS_Store ├── disp_test │ ├── 000000.png │ ├── 000000_disp.png │ └── predictions.npy ├── kitti_09_final.gif └── kitti_10_final.gif ├── losses ├── InverseDepthSmoothnessLoss.py ├── __init__.py └── warp_loss_function.py ├── models ├── DispResNet.py ├── __init__.py ├── models.py └── monodepth2 │ ├── __init__.py │ ├── depth_decoder.py │ ├── layers.py │ └── resnet_encoder.py ├── requirements.txt ├── test_disp.py ├── test_pose.py ├── train.py └── utils ├── __init__.py ├── custom_transform.py ├── inverse_warp.py ├── logger.py ├── plot_traj.py ├── pose_transfer.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | .idea/ 2 | */.idea/ 3 | __pycache__/ 4 | */__pycache__/ 5 | *.pyc 6 | history/ 7 | scripts/ 8 | venv/ -------------------------------------------------------------------------------- /DATA.md: -------------------------------------------------------------------------------- 1 | Note that we resize the image both in 2 | KITTI dataset and Malaga dataset from original resolution into $832 \times 256$. 3 | 4 | Download [KITTI_dataset.tar.gz](https://pan.baidu.com/s/1GZx-v71DDBLhICxAcsJ1pg?pwd=xsf2) and [Malaga_dataset.tar.gz](https://pan.baidu.com/s/1SWlr2SiDxByoLRZiml3MRQ?pwd=fjf2) 5 | and extract them into folders `KITTI_rec_256` and `Malaga_down` in `$DATA_ROOT`. Then download 6 | [kitti_ckpt.tar.gz](https://pan.baidu.com/s/1LVadWtHBXJfh0Qz_lS63HQ?pwd=dj2f) and [malaga_ckpt.tar.gz](https://pan.baidu.com/s/1t2iSBwq_NpJ1PCKUOpIFMg?pwd=bggp) and extract them into 7 | folders `kitti_ckpt` and `malaga_ckpt` in `$ROOT` 8 | 9 | ## KITTI dataset 10 | 11 | We list the sequence index of KITTI Odometry split and corresponding folder name 12 | in KITTI raw split (03 is not included in training set as the IMU data is not available): 13 | 14 | | Seq index | name | 15 | |-----------|-----------------------| 16 | | 00 | 2011_10_03_drive_0027 | 17 | | 01 | 2011_10_03_drive_0042 | 18 | | 02 | 2011_10_03_drive_0034 | 19 | | 03 | 2011_09_26_drive_0067 | 20 | | 04 | 2011_09_30_drive_0016 | 21 | | 05 | 2011_09_30_drive_0018 | 22 | | 06 | 2011_09_30_drive_0020 | 23 | | 07 | 2011_09_30_drive_0027 | 24 | | 08 | 2011_09_30_drive_0028 | 25 | | 09 | 2011_09_30_drive_0033 | 26 | | 10 | 2011_09_30_drive_0034 | 27 | 28 | the processed KITTI data for training and validation will be placed in folder `KITTI_rec_256` as 29 | follows 30 | 31 | ``` 32 | ├── 2011_09_26_drive_0067_sync_02 33 | ├── 2011_09_30_drive_0016_sync_02 34 | ├── 2011_09_30_drive_0018_sync_02 35 | ├── 2011_09_30_drive_0020_sync_02 36 | ├── 2011_09_30_drive_0027_sync_02 37 | ├── 2011_09_30_drive_0028_sync_02 38 | ├── 2011_09_30_drive_0033_sync_02 39 | ├── 2011_09_30_drive_0034_sync_02 40 | ├── 2011_10_03_drive_0027_sync_02 41 | ├── 2011_10_03_drive_0034_sync_02 42 | ├── 2011_10_03_drive_0042_sync_02 43 | ├── train.txt 44 | └── val.txt 45 | ``` 46 | 47 | There are images, camera intrinsic parameters, sparse imu 48 | values, dense imu values, ground truth poses, sampled imu data 49 | in each sub folder. The structure are listed as follows: 50 | 51 | ``` 52 | ├── 000000xxxx.jpg 53 | ├── ...... 54 | ├── cam.txt 55 | ├── oxts.csv 56 | ├── oxts_ori.csv 57 | ├── poses.csv 58 | ├── sampled_imu_index1_index2.npy 59 | 60 | 61 | ``` 62 | 63 | ## Malaga dataset 64 | 65 | We list the folder structure of Malaga dataset as follows, 66 | 67 | ``` 68 | ├── 01 69 | ├── 02 70 | ├── ... 71 | ``` 72 | 73 | in each folder, there are imu data, images, timestamp arranged 74 | as follows, 75 | 76 | ``` 77 | ├── imu 78 | ├── left_256 79 | ├──── xxxxxxxxxx.xxxxxx.jpg 80 | ├──── ...... 81 | ├──── cam.txt 82 | ├── data.csv 83 | ``` -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Unsupervised Network for Visual Inertial Odometry 2 | IJCAI2020 paper: Unsupervised Network for Visual Inertial Odometry. 3 | 4 | | KITTI 09 | KITTI 10 | 5 | | ------------- | ------------- | 6 | | ![aa](./imgs/kitti_09_final.gif) | ![bb](./imgs/kitti_10_final.gif) | 7 | 8 | 9 | ## Introduction 10 | This repository is the official [Pytorch](https://pytorch.org/) implementation of IJCAI2020 paper [Unsupervised Network for Visual Inertial Odometry](https://robotics.pkusz.edu.cn/static/papers/IJCAI-weipeng.pdf). 11 | 12 | 13 | ## Installation 14 | 15 | UnVIO has been tested on Ubuntu with Pytorch 1.4 and Python 3.7.10. For installation, it is recommended to use conda environment. 16 | 17 | 18 | ```shell 19 | conda create -n unvio_env python=3.7.10 20 | conda activate unvio_env 21 | pip install -r requirements.txt 22 | ``` 23 | 24 | Other applications should be installed also, 25 | ```shell 26 | sudo apt install gnuplot 27 | ``` 28 | 29 | ## Data Preparing 30 | 31 | The datasets used in this paper are [KITTI raw](http://www.cvlibs.net/datasets/kitti/raw_data.php) ataset 32 | and [Malaga](https://www.mrpt.org/MalagaUrbanDataset) dataset. Please refer to [Data preparing](DATA.md) for detailed 33 | instruction. 34 | 35 | ## Validation 36 | 37 | Validation can be implemented on Depth estimation and Odometry estimation. 38 | First specify the model path and dataset path: 39 | 40 | ```shell 41 | ROOT='MODEL_ROOT_HERE' 42 | DATA_ROOT='DATA_ROOT_HERE' 43 | ``` 44 | 45 | ### Depth Estimation 46 | 47 | For Depth estimation on KITTI 09 (if you want to test on KITTI 10, change the 48 | `--dataset-list` to `.eval/kitti_10.txt`, same set for Malaga dataset), run the following command: 49 | 50 | ```shell 51 | ROOT=$ROOT/kitti_ckpt 52 | #ROOT=$ROOT/malaga_ckpt 53 | DATA_ROOT=$DATA_ROOT/KITTI_rec_256/ 54 | #DATA_ROOT=$DATA_ROOT/Malaga_down/ 55 | ``` 56 | 57 | ```shell 58 | python test_disp.py \ 59 | --pretrained-dispnet $ROOT/UnVIO_dispnet.pth.tar \ 60 | --dataset-dir $DATA_ROOT \ 61 | --dataset-list .eval/kitti_09.txt \ 62 | --output-dir $ROOT/results_disp \ 63 | --save-depth 64 | ``` 65 | 66 | The `predictions.npy` that stores the all the depth values will be saved in `$ROOT/results_disp`, if `--save-depth` is added, the colored depths will be saved simultaneously is `$ROOT/results_disp/disp` 67 | 68 | ### Visual Odometry 69 | 70 | For Odometry estimation KITTI 09 (if you want to test on KITTI 10, change the `testscene` to `2011_09_30_drive_0034_sync_02`), run the following command: 71 | 72 | ```shell 73 | ROOT=$ROOT/kitti_ckpt 74 | DATA_ROOT=$DATA_ROOT 75 | ``` 76 | 77 | 78 | ```shell 79 | python test_pose.py \ 80 | --pretrained-visualnet $ROOT/UnVIO_visualnet.pth.tar \ 81 | --pretrained-imunet $ROOT/UnVIO_imunet.pth.tar\ 82 | --pretrained-posenet $ROOT/UnVIO_posenet.pth.tar\ 83 | --dataset_root $DATA_ROOT \ 84 | --dataset KITTI \ 85 | --testscene 2011_09_30_drive_0033_sync_02 \ 86 | --show-traj 87 | ``` 88 | 89 | This will create a `.csv` file represneting $T_{wc} \in \mathbb{R}^{3 \times 4}$ in `$ROOT` directory. If the `--show-traj` is added, a scaled trajectory comparing with the ground truth will be ploted. 90 | 91 | ## Train 92 | 93 | Run the following command to train the UnVIO from scratch: 94 | 95 | ```shell 96 | DATA_ROOT=$DATA_ROOT 97 | ``` 98 | 99 | ```shell 100 | python train.py --dataset_root $DATA_ROOT --dataset KITTI 101 | ``` 102 | 103 | specify `--dataset (KITTI or Malaga)` as you need. 104 | 105 | ## Citation 106 | 107 | ``` 108 | @inproceedings{2020Unsupervised, 109 | title={Unsupervised Monocular Visual-inertial Odometry Network}, 110 | author={ Wei, P. and Hua, G. and Huang, W. and Meng, F. and Liu, H. }, 111 | booktitle={Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}}, 112 | year={2020}, 113 | } 114 | ``` 115 | 116 | ## License 117 | 118 | This project is licensed under the terms of the MIT license. 119 | 120 | ## References 121 | 122 | The repository borrowed some code from [SC](https://github.com/JiawangBian/SC-SfMLearner-Release), [Monodepth2](https://github.com/nianticlabs/monodepth2.git) and [SfMLearner](https://github.com/ClementPinard/SfmLearner-Pytorch), thanks for their great work. 123 | 124 | 125 | -------------------------------------------------------------------------------- /data/getIMU_KITTI.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | ''' 3 | generate IMU data corresponding to image timestamp, which will be saved as .npy file 4 | ''' 5 | import pandas as pd 6 | import numpy as np 7 | 8 | def time2timestamp(time_): 9 | time1 = time_.split('.')[0] 10 | time2 = time_.split('.')[1] 11 | h, m, s = time1.split(':') 12 | 13 | integer = str(int(h)*3600 + int(m)*60 + int(s)) 14 | decimal = time2 15 | 16 | time_stamp = int(integer+decimal) 17 | return time_stamp 18 | 19 | def getClosestIndex(file_time, original_imu): 20 | foundIdx = 0 21 | for i in range(0, len(original_imu)): 22 | if time2timestamp(original_imu[i][0]) >= file_time: 23 | foundIdx = i 24 | break 25 | return foundIdx 26 | 27 | def get_IMUS(dataset_dir, before, after): 28 | for scene in dataset: 29 | scene_dir = dataset_dir + '/' + scene 30 | filecsv = np.array(pd.read_csv(scene_dir + '/oxts.csv')) 31 | 32 | # read original IMU data 33 | original_imu = np.array(pd.read_csv(scene_dir + '/oxts_ori.csv')) 34 | 35 | sampled_imu = [] 36 | for i in range(len(filecsv)): 37 | index_imu = [] 38 | file_time = time2timestamp(filecsv[i][0]) 39 | foundIdx = getClosestIndex(file_time, original_imu) 40 | for j in range(before, after+1): 41 | if foundIdx+j <= len(original_imu)-1: 42 | temp_imu = original_imu[foundIdx+j][1:] 43 | else: 44 | temp_imu = original_imu[len(original_imu)-1][1:] 45 | index_imu.append(temp_imu) 46 | sampled_imu.append(index_imu) 47 | 48 | sampled_imu = np.array(sampled_imu) 49 | 50 | np.save(scene_dir + '/sampled_imu_{}_{}.npy'.format(before, after), sampled_imu) 51 | 52 | return 53 | 54 | if __name__ == "__main__": 55 | dataset = {'2011_10_03_drive_0042_sync_02':'01', 56 | '2011_10_03_drive_0034_sync_02':'02', 57 | '2011_10_03_drive_0027_sync_02':'00', 58 | '2011_09_30_drive_0033_sync_02':'09', 59 | '2011_09_30_drive_0028_sync_02':'08', 60 | '2011_09_30_drive_0027_sync_02':'07', 61 | '2011_09_30_drive_0020_sync_02':'06', 62 | '2011_09_30_drive_0018_sync_02':'05', 63 | '2011_09_30_drive_0016_sync_02':'04', 64 | '2011_09_30_drive_0034_sync_02':'10'} 65 | 66 | dataset_dir = '/data3/Datasets/weip/KITTI/KITTI_rec_256' 67 | get_IMUS(dataset_dir, -10, 0) 68 | -------------------------------------------------------------------------------- /dataset/KITTIDataset.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | import torch.utils.data as data 3 | import sys 4 | sys.path.append("..") 5 | from imageio import imread 6 | from path import Path 7 | import random 8 | import pandas as pd 9 | import time 10 | from utils.pose_transfer import * 11 | 12 | train_set = {'2011_10_03_drive_0042_sync_02':'01', 13 | '2011_10_03_drive_0034_sync_02':'02', 14 | '2011_10_03_drive_0027_sync_02':'00', 15 | '2011_09_30_drive_0028_sync_02':'08', 16 | '2011_09_30_drive_0027_sync_02':'07', 17 | '2011_09_30_drive_0020_sync_02':'06', 18 | '2011_09_30_drive_0018_sync_02':'05', 19 | '2011_09_30_drive_0016_sync_02':'04' 20 | } 21 | test_set = {'2011_09_30_drive_0033_sync_02':'09', 22 | '2011_09_30_drive_0034_sync_02':'10'} 23 | 24 | def load_as_float(path): 25 | return imread(path).astype(np.float32) 26 | 27 | def pose12to16(mat): 28 | if mat.ndim == 1: 29 | mat = mat.reshape(3, -1) 30 | mat = np.vstack([mat, [0, 0, 0, 1]]) 31 | return mat 32 | else: 33 | mat = np.vstack([mat, [0, 0, 0, 1]]) 34 | return mat 35 | 36 | def mat_to_6dof(mat): 37 | if mat.shape[0] == 3: 38 | mat = pose12to16(mat) 39 | else: 40 | translation = list(mat[:3,3]) 41 | rotation = list(euler_from_matrix(mat)) 42 | pose = rotation + translation 43 | return pose 44 | 45 | def absolute2Relative(seqGT): 46 | sequence_length = len(seqGT) 47 | seqGT_mat = [pose12to16(item) for item in seqGT] 48 | seqGT_Rela_mat = [] 49 | seqGT_Rela_mat.append(seqGT_mat[0]) 50 | seqGT_Rela_Eul = [] 51 | seqGT_Rela_Eul.append(mat_to_6dof(seqGT_mat[0])) 52 | for i in range(1, sequence_length): 53 | seqGT_Rela_mat.append(np.linalg.inv(seqGT_mat[i-1]) @ seqGT_mat[i]) 54 | seqGT_Rela_mat = np.array(seqGT_Rela_mat) 55 | return seqGT_Rela_mat 56 | 57 | class DataSequence(data.Dataset): 58 | def __init__(self, root, 59 | seed=None, 60 | train=True, 61 | sequence_length=3, 62 | imu_range=[0, 0], 63 | transform=None, 64 | shuffle=True, 65 | scene='default', 66 | image_width=640, 67 | image_height=480): 68 | np.random.seed(seed) 69 | random.seed(seed) 70 | self.root = Path(root) 71 | self.settype = scene 72 | if self.settype == 'default': 73 | scene_list = train_set if train == True else test_set 74 | else: 75 | scene_list = scene 76 | self.scenes = [self.root/folder for folder in scene_list] 77 | self.transform = transform 78 | self.imu_range = imu_range 79 | self.shuffle = shuffle 80 | self.crawl_folders(sequence_length) 81 | 82 | def crawl_folders(self, sequence_length): 83 | sequence_set = [] 84 | demi_length = (sequence_length-1)//2 85 | shifts = list(range(-demi_length, demi_length+1)) 86 | 87 | for scene in self.scenes: 88 | imgs = sorted((scene).files('*.jpg')) 89 | intrinsics = np.genfromtxt(scene/'cam.txt').astype(np.float32).reshape((3, 3)) 90 | try: 91 | imus = np.load(scene/'sampled_imu_{}_{}.npy'.format(self.imu_range[0], self.imu_range[1]), allow_pickle=True).astype(np.float32) 92 | except EOFError as e: 93 | print("No npy files 'sampled_imu_{}_{}.npy' as commmand specified".format(self.imu_range[0], self.imu_range[1])) 94 | GT = np.array(pd.read_csv(scene/'poses.txt', sep=' ', header=None)) 95 | 96 | for i in range(demi_length, len(imgs)-demi_length): 97 | sample = {'imgs':[], 'imus':[], 'intrinsics': intrinsics, 'gts': []} 98 | seq_GT = [] 99 | # put the target in the middle 100 | for j in shifts: 101 | sample['imgs'].append(imgs[i+j]) 102 | sample['imus'].append(imus[i+j]) 103 | seq_GT.append(GT[i+j]) 104 | seq_GT = absolute2Relative(seq_GT) 105 | sample['gts'].append(seq_GT) 106 | sequence_set.append(sample) 107 | 108 | if self.shuffle: 109 | random.shuffle(sequence_set) 110 | self.samples = sequence_set 111 | 112 | def __getitem__(self, index): 113 | sample = self.samples[index] 114 | imgs = [load_as_float(img) for img in sample['imgs']] 115 | gts = np.squeeze(np.array(sample['gts'])).astype(np.float32) 116 | if self.transform is not None: 117 | imgs, imus, intrinsics = self.transform(imgs, np.copy(sample['imus']), np.copy(sample['intrinsics'])) 118 | else: 119 | intrinsics = np.copy(sample['intrinsics']) 120 | imus = np.copy(sample['imus']) 121 | if self.settype == 'default': 122 | return imgs, imus, intrinsics, np.linalg.inv(intrinsics) 123 | else: 124 | return imgs, imus, intrinsics, gts 125 | 126 | def __len__(self): 127 | return len(self.samples) 128 | 129 | if __name__ == "__main__": 130 | start_time = time.time() 131 | D = DataSequence( 132 | sequence_length=5, 133 | root='/data3/Datasets/weip/KITTI/KITTI_rec_256', 134 | imu_range=[-10, 0], 135 | shuffle=False, 136 | image_width=832, 137 | image_height=256, 138 | scene=['2011_09_30_drive_0033_sync_02'] 139 | ) 140 | img, imu, intr, gt = D[0] 141 | print('time used {}'.format(time.time()-start_time)) 142 | -------------------------------------------------------------------------------- /dataset/MalagaDataset.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | import torch.utils.data as data 3 | import sys 4 | sys.path.append("..") 5 | from imageio import imread 6 | from path import Path 7 | import random 8 | import cv2 9 | import time 10 | from utils.pose_transfer import * 11 | 12 | train_set = ['01', '02', '04', '05', '06', '08'] 13 | test_set = ['03', '07', '09'] 14 | 15 | def load_as_float(path, width, height): 16 | return cv2.resize(imread(path), (width, height)).astype(np.float32) 17 | 18 | def pose12to16(mat): 19 | if mat.ndim == 1: 20 | mat = mat.reshape(3, -1) 21 | mat = np.vstack([mat, [0, 0, 0, 1]]) 22 | return mat 23 | else: 24 | mat = np.vstack([mat, [0, 0, 0, 1]]) 25 | return mat 26 | 27 | def mat_to_6dof(mat): 28 | if mat.shape[0] == 3: 29 | mat = pose12to16(mat) 30 | else: 31 | translation = list(mat[:3,3]) 32 | rotation = list(euler_from_matrix(mat)) 33 | pose = rotation + translation 34 | return pose 35 | 36 | def absolute2Relative(seqGT): 37 | sequence_length = len(seqGT) 38 | seqGT_mat = [pose12to16(item) for item in seqGT] 39 | seqGT_Rela_mat = [] 40 | seqGT_Rela_mat.append(seqGT_mat[0]) 41 | seqGT_Rela_Eul = [] 42 | seqGT_Rela_Eul.append(mat_to_6dof(seqGT_mat[0])) 43 | for i in range(1, sequence_length): 44 | seqGT_Rela_Eul.append(mat_to_6dof(np.linalg.inv(seqGT_mat[i-1]) @ seqGT_mat[i])) 45 | seqGT_Rela_Eul = np.array(seqGT_Rela_Eul) 46 | 47 | return seqGT_Rela_Eul 48 | 49 | class DataSequence(data.Dataset): 50 | 51 | def __init__(self, 52 | root, 53 | seed=None, 54 | train=True, 55 | sequence_length=3, 56 | imu_range=[0, 0], 57 | transform=None, 58 | shuffle=True, 59 | scene='default', 60 | image_width=640, 61 | image_height=480): 62 | np.random.seed(seed) 63 | random.seed(seed) 64 | self.root = Path(root) 65 | if scene == 'default': 66 | scene_list = train_set if train == True else test_set 67 | else: 68 | scene_list = scene 69 | self.scenes = [self.root/folder for folder in scene_list] 70 | self.transform = transform 71 | self.imu_range = imu_range 72 | self.shuffle = shuffle 73 | self.image_height = image_height 74 | self.image_width = image_width 75 | self.crawl_folders(sequence_length) 76 | 77 | def crawl_folders(self, sequence_length): 78 | sequence_set = [] 79 | demi_length = (sequence_length-1)//2 80 | shifts = list(range(-demi_length, demi_length+1)) 81 | 82 | for scene in self.scenes: 83 | imgs = sorted((scene/'left_256').files('*.jpg')) 84 | intrinsics = np.genfromtxt(scene/'left_256'/'cam.txt').astype(np.float32).reshape((3, 3)) 85 | intrinsics[0, :] *= (self.image_width/832) 86 | intrinsics[1, :] *= (self.image_height/256) 87 | try: 88 | imus = np.load(scene/'imu/sampled_imu_{}_{}.npy'.format(self.imu_range[0],self.imu_range[1])).astype(np.float32) 89 | except EOFError as e: 90 | print("No npy files 'sampled_imu_{}_{}.npy' as commmand specified".format(self.imu_range[0],self.imu_range[1])) 91 | for i in range(demi_length, len(imgs)-demi_length): 92 | sample = {'imgs': [], 'imus': [], 'intrinsics': intrinsics} 93 | # put the target in the middle 94 | for j in shifts: 95 | sample['imgs'].append(imgs[i+j]) 96 | sample['imus'].append(imus[i+j]) 97 | sequence_set.append(sample) 98 | 99 | if self.shuffle: 100 | random.shuffle(sequence_set) 101 | self.samples = sequence_set 102 | 103 | def __getitem__(self, index): 104 | sample = self.samples[index] 105 | imgs = [load_as_float(img, self.image_width, self.image_height) for img in sample['imgs']] 106 | if self.transform is not None: 107 | imgs, imus, intrinsics = self.transform(imgs, np.copy(sample['imus']), np.copy(sample['intrinsics'])) 108 | else: 109 | intrinsics = np.copy(sample['intrinsics']) 110 | imus = np.copy(sample['imus']) 111 | return imgs, imus, intrinsics, np.linalg.inv(intrinsics) 112 | 113 | def __len__(self): 114 | return len(self.samples) 115 | 116 | if __name__ == "__main__": 117 | start_time = time.time() 118 | D = DataSequence( 119 | sequence_length=5, 120 | root='/data3/Datasets/weip/Malaga/Malaga_Down/', 121 | imu_range=[-10, 0], 122 | shuffle=False, 123 | image_width=832, 124 | image_height=256, 125 | train=True, 126 | ) 127 | img, imu, intr, inr_inv = D[390] 128 | print('time used {}'.format(time.time()-start_time)) -------------------------------------------------------------------------------- /eval/associate.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | # Software License Agreement (BSD License) 3 | # 4 | # Copyright (c) 2013, Juergen Sturm, TUM 5 | # All rights reserved. 6 | # 7 | # Redistribution and use in source and binary forms, with or without 8 | # modification, are permitted provided that the following conditions 9 | # are met: 10 | # 11 | # * Redistributions of source code must retain the above copyright 12 | # notice, this list of conditions and the following disclaimer. 13 | # * Redistributions in binary form must reproduce the above 14 | # copyright notice, this list of conditions and the following 15 | # disclaimer in the documentation and/or other materials provided 16 | # with the distribution. 17 | # * Neither the name of TUM nor the names of its 18 | # contributors may be used to endorse or promote products derived 19 | # from this software without specific prior written permission. 20 | # 21 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 22 | # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT 23 | # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS 24 | # FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE 25 | # COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, 26 | # INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, 27 | # BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; 28 | # LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 29 | # CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 30 | # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN 31 | # ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 32 | # POSSIBILITY OF SUCH DAMAGE. 33 | # 34 | # Requirements: 35 | # sudo apt-get install python-argparse 36 | 37 | """ 38 | The Kinect provides the color and depth images in an un-synchronized way. This means that the set of time stamps from the color images do not intersect with those of the depth images. Therefore, we need some way of associating color images to depth images. 39 | 40 | For this purpose, you can use the ''associate.py'' script. It reads the time stamps from the rgb.txt file and the depth.txt file, and joins them by finding the best matches. 41 | """ 42 | 43 | import argparse 44 | import sys 45 | import os 46 | import numpy 47 | 48 | def read_file_list(filename): 49 | """ 50 | Reads a trajectory from a text file. 51 | 52 | File format: 53 | The file format is "stamp d1 d2 d3 ...", where stamp denotes the time stamp (to be matched) 54 | and "d1 d2 d3.." is arbitary data (e.g., a 3D position and 3D orientation) associated to this timestamp. 55 | 56 | Input: 57 | filename -- File name 58 | 59 | Output: 60 | dict -- dictionary of (stamp,data) tuples 61 | 62 | """ 63 | file = open(filename) 64 | data = file.read() 65 | lines = data.replace(",", " ").replace("\t", " ").split("\n") 66 | list = [[v.strip() for v in line.split(" ") if v.strip() != ""] for line in lines if len(line)>0 and line[0] != "#"] 67 | list = [(float(l[0]), l[1:]) for l in list if len(l) > 1] 68 | return dict(list) 69 | 70 | def associate(first_list, second_list,offset,max_difference): 71 | """ 72 | Associate two dictionaries of (stamp,data). As the time stamps never match exactly, we aim 73 | to find the closest match for every input tuple. 74 | 75 | Input: 76 | first_list -- first dictionary of (stamp,data) tuples 77 | second_list -- second dictionary of (stamp,data) tuples 78 | offset -- time offset between both dictionaries (e.g., to model the delay between the sensors) 79 | max_difference -- search radius for candidate generation 80 | 81 | Output: 82 | matches -- list of matched tuples ((stamp1,data1),(stamp2,data2)) 83 | 84 | """ 85 | first_keys = first_list.keys() 86 | second_keys = second_list.keys() 87 | potential_matches = [(abs(a - (b + offset)), a, b) 88 | for a in first_keys 89 | for b in second_keys 90 | if abs(a - (b + offset)) < max_difference] 91 | potential_matches.sort() 92 | matches = [] 93 | for diff, a, b in potential_matches: 94 | if a in first_keys and b in second_keys: 95 | first_keys.remove(a) 96 | second_keys.remove(b) 97 | matches.append((a, b)) 98 | 99 | matches.sort() 100 | return matches 101 | 102 | if __name__ == '__main__': 103 | 104 | # parse command line 105 | parser = argparse.ArgumentParser(description=''' 106 | This script takes two data files with timestamps and associates them 107 | ''') 108 | parser.add_argument('first_file', help='first text file (format: timestamp data)') 109 | parser.add_argument('second_file', help='second text file (format: timestamp data)') 110 | parser.add_argument('--first_only', help='only output associated lines from first file', action='store_true') 111 | parser.add_argument('--offset', help='time offset added to the timestamps of the second file (default: 0.0)',default=0.0) 112 | parser.add_argument('--max_difference', help='maximally allowed time difference for matching entries (default: 0.02)',default=0.02) 113 | args = parser.parse_args() 114 | 115 | first_list = read_file_list(args.first_file) 116 | second_list = read_file_list(args.second_file) 117 | 118 | matches = associate(first_list, second_list,float(args.offset),float(args.max_difference)) 119 | 120 | if args.first_only: 121 | for a,b in matches: 122 | print("%f %s"%(a," ".join(first_list[a]))) 123 | else: 124 | for a,b in matches: 125 | print("%f %s %f %s"%(a," ".join(first_list[a]),b-float(args.offset)," ".join(second_list[b]))) 126 | 127 | -------------------------------------------------------------------------------- /eval/devkit/cpp/evaluate_odometry: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Ironbrotherstyle/UnVIO/173e8ef646074da45cb64c77da36e474e4746031/eval/devkit/cpp/evaluate_odometry -------------------------------------------------------------------------------- /eval/devkit/cpp/evaluate_odometry.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | #include 3 | #include 4 | #include 5 | #include 6 | #include 7 | #include 8 | #include "matrix.h" 9 | #include 10 | // #include 11 | 12 | using namespace std; 13 | 14 | // static parameter 15 | // float lengths[] = {5,10,50,100,150,200,250,300,350,400}; 16 | float lengths[] = {100,200,300,400,500,600,700,800}; 17 | int32_t num_lengths = 8; 18 | 19 | struct errors { 20 | int32_t first_frame; 21 | float r_err; 22 | float t_err; 23 | float len; 24 | float speed; 25 | errors (int32_t first_frame,float r_err,float t_err,float len,float speed) : 26 | first_frame(first_frame),r_err(r_err),t_err(t_err),len(len),speed(speed) {} 27 | }; 28 | 29 | vector loadPoses(string file_name) { 30 | vector poses; 31 | FILE *fp = fopen(file_name.c_str(),"r"); 32 | if (!fp) 33 | return poses; 34 | while (!feof(fp)) { 35 | Matrix P = Matrix::eye(4); 36 | if (fscanf(fp, "%lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf %lf", 37 | &P.val[0][0], &P.val[0][1], &P.val[0][2], &P.val[0][3], 38 | &P.val[1][0], &P.val[1][1], &P.val[1][2], &P.val[1][3], 39 | &P.val[2][0], &P.val[2][1], &P.val[2][2], &P.val[2][3] )==12) { 40 | poses.push_back(P); 41 | } 42 | } 43 | fclose(fp); 44 | return poses; 45 | } 46 | 47 | //计算的变换矩阵最后一列的位姿绝对误差方根之和 48 | vector trajectoryDistances (vector &poses) { 49 | vector dist; 50 | dist.push_back(0); 51 | for (int32_t i=1; i &dist,int32_t first_frame,float len) { 64 | for (int32_t i=first_frame; idist[first_frame]+len) 66 | return i; 67 | return -1; 68 | } 69 | 70 | // 旋转误差由对角线决定 71 | inline float rotationError(Matrix &pose_error) { 72 | float a = pose_error.val[0][0]; 73 | float b = pose_error.val[1][1]; 74 | float c = pose_error.val[2][2]; 75 | float d = 0.5*(a+b+c-1.0); 76 | return acos(max(min(d,1.0f),-1.0f)); 77 | } 78 | 79 | // 平移误差直接由平移的平方决定 80 | inline float translationError(Matrix &pose_error) { 81 | float dx = pose_error.val[0][3]; 82 | float dy = pose_error.val[1][3]; 83 | float dz = pose_error.val[2][3]; 84 | return sqrt(dx*dx+dy*dy+dz*dz); 85 | } 86 | 87 | vector calcSequenceErrors (vector &poses_gt,vector &poses_result) { 88 | 89 | // error vector 90 | vector err; 91 | 92 | // parameters 93 | int32_t step_size = 10; // every second 10fps 因此隔10张就是一秒钟 94 | 95 | // pre-compute distances (from ground truth as reference) 计算GT求和的的绝对轨迹误差 96 | vector dist = trajectoryDistances(poses_gt); 97 | 98 | // for all start positions do 10张10张的取样 99 | for (int32_t first_frame=0; first_frame &err,string file_name) { 135 | 136 | // open file 137 | FILE *fp; 138 | fp = fopen(file_name.c_str(),"w"); 139 | 140 | // write to file 141 | for (vector::iterator it=err.begin(); it!=err.end(); it++) 142 | fprintf(fp,"%d %f %f %f %f\n",it->first_frame,it->r_err,it->t_err,it->len,it->speed); 143 | 144 | // close file 145 | fclose(fp); 146 | } 147 | 148 | void savePathPlot (vector &poses_gt,vector &poses_result,string file_name) { 149 | 150 | // parameters 151 | int32_t step_size = 3; 152 | 153 | // open file 154 | FILE *fp = fopen(file_name.c_str(),"w"); 155 | 156 | // save x/z coordinates of all frames to file 把两个的xz坐标存下来 157 | for (int32_t i=0; i computeRoi (vector &poses_gt,vector &poses_result) { 167 | 168 | float x_min = numeric_limits::max(); 169 | float x_max = numeric_limits::min(); 170 | float z_min = numeric_limits::max(); 171 | float z_max = numeric_limits::min(); 172 | 173 | for (vector::iterator it=poses_gt.begin(); it!=poses_gt.end(); it++) { 174 | float x = it->val[0][3]; 175 | float z = it->val[2][3]; 176 | if (xx_max) x_max = x; 177 | if (zz_max) z_max = z; 178 | } 179 | 180 | for (vector::iterator it=poses_result.begin(); it!=poses_result.end(); it++) { 181 | float x = it->val[0][3]; 182 | float z = it->val[2][3]; 183 | if (xx_max) x_max = x; 184 | if (zz_max) z_max = z; 185 | } 186 | 187 | // 计算出变化量和中值,画出感兴趣区域图像 188 | float dx = 1.1*(x_max-x_min); 189 | float dz = 1.1*(z_max-z_min); 190 | float mx = 0.5*(x_max+x_min); 191 | float mz = 0.5*(z_max+z_min); 192 | float r = 0.5*max(dx,dz); 193 | 194 | vector roi; 195 | roi.push_back((int32_t)(mx-r)); 196 | roi.push_back((int32_t)(mx+r)); 197 | roi.push_back((int32_t)(mz-r)); 198 | roi.push_back((int32_t)(mz+r)); 199 | return roi; 200 | } 201 | 202 | // 这个函数用来输出记录? 203 | void plotPathPlot (string dir,vector &roi,int32_t idx) { 204 | 205 | // gnuplot file name 206 | char command[1024]; 207 | char file_name[256]; 208 | sprintf(file_name,"%02d.gp",idx); 209 | string full_name = dir + "/" + file_name; 210 | 211 | // create png + eps 212 | for (int32_t i=0; i<2; i++) { 213 | 214 | // open file 215 | FILE *fp = fopen(full_name.c_str(),"w"); 216 | 217 | // save gnuplot instructions 218 | if (i==0) { 219 | fprintf(fp,"set term png size 900,900\n"); 220 | fprintf(fp,"set output \"%02d.png\"\n",idx); 221 | } else { 222 | fprintf(fp,"set term postscript eps enhanced color\n"); 223 | fprintf(fp,"set output \"%02d.eps\"\n",idx); 224 | } 225 | 226 | fprintf(fp,"set size ratio -1\n"); 227 | fprintf(fp,"set xrange [%d:%d]\n",roi[0],roi[1]); 228 | fprintf(fp,"set yrange [%d:%d]\n",roi[2],roi[3]); 229 | fprintf(fp,"set xlabel \"x [m]\"\n"); 230 | fprintf(fp,"set ylabel \"z [m]\"\n"); 231 | fprintf(fp,"plot \"%02d.txt\" using 1:2 lc rgb \"#FF0000\" title 'Ground Truth' w lines,",idx); 232 | fprintf(fp,"\"%02d.txt\" using 3:4 lc rgb \"#0000FF\" title 'Visual Odometry' w lines,",idx); 233 | fprintf(fp,"\"< head -1 %02d.txt\" using 1:2 lc rgb \"#000000\" pt 4 ps 1 lw 2 title 'Sequence Start' w points\n",idx); 234 | 235 | // close file 236 | fclose(fp); 237 | 238 | // run gnuplot => create png + eps 239 | sprintf(command,"cd %s; gnuplot %s",dir.c_str(),file_name); 240 | if(system(command)); 241 | } 242 | 243 | // create pdf and crop 244 | // sprintf(command,"cd %s; ps2pdf %02d.eps %02d_large.pdf",dir.c_str(),idx,idx); 245 | // if(system(command)); 246 | // sprintf(command,"cd %s; pdfcrop %02d_large.pdf %02d.pdf",dir.c_str(),idx,idx); 247 | // if(system(command)); 248 | // sprintf(command,"cd %s; rm %02d_large.pdf",dir.c_str(),idx); 249 | // if(system(command)); 250 | } 251 | 252 | void saveErrorPlots(vector &seq_err,string plot_error_dir,char* prefix) { 253 | 254 | // file names 生成这四个文件的名字 255 | char file_name_tl[1024]; sprintf(file_name_tl,"%s/%s_tl.txt",plot_error_dir.c_str(),prefix); 256 | char file_name_rl[1024]; sprintf(file_name_rl,"%s/%s_rl.txt",plot_error_dir.c_str(),prefix); 257 | char file_name_ts[1024]; sprintf(file_name_ts,"%s/%s_ts.txt",plot_error_dir.c_str(),prefix); 258 | char file_name_rs[1024]; sprintf(file_name_rs,"%s/%s_rs.txt",plot_error_dir.c_str(),prefix); 259 | 260 | // open files 这几个值分别是平移相对于长度,旋转相对于长度,平移相对于速度,旋转相对于速度。 261 | FILE *fp_tl = fopen(file_name_tl,"w"); 262 | FILE *fp_rl = fopen(file_name_rl,"w"); 263 | FILE *fp_ts = fopen(file_name_ts,"w"); 264 | FILE *fp_rs = fopen(file_name_rs,"w"); 265 | 266 | // for each segment length do 267 | for (int32_t i=0; i::iterator it=seq_err.begin(); it!=seq_err.end(); it++) { 275 | if (fabs(it->len-lengths[i])<1.0) { 276 | t_err += it->t_err; 277 | r_err += it->r_err; 278 | num++; 279 | } 280 | } 281 | 282 | // we require at least 3 values 有三个不同长度以上的话 283 | if (num>2.5) { 284 | fprintf(fp_tl,"%f %f\n",lengths[i],t_err/num); 285 | fprintf(fp_rl,"%f %f\n",lengths[i],r_err/num); 286 | } 287 | } 288 | 289 | // for each driving speed do (in m/s) 在一个序列下,遍历速度,如果有片段速度和给定的接近,就把误差求和。 290 | for (float speed=2; speed<25; speed+=2) { 291 | 292 | float t_err = 0; 293 | float r_err = 0; 294 | float num = 0; 295 | 296 | // for all errors do 297 | for (vector::iterator it=seq_err.begin(); it!=seq_err.end(); it++) { 298 | if (fabs(it->speed-speed)<2.0) { 299 | t_err += it->t_err; 300 | r_err += it->r_err; 301 | num++; 302 | } 303 | } 304 | 305 | // we require at least 3 values 306 | if (num>2.5) { 307 | fprintf(fp_ts,"%f %f\n",speed,t_err/num); 308 | fprintf(fp_rs,"%f %f\n",speed,r_err/num); 309 | } 310 | } 311 | 312 | // close files 313 | fclose(fp_tl); 314 | fclose(fp_rl); 315 | fclose(fp_ts); 316 | fclose(fp_rs); 317 | } 318 | 319 | // 画出所有的error 320 | void plotErrorPlots (string dir,char* prefix) { 321 | 322 | char command[1024]; 323 | 324 | // for all four error plots do 循环四重指标 325 | for (int32_t i=0; i<4; i++) { 326 | 327 | // create suffix 328 | char suffix[16]; 329 | switch (i) { 330 | case 0: sprintf(suffix,"tl"); break; 331 | case 1: sprintf(suffix,"rl"); break; 332 | case 2: sprintf(suffix,"ts"); break; 333 | case 3: sprintf(suffix,"rs"); break; 334 | } 335 | 336 | // gnuplot file name 337 | char file_name[1024]; char full_name[1024]; 338 | sprintf(file_name,"%s_%s.gp",prefix,suffix); 339 | sprintf(full_name,"%s/%s",dir.c_str(),file_name); 340 | 341 | // create png + eps 342 | for (int32_t j=0; j<2; j++) { 343 | 344 | // open file 345 | FILE *fp = fopen(full_name,"w"); 346 | 347 | // save gnuplot instructions 348 | if (j==0) { 349 | fprintf(fp,"set term png size 500,250 font \"Helvetica\" 11\n"); 350 | fprintf(fp,"set output \"%s_%s.png\"\n",prefix,suffix); 351 | } else { 352 | fprintf(fp,"set term postscript eps enhanced color\n"); 353 | fprintf(fp,"set output \"%s_%s.eps\"\n",prefix,suffix); 354 | } 355 | 356 | // start plot at 0 357 | fprintf(fp,"set size ratio 0.5\n"); 358 | fprintf(fp,"set yrange [0:*]\n"); 359 | 360 | // x label 两种不同的X坐标 361 | if (i<=1) fprintf(fp,"set xlabel \"Path Length [m]\"\n"); 362 | else fprintf(fp,"set xlabel \"Speed [km/h]\"\n"); 363 | 364 | // y label 两种不同的Y坐标 365 | if (i==0 || i==2) fprintf(fp,"set ylabel \"Translation Error [%%]\"\n"); 366 | else fprintf(fp,"set ylabel \"Rotation Error [deg/m]\"\n"); 367 | 368 | // plot error curve 369 | fprintf(fp,"plot \"%s_%s.txt\" using ",prefix,suffix); 370 | switch (i) { 371 | case 0: fprintf(fp,"1:($2*100) title 'Translation Error'"); break; 372 | case 1: fprintf(fp,"1:($2*57.3) title 'Rotation Error'"); break; 373 | case 2: fprintf(fp,"($1*3.6):($2*100) title 'Translation Error'"); break; 374 | case 3: fprintf(fp,"($1*3.6):($2*57.3) title 'Rotation Error'"); break; 375 | } 376 | fprintf(fp," lc rgb \"#0000FF\" pt 4 w linespoints\n"); 377 | 378 | // close file 379 | fclose(fp); 380 | 381 | // run gnuplot => create png + eps 382 | sprintf(command,"cd %s; gnuplot %s",dir.c_str(),file_name); 383 | if(system(command)); 384 | } 385 | 386 | // create pdf and crop 387 | // sprintf(command,"cd %s; ps2pdf %s_%s.eps %s_%s_large.pdf",dir.c_str(),prefix,suffix,prefix,suffix); 388 | // if(system(command)); 389 | // sprintf(command,"cd %s; pdfcrop %s_%s_large.pdf %s_%s.pdf",dir.c_str(),prefix,suffix,prefix,suffix); 390 | // if(system(command)); 391 | // sprintf(command,"cd %s; rm %s_%s_large.pdf",dir.c_str(),prefix,suffix); 392 | // if(system(command)); 393 | } 394 | } 395 | 396 | void saveStats (vector err,string dir) { 397 | 398 | float t_err = 0; 399 | float r_err = 0; 400 | 401 | // for all errors do => compute sum of t_err, r_err 402 | for (vector::iterator it=err.begin(); it!=err.end(); it++) { 403 | t_err += it->t_err; 404 | r_err += it->r_err; 405 | } 406 | 407 | // open file 408 | FILE *fp = fopen((dir + "/stats.txt").c_str(),"w"); 409 | 410 | // save errors 411 | float num = err.size(); 412 | fprintf(fp,"%f %f\n",t_err/num,r_err/num); 413 | 414 | // close file 415 | fclose(fp); 416 | } 417 | 418 | void split(const string& s,vector& sv,const char flag = ' ') { 419 | sv.clear(); 420 | istringstream iss(s); 421 | string temp; 422 | 423 | while (getline(iss, temp, flag)) { 424 | sv.push_back(stoi(temp)); 425 | } 426 | return; 427 | } 428 | 429 | bool eval(string gt_direction, string prediction_dir, string sequence) { 430 | // ground truth and result directories 431 | // string gt_dir = "/home/weapon/Desktop/VINet1.0/evaluate/odometry/gt/poses"; 432 | string gt_dir = gt_direction; 433 | string result_dir = prediction_dir; 434 | string error_dir = result_dir + "/errors"; 435 | string plot_path_dir = result_dir + "/plot_path"; 436 | string plot_error_dir = result_dir + "/plot_error"; 437 | 438 | // create output directories 439 | if (access(error_dir.c_str(), 0) == -1){ //如果文件夹不存在 440 | // _mkdir(error_dir.c_str()); 441 | system(("mkdir " + error_dir).c_str()); 442 | printf("create dir: %s\n", error_dir.c_str()); 443 | } 444 | if (access(plot_path_dir.c_str(), 0) == -1){ 445 | // _mkdir(plot_path_dir.c_str()); 446 | system(("mkdir " + plot_path_dir).c_str()); 447 | printf("create dir: %s\n", plot_path_dir.c_str()); 448 | } 449 | if (access(plot_error_dir.c_str(), 0) == -1){ 450 | // _mkdir(plot_error_dir.c_str()); 451 | system(("mkdir " + plot_error_dir).c_str()); 452 | printf("create dir: %s\n", plot_error_dir.c_str()); 453 | } 454 | 455 | // if(system(("mkdir " + error_dir).c_str())); 456 | // if(system(("mkdir " + plot_path_dir).c_str())); 457 | // if(system(("mkdir " + plot_error_dir).c_str())); 458 | 459 | // total errors 460 | vector total_err; 461 | 462 | // as for eval_train 463 | // int seq[]={0, 1, 2, 8, 9}; 464 | 465 | // as for eval_test 466 | string sequence_ = sequence; 467 | sequence_ = sequence_.substr(1,sequence_.length()-2); 468 | // cout<< sequence_ < seq; 470 | split(sequence_, seq, ','); 471 | // int seq[]={3, 4, 5, 6, 7, 10}; 472 | int seq_length = 0; 473 | for (const auto& s : seq) { 474 | // cout << s << endl; 475 | seq_length +=1; 476 | } 477 | // cout << "sizeof(seq) " << sizeof(seq) << "sizeof(seq[0]) " << sizeof(seq[0]) < poses_gt = loadPoses(gt_dir + "/" + file_name); 488 | vector poses_result = loadPoses(result_dir + "/" + file_name); 489 | 490 | // plot status 491 | // printf("Processing: %s, poses: %zu/%zu\n",file_name,poses_result.size(),poses_gt.size()); 492 | 493 | // check for errors 判断序列长度是否相同 494 | if (poses_gt.size()==0 || poses_result.size()!=poses_gt.size()) { 495 | printf("ERROR: Couldn't read (all) poses of: %s/%s", result_dir.c_str(), file_name); 496 | return false; 497 | } 498 | 499 | // compute sequence errors 返回的是五列数据 it->first_frame,it->r_err,it->t_err,it->len,it->speed 500 | vector seq_err = calcSequenceErrors(poses_gt,poses_result); 501 | saveSequenceErrors(seq_err,error_dir + "/" + file_name); 502 | 503 | // add to total errors 将当前序列的结果加到总的err中去 504 | total_err.insert(total_err.end(),seq_err.begin(),seq_err.end()); 505 | 506 | // save + plot bird's eye view trajectories 画出轨迹范围图 先存两者的轨迹 507 | savePathPlot(poses_gt,poses_result,plot_path_dir + "/" + file_name); 508 | vector roi = computeRoi(poses_gt,poses_result); //返回四个坐标 509 | plotPathPlot(plot_path_dir,roi,i); 510 | 511 | // save + plot individual errors 512 | char prefix[16]; 513 | sprintf(prefix,"%02d",i); 514 | saveErrorPlots(seq_err,plot_error_dir,prefix); 515 | plotErrorPlots(plot_error_dir,prefix); 516 | } 517 | 518 | // save + plot total errors + summary statistics 画出带avg的图片 519 | if (total_err.size()>0) { 520 | char prefix[16]; 521 | sprintf(prefix,"avg"); 522 | saveErrorPlots(total_err,plot_error_dir,prefix); 523 | plotErrorPlots(plot_error_dir,prefix); 524 | saveStats(total_err,result_dir); 525 | } 526 | 527 | // success 528 | return true; 529 | } 530 | 531 | int32_t main (int32_t argc, char *argv[]) { 532 | // need only 2 arguments 533 | if (argc != 4) { 534 | cout << "Usage: ./eval_odometry gt_dir result_dir seq" << endl; 535 | return 1; 536 | } 537 | string gt_direction = argv[1]; 538 | string result_dir = argv[2]; 539 | string sequence = argv[3]; 540 | bool success = eval(gt_direction, result_dir, sequence); 541 | // if (success){ 542 | // printf("\nProcessing success!\n"); 543 | // } 544 | // else{ 545 | // printf("\nProcessing fail!\n"); 546 | // } 547 | 548 | 549 | return 0; 550 | } 551 | -------------------------------------------------------------------------------- /eval/devkit/cpp/io.h: -------------------------------------------------------------------------------- 1 | /* Copyright (C) 1996-2014 Free Software Foundation, Inc. 2 | This file is part of the GNU C Library. 3 | 4 | The GNU C Library is free software; you can redistribute it and/or 5 | modify it under the terms of the GNU Lesser General Public 6 | License as published by the Free Software Foundation; either 7 | version 2.1 of the License, or (at your option) any later version. 8 | 9 | The GNU C Library is distributed in the hope that it will be useful, 10 | but WITHOUT ANY WARRANTY; without even the implied warranty of 11 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU 12 | Lesser General Public License for more details. 13 | 14 | You should have received a copy of the GNU Lesser General Public 15 | License along with the GNU C Library; if not, see 16 | . */ 17 | 18 | #ifndef _SYS_IO_H 19 | #define _SYS_IO_H 1 20 | 21 | #include 22 | 23 | __BEGIN_DECLS 24 | 25 | /* If TURN_ON is TRUE, request for permission to do direct i/o on the 26 | port numbers in the range [FROM,FROM+NUM-1]. Otherwise, turn I/O 27 | permission off for that range. This call requires root privileges. 28 | 29 | Portability note: not all Linux platforms support this call. Most 30 | platforms based on the PC I/O architecture probably will, however. 31 | E.g., Linux/Alpha for Alpha PCs supports this. */ 32 | extern int ioperm (unsigned long int __from, unsigned long int __num, 33 | int __turn_on) __THROW; 34 | 35 | /* Set the I/O privilege level to LEVEL. If LEVEL>3, permission to 36 | access any I/O port is granted. This call requires root 37 | privileges. */ 38 | extern int iopl (int __level) __THROW; 39 | 40 | #if defined __GNUC__ && __GNUC__ >= 2 41 | 42 | static __inline unsigned char 43 | inb (unsigned short int __port) 44 | { 45 | unsigned char _v; 46 | 47 | __asm__ __volatile__ ("inb %w1,%0":"=a" (_v):"Nd" (__port)); 48 | return _v; 49 | } 50 | 51 | static __inline unsigned char 52 | inb_p (unsigned short int __port) 53 | { 54 | unsigned char _v; 55 | 56 | __asm__ __volatile__ ("inb %w1,%0\noutb %%al,$0x80":"=a" (_v):"Nd" (__port)); 57 | return _v; 58 | } 59 | 60 | static __inline unsigned short int 61 | inw (unsigned short int __port) 62 | { 63 | unsigned short _v; 64 | 65 | __asm__ __volatile__ ("inw %w1,%0":"=a" (_v):"Nd" (__port)); 66 | return _v; 67 | } 68 | 69 | static __inline unsigned short int 70 | inw_p (unsigned short int __port) 71 | { 72 | unsigned short int _v; 73 | 74 | __asm__ __volatile__ ("inw %w1,%0\noutb %%al,$0x80":"=a" (_v):"Nd" (__port)); 75 | return _v; 76 | } 77 | 78 | static __inline unsigned int 79 | inl (unsigned short int __port) 80 | { 81 | unsigned int _v; 82 | 83 | __asm__ __volatile__ ("inl %w1,%0":"=a" (_v):"Nd" (__port)); 84 | return _v; 85 | } 86 | 87 | static __inline unsigned int 88 | inl_p (unsigned short int __port) 89 | { 90 | unsigned int _v; 91 | __asm__ __volatile__ ("inl %w1,%0\noutb %%al,$0x80":"=a" (_v):"Nd" (__port)); 92 | return _v; 93 | } 94 | 95 | static __inline void 96 | outb (unsigned char __value, unsigned short int __port) 97 | { 98 | __asm__ __volatile__ ("outb %b0,%w1": :"a" (__value), "Nd" (__port)); 99 | } 100 | 101 | static __inline void 102 | outb_p (unsigned char __value, unsigned short int __port) 103 | { 104 | __asm__ __volatile__ ("outb %b0,%w1\noutb %%al,$0x80": :"a" (__value), 105 | "Nd" (__port)); 106 | } 107 | 108 | static __inline void 109 | outw (unsigned short int __value, unsigned short int __port) 110 | { 111 | __asm__ __volatile__ ("outw %w0,%w1": :"a" (__value), "Nd" (__port)); 112 | 113 | } 114 | 115 | static __inline void 116 | outw_p (unsigned short int __value, unsigned short int __port) 117 | { 118 | __asm__ __volatile__ ("outw %w0,%w1\noutb %%al,$0x80": :"a" (__value), 119 | "Nd" (__port)); 120 | } 121 | 122 | static __inline void 123 | outl (unsigned int __value, unsigned short int __port) 124 | { 125 | __asm__ __volatile__ ("outl %0,%w1": :"a" (__value), "Nd" (__port)); 126 | } 127 | 128 | static __inline void 129 | outl_p (unsigned int __value, unsigned short int __port) 130 | { 131 | __asm__ __volatile__ ("outl %0,%w1\noutb %%al,$0x80": :"a" (__value), 132 | "Nd" (__port)); 133 | } 134 | 135 | static __inline void 136 | insb (unsigned short int __port, void *__addr, unsigned long int __count) 137 | { 138 | __asm__ __volatile__ ("cld ; rep ; insb":"=D" (__addr), "=c" (__count) 139 | :"d" (__port), "0" (__addr), "1" (__count)); 140 | } 141 | 142 | static __inline void 143 | insw (unsigned short int __port, void *__addr, unsigned long int __count) 144 | { 145 | __asm__ __volatile__ ("cld ; rep ; insw":"=D" (__addr), "=c" (__count) 146 | :"d" (__port), "0" (__addr), "1" (__count)); 147 | } 148 | 149 | static __inline void 150 | insl (unsigned short int __port, void *__addr, unsigned long int __count) 151 | { 152 | __asm__ __volatile__ ("cld ; rep ; insl":"=D" (__addr), "=c" (__count) 153 | :"d" (__port), "0" (__addr), "1" (__count)); 154 | } 155 | 156 | static __inline void 157 | outsb (unsigned short int __port, const void *__addr, 158 | unsigned long int __count) 159 | { 160 | __asm__ __volatile__ ("cld ; rep ; outsb":"=S" (__addr), "=c" (__count) 161 | :"d" (__port), "0" (__addr), "1" (__count)); 162 | } 163 | 164 | static __inline void 165 | outsw (unsigned short int __port, const void *__addr, 166 | unsigned long int __count) 167 | { 168 | __asm__ __volatile__ ("cld ; rep ; outsw":"=S" (__addr), "=c" (__count) 169 | :"d" (__port), "0" (__addr), "1" (__count)); 170 | } 171 | 172 | static __inline void 173 | outsl (unsigned short int __port, const void *__addr, 174 | unsigned long int __count) 175 | { 176 | __asm__ __volatile__ ("cld ; rep ; outsl":"=S" (__addr), "=c" (__count) 177 | :"d" (__port), "0" (__addr), "1" (__count)); 178 | } 179 | 180 | #endif /* GNU C */ 181 | 182 | __END_DECLS 183 | #endif /* _SYS_IO_H */ 184 | -------------------------------------------------------------------------------- /eval/devkit/cpp/mail.h: -------------------------------------------------------------------------------- 1 | #ifndef MAIL_H 2 | #define MAIL_H 3 | 4 | #include 5 | #include 6 | #include 7 | 8 | class Mail { 9 | 10 | public: 11 | 12 | Mail (std::string email = "") { 13 | if (email.compare("")) { 14 | mail = popen("/usr/lib/sendmail -t -f noreply@cvlibs.net","w"); 15 | fprintf(mail,"To: %s\n", email.c_str()); 16 | fprintf(mail,"From: noreply@cvlibs.net\n"); 17 | fprintf(mail,"Subject: KITTI Evaluation Benchmark\n"); 18 | fprintf(mail,"\n\n"); 19 | } else { 20 | mail = 0; 21 | } 22 | } 23 | 24 | ~Mail() { 25 | if (mail) { 26 | pclose(mail); 27 | } 28 | } 29 | 30 | void msg (const char *format, ...) { 31 | va_list args; 32 | va_start(args,format); 33 | if (mail) { 34 | vfprintf(mail,format,args); 35 | fprintf(mail,"\n"); 36 | } 37 | vprintf(format,args); 38 | printf("\n"); 39 | va_end(args); 40 | } 41 | 42 | private: 43 | 44 | FILE *mail; 45 | 46 | }; 47 | 48 | #endif 49 | -------------------------------------------------------------------------------- /eval/devkit/cpp/matrix.h: -------------------------------------------------------------------------------- 1 | /* 2 | Copyright 2011. All rights reserved. 3 | Institute of Measurement and Control Systems 4 | Karlsruhe Institute of Technology, Germany 5 | 6 | This file is part of libviso2. 7 | Authors: Andreas Geiger 8 | 9 | libviso2 is free software; you can redistribute it and/or modify it under the 10 | terms of the GNU General Public License as published by the Free Software 11 | Foundation; either version 2 of the License, or any later version. 12 | 13 | libviso2 is distributed in the hope that it will be useful, but WITHOUT ANY 14 | WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A 15 | PARTICULAR PURPOSE. See the GNU General Public License for more details. 16 | 17 | You should have received a copy of the GNU General Public License along with 18 | libviso2; if not, write to the Free Software Foundation, Inc., 51 Franklin 19 | Street, Fifth Floor, Boston, MA 02110-1301, USA 20 | */ 21 | 22 | #ifndef MATRIX_H 23 | #define MATRIX_H 24 | 25 | #include 26 | #include 27 | #include 28 | #include 29 | #include 30 | 31 | #ifndef _MSC_VER 32 | #include 33 | #else 34 | typedef __int8 int8_t; 35 | typedef __int16 int16_t; 36 | typedef __int32 int32_t; 37 | typedef __int64 int64_t; 38 | typedef unsigned __int8 uint8_t; 39 | typedef unsigned __int16 uint16_t; 40 | typedef unsigned __int32 uint32_t; 41 | typedef unsigned __int64 uint64_t; 42 | #endif 43 | 44 | #define endll endl << endl // double end line definition 45 | 46 | typedef double FLOAT; // double precision 47 | //typedef float FLOAT; // single precision 48 | 49 | class Matrix { 50 | 51 | public: 52 | 53 | // constructor / deconstructor 54 | Matrix (); // init empty 0x0 matrix 55 | Matrix (const int32_t m,const int32_t n); // init empty mxn matrix 56 | Matrix (const int32_t m,const int32_t n,const FLOAT* val_); // init mxn matrix with values from array 'val' 57 | Matrix (const Matrix &M); // creates deepcopy of M 58 | ~Matrix (); 59 | 60 | // assignment operator, copies contents of M 61 | Matrix& operator= (const Matrix &M); 62 | 63 | // copies submatrix of M into array 'val', default values copy whole row/column/matrix 64 | void getData(FLOAT* val_,int32_t i1=0,int32_t j1=0,int32_t i2=-1,int32_t j2=-1); 65 | 66 | // set or get submatrices of current matrix 67 | Matrix getMat(int32_t i1,int32_t j1,int32_t i2=-1,int32_t j2=-1); 68 | void setMat(const Matrix &M,const int32_t i,const int32_t j); 69 | 70 | // set sub-matrix to scalar (default 0), -1 as end replaces whole row/column/matrix 71 | void setVal(FLOAT s,int32_t i1=0,int32_t j1=0,int32_t i2=-1,int32_t j2=-1); 72 | 73 | // set (part of) diagonal to scalar, -1 as end replaces whole diagonal 74 | void setDiag(FLOAT s,int32_t i1=0,int32_t i2=-1); 75 | 76 | // clear matrix 77 | void zero(); 78 | 79 | // extract columns with given index 80 | Matrix extractCols (std::vector idx); 81 | 82 | // create identity matrix 83 | static Matrix eye (const int32_t m); 84 | void eye (); 85 | 86 | // create diagonal matrix with nx1 or 1xn matrix M as elements 87 | static Matrix diag(const Matrix &M); 88 | 89 | // returns the m-by-n matrix whose elements are taken column-wise from M 90 | static Matrix reshape(const Matrix &M,int32_t m,int32_t n); 91 | 92 | // create 3x3 rotation matrices (convention: http://en.wikipedia.org/wiki/Rotation_matrix) 93 | static Matrix rotMatX(const FLOAT &angle); 94 | static Matrix rotMatY(const FLOAT &angle); 95 | static Matrix rotMatZ(const FLOAT &angle); 96 | 97 | // simple arithmetic operations 98 | Matrix operator+ (const Matrix &M); // add matrix 99 | Matrix operator- (const Matrix &M); // subtract matrix 100 | Matrix operator* (const Matrix &M); // multiply with matrix 101 | Matrix operator* (const FLOAT &s); // multiply with scalar 102 | Matrix operator/ (const Matrix &M); // divide elementwise by matrix (or vector) 103 | Matrix operator/ (const FLOAT &s); // divide by scalar 104 | Matrix operator- (); // negative matrix 105 | Matrix operator~ (); // transpose 106 | FLOAT l2norm (); // euclidean norm (vectors) / frobenius norm (matrices) 107 | FLOAT mean (); // mean of all elements in matrix 108 | 109 | // complex arithmetic operations 110 | static Matrix cross (const Matrix &a, const Matrix &b); // cross product of two vectors 111 | static Matrix inv (const Matrix &M); // invert matrix M 112 | bool inv (); // invert this matrix 113 | FLOAT det (); // returns determinant of matrix 114 | bool solve (const Matrix &M,FLOAT eps=1e-20); // solve linear system M*x=B, replaces *this and M 115 | bool lu(int32_t *idx, FLOAT &d, FLOAT eps=1e-20); // replace *this by lower upper decomposition 116 | void svd(Matrix &U,Matrix &W,Matrix &V); // singular value decomposition *this = U*diag(W)*V^T 117 | 118 | // print matrix to stream 119 | friend std::ostream& operator<< (std::ostream& out,const Matrix& M); 120 | 121 | // direct data access 122 | FLOAT **val; 123 | int32_t m,n; 124 | 125 | private: 126 | 127 | void allocateMemory (const int32_t m_,const int32_t n_); 128 | void releaseMemory (); 129 | inline FLOAT pythag(FLOAT a,FLOAT b); 130 | 131 | }; 132 | 133 | #endif // MATRIX_H 134 | -------------------------------------------------------------------------------- /eval/devkit/readme.txt: -------------------------------------------------------------------------------- 1 | ########################################################################### 2 | # THE KITTI VISION BENCHMARK SUITE: VISUAL ODOMETRY / SLAM BENCHMARK # 3 | # Andreas Geiger Philip Lenz Raquel Urtasun # 4 | # Karlsruhe Institute of Technology # 5 | # Toyota Technological Institute at Chicago # 6 | # www.cvlibs.net # 7 | ########################################################################### 8 | 9 | This file describes the KITTI visual odometry / SLAM benchmark package. 10 | Accurate ground truth (<10cm) is provided by a GPS/IMU system with RTK 11 | float/integer corrections enabled. In order to enable a fair comparison of 12 | all methods, only ground truth for the sequences 00-10 is made publicly 13 | available. The remaining sequences (11-21) serve as evaluation sequences. 14 | 15 | NOTE: WHEN SUBMITTING RESULTS, PLEASE STORE THEM IN THE SAME DATA FORMAT IN 16 | WHICH THE GROUND TRUTH DATA IS PROVIDED (SEE 'POSES' BELOW), USING THE 17 | FILE NAMES 11.txt TO 21.txt. CREATE A ZIP ARCHIVE OF THEM AND STORE YOUR 18 | RESULTS IN ITS ROOT FOLDER. 19 | 20 | File description: 21 | ================= 22 | 23 | Folder 'sequences': 24 | 25 | Each folder within the folder 'sequences' contains a single sequence, where 26 | the left and right images are stored in the sub-folders image_0 and 27 | image_1, respectively. The images are provided as greyscale PNG images and 28 | can be loaded with MATLAB or libpng++. All images have been undistorted and 29 | rectified. Sequences 0-10 can be used for training, while results must be 30 | provided for the test sequences 11-21. 31 | 32 | Additionally we provide the velodyne point clouds for point-cloud-based 33 | methods. To save space, all scans have been stored as Nx4 float matrix into 34 | a binary file using the following code: 35 | 36 | stream = fopen (dst_file.c_str(),"wb"); 37 | fwrite(data,sizeof(float),4*num,stream); 38 | fclose(stream); 39 | 40 | Here, data contains 4*num values, where the first 3 values correspond to 41 | x,y and z, and the last value is the reflectance information. All scans 42 | are stored row-aligned, meaning that the first 4 values correspond to the 43 | first measurement. Since each scan might potentially have a different 44 | number of points, this must be determined from the file size when reading 45 | the file, where 1e6 is a good enough upper bound on the number of values: 46 | 47 | // allocate 4 MB buffer (only ~130*4*4 KB are needed) 48 | int32_t num = 1000000; 49 | float *data = (float*)malloc(num*sizeof(float)); 50 | 51 | // pointers 52 | float *px = data+0; 53 | float *py = data+1; 54 | float *pz = data+2; 55 | float *pr = data+3; 56 | 57 | // load point cloud 58 | FILE *stream; 59 | stream = fopen (currFilenameBinary.c_str(),"rb"); 60 | num = fread(data,sizeof(float),num,stream)/4; 61 | for (int32_t i=0; i>> idepth = torch.rand(1, 1, 4, 5) 27 | >>> image = torch.rand(1, 3, 4, 5) 28 | >>> smooth = tgm.losses.DepthSmoothnessLoss() 29 | >>> loss = smooth(idepth, image) 30 | """ 31 | 32 | def __init__(self) -> None: 33 | super(InverseDepthSmoothnessLoss, self).__init__() 34 | 35 | @staticmethod 36 | def gradient_x(img: torch.Tensor) -> torch.Tensor: 37 | assert len(img.shape) == 4, img.shape 38 | return img[:, :, :, :-1] - img[:, :, :, 1:] 39 | 40 | @staticmethod 41 | def gradient_y(img: torch.Tensor) -> torch.Tensor: 42 | assert len(img.shape) == 4, img.shape 43 | return img[:, :, :-1, :] - img[:, :, 1:, :] 44 | 45 | # @staticmethod 46 | # def gradient_x(image): 47 | # sobel_x = np.array([[1,0,-1],[2,0,-2],[1,0,-1]]) 48 | # sobel_x = torch.from_numpy(sobel_x).type_as(image).view(1,1,3,3) 49 | # if image.shape[1] == 3: 50 | # sobel_x = torch.cat([sobel_x, sobel_x, sobel_x], dim=1) 51 | # return F.conv2d(image, weight=sobel_x, padding=1) 52 | 53 | # @staticmethod 54 | # def gradient_y(image): 55 | # sobel_y = np.array([[1,2,-1],[0,0,0],[-1,-2,-1]]) 56 | # sobel_y = torch.from_numpy(sobel_y).type_as(image).view(1,1,3,3) 57 | # if image.shape[1] == 3: 58 | # sobel_y = torch.cat([sobel_y, sobel_y, sobel_y], dim=1) 59 | # return F.conv2d(image, weight=sobel_y, padding=1) 60 | 61 | def forward( 62 | self, 63 | idepth: torch.Tensor, 64 | image: torch.Tensor) -> torch.Tensor: 65 | if not torch.is_tensor(idepth): 66 | raise TypeError("Input idepth type is not a torch.Tensor. Got {}" 67 | .format(type(idepth))) 68 | if not torch.is_tensor(image): 69 | raise TypeError("Input image type is not a torch.Tensor. Got {}" 70 | .format(type(image))) 71 | if not len(idepth.shape) == 4: 72 | raise ValueError("Invalid idepth shape, we expect BxCxHxW. Got: {}" 73 | .format(idepth.shape)) 74 | if not len(image.shape) == 4: 75 | raise ValueError("Invalid image shape, we expect BxCxHxW. Got: {}" 76 | .format(image.shape)) 77 | if not idepth.shape[-2:] == image.shape[-2:]: 78 | raise ValueError("idepth and image shapes must be the same. Got: {}" 79 | .format(idepth.shape, image.shape)) 80 | if not idepth.device == image.device: 81 | raise ValueError( 82 | "idepth and image must be in the same device. Got: {}" .format( 83 | idepth.device, image.device)) 84 | if not idepth.dtype == image.dtype: 85 | raise ValueError( 86 | "idepth and image must be in the same dtype. Got: {}" .format( 87 | idepth.dtype, image.dtype)) 88 | # compute the gradients 89 | idepth_dx = self.gradient_x(idepth) 90 | idepth_dy = self.gradient_y(idepth) 91 | image_dx = self.gradient_x(image) 92 | image_dy = self.gradient_y(image) 93 | 94 | # compute image weights 95 | weights_x = torch.exp( 96 | -torch.mean(torch.abs(image_dx), dim=1, keepdim=True)) 97 | weights_y = torch.exp( 98 | -torch.mean(torch.abs(image_dy), dim=1, keepdim=True)) 99 | 100 | # apply image weights to depth 101 | smoothness_x = torch.abs(idepth_dx * weights_x) 102 | smoothness_y = torch.abs(idepth_dy * weights_y) 103 | return torch.mean(smoothness_x) + torch.mean(smoothness_y) 104 | 105 | ###################### 106 | # functional interface 107 | ###################### 108 | 109 | def inverse_depth_smoothness_loss( 110 | idepth: torch.Tensor, 111 | image: torch.Tensor) -> torch.Tensor: 112 | r"""Computes image-aware inverse depth smoothness loss. 113 | 114 | See :class:`~torchgeometry.losses.InvDepthSmoothnessLoss` for details. 115 | """ 116 | return InverseDepthSmoothnessLoss()(idepth, image) -------------------------------------------------------------------------------- /losses/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Ironbrotherstyle/UnVIO/173e8ef646074da45cb64c77da36e474e4746031/losses/__init__.py -------------------------------------------------------------------------------- /losses/warp_loss_function.py: -------------------------------------------------------------------------------- 1 | from __future__ import division 2 | import torch 3 | from torch import nn 4 | import torch.nn.functional as F 5 | from utils.inverse_warp import inverse_warp_3d 6 | from .InverseDepthSmoothnessLoss import inverse_depth_smoothness_loss 7 | 8 | def photometric_reconstruction_loss(tgt_img, ref_imgs, intrinsics, 9 | depth, pose, 10 | rotation_mode='euler', padding_mode='zeros', explainability_mask=None, ref_depth=None): 11 | def one_scale(depth, explainability_mask = None, ref_depth=None): 12 | assert(pose.size(1) == len(ref_imgs)) 13 | 14 | reconstruction_loss, loss_3d = 0, 0 15 | b, _, h, w = depth.size() 16 | downscale = tgt_img.size(2)/h 17 | tgt_img_scaled = F.interpolate(tgt_img, (h, w), mode='area') 18 | ref_imgs_scaled = [F.interpolate(ref_img, (h, w), mode='area') for ref_img in ref_imgs] 19 | 20 | intrinsics_scaled = torch.cat((intrinsics[:, 0:2]/downscale, intrinsics[:, 2:]), dim=1) 21 | for i, ref_img in enumerate(ref_imgs_scaled): 22 | current_pose = pose[:, i] 23 | if (ref_depth is not None) and (ref_depth[i] is not None): 24 | ref_img_warped, coords_3d = inverse_warp_3d(ref_img, depth[:,0], current_pose, 25 | intrinsics_scaled, rotation_mode, padding_mode, ref_depth=ref_depth[i][:, 0]) 26 | else: 27 | ref_img_warped, _ = inverse_warp_3d(ref_img, depth[:,0], current_pose, 28 | intrinsics_scaled, rotation_mode, padding_mode, ref_depth=None) 29 | out_of_bound = 1 - (ref_img_warped == 0).prod(1, keepdim=True).type_as(ref_img_warped) # B 1 H W 30 | diff = appearence_loss(tgt_img_scaled, ref_img_warped, out_of_bound) 31 | 32 | if (ref_depth is not None) and (ref_depth[i] is not None): 33 | loss_3d += (out_of_bound * (coords_3d[0][:,2:] - coords_3d[1][:,2:]).abs() / (coords_3d[0][:,2:] + coords_3d[1][:,2:])).mean() 34 | 35 | if explainability_mask is not None: 36 | diff = diff * explainability_mask[:,i:i+1].expand_as(diff) 37 | 38 | reconstruction_loss += diff.mean() 39 | assert((reconstruction_loss == reconstruction_loss).item() == 1) 40 | 41 | return reconstruction_loss/len(ref_imgs), loss_3d/len(ref_imgs) 42 | 43 | if type(explainability_mask) not in [tuple, list]: 44 | explainability_mask = [explainability_mask] 45 | if type(depth) not in [list, tuple]: 46 | depth = [depth] 47 | 48 | loss_photo, loss_3d = 0, 0 49 | 50 | for i, d in enumerate(depth): 51 | r_depth = ref_depth[i] 52 | tmp1, tmp2 = one_scale(d, None, ref_depth=r_depth) 53 | loss_photo += tmp1 54 | loss_3d += tmp2 55 | return loss_photo / len(depth), loss_3d/len(depth) 56 | 57 | def appearence_loss(img1, img2, valid_pixel, ternary=False, weights=[0.15, 0.85, 0.08]): 58 | 59 | diff = img1 - img2 60 | diff = charbonnier_loss(diff, valid_pixel, alpha=0.5, size_average=False) 61 | ssim_loss = s_SSIM(img1*valid_pixel, img2*valid_pixel, window_size=3).mean(1, keepdim=True) 62 | if ternary: 63 | ternaryloss = ternary_loss(img1, img2, valid_pixel) 64 | else: 65 | ternaryloss = 0 66 | return weights[0] * diff + weights[1] * ssim_loss + weights[2] * ternaryloss 67 | 68 | def s_SSIM(x, y, window_size): 69 | C1 = 0.01 ** 2 70 | C2 = 0.03 ** 2 71 | 72 | avg = nn.AvgPool2d(kernel_size=window_size, padding=window_size//2, stride=1) 73 | mu_x = avg(x) 74 | mu_y = avg(y) 75 | mu_x_mu_y = mu_x * mu_y 76 | mu_x_sq = mu_x.pow(2) 77 | mu_y_sq = mu_y.pow(2) 78 | 79 | sigma_x = avg(x * x) - mu_x_sq 80 | sigma_y = avg(y * y) - mu_y_sq 81 | sigma_xy = avg(x * y) - mu_x_mu_y 82 | 83 | SSIM_n = (2 * mu_x_mu_y + C1) * (2 * sigma_xy + C2) 84 | SSIM_d = (mu_x_sq + mu_y_sq + C1) * (sigma_x + sigma_y + C2) 85 | SSIM = SSIM_n / SSIM_d 86 | return torch.clamp((1 - SSIM) / 2, 0, 1) 87 | 88 | def charbonnier_loss(x, mask=None, truncate=None, alpha=0.45, beta=1.0, epsilon=0.001, size_average=False): 89 | """Compute the generalized charbonnier loss of the difference tensor x. 90 | All positions where mask == 0 are not taken into account. 91 | Args: 92 | x: a tensor of shape [num_batch, channels, height, width]. 93 | mask: a mask of shape [num_batch, mask_channels, height, width], 94 | where mask channels must be either 1 or the same number as 95 | the number of channels of x. Entries should be 0 or 1. 96 | Returns: 97 | loss as floatTensor 98 | """ 99 | if x.size(1) == 1 and mask.size(1) == 3: 100 | mask = mask.mean(1,keepdim=True) 101 | assert(x.size(1) == mask.size(1) or mask.size(1) == 1) 102 | b, c, h, w = x.size() 103 | 104 | error = ((x*beta)**2 + epsilon**2).pow(alpha) 105 | 106 | if mask is not None: 107 | error = error * mask 108 | 109 | if truncate is not None: 110 | error = torch.min(error, truncate) 111 | if size_average: 112 | return error.mean() 113 | else: 114 | return error.mean(1, keepdim=True) 115 | 116 | def explainability_loss(mask): 117 | if type(mask) not in [tuple, list]: 118 | mask = [mask] 119 | loss = 0 120 | for mask_scaled in mask: 121 | ones_var = torch.ones_like(mask_scaled) 122 | loss += nn.functional.binary_cross_entropy(mask_scaled, ones_var) 123 | return loss 124 | 125 | def smooth_loss(pred_map): 126 | def gradient(pred): 127 | D_dy = pred[:, :, 1:] - pred[:, :, :-1] 128 | D_dx = pred[:, :, :, 1:] - pred[:, :, :, :-1] 129 | return D_dx, D_dy 130 | 131 | if type(pred_map) not in [tuple, list]: 132 | pred_map = [pred_map] 133 | 134 | loss = 0 135 | weight = 1. 136 | 137 | for scaled_map in pred_map: 138 | dx, dy = gradient(scaled_map) 139 | dx2, dxdy = gradient(dx) 140 | dydx, dy2 = gradient(dy) 141 | loss += (dx2.abs().mean() + dxdy.abs().mean() + dydx.abs().mean() + dy2.abs().mean())*weight 142 | weight /= 2.3 # don't ask me why it works better 143 | return loss 144 | 145 | def disp_smooth_loss(disparities, img): 146 | loss = 0 147 | weight = 1. 148 | for disp in disparities: 149 | b, _, h, w = disp.size() 150 | img_scaled = F.interpolate(img, (h, w), mode='area') 151 | loss += inverse_depth_smoothness_loss(disp, img_scaled) * weight 152 | weight /= 2.3 153 | return loss 154 | 155 | def spatial_normalize(disparities): 156 | ''' 157 | proposed by https://arxiv.org/abs/1712.00175 158 | ''' 159 | disp_mean = disparities.mean(1).mean(1).mean(1) 160 | disp_mean = disp_mean.unsqueeze(1).unsqueeze(1).unsqueeze(1) 161 | return disparities/disp_mean 162 | 163 | @torch.no_grad() 164 | def compute_errors(gt, pred, crop=True): 165 | abs_diff, abs_rel, sq_rel, a1, a2, a3 = 0, 0, 0, 0, 0, 0 166 | batch_size = gt.size(0) 167 | ''' 168 | crop used by Garg ECCV16 to reprocude Eigen NIPS14 results 169 | construct a mask of False values, with the same size as target 170 | and then set to True values inside the crop 171 | ''' 172 | if crop: 173 | crop_mask = gt[0] != gt[0] 174 | y1, y2 = int(0.40810811 * gt.size(1)), int(0.99189189 * gt.size(1)) 175 | x1, x2 = int(0.03594771 * gt.size(2)), int(0.96405229 * gt.size(2)) 176 | crop_mask[y1:y2, x1:x2] = 1 177 | 178 | for current_gt, current_pred in zip(gt, pred): 179 | valid = (current_gt > 0) & (current_gt < 80) 180 | if crop: 181 | valid = valid & crop_mask 182 | 183 | valid_gt = current_gt[valid] 184 | valid_pred = current_pred[valid].clamp(1e-3, 80) 185 | 186 | valid_pred = valid_pred * torch.median(valid_gt)/torch.median(valid_pred) 187 | 188 | thresh = torch.max((valid_gt / valid_pred), (valid_pred / valid_gt)) 189 | a1 += (thresh < 1.25).float().mean() 190 | a2 += (thresh < 1.25 ** 2).float().mean() 191 | a3 += (thresh < 1.25 ** 3).float().mean() 192 | 193 | abs_diff += torch.mean(torch.abs(valid_gt - valid_pred)) 194 | abs_rel += torch.mean(torch.abs(valid_gt - valid_pred) / valid_gt) 195 | 196 | sq_rel += torch.mean(((valid_gt - valid_pred)**2) / valid_gt) 197 | 198 | return [metric.item() / batch_size for metric in [abs_diff, abs_rel, sq_rel, a1, a2, a3]] 199 | -------------------------------------------------------------------------------- /models/DispResNet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | def conv3x3(in_planes, out_planes, stride=1): 6 | """3x3 convolution with padding""" 7 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, 8 | padding=1, bias=False) 9 | 10 | class BasicBlock(nn.Module): 11 | expansion = 1 12 | 13 | def __init__(self, inplanes, planes, stride=1, downsample=None): 14 | super(BasicBlock, self).__init__() 15 | self.conv1 = conv3x3(inplanes, planes, stride) 16 | #self.bn1 = nn.BatchNorm2d(planes) 17 | self.relu = nn.ReLU(inplace=True) 18 | self.conv2 = conv3x3(planes, planes) 19 | #self.bn2 = nn.BatchNorm2d(planes) 20 | self.downsample = downsample 21 | self.stride = stride 22 | 23 | def forward(self, x): 24 | residual = x 25 | 26 | out = self.conv1(x) 27 | #out = self.bn1(out) 28 | out = self.relu(out) 29 | 30 | out = self.conv2(out) 31 | #out = self.bn2(out) 32 | 33 | if self.downsample is not None: 34 | residual = self.downsample(x) 35 | 36 | out += residual 37 | out = self.relu(out) 38 | 39 | return out 40 | 41 | def make_layer(inplanes, block, planes, blocks, stride=1): 42 | downsample = None 43 | if stride != 1 or inplanes != planes * block.expansion: 44 | downsample = nn.Sequential( 45 | nn.Conv2d(inplanes, planes * block.expansion, 46 | kernel_size=1, stride=stride, bias=False), 47 | nn.BatchNorm2d(planes * block.expansion), 48 | ) 49 | 50 | layers = [] 51 | layers.append(block(inplanes, planes, stride, downsample)) 52 | inplanes = planes * block.expansion 53 | for i in range(1, blocks): 54 | layers.append(block(inplanes, planes)) 55 | 56 | return nn.Sequential(*layers) 57 | 58 | def downsample_conv(in_planes, out_planes, kernel_size=3): 59 | return nn.Sequential( 60 | nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=2, padding=(kernel_size-1)//2), 61 | nn.ReLU(inplace=True), 62 | nn.Conv2d(out_planes, out_planes, kernel_size=kernel_size, padding=(kernel_size-1)//2), 63 | nn.ReLU(inplace=True) 64 | ) 65 | 66 | 67 | def predict_disp(in_planes): 68 | return nn.Sequential( 69 | nn.Conv2d(in_planes, 1, kernel_size=3, padding=1), 70 | nn.Sigmoid() 71 | ) 72 | 73 | 74 | def conv(in_planes, out_planes): 75 | return nn.Sequential( 76 | nn.Conv2d(in_planes, out_planes, kernel_size=3, padding=1), 77 | nn.ReLU(inplace=True) 78 | ) 79 | 80 | 81 | def upconv(in_planes, out_planes): 82 | return nn.Sequential( 83 | nn.ConvTranspose2d(in_planes, out_planes, kernel_size=3, stride=2, padding=1, output_padding=1), 84 | nn.ReLU(inplace=True) 85 | ) 86 | 87 | 88 | def crop_like(input, ref): 89 | assert(input.size(2) >= ref.size(2) and input.size(3) >= ref.size(3)) 90 | return input[:, :, :ref.size(2), :ref.size(3)] 91 | 92 | 93 | class DispResNet(nn.Module): 94 | 95 | def __init__(self, alpha=10, beta=0.01): 96 | super(DispResNet, self).__init__() 97 | 98 | self.alpha = alpha 99 | self.beta = beta 100 | 101 | conv_planes = [32, 64, 128, 256, 512, 512, 512] 102 | self.conv1 = downsample_conv(3, conv_planes[0], kernel_size=7) 103 | self.conv2 = make_layer(conv_planes[0], BasicBlock, conv_planes[1], blocks=2, stride=2) 104 | self.conv3 = make_layer(conv_planes[1], BasicBlock, conv_planes[2], blocks=2, stride=2) 105 | self.conv4 = make_layer(conv_planes[2], BasicBlock, conv_planes[3], blocks=3, stride=2) 106 | self.conv5 = make_layer(conv_planes[3], BasicBlock, conv_planes[4], blocks=3, stride=2) 107 | self.conv6 = make_layer(conv_planes[4], BasicBlock, conv_planes[5], blocks=3, stride=2) 108 | self.conv7 = make_layer(conv_planes[5], BasicBlock, conv_planes[6], blocks=3, stride=2) 109 | 110 | upconv_planes = [512, 512, 256, 128, 64, 32, 16] 111 | self.upconv7 = upconv(conv_planes[6], upconv_planes[0]) 112 | self.upconv6 = upconv(upconv_planes[0], upconv_planes[1]) 113 | self.upconv5 = upconv(upconv_planes[1], upconv_planes[2]) 114 | self.upconv4 = upconv(upconv_planes[2], upconv_planes[3]) 115 | self.upconv3 = upconv(upconv_planes[3], upconv_planes[4]) 116 | self.upconv2 = upconv(upconv_planes[4], upconv_planes[5]) 117 | self.upconv1 = upconv(upconv_planes[5], upconv_planes[6]) 118 | 119 | self.iconv7 = make_layer(upconv_planes[0] + conv_planes[5], BasicBlock, upconv_planes[0], blocks=2, stride=1) 120 | self.iconv6 = make_layer(upconv_planes[1] + conv_planes[4], BasicBlock, upconv_planes[1], blocks=2, stride=1) 121 | self.iconv5 = make_layer(upconv_planes[2] + conv_planes[3], BasicBlock, upconv_planes[2], blocks=2, stride=1) 122 | self.iconv4 = make_layer(upconv_planes[3] + conv_planes[2], BasicBlock, upconv_planes[3], blocks=2, stride=1) 123 | self.iconv3 = make_layer(1 + upconv_planes[4] + conv_planes[1], BasicBlock, upconv_planes[4], blocks=1, stride=1) 124 | self.iconv2 = make_layer(1 + upconv_planes[5] + conv_planes[0], BasicBlock, upconv_planes[5], blocks=1, stride=1) 125 | self.iconv1 = make_layer(1 + upconv_planes[6], BasicBlock, upconv_planes[6], blocks=1, stride=1) 126 | 127 | self.predict_disp6 = predict_disp(upconv_planes[1]) 128 | self.predict_disp5 = predict_disp(upconv_planes[2]) 129 | self.predict_disp4 = predict_disp(upconv_planes[3]) 130 | self.predict_disp3 = predict_disp(upconv_planes[4]) 131 | self.predict_disp2 = predict_disp(upconv_planes[5]) 132 | self.predict_disp1 = predict_disp(upconv_planes[6]) 133 | 134 | def init_weights(self): 135 | for m in self.modules(): 136 | if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d): 137 | nn.init.xavier_uniform_(m.weight.data) 138 | if m.bias is not None: 139 | m.bias.data.zero_() 140 | 141 | def forward(self, x): 142 | out_conv1 = self.conv1(x) 143 | out_conv2 = self.conv2(out_conv1) 144 | out_conv3 = self.conv3(out_conv2) 145 | out_conv4 = self.conv4(out_conv3) 146 | out_conv5 = self.conv5(out_conv4) 147 | out_conv6 = self.conv6(out_conv5) 148 | out_conv7 = self.conv7(out_conv6) 149 | 150 | out_upconv7 = crop_like(self.upconv7(out_conv7), out_conv6) 151 | concat7 = torch.cat((out_upconv7, out_conv6), 1) 152 | out_iconv7 = self.iconv7(concat7) 153 | 154 | out_upconv6 = crop_like(self.upconv6(out_iconv7), out_conv5) 155 | concat6 = torch.cat((out_upconv6, out_conv5), 1) 156 | out_iconv6 = self.iconv6(concat6) 157 | disp6 = self.alpha * self.predict_disp6(out_iconv6) + self.beta 158 | 159 | out_upconv5 = crop_like(self.upconv5(out_iconv6), out_conv4) 160 | concat5 = torch.cat((out_upconv5, out_conv4), 1) 161 | out_iconv5 = self.iconv5(concat5) 162 | disp5 = self.alpha * self.predict_disp5(out_iconv5) + self.beta 163 | 164 | out_upconv4 = crop_like(self.upconv4(out_iconv5), out_conv3) 165 | concat4 = torch.cat((out_upconv4, out_conv3), 1) 166 | out_iconv4 = self.iconv4(concat4) 167 | disp4 = self.alpha * self.predict_disp4(out_iconv4) + self.beta 168 | 169 | out_upconv3 = crop_like(self.upconv3(out_iconv4), out_conv2) 170 | disp4_up = crop_like(F.interpolate(disp4, scale_factor=2, mode='bilinear', align_corners=False), out_conv2) 171 | concat3 = torch.cat((out_upconv3, out_conv2, disp4_up), 1) 172 | out_iconv3 = self.iconv3(concat3) 173 | disp3 = self.alpha * self.predict_disp3(out_iconv3) + self.beta 174 | 175 | out_upconv2 = crop_like(self.upconv2(out_iconv3), out_conv1) 176 | disp3_up = crop_like(F.interpolate(disp3, scale_factor=2, mode='bilinear', align_corners=False), out_conv1) 177 | concat2 = torch.cat((out_upconv2, out_conv1, disp3_up), 1) 178 | out_iconv2 = self.iconv2(concat2) 179 | disp2 = self.alpha * self.predict_disp2(out_iconv2) + self.beta 180 | 181 | out_upconv1 = crop_like(self.upconv1(out_iconv2), x) 182 | disp2_up = crop_like(F.interpolate(disp2, scale_factor=2, mode='bilinear', align_corners=False), x) 183 | concat1 = torch.cat((out_upconv1, disp2_up), 1) 184 | out_iconv1 = self.iconv1(concat1) 185 | disp1 = self.alpha * self.predict_disp1(out_iconv1) + self.beta 186 | 187 | if self.training: 188 | return disp1, disp2, disp3, disp4, disp5, disp6 189 | else: 190 | return disp1 191 | 192 | if __name__ == "__main__": 193 | disp_res = DispResNet() 194 | input = torch.randn(4,3,256,832) 195 | out1, out2, out3, out4, out5, out6 = disp_res(input) 196 | out = [out1, out2, out3, out4, out5, out6] 197 | for i in range(6): 198 | print('out{} shape: {}'.format(i+1, out[i].shape)) -------------------------------------------------------------------------------- /models/__init__.py: -------------------------------------------------------------------------------- 1 | from .DispResNet import DispResNet 2 | from .monodepth2 import * 3 | from . import models -------------------------------------------------------------------------------- /models/models.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import torch.nn.init as init 5 | import numpy as np 6 | 7 | def conv(in_planes, out_planes, kernel_size=3): 8 | return nn.Sequential( 9 | nn.Conv2d(in_planes, 10 | out_planes, 11 | kernel_size=kernel_size, 12 | padding=(kernel_size - 1) // 2, 13 | stride=2), nn.ReLU(inplace=True)) 14 | 15 | class VisualNet(nn.Module): 16 | ''' 17 | Encode imgs into visual features 18 | ''' 19 | def __init__(self): 20 | super(VisualNet, self).__init__() 21 | 22 | conv_planes = [16, 32, 64, 128, 256, 512, 512] 23 | self.conv1 = conv(6, conv_planes[0], kernel_size=7) 24 | self.conv2 = conv(conv_planes[0], conv_planes[1], kernel_size=5) 25 | self.conv3 = conv(conv_planes[1], conv_planes[2]) 26 | self.conv4 = conv(conv_planes[2], conv_planes[3]) 27 | self.conv5 = conv(conv_planes[3], conv_planes[4]) 28 | self.conv6 = conv(conv_planes[4], conv_planes[5]) 29 | self.conv7 = conv(conv_planes[5], conv_planes[6]) 30 | self.conv8 = nn.Conv2d(conv_planes[6], conv_planes[6], kernel_size=1) 31 | 32 | def init_weights(self): 33 | for m in self.modules(): 34 | if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d): 35 | init.xavier_uniform_(m.weight.data) 36 | if m.bias is not None: 37 | init.zeros_(m.bias) 38 | 39 | def forward(self, imgs): 40 | visual_fea = [] 41 | for i in range(len(imgs) - 1): 42 | input = torch.cat(imgs[i:i + 2], 1) 43 | out_conv1 = self.conv1(input) 44 | out_conv2 = self.conv2(out_conv1) 45 | out_conv3 = self.conv3(out_conv2) 46 | out_conv4 = self.conv4(out_conv3) 47 | out_conv5 = self.conv5(out_conv4) 48 | out_conv6 = self.conv6(out_conv5) 49 | out_conv7 = self.conv7(out_conv6) 50 | out_conv8 = self.conv8(out_conv7) 51 | 52 | visual_fea.append(out_conv8.mean(3).mean(2)) 53 | return torch.stack(visual_fea, dim=1) 54 | 55 | 56 | class ImuNet(nn.Module): 57 | ''' 58 | Encode imus into imu feature 59 | ''' 60 | def __init__(self, input_size=6, hidden_size=512): 61 | super(ImuNet, self).__init__() 62 | self.rnn = nn.LSTM(input_size=input_size, 63 | hidden_size=hidden_size, 64 | num_layers=2, 65 | batch_first=True) 66 | 67 | def init_weights(self): 68 | for m in self.modules(): 69 | if isinstance(m, nn.LSTM): 70 | init.xavier_normal_(m.all_weights[0][0], gain=np.sqrt(1)) 71 | init.xavier_normal_(m.all_weights[0][1], gain=np.sqrt(1)) 72 | init.xavier_normal_(m.all_weights[1][0], gain=np.sqrt(1)) 73 | init.xavier_normal_(m.all_weights[1][1], gain=np.sqrt(1)) 74 | 75 | def forward(self, imus): 76 | self.rnn.flatten_parameters() 77 | x = imus 78 | B, t, N, _ = x.shape # B T N 6 79 | x = x.reshape(B * t, N, -1) # B T*N 6 80 | out, (h, c) = self.rnn(x) # B*T 1000 81 | out = out[:, -1, :] 82 | return out.reshape(B, t, -1) 83 | 84 | 85 | class FuseModule(nn.Module): 86 | def __init__(self, channels, reduction): 87 | super(FuseModule, self).__init__() 88 | self.fc1 = nn.Linear(channels, channels // reduction) 89 | self.relu = nn.ReLU(inplace=True) 90 | self.fc2 = nn.Linear(channels // reduction, channels) 91 | self.sigmoid = nn.Sigmoid() 92 | 93 | def forward(self, x): 94 | module_input = x 95 | x = self.fc1(x) 96 | x = self.relu(x) 97 | x = self.fc2(x) 98 | x = self.sigmoid(x) 99 | return module_input * x 100 | 101 | class PoseNet(nn.Module): 102 | ''' 103 | Fuse both features and output the 6 DOF camera pose 104 | ''' 105 | def __init__(self, input_size=1024): 106 | super(PoseNet, self).__init__() 107 | 108 | self.se = FuseModule(input_size, 16) 109 | 110 | self.rnn = nn.LSTM(input_size=input_size, 111 | hidden_size=1024, 112 | num_layers=2, 113 | batch_first=True) 114 | 115 | self.fc1 = nn.Sequential(nn.Linear(1024, 6)) 116 | 117 | def init_weights(self): 118 | for m in self.modules(): 119 | if isinstance(m, nn.LSTM): 120 | init.xavier_normal_(m.all_weights[0][0], gain=np.sqrt(1)) 121 | init.xavier_normal_(m.all_weights[0][1], gain=np.sqrt(1)) 122 | init.xavier_normal_(m.all_weights[1][0], gain=np.sqrt(1)) 123 | init.xavier_normal_(m.all_weights[1][1], gain=np.sqrt(1)) 124 | elif isinstance(m, nn.Linear): 125 | init.xavier_normal_(m.weight.data, gain=np.sqrt(1)) 126 | if m.bias is not None: 127 | init.zeros_(m.bias) 128 | 129 | def forward(self, visual_fea, imu_fea): 130 | self.rnn.flatten_parameters() 131 | if imu_fea is not None: 132 | B, t, _ = imu_fea.shape 133 | imu_input = imu_fea.view(B, t, -1) 134 | visual_input = visual_fea.view(B, t, -1) 135 | inpt = torch.cat((visual_input, imu_input), dim=2) 136 | else: 137 | inpt = visual_fea 138 | inpt = self.se(inpt) 139 | out, (h, c) = self.rnn(inpt) 140 | out = 0.01 * self.fc1(out) 141 | return out 142 | 143 | if __name__ == '__main__': 144 | image_model = VisualNet() 145 | imgs = [torch.rand(4, 3, 256, 832)] * 5 146 | img_fea = image_model(imgs) 147 | print(img_fea.shape) 148 | 149 | imu_model = ImuNet() 150 | imus = torch.rand(4, 4, 11, 6) 151 | imu_fea = imu_model(imus) 152 | print(imu_fea.shape) 153 | 154 | pose_modle = PoseNet() 155 | pose = pose_modle(img_fea, imu_fea) 156 | print(pose.shape) 157 | -------------------------------------------------------------------------------- /models/monodepth2/__init__.py: -------------------------------------------------------------------------------- 1 | from .depth_decoder import DepthDecoder 2 | -------------------------------------------------------------------------------- /models/monodepth2/depth_decoder.py: -------------------------------------------------------------------------------- 1 | # Copyright Niantic 2019. Patent Pending. All rights reserved. 2 | # 3 | # This software is licensed under the terms of the Monodepth2 licence 4 | # which allows for non-commercial use only, the full terms of which are made 5 | # available in the LICENSE file. 6 | 7 | from __future__ import absolute_import, division, print_function 8 | 9 | import numpy as np 10 | import torch 11 | import torch.nn as nn 12 | 13 | from collections import OrderedDict 14 | from .layers import * 15 | from .resnet_encoder import ResnetEncoder 16 | 17 | 18 | class DepthDecoder(nn.Module): 19 | def __init__(self, num_layers=18, pretrained=True, scales=range(4), 20 | num_output_channels=1, use_skips=True, alpha=10, beta=0.01): 21 | super(DepthDecoder, self).__init__() 22 | 23 | self.encoder = ResnetEncoder(num_layers=num_layers, pretrained=pretrained, num_input_images=1) 24 | self.num_output_channels = num_output_channels 25 | self.use_skips = use_skips 26 | self.upsample_mode = 'nearest' 27 | self.scales = scales 28 | self.alpha = alpha 29 | self.beta = beta 30 | 31 | self.num_ch_enc = self.encoder.num_ch_enc 32 | self.num_ch_dec = np.array([16, 32, 64, 128, 256]) 33 | 34 | # decoder 35 | self.convs = nn.ModuleDict() 36 | for i in range(4, -1, -1): 37 | # upconv_0 38 | num_ch_in = self.num_ch_enc[-1] if i == 4 else self.num_ch_dec[i + 1] 39 | num_ch_out = self.num_ch_dec[i] 40 | self.convs["upconv_{}_{}".format(i,0)] = ConvBlock(num_ch_in, num_ch_out) 41 | 42 | # upconv_1 43 | num_ch_in = self.num_ch_dec[i] 44 | if self.use_skips and i > 0: 45 | num_ch_in += self.num_ch_enc[i - 1] 46 | num_ch_out = self.num_ch_dec[i] 47 | self.convs["upconv_{}_{}".format(i,1)] = ConvBlock(num_ch_in, num_ch_out) 48 | 49 | for s in self.scales: 50 | self.convs["dispconv_{}".format(s)] = Conv3x3(self.num_ch_dec[s], self.num_output_channels) 51 | 52 | self.decoder = nn.ModuleList(list(self.convs.values())) 53 | self.sigmoid = nn.Sigmoid() 54 | 55 | def init_weights(self): 56 | for lay in self.convs.values(): 57 | for m in lay.modules(): 58 | if isinstance(m, nn.Conv2d) or isinstance(m, nn.ConvTranspose2d): 59 | nn.init.xavier_uniform_(m.weight.data) 60 | if m.bias is not None: 61 | m.bias.data.zero_() 62 | 63 | def forward(self, img): 64 | input_features = self.encoder(img) 65 | 66 | self.outputs = [] 67 | 68 | # decoder 69 | x = input_features[-1] 70 | for i in range(4, -1, -1): 71 | x = self.convs["upconv_{}_{}".format(i,0)](x) 72 | x = [upsample(x)] 73 | if self.use_skips and i > 0: 74 | x += [input_features[i - 1]] 75 | x = torch.cat(x, 1) 76 | x = self.convs["upconv_{}_{}".format(i,1)](x) 77 | if i in self.scales: 78 | self.outputs.append(self.alpha * self.sigmoid(self.convs["dispconv_{}".format(i)](x)) + self.beta) 79 | 80 | if self.training: 81 | return self.outputs[::-1] 82 | else: 83 | return self.outputs[-1] 84 | 85 | if __name__ == '__main__': 86 | model = DepthDecoder() 87 | x = torch.rand(1,3,128,416) 88 | for i in model(x): 89 | print(i.shape) -------------------------------------------------------------------------------- /models/monodepth2/layers.py: -------------------------------------------------------------------------------- 1 | # Copyright Niantic 2019. Patent Pending. All rights reserved. 2 | # 3 | # This software is licensed under the terms of the Monodepth2 licence 4 | # which allows for non-commercial use only, the full terms of which are made 5 | # available in the LICENSE file. 6 | 7 | from __future__ import absolute_import, division, print_function 8 | 9 | import numpy as np 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torch.nn.functional as F 14 | 15 | 16 | def disp_to_depth(disp, min_depth, max_depth): 17 | """Convert network's sigmoid output into depth prediction 18 | The formula for this conversion is given in the 'additional considerations' 19 | section of the paper. 20 | """ 21 | min_disp = 1 / max_depth 22 | max_disp = 1 / min_depth 23 | scaled_disp = min_disp + (max_disp - min_disp) * disp 24 | depth = 1 / scaled_disp 25 | return scaled_disp, depth 26 | 27 | 28 | def transformation_from_parameters(axisangle, translation, invert=False): 29 | """Convert the network's (axisangle, translation) output into a 4x4 matrix 30 | """ 31 | R = rot_from_axisangle(axisangle) 32 | t = translation.clone() 33 | 34 | if invert: 35 | R = R.transpose(1, 2) 36 | t *= -1 37 | 38 | T = get_translation_matrix(t) 39 | 40 | if invert: 41 | M = torch.matmul(R, T) 42 | else: 43 | M = torch.matmul(T, R) 44 | 45 | return M 46 | 47 | 48 | def get_translation_matrix(translation_vector): 49 | """Convert a translation vector into a 4x4 transformation matrix 50 | """ 51 | T = torch.zeros(translation_vector.shape[0], 4, 4).to(device=translation_vector.device) 52 | 53 | t = translation_vector.contiguous().view(-1, 3, 1) 54 | 55 | T[:, 0, 0] = 1 56 | T[:, 1, 1] = 1 57 | T[:, 2, 2] = 1 58 | T[:, 3, 3] = 1 59 | T[:, :3, 3, None] = t 60 | 61 | return T 62 | 63 | 64 | def rot_from_axisangle(vec): 65 | """Convert an axisangle rotation into a 4x4 transformation matrix 66 | (adapted from https://github.com/Wallacoloo/printipi) 67 | Input 'vec' has to be Bx1x3 68 | """ 69 | angle = torch.norm(vec, 2, 2, True) 70 | axis = vec / (angle + 1e-7) 71 | 72 | ca = torch.cos(angle) 73 | sa = torch.sin(angle) 74 | C = 1 - ca 75 | 76 | x = axis[..., 0].unsqueeze(1) 77 | y = axis[..., 1].unsqueeze(1) 78 | z = axis[..., 2].unsqueeze(1) 79 | 80 | xs = x * sa 81 | ys = y * sa 82 | zs = z * sa 83 | xC = x * C 84 | yC = y * C 85 | zC = z * C 86 | xyC = x * yC 87 | yzC = y * zC 88 | zxC = z * xC 89 | 90 | rot = torch.zeros((vec.shape[0], 4, 4)).to(device=vec.device) 91 | 92 | rot[:, 0, 0] = torch.squeeze(x * xC + ca) 93 | rot[:, 0, 1] = torch.squeeze(xyC - zs) 94 | rot[:, 0, 2] = torch.squeeze(zxC + ys) 95 | rot[:, 1, 0] = torch.squeeze(xyC + zs) 96 | rot[:, 1, 1] = torch.squeeze(y * yC + ca) 97 | rot[:, 1, 2] = torch.squeeze(yzC - xs) 98 | rot[:, 2, 0] = torch.squeeze(zxC - ys) 99 | rot[:, 2, 1] = torch.squeeze(yzC + xs) 100 | rot[:, 2, 2] = torch.squeeze(z * zC + ca) 101 | rot[:, 3, 3] = 1 102 | 103 | return rot 104 | 105 | 106 | class ConvBlock(nn.Module): 107 | """Layer to perform a convolution followed by ELU 108 | """ 109 | def __init__(self, in_channels, out_channels): 110 | super(ConvBlock, self).__init__() 111 | 112 | self.conv = Conv3x3(in_channels, out_channels) 113 | self.nonlin = nn.ELU(inplace=True) 114 | 115 | def forward(self, x): 116 | out = self.conv(x) 117 | out = self.nonlin(out) 118 | return out 119 | 120 | 121 | class Conv3x3(nn.Module): 122 | """Layer to pad and convolve input 123 | """ 124 | def __init__(self, in_channels, out_channels, use_refl=True): 125 | super(Conv3x3, self).__init__() 126 | 127 | if use_refl: 128 | self.pad = nn.ReflectionPad2d(1) 129 | else: 130 | self.pad = nn.ZeroPad2d(1) 131 | self.conv = nn.Conv2d(int(in_channels), int(out_channels), 3) 132 | 133 | def forward(self, x): 134 | out = self.pad(x) 135 | out = self.conv(out) 136 | return out 137 | 138 | 139 | class BackprojectDepth(nn.Module): 140 | """Layer to transform a depth image into a point cloud 141 | """ 142 | def __init__(self, batch_size, height, width): 143 | super(BackprojectDepth, self).__init__() 144 | 145 | self.batch_size = batch_size 146 | self.height = height 147 | self.width = width 148 | 149 | meshgrid = np.meshgrid(range(self.width), range(self.height), indexing='xy') 150 | self.id_coords = np.stack(meshgrid, axis=0).astype(np.float32) 151 | self.id_coords = nn.Parameter(torch.from_numpy(self.id_coords)) 152 | 153 | self.ones = nn.Parameter(torch.ones(self.batch_size, 1, self.height * self.width)) 154 | 155 | self.pix_coords = torch.unsqueeze(torch.stack( 156 | [self.id_coords[0].view(-1), self.id_coords[1].view(-1)], 0), 0) 157 | self.pix_coords = self.pix_coords.repeat(batch_size, 1, 1) 158 | self.pix_coords = nn.Parameter(torch.cat([self.pix_coords, self.ones], 1)) 159 | 160 | def forward(self, depth, inv_K): 161 | cam_points = torch.matmul(inv_K[:, :3, :3], self.pix_coords) 162 | cam_points = depth.view(self.batch_size, 1, -1) * cam_points 163 | cam_points = torch.cat([cam_points, self.ones], 1) 164 | 165 | return cam_points 166 | 167 | 168 | class Project3D(nn.Module): 169 | """Layer which projects 3D points into a camera with intrinsics K and at position T 170 | """ 171 | def __init__(self, batch_size, height, width, eps=1e-7): 172 | super(Project3D, self).__init__() 173 | 174 | self.batch_size = batch_size 175 | self.height = height 176 | self.width = width 177 | self.eps = eps 178 | 179 | def forward(self, points, K, T): 180 | P = torch.matmul(K, T)[:, :3, :] 181 | 182 | cam_points = torch.matmul(P, points) 183 | 184 | pix_coords = cam_points[:, :2, :] / (cam_points[:, 2, :].unsqueeze(1) + self.eps) 185 | pix_coords = pix_coords.view(self.batch_size, 2, self.height, self.width) 186 | pix_coords = pix_coords.permute(0, 2, 3, 1) 187 | pix_coords[..., 0] /= self.width - 1 188 | pix_coords[..., 1] /= self.height - 1 189 | pix_coords = (pix_coords - 0.5) * 2 190 | return pix_coords 191 | 192 | 193 | def upsample(x): 194 | """Upsample input tensor by a factor of 2 195 | """ 196 | return F.interpolate(x, scale_factor=2, mode="nearest") 197 | 198 | 199 | def get_smooth_loss(disp, img): 200 | """Computes the smoothness loss for a disparity image 201 | The color image is used for edge-aware smoothness 202 | """ 203 | grad_disp_x = torch.abs(disp[:, :, :, :-1] - disp[:, :, :, 1:]) 204 | grad_disp_y = torch.abs(disp[:, :, :-1, :] - disp[:, :, 1:, :]) 205 | 206 | grad_img_x = torch.mean(torch.abs(img[:, :, :, :-1] - img[:, :, :, 1:]), 1, keepdim=True) 207 | grad_img_y = torch.mean(torch.abs(img[:, :, :-1, :] - img[:, :, 1:, :]), 1, keepdim=True) 208 | 209 | grad_disp_x *= torch.exp(-grad_img_x) 210 | grad_disp_y *= torch.exp(-grad_img_y) 211 | 212 | return grad_disp_x.mean() + grad_disp_y.mean() 213 | 214 | 215 | class SSIM(nn.Module): 216 | """Layer to compute the SSIM loss between a pair of images 217 | """ 218 | def __init__(self): 219 | super(SSIM, self).__init__() 220 | self.mu_x_pool = nn.AvgPool2d(3, 1) 221 | self.mu_y_pool = nn.AvgPool2d(3, 1) 222 | self.sig_x_pool = nn.AvgPool2d(3, 1) 223 | self.sig_y_pool = nn.AvgPool2d(3, 1) 224 | self.sig_xy_pool = nn.AvgPool2d(3, 1) 225 | 226 | self.refl = nn.ReflectionPad2d(1) 227 | 228 | self.C1 = 0.01 ** 2 229 | self.C2 = 0.03 ** 2 230 | 231 | def forward(self, x, y): 232 | x = self.refl(x) 233 | y = self.refl(y) 234 | 235 | mu_x = self.mu_x_pool(x) 236 | mu_y = self.mu_y_pool(y) 237 | 238 | sigma_x = self.sig_x_pool(x ** 2) - mu_x ** 2 239 | sigma_y = self.sig_y_pool(y ** 2) - mu_y ** 2 240 | sigma_xy = self.sig_xy_pool(x * y) - mu_x * mu_y 241 | 242 | SSIM_n = (2 * mu_x * mu_y + self.C1) * (2 * sigma_xy + self.C2) 243 | SSIM_d = (mu_x ** 2 + mu_y ** 2 + self.C1) * (sigma_x + sigma_y + self.C2) 244 | 245 | return torch.clamp((1 - SSIM_n / SSIM_d) / 2, 0, 1) 246 | 247 | 248 | def compute_depth_errors(gt, pred): 249 | """Computation of error metrics between predicted and ground truth depths 250 | """ 251 | thresh = torch.max((gt / pred), (pred / gt)) 252 | a1 = (thresh < 1.25 ).float().mean() 253 | a2 = (thresh < 1.25 ** 2).float().mean() 254 | a3 = (thresh < 1.25 ** 3).float().mean() 255 | 256 | rmse = (gt - pred) ** 2 257 | rmse = torch.sqrt(rmse.mean()) 258 | 259 | rmse_log = (torch.log(gt) - torch.log(pred)) ** 2 260 | rmse_log = torch.sqrt(rmse_log.mean()) 261 | 262 | abs_rel = torch.mean(torch.abs(gt - pred) / gt) 263 | 264 | sq_rel = torch.mean((gt - pred) ** 2 / gt) 265 | 266 | return abs_rel, sq_rel, rmse, rmse_log, a1, a2, a3 267 | -------------------------------------------------------------------------------- /models/monodepth2/resnet_encoder.py: -------------------------------------------------------------------------------- 1 | # Copyright Niantic 2019. Patent Pending. All rights reserved. 2 | # 3 | # This software is licensed under the terms of the Monodepth2 licence 4 | # which allows for non-commercial use only, the full terms of which are made 5 | # available in the LICENSE file. 6 | 7 | from __future__ import absolute_import, division, print_function 8 | 9 | import numpy as np 10 | 11 | import torch 12 | import torch.nn as nn 13 | import torchvision.models as models 14 | import torch.utils.model_zoo as model_zoo 15 | 16 | 17 | class ResNetMultiImageInput(models.ResNet): 18 | """Constructs a resnet model with varying number of input images. 19 | Adapted from https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py 20 | """ 21 | def __init__(self, block, layers, num_classes=1000, num_input_images=1): 22 | super(ResNetMultiImageInput, self).__init__(block, layers) 23 | self.inplanes = 64 24 | self.conv1 = nn.Conv2d( 25 | num_input_images * 3, 64, kernel_size=7, stride=2, padding=3, bias=False) 26 | self.bn1 = nn.BatchNorm2d(64) 27 | self.relu = nn.ReLU(inplace=True) 28 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) 29 | self.layer1 = self._make_layer(block, 64, layers[0]) 30 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2) 31 | self.layer3 = self._make_layer(block, 256, layers[2], stride=2) 32 | self.layer4 = self._make_layer(block, 512, layers[3], stride=2) 33 | 34 | for m in self.modules(): 35 | if isinstance(m, nn.Conv2d): 36 | nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 37 | elif isinstance(m, nn.BatchNorm2d): 38 | nn.init.constant_(m.weight, 1) 39 | nn.init.constant_(m.bias, 0) 40 | 41 | 42 | def resnet_multiimage_input(num_layers, pretrained=False, num_input_images=1): 43 | """Constructs a ResNet model. 44 | Args: 45 | num_layers (int): Number of resnet layers. Must be 18 or 50 46 | pretrained (bool): If True, returns a model pre-trained on ImageNet 47 | num_input_images (int): Number of frames stacked as input 48 | """ 49 | assert num_layers in [18, 50], "Can only run with 18 or 50 layer resnet" 50 | blocks = {18: [2, 2, 2, 2], 50: [3, 4, 6, 3]}[num_layers] 51 | block_type = {18: models.resnet.BasicBlock, 50: models.resnet.Bottleneck}[num_layers] 52 | model = ResNetMultiImageInput(block_type, blocks, num_input_images=num_input_images) 53 | 54 | if pretrained: 55 | loaded = model_zoo.load_url(models.resnet.model_urls['resnet{}'.format(num_layers)]) 56 | loaded['conv1.weight'] = torch.cat( 57 | [loaded['conv1.weight']] * num_input_images, 1) / num_input_images 58 | model.load_state_dict(loaded) 59 | return model 60 | 61 | 62 | class ResnetEncoder(nn.Module): 63 | """Pytorch module for a resnet encoder 64 | """ 65 | def __init__(self, num_layers, pretrained, num_input_images=1): 66 | super(ResnetEncoder, self).__init__() 67 | 68 | self.num_ch_enc = np.array([64, 64, 128, 256, 512]) 69 | 70 | resnets = {18: models.resnet18, 71 | 34: models.resnet34, 72 | 50: models.resnet50, 73 | 101: models.resnet101, 74 | 152: models.resnet152} 75 | 76 | if num_layers not in resnets: 77 | raise ValueError("{} is not a valid number of resnet layers".format(num_layers)) 78 | 79 | if num_input_images > 1: 80 | self.encoder = resnet_multiimage_input(num_layers, pretrained, num_input_images) 81 | else: 82 | self.encoder = resnets[num_layers](pretrained) 83 | 84 | if num_layers > 34: 85 | self.num_ch_enc[1:] *= 4 86 | 87 | def forward(self, input_image): 88 | self.features = [] 89 | x = (input_image - 0.45) / 0.225 90 | x = self.encoder.conv1(x) 91 | x = self.encoder.bn1(x) 92 | self.features.append(self.encoder.relu(x)) 93 | self.features.append(self.encoder.layer1(self.encoder.maxpool(self.features[-1]))) 94 | self.features.append(self.encoder.layer2(self.features[-1])) 95 | self.features.append(self.encoder.layer3(self.features[-1])) 96 | self.features.append(self.encoder.layer4(self.features[-1])) 97 | 98 | return self.features 99 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | imageio==2.19.1 2 | matplotlib==3.5.2 3 | numpy==1.21.6 4 | opencv-python==4.5.5.64 5 | Pillow==9.1.0 6 | scipy==1.1.0 7 | tensorboard==2.9.0 8 | tensorboardX==2.5.1 9 | torch==1.10.0 10 | torchvision==0.11.1 11 | tqdm 12 | -------------------------------------------------------------------------------- /test_disp.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import cv2 4 | import numpy as np 5 | import torch 6 | from path import Path 7 | from scipy.misc import imread, imresize 8 | from scipy.ndimage.interpolation import zoom 9 | from tqdm import tqdm 10 | import matplotlib.pyplot as plt 11 | 12 | import models 13 | cmap = plt.cm.magma 14 | 15 | parser = argparse.ArgumentParser(description='Script for DispNet testing with corresponding groundTruth', formatter_class=argparse.ArgumentDefaultsHelpFormatter) 16 | parser.add_argument("--pretrained-dispnet", required=True, type=str, help="pretrained DispNet path") 17 | parser.add_argument("--img-height", default=256, type=int, help="Image height") 18 | parser.add_argument("--img-width", default=832, type=int, help="Image width") 19 | parser.add_argument("--no-resize", action='store_true', help="no resizing is done") 20 | parser.add_argument("--save-depth", action='store_true', help="if save depth map") 21 | parser.add_argument("--min-depth", default=1e-3) 22 | parser.add_argument("--max-depth", default=80) 23 | parser.add_argument("--dataset-dir", default='.', type=str, help="Dataset directory") 24 | parser.add_argument("--dataset-list", default=None, type=str, help="Dataset list file") 25 | parser.add_argument("--output-dir", default=None, type=str, help="Output directory for saving predictions in a big 3D numpy file") 26 | parser.add_argument("--gt-type", default='KITTI', type=str, help="GroundTruth data type", choices=['npy', 'png', 'KITTI', 'stillbox']) 27 | parser.add_argument("--img-exts", default=['png', 'jpg', 'bmp'], nargs='*', type=str, help="images extensions to glob") 28 | 29 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu") 30 | 31 | def load_tensor_image(filename, args): 32 | img = imread(filename).astype(np.float32) 33 | h, w, _ = img.shape 34 | if (not args.no_resize) and (h != args.img_height or w != args.img_width): 35 | img = imresize(img, 36 | (args.img_height, args.img_width)).astype(np.float32) 37 | img = np.transpose(img, (2, 0, 1)) 38 | tensor_img = ((torch.from_numpy(img).unsqueeze(0) / 255 - 0.5) / 39 | 0.5).to(device) 40 | return tensor_img 41 | 42 | def colored_depth(depth, d_min=None, d_max=None): 43 | if d_min is None: 44 | d_min = np.min(depth) 45 | if d_max is None: 46 | d_max = np.max(depth) 47 | depth_relative = ((depth-d_min)/(d_max-d_min)) 48 | return 255 * cmap(depth_relative)[:,:,:3] 49 | 50 | @torch.no_grad() 51 | def main(): 52 | args = parser.parse_args() 53 | disp_net = models.DepthDecoder(alpha=10, beta=0.01).to(device) 54 | weights = torch.load(args.pretrained_dispnet) 55 | disp_net.load_state_dict(weights['state_dict']) 56 | disp_net.eval() 57 | 58 | if args.dataset_list: 59 | dataset_dir = Path(args.dataset_dir) 60 | with open(args.dataset_list, 'r') as f: 61 | test_files = list(f.read().splitlines()) 62 | else: 63 | print('No dataset_list, please specify the dataset list for depth estimation') 64 | exit() 65 | 66 | print('{} files to test'.format(len(test_files))) 67 | 68 | if not args.output_dir: 69 | args.output_dir = args.dataset_dir 70 | output_dir = Path(args.output_dir) 71 | print('writing npy file in: ', output_dir) 72 | output_dir.makedirs_p() 73 | 74 | if args.save_depth: 75 | output_depth_dir = output_dir/'disp' 76 | print('saving depth maps in: ', output_depth_dir) 77 | output_depth_dir.makedirs_p() 78 | 79 | for j in tqdm(range(len(test_files))): 80 | tgt_img = load_tensor_image(dataset_dir + test_files[j], args) 81 | pred_disp = disp_net(tgt_img).cpu().numpy()[0, 0] 82 | 83 | if j == 0: 84 | predictions = np.zeros((len(test_files), *pred_disp.shape)) 85 | predictions[j] = 1 / pred_disp 86 | 87 | if args.save_depth: 88 | depth_colored = colored_depth(pred_disp)[:, :, ::-1] 89 | depth_colored = np.array(depth_colored, dtype=np.uint8) 90 | cv2.imwrite(output_depth_dir/'{}_disp.png'.format(str(j).zfill(6)), depth_colored) 91 | 92 | np.save(output_dir / 'predictions.npy', predictions) 93 | 94 | if __name__ == '__main__': 95 | main() 96 | -------------------------------------------------------------------------------- /test_pose.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import models 3 | import os 4 | from utils import custom_transform 5 | from utils.pose_transfer import * 6 | from utils.inverse_warp import * 7 | import argparse 8 | from utils.plot_traj import get_prediction_traj, plot_route3d, plot_route2d 9 | import pandas as pd 10 | 11 | parser = argparse.ArgumentParser(description='UnVIO test file', formatter_class=argparse.ArgumentDefaultsHelpFormatter) 12 | parser.add_argument('--dataset_root', type=str, help='dataset root path') 13 | parser.add_argument('--dataset', type=str, choices=['KITTI', 'Malaga'], default='KITTI', help='which dataset is used to test') 14 | parser.add_argument('--sequence-length', type=int, metavar='N', help='sequence length for testing', default=5) 15 | parser.add_argument('--rotation-mode', type=str, choices=['euler', 'quat'], default='euler', help='rotation mode for PoseExpnet : euler (yaw,pitch,roll) or quaternion (last 3 coefficients)') 16 | parser.add_argument('--pretrained-visualnet', dest='pretrained_visualnet', default=None, metavar='PATH', help='path to pre-trained visual net model') 17 | parser.add_argument('--pretrained-imunet', dest='pretrained_imunet', default=None, metavar='PATH', help='path to pre-trained imu net model') 18 | parser.add_argument('--pretrained-posenet', dest='pretrained_posenet', default=None, metavar='PATH', help='path to pre-trained pose net model') 19 | parser.add_argument('--testscene', type=str, help='test_scene of dataset') 20 | parser.add_argument("--show-traj", action='store_true', default=False, help="show trajectory") 21 | 22 | device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu") 23 | 24 | args = parser.parse_args() 25 | 26 | @torch.no_grad() 27 | def main(): 28 | torch.multiprocessing.set_sharing_strategy('file_system') 29 | data_root = args.dataset_root 30 | args.imu_range = [-10, 0] 31 | # load data 32 | if args.dataset == 'KITTI': 33 | from dataset.KITTIDataset import DataSequence as dataset 34 | data_path = '{}/KITTI_rec_256'.format(data_root) 35 | args.img_width = 832 36 | args.img_height = 256 37 | elif args.dataset == 'Malaga': 38 | from dataset.MalagaDataset import DataSequence as dataset 39 | data_path = '{}/Malaga_Down/'.format(data_root) 40 | args.img_width = 832 41 | args.img_height = 256 42 | else: 43 | raise ValueError('Unsupported dataset') 44 | 45 | # init visual net, imunet, posenet 46 | visual_net = models.models.VisualNet().to(device) 47 | imu_net = models.models.ImuNet().to(device) 48 | pose_net = models.models.PoseNet(input_size=1024).to(device) 49 | 50 | if args.pretrained_posenet: 51 | print("=> using pre-trained weights for pose_net") 52 | weights = torch.load(args.pretrained_posenet) 53 | pose_net.load_state_dict(weights['state_dict'], strict=False) 54 | else: 55 | pose_net.init_weights() 56 | 57 | if args.pretrained_visualnet: 58 | print("=> using pre-trained weights for visual_net") 59 | weights = torch.load(args.pretrained_visualnet) 60 | visual_net.load_state_dict(weights['state_dict'], strict=False) 61 | else: 62 | visual_net.init_weights() 63 | 64 | if args.pretrained_imunet: 65 | print("=> using pre-trained weights for imu_net") 66 | weights = torch.load(args.pretrained_imunet) 67 | imu_net.load_state_dict(weights['state_dict'], strict=True) 68 | else: 69 | imu_net.init_weights() 70 | 71 | test_scene = [args.testscene] 72 | print('test_scene: ', test_scene) 73 | 74 | save_path = os.path.split(args.pretrained_posenet)[0] 75 | print('=> Save results at {}'.format(save_path)) 76 | 77 | test_transform = custom_transform.Compose([ 78 | custom_transform.ToTensor(), 79 | custom_transform.Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) 80 | ]) 81 | test_set = dataset(root=data_path, 82 | seed=0, 83 | sequence_length=args.sequence_length, 84 | imu_range=args.imu_range, 85 | transform=test_transform, 86 | shuffle=False, 87 | scene=test_scene, 88 | image_width=args.img_width, 89 | image_height=args.img_height) 90 | 91 | test_loader = torch.utils.data.DataLoader(test_set, 92 | batch_size=1, 93 | shuffle=False, 94 | num_workers=4, 95 | pin_memory=True) 96 | 97 | print('sequence length: {}'.format(len(test_loader))) 98 | predictions_array = [] 99 | visual_net.eval() 100 | imu_net.eval() 101 | pose_net.eval() 102 | GTS = [] 103 | 104 | for i, (imgs, imus, intr, gt) in enumerate(test_loader): 105 | if args.dataset == 'KITTI': 106 | if i == 0: 107 | GTS.append(gt[0, 0]) 108 | for j in range(1, args.sequence_length): 109 | GTS.append(GTS[-1] @ gt[0, j]) 110 | 111 | else: 112 | GTS.append(GTS[-1] @ gt[0, -1]) 113 | 114 | print('{}th batch of total {}'.format(i + 1, len(test_loader)), end="") 115 | print('\r', end="", flush=True) 116 | 117 | imgs = [img.to(device) for img in imgs] 118 | imus = imus.to(device) 119 | visual_feature = visual_net(imgs) # B T 1024 120 | imu_feature = imu_net(imus[:, 1:]) 121 | out = pose_net(visual_feature, 122 | imu_feature).data.cpu().numpy() # B T-1 6 123 | 124 | if i == 0: 125 | predictions_array.append(np.zeros([1, 6])) 126 | for j in range(args.sequence_length - 1): 127 | predictions_array.append(out[:, j, :]) 128 | 129 | else: 130 | predictions_array.append(out[:, -1, :]) 131 | 132 | print('\nlength of predictions_array: ', len(predictions_array)) 133 | absolute_pose = np.array(relative2absolute(predictions_array)) 134 | tmppose = absolute_pose[:, :3].reshape(absolute_pose.shape[0], -1) 135 | save_csv = pd.DataFrame(tmppose) 136 | save_csv.to_csv(save_path + 137 | '/predpose_{}.csv'.format(''.join(test_scene)), 138 | header=None, 139 | index=None) 140 | 141 | if args.show_traj: 142 | if args.dataset == 'KITTI': 143 | GTS = np.stack(GTS, 0) 144 | scale = GTS[:, :3, 3].mean() / absolute_pose[:, :3, 3].mean() 145 | plot_route2d(get_prediction_traj(GTS), 146 | get_prediction_traj(absolute_pose * scale)) 147 | else: 148 | plot_route3d(None, get_prediction_traj(absolute_pose)) 149 | 150 | def out2pose(out): 151 | pose = [pose_vec2mat4(out[:, i]) for i in range(4)] 152 | pose = [ 153 | pose[0] @ pose[1], pose[1], 154 | b_inv(pose[2]), 155 | b_inv(pose[2] @ pose[3]) 156 | ] 157 | pose = torch.stack(pose, dim=1) 158 | return pose[:, :, :3, :] 159 | 160 | def relative2absolute(pose): 161 | abs_pose_mat = [] 162 | for i in range(len(pose)): 163 | temp_mat = _6Dofto16mat(pose[i]) 164 | if i == 0: 165 | abs_pose_mat.append(temp_mat) 166 | else: 167 | abs_pose_mat.append(abs_pose_mat[i - 1] @ temp_mat) 168 | return abs_pose_mat 169 | 170 | def _6Dofto16mat(pose): 171 | translation = pose[0][3:] 172 | rotation = pose[0][:3] 173 | R = euler_matrix(rotation[0], rotation[1], rotation[2]) 174 | T = np.vstack([np.hstack([R, translation.reshape(-1, 1)]), [0, 0, 0, 1]]) 175 | return T 176 | 177 | def _16mat26Dof(mat): 178 | translation = [str(item) for item in list(mat[:3, 3])] 179 | roation = [str(item) for item in list(euler_from_matrix(mat))] 180 | return roation + translation 181 | 182 | if __name__ == "__main__": 183 | main() 184 | -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Ironbrotherstyle/UnVIO/173e8ef646074da45cb64c77da36e474e4746031/utils/__init__.py -------------------------------------------------------------------------------- /utils/custom_transform.py: -------------------------------------------------------------------------------- 1 | from __future__ import division 2 | import torch 3 | import random 4 | import numpy as np 5 | from scipy.misc import imresize 6 | 7 | class Compose(object): 8 | def __init__(self, transforms): 9 | self.transforms = transforms 10 | 11 | def __call__(self, images, imus, intrinsics): 12 | for t in self.transforms: 13 | images, imus, intrinsics = t(images, imus, intrinsics) 14 | return images, imus, intrinsics 15 | 16 | class Normalize(object): 17 | def __init__(self, mean, std): 18 | self.mean = mean 19 | self.std = std 20 | 21 | def __call__(self, images, imus, intrinsics): 22 | for tensor in images: 23 | for t, m, s in zip(tensor, self.mean, self.std): 24 | t.sub_(m).div_(s) 25 | return images, imus, intrinsics 26 | 27 | class ToTensor(object): 28 | """Converts a list of numpy.ndarray (H x W x C) along with a intrinsics matrix to a list of torch.FloatTensor of shape (C x H x W) with a intrinsics tensor.""" 29 | 30 | def __call__(self, images, imus, intrinsics): 31 | tensors = [] 32 | for im in images: 33 | # put it from HWC to CHW format 34 | im = np.transpose(im, (2, 0, 1)) 35 | # handle numpy array 36 | tensors.append(torch.from_numpy(im).float()/255) 37 | return tensors, imus, intrinsics 38 | 39 | class RandomHorizontalFlip(object): 40 | """Randomly horizontally flips the given numpy array with a probability of 0.5""" 41 | 42 | def __call__(self, images, imus, intrinsics): 43 | assert intrinsics is not None 44 | if random.random() < 0.5: 45 | output_intrinsics = np.copy(intrinsics) 46 | output_images = [np.copy(np.fliplr(im)) for im in images] 47 | w = output_images[0].shape[1] 48 | output_intrinsics[0,2] = w - output_intrinsics[0,2] 49 | else: 50 | output_images = images 51 | output_intrinsics = intrinsics 52 | return output_images, imus, output_intrinsics 53 | 54 | 55 | class RandomScaleCrop(object): 56 | """Randomly zooms images up to 15% and crop them to keep same size as before.""" 57 | 58 | def __call__(self, images, imus, intrinsics): 59 | assert intrinsics is not None 60 | output_intrinsics = np.copy(intrinsics) 61 | 62 | in_h, in_w, _ = images[0].shape 63 | x_scaling, y_scaling = np.random.uniform(1,1.15,2) 64 | scaled_h, scaled_w = int(in_h * y_scaling), int(in_w * x_scaling) 65 | 66 | output_intrinsics[0] *= x_scaling 67 | output_intrinsics[1] *= y_scaling 68 | scaled_images = [imresize(im, (scaled_h, scaled_w)) for im in images] 69 | 70 | offset_y = np.random.randint(scaled_h - in_h + 1) 71 | offset_x = np.random.randint(scaled_w - in_w + 1) 72 | cropped_images = [im[offset_y:offset_y + in_h, offset_x:offset_x + in_w] for im in scaled_images] 73 | 74 | output_intrinsics[0,2] -= offset_x 75 | output_intrinsics[1,2] -= offset_y 76 | 77 | return cropped_images, imus, output_intrinsics 78 | 79 | 80 | class AugmentImagePair(object): 81 | def __init__(self, augment_parameters=[0.8, 1.2, 0.5, 2.0, 0.8, 1.2]): 82 | self.gamma_low = augment_parameters[0] # 0.8 83 | self.gamma_high = augment_parameters[1] # 1.2 84 | self.brightness_low = augment_parameters[2] # 0.5 85 | self.brightness_high = augment_parameters[3] # 2.0 86 | self.color_low = augment_parameters[4] # 0.8 87 | self.color_high = augment_parameters[5] # 1.2 88 | 89 | def __call__(self, images, imus, intrinsics): 90 | p = np.random.uniform(0, 1, 1) 91 | sample = [] 92 | if p > 0.5: 93 | random_gamma = np.random.uniform(self.gamma_low, self.gamma_high) 94 | random_brightness = np.random.uniform(self.brightness_low, self.brightness_high) 95 | random_colors = np.random.uniform(self.color_low, self.color_high, 3) 96 | for img in images: 97 | # randomly shift gamma 98 | img_aug = img ** random_gamma 99 | 100 | # randomly shift brightness 101 | img_aug = img_aug * random_brightness 102 | 103 | # randomly shift color 104 | for i in range(3): 105 | img_aug[i, :, :] *= random_colors[i] 106 | 107 | # saturate 108 | img_aug = torch.clamp(img_aug, 0, 1) 109 | sample.append(img_aug) 110 | else: 111 | sample = images 112 | return sample, imus, intrinsics -------------------------------------------------------------------------------- /utils/inverse_warp.py: -------------------------------------------------------------------------------- 1 | from __future__ import division 2 | import torch 3 | import torch.nn.functional as F 4 | 5 | device = torch.device( 6 | 'cuda') if torch.cuda.is_available() else torch.device('cpu') 7 | 8 | pixel_coords = None 9 | 10 | # def b_inv(b_mat): 11 | # eye = b_mat.new_ones(b_mat.size(-1)).diag().expand_as(b_mat) 12 | # b_inv, _ = torch.gesv(eye, b_mat) 13 | # return b_inv 14 | 15 | def b_inv(b_mat): 16 | b_inv = torch.inverse(b_mat) 17 | return b_inv 18 | 19 | def set_id_grid(depth): 20 | global pixel_coords 21 | b, h, w = depth.size() 22 | i_range = torch.arange(0, h).view(1, h, 1).expand(1, h, w).type_as(depth) # [1, H, W] 23 | j_range = torch.arange(0, w).view(1, 1, w).expand(1, h, w).type_as(depth) # [1, H, W] 24 | ones = torch.ones(1, h, w).type_as(depth) 25 | 26 | pixel_coords = torch.stack((j_range, i_range, ones), dim=1) # [1, 3, H, W] 27 | 28 | def check_sizes(input, input_name, expected): 29 | condition = [input.ndimension() == len(expected)] 30 | for i, size in enumerate(expected): 31 | if size.isdigit(): 32 | condition.append(input.size(i) == int(size)) 33 | assert(all(condition)), "wrong size for {}, expected {}, got {}".format(input_name, 'x'.join(expected), list(input.size())) 34 | 35 | def pixel2cam(depth, intrinsics_inv): 36 | global pixel_coords 37 | """Transform coordinates in the pixel frame to the camera frame. 38 | Args: 39 | depth: depth maps -- [B, H, W] 40 | intrinsics_inv: intrinsics_inv matrix for each element of batch -- [B, 3, 3] 41 | Returns: 42 | array of (u,v,1) cam coordinates -- [B, 3, H, W] 43 | """ 44 | b, h, w = depth.size() 45 | if (pixel_coords is None) or pixel_coords.size(2) < h: 46 | set_id_grid(depth) 47 | current_pixel_coords = pixel_coords[:, :, :h, :w].expand(b, 3, h, w).reshape(b, 3, -1) # [B, 3, H*W] 48 | cam_coords = (intrinsics_inv @ current_pixel_coords).reshape(b, 3, h, w) 49 | return cam_coords * depth.unsqueeze(1) 50 | 51 | def cam2pixel(cam_coords, proj_c2p_rot, proj_c2p_tr, padding_mode): 52 | """Transform coordinates in the camera frame to the pixel frame. 53 | Args: 54 | cam_coords: pixel coordinates defined in the first camera coordinates system -- [B, 4, H, W] #3? 55 | proj_c2p_rot: rotation matrix of cameras -- [B, 3, 4] [B, 3, 3]? 56 | proj_c2p_tr: translation vectors of cameras -- [B, 3, 1] 57 | Returns: 58 | array of [-1,1] coordinates -- [B, 2, H, W] 59 | """ 60 | b, _, h, w = cam_coords.size() 61 | cam_coords_flat = cam_coords.reshape(b, 3, -1) # [B, 3, H*W] 62 | if proj_c2p_rot is not None: 63 | pcoords = proj_c2p_rot @ cam_coords_flat 64 | else: 65 | pcoords = cam_coords_flat 66 | 67 | if proj_c2p_tr is not None: 68 | pcoords = pcoords + proj_c2p_tr # [B, 3, H*W] 69 | X = pcoords[:, 0] 70 | Y = pcoords[:, 1] 71 | Z = pcoords[:, 2].clamp(min=1e-3) 72 | 73 | X_norm = 2*(X / Z)/(w-1) - 1 # Normalized, -1 if on extreme left, 1 if on extreme right (x = w-1) [B, H*W] 74 | Y_norm = 2*(Y / Z)/(h-1) - 1 # Idem [B, H*W] 75 | if padding_mode == 'zeros': 76 | X_mask = ((X_norm > 1)+(X_norm < -1)).detach() 77 | X_norm[X_mask] = 2 # make sure that no point in warped image is a combination of im and gray 78 | Y_mask = ((Y_norm > 1)+(Y_norm < -1)).detach() 79 | Y_norm[Y_mask] = 2 80 | 81 | pixel_coords = torch.stack([X_norm, Y_norm], dim=2) # [B, H*W, 2] 82 | return pixel_coords.reshape(b, h, w, 2) 83 | 84 | def euler2mat(angle): 85 | """Convert euler angles to rotation matrix. 86 | 87 | Reference: https://github.com/pulkitag/pycaffe-utils/blob/master/rot_utils.py#L174 88 | 89 | Args: 90 | angle: rotation angle along 3 axis (in radians) -- size = [B, 3] 91 | Returns: 92 | Rotation matrix corresponding to the euler angles -- size = [B, 3, 3] 93 | """ 94 | B = angle.size(0) 95 | x, y, z = angle[:, 0], angle[:, 1], angle[:, 2] 96 | 97 | cosz = torch.cos(z) 98 | sinz = torch.sin(z) 99 | 100 | zeros = z.detach()*0 101 | ones = zeros.detach()+1 102 | zmat = torch.stack([cosz, -sinz, zeros, 103 | sinz, cosz, zeros, 104 | zeros, zeros, ones], dim=1).reshape(B, 3, 3) 105 | 106 | cosy = torch.cos(y) 107 | siny = torch.sin(y) 108 | 109 | ymat = torch.stack([cosy, zeros, siny, 110 | zeros, ones, zeros, 111 | -siny, zeros, cosy], dim=1).reshape(B, 3, 3) 112 | 113 | cosx = torch.cos(x) 114 | sinx = torch.sin(x) 115 | 116 | xmat = torch.stack([ones, zeros, zeros, 117 | zeros, cosx, -sinx, 118 | zeros, sinx, cosx], dim=1).reshape(B, 3, 3) 119 | 120 | rotMat = xmat @ ymat @ zmat 121 | return rotMat 122 | 123 | def quat2mat(quat): 124 | """Convert quaternion coefficients to rotation matrix. 125 | 126 | Args: 127 | quat: first three coeff of quaternion of rotation. fourht is then computed to have a norm of 1 -- size = [B, 3] 128 | Returns: 129 | Rotation matrix corresponding to the quaternion -- size = [B, 3, 3] 130 | """ 131 | norm_quat = torch.cat([quat[:, :1].detach()*0 + 1, quat], dim=1) 132 | norm_quat = norm_quat/norm_quat.norm(p=2, dim=1, keepdim=True) 133 | w, x, y, z = norm_quat[:, 0], norm_quat[:, 1], norm_quat[:, 2], norm_quat[:, 3] 134 | 135 | B = quat.size(0) 136 | 137 | w2, x2, y2, z2 = w.pow(2), x.pow(2), y.pow(2), z.pow(2) 138 | wx, wy, wz = w*x, w*y, w*z 139 | xy, xz, yz = x*y, x*z, y*z 140 | 141 | rotMat = torch.stack([w2 + x2 - y2 - z2, 2*xy - 2*wz, 2*wy + 2*xz, 142 | 2*wz + 2*xy, w2 - x2 + y2 - z2, 2*yz - 2*wx, 143 | 2*xz - 2*wy, 2*wx + 2*yz, w2 - x2 - y2 + z2], dim=1).reshape(B, 3, 3) 144 | return rotMat 145 | 146 | def pose_vec2mat(vec, rotation_mode='euler'): 147 | """ 148 | Convert 6DoF parameters to transformation matrix. 149 | 150 | Args:s 151 | vec: 6DoF parameters in the order of tx, ty, tz, rx, ry, rz -- [B, 6] 152 | Returns: 153 | A transformation matrix -- [B, 3, 4] 154 | """ 155 | translation = vec[:, 3:].unsqueeze(-1) # [B, 3, 1] 156 | rot = vec[:,:3] 157 | 158 | if rotation_mode == 'euler': 159 | rot_mat = euler2mat(rot) # [B, 3, 3] 160 | elif rotation_mode == 'quat': 161 | rot_mat = quat2mat(rot) # [B, 3, 3] 162 | transform_mat = torch.cat([rot_mat, translation], dim=2) # [B, 3, 4] 163 | return transform_mat 164 | 165 | def pose_vec2mat4(vec, rotation_mode='euler'): 166 | """ 167 | Convert 6DoF parameters to transformation matrix. 168 | 169 | Args:s 170 | vec: 6DoF parameters in the order of tx, ty, tz, rx, ry, rz -- [B, 6] 171 | Returns: 172 | A transformation matrix -- [B, 4, 4] 173 | """ 174 | translation = vec[:, 3:].unsqueeze(-1) # [B, 3, 1] 175 | rot = vec[:, :3] 176 | 177 | if rotation_mode == 'euler': 178 | rot_mat = euler2mat(rot) # [B, 3, 3] 179 | elif rotation_mode == 'quat': 180 | rot_mat = quat2mat(rot) # [B, 3, 3] 181 | transform_mat = torch.cat([rot_mat, translation], dim=2) # [B, 3, 4] 182 | add = torch.FloatTensor([0, 0, 0, 1]).view(1, 4).unsqueeze(0).repeat(transform_mat.size(0), 1, 1).to(device) 183 | transform_mat = torch.cat(([transform_mat, add]), dim=1) 184 | 185 | return transform_mat 186 | 187 | def inverse_warp(img, depth, pose, intrinsics, rotation_mode='euler', padding_mode='zeros'): 188 | """ 189 | Inverse warp a source image to the target image plane. 190 | formula: 191 | ps = K*T*D*K^-1*pt 192 | 193 | Args: 194 | img: the source image (where to sample pixels) -- [B, 3, H, W] 195 | depth: depth map of the target image -- [B, H, W] 196 | pose: 6DoF pose parameters from target to source -- [B, 6] 197 | intrinsics: camera intrinsic matrix -- [B, 3, 3] 198 | Returns: 199 | Source image warped to the target image plane 200 | """ 201 | check_sizes(img, 'img', 'B3HW') 202 | check_sizes(depth, 'depth', 'BHW') 203 | check_sizes(pose, 'pose', 'B34') 204 | check_sizes(intrinsics, 'intrinsics', 'B33') 205 | 206 | batch_size, _, img_height, img_width = img.size() 207 | cam_coords = pixel2cam(depth, b_inv(intrinsics)) # [B,3,H,W] 208 | 209 | pose_mat = pose 210 | 211 | # Get projection matrix for tgt camera frame to source pixel frame 212 | proj_cam_to_src_pixel = intrinsics @ pose_mat # [B, 3, 4] 213 | 214 | src_pixel_coords = cam2pixel(cam_coords, proj_cam_to_src_pixel[:, :, :3], proj_cam_to_src_pixel[:, :, -1:], padding_mode) # [B,H,W,2] 215 | 216 | projected_img = F.grid_sample(img, src_pixel_coords, padding_mode=padding_mode, align_corners=True) 217 | return projected_img 218 | 219 | def inverse_warp_3d(img, depth, pose, intrinsics, rotation_mode='euler', padding_mode='zeros', ref_depth=None): 220 | """ 221 | Inverse warp a source image to the target image plane. 222 | 223 | formula: 224 | ps = K*T*D*K^-1*pt 225 | 226 | Args: 227 | img: the source image (where to sample pixels) -- [B, 3, H, W] 228 | depth: depth map of the target image -- [B, H, W] 229 | pose: 6DoF pose parameters from target to source -- [B, 6] 230 | intrinsics: camera intrinsic matrix -- [B, 3, 3] 231 | Returns: 232 | Source image warped to the target image plane 233 | """ 234 | check_sizes(img, 'img', 'B3HW') 235 | check_sizes(depth, 'depth', 'BHW') 236 | check_sizes(pose, 'pose', 'B34') 237 | check_sizes(intrinsics, 'intrinsics', 'B33') 238 | 239 | batch_size, _, h, w = img.size() 240 | 241 | cam_coords = pixel2cam(depth, b_inv(intrinsics)) # [B,3,H,W] 242 | 243 | pose_mat = pose 244 | 245 | # Get projection matrix for tgt camera frame to source pixel frame 246 | proj_cam_to_src_pixel = intrinsics @ pose_mat # [B, 3, 4] 247 | 248 | src_pixel_coords = cam2pixel(cam_coords, proj_cam_to_src_pixel[:, :, :3], proj_cam_to_src_pixel[:, :, -1:], padding_mode) # [B,H,W,2] 249 | 250 | projected_img = F.grid_sample(img, src_pixel_coords, padding_mode=padding_mode, align_corners=True) 251 | 252 | if ref_depth is not None: 253 | cam_coords_flat = cam_coords.reshape(batch_size, 3, -1) # [B, 3, H*W] 254 | pcoords = pose_mat[:, :, :3] @ cam_coords_flat 255 | pcoords = pcoords + pose_mat[:, :, -1:] 256 | pcoords = pcoords.reshape(batch_size, 3, h, w) 257 | 258 | ref_coords_3d = pixel2cam(ref_depth, b_inv(intrinsics)) # [B,3,H,W] 259 | 260 | projected_3d_points = F.grid_sample(ref_coords_3d, src_pixel_coords, padding_mode=padding_mode, align_corners=True) # [B,3,H,W] X_{边}系下的3d坐标在中系网格下的值 用来和 pcoords 比较 261 | 262 | return projected_img, [pcoords, projected_3d_points] 263 | 264 | return projected_img, None -------------------------------------------------------------------------------- /utils/logger.py: -------------------------------------------------------------------------------- 1 | from blessings import Terminal 2 | import progressbar 3 | import sys 4 | 5 | class TermLogger(object): 6 | def __init__(self, n_epochs, train_size, valid_size): 7 | self.n_epochs = n_epochs 8 | self.train_size = train_size 9 | self.valid_size = valid_size 10 | self.t = Terminal() 11 | s = 10 12 | e = 1 # epoch bar position 13 | tr = 3 # train bar position 14 | ts = 6 # valid bar position 15 | h = self.t.height 16 | 17 | for i in range(10): 18 | print('') 19 | self.epoch_bar = progressbar.ProgressBar(maxval=n_epochs, fd=Writer(self.t, (0, h-s+e))).start() 20 | 21 | self.train_writer = Writer(self.t, (0, h-s+tr)) 22 | self.train_bar_writer = Writer(self.t, (0, h-s+tr+1)) 23 | 24 | self.valid_writer = Writer(self.t, (0, h-s+ts)) 25 | self.valid_bar_writer = Writer(self.t, (0, h-s+ts+1)) 26 | 27 | self.reset_train_bar() 28 | self.reset_valid_bar() 29 | 30 | def reset_train_bar(self): 31 | self.train_bar = progressbar.ProgressBar(maxval=self.train_size, fd=self.train_bar_writer).start() 32 | 33 | def reset_valid_bar(self): 34 | self.valid_bar = progressbar.ProgressBar(maxval=self.valid_size, fd=self.valid_bar_writer).start() 35 | 36 | 37 | class Writer(object): 38 | """Create an object with a write method that writes to a 39 | specific place on the screen, defined at instantiation. 40 | 41 | This is the glue between blessings and progressbar. 42 | """ 43 | 44 | def __init__(self, t, location): 45 | """ 46 | Input: location - tuple of ints (x, y), the position 47 | of the bar in the terminal 48 | """ 49 | self.location = location 50 | self.t = t 51 | 52 | def write(self, string): 53 | with self.t.location(*self.location): 54 | sys.stdout.write("\033[K") 55 | print(string) 56 | 57 | def flush(self): 58 | return 59 | 60 | 61 | class AverageMeter(object): 62 | """Computes and stores the average and current value""" 63 | 64 | def __init__(self, i=1, precision=3): 65 | self.meters = i 66 | self.precision = precision 67 | self.reset(self.meters) 68 | 69 | def reset(self, i): 70 | self.val = [0]*i 71 | self.avg = [0]*i 72 | self.sum = [0]*i 73 | self.count = 0 74 | 75 | def update(self, val, n=1): 76 | if not isinstance(val, list): 77 | val = [val] 78 | assert(len(val) == self.meters) 79 | self.count += n 80 | for i, v in enumerate(val): 81 | self.val[i] = v 82 | self.sum[i] += v * n 83 | self.avg[i] = self.sum[i] / self.count 84 | 85 | def __repr__(self): 86 | val = ' '.join(['{:.{}f}'.format(v, self.precision) for v in self.val]) 87 | avg = ' '.join(['{:.{}f}'.format(a, self.precision) for a in self.avg]) 88 | return '{} ({})'.format(val, avg) 89 | -------------------------------------------------------------------------------- /utils/plot_traj.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | import os 4 | from mpl_toolkits.mplot3d import Axes3D 5 | 6 | def pose12to16(mat): 7 | if mat.ndim == 1: 8 | mat = mat.reshape(3, -1) 9 | mat = np.vstack([mat, [0, 0, 0, 1]]) 10 | return mat 11 | else: 12 | mat = np.vstack([mat, [0, 0, 0, 1]]) 13 | return mat 14 | def pose16to12(mat): 15 | return (mat[:3][:3]) 16 | 17 | def plot_route3d(gt_coords, pred_coords, c_gt='r', c_out='b'): 18 | fig = plt.figure() 19 | ax = fig.gca(projection='3d') 20 | ax.set_title("{}".format(1)) 21 | ax.set_xlabel("x") 22 | ax.set_ylabel("y") 23 | ax.set_zlabel("z") 24 | 25 | if gt_coords: 26 | x0, y0, z0 = gt_coords 27 | ax.plot(x0, y0, z0, color=c_gt, label='Ground Truth') 28 | if pred_coords: 29 | x1, y1, z1 = pred_coords 30 | ax.plot(x1, y1, z1, color=c_out, label='UnVIO') 31 | plt.show() 32 | 33 | def plot_route2d(gt_coords, pred_coords, c_gt='r', c_out='b'): 34 | fig = plt.figure() 35 | ax = fig.gca() 36 | ax.set_title("{}".format(1)) 37 | ax.set_xlabel("x") 38 | ax.set_ylabel("y") 39 | 40 | if gt_coords: 41 | x0, y0, z0 = gt_coords 42 | ax.plot(x0, z0, color=c_gt, label='Ground Truth') 43 | if pred_coords: 44 | x1, y1, z1 = pred_coords 45 | ax.plot(x1, z1, color=c_out, label='UnVIO') 46 | plt.show() 47 | 48 | def get_prediction_traj(predictions): 49 | X = predictions[:, 0, 3] 50 | Y = predictions[:, 1, 3] 51 | Z = predictions[:, 2, 3] 52 | return X, Y, Z 53 | 54 | def get_gt_traj(groundtruth): 55 | X = groundtruth[:,3] 56 | Y = groundtruth[:,7] 57 | Z = groundtruth[:,11] 58 | return X, Y, Z 59 | 60 | if __name__ == '__main__': 61 | scale = 1 62 | groundtruth = np.loadtxt('sampled_12Dpose.csv',delimiter=',',dtype=np.float32) 63 | predictions = np.load('predpose_09.npy') 64 | gt_coords = get_gt_traj(groundtruth) 65 | pred_coords = get_prediction_traj(predictions*scale) 66 | plot_route3d(None, pred_coords) 67 | plt.show() 68 | 69 | -------------------------------------------------------------------------------- /utils/pose_transfer.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | import numpy as np 3 | import numpy 4 | import math 5 | ''' 6 | brrowed from github:https://github.com/awesomebytes/delta_robot/blob/master/src/transformations.py 7 | ''' 8 | _EPS = np.finfo(float).eps * 4.0 9 | 10 | 11 | _AXES2TUPLE = { 12 | 'sxyz': (0, 0, 0, 0), 'sxyx': (0, 0, 1, 0), 'sxzy': (0, 1, 0, 0), 13 | 'sxzx': (0, 1, 1, 0), 'syzx': (1, 0, 0, 0), 'syzy': (1, 0, 1, 0), 14 | 'syxz': (1, 1, 0, 0), 'syxy': (1, 1, 1, 0), 'szxy': (2, 0, 0, 0), 15 | 'szxz': (2, 0, 1, 0), 'szyx': (2, 1, 0, 0), 'szyz': (2, 1, 1, 0), 16 | 'rzyx': (0, 0, 0, 1), 'rxyx': (0, 0, 1, 1), 'ryzx': (0, 1, 0, 1), 17 | 'rxzx': (0, 1, 1, 1), 'rxzy': (1, 0, 0, 1), 'ryzy': (1, 0, 1, 1), 18 | 'rzxy': (1, 1, 0, 1), 'ryxy': (1, 1, 1, 1), 'ryxz': (2, 0, 0, 1), 19 | 'rzxz': (2, 0, 1, 1), 'rxyz': (2, 1, 0, 1), 'rzyz': (2, 1, 1, 1)} 20 | 21 | _NEXT_AXIS = [1, 2, 0, 1] 22 | 23 | def euler_from_quaternion(quaternion, axes='sxyz'): 24 | """Return Euler angles from quaternion for specified axis sequence. 25 | >>> angles = euler_from_quaternion([0.99810947, 0.06146124, 0, 0]) 26 | >>> numpy.allclose(angles, [0.123, 0, 0]) 27 | True 28 | """ 29 | return euler_from_matrix(quaternion_matrix(quaternion), axes) 30 | 31 | def quaternion_matrix(quaternion): 32 | """Return homogeneous rotation matrix from quaternion. 33 | >>> M = quaternion_matrix([0.99810947, 0.06146124, 0, 0]) 34 | >>> numpy.allclose(M, rotation_matrix(0.123, [1, 0, 0])) 35 | True 36 | >>> M = quaternion_matrix([1, 0, 0, 0]) 37 | >>> numpy.allclose(M, numpy.identity(4)) 38 | True 39 | >>> M = quaternion_matrix([0, 1, 0, 0]) 40 | >>> numpy.allclose(M, numpy.diag([1, -1, -1, 1])) 41 | True 42 | """ 43 | q = numpy.array(quaternion, dtype=numpy.float64, copy=True) 44 | n = numpy.dot(q, q) 45 | if n < _EPS: 46 | return numpy.identity(4) 47 | q *= math.sqrt(2.0 / n) 48 | q = numpy.outer(q, q) 49 | return numpy.array([ 50 | [1.0-q[2, 2]-q[3, 3], q[1, 2]-q[3, 0], q[1, 3]+q[2, 0], 0.0], 51 | [ q[1, 2]+q[3, 0], 1.0-q[1, 1]-q[3, 3], q[2, 3]-q[1, 0], 0.0], 52 | [ q[1, 3]-q[2, 0], q[2, 3]+q[1, 0], 1.0-q[1, 1]-q[2, 2], 0.0], 53 | [ 0.0, 0.0, 0.0, 1.0]]) 54 | 55 | def quaternion_from_matrix(matrix, isprecise=False): 56 | """Return quaternion from rotation matrix. 57 | If isprecise is True, the input matrix is assumed to be a precise rotation 58 | matrix and a faster algorithm is used. 59 | >>> q = quaternion_from_matrix(numpy.identity(4), True) 60 | >>> numpy.allclose(q, [1, 0, 0, 0]) 61 | True 62 | >>> q = quaternion_from_matrix(numpy.diag([1, -1, -1, 1])) 63 | >>> numpy.allclose(q, [0, 1, 0, 0]) or numpy.allclose(q, [0, -1, 0, 0]) 64 | True 65 | >>> R = rotation_matrix(0.123, (1, 2, 3)) 66 | >>> q = quaternion_from_matrix(R, True) 67 | >>> numpy.allclose(q, [0.9981095, 0.0164262, 0.0328524, 0.0492786]) 68 | True 69 | >>> R = [[-0.545, 0.797, 0.260, 0], [0.733, 0.603, -0.313, 0], 70 | ... [-0.407, 0.021, -0.913, 0], [0, 0, 0, 1]] 71 | >>> q = quaternion_from_matrix(R) 72 | >>> numpy.allclose(q, [0.19069, 0.43736, 0.87485, -0.083611]) 73 | True 74 | >>> R = [[0.395, 0.362, 0.843, 0], [-0.626, 0.796, -0.056, 0], 75 | ... [-0.677, -0.498, 0.529, 0], [0, 0, 0, 1]] 76 | >>> q = quaternion_from_matrix(R) 77 | >>> numpy.allclose(q, [0.82336615, -0.13610694, 0.46344705, -0.29792603]) 78 | True 79 | >>> R = random_rotation_matrix() 80 | >>> q = quaternion_from_matrix(R) 81 | >>> is_same_transform(R, quaternion_matrix(q)) 82 | True 83 | """ 84 | M = numpy.array(matrix, dtype=numpy.float64, copy=False)[:4, :4] 85 | if isprecise: 86 | q = numpy.empty((4, )) 87 | t = numpy.trace(M) 88 | if t > M[3, 3]: 89 | q[0] = t 90 | q[3] = M[1, 0] - M[0, 1] 91 | q[2] = M[0, 2] - M[2, 0] 92 | q[1] = M[2, 1] - M[1, 2] 93 | else: 94 | i, j, k = 1, 2, 3 95 | if M[1, 1] > M[0, 0]: 96 | i, j, k = 2, 3, 1 97 | if M[2, 2] > M[i, i]: 98 | i, j, k = 3, 1, 2 99 | t = M[i, i] - (M[j, j] + M[k, k]) + M[3, 3] 100 | q[i] = t 101 | q[j] = M[i, j] + M[j, i] 102 | q[k] = M[k, i] + M[i, k] 103 | q[3] = M[k, j] - M[j, k] 104 | q *= 0.5 / math.sqrt(t * M[3, 3]) 105 | else: 106 | m00 = M[0, 0] 107 | m01 = M[0, 1] 108 | m02 = M[0, 2] 109 | m10 = M[1, 0] 110 | m11 = M[1, 1] 111 | m12 = M[1, 2] 112 | m20 = M[2, 0] 113 | m21 = M[2, 1] 114 | m22 = M[2, 2] 115 | # symmetric matrix K 116 | K = numpy.array([[m00-m11-m22, 0.0, 0.0, 0.0], 117 | [m01+m10, m11-m00-m22, 0.0, 0.0], 118 | [m02+m20, m12+m21, m22-m00-m11, 0.0], 119 | [m21-m12, m02-m20, m10-m01, m00+m11+m22]]) 120 | K /= 3.0 121 | # quaternion is eigenvector of K that corresponds to largest eigenvalue 122 | w, V = numpy.linalg.eigh(K) 123 | q = V[[3, 0, 1, 2], numpy.argmax(w)] 124 | if q[0] < 0.0: 125 | numpy.negative(q, q) 126 | return q 127 | 128 | 129 | def euler_matrix(ai, aj, ak, axes='sxyz'): 130 | """Return homogeneous rotation matrix from Euler angles and axis sequence. 131 | ai, aj, ak : Euler's roll, pitch and yaw angles 132 | axes : One of 24 axis sequences as string or encoded tuple 133 | >>> R = euler_matrix(1, 2, 3, 'syxz') 134 | >>> numpy.allclose(numpy.sum(R[0]), -1.34786452) 135 | True 136 | >>> R = euler_matrix(1, 2, 3, (0, 1, 0, 1)) 137 | >>> numpy.allclose(numpy.sum(R[0]), -0.383436184) 138 | True 139 | >>> ai, aj, ak = (4*math.pi) * (numpy.random.random(3) - 0.5) 140 | >>> for axes in _AXES2TUPLE.keys(): 141 | ... R = euler_matrix(ai, aj, ak, axes) 142 | >>> for axes in _TUPLE2AXES.keys(): 143 | ... R = euler_matrix(ai, aj, ak, axes) 144 | """ 145 | try: 146 | firstaxis, parity, repetition, frame = _AXES2TUPLE[axes] 147 | except (AttributeError, KeyError): 148 | _TUPLE2AXES[axes] # validation 149 | firstaxis, parity, repetition, frame = axes 150 | 151 | i = firstaxis 152 | j = _NEXT_AXIS[i+parity] 153 | k = _NEXT_AXIS[i-parity+1] 154 | 155 | if frame: 156 | ai, ak = ak, ai 157 | if parity: 158 | ai, aj, ak = -ai, -aj, -ak 159 | 160 | si, sj, sk = math.sin(ai), math.sin(aj), math.sin(ak) 161 | ci, cj, ck = math.cos(ai), math.cos(aj), math.cos(ak) 162 | cc, cs = ci*ck, ci*sk 163 | sc, ss = si*ck, si*sk 164 | 165 | M = numpy.identity(4) 166 | if repetition: 167 | M[i, i] = cj 168 | M[i, j] = sj*si 169 | M[i, k] = sj*ci 170 | M[j, i] = sj*sk 171 | M[j, j] = -cj*ss+cc 172 | M[j, k] = -cj*cs-sc 173 | M[k, i] = -sj*ck 174 | M[k, j] = cj*sc+cs 175 | M[k, k] = cj*cc-ss 176 | else: 177 | M[i, i] = cj*ck 178 | M[i, j] = sj*sc-cs 179 | M[i, k] = sj*cc+ss 180 | M[j, i] = cj*sk 181 | M[j, j] = sj*ss+cc 182 | M[j, k] = sj*cs-sc 183 | M[k, i] = -sj 184 | M[k, j] = cj*si 185 | M[k, k] = cj*ci 186 | 187 | M = M[0:3,0:3] 188 | return M 189 | 190 | def euler_from_matrix(matrix, axes='sxyz'): 191 | """Return Euler angles from rotation matrix for specified axis sequence. 192 | axes : One of 24 axis sequences as string or encoded tuple 193 | Note that many Euler angle triplets can describe one matrix. 194 | >>> R0 = euler_matrix(1, 2, 3, 'syxz') 195 | >>> al, be, ga = euler_from_matrix(R0, 'syxz') 196 | >>> R1 = euler_matrix(al, be, ga, 'syxz') 197 | >>> numpy.allclose(R0, R1) 198 | True 199 | >>> angles = (4*math.pi) * (numpy.random.random(3) - 0.5) 200 | >>> for axes in _AXES2TUPLE.keys(): 201 | ... R0 = euler_matrix(axes=axes, *angles) 202 | ... R1 = euler_matrix(axes=axes, *euler_from_matrix(R0, axes)) 203 | ... if not numpy.allclose(R0, R1): print(axes, "failed") 204 | """ 205 | try: 206 | firstaxis, parity, repetition, frame = _AXES2TUPLE[axes.lower()] 207 | except (AttributeError, KeyError): 208 | _TUPLE2AXES[axes] # validation 209 | firstaxis, parity, repetition, frame = axes 210 | 211 | i = firstaxis 212 | j = _NEXT_AXIS[i+parity] 213 | k = _NEXT_AXIS[i-parity+1] 214 | 215 | M = numpy.array(matrix, dtype=numpy.float64, copy=False)[:3, :3] 216 | if repetition: 217 | sy = math.sqrt(M[i, j]*M[i, j] + M[i, k]*M[i, k]) 218 | if sy > _EPS: 219 | ax = math.atan2( M[i, j], M[i, k]) 220 | ay = math.atan2( sy, M[i, i]) 221 | az = math.atan2( M[j, i], -M[k, i]) 222 | else: 223 | ax = math.atan2(-M[j, k], M[j, j]) 224 | ay = math.atan2( sy, M[i, i]) 225 | az = 0.0 226 | else: 227 | cy = math.sqrt(M[i, i]*M[i, i] + M[j, i]*M[j, i]) 228 | if cy > _EPS: 229 | ax = math.atan2( M[k, j], M[k, k]) 230 | ay = math.atan2(-M[k, i], cy) 231 | az = math.atan2( M[j, i], M[i, i]) 232 | else: 233 | ax = math.atan2(-M[j, k], M[j, j]) 234 | ay = math.atan2(-M[k, i], cy) 235 | az = 0.0 236 | 237 | if parity: 238 | ax, ay, az = -ax, -ay, -az 239 | if frame: 240 | ax, az = az, ax 241 | return ax, ay, az 242 | 243 | def eulerAnglesToRotationMatrix(theta) : 244 | R_x = np.array([[1, 0, 0 ], 245 | [0, np.cos(theta[0]), -np.sin(theta[0]) ], 246 | [0, np.sin(theta[0]), np.cos(theta[0]) ] 247 | ]) 248 | R_y = np.array([[np.cos(theta[1]), 0, np.sin(theta[1]) ], 249 | [0, 1, 0 ], 250 | [-np.sin(theta[1]), 0, np.cos(theta[1]) ] 251 | ]) 252 | R_z = np.array([[np.cos(theta[2]), -np.sin(theta[2]), 0], 253 | [np.sin(theta[2]), np.cos(theta[2]), 0], 254 | [0, 0, 1] 255 | ]) 256 | R = np.dot(R_z, np.dot( R_y, R_x )) 257 | return R 258 | 259 | 260 | def normalize_angle_delta(angle): 261 | if (angle > np.pi): 262 | angle = angle - 2 * np.pi 263 | elif (angle < -np.pi): 264 | angle = 2 * np.pi + angle 265 | return angle 266 | 267 | if __name__ == "__main__": 268 | 269 | q = [0.534, -0.153, -0.827, -0.082] 270 | print(euler_from_quaternion(q, axes='rxyz')) 271 | e = [-3.0755151367020725, -1.1407976445564845, 2.733415384242117] 272 | r1 = quaternion_matrix(q) 273 | r2 = euler_matrix(e[0], e[1], e[2], axes='rxyz') 274 | print('r1', r1) 275 | print('r2', r2) 276 | q1 = euler_from_matrix(r1) 277 | q2 = euler_from_matrix(r2) 278 | print('e1', q1) 279 | print('e2', q2) 280 | m = euler_matrix(e[0], e[1], e[2]) 281 | print('m', m) -------------------------------------------------------------------------------- /utils/utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import numpy 3 | import torch 4 | import shutil 5 | from path import Path 6 | import datetime 7 | from collections import OrderedDict 8 | from matplotlib import cm 9 | from matplotlib.colors import ListedColormap, LinearSegmentedColormap 10 | import matplotlib.pyplot as plt 11 | from collections import OrderedDict 12 | import json 13 | import os 14 | 15 | def json_out(dictionary, outpath, outname): 16 | with open(os.path.join(outpath, outname), 'w', encoding="utf-8") as f_out: 17 | jsObj = json.dumps(dictionary, indent=4) 18 | f_out.write(jsObj) 19 | f_out.close() 20 | 21 | def save_path_formatter(args, parser): 22 | def is_default(key, value): 23 | return value == parser.get_default(key) 24 | args_dict = vars(args) 25 | data_folder_name = str(Path(args_dict['data']).normpath().name) 26 | folder_string = [data_folder_name] 27 | if not is_default('epochs', args_dict['epochs']): 28 | folder_string.append('{}epochs'.format(args_dict['epochs'])) 29 | keys_with_prefix = OrderedDict() 30 | keys_with_prefix['epoch_size'] = 'epoch_size' 31 | keys_with_prefix['sequence_length'] = 'seq' 32 | keys_with_prefix['padding_mode'] = 'padding_' 33 | keys_with_prefix['batch_size'] = 'b' 34 | keys_with_prefix['lr'] = 'lr' 35 | keys_with_prefix['photo_loss_weight'] = 'p' 36 | keys_with_prefix['smooth_loss_weight'] = 's' 37 | 38 | for key, prefix in keys_with_prefix.items(): 39 | value = args_dict[key] 40 | if not is_default(key, value): 41 | folder_string.append('{}{}'.format(prefix, value)) 42 | save_path = Path(','.join(folder_string)) 43 | timestamp = datetime.datetime.now().strftime("%m-%d-%H-%M") 44 | return save_path/timestamp 45 | 46 | def save_checkpoint(save_path, dispnet_state, visualnet_state, imunet_state, posenet_state, is_best, filename='checkpoint.pth.tar'): 47 | file_prefixes = ['UnVIO_dispnet','UnVIO_visualnet', 'UnVIO_imunet', 'UnVIO_posenet'] 48 | states = [dispnet_state, visualnet_state, imunet_state, posenet_state] 49 | for (prefix, state) in zip(file_prefixes, states): 50 | torch.save(state, save_path/'{}_{}'.format(prefix,filename)) 51 | 52 | if is_best: 53 | for prefix in file_prefixes: 54 | shutil.copyfile(save_path/'{}_{}'.format(prefix,filename), save_path/'{}_best.pth.tar'.format(prefix)) 55 | 56 | def high_res_colormap(low_res_cmap, resolution=1000, max_value=1): 57 | # Construct the list colormap, with interpolated values for higer resolution 58 | # For a linear segmented colormap, you can just specify the number of point in 59 | # cm.get_cmap(name, lutsize) with the parameter lutsize 60 | x = np.linspace(0,1,low_res_cmap.N) 61 | low_res = low_res_cmap(x) 62 | new_x = np.linspace(0,max_value,resolution) 63 | high_res = np.stack([np.interp(new_x, x, low_res[:,i]) for i in range(low_res.shape[1])], axis=1) 64 | return ListedColormap(high_res) 65 | 66 | def opencv_rainbow(resolution=1000): 67 | # Construct the opencv equivalent of Rainbow 68 | opencv_rainbow_data = ( 69 | (0.000, (1.00, 0.00, 0.00)), 70 | (0.400, (1.00, 1.00, 0.00)), 71 | (0.600, (0.00, 1.00, 0.00)), 72 | (0.800, (0.00, 0.00, 1.00)), 73 | (1.000, (0.60, 0.00, 1.00)) 74 | ) 75 | return LinearSegmentedColormap.from_list('opencv_rainbow', opencv_rainbow_data, resolution) 76 | 77 | 78 | COLORMAPS = {'rainbow': opencv_rainbow(), 79 | 'magma': high_res_colormap(cm.get_cmap('magma')), 80 | 'bone': cm.get_cmap('bone', 10000)} 81 | 82 | 83 | def tensor2array(tensor, max_value=None, colormap='rainbow'): 84 | tensor = tensor.detach().cpu() 85 | if max_value is None: 86 | max_value = tensor.max().item() 87 | if tensor.ndimension() == 2 or tensor.size(0) == 1: 88 | norm_array = tensor.squeeze().numpy()/max_value 89 | array = COLORMAPS[colormap](norm_array).astype(np.float32) 90 | array = array.transpose(2, 0, 1) 91 | 92 | elif tensor.ndimension() == 3: 93 | assert(tensor.size(0) == 3) 94 | array = 0.5 + tensor.numpy()*0.5 95 | return array 96 | 97 | 98 | def tensor2array2(tensor, max_value=None, colormap='rainbow'): 99 | tensor = tensor.detach().cpu() 100 | if max_value is None: 101 | max_value = tensor.max().item() 102 | if tensor.ndimension() == 2 or tensor.size(0) == 1: 103 | norm_array = tensor.squeeze().numpy()/max_value 104 | array = COLORMAPS[colormap](norm_array).astype(np.float32) 105 | array = array.transpose(2, 0, 1) 106 | 107 | elif tensor.ndimension() == 3: 108 | assert(tensor.size(0) == 3) 109 | array = 0.5 + tensor.numpy()*0.5 110 | else: 111 | raise ValueError('wrong dimention of tensor.shape {}'.format(tensor.shape)) 112 | return (array.transpose(1,2,0) * 255).astype('uint8') 113 | 114 | def tensor2img(tensor): 115 | img = tensor.cpu().data.numpy() 116 | mean = np.array([0.5, 0.5, 0.5]) 117 | img = np.transpose(img, (0, 2, 3, 1)) 118 | img = img*mean + mean 119 | return img 120 | 121 | def show_imgs(imgs): 122 | ''' 123 | from utils import tensor2img as t2i 124 | from utils import show_imgs as shi 125 | tgi = t2i(tgt_img); rfi = t2i(ref_img); rwi = t2i(ref_img_warped); rci = t2i(rec_imgs[i]) 126 | tmp = diff_mask.squeeze().cpu().data.numpy() ; tmp = occ_mask.squeeze().cpu().data.numpy() 127 | m=0; shi([tgi[m], rwi[m], rfi[m], rci[m], tmp[m]]); m=0; shi([tgi[m], rwi[m], rfi[m], tmp[m]]) 128 | import matplotlib.pyplot as plt 129 | plt.subplot(311); plt.imshow(tgi[m]);plt.subplot(312); plt.imshow(rfi[m]); plt.subplot(313); plt.imshow(tgi[m]); plt.imshow(tmp[m],alpha=0.2,cmap='magma');plt.show() 130 | ''' 131 | if type(imgs) not in [list]: 132 | assert((len(imgs.shape)==3 and imgs.shape[2]==3) or len(imgs.shape)==2) 133 | plt.imshow(imgs) 134 | plt.show() 135 | else: 136 | num = len(imgs) 137 | for i, img in enumerate(imgs): 138 | assert((len(img.shape)==3) or len(img.shape)==2) 139 | plt.subplot(num, 1, i+1) 140 | plt.imshow(img) 141 | plt.show() 142 | 143 | 144 | def align(model, data): 145 | """Align two trajectories using the method of Horn (closed-form). 146 | 147 | Input: 148 | model -- first trajectory (3xn) 149 | data -- second trajectory (3xn) 150 | 151 | Output: 152 | rot -- rotation matrix (3x3) 153 | trans -- translation vector (3x1) 154 | trans_error -- translational error per point (1xn) 155 | """ 156 | numpy.set_printoptions(precision=3, suppress=True) 157 | model_zerocentered = model - model.mean(1) 158 | data_zerocentered = data - data.mean(1) 159 | 160 | W = numpy.zeros((3, 3)) 161 | for column in range(model.shape[1]): 162 | W += numpy.outer(model_zerocentered[:, column], data_zerocentered[:, column]) 163 | U, d, Vh = numpy.linalg.linalg.svd(W.transpose()) 164 | S = numpy.matrix(numpy.identity(3)) 165 | if (numpy.linalg.det(U) * numpy.linalg.det(Vh) < 0): 166 | S[2, 2] = -1 167 | rot = U * S * Vh 168 | 169 | rotmodel = rot * model_zerocentered 170 | dots = 0.0 171 | norms = 0.0 172 | 173 | for column in range(data_zerocentered.shape[1]): 174 | dots += numpy.dot(data_zerocentered[:, column].transpose(), rotmodel[:, column]) 175 | normi = numpy.linalg.norm(model_zerocentered[:, column]) 176 | norms += normi * normi 177 | 178 | s = float(dots / norms) 179 | 180 | transGT = data.mean(1) - s * rot * model.mean(1) 181 | trans = data.mean(1) - rot * model.mean(1) 182 | 183 | model_alignedGT = s * rot * model + transGT 184 | model_aligned = rot * model + trans 185 | 186 | alignment_errorGT = model_alignedGT - data 187 | alignment_error = model_aligned - data 188 | 189 | trans_errorGT = numpy.sqrt(numpy.sum(numpy.multiply(alignment_errorGT, alignment_errorGT), 0)).A[0] 190 | trans_error = numpy.sqrt(numpy.sum(numpy.multiply(alignment_error, alignment_error), 0)).A[0] 191 | 192 | return rot, transGT, trans_errorGT, trans, trans_error, s, np.array(data.transpose()), np.array(model_alignedGT.transpose()) --------------------------------------------------------------------------------