├── .gitignore ├── README.md ├── datasets ├── __init__.py ├── cityscapes_dataset.py ├── kitti_dataset.py └── mono_dataset.py ├── depthComparison.png ├── evaluate_city_depth.py ├── evaluate_depth.py ├── evaluate_pose.py ├── export_gt_depth.py ├── inplace_abn ├── __init__.py ├── bn.py ├── functions.py └── src │ ├── common.h │ ├── inplace_abn.cpp │ ├── inplace_abn.h │ ├── inplace_abn_cpu.cpp │ └── inplace_abn_cuda.cu ├── kitti_utils.py ├── layers.py ├── networks ├── __init__.py ├── asp_oc_block.py ├── base_oc_block.py ├── decoder.py ├── encoder_selfattn.py ├── monodepth2_decoder.py ├── pose_cnn.py ├── pose_decoder.py ├── resnet_encoder.py └── util.py ├── options.py ├── requirements.txt ├── splits ├── benchmark │ ├── eigen_to_benchmark_ids.npy │ ├── test_files.txt │ ├── train_files.txt │ └── val_files.txt ├── eigen │ └── test_files.txt ├── eigen_benchmark │ └── test_files.txt ├── eigen_full │ ├── train_files.txt │ └── val_files.txt ├── eigen_zhou │ ├── train_files.txt │ └── val_files.txt ├── kitti_archives_to_download.txt └── odom │ ├── test_files_09.txt │ ├── test_files_10.txt │ ├── train_files.txt │ └── val_files.txt ├── test_simple.py ├── train.py ├── trainer.py └── utils.py /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__ 2 | *.pyc 3 | *_disp.jpg 4 | *_disp.npy 5 | *.npz 6 | kitti_data 7 | models 8 | .idea 9 | tmp/ 10 | *.pth -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Self-supervised Monocular Trained Depth Estimation using Self-attention and Discrete Disparity Volume - ML Reproducibility Challenge 2020 2 | 3 | This project is a reproduction of the CVPR 2020 paper 4 | > **Self-supervised Monocular Trained Depth Estimation 5 | > using Self-attention and Discrete Disparity Volume** 6 | > 7 | > Adrian Johnston, Gustavo Carneiro 8 | > 9 | > [CVPR 2020 (arXiv pdf)](https://arxiv.org/pdf/2003.13951.pdf) 10 | 11 | It proposes to close the performance gap with the fully-supervised 12 | methods using only the monocular sequence for training with the help 13 | of additional layers - self-attention and discrete disparity volume. 14 | 15 | 16 | 17 | Setup procedure 18 | ---------------- 19 | 1. Clone project from [GitHub](https://github.com/sjsu-smart-lab/Self-supervised-Monocular-Trained-Depth-Estimation-using-Self-attention-and-Discrete-Disparity-Volum). 20 | Change to the directory Self-supervised-Monocular-Trained-Depth-Estimation-using-Self-attention-and-Discrete-Disparity-Volum. 21 | 2. Install packages 22 | In order to reproduce the code install the packages by running the below 23 | command. 24 | 25 | pip install -r requirements.txt 26 | 27 | This project uses Python 3.6.6, cuda 10.1, pytorch 0.4.1, torchvision 0.2.1, tensorboardX 1.4 and opencv. 28 | The experiments were conducted using NVIDIA Tesla P100 GPU and CPU environment - Intel Xeon E5-2660 v4 (2.0GHz, 35M Cache). 29 | 30 | 3. Download the required data sets. 31 | The data set that is used in this project are [KITTI Raw](http://www.cvlibs.net/datasets/kitti/raw_data.php) 32 | and leftImg8bit of [Cityscapes](https://www.cityscapes-dataset.com/). 33 | 34 | Training 35 | --------- 36 | The paper claims to achieve state-of-the-art results using only monocular sequence, 37 | unlike previous algorithms which relied on both stereo and monocular 38 | images. 39 | 40 | python3 train.py --data_path --log_dir tmp/ --model_name --png 41 | 42 | For setting/ altering other input parameters for abalation study or 43 | hyperparameter search refer the [options.py](options.py) 44 | 45 | Evaluation 46 | ----------- 47 | Prepare ground truth data by 48 | 49 | python export_gt_depth.py --data_path kitti_data --split eigen --dataset 50 | 51 | The accuracy and loss values of a trained model can be infered using the 52 | below command 53 | 54 | python evaluate_depth.py --data_path --load_weights_folder --eval_mono --png 55 | 56 | Inference 57 | --------- 58 | The inference prints the depth map, space occupied by the model and inference time 59 | as output for a given image(s) file/folder. 60 | 61 | python test_simple.py --image_path --model_name 62 | 63 | 64 | Results 65 | -------- 66 | 67 | Below are the results obtained on the KITTI Raw test set for the models trained in the project. 68 | 69 | > NOTE 70 | The results obtained are system specific. Due to different combinations of the neural 71 | network cudnn library versions and NVIDIA driver library versions, the results can be 72 | slightly different. To the best of my knowledge, upon reproducing the environment, the 73 | ballpark number will be close to the results obtained. 74 | 75 | | abs_rel | sq_rel | RMSE | RMSE log | a1 | a2 | a3 | 76 | |:-------:|:------:|:-----:|:--------:|:-----:|:-----:|:-----:| 77 | | 0.108 | 93.13 | 4.682 | 0.185 | 0.889 | 0.962 | 0.982 | 78 | 79 | | Training time | Inference time (CPU) | Inference time (GPU) | Memory | 80 | |:-------------:|:--------------------:|:--------------------:|:--------:| 81 | | 204 hours | 6108.5 +/- 12.23 | 653.21 +/- 0.98 | 252.7 MB | 82 | 83 | References 84 | ----------- 85 | 1. Monodepth2 - https://github.com/nianticlabs/monodepth2 86 | 2. OCNet - https://github.com/openseg-group/OCNet.pytorch 87 | 3. DORN - https://arxiv.org/abs/1806.02446 88 | -------------------------------------------------------------------------------- /datasets/__init__.py: -------------------------------------------------------------------------------- 1 | from .kitti_dataset import KITTIRAWDataset, KITTIOdomDataset, KITTIDepthDataset 2 | from .cityscapes_dataset import CityscapesData 3 | -------------------------------------------------------------------------------- /datasets/cityscapes_dataset.py: -------------------------------------------------------------------------------- 1 | import os 2 | from torch.utils.data import Dataset 3 | import cv2 4 | import numpy as np 5 | from pathlib import Path 6 | 7 | 8 | class CityscapesData(Dataset): 9 | """ 10 | Cityscapes dataset for inference 11 | """ 12 | def __init__(self, folder_path): 13 | self.folder_path = folder_path 14 | self.all_imgs = sorted(list(Path(folder_path).glob('**/*.png'))) 15 | 16 | def __len__(self): 17 | return len(self.all_imgs) 18 | 19 | def __getitem__(self, index): 20 | image_path = self.all_imgs[index] 21 | image = cv2.imread(str(image_path), cv2.IMREAD_UNCHANGED) 22 | # resize in accordance to the trained model and normalize 23 | image = (cv2.resize(image, (640, 192), cv2.INTER_AREA)*1.0)/256 24 | image = image.astype(np.float32) 25 | 26 | return image -------------------------------------------------------------------------------- /datasets/kitti_dataset.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | import skimage.transform 5 | import numpy as np 6 | import PIL.Image as pil 7 | 8 | from kitti_utils import generate_depth_map 9 | from .mono_dataset import MonoDataset 10 | 11 | 12 | class KITTIDataset(MonoDataset): 13 | """Superclass for different types of KITTI dataset loaders 14 | """ 15 | def __init__(self, *args, **kwargs): 16 | super(KITTIDataset, self).__init__(*args, **kwargs) 17 | 18 | # NOTE: Make sure your intrinsics matrix is *normalized* by the original image size 19 | self.K = np.array([[0.58, 0, 0.5, 0], 20 | [0, 1.92, 0.5, 0], 21 | [0, 0, 1, 0], 22 | [0, 0, 0, 1]], dtype=np.float32) 23 | 24 | self.full_res_shape = (1242, 375) 25 | self.side_map = {"2": 2, "3": 3, "l": 2, "r": 3} 26 | 27 | def check_depth(self): 28 | line = self.filenames[0].split() 29 | scene_name = line[0] 30 | frame_index = int(line[1]) 31 | 32 | velo_filename = os.path.join( 33 | self.data_path, 34 | scene_name, 35 | "velodyne_points/data/{:010d}.bin".format(int(frame_index))) 36 | 37 | return os.path.isfile(velo_filename) 38 | 39 | def get_color(self, folder, frame_index, side, do_flip): 40 | color = self.loader(self.get_image_path(folder, frame_index, side)) 41 | 42 | if do_flip: 43 | color = color.transpose(pil.FLIP_LEFT_RIGHT) 44 | 45 | return color 46 | 47 | 48 | class KITTIRAWDataset(KITTIDataset): 49 | """KITTI dataset which loads the original velodyne depth maps for ground truth 50 | """ 51 | def __init__(self, *args, **kwargs): 52 | super(KITTIRAWDataset, self).__init__(*args, **kwargs) 53 | 54 | def get_image_path(self, folder, frame_index, side): 55 | f_str = "{:010d}{}".format(frame_index, self.img_ext) 56 | image_path = os.path.join( 57 | self.data_path, folder, "image_0{}/data".format(self.side_map[side]), f_str) 58 | return image_path 59 | 60 | def get_depth(self, folder, frame_index, side, do_flip): 61 | calib_path = os.path.join(self.data_path, folder.split("/")[0]) 62 | 63 | velo_filename = os.path.join( 64 | self.data_path, 65 | folder, 66 | "velodyne_points/data/{:010d}.bin".format(int(frame_index))) 67 | 68 | depth_gt = generate_depth_map(calib_path, velo_filename, self.side_map[side]) 69 | depth_gt = skimage.transform.resize( 70 | depth_gt, self.full_res_shape[::-1], order=0, preserve_range=True, mode='constant') 71 | 72 | if do_flip: 73 | depth_gt = np.fliplr(depth_gt) 74 | 75 | return depth_gt 76 | 77 | 78 | class KITTIOdomDataset(KITTIDataset): 79 | """KITTI dataset for odometry training and testing 80 | """ 81 | def __init__(self, *args, **kwargs): 82 | super(KITTIOdomDataset, self).__init__(*args, **kwargs) 83 | 84 | def get_image_path(self, folder, frame_index, side): 85 | f_str = "{:06d}{}".format(frame_index, self.img_ext) 86 | image_path = os.path.join( 87 | self.data_path, 88 | "sequences/{:02d}".format(int(folder)), 89 | "image_{}".format(self.side_map[side]), 90 | f_str) 91 | return image_path 92 | 93 | 94 | class KITTIDepthDataset(KITTIDataset): 95 | """KITTI dataset which uses the updated ground truth depth maps 96 | """ 97 | def __init__(self, *args, **kwargs): 98 | super(KITTIDepthDataset, self).__init__(*args, **kwargs) 99 | 100 | def get_image_path(self, folder, frame_index, side): 101 | f_str = "{:010d}{}".format(frame_index, self.img_ext) 102 | image_path = os.path.join( 103 | self.data_path, 104 | folder, 105 | "image_0{}/data".format(self.side_map[side]), 106 | f_str) 107 | return image_path 108 | 109 | def get_depth(self, folder, frame_index, side, do_flip): 110 | f_str = "{:010d}.png".format(frame_index) 111 | depth_path = os.path.join( 112 | self.data_path, 113 | folder, 114 | "proj_depth/groundtruth/image_0{}".format(self.side_map[side]), 115 | f_str) 116 | 117 | depth_gt = pil.open(depth_path) 118 | depth_gt = depth_gt.resize(self.full_res_shape, pil.NEAREST) 119 | depth_gt = np.array(depth_gt).astype(np.float32) / 256 120 | 121 | if do_flip: 122 | depth_gt = np.fliplr(depth_gt) 123 | 124 | return depth_gt 125 | -------------------------------------------------------------------------------- /datasets/mono_dataset.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | import random 5 | import numpy as np 6 | import copy 7 | from PIL import Image # using pillow-simd for increased speed 8 | 9 | import torch 10 | import torch.utils.data as data 11 | from torchvision import transforms 12 | 13 | 14 | def pil_loader(path): 15 | # open path as file to avoid ResourceWarning 16 | # (https://github.com/python-pillow/Pillow/issues/835) 17 | with open(path, 'rb') as f: 18 | with Image.open(f) as img: 19 | return img.convert('RGB') 20 | 21 | 22 | class MonoDataset(data.Dataset): 23 | """Superclass for monocular dataloaders 24 | 25 | Args: 26 | data_path: path where KITTI Raw dataset is stored 27 | filenames: Training, validation ans test split file in 'splits' folder 28 | height: height of the image 29 | width: width of the image 30 | frame_idxs: temporally adjacent frames 31 | num_scales: scales used in the loss 32 | is_train: is the dataloader used for training 33 | img_ext: .png or .jpg or .jpeg 34 | """ 35 | def __init__(self, 36 | data_path, 37 | filenames, 38 | height, 39 | width, 40 | frame_idxs, 41 | num_scales, 42 | is_train=False, 43 | img_ext='.jpg'): 44 | super(MonoDataset, self).__init__() 45 | 46 | self.data_path = data_path 47 | self.filenames = filenames 48 | self.height = height 49 | self.width = width 50 | self.num_scales = num_scales 51 | self.interp = Image.ANTIALIAS 52 | 53 | self.frame_idxs = frame_idxs 54 | 55 | self.is_train = is_train 56 | self.img_ext = img_ext 57 | 58 | self.loader = pil_loader 59 | self.to_tensor = transforms.ToTensor() 60 | 61 | # We need to specify augmentations differently in newer versions of torchvision. 62 | # We first try the newer tuple version; if this fails we fall back to scalars 63 | try: 64 | self.brightness = (0.8, 1.2) 65 | self.contrast = (0.8, 1.2) 66 | self.saturation = (0.8, 1.2) 67 | self.hue = (-0.1, 0.1) 68 | transforms.ColorJitter.get_params( 69 | self.brightness, self.contrast, self.saturation, self.hue) 70 | except TypeError: 71 | self.brightness = 0.2 72 | self.contrast = 0.2 73 | self.saturation = 0.2 74 | self.hue = 0.1 75 | 76 | self.resize = {} 77 | for i in range(self.num_scales): 78 | s = 2 ** i 79 | self.resize[i] = transforms.Resize((self.height // s, self.width // s), 80 | interpolation=self.interp) 81 | 82 | self.load_depth = self.check_depth() 83 | 84 | def preprocess(self, inputs, color_aug): 85 | """Resize colour images to the required scales and augment if required 86 | 87 | color_aug object is created in advance and the same augmentation is applied to all 88 | images in this item. This ensures that all images input to the pose network receive the 89 | same augmentation. 90 | """ 91 | for k in list(inputs): 92 | frame = inputs[k] 93 | if "color" in k: 94 | n, im, i = k 95 | for i in range(self.num_scales): 96 | inputs[(n, im, i)] = self.resize[i](inputs[(n, im, i - 1)]) 97 | 98 | for k in list(inputs): 99 | f = inputs[k] 100 | if "color" in k: 101 | n, im, i = k 102 | inputs[(n, im, i)] = self.to_tensor(f) 103 | inputs[(n + "_aug", im, i)] = self.to_tensor(color_aug(f)) 104 | 105 | def __len__(self): 106 | return len(self.filenames) 107 | 108 | def __getitem__(self, index): 109 | """Returns a single training item from the dataset as a dictionary. 110 | 111 | Values correspond to torch tensors. 112 | Keys in the dictionary are either strings or tuples: 113 | 114 | ("color", , ) for raw colour images, 115 | ("color_aug", , ) for augmented colour images, 116 | ("K", scale) or ("inv_K", scale) for camera intrinsics, 117 | "stereo_T" for camera extrinsics, and 118 | "depth_gt" for ground truth depth maps. 119 | 120 | is either: 121 | an integer (e.g. 0, -1, or 1) representing the temporal step relative to 'index', 122 | or 123 | "s" for the opposite image in the stereo pair. 124 | 125 | is an integer representing the scale of the image relative to the fullsize image: 126 | -1 images at native resolution as loaded from disk 127 | 0 images resized to (self.width, self.height ) 128 | 1 images resized to (self.width // 2, self.height // 2) 129 | 2 images resized to (self.width // 4, self.height // 4) 130 | 3 images resized to (self.width // 8, self.height // 8) 131 | """ 132 | inputs = {} 133 | 134 | do_color_aug = self.is_train and random.random() > 0.5 135 | do_flip = self.is_train and random.random() > 0.5 136 | 137 | line = self.filenames[index].split() 138 | folder = line[0] 139 | 140 | if len(line) == 3: 141 | frame_index = int(line[1]) 142 | else: 143 | frame_index = 0 144 | 145 | if len(line) == 3: 146 | side = line[2] 147 | else: 148 | side = None 149 | 150 | for i in self.frame_idxs: 151 | if i == "s": 152 | other_side = {"r": "l", "l": "r"}[side] 153 | inputs[("color", i, -1)] = self.get_color(folder, frame_index, other_side, do_flip) 154 | else: 155 | inputs[("color", i, -1)] = self.get_color(folder, frame_index + i, side, do_flip) 156 | 157 | # adjusting intrinsics to match each scale in the pyramid 158 | for scale in range(self.num_scales): 159 | K = self.K.copy() 160 | 161 | K[0, :] *= self.width // (2 ** scale) 162 | K[1, :] *= self.height // (2 ** scale) 163 | 164 | inv_K = np.linalg.pinv(K) 165 | 166 | inputs[("K", scale)] = torch.from_numpy(K) 167 | inputs[("inv_K", scale)] = torch.from_numpy(inv_K) 168 | 169 | if do_color_aug: 170 | color_aug = transforms.ColorJitter.get_params( 171 | self.brightness, self.contrast, self.saturation, self.hue) 172 | else: 173 | color_aug = (lambda x: x) 174 | 175 | self.preprocess(inputs, color_aug) 176 | 177 | for i in self.frame_idxs: 178 | del inputs[("color", i, -1)] 179 | del inputs[("color_aug", i, -1)] 180 | 181 | if self.load_depth: 182 | depth_gt = self.get_depth(folder, frame_index, side, do_flip) 183 | inputs["depth_gt"] = np.expand_dims(depth_gt, 0) 184 | inputs["depth_gt"] = torch.from_numpy(inputs["depth_gt"].astype(np.float32)) 185 | 186 | if "s" in self.frame_idxs: 187 | stereo_T = np.eye(4, dtype=np.float32) 188 | baseline_sign = -1 if do_flip else 1 189 | side_sign = -1 if side == "l" else 1 190 | stereo_T[0, 3] = side_sign * baseline_sign * 0.1 191 | 192 | inputs["stereo_T"] = torch.from_numpy(stereo_T) 193 | 194 | return inputs 195 | 196 | def get_color(self, folder, frame_index, side, do_flip): 197 | raise NotImplementedError 198 | 199 | def check_depth(self): 200 | raise NotImplementedError 201 | 202 | def get_depth(self, folder, frame_index, side, do_flip): 203 | raise NotImplementedError 204 | -------------------------------------------------------------------------------- /depthComparison.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sjsu-smart-lab/Self-supervised-Monocular-Trained-Depth-Estimation-using-Self-attention-and-Discrete-Disparity-Volum/3c6f46ab03cfd424b677dfeb0c4a45d6269415a9/depthComparison.png -------------------------------------------------------------------------------- /evaluate_city_depth.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | import cv2 5 | import numpy as np 6 | 7 | import torch 8 | from torch.utils.data import DataLoader 9 | 10 | from layers import disp_to_depth 11 | from options import MonodepthOptions 12 | import datasets 13 | import networks 14 | 15 | cv2.setNumThreads(0) # This speeds up evaluation 5x on our unix systems (OpenCV 3.3.1) 16 | 17 | splits_dir = os.path.join(os.path.dirname(__file__), "splits") 18 | 19 | 20 | def compute_errors(gt, pred): 21 | """Computation of error metrics between predicted and ground truth depths 22 | """ 23 | thresh = np.maximum((gt / pred), (pred / gt)) 24 | a1 = (thresh < 1.25).mean() 25 | a2 = (thresh < 1.25 ** 2).mean() 26 | a3 = (thresh < 1.25 ** 3).mean() 27 | 28 | rmse = (gt - pred) ** 2 29 | rmse = np.sqrt(rmse.mean()) 30 | 31 | rmse_log = (np.log(gt) - np.log(pred)) ** 2 32 | rmse_log = np.sqrt(rmse_log.mean()) 33 | 34 | abs_rel = np.mean(np.abs(gt - pred) / gt) 35 | 36 | sq_rel = np.mean(((gt - pred) ** 2) / gt) 37 | 38 | return abs_rel, sq_rel, rmse, rmse_log, a1, a2, a3 39 | 40 | 41 | def batch_post_process_disparity(l_disp, r_disp): 42 | """Apply the disparity post-processing method as introduced in Monodepthv1 43 | """ 44 | _, h, w = l_disp.shape 45 | m_disp = 0.5 * (l_disp + r_disp) 46 | l, _ = np.meshgrid(np.linspace(0, 1, w), np.linspace(0, 1, h)) 47 | l_mask = (1.0 - np.clip(20 * (l - 0.05), 0, 1))[None, ...] 48 | r_mask = l_mask[:, :, ::-1] 49 | return r_mask * l_disp + l_mask * r_disp + (1.0 - l_mask - r_mask) * m_disp 50 | 51 | 52 | def evaluate(opt): 53 | """Evaluates a pretrained model using a specified test set 54 | """ 55 | MIN_DEPTH = 0.001 56 | MAX_DEPTH = 80 57 | 58 | opt.load_weights_folder = os.path.expanduser(opt.load_weights_folder) 59 | 60 | assert os.path.isdir(opt.load_weights_folder), \ 61 | "Cannot find a folder at {}".format(opt.load_weights_folder) 62 | 63 | print("-> Loading weights from {}".format(opt.load_weights_folder)) 64 | 65 | encoder_path = os.path.join(opt.load_weights_folder, "encoder.pth") 66 | decoder_path = os.path.join(opt.load_weights_folder, "depth.pth") 67 | 68 | encoder_dict = torch.load(encoder_path) 69 | 70 | # Evaluate on cityscapes dataset 71 | CITY_DIR = opt.data_path 72 | cityscapes = datasets.CityscapesData(CITY_DIR) 73 | cityscapes_dataloader = DataLoader(cityscapes, 4, shuffle=False, num_workers=6, 74 | pin_memory=True, drop_last=False) 75 | 76 | encoder = networks.get_resnet101_asp_oc_dsn( 77 | 128, opt.no_self_attention, False) 78 | depth_decoder = networks.MSDepthDecoder(encoder.num_ch_enc, 79 | discretization=opt.discretization) 80 | 81 | model_dict = encoder.state_dict() 82 | encoder.load_state_dict({k: v for k, v in encoder_dict.items() if k in model_dict}) 83 | depth_decoder.load_state_dict(torch.load(decoder_path)) 84 | 85 | encoder.cuda() 86 | encoder.eval() 87 | depth_decoder.cuda() 88 | depth_decoder.eval() 89 | 90 | pred_disps = [] 91 | 92 | print("-> Computing predictions with size {}x{}".format( 93 | encoder_dict['width'], encoder_dict['height'])) 94 | 95 | with torch.no_grad(): 96 | for data in cityscapes_dataloader: 97 | input_color = data.permute(0, 3, 1, 2) 98 | input_color = input_color.cuda() 99 | output = depth_decoder(encoder(input_color)) 100 | 101 | pred_disp, _ = disp_to_depth(output[("disp", 0)], opt.min_depth, opt.max_depth) 102 | pred_disp = pred_disp.cpu()[:, 0].numpy() 103 | 104 | pred_disps.append(pred_disp) 105 | 106 | pred_disps = np.concatenate(pred_disps) 107 | 108 | gt_path = os.path.join(splits_dir, "gt_depths_cityscapes.npz") 109 | gt_depths = np.load(gt_path, fix_imports=True, encoding='latin1', allow_pickle=True)["data"] 110 | 111 | print("-> Evaluating") 112 | print(" Mono evaluation - using median scaling") 113 | 114 | errors = [] 115 | ratios = [] 116 | 117 | for i in range(pred_disps.shape[0]): 118 | 119 | gt_depth = gt_depths[i] 120 | gt_depth[gt_depth > 0] = (gt_depth[gt_depth > 0] - 1) / 256 121 | gt_depth[gt_depth > 0] = (0.209313 * 2262.52) / gt_depth[gt_depth > 0] 122 | gt_depth[gt_depth > MAX_DEPTH] = 0 123 | gt_height, gt_width = gt_depth.shape[:2] 124 | 125 | pred_disp = pred_disps[i] 126 | pred_disp = cv2.resize(pred_disp, (gt_width, gt_height)) 127 | pred_depth = 1 / pred_disp 128 | 129 | mask = np.logical_and(gt_depth > MIN_DEPTH, gt_depth < MAX_DEPTH) 130 | 131 | crop = np.array([0.05 * gt_height, 0.80 * gt_height, 132 | 0.05 * gt_width, 0.99 * gt_width]).astype(np.int32) 133 | crop_mask = np.zeros(mask.shape) 134 | crop_mask[crop[0]:crop[1], crop[2]:crop[3]] = 1 135 | mask = np.logical_and(mask, crop_mask) 136 | 137 | pred_depth = pred_depth[mask] 138 | gt_depth = gt_depth[mask] 139 | 140 | pred_depth *= opt.pred_depth_scale_factor 141 | if not opt.disable_median_scaling: 142 | ratio = np.median(gt_depth) / np.median(pred_depth) 143 | ratios.append(ratio) 144 | pred_depth *= ratio 145 | 146 | pred_depth[pred_depth < MIN_DEPTH] = MIN_DEPTH 147 | 148 | pred_depth[pred_depth > MAX_DEPTH] = MAX_DEPTH 149 | 150 | errors.append(compute_errors(gt_depth, pred_depth)) 151 | 152 | if not opt.disable_median_scaling: 153 | ratios = np.array(ratios) 154 | med = np.median(ratios) 155 | print(" Scaling ratios | med: {:0.3f} | std: {:0.3f}".format(med, np.std(ratios / med))) 156 | 157 | mean_errors = np.array(errors).mean(0) 158 | 159 | print("\n " + ("{:>8} | " * 7).format("abs_rel", "sq_rel", "rmse", "rmse_log", "a1", "a2", "a3")) 160 | print(("&{: 8.3f} " * 7).format(*mean_errors.tolist()) + "\\\\") 161 | print("\n-> Done!") 162 | 163 | 164 | if __name__ == "__main__": 165 | options = MonodepthOptions() 166 | evaluate(options.parse()) 167 | -------------------------------------------------------------------------------- /evaluate_depth.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | import cv2 5 | import numpy as np 6 | 7 | import torch 8 | from torch.utils.data import DataLoader 9 | 10 | from layers import disp_to_depth 11 | from utils import readlines 12 | from options import MonodepthOptions 13 | import datasets 14 | import networks 15 | 16 | cv2.setNumThreads(0) # This speeds up evaluation 5x on our unix systems (OpenCV 3.3.1) 17 | 18 | 19 | splits_dir = os.path.join(os.path.dirname(__file__), "splits") 20 | 21 | # Models which were trained with stereo supervision were trained with a nominal 22 | # baseline of 0.1 units. The KITTI rig has a baseline of 54cm. Therefore, 23 | # to convert our stereo predictions to real-world scale we multiply our depths by 5.4. 24 | STEREO_SCALE_FACTOR = 5.4 25 | 26 | 27 | def compute_errors(gt, pred): 28 | """Computation of error metrics between predicted and ground truth depths 29 | """ 30 | thresh = np.maximum((gt / pred), (pred / gt)) 31 | a1 = (thresh < 1.25 ).mean() 32 | a2 = (thresh < 1.25 ** 2).mean() 33 | a3 = (thresh < 1.25 ** 3).mean() 34 | 35 | rmse = (gt - pred) ** 2 36 | rmse = np.sqrt(rmse.mean()) 37 | 38 | rmse_log = (np.log(gt) - np.log(pred)) ** 2 39 | rmse_log = np.sqrt(rmse_log.mean()) 40 | 41 | abs_rel = np.mean(np.abs(gt - pred) / gt) 42 | 43 | sq_rel = np.mean(((gt - pred) ** 2) / gt) 44 | 45 | return abs_rel, sq_rel, rmse, rmse_log, a1, a2, a3 46 | 47 | 48 | def batch_post_process_disparity(l_disp, r_disp): 49 | """Apply the disparity post-processing method as introduced in Monodepthv1 50 | """ 51 | _, h, w = l_disp.shape 52 | m_disp = 0.5 * (l_disp + r_disp) 53 | l, _ = np.meshgrid(np.linspace(0, 1, w), np.linspace(0, 1, h)) 54 | l_mask = (1.0 - np.clip(20 * (l - 0.05), 0, 1))[None, ...] 55 | r_mask = l_mask[:, :, ::-1] 56 | return r_mask * l_disp + l_mask * r_disp + (1.0 - l_mask - r_mask) * m_disp 57 | 58 | 59 | def evaluate(opt): 60 | """Evaluates a pretrained model using a specified test set 61 | """ 62 | MIN_DEPTH = 1e-3 63 | MAX_DEPTH = 80 64 | 65 | assert sum((opt.eval_mono, opt.eval_stereo)) == 1, \ 66 | "Please choose mono or stereo evaluation by setting either --eval_mono or --eval_stereo" 67 | 68 | if opt.ext_disp_to_eval is None: 69 | 70 | opt.load_weights_folder = os.path.expanduser(opt.load_weights_folder) 71 | 72 | assert os.path.isdir(opt.load_weights_folder), \ 73 | "Cannot find a folder at {}".format(opt.load_weights_folder) 74 | 75 | print("-> Loading weights from {}".format(opt.load_weights_folder)) 76 | 77 | filenames = readlines(os.path.join(splits_dir, opt.eval_split, "test_files.txt")) 78 | encoder_path = os.path.join(opt.load_weights_folder, "encoder.pth") 79 | decoder_path = os.path.join(opt.load_weights_folder, "depth.pth") 80 | 81 | encoder_dict = torch.load(encoder_path) 82 | 83 | img_ext = '.png' if opt.png else '.jpg' 84 | dataset = datasets.KITTIRAWDataset(opt.data_path, filenames, 85 | encoder_dict['height'], encoder_dict['width'], 86 | [0], 4, is_train=False, img_ext=img_ext) 87 | dataloader = DataLoader(dataset, 16, shuffle=False, num_workers=opt.num_workers, 88 | pin_memory=True, drop_last=False) 89 | 90 | if opt.no_ddv: 91 | encoder = networks.get_resnet101_asp_oc_dsn( 92 | 2048, opt.no_self_attention, False) 93 | depth_decoder = networks.DepthDecoder( 94 | encoder.num_ch_enc) 95 | else: 96 | encoder = networks.get_resnet101_asp_oc_dsn( 97 | 128, opt.no_self_attention, False) 98 | depth_decoder = networks.MSDepthDecoder( 99 | encoder.num_ch_enc, discretization=opt.discretization) 100 | 101 | model_dict = encoder.state_dict() 102 | encoder.load_state_dict({k: v for k, v in encoder_dict.items() if k in model_dict}) 103 | depth_decoder.load_state_dict(torch.load(decoder_path)) 104 | 105 | encoder.cuda() 106 | encoder.eval() 107 | depth_decoder.cuda() 108 | depth_decoder.eval() 109 | 110 | pred_disps = [] 111 | 112 | print("-> Computing predictions with size {}x{}".format( 113 | encoder_dict['width'], encoder_dict['height'])) 114 | 115 | with torch.no_grad(): 116 | for data in dataloader: 117 | input_color = data[("color", 0, 0)].cuda() 118 | 119 | if opt.post_process: 120 | # Post-processed results require each image to have two forward passes 121 | input_color = torch.cat((input_color, torch.flip(input_color, [3])), 0) 122 | 123 | features = encoder(input_color) 124 | if opt.no_ddv: 125 | output = depth_decoder(features) 126 | else: 127 | all_features = {} 128 | all_features['conv3'] = features[0] 129 | all_features['layer1'] = features[1] 130 | all_features['output'] = features[-1] 131 | output = depth_decoder(all_features) 132 | 133 | pred_disp, _ = disp_to_depth(output[("disp", 0)], opt.min_depth, opt.max_depth) 134 | pred_disp = pred_disp.cpu()[:, 0].numpy() 135 | 136 | if opt.post_process: 137 | N = pred_disp.shape[0] // 2 138 | pred_disp = batch_post_process_disparity(pred_disp[:N], pred_disp[N:, :, ::-1]) 139 | 140 | pred_disps.append(pred_disp) 141 | 142 | pred_disps = np.concatenate(pred_disps) 143 | 144 | else: 145 | # Load predictions from file 146 | print("-> Loading predictions from {}".format(opt.ext_disp_to_eval)) 147 | pred_disps = np.load(opt.ext_disp_to_eval) 148 | 149 | if opt.eval_eigen_to_benchmark: 150 | eigen_to_benchmark_ids = np.load( 151 | os.path.join(splits_dir, "benchmark", "eigen_to_benchmark_ids.npy")) 152 | 153 | pred_disps = pred_disps[eigen_to_benchmark_ids] 154 | 155 | if opt.save_pred_disps: 156 | output_path = os.path.join( 157 | opt.load_weights_folder, "disps_{}_split.npy".format(opt.eval_split)) 158 | print("-> Saving predicted disparities to ", output_path) 159 | np.save(output_path, pred_disps) 160 | 161 | if opt.no_eval: 162 | print("-> Evaluation disabled. Done.") 163 | quit() 164 | 165 | elif opt.eval_split == 'benchmark': 166 | save_dir = os.path.join(opt.load_weights_folder, "benchmark_predictions") 167 | print("-> Saving out benchmark predictions to {}".format(save_dir)) 168 | if not os.path.exists(save_dir): 169 | os.makedirs(save_dir) 170 | 171 | for idx in range(len(pred_disps)): 172 | disp_resized = cv2.resize(pred_disps[idx], (1216, 352)) 173 | depth = STEREO_SCALE_FACTOR / disp_resized 174 | depth = np.clip(depth, 0, 80) 175 | depth = np.uint16(depth * 256) 176 | save_path = os.path.join(save_dir, "{:010d}.png".format(idx)) 177 | cv2.imwrite(save_path, depth) 178 | 179 | print("-> No ground truth is available for the KITTI benchmark, so not evaluating. Done.") 180 | quit() 181 | 182 | gt_path = os.path.join(splits_dir, opt.eval_split, "gt_depths.npz") 183 | gt_depths = np.load(gt_path, fix_imports=True, encoding='latin1', allow_pickle=True)["data"] 184 | 185 | print("-> Evaluating") 186 | 187 | if opt.eval_stereo: 188 | print(" Stereo evaluation - " 189 | "disabling median scaling, scaling by {}".format(STEREO_SCALE_FACTOR)) 190 | opt.disable_median_scaling = True 191 | opt.pred_depth_scale_factor = STEREO_SCALE_FACTOR 192 | else: 193 | print(" Mono evaluation - using median scaling") 194 | 195 | errors = [] 196 | ratios = [] 197 | 198 | for i in range(pred_disps.shape[0]): 199 | 200 | gt_depth = gt_depths[i] 201 | gt_height, gt_width = gt_depth.shape[:2] 202 | 203 | pred_disp = pred_disps[i] 204 | pred_disp = cv2.resize(pred_disp, (gt_width, gt_height)) 205 | pred_depth = 1 / pred_disp 206 | 207 | if opt.eval_split == "eigen": 208 | mask = np.logical_and(gt_depth > MIN_DEPTH, gt_depth < MAX_DEPTH) 209 | 210 | crop = np.array([0.40810811 * gt_height, 0.99189189 * gt_height, 211 | 0.03594771 * gt_width, 0.96405229 * gt_width]).astype(np.int32) 212 | crop_mask = np.zeros(mask.shape) 213 | crop_mask[crop[0]:crop[1], crop[2]:crop[3]] = 1 214 | mask = np.logical_and(mask, crop_mask) 215 | 216 | else: 217 | mask = gt_depth > 0 218 | 219 | pred_depth = pred_depth[mask] 220 | gt_depth = gt_depth[mask] 221 | 222 | pred_depth *= opt.pred_depth_scale_factor 223 | if not opt.disable_median_scaling: 224 | ratio = np.median(gt_depth) / np.median(pred_depth) 225 | ratios.append(ratio) 226 | pred_depth *= ratio 227 | 228 | pred_depth[pred_depth < MIN_DEPTH] = MIN_DEPTH 229 | pred_depth[pred_depth > MAX_DEPTH] = MAX_DEPTH 230 | 231 | errors.append(compute_errors(gt_depth, pred_depth)) 232 | 233 | if not opt.disable_median_scaling: 234 | ratios = np.array(ratios) 235 | med = np.median(ratios) 236 | print(" Scaling ratios | med: {:0.3f} | std: {:0.3f}".format(med, np.std(ratios / med))) 237 | 238 | mean_errors = np.array(errors).mean(0) 239 | 240 | print("\n " + ("{:>8} | " * 7).format("abs_rel", "sq_rel", "rmse", "rmse_log", "a1", "a2", "a3")) 241 | print(("&{: 8.3f} " * 7).format(*mean_errors.tolist()) + "\\\\") 242 | print("\n-> Done!") 243 | 244 | 245 | if __name__ == "__main__": 246 | options = MonodepthOptions() 247 | evaluate(options.parse()) 248 | -------------------------------------------------------------------------------- /evaluate_pose.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | import numpy as np 5 | 6 | import torch 7 | from torch.utils.data import DataLoader 8 | 9 | from layers import transformation_from_parameters 10 | from utils import readlines 11 | from options import MonodepthOptions 12 | from datasets import KITTIOdomDataset 13 | import networks 14 | 15 | 16 | def dump_xyz(source_to_target_transformations): 17 | xyzs = [] 18 | cam_to_world = np.eye(4) 19 | xyzs.append(cam_to_world[:3, 3]) 20 | for source_to_target_transformation in source_to_target_transformations: 21 | cam_to_world = np.dot(cam_to_world, source_to_target_transformation) 22 | xyzs.append(cam_to_world[:3, 3]) 23 | return xyzs 24 | 25 | 26 | def compute_ate(gtruth_xyz, pred_xyz_o): 27 | 28 | # Make sure that the first matched frames align (no need for rotational alignment as 29 | # all the predicted/ground-truth snippets have been converted to use the same coordinate 30 | # system with the first frame of the snippet being the origin). 31 | offset = gtruth_xyz[0] - pred_xyz_o[0] 32 | pred_xyz = pred_xyz_o + offset[None, :] 33 | 34 | # Optimize the scaling factor 35 | scale = np.sum(gtruth_xyz * pred_xyz) / np.sum(pred_xyz ** 2) 36 | alignment_error = pred_xyz * scale - gtruth_xyz 37 | rmse = np.sqrt(np.sum(alignment_error ** 2)) / gtruth_xyz.shape[0] 38 | return rmse 39 | 40 | 41 | def evaluate(opt): 42 | """Evaluate odometry on the KITTI dataset 43 | """ 44 | assert os.path.isdir(opt.load_weights_folder), \ 45 | "Cannot find a folder at {}".format(opt.load_weights_folder) 46 | 47 | assert opt.eval_split == "odom_9" or opt.eval_split == "odom_10", \ 48 | "eval_split should be either odom_9 or odom_10" 49 | 50 | sequence_id = int(opt.eval_split.split("_")[1]) 51 | 52 | filenames = readlines( 53 | os.path.join(os.path.dirname(__file__), "splits", "odom", 54 | "test_files_{:02d}.txt".format(sequence_id))) 55 | 56 | dataset = KITTIOdomDataset(opt.data_path, filenames, opt.height, opt.width, 57 | [0, 1], 4, is_train=False) 58 | dataloader = DataLoader(dataset, opt.batch_size, shuffle=False, 59 | num_workers=opt.num_workers, pin_memory=True, drop_last=False) 60 | 61 | pose_encoder_path = os.path.join(opt.load_weights_folder, "pose_encoder.pth") 62 | pose_decoder_path = os.path.join(opt.load_weights_folder, "pose.pth") 63 | 64 | pose_encoder = networks.ResnetEncoder(opt.num_layers, False, 2) 65 | pose_encoder.load_state_dict(torch.load(pose_encoder_path)) 66 | 67 | pose_decoder = networks.PoseDecoder(pose_encoder.num_ch_enc, 1, 2) 68 | pose_decoder.load_state_dict(torch.load(pose_decoder_path)) 69 | 70 | pose_encoder.cuda() 71 | pose_encoder.eval() 72 | pose_decoder.cuda() 73 | pose_decoder.eval() 74 | 75 | pred_poses = [] 76 | 77 | print("-> Computing pose predictions") 78 | 79 | opt.frame_ids = [0, 1] # pose network only takes two frames as input 80 | 81 | with torch.no_grad(): 82 | for inputs in dataloader: 83 | for key, ipt in inputs.items(): 84 | inputs[key] = ipt.cuda() 85 | 86 | all_color_aug = torch.cat([inputs[("color_aug", i, 0)] for i in opt.frame_ids], 1) 87 | 88 | features = [pose_encoder(all_color_aug)] 89 | axisangle, translation = pose_decoder(features) 90 | 91 | pred_poses.append( 92 | transformation_from_parameters(axisangle[:, 0], translation[:, 0]).cpu().numpy()) 93 | 94 | pred_poses = np.concatenate(pred_poses) 95 | 96 | gt_poses_path = os.path.join(opt.data_path, "poses", "{:02d}.txt".format(sequence_id)) 97 | gt_global_poses = np.loadtxt(gt_poses_path).reshape(-1, 3, 4) 98 | gt_global_poses = np.concatenate( 99 | (gt_global_poses, np.zeros((gt_global_poses.shape[0], 1, 4))), 1) 100 | gt_global_poses[:, 3, 3] = 1 101 | gt_xyzs = gt_global_poses[:, :3, 3] 102 | 103 | gt_local_poses = [] 104 | for i in range(1, len(gt_global_poses)): 105 | gt_local_poses.append( 106 | np.linalg.inv(np.dot(np.linalg.inv(gt_global_poses[i - 1]), gt_global_poses[i]))) 107 | 108 | ates = [] 109 | num_frames = gt_xyzs.shape[0] 110 | track_length = 5 111 | for i in range(0, num_frames - 1): 112 | local_xyzs = np.array(dump_xyz(pred_poses[i:i + track_length - 1])) 113 | gt_local_xyzs = np.array(dump_xyz(gt_local_poses[i:i + track_length - 1])) 114 | 115 | ates.append(compute_ate(gt_local_xyzs, local_xyzs)) 116 | 117 | print("\n Trajectory error: {:0.3f}, std: {:0.3f}\n".format(np.mean(ates), np.std(ates))) 118 | 119 | save_path = os.path.join(opt.load_weights_folder, "poses.npy") 120 | np.save(save_path, pred_poses) 121 | print("-> Predictions saved to", save_path) 122 | 123 | 124 | if __name__ == "__main__": 125 | options = MonodepthOptions() 126 | evaluate(options.parse()) 127 | -------------------------------------------------------------------------------- /export_gt_depth.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | 5 | import argparse 6 | import numpy as np 7 | import PIL.Image as pil 8 | import cv2 9 | from pathlib import Path 10 | 11 | from utils import readlines 12 | from kitti_utils import generate_depth_map 13 | 14 | 15 | def export_gt_depths_kitti(opt): 16 | """ 17 | Generate ground-truth data and store as .npz file 18 | """ 19 | split_folder = os.path.join(os.path.dirname(__file__), "splits", opt.split) 20 | lines = readlines(os.path.join(split_folder, "test_files.txt")) 21 | 22 | print("Exporting ground truth depths for {}".format(opt.split)) 23 | 24 | gt_depths = [] 25 | for line in lines: 26 | 27 | folder, frame_id, _ = line.split() 28 | frame_id = int(frame_id) 29 | 30 | if opt.split == "eigen": 31 | calib_dir = os.path.join(opt.data_path, folder.split("/")[0]) 32 | velo_filename = os.path.join(opt.data_path, folder, 33 | "velodyne_points/data", "{:010d}.bin".format(frame_id)) 34 | gt_depth = generate_depth_map(calib_dir, velo_filename, 2, True) 35 | elif opt.split == "eigen_benchmark": 36 | gt_depth_path = os.path.join(opt.data_path, folder, "proj_depth", 37 | "groundtruth", "image_02", "{:010d}.png".format(frame_id)) 38 | gt_depth = np.array(pil.open(gt_depth_path)).astype(np.float32) / 256 39 | 40 | gt_depths.append(gt_depth.astype(np.float32)) 41 | 42 | output_path = os.path.join(split_folder, "gt_depths.npz") 43 | 44 | print("Saving to {}".format(opt.split)) 45 | 46 | np.savez_compressed(output_path, data=np.array(gt_depths)) 47 | 48 | 49 | def export_gt_depths_cityscapes(opt): 50 | """ 51 | Load ground-truth in the dataset an store as .npz file 52 | """ 53 | split_folder = os.path.join(os.path.dirname(__file__), "splits") 54 | gt_depths = [] 55 | print("Exporting ground truth depths for {}".format(opt.dataset)) 56 | folder_path = opt.data_path 57 | all_imgs = sorted(list(Path(folder_path).glob('**/*.png'))) 58 | for line in all_imgs: 59 | # gt_depth_path = os.path.join(opt.data_path, line) 60 | gt_depth = cv2.imread(str(line), cv2.IMREAD_UNCHANGED) 61 | gt_depth = (cv2.resize(gt_depth, (1242, 375), cv2.INTER_AREA)) 62 | 63 | gt_depths.append(gt_depth.astype(np.float32)) 64 | 65 | output_path = os.path.join(split_folder, "gt_depths_cityscapes.npz") 66 | 67 | print("Saving to {}".format(output_path)) 68 | 69 | np.savez_compressed(output_path, data=np.array(gt_depths)) 70 | 71 | 72 | def main(): 73 | parser = argparse.ArgumentParser(description='export_gt_depth') 74 | 75 | parser.add_argument('--data_path', 76 | type=str, 77 | help='path to the root of the KITTI data', 78 | required=True) 79 | parser.add_argument('--dataset', 80 | type=str, 81 | help='which split to export gt from', 82 | required=True, 83 | choices=["kitti", "cityscapes"]) 84 | parser.add_argument('--split', 85 | type=str, 86 | help='which split to export gt from', 87 | choices=["eigen", "eigen_benchmark"]) 88 | 89 | opt = parser.parse_args() 90 | 91 | 92 | if opt.dataset == 'kitti': 93 | export_gt_depths_kitti(opt) 94 | elif opt.dataset == 'cityscapes': 95 | export_gt_depths_cityscapes(opt) 96 | 97 | 98 | if __name__ == "__main__": 99 | main() 100 | -------------------------------------------------------------------------------- /inplace_abn/__init__.py: -------------------------------------------------------------------------------- 1 | from .bn import ABN, InPlaceABN, InPlaceABNSync 2 | from .functions import ACT_RELU, ACT_LEAKY_RELU, ACT_ELU, ACT_NONE 3 | -------------------------------------------------------------------------------- /inplace_abn/bn.py: -------------------------------------------------------------------------------- 1 | import os, sys 2 | import torch 3 | import torch.nn as nn 4 | import torch.nn.functional as functional 5 | from queue import Queue 6 | BASE_DIR = os.path.dirname(os.path.abspath(__file__)) 7 | sys.path.append(BASE_DIR) 8 | sys.path.append(os.path.join(BASE_DIR, '../src')) 9 | from inplace_abn.functions import * 10 | 11 | 12 | class ABN(nn.Module): 13 | """Activated Batch Normalization 14 | 15 | This gathers a `BatchNorm2d` and an activation function in a single module 16 | """ 17 | 18 | def __init__(self, num_features, eps=1e-5, momentum=0.1, affine=True, activation="leaky_relu", slope=0.01): 19 | """Creates an Activated Batch Normalization module 20 | 21 | Args: 22 | num_features : Number of feature channels in the input and output. 23 | eps : Small constant to prevent numerical issues. 24 | momentum : Momentum factor applied to compute running statistics as. 25 | affine : If `True` apply learned scale and shift transformation after normalization. 26 | activation : Name of the activation functions, one of: `leaky_relu`, `elu` or `none`. 27 | slope : Negative slope for the `leaky_relu` activation. 28 | """ 29 | super(ABN, self).__init__() 30 | self.num_features = num_features 31 | self.affine = affine 32 | self.eps = eps 33 | self.momentum = momentum 34 | self.activation = activation 35 | self.slope = slope 36 | if self.affine: 37 | self.weight = nn.Parameter(torch.ones(num_features)) 38 | self.bias = nn.Parameter(torch.zeros(num_features)) 39 | else: 40 | self.register_parameter('weight', None) 41 | self.register_parameter('bias', None) 42 | self.register_buffer('running_mean', torch.zeros(num_features)) 43 | self.register_buffer('running_var', torch.ones(num_features)) 44 | self.reset_parameters() 45 | 46 | def reset_parameters(self): 47 | nn.init.constant_(self.running_mean, 0) 48 | nn.init.constant_(self.running_var, 1) 49 | if self.affine: 50 | nn.init.constant_(self.weight, 1) 51 | nn.init.constant_(self.bias, 0) 52 | 53 | def forward(self, x): 54 | x = functional.batch_norm(x, self.running_mean, self.running_var, self.weight, self.bias, 55 | self.training, self.momentum, self.eps) 56 | 57 | if self.activation == ACT_RELU: 58 | return functional.relu(x, inplace=True) 59 | elif self.activation == ACT_LEAKY_RELU: 60 | return functional.leaky_relu(x, negative_slope=self.slope, inplace=True) 61 | elif self.activation == ACT_ELU: 62 | return functional.elu(x, inplace=True) 63 | else: 64 | return x 65 | 66 | def __repr__(self): 67 | rep = '{name}({num_features}, eps={eps}, momentum={momentum},' \ 68 | ' affine={affine}, activation={activation}' 69 | if self.activation == "leaky_relu": 70 | rep += ', slope={slope})' 71 | else: 72 | rep += ')' 73 | return rep.format(name=self.__class__.__name__, **self.__dict__) 74 | 75 | 76 | class InPlaceABN(ABN): 77 | """InPlace Activated Batch Normalization""" 78 | 79 | def __init__(self, num_features, eps=1e-5, momentum=0.1, affine=True, activation="leaky_relu", slope=0.01): 80 | """Creates an InPlace Activated Batch Normalization module 81 | 82 | Args: 83 | num_features : Number of feature channels in the input and output. 84 | eps : mall constant to prevent numerical issues. 85 | momentum : Momentum factor applied to compute running statistics as. 86 | affine : If `True` apply learned scale and shift transformation after normalization. 87 | activation : Name of the activation functions, one of: `leaky_relu`, `elu` or `none`. 88 | slope : Negative slope for the `leaky_relu` activation. 89 | """ 90 | super(InPlaceABN, self).__init__(num_features, eps, momentum, affine, activation, slope) 91 | 92 | def forward(self, x): 93 | return inplace_abn(x, self.weight, self.bias, self.running_mean, self.running_var, 94 | self.training, self.momentum, self.eps, self.activation, self.slope) 95 | 96 | 97 | class InPlaceABNSync(ABN): 98 | """InPlace Activated Batch Normalization with cross-GPU synchronization 99 | 100 | This assumes that it will be replicated across GPUs using the same mechanism as in `nn.DataParallel`. 101 | """ 102 | 103 | def __init__(self, num_features, devices=None, eps=1e-5, momentum=0.1, affine=True, activation="leaky_relu", 104 | slope=0.01): 105 | """Creates a synchronized, InPlace Activated Batch Normalization module 106 | 107 | Args: 108 | num_features : Number of feature channels in the input and output. 109 | devices : IDs of the GPUs that will run the replicas of this module. 110 | eps : Small constant to prevent numerical issues. 111 | momentum : Momentum factor applied to compute running statistics as. 112 | affine : If `True` apply learned scale and shift transformation after normalization. 113 | activation : Name of the activation functions, one of: `leaky_relu`, `elu` or `none`. 114 | slope : Negative slope for the `leaky_relu` activation. 115 | """ 116 | super(InPlaceABNSync, self).__init__(num_features, eps, momentum, affine, activation, slope) 117 | self.devices = devices if devices else list(range(torch.cuda.device_count())) 118 | 119 | # Initialize queues 120 | self.worker_ids = self.devices[1:] 121 | self.master_queue = Queue(len(self.worker_ids)) 122 | self.worker_queues = [Queue(1) for _ in self.worker_ids] 123 | 124 | def forward(self, x): 125 | if x.get_device() == self.devices[0]: 126 | # Master mode 127 | extra = { 128 | "is_master": True, 129 | "master_queue": self.master_queue, 130 | "worker_queues": self.worker_queues, 131 | "worker_ids": self.worker_ids 132 | } 133 | else: 134 | # Worker mode 135 | extra = { 136 | "is_master": False, 137 | "master_queue": self.master_queue, 138 | "worker_queue": self.worker_queues[self.worker_ids.index(x.get_device())] 139 | } 140 | 141 | return inplace_abn_sync(x, self.weight, self.bias, self.running_mean, self.running_var, 142 | extra, self.training, self.momentum, self.eps, self.activation, self.slope) 143 | 144 | def __repr__(self): 145 | rep = '{name}({num_features}, eps={eps}, momentum={momentum},' \ 146 | ' affine={affine}, devices={devices}, activation={activation}' 147 | if self.activation == "leaky_relu": 148 | rep += ', slope={slope})' 149 | else: 150 | rep += ')' 151 | return rep.format(name=self.__class__.__name__, **self.__dict__) 152 | -------------------------------------------------------------------------------- /inplace_abn/functions.py: -------------------------------------------------------------------------------- 1 | from os import path 2 | 3 | import torch.autograd as autograd 4 | import torch.cuda.comm as comm 5 | from torch.autograd.function import once_differentiable 6 | from torch.utils.cpp_extension import load 7 | 8 | _src_path = path.join(path.dirname(path.abspath(__file__)), "src") 9 | _backend = load(name="inplace_abn", 10 | extra_cflags=["-O3"], 11 | sources=[path.join(_src_path, f) for f in [ 12 | "inplace_abn.cpp", 13 | "inplace_abn_cpu.cpp", 14 | "inplace_abn_cuda.cu" 15 | ]], 16 | extra_cuda_cflags=["--expt-extended-lambda"]) 17 | 18 | # Activation names 19 | ACT_RELU = "relu" 20 | ACT_LEAKY_RELU = "leaky_relu" 21 | ACT_ELU = "elu" 22 | ACT_NONE = "none" 23 | 24 | 25 | def _check(fn, *args, **kwargs): 26 | success = fn(*args, **kwargs) 27 | if not success: 28 | raise RuntimeError("CUDA Error encountered in {}".format(fn)) 29 | 30 | 31 | def _broadcast_shape(x): 32 | out_size = [] 33 | for i, s in enumerate(x.size()): 34 | if i != 1: 35 | out_size.append(1) 36 | else: 37 | out_size.append(s) 38 | return out_size 39 | 40 | 41 | def _reduce(x): 42 | if len(x.size()) == 2: 43 | return x.sum(dim=0) 44 | else: 45 | n, c = x.size()[0:2] 46 | return x.contiguous().view((n, c, -1)).sum(2).sum(0) 47 | 48 | 49 | def _count_samples(x): 50 | count = 1 51 | for i, s in enumerate(x.size()): 52 | if i != 1: 53 | count *= s 54 | return count 55 | 56 | 57 | def _act_forward(ctx, x): 58 | if ctx.activation == ACT_LEAKY_RELU: 59 | _backend.leaky_relu_forward(x, ctx.slope) 60 | elif ctx.activation == ACT_ELU: 61 | _backend.elu_forward(x) 62 | elif ctx.activation == ACT_NONE: 63 | pass 64 | 65 | 66 | def _act_backward(ctx, x, dx): 67 | if ctx.activation == ACT_LEAKY_RELU: 68 | _backend.leaky_relu_backward(x, dx, ctx.slope) 69 | elif ctx.activation == ACT_ELU: 70 | _backend.elu_backward(x, dx) 71 | elif ctx.activation == ACT_NONE: 72 | pass 73 | 74 | 75 | class InPlaceABN(autograd.Function): 76 | @staticmethod 77 | def forward(ctx, x, weight, bias, running_mean, running_var, 78 | training=True, momentum=0.1, eps=1e-05, activation=ACT_LEAKY_RELU, slope=0.01): 79 | # Save context 80 | ctx.training = training 81 | ctx.momentum = momentum 82 | ctx.eps = eps 83 | ctx.activation = activation 84 | ctx.slope = slope 85 | ctx.affine = weight is not None and bias is not None 86 | 87 | # Prepare inputs 88 | count = _count_samples(x) 89 | x = x.contiguous() 90 | weight = weight.contiguous() if ctx.affine else x.new_empty(0) 91 | bias = bias.contiguous() if ctx.affine else x.new_empty(0) 92 | 93 | if ctx.training: 94 | mean, var = _backend.mean_var(x) 95 | 96 | # Update running stats 97 | running_mean.mul_((1 - ctx.momentum)).add_(ctx.momentum * mean) 98 | running_var.mul_((1 - ctx.momentum)).add_(ctx.momentum * var * count / (count - 1)) 99 | 100 | # Mark in-place modified tensors 101 | ctx.mark_dirty(x, running_mean, running_var) 102 | else: 103 | mean, var = running_mean.contiguous(), running_var.contiguous() 104 | ctx.mark_dirty(x) 105 | 106 | # BN forward + activation 107 | _backend.forward(x, mean, var, weight, bias, ctx.affine, ctx.eps) 108 | _act_forward(ctx, x) 109 | 110 | # Output 111 | ctx.var = var 112 | ctx.save_for_backward(x, var, weight, bias) 113 | return x 114 | 115 | @staticmethod 116 | @once_differentiable 117 | def backward(ctx, dz): 118 | z, var, weight, bias = ctx.saved_tensors 119 | dz = dz.contiguous() 120 | 121 | # Undo activation 122 | _act_backward(ctx, z, dz) 123 | 124 | if ctx.training: 125 | edz, eydz = _backend.edz_eydz(z, dz, weight, bias, ctx.affine, ctx.eps) 126 | else: 127 | # TODO: implement simplified CUDA backward for inference mode 128 | edz = dz.new_zeros(dz.size(1)) 129 | eydz = dz.new_zeros(dz.size(1)) 130 | 131 | dx, dweight, dbias = _backend.backward(z, dz, var, weight, bias, edz, eydz, ctx.affine, ctx.eps) 132 | dweight = dweight if ctx.affine else None 133 | dbias = dbias if ctx.affine else None 134 | 135 | return dx, dweight, dbias, None, None, None, None, None, None, None 136 | 137 | 138 | class InPlaceABNSync(autograd.Function): 139 | @classmethod 140 | def forward(cls, ctx, x, weight, bias, running_mean, running_var, 141 | extra, training=True, momentum=0.1, eps=1e-05, activation=ACT_LEAKY_RELU, slope=0.01): 142 | # Save context 143 | cls._parse_extra(ctx, extra) 144 | ctx.training = training 145 | ctx.momentum = momentum 146 | ctx.eps = eps 147 | ctx.activation = activation 148 | ctx.slope = slope 149 | ctx.affine = weight is not None and bias is not None 150 | 151 | # Prepare inputs 152 | count = _count_samples(x) * (ctx.master_queue.maxsize + 1) 153 | x = x.contiguous() 154 | weight = weight.contiguous() if ctx.affine else x.new_empty(0) 155 | bias = bias.contiguous() if ctx.affine else x.new_empty(0) 156 | 157 | if ctx.training: 158 | mean, var = _backend.mean_var(x) 159 | 160 | if ctx.is_master: 161 | means, vars = [mean.unsqueeze(0)], [var.unsqueeze(0)] 162 | for _ in range(ctx.master_queue.maxsize): 163 | mean_w, var_w = ctx.master_queue.get() 164 | ctx.master_queue.task_done() 165 | means.append(mean_w.unsqueeze(0)) 166 | vars.append(var_w.unsqueeze(0)) 167 | 168 | means = comm.gather(means) 169 | vars = comm.gather(vars) 170 | 171 | mean = means.mean(0) 172 | var = (vars + (mean - means) ** 2).mean(0) 173 | 174 | tensors = comm.broadcast_coalesced((mean, var), [mean.get_device()] + ctx.worker_ids) 175 | for ts, queue in zip(tensors[1:], ctx.worker_queues): 176 | queue.put(ts) 177 | else: 178 | ctx.master_queue.put((mean, var)) 179 | mean, var = ctx.worker_queue.get() 180 | ctx.worker_queue.task_done() 181 | 182 | # Update running stats 183 | running_mean.mul_((1 - ctx.momentum)).add_(ctx.momentum * mean) 184 | running_var.mul_((1 - ctx.momentum)).add_(ctx.momentum * var * count / (count - 1)) 185 | 186 | # Mark in-place modified tensors 187 | ctx.mark_dirty(x, running_mean, running_var) 188 | else: 189 | mean, var = running_mean.contiguous(), running_var.contiguous() 190 | ctx.mark_dirty(x) 191 | 192 | # BN forward + activation 193 | _backend.forward(x, mean, var, weight, bias, ctx.affine, ctx.eps) 194 | _act_forward(ctx, x) 195 | 196 | # Output 197 | ctx.var = var 198 | ctx.save_for_backward(x, var, weight, bias) 199 | return x 200 | 201 | @staticmethod 202 | @once_differentiable 203 | def backward(ctx, dz): 204 | z, var, weight, bias = ctx.saved_tensors 205 | dz = dz.contiguous() 206 | 207 | # Undo activation 208 | _act_backward(ctx, z, dz) 209 | 210 | if ctx.training: 211 | edz, eydz = _backend.edz_eydz(z, dz, weight, bias, ctx.affine, ctx.eps) 212 | 213 | if ctx.is_master: 214 | edzs, eydzs = [edz], [eydz] 215 | for _ in range(len(ctx.worker_queues)): 216 | edz_w, eydz_w = ctx.master_queue.get() 217 | ctx.master_queue.task_done() 218 | edzs.append(edz_w) 219 | eydzs.append(eydz_w) 220 | 221 | edz = comm.reduce_add(edzs) / (ctx.master_queue.maxsize + 1) 222 | eydz = comm.reduce_add(eydzs) / (ctx.master_queue.maxsize + 1) 223 | 224 | tensors = comm.broadcast_coalesced((edz, eydz), [edz.get_device()] + ctx.worker_ids) 225 | for ts, queue in zip(tensors[1:], ctx.worker_queues): 226 | queue.put(ts) 227 | else: 228 | ctx.master_queue.put((edz, eydz)) 229 | edz, eydz = ctx.worker_queue.get() 230 | ctx.worker_queue.task_done() 231 | else: 232 | edz = dz.new_zeros(dz.size(1)) 233 | eydz = dz.new_zeros(dz.size(1)) 234 | 235 | dx, dweight, dbias = _backend.backward(z, dz, var, weight, bias, edz, eydz, ctx.affine, ctx.eps) 236 | dweight = dweight if ctx.affine else None 237 | dbias = dbias if ctx.affine else None 238 | 239 | return dx, dweight, dbias, None, None, None, None, None, None, None, None 240 | 241 | @staticmethod 242 | def _parse_extra(ctx, extra): 243 | ctx.is_master = extra["is_master"] 244 | if ctx.is_master: 245 | ctx.master_queue = extra["master_queue"] 246 | ctx.worker_queues = extra["worker_queues"] 247 | ctx.worker_ids = extra["worker_ids"] 248 | else: 249 | ctx.master_queue = extra["master_queue"] 250 | ctx.worker_queue = extra["worker_queue"] 251 | 252 | 253 | inplace_abn = InPlaceABN.apply 254 | inplace_abn_sync = InPlaceABNSync.apply 255 | 256 | __all__ = ["inplace_abn", "inplace_abn_sync", "ACT_RELU", "ACT_LEAKY_RELU", "ACT_ELU", "ACT_NONE"] 257 | -------------------------------------------------------------------------------- /inplace_abn/src/common.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | 5 | /* 6 | * General settings 7 | */ 8 | const int WARP_SIZE = 32; 9 | const int MAX_BLOCK_SIZE = 512; 10 | 11 | template 12 | struct Pair { 13 | T v1, v2; 14 | __device__ Pair() {} 15 | __device__ Pair(T _v1, T _v2) : v1(_v1), v2(_v2) {} 16 | __device__ Pair(T v) : v1(v), v2(v) {} 17 | __device__ Pair(int v) : v1(v), v2(v) {} 18 | __device__ Pair &operator+=(const Pair &a) { 19 | v1 += a.v1; 20 | v2 += a.v2; 21 | return *this; 22 | } 23 | }; 24 | 25 | /* 26 | * Utility functions 27 | */ 28 | template 29 | __device__ __forceinline__ T WARP_SHFL_XOR(T value, int laneMask, int width = warpSize, 30 | unsigned int mask = 0xffffffff) { 31 | #if CUDART_VERSION >= 9000 32 | return __shfl_xor_sync(mask, value, laneMask, width); 33 | #else 34 | return __shfl_xor(value, laneMask, width); 35 | #endif 36 | } 37 | 38 | __device__ __forceinline__ int getMSB(int val) { return 31 - __clz(val); } 39 | 40 | static int getNumThreads(int nElem) { 41 | int threadSizes[5] = {32, 64, 128, 256, MAX_BLOCK_SIZE}; 42 | for (int i = 0; i != 5; ++i) { 43 | if (nElem <= threadSizes[i]) { 44 | return threadSizes[i]; 45 | } 46 | } 47 | return MAX_BLOCK_SIZE; 48 | } 49 | 50 | template 51 | static __device__ __forceinline__ T warpSum(T val) { 52 | #if __CUDA_ARCH__ >= 300 53 | for (int i = 0; i < getMSB(WARP_SIZE); ++i) { 54 | val += WARP_SHFL_XOR(val, 1 << i, WARP_SIZE); 55 | } 56 | #else 57 | __shared__ T values[MAX_BLOCK_SIZE]; 58 | values[threadIdx.x] = val; 59 | __threadfence_block(); 60 | const int base = (threadIdx.x / WARP_SIZE) * WARP_SIZE; 61 | for (int i = 1; i < WARP_SIZE; i++) { 62 | val += values[base + ((i + threadIdx.x) % WARP_SIZE)]; 63 | } 64 | #endif 65 | return val; 66 | } 67 | 68 | template 69 | static __device__ __forceinline__ Pair warpSum(Pair value) { 70 | value.v1 = warpSum(value.v1); 71 | value.v2 = warpSum(value.v2); 72 | return value; 73 | } 74 | 75 | template 76 | __device__ T reduce(Op op, int plane, int N, int C, int S) { 77 | T sum = (T)0; 78 | for (int batch = 0; batch < N; ++batch) { 79 | for (int x = threadIdx.x; x < S; x += blockDim.x) { 80 | sum += op(batch, plane, x); 81 | } 82 | } 83 | 84 | // sum over NumThreads within a warp 85 | sum = warpSum(sum); 86 | 87 | // 'transpose', and reduce within warp again 88 | __shared__ T shared[32]; 89 | __syncthreads(); 90 | if (threadIdx.x % WARP_SIZE == 0) { 91 | shared[threadIdx.x / WARP_SIZE] = sum; 92 | } 93 | if (threadIdx.x >= blockDim.x / WARP_SIZE && threadIdx.x < WARP_SIZE) { 94 | // zero out the other entries in shared 95 | shared[threadIdx.x] = (T)0; 96 | } 97 | __syncthreads(); 98 | if (threadIdx.x / WARP_SIZE == 0) { 99 | sum = warpSum(shared[threadIdx.x]); 100 | if (threadIdx.x == 0) { 101 | shared[0] = sum; 102 | } 103 | } 104 | __syncthreads(); 105 | 106 | // Everyone picks it up, should be broadcast into the whole gradInput 107 | return shared[0]; 108 | } -------------------------------------------------------------------------------- /inplace_abn/src/inplace_abn.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include 4 | 5 | #include "inplace_abn.h" 6 | 7 | std::vector mean_var(at::Tensor x) { 8 | if (x.is_cuda()) { 9 | return mean_var_cuda(x); 10 | } else { 11 | return mean_var_cpu(x); 12 | } 13 | } 14 | 15 | at::Tensor forward(at::Tensor x, at::Tensor mean, at::Tensor var, at::Tensor weight, at::Tensor bias, 16 | bool affine, float eps) { 17 | if (x.is_cuda()) { 18 | return forward_cuda(x, mean, var, weight, bias, affine, eps); 19 | } else { 20 | return forward_cpu(x, mean, var, weight, bias, affine, eps); 21 | } 22 | } 23 | 24 | std::vector edz_eydz(at::Tensor z, at::Tensor dz, at::Tensor weight, at::Tensor bias, 25 | bool affine, float eps) { 26 | if (z.is_cuda()) { 27 | return edz_eydz_cuda(z, dz, weight, bias, affine, eps); 28 | } else { 29 | return edz_eydz_cpu(z, dz, weight, bias, affine, eps); 30 | } 31 | } 32 | 33 | std::vector backward(at::Tensor z, at::Tensor dz, at::Tensor var, at::Tensor weight, at::Tensor bias, 34 | at::Tensor edz, at::Tensor eydz, bool affine, float eps) { 35 | if (z.is_cuda()) { 36 | return backward_cuda(z, dz, var, weight, bias, edz, eydz, affine, eps); 37 | } else { 38 | return backward_cpu(z, dz, var, weight, bias, edz, eydz, affine, eps); 39 | } 40 | } 41 | 42 | void leaky_relu_forward(at::Tensor z, float slope) { 43 | at::leaky_relu_(z, slope); 44 | } 45 | 46 | void leaky_relu_backward(at::Tensor z, at::Tensor dz, float slope) { 47 | if (z.is_cuda()) { 48 | return leaky_relu_backward_cuda(z, dz, slope); 49 | } else { 50 | return leaky_relu_backward_cpu(z, dz, slope); 51 | } 52 | } 53 | 54 | void elu_forward(at::Tensor z) { 55 | at::elu_(z); 56 | } 57 | 58 | void elu_backward(at::Tensor z, at::Tensor dz) { 59 | if (z.is_cuda()) { 60 | return elu_backward_cuda(z, dz); 61 | } else { 62 | return elu_backward_cpu(z, dz); 63 | } 64 | } 65 | 66 | PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) { 67 | m.def("mean_var", &mean_var, "Mean and variance computation"); 68 | m.def("forward", &forward, "In-place forward computation"); 69 | m.def("edz_eydz", &edz_eydz, "First part of backward computation"); 70 | m.def("backward", &backward, "Second part of backward computation"); 71 | m.def("leaky_relu_forward", &leaky_relu_forward, "Leaky relu forward computation"); 72 | m.def("leaky_relu_backward", &leaky_relu_backward, "Leaky relu backward computation and inversion"); 73 | m.def("elu_forward", &elu_forward, "Elu forward computation"); 74 | m.def("elu_backward", &elu_backward, "Elu backward computation and inversion"); 75 | } -------------------------------------------------------------------------------- /inplace_abn/src/inplace_abn.h: -------------------------------------------------------------------------------- 1 | #pragma once 2 | 3 | #include 4 | 5 | #include 6 | 7 | std::vector mean_var_cpu(at::Tensor x); 8 | std::vector mean_var_cuda(at::Tensor x); 9 | 10 | at::Tensor forward_cpu(at::Tensor x, at::Tensor mean, at::Tensor var, at::Tensor weight, at::Tensor bias, 11 | bool affine, float eps); 12 | at::Tensor forward_cuda(at::Tensor x, at::Tensor mean, at::Tensor var, at::Tensor weight, at::Tensor bias, 13 | bool affine, float eps); 14 | 15 | std::vector edz_eydz_cpu(at::Tensor z, at::Tensor dz, at::Tensor weight, at::Tensor bias, 16 | bool affine, float eps); 17 | std::vector edz_eydz_cuda(at::Tensor z, at::Tensor dz, at::Tensor weight, at::Tensor bias, 18 | bool affine, float eps); 19 | 20 | std::vector backward_cpu(at::Tensor z, at::Tensor dz, at::Tensor var, at::Tensor weight, at::Tensor bias, 21 | at::Tensor edz, at::Tensor eydz, bool affine, float eps); 22 | std::vector backward_cuda(at::Tensor z, at::Tensor dz, at::Tensor var, at::Tensor weight, at::Tensor bias, 23 | at::Tensor edz, at::Tensor eydz, bool affine, float eps); 24 | 25 | void leaky_relu_backward_cpu(at::Tensor z, at::Tensor dz, float slope); 26 | void leaky_relu_backward_cuda(at::Tensor z, at::Tensor dz, float slope); 27 | 28 | void elu_backward_cpu(at::Tensor z, at::Tensor dz); 29 | void elu_backward_cuda(at::Tensor z, at::Tensor dz); -------------------------------------------------------------------------------- /inplace_abn/src/inplace_abn_cpu.cpp: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include 4 | 5 | #include "inplace_abn.h" 6 | 7 | at::Tensor reduce_sum(at::Tensor x) { 8 | if (x.ndimension() == 2) { 9 | return x.sum(0); 10 | } else { 11 | auto x_view = x.view({x.size(0), x.size(1), -1}); 12 | return x_view.sum(-1).sum(0); 13 | } 14 | } 15 | 16 | at::Tensor broadcast_to(at::Tensor v, at::Tensor x) { 17 | if (x.ndimension() == 2) { 18 | return v; 19 | } else { 20 | std::vector broadcast_size = {1, -1}; 21 | for (int64_t i = 2; i < x.ndimension(); ++i) 22 | broadcast_size.push_back(1); 23 | 24 | return v.view(broadcast_size); 25 | } 26 | } 27 | 28 | int64_t count(at::Tensor x) { 29 | int64_t count = x.size(0); 30 | for (int64_t i = 2; i < x.ndimension(); ++i) 31 | count *= x.size(i); 32 | 33 | return count; 34 | } 35 | 36 | at::Tensor invert_affine(at::Tensor z, at::Tensor weight, at::Tensor bias, bool affine, float eps) { 37 | if (affine) { 38 | return (z - broadcast_to(bias, z)) / broadcast_to(at::abs(weight) + eps, z); 39 | } else { 40 | return z; 41 | } 42 | } 43 | 44 | std::vector mean_var_cpu(at::Tensor x) { 45 | auto num = count(x); 46 | auto mean = reduce_sum(x) / num; 47 | auto diff = x - broadcast_to(mean, x); 48 | auto var = reduce_sum(diff.pow(2)) / num; 49 | 50 | return {mean, var}; 51 | } 52 | 53 | at::Tensor forward_cpu(at::Tensor x, at::Tensor mean, at::Tensor var, at::Tensor weight, at::Tensor bias, 54 | bool affine, float eps) { 55 | auto gamma = affine ? at::abs(weight) + eps : at::ones_like(var); 56 | auto mul = at::rsqrt(var + eps) * gamma; 57 | 58 | x.sub_(broadcast_to(mean, x)); 59 | x.mul_(broadcast_to(mul, x)); 60 | if (affine) x.add_(broadcast_to(bias, x)); 61 | 62 | return x; 63 | } 64 | 65 | std::vector edz_eydz_cpu(at::Tensor z, at::Tensor dz, at::Tensor weight, at::Tensor bias, 66 | bool affine, float eps) { 67 | auto edz = reduce_sum(dz); 68 | auto y = invert_affine(z, weight, bias, affine, eps); 69 | auto eydz = reduce_sum(y * dz); 70 | 71 | return {edz, eydz}; 72 | } 73 | 74 | std::vector backward_cpu(at::Tensor z, at::Tensor dz, at::Tensor var, at::Tensor weight, at::Tensor bias, 75 | at::Tensor edz, at::Tensor eydz, bool affine, float eps) { 76 | auto y = invert_affine(z, weight, bias, affine, eps); 77 | auto mul = affine ? at::rsqrt(var + eps) * (at::abs(weight) + eps) : at::rsqrt(var + eps); 78 | 79 | auto num = count(z); 80 | auto dx = (dz - broadcast_to(edz / num, dz) - y * broadcast_to(eydz / num, dz)) * broadcast_to(mul, dz); 81 | 82 | auto dweight = at::empty(z.type(), {0}); 83 | auto dbias = at::empty(z.type(), {0}); 84 | if (affine) { 85 | dweight = eydz * at::sign(weight); 86 | dbias = edz; 87 | } 88 | 89 | return {dx, dweight, dbias}; 90 | } 91 | 92 | void leaky_relu_backward_cpu(at::Tensor z, at::Tensor dz, float slope) { 93 | AT_DISPATCH_FLOATING_TYPES(z.type(), "leaky_relu_backward_cpu", ([&] { 94 | int64_t count = z.numel(); 95 | auto *_z = z.data(); 96 | auto *_dz = dz.data(); 97 | 98 | for (int64_t i = 0; i < count; ++i) { 99 | if (_z[i] < 0) { 100 | _z[i] *= 1 / slope; 101 | _dz[i] *= slope; 102 | } 103 | } 104 | })); 105 | } 106 | 107 | void elu_backward_cpu(at::Tensor z, at::Tensor dz) { 108 | AT_DISPATCH_FLOATING_TYPES(z.type(), "elu_backward_cpu", ([&] { 109 | int64_t count = z.numel(); 110 | auto *_z = z.data(); 111 | auto *_dz = dz.data(); 112 | 113 | for (int64_t i = 0; i < count; ++i) { 114 | if (_z[i] < 0) { 115 | _z[i] = log1p(_z[i]); 116 | _dz[i] *= (_z[i] + 1.f); 117 | } 118 | } 119 | })); 120 | } -------------------------------------------------------------------------------- /inplace_abn/src/inplace_abn_cuda.cu: -------------------------------------------------------------------------------- 1 | #include 2 | 3 | #include 4 | #include 5 | 6 | #include 7 | 8 | #include "common.h" 9 | #include "inplace_abn.h" 10 | 11 | // Checks 12 | #ifndef AT_CHECK 13 | #define AT_CHECK AT_ASSERT 14 | #endif 15 | #define CHECK_CUDA(x) AT_CHECK(x.type().is_cuda(), #x " must be a CUDA tensor") 16 | #define CHECK_CONTIGUOUS(x) AT_CHECK(x.is_contiguous(), #x " must be contiguous") 17 | #define CHECK_INPUT(x) CHECK_CUDA(x); CHECK_CONTIGUOUS(x) 18 | 19 | // Utilities 20 | void get_dims(at::Tensor x, int64_t& num, int64_t& chn, int64_t& sp) { 21 | num = x.size(0); 22 | chn = x.size(1); 23 | sp = 1; 24 | for (int64_t i = 2; i < x.ndimension(); ++i) 25 | sp *= x.size(i); 26 | } 27 | 28 | // Operations for reduce 29 | template 30 | struct SumOp { 31 | __device__ SumOp(const T *t, int c, int s) 32 | : tensor(t), chn(c), sp(s) {} 33 | __device__ __forceinline__ T operator()(int batch, int plane, int n) { 34 | return tensor[(batch * chn + plane) * sp + n]; 35 | } 36 | const T *tensor; 37 | const int chn; 38 | const int sp; 39 | }; 40 | 41 | template 42 | struct VarOp { 43 | __device__ VarOp(T m, const T *t, int c, int s) 44 | : mean(m), tensor(t), chn(c), sp(s) {} 45 | __device__ __forceinline__ T operator()(int batch, int plane, int n) { 46 | T val = tensor[(batch * chn + plane) * sp + n]; 47 | return (val - mean) * (val - mean); 48 | } 49 | const T mean; 50 | const T *tensor; 51 | const int chn; 52 | const int sp; 53 | }; 54 | 55 | template 56 | struct GradOp { 57 | __device__ GradOp(T _weight, T _bias, const T *_z, const T *_dz, int c, int s) 58 | : weight(_weight), bias(_bias), z(_z), dz(_dz), chn(c), sp(s) {} 59 | __device__ __forceinline__ Pair operator()(int batch, int plane, int n) { 60 | T _y = (z[(batch * chn + plane) * sp + n] - bias) / weight; 61 | T _dz = dz[(batch * chn + plane) * sp + n]; 62 | return Pair(_dz, _y * _dz); 63 | } 64 | const T weight; 65 | const T bias; 66 | const T *z; 67 | const T *dz; 68 | const int chn; 69 | const int sp; 70 | }; 71 | 72 | /*********** 73 | * mean_var 74 | ***********/ 75 | 76 | template 77 | __global__ void mean_var_kernel(const T *x, T *mean, T *var, int num, int chn, int sp) { 78 | int plane = blockIdx.x; 79 | T norm = T(1) / T(num * sp); 80 | 81 | T _mean = reduce>(SumOp(x, chn, sp), plane, num, chn, sp) * norm; 82 | __syncthreads(); 83 | T _var = reduce>(VarOp(_mean, x, chn, sp), plane, num, chn, sp) * norm; 84 | 85 | if (threadIdx.x == 0) { 86 | mean[plane] = _mean; 87 | var[plane] = _var; 88 | } 89 | } 90 | 91 | std::vector mean_var_cuda(at::Tensor x) { 92 | CHECK_INPUT(x); 93 | 94 | // Extract dimensions 95 | int64_t num, chn, sp; 96 | get_dims(x, num, chn, sp); 97 | 98 | // Prepare output tensors 99 | auto mean = at::empty(x.type(), {chn}); 100 | auto var = at::empty(x.type(), {chn}); 101 | 102 | // Run kernel 103 | dim3 blocks(chn); 104 | dim3 threads(getNumThreads(sp)); 105 | AT_DISPATCH_FLOATING_TYPES(x.type(), "mean_var_cuda", ([&] { 106 | mean_var_kernel<<>>( 107 | x.data(), 108 | mean.data(), 109 | var.data(), 110 | num, chn, sp); 111 | })); 112 | 113 | return {mean, var}; 114 | } 115 | 116 | /********** 117 | * forward 118 | **********/ 119 | 120 | template 121 | __global__ void forward_kernel(T *x, const T *mean, const T *var, const T *weight, const T *bias, 122 | bool affine, float eps, int num, int chn, int sp) { 123 | int plane = blockIdx.x; 124 | 125 | T _mean = mean[plane]; 126 | T _var = var[plane]; 127 | T _weight = affine ? abs(weight[plane]) + eps : T(1); 128 | T _bias = affine ? bias[plane] : T(0); 129 | 130 | T mul = rsqrt(_var + eps) * _weight; 131 | 132 | for (int batch = 0; batch < num; ++batch) { 133 | for (int n = threadIdx.x; n < sp; n += blockDim.x) { 134 | T _x = x[(batch * chn + plane) * sp + n]; 135 | T _y = (_x - _mean) * mul + _bias; 136 | 137 | x[(batch * chn + plane) * sp + n] = _y; 138 | } 139 | } 140 | } 141 | 142 | at::Tensor forward_cuda(at::Tensor x, at::Tensor mean, at::Tensor var, at::Tensor weight, at::Tensor bias, 143 | bool affine, float eps) { 144 | CHECK_INPUT(x); 145 | CHECK_INPUT(mean); 146 | CHECK_INPUT(var); 147 | CHECK_INPUT(weight); 148 | CHECK_INPUT(bias); 149 | 150 | // Extract dimensions 151 | int64_t num, chn, sp; 152 | get_dims(x, num, chn, sp); 153 | 154 | // Run kernel 155 | dim3 blocks(chn); 156 | dim3 threads(getNumThreads(sp)); 157 | AT_DISPATCH_FLOATING_TYPES(x.type(), "forward_cuda", ([&] { 158 | forward_kernel<<>>( 159 | x.data(), 160 | mean.data(), 161 | var.data(), 162 | weight.data(), 163 | bias.data(), 164 | affine, eps, num, chn, sp); 165 | })); 166 | 167 | return x; 168 | } 169 | 170 | /*********** 171 | * edz_eydz 172 | ***********/ 173 | 174 | template 175 | __global__ void edz_eydz_kernel(const T *z, const T *dz, const T *weight, const T *bias, 176 | T *edz, T *eydz, bool affine, float eps, int num, int chn, int sp) { 177 | int plane = blockIdx.x; 178 | 179 | T _weight = affine ? abs(weight[plane]) + eps : 1.f; 180 | T _bias = affine ? bias[plane] : 0.f; 181 | 182 | Pair res = reduce, GradOp>(GradOp(_weight, _bias, z, dz, chn, sp), plane, num, chn, sp); 183 | __syncthreads(); 184 | 185 | if (threadIdx.x == 0) { 186 | edz[plane] = res.v1; 187 | eydz[plane] = res.v2; 188 | } 189 | } 190 | 191 | std::vector edz_eydz_cuda(at::Tensor z, at::Tensor dz, at::Tensor weight, at::Tensor bias, 192 | bool affine, float eps) { 193 | CHECK_INPUT(z); 194 | CHECK_INPUT(dz); 195 | CHECK_INPUT(weight); 196 | CHECK_INPUT(bias); 197 | 198 | // Extract dimensions 199 | int64_t num, chn, sp; 200 | get_dims(z, num, chn, sp); 201 | 202 | auto edz = at::empty(z.type(), {chn}); 203 | auto eydz = at::empty(z.type(), {chn}); 204 | 205 | // Run kernel 206 | dim3 blocks(chn); 207 | dim3 threads(getNumThreads(sp)); 208 | AT_DISPATCH_FLOATING_TYPES(z.type(), "edz_eydz_cuda", ([&] { 209 | edz_eydz_kernel<<>>( 210 | z.data(), 211 | dz.data(), 212 | weight.data(), 213 | bias.data(), 214 | edz.data(), 215 | eydz.data(), 216 | affine, eps, num, chn, sp); 217 | })); 218 | 219 | return {edz, eydz}; 220 | } 221 | 222 | /*********** 223 | * backward 224 | ***********/ 225 | 226 | template 227 | __global__ void backward_kernel(const T *z, const T *dz, const T *var, const T *weight, const T *bias, const T *edz, 228 | const T *eydz, T *dx, T *dweight, T *dbias, 229 | bool affine, float eps, int num, int chn, int sp) { 230 | int plane = blockIdx.x; 231 | 232 | T _weight = affine ? abs(weight[plane]) + eps : 1.f; 233 | T _bias = affine ? bias[plane] : 0.f; 234 | T _var = var[plane]; 235 | T _edz = edz[plane]; 236 | T _eydz = eydz[plane]; 237 | 238 | T _mul = _weight * rsqrt(_var + eps); 239 | T count = T(num * sp); 240 | 241 | for (int batch = 0; batch < num; ++batch) { 242 | for (int n = threadIdx.x; n < sp; n += blockDim.x) { 243 | T _dz = dz[(batch * chn + plane) * sp + n]; 244 | T _y = (z[(batch * chn + plane) * sp + n] - _bias) / _weight; 245 | 246 | dx[(batch * chn + plane) * sp + n] = (_dz - _edz / count - _y * _eydz / count) * _mul; 247 | } 248 | } 249 | 250 | if (threadIdx.x == 0) { 251 | if (affine) { 252 | dweight[plane] = weight[plane] > 0 ? _eydz : -_eydz; 253 | dbias[plane] = _edz; 254 | } 255 | } 256 | } 257 | 258 | std::vector backward_cuda(at::Tensor z, at::Tensor dz, at::Tensor var, at::Tensor weight, at::Tensor bias, 259 | at::Tensor edz, at::Tensor eydz, bool affine, float eps) { 260 | CHECK_INPUT(z); 261 | CHECK_INPUT(dz); 262 | CHECK_INPUT(var); 263 | CHECK_INPUT(weight); 264 | CHECK_INPUT(bias); 265 | CHECK_INPUT(edz); 266 | CHECK_INPUT(eydz); 267 | 268 | // Extract dimensions 269 | int64_t num, chn, sp; 270 | get_dims(z, num, chn, sp); 271 | 272 | auto dx = at::zeros_like(z); 273 | auto dweight = at::zeros_like(weight); 274 | auto dbias = at::zeros_like(bias); 275 | 276 | // Run kernel 277 | dim3 blocks(chn); 278 | dim3 threads(getNumThreads(sp)); 279 | AT_DISPATCH_FLOATING_TYPES(z.type(), "backward_cuda", ([&] { 280 | backward_kernel<<>>( 281 | z.data(), 282 | dz.data(), 283 | var.data(), 284 | weight.data(), 285 | bias.data(), 286 | edz.data(), 287 | eydz.data(), 288 | dx.data(), 289 | dweight.data(), 290 | dbias.data(), 291 | affine, eps, num, chn, sp); 292 | })); 293 | 294 | return {dx, dweight, dbias}; 295 | } 296 | 297 | /************** 298 | * activations 299 | **************/ 300 | 301 | template 302 | inline void leaky_relu_backward_impl(T *z, T *dz, float slope, int64_t count) { 303 | // Create thrust pointers 304 | thrust::device_ptr th_z = thrust::device_pointer_cast(z); 305 | thrust::device_ptr th_dz = thrust::device_pointer_cast(dz); 306 | 307 | thrust::transform_if(th_dz, th_dz + count, th_z, th_dz, 308 | [slope] __device__ (const T& dz) { return dz * slope; }, 309 | [] __device__ (const T& z) { return z < 0; }); 310 | thrust::transform_if(th_z, th_z + count, th_z, 311 | [slope] __device__ (const T& z) { return z / slope; }, 312 | [] __device__ (const T& z) { return z < 0; }); 313 | } 314 | 315 | void leaky_relu_backward_cuda(at::Tensor z, at::Tensor dz, float slope) { 316 | CHECK_INPUT(z); 317 | CHECK_INPUT(dz); 318 | 319 | int64_t count = z.numel(); 320 | 321 | AT_DISPATCH_FLOATING_TYPES(z.type(), "leaky_relu_backward_cuda", ([&] { 322 | leaky_relu_backward_impl(z.data(), dz.data(), slope, count); 323 | })); 324 | } 325 | 326 | template 327 | inline void elu_backward_impl(T *z, T *dz, int64_t count) { 328 | // Create thrust pointers 329 | thrust::device_ptr th_z = thrust::device_pointer_cast(z); 330 | thrust::device_ptr th_dz = thrust::device_pointer_cast(dz); 331 | 332 | thrust::transform_if(th_dz, th_dz + count, th_z, th_z, th_dz, 333 | [] __device__ (const T& dz, const T& z) { return dz * (z + 1.); }, 334 | [] __device__ (const T& z) { return z < 0; }); 335 | thrust::transform_if(th_z, th_z + count, th_z, 336 | [] __device__ (const T& z) { return log1p(z); }, 337 | [] __device__ (const T& z) { return z < 0; }); 338 | } 339 | 340 | void elu_backward_cuda(at::Tensor z, at::Tensor dz) { 341 | CHECK_INPUT(z); 342 | CHECK_INPUT(dz); 343 | 344 | int64_t count = z.numel(); 345 | 346 | AT_DISPATCH_FLOATING_TYPES(z.type(), "leaky_relu_backward_cuda", ([&] { 347 | elu_backward_impl(z.data(), dz.data(), count); 348 | })); 349 | } 350 | -------------------------------------------------------------------------------- /kitti_utils.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | import numpy as np 5 | from collections import Counter 6 | 7 | 8 | def load_velodyne_points(filename): 9 | """Load 3D point cloud from KITTI file format 10 | """ 11 | points = np.fromfile(filename, dtype=np.float32).reshape(-1, 4) 12 | points[:, 3] = 1.0 # homogeneous 13 | return points 14 | 15 | 16 | def read_calib_file(path): 17 | """Read KITTI calibration file 18 | """ 19 | float_chars = set("0123456789.e+- ") 20 | data = {} 21 | with open(path, 'r') as f: 22 | for line in f.readlines(): 23 | key, value = line.split(':', 1) 24 | value = value.strip() 25 | data[key] = value 26 | if float_chars.issuperset(value): 27 | # try to cast to float array 28 | try: 29 | data[key] = np.array(list(map(float, value.split(' ')))) 30 | except ValueError: 31 | # casting error: data[key] already eq. value, so pass 32 | pass 33 | 34 | return data 35 | 36 | 37 | def sub2ind(matrixSize, rowSub, colSub): 38 | """Convert row, col matrix subscripts to linear indices 39 | """ 40 | m, n = matrixSize 41 | return rowSub * (n-1) + colSub - 1 42 | 43 | 44 | def generate_depth_map(calib_dir, velo_filename, cam=2, vel_depth=False): 45 | """Generate a depth map from velodyne data 46 | """ 47 | # load calibration files 48 | cam2cam = read_calib_file(os.path.join(calib_dir, 'calib_cam_to_cam.txt')) 49 | velo2cam = read_calib_file(os.path.join(calib_dir, 'calib_velo_to_cam.txt')) 50 | velo2cam = np.hstack((velo2cam['R'].reshape(3, 3), velo2cam['T'][..., np.newaxis])) 51 | velo2cam = np.vstack((velo2cam, np.array([0, 0, 0, 1.0]))) 52 | 53 | # get image shape 54 | im_shape = cam2cam["S_rect_02"][::-1].astype(np.int32) 55 | 56 | # compute projection matrix velodyne->image plane 57 | R_cam2rect = np.eye(4) 58 | R_cam2rect[:3, :3] = cam2cam['R_rect_00'].reshape(3, 3) 59 | P_rect = cam2cam['P_rect_0'+str(cam)].reshape(3, 4) 60 | P_velo2im = np.dot(np.dot(P_rect, R_cam2rect), velo2cam) 61 | 62 | # load velodyne points and remove all behind image plane (approximation) 63 | # each row of the velodyne data is forward, left, up, reflectance 64 | velo = load_velodyne_points(velo_filename) 65 | velo = velo[velo[:, 0] >= 0, :] 66 | 67 | # project the points to the camera 68 | velo_pts_im = np.dot(P_velo2im, velo.T).T 69 | velo_pts_im[:, :2] = velo_pts_im[:, :2] / velo_pts_im[:, 2][..., np.newaxis] 70 | 71 | if vel_depth: 72 | velo_pts_im[:, 2] = velo[:, 0] 73 | 74 | # check if in bounds 75 | # use minus 1 to get the exact same value as KITTI matlab code 76 | velo_pts_im[:, 0] = np.round(velo_pts_im[:, 0]) - 1 77 | velo_pts_im[:, 1] = np.round(velo_pts_im[:, 1]) - 1 78 | val_inds = (velo_pts_im[:, 0] >= 0) & (velo_pts_im[:, 1] >= 0) 79 | val_inds = val_inds & (velo_pts_im[:, 0] < im_shape[1]) & (velo_pts_im[:, 1] < im_shape[0]) 80 | velo_pts_im = velo_pts_im[val_inds, :] 81 | 82 | # project to image 83 | depth = np.zeros((im_shape[:2])) 84 | depth[velo_pts_im[:, 1].astype(np.int), velo_pts_im[:, 0].astype(np.int)] = velo_pts_im[:, 2] 85 | 86 | # find the duplicate points and choose the closest depth 87 | inds = sub2ind(depth.shape, velo_pts_im[:, 1], velo_pts_im[:, 0]) 88 | dupe_inds = [item for item, count in Counter(inds).items() if count > 1] 89 | for dd in dupe_inds: 90 | pts = np.where(inds == dd)[0] 91 | x_loc = int(velo_pts_im[pts[0], 0]) 92 | y_loc = int(velo_pts_im[pts[0], 1]) 93 | depth[y_loc, x_loc] = velo_pts_im[pts, 2].min() 94 | depth[depth < 0] = 0 95 | 96 | return depth 97 | -------------------------------------------------------------------------------- /layers.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import numpy as np 4 | 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | 9 | 10 | def disp_to_depth(disp, min_depth, max_depth): 11 | """Convert network's sigmoid output into depth prediction 12 | """ 13 | min_disp = 1 / max_depth 14 | max_disp = 1 / min_depth 15 | scaled_disp = min_disp + (max_disp - min_disp) * disp 16 | depth = 1 / scaled_disp 17 | return scaled_disp, depth 18 | 19 | 20 | def transformation_from_parameters(axisangle, translation, invert=False): 21 | """Convert the network's (axisangle, translation) output into a 4x4 matrix 22 | """ 23 | R = rot_from_axisangle(axisangle) 24 | t = translation.clone() 25 | 26 | if invert: 27 | R = R.transpose(1, 2) 28 | t *= -1 29 | 30 | T = get_translation_matrix(t) 31 | 32 | if invert: 33 | M = torch.matmul(R, T) 34 | else: 35 | M = torch.matmul(T, R) 36 | 37 | return M 38 | 39 | 40 | def get_translation_matrix(translation_vector): 41 | """Convert a translation vector into a 4x4 transformation matrix 42 | """ 43 | T = torch.zeros(translation_vector.shape[0], 4, 4).to(device=translation_vector.device) 44 | 45 | t = translation_vector.contiguous().view(-1, 3, 1) 46 | 47 | T[:, 0, 0] = 1 48 | T[:, 1, 1] = 1 49 | T[:, 2, 2] = 1 50 | T[:, 3, 3] = 1 51 | T[:, :3, 3, None] = t 52 | 53 | return T 54 | 55 | 56 | def rot_from_axisangle(vec): 57 | """Convert an axisangle rotation into a 4x4 transformation matrix 58 | Input 'vec' has to be Bx1x3 59 | """ 60 | angle = torch.norm(vec, 2, 2, True) 61 | axis = vec / (angle + 1e-7) 62 | 63 | ca = torch.cos(angle) 64 | sa = torch.sin(angle) 65 | C = 1 - ca 66 | 67 | x = axis[..., 0].unsqueeze(1) 68 | y = axis[..., 1].unsqueeze(1) 69 | z = axis[..., 2].unsqueeze(1) 70 | 71 | xs = x * sa 72 | ys = y * sa 73 | zs = z * sa 74 | xC = x * C 75 | yC = y * C 76 | zC = z * C 77 | xyC = x * yC 78 | yzC = y * zC 79 | zxC = z * xC 80 | 81 | rot = torch.zeros((vec.shape[0], 4, 4)).to(device=vec.device) 82 | 83 | rot[:, 0, 0] = torch.squeeze(x * xC + ca) 84 | rot[:, 0, 1] = torch.squeeze(xyC - zs) 85 | rot[:, 0, 2] = torch.squeeze(zxC + ys) 86 | rot[:, 1, 0] = torch.squeeze(xyC + zs) 87 | rot[:, 1, 1] = torch.squeeze(y * yC + ca) 88 | rot[:, 1, 2] = torch.squeeze(yzC - xs) 89 | rot[:, 2, 0] = torch.squeeze(zxC - ys) 90 | rot[:, 2, 1] = torch.squeeze(yzC + xs) 91 | rot[:, 2, 2] = torch.squeeze(z * zC + ca) 92 | rot[:, 3, 3] = 1 93 | 94 | return rot 95 | 96 | 97 | class ConvBlock(nn.Module): 98 | """Layer to perform a convolution followed by ELU 99 | """ 100 | def __init__(self, in_channels, out_channels): 101 | super(ConvBlock, self).__init__() 102 | 103 | self.conv = Conv3x3(in_channels, out_channels) 104 | self.nonlin = nn.ELU(inplace=True) 105 | 106 | def forward(self, x): 107 | out = self.conv(x) 108 | out = self.nonlin(out) 109 | return out 110 | 111 | 112 | class Conv3x3(nn.Module): 113 | """Layer to pad and convolve input 114 | """ 115 | def __init__(self, in_channels, out_channels, use_refl=True): 116 | super(Conv3x3, self).__init__() 117 | 118 | if use_refl: 119 | self.pad = nn.ReflectionPad2d(1) 120 | else: 121 | self.pad = nn.ZeroPad2d(1) 122 | self.conv = nn.Conv2d(int(in_channels), int(out_channels), 3) 123 | 124 | def forward(self, x): 125 | out = self.pad(x) 126 | out = self.conv(out) 127 | return out 128 | 129 | 130 | class BackprojectDepth(nn.Module): 131 | """Layer to transform a depth image into a point cloud 132 | """ 133 | def __init__(self, batch_size, height, width): 134 | super(BackprojectDepth, self).__init__() 135 | 136 | self.batch_size = batch_size 137 | self.height = height 138 | self.width = width 139 | 140 | meshgrid = np.meshgrid(range(self.width), range(self.height), indexing='xy') 141 | self.id_coords = np.stack(meshgrid, axis=0).astype(np.float32) 142 | self.id_coords = nn.Parameter(torch.from_numpy(self.id_coords), 143 | requires_grad=False) 144 | 145 | self.ones = nn.Parameter(torch.ones(self.batch_size, 1, self.height * self.width), 146 | requires_grad=False) 147 | 148 | self.pix_coords = torch.unsqueeze(torch.stack( 149 | [self.id_coords[0].view(-1), self.id_coords[1].view(-1)], 0), 0) 150 | self.pix_coords = self.pix_coords.repeat(batch_size, 1, 1) 151 | self.pix_coords = nn.Parameter(torch.cat([self.pix_coords, self.ones], 1), 152 | requires_grad=False) 153 | 154 | def forward(self, depth, inv_K): 155 | cam_points = torch.matmul(inv_K[:, :3, :3], self.pix_coords) 156 | cam_points = depth.view(self.batch_size, 1, -1) * cam_points 157 | cam_points = torch.cat([cam_points, self.ones], 1) 158 | 159 | return cam_points 160 | 161 | 162 | class Project3D(nn.Module): 163 | """Layer which projects 3D points into a camera with intrinsics K and at position T 164 | """ 165 | def __init__(self, batch_size, height, width, eps=1e-7): 166 | super(Project3D, self).__init__() 167 | 168 | self.batch_size = batch_size 169 | self.height = height 170 | self.width = width 171 | self.eps = eps 172 | 173 | def forward(self, points, K, T): 174 | P = torch.matmul(K, T)[:, :3, :] 175 | 176 | cam_points = torch.matmul(P, points) 177 | 178 | pix_coords = cam_points[:, :2, :] / (cam_points[:, 2, :].unsqueeze(1) + self.eps) 179 | pix_coords = pix_coords.view(self.batch_size, 2, self.height, self.width) 180 | pix_coords = pix_coords.permute(0, 2, 3, 1) 181 | pix_coords[..., 0] /= self.width - 1 182 | pix_coords[..., 1] /= self.height - 1 183 | pix_coords = (pix_coords - 0.5) * 2 184 | return pix_coords 185 | 186 | 187 | def upsample(x): 188 | """Upsample input tensor by a factor of 2 189 | """ 190 | return F.interpolate(x, scale_factor=2, mode="nearest") 191 | 192 | 193 | def get_smooth_loss(disp, img): 194 | """Computes the smoothness loss for a disparity image 195 | The color image is used for edge-aware smoothness 196 | """ 197 | grad_disp_x = torch.abs(disp[:, :, :, :-1] - disp[:, :, :, 1:]) 198 | grad_disp_y = torch.abs(disp[:, :, :-1, :] - disp[:, :, 1:, :]) 199 | 200 | grad_img_x = torch.mean(torch.abs(img[:, :, :, :-1] - img[:, :, :, 1:]), 1, keepdim=True) 201 | grad_img_y = torch.mean(torch.abs(img[:, :, :-1, :] - img[:, :, 1:, :]), 1, keepdim=True) 202 | 203 | grad_disp_x *= torch.exp(-grad_img_x) 204 | grad_disp_y *= torch.exp(-grad_img_y) 205 | 206 | return grad_disp_x.mean() + grad_disp_y.mean() 207 | 208 | 209 | class SSIM(nn.Module): 210 | """Layer to compute the SSIM loss between a pair of images 211 | """ 212 | def __init__(self): 213 | super(SSIM, self).__init__() 214 | self.mu_x_pool = nn.AvgPool2d(3, 1) 215 | self.mu_y_pool = nn.AvgPool2d(3, 1) 216 | self.sig_x_pool = nn.AvgPool2d(3, 1) 217 | self.sig_y_pool = nn.AvgPool2d(3, 1) 218 | self.sig_xy_pool = nn.AvgPool2d(3, 1) 219 | 220 | self.refl = nn.ReflectionPad2d(1) 221 | 222 | self.C1 = 0.01 ** 2 223 | self.C2 = 0.03 ** 2 224 | 225 | def forward(self, x, y): 226 | x = self.refl(x) 227 | y = self.refl(y) 228 | 229 | mu_x = self.mu_x_pool(x) 230 | mu_y = self.mu_y_pool(y) 231 | 232 | sigma_x = self.sig_x_pool(x ** 2) - mu_x ** 2 233 | sigma_y = self.sig_y_pool(y ** 2) - mu_y ** 2 234 | sigma_xy = self.sig_xy_pool(x * y) - mu_x * mu_y 235 | 236 | SSIM_n = (2 * mu_x * mu_y + self.C1) * (2 * sigma_xy + self.C2) 237 | SSIM_d = (mu_x ** 2 + mu_y ** 2 + self.C1) * (sigma_x + sigma_y + self.C2) 238 | 239 | return torch.clamp((1 - SSIM_n / SSIM_d) / 2, 0, 1) 240 | 241 | 242 | def compute_depth_errors(gt, pred): 243 | """Computation of error metrics between predicted and ground truth depths 244 | """ 245 | thresh = torch.max((gt / pred), (pred / gt)) 246 | a1 = (thresh < 1.25 ).float().mean() 247 | a2 = (thresh < 1.25 ** 2).float().mean() 248 | a3 = (thresh < 1.25 ** 3).float().mean() 249 | 250 | rmse = (gt - pred) ** 2 251 | rmse = torch.sqrt(rmse.mean()) 252 | 253 | rmse_log = (torch.log(gt) - torch.log(pred)) ** 2 254 | rmse_log = torch.sqrt(rmse_log.mean()) 255 | 256 | abs_rel = torch.mean(torch.abs(gt - pred) / gt) 257 | 258 | sq_rel = torch.mean((gt - pred) ** 2 / gt) 259 | 260 | return abs_rel, sq_rel, rmse, rmse_log, a1, a2, a3 261 | -------------------------------------------------------------------------------- /networks/__init__.py: -------------------------------------------------------------------------------- 1 | from .resnet_encoder import ResnetEncoder 2 | from .monodepth2_decoder import DepthDecoder 3 | from .pose_decoder import PoseDecoder 4 | from .pose_cnn import PoseCNN 5 | from .encoder_selfattn import get_resnet101_asp_oc_dsn 6 | from .decoder import MSDepthDecoder 7 | -------------------------------------------------------------------------------- /networks/asp_oc_block.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | from torch.nn import functional as F 3 | import math 4 | import torch.utils.model_zoo as model_zoo 5 | import torch 6 | import os 7 | import sys 8 | import pdb 9 | import numpy as np 10 | from torch.autograd import Variable 11 | import functools 12 | from inplace_abn.bn import InPlaceABNSync, InPlaceABN 13 | 14 | ABN_module = InPlaceABN 15 | BatchNorm2d = functools.partial(ABN_module, activation='none') 16 | 17 | from networks.base_oc_block import BaseOC_Context_Module 18 | 19 | 20 | class ASP_OC_Module(nn.Module): 21 | """ 22 | Network to perform Atrous Spatial Pyramid Pooling (ASPP). 23 | """ 24 | def __init__(self, features, out_features=256, dilations=(12, 24, 36), disable_self_attn=False): 25 | super(ASP_OC_Module, self).__init__() 26 | self.disable_self_attn = disable_self_attn 27 | self.context_oc = nn.Sequential( 28 | nn.Conv2d(features, out_features, kernel_size=3, padding=1, dilation=1, bias=True), 29 | ABN_module(out_features), 30 | BaseOC_Context_Module(in_channels=out_features, out_channels=out_features, 31 | key_channels=out_features // 2, value_channels=out_features, 32 | dropout=0, sizes=([2]))) 33 | self.context = nn.Sequential( 34 | nn.Conv2d(features, out_features, kernel_size=3, padding=1, dilation=1, bias=True), 35 | ABN_module(out_features)) 36 | 37 | self.conv2 = nn.Sequential(nn.Conv2d(features, out_features, kernel_size=1, padding=0, dilation=1, bias=False), 38 | ABN_module(out_features)) 39 | self.conv3 = nn.Sequential( 40 | nn.Conv2d(features, out_features, kernel_size=3, padding=dilations[0], dilation=dilations[0], bias=False), 41 | ABN_module(out_features)) 42 | self.conv4 = nn.Sequential( 43 | nn.Conv2d(features, out_features, kernel_size=3, padding=dilations[1], dilation=dilations[1], bias=False), 44 | ABN_module(out_features)) 45 | self.conv5 = nn.Sequential( 46 | nn.Conv2d(features, out_features, kernel_size=3, padding=dilations[2], dilation=dilations[2], bias=False), 47 | ABN_module(out_features)) 48 | 49 | self.conv_bn_dropout = nn.Sequential( 50 | nn.Conv2d(out_features * 5, out_features * 2, kernel_size=1, padding=0, dilation=1, bias=False), 51 | ABN_module(out_features * 2), 52 | nn.Dropout2d(0.1) 53 | ) 54 | 55 | def _cat_each(self, feat1, feat2, feat3, feat4, feat5): 56 | """ 57 | Concatenate parallel convolution layers with different dilation rates 58 | to perform ASPP. 59 | """ 60 | assert (len(feat1) == len(feat2)) 61 | z = [] 62 | for i in range(len(feat1)): 63 | z.append(torch.cat((feat1[i], feat2[i], feat3[i], feat4[i], feat5[i]), 1)) 64 | return z 65 | 66 | def forward(self, x): 67 | if isinstance(x, Variable): 68 | _, _, h, w = x.size() 69 | elif isinstance(x, tuple) or isinstance(x, list): 70 | _, _, h, w = x[0].size() 71 | else: 72 | raise RuntimeError('unknown input type') 73 | 74 | if self.disable_self_attn: 75 | feat1 = self.context(x) 76 | else: 77 | feat1 = self.context_oc(x) 78 | feat2 = self.conv2(x) 79 | feat3 = self.conv3(x) 80 | feat4 = self.conv4(x) 81 | feat5 = self.conv5(x) 82 | 83 | if isinstance(x, Variable): 84 | out = torch.cat((feat1, feat2, feat3, feat4, feat5), 1) 85 | elif isinstance(x, tuple) or isinstance(x, list): 86 | out = self._cat_each(feat1, feat2, feat3, feat4, feat5) 87 | else: 88 | raise RuntimeError('unknown input type') 89 | 90 | output = self.conv_bn_dropout(out) 91 | return output 92 | -------------------------------------------------------------------------------- /networks/base_oc_block.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import os 3 | import sys 4 | import pdb 5 | import numpy as np 6 | from torch import nn 7 | from torch.nn import functional as F 8 | import functools 9 | from inplace_abn.bn import InPlaceABNSync, InPlaceABN 10 | ABN_module = InPlaceABN 11 | BatchNorm2d = functools.partial(ABN_module, activation='none') 12 | 13 | 14 | class _SelfAttentionBlock(nn.Module): 15 | """ 16 | The basic implementation for self-attention block/non-local block 17 | Input: 18 | N X C X H X W 19 | Args: 20 | in_channels : the dimension of the input feature map 21 | key_channels : the dimension after the key/query transform 22 | value_channels : the dimension after the value transform 23 | scale : choose the scale to downsample the input feature maps (save memory cost) 24 | Return: 25 | N X C X H X W 26 | position-aware context features.(w/o concate or add with the input) 27 | """ 28 | 29 | def __init__(self, in_channels, key_channels, value_channels, out_channels=None, scale=1): 30 | super(_SelfAttentionBlock, self).__init__() 31 | self.scale = scale 32 | self.in_channels = in_channels 33 | self.out_channels = out_channels 34 | self.key_channels = key_channels 35 | self.value_channels = value_channels 36 | if out_channels is None: 37 | self.out_channels = in_channels 38 | self.pool = nn.MaxPool2d(kernel_size=(scale, scale)) 39 | self.f_key = nn.Sequential( 40 | nn.Conv2d(in_channels=self.in_channels, out_channels=self.key_channels, 41 | kernel_size=1, stride=1, padding=0), 42 | ABN_module(self.key_channels), 43 | ) 44 | self.f_query = self.f_key 45 | self.f_value = nn.Conv2d(in_channels=self.in_channels, out_channels=self.value_channels, 46 | kernel_size=1, stride=1, padding=0) 47 | self.W = nn.Conv2d(in_channels=self.value_channels, out_channels=self.out_channels, 48 | kernel_size=1, stride=1, padding=0) 49 | nn.init.constant(self.W.weight, 0) 50 | nn.init.constant(self.W.bias, 0) 51 | 52 | def forward(self, x): 53 | batch_size, h, w = x.size(0), x.size(2), x.size(3) 54 | if self.scale > 1: 55 | x = self.pool(x) 56 | 57 | value = self.f_value(x).view(batch_size, self.value_channels, -1) 58 | value = value.permute(0, 2, 1) 59 | query = self.f_query(x).view(batch_size, self.key_channels, -1) 60 | query = query.permute(0, 2, 1) 61 | key = self.f_key(x).view(batch_size, self.key_channels, -1) 62 | 63 | sim_map = torch.matmul(query, key) 64 | sim_map = (self.key_channels ** -.5) * sim_map 65 | sim_map = F.softmax(sim_map, dim=-1) 66 | 67 | context = torch.matmul(sim_map, value) 68 | context = context.permute(0, 2, 1).contiguous() 69 | context = context.view(batch_size, self.value_channels, *x.size()[2:]) 70 | context = self.W(context) 71 | if self.scale > 1: 72 | context = F.upsample(input=context, size=(h, w), mode='bilinear', align_corners=True) 73 | return context 74 | 75 | 76 | class SelfAttentionBlock2D(_SelfAttentionBlock): 77 | def __init__(self, in_channels, key_channels, value_channels, out_channels=None, scale=1): 78 | super(SelfAttentionBlock2D, self).__init__(in_channels, 79 | key_channels, 80 | value_channels, 81 | out_channels, 82 | scale) 83 | 84 | 85 | class BaseOC_Module(nn.Module): 86 | """ 87 | Implementation of the BaseOC module 88 | Args: 89 | in_features / out_features: the channels of the input / output feature maps. 90 | dropout: we choose 0.05 as the default value. 91 | size: you can apply multiple sizes. Here we only use one size. 92 | Return: 93 | features fused with Object context information. 94 | """ 95 | 96 | def __init__(self, in_channels, out_channels, key_channels, value_channels, dropout, sizes=([1])): 97 | super(BaseOC_Module, self).__init__() 98 | self.stages = [] 99 | self.stages = nn.ModuleList( 100 | [self._make_stage(in_channels, out_channels, key_channels, value_channels, size) for size in sizes]) 101 | self.conv_bn_dropout = nn.Sequential( 102 | nn.Conv2d(2 * in_channels, out_channels, kernel_size=1, padding=0), 103 | ABN_module(out_channels), 104 | nn.Dropout2d(dropout) 105 | ) 106 | 107 | def _make_stage(self, in_channels, output_channels, key_channels, value_channels, size): 108 | return SelfAttentionBlock2D(in_channels, 109 | key_channels, 110 | value_channels, 111 | output_channels, 112 | size) 113 | 114 | def forward(self, feats): 115 | priors = [stage(feats) for stage in self.stages] 116 | context = priors[0] 117 | for i in range(1, len(priors)): 118 | context += priors[i] 119 | output = self.conv_bn_dropout(torch.cat([context, feats], 1)) 120 | return output 121 | 122 | 123 | class BaseOC_Context_Module(nn.Module): 124 | """ 125 | Output only the context features. 126 | Args: 127 | in_features / out_features: the channels of the input / output feature maps. 128 | dropout: specify the dropout ratio 129 | fusion: We provide two different fusion method, "concat" or "add" 130 | size: we find that directly learn the attention weights on even 1/8 feature maps is hard. 131 | Return: 132 | features after "concat" or "add" 133 | """ 134 | 135 | def __init__(self, in_channels, out_channels, key_channels, value_channels, dropout, sizes=([1])): 136 | super(BaseOC_Context_Module, self).__init__() 137 | self.stages = [] 138 | self.stages = nn.ModuleList( 139 | [self._make_stage(in_channels, out_channels, key_channels, value_channels, size) for size in sizes]) 140 | self.conv_bn_dropout = nn.Sequential( 141 | nn.Conv2d(in_channels, out_channels, kernel_size=1, padding=0), 142 | ABN_module(out_channels), 143 | ) 144 | 145 | def _make_stage(self, in_channels, output_channels, key_channels, value_channels, size): 146 | return SelfAttentionBlock2D(in_channels, 147 | key_channels, 148 | value_channels, 149 | output_channels, 150 | size) 151 | 152 | def forward(self, feats): 153 | priors = [stage(feats) for stage in self.stages] 154 | context = priors[0] 155 | for i in range(1, len(priors)): 156 | context += priors[i] 157 | output = self.conv_bn_dropout(context) 158 | return output 159 | -------------------------------------------------------------------------------- /networks/decoder.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import numpy as np 4 | import torch 5 | import torch.nn as nn 6 | import torch.nn.functional as F 7 | from collections import OrderedDict 8 | import warnings 9 | 10 | 11 | class ConvBlock(nn.Module): 12 | """Layer to perform a convolution followed by ELU 13 | """ 14 | 15 | def __init__(self, in_channels, out_channels): 16 | super(ConvBlock, self).__init__() 17 | 18 | self.conv = Conv3x3(in_channels, out_channels) 19 | self.nonlin = nn.ELU(inplace=True) 20 | 21 | def forward(self, x): 22 | out = self.conv(x) 23 | out = self.nonlin(out) 24 | return out 25 | 26 | 27 | class Conv3x3(nn.Module): 28 | """Layer to pad and convolve input 29 | """ 30 | 31 | def __init__(self, in_channels, out_channels, use_refl=True, use_norm=False): 32 | super(Conv3x3, self).__init__() 33 | 34 | if use_refl: 35 | self.pad = nn.ReflectionPad2d(1) 36 | else: 37 | self.pad = nn.ZeroPad2d(1) 38 | self.conv = nn.Conv2d(int(in_channels), int(out_channels), 3) 39 | 40 | self.use_norm = use_norm 41 | if use_norm: 42 | self.norm = nn.BatchNorm2d(in_channels) 43 | 44 | def forward(self, x): 45 | out = self.pad(x) 46 | 47 | if self.use_norm: 48 | out = self.norm(out) 49 | 50 | out = self.conv(out) 51 | 52 | return out 53 | 54 | 55 | class SoftAttnDepth(nn.Module): 56 | def __init__(self, alpha=0.01, beta=1.0, dim=1, discretization='UD'): 57 | super(SoftAttnDepth, self).__init__() 58 | self.dim = dim 59 | self.alpha = alpha 60 | self.beta = beta 61 | self.discretization = discretization 62 | 63 | def get_depth_sid(self, depth_labels): 64 | alpha_ = torch.FloatTensor([self.alpha]) 65 | beta_ = torch.FloatTensor([self.beta]) 66 | t = [] 67 | for K in range(depth_labels): 68 | K_ = torch.FloatTensor([K]) 69 | t.append(torch.exp(torch.log(alpha_) + torch.log(beta_ / alpha_) * K_ / depth_labels)) 70 | t = torch.FloatTensor(t) 71 | return t 72 | 73 | def forward(self, input_t, eps=1e-6): 74 | batch_size, depth, height, width = input_t.shape 75 | if self.discretization == 'SID': 76 | grid = self.get_depth_sid(depth).unsqueeze(0).unsqueeze(2).unsqueeze(3) 77 | else: 78 | grid = torch.linspace( 79 | self.alpha, self.beta, depth, 80 | requires_grad=False).unsqueeze(0).unsqueeze(2).unsqueeze(3) 81 | grid = grid.repeat(batch_size, 1, height, width).float() 82 | 83 | z = F.softmax(input_t, dim=self.dim) 84 | z = z * (grid.to(z.device)) 85 | z = torch.sum(z, dim=1, keepdim=True) 86 | 87 | return z 88 | 89 | 90 | def upsample(x): 91 | """Upsample input tensor by a factor of 2 92 | """ 93 | return F.interpolate(x, scale_factor=2, mode="nearest") 94 | 95 | 96 | class UpBlock(nn.Module): 97 | def __init__(self, 98 | in_planes, 99 | out_planes, 100 | use_skip=True, 101 | skip_planes=None, 102 | is_output_scale=False, 103 | output_scale_planes=128): 104 | super(UpBlock, self).__init__() 105 | 106 | self.in_planes = in_planes 107 | self.out_planes = out_planes 108 | self.use_skip = use_skip 109 | self.skip_planes = skip_planes 110 | self.is_output_scale = is_output_scale 111 | self.output_scale_planes = output_scale_planes 112 | 113 | self.conv1 = ConvBlock(in_planes, out_planes) 114 | 115 | self.block_two_feature = (out_planes + 116 | skip_planes) if use_skip else out_planes 117 | 118 | self.conv2 = ConvBlock(self.block_two_feature, out_planes) 119 | 120 | self.is_output_scale = is_output_scale 121 | if self.is_output_scale: 122 | self.output_layer = Conv3x3(out_planes, 123 | output_scale_planes, 124 | use_norm=True) 125 | 126 | def forward(self, input_t, skip_feature=None): 127 | 128 | # import ipdb; ipdb.set_trace() 129 | x = self.conv1(input_t) 130 | x = upsample(x) 131 | 132 | if self.use_skip and skip_feature is not None: 133 | x = torch.cat([x, skip_feature], 1) 134 | 135 | x = self.conv2(x) 136 | output_scale = None 137 | if self.is_output_scale: 138 | output_scale = self.output_layer(x) 139 | 140 | return x, output_scale 141 | 142 | 143 | class MSDepthDecoder(nn.Module): 144 | def __init__(self, 145 | num_ch_enc, 146 | scales=range(4), 147 | num_output_channels=128, 148 | use_skips=True, 149 | alpha=1e-3, 150 | beta=1.0, 151 | volume_output=False, 152 | discretization='UD'): 153 | super(MSDepthDecoder, self).__init__() 154 | """ 155 | To Replicate the paper, use num_output_channels=128 156 | """ 157 | self.num_output_channels = num_output_channels 158 | self.use_skips = use_skips 159 | self.upsample_mode = 'nearest' 160 | self.scales = scales 161 | 162 | self.num_ch_enc = num_ch_enc 163 | self.num_ch_enc[-1] = num_output_channels 164 | self.num_ch_dec = np.array([16, 32, 64, 128, 256]) 165 | 166 | # Decoder 167 | 168 | # (24, 80) -> (48, 160) 169 | self.deconv1 = UpBlock(in_planes=128, 170 | out_planes=64, 171 | use_skip=True, 172 | skip_planes=256, 173 | is_output_scale=True, 174 | output_scale_planes=self.num_output_channels) 175 | 176 | # (48, 160) -> (96, 320) 177 | self.deconv2 = UpBlock(in_planes=64, 178 | out_planes=64, 179 | use_skip=True, 180 | skip_planes=128, 181 | is_output_scale=True, 182 | output_scale_planes=self.num_output_channels) 183 | 184 | # (96, 320) -> (192, 640) 185 | self.deconv3 = UpBlock(in_planes=64, 186 | out_planes=32, 187 | use_skip=False, 188 | skip_planes=128, 189 | is_output_scale=True, 190 | output_scale_planes=self.num_output_channels) 191 | 192 | self.sigmoid = nn.Sigmoid() 193 | self.depth_layer = SoftAttnDepth(alpha=alpha, beta=beta, discretization=discretization) 194 | 195 | self.volume_output = volume_output 196 | 197 | def forward(self, features): 198 | self.outputs = {} 199 | # decoder 200 | x = features["output"] 201 | self.outputs[("disp", 3)] = self.depth_layer(x) 202 | 203 | x, z = self.deconv1(x, features["layer1"]) 204 | self.outputs[("disp", 2)] = self.depth_layer(z) 205 | 206 | x, z = self.deconv2(x, features["conv3"]) 207 | self.outputs[("disp", 1)] = self.depth_layer(z) 208 | 209 | x, z = self.deconv3(x, None) 210 | 211 | if self.volume_output: 212 | self.outputs[("volume", 0)] = z 213 | 214 | self.outputs[("disp", 0)] = self.depth_layer(z) 215 | 216 | return self.outputs 217 | -------------------------------------------------------------------------------- /networks/encoder_selfattn.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | from torch.nn import functional as F 3 | import math 4 | import torch.utils.model_zoo as model_zoo 5 | import torch 6 | import os 7 | import sys 8 | import pdb 9 | import numpy as np 10 | from torch.autograd import Variable 11 | import functools 12 | from networks.util import conv3x3, Bottleneck 13 | from networks.asp_oc_block import ASP_OC_Module 14 | from inplace_abn.bn import InPlaceABNSync, InPlaceABN 15 | from torchsummary import summary 16 | 17 | ABN_module = InPlaceABN 18 | BatchNorm2d = functools.partial(ABN_module, activation='none') 19 | affine_par = True 20 | 21 | 22 | class ResNet(nn.Module): 23 | """ 24 | Basic ResNet101. 25 | """ 26 | def __init__(self, block, layers): 27 | self.inplanes = 128 28 | super(ResNet, self).__init__() 29 | self.conv1 = conv3x3(3, 64, stride=1, dilation=2, padding=2) 30 | self.bn1 = BatchNorm2d(64) 31 | self.relu1 = nn.ReLU(inplace=False) 32 | self.conv2 = conv3x3(64, 64, stride=2) 33 | self.bn2 = BatchNorm2d(64) 34 | self.relu2 = nn.ReLU(inplace=False) 35 | self.conv3 = conv3x3(64, 128) 36 | self.bn3 = BatchNorm2d(128) 37 | self.relu3 = nn.ReLU(inplace=False) 38 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2) 39 | self.relu = nn.ReLU(inplace=False) 40 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True) # change 41 | self.layer1 = self._make_layer(block, 64, layers[0]) 42 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2) 43 | self.layer3 = self._make_layer(block, 256, layers[2], stride=1, dilation=2) 44 | self.layer4 = self._make_layer(block, 512, layers[3], stride=1, dilation=4, multi_grid=(1, 1, 1)) 45 | 46 | def _make_layer(self, block, planes, blocks, stride=1, dilation=1, multi_grid=1): 47 | downsample = None 48 | if stride != 1 or self.inplanes != planes * block.expansion: 49 | downsample = nn.Sequential( 50 | nn.Conv2d(self.inplanes, planes * block.expansion, 51 | kernel_size=1, stride=stride, bias=False), 52 | BatchNorm2d(planes * block.expansion, affine=affine_par)) 53 | 54 | layers = [] 55 | generate_multi_grid = lambda index, grids: grids[index % len(grids)] if isinstance(grids, tuple) else 1 56 | layers.append(block(self.inplanes, planes, stride, dilation=dilation, downsample=downsample, 57 | multi_grid=generate_multi_grid(0, multi_grid))) 58 | self.inplanes = planes * block.expansion 59 | for i in range(1, blocks): 60 | layers.append( 61 | block(self.inplanes, planes, dilation=dilation, multi_grid=generate_multi_grid(i, multi_grid))) 62 | return nn.Sequential(*layers) 63 | 64 | def forward(self, x): 65 | self.features = [] 66 | x = (x - 0.45) / 0.225 67 | x = self.relu1(self.bn1(self.conv1(x))) 68 | x = self.relu2(self.bn2(self.conv2(x))) 69 | x = self.relu3(self.bn3(self.conv3(x))) 70 | self.features.append(x) 71 | x = self.maxpool(x) 72 | x = self.layer1(x) 73 | self.features.append(x) 74 | x = self.layer2(x) 75 | self.features.append(x) 76 | x = self.layer3(x) 77 | self.features.append(x) 78 | # x_dsn = self.dsn(x) 79 | x = self.layer4(x) 80 | self.features.append(x) 81 | 82 | return self.features 83 | 84 | 85 | class ResNet_context(nn.Module): 86 | """ 87 | ResNet101 + self-attention 88 | """ 89 | def __init__(self, num_classes, disable_self_attn, pretrained): 90 | self.num_ch_enc = np.array([128, 256, 512, 1024, 2048]) 91 | self.disable_self_attn = disable_self_attn 92 | super(ResNet_context, self).__init__() 93 | 94 | self.basedir = os.path.dirname(os.path.abspath(__file__)) 95 | self.resnet_model = ResNet(Bottleneck, [3, 4, 23, 3]) 96 | if pretrained: 97 | pretrained_weights = torch.load(os.path.join(self.basedir, '../splits/resnet101-imagenet.pth')) 98 | model_dict = self.resnet_model.state_dict() 99 | self.resnet_model.load_state_dict({k: v for k, v in pretrained_weights.items() if k in model_dict}) 100 | 101 | # extra added layers 102 | self.context = nn.Sequential( 103 | nn.Conv2d(2048, 512, kernel_size=3, stride=1, padding=1), 104 | ABN_module(512), 105 | ASP_OC_Module(512, 256, disable_self_attn=self.disable_self_attn) 106 | ) 107 | self.cls = nn.Conv2d(512, num_classes, kernel_size=1, stride=1, padding=0, bias=True) 108 | self.dsn = nn.Sequential( 109 | nn.Conv2d(1024, 512, kernel_size=3, stride=1, padding=1), 110 | ABN_module(512), 111 | nn.Dropout2d(0.10), 112 | nn.Conv2d(512, num_classes, kernel_size=1, stride=1, padding=0, bias=True) 113 | ) 114 | 115 | def forward(self, x): 116 | all_features = self.resnet_model(x) 117 | all_features[-1] = self.context(all_features[-1]) 118 | all_features[-1] = self.cls(all_features[-1]) 119 | 120 | return all_features 121 | 122 | 123 | def get_resnet101_asp_oc_dsn(num_classes=128, disable_self_attn=False, pretrained=True): 124 | model = ResNet_context(num_classes, disable_self_attn, pretrained) 125 | return model 126 | 127 | 128 | if __name__ == '__main__': 129 | base = os.path.dirname(os.path.abspath(__file__)) 130 | pretrained_weights = torch.load(os.path.join(base, '../splits/resnet101-imagenet.pth')) 131 | print(pretrained_weights.keys()) 132 | 133 | 134 | -------------------------------------------------------------------------------- /networks/monodepth2_decoder.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import numpy as np 4 | import torch 5 | import torch.nn as nn 6 | 7 | from collections import OrderedDict 8 | from layers import * 9 | 10 | 11 | class DepthDecoder(nn.Module): 12 | def __init__(self, num_ch_enc, scales=range(4), num_output_channels=1, use_skips=True): 13 | super(DepthDecoder, self).__init__() 14 | 15 | self.num_output_channels = num_output_channels 16 | self.use_skips = use_skips 17 | self.upsample_mode = 'nearest' 18 | self.scales = scales 19 | 20 | self.num_ch_enc = num_ch_enc 21 | self.num_ch_dec = np.array([64, 128, 256, 512, 1024]) 22 | 23 | # decoder 24 | self.convs = OrderedDict() 25 | for i in range(4, -1, -1): 26 | # upconv_0 27 | num_ch_in = self.num_ch_enc[-1] if i == 4 else self.num_ch_dec[i + 1] 28 | num_ch_out = self.num_ch_dec[i] 29 | self.convs[("upconv", i, 0)] = ConvBlock(num_ch_in, num_ch_out) 30 | 31 | # upconv_1 32 | num_ch_in = self.num_ch_dec[i] 33 | if self.use_skips and i > 0: 34 | num_ch_in += self.num_ch_enc[i - 1] 35 | num_ch_out = self.num_ch_dec[i] 36 | self.convs[("upconv", i, 1)] = ConvBlock(num_ch_in, num_ch_out) 37 | 38 | for s in self.scales: 39 | self.convs[("dispconv", s)] = Conv3x3(self.num_ch_dec[s], self.num_output_channels) 40 | 41 | self.decoder = nn.ModuleList(list(self.convs.values())) 42 | self.sigmoid = nn.Sigmoid() 43 | 44 | def forward(self, input_features): 45 | self.outputs = {} 46 | 47 | # decoder 48 | x = input_features[-1] 49 | for i in range(4, -1, -1): 50 | x = self.convs[("upconv", i, 0)](x) 51 | if i < 3: 52 | x = [upsample(x)] 53 | else: 54 | x = [x] 55 | if self.use_skips and i > 0: 56 | x += [input_features[i - 1]] 57 | x = torch.cat(x, 1) 58 | x = self.convs[("upconv", i, 1)](x) 59 | if i in self.scales: 60 | self.outputs[("disp", i)] = self.sigmoid(self.convs[("dispconv", i)](x)) 61 | 62 | return self.outputs 63 | -------------------------------------------------------------------------------- /networks/pose_cnn.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import torch 4 | import torch.nn as nn 5 | 6 | 7 | class PoseCNN(nn.Module): 8 | def __init__(self, num_input_frames): 9 | super(PoseCNN, self).__init__() 10 | 11 | self.num_input_frames = num_input_frames 12 | 13 | self.convs = {} 14 | self.convs[0] = nn.Conv2d(3 * num_input_frames, 16, 7, 2, 3) 15 | self.convs[1] = nn.Conv2d(16, 32, 5, 2, 2) 16 | self.convs[2] = nn.Conv2d(32, 64, 3, 2, 1) 17 | self.convs[3] = nn.Conv2d(64, 128, 3, 2, 1) 18 | self.convs[4] = nn.Conv2d(128, 256, 3, 2, 1) 19 | self.convs[5] = nn.Conv2d(256, 256, 3, 2, 1) 20 | self.convs[6] = nn.Conv2d(256, 256, 3, 2, 1) 21 | 22 | self.pose_conv = nn.Conv2d(256, 6 * (num_input_frames - 1), 1) 23 | 24 | self.num_convs = len(self.convs) 25 | 26 | self.relu = nn.ReLU(True) 27 | 28 | self.net = nn.ModuleList(list(self.convs.values())) 29 | 30 | def forward(self, out): 31 | 32 | for i in range(self.num_convs): 33 | out = self.convs[i](out) 34 | out = self.relu(out) 35 | 36 | out = self.pose_conv(out) 37 | out = out.mean(3).mean(2) 38 | 39 | out = 0.01 * out.view(-1, self.num_input_frames - 1, 1, 6) 40 | 41 | axisangle = out[..., :3] 42 | translation = out[..., 3:] 43 | 44 | return axisangle, translation 45 | -------------------------------------------------------------------------------- /networks/pose_decoder.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import torch 4 | import torch.nn as nn 5 | from collections import OrderedDict 6 | 7 | 8 | class PoseDecoder(nn.Module): 9 | def __init__(self, num_ch_enc, num_input_features, num_frames_to_predict_for=None, stride=1): 10 | super(PoseDecoder, self).__init__() 11 | 12 | self.num_ch_enc = num_ch_enc 13 | self.num_input_features = num_input_features 14 | 15 | if num_frames_to_predict_for is None: 16 | num_frames_to_predict_for = num_input_features - 1 17 | self.num_frames_to_predict_for = num_frames_to_predict_for 18 | 19 | self.convs = OrderedDict() 20 | self.convs[("squeeze")] = nn.Conv2d(self.num_ch_enc[-1], 256, 1) 21 | self.convs[("pose", 0)] = nn.Conv2d(num_input_features * 256, 256, 3, stride, 1) 22 | self.convs[("pose", 1)] = nn.Conv2d(256, 256, 3, stride, 1) 23 | self.convs[("pose", 2)] = nn.Conv2d(256, 6 * num_frames_to_predict_for, 1) 24 | 25 | self.relu = nn.ReLU() 26 | 27 | self.net = nn.ModuleList(list(self.convs.values())) 28 | 29 | def forward(self, input_features): 30 | last_features = [f[-1] for f in input_features] 31 | 32 | cat_features = [self.relu(self.convs["squeeze"](f)) for f in last_features] 33 | cat_features = torch.cat(cat_features, 1) 34 | 35 | out = cat_features 36 | for i in range(3): 37 | out = self.convs[("pose", i)](out) 38 | if i != 2: 39 | out = self.relu(out) 40 | 41 | out = out.mean(3).mean(2) 42 | 43 | out = 0.01 * out.view(-1, self.num_frames_to_predict_for, 1, 6) 44 | 45 | axisangle = out[..., :3] 46 | translation = out[..., 3:] 47 | 48 | return axisangle, translation 49 | -------------------------------------------------------------------------------- /networks/resnet_encoder.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import numpy as np 4 | 5 | import torch 6 | import torch.nn as nn 7 | import torchvision.models as models 8 | import torch.utils.model_zoo as model_zoo 9 | 10 | 11 | class ResNetMultiImageInput(models.ResNet): 12 | """ 13 | Constructs a resnet model with varying number of input images. 14 | """ 15 | def __init__(self, block, layers, num_classes=1000, num_input_images=1): 16 | super(ResNetMultiImageInput, self).__init__(block, layers) 17 | self.inplanes = 64 18 | self.conv1 = nn.Conv2d( 19 | num_input_images * 3, 64, kernel_size=7, stride=2, padding=3, bias=False) 20 | self.bn1 = nn.BatchNorm2d(64) 21 | self.relu = nn.ReLU(inplace=True) 22 | self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1) 23 | self.layer1 = self._make_layer(block, 64, layers[0]) 24 | self.layer2 = self._make_layer(block, 128, layers[1], stride=2) 25 | self.layer3 = self._make_layer(block, 256, layers[2], stride=2) 26 | self.layer4 = self._make_layer(block, 512, layers[3], stride=2) 27 | 28 | for m in self.modules(): 29 | if isinstance(m, nn.Conv2d): 30 | nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 31 | elif isinstance(m, nn.BatchNorm2d): 32 | nn.init.constant_(m.weight, 1) 33 | nn.init.constant_(m.bias, 0) 34 | 35 | 36 | def resnet_multiimage_input(num_layers, pretrained=False, num_input_images=1): 37 | """Constructs a ResNet model. 38 | Args: 39 | num_layers (int): Number of resnet layers. Must be 18 or 50 40 | pretrained (bool): If True, returns a model pre-trained on ImageNet 41 | num_input_images (int): Number of frames stacked as input 42 | """ 43 | assert num_layers in [18, 50], "Can only run with 18 or 50 layer resnet" 44 | blocks = {18: [2, 2, 2, 2], 50: [3, 4, 6, 3]}[num_layers] 45 | block_type = {18: models.resnet.BasicBlock, 50: models.resnet.Bottleneck}[num_layers] 46 | model = ResNetMultiImageInput(block_type, blocks, num_input_images=num_input_images) 47 | 48 | if pretrained: 49 | loaded = model_zoo.load_url(models.resnet.model_urls['resnet{}'.format(num_layers)]) 50 | loaded['conv1.weight'] = torch.cat( 51 | [loaded['conv1.weight']] * num_input_images, 1) / num_input_images 52 | model.load_state_dict(loaded) 53 | return model 54 | 55 | 56 | class ResnetEncoder(nn.Module): 57 | """ 58 | Pytorch module for a resnet encoder 59 | """ 60 | def __init__(self, num_layers, pretrained, num_input_images=1): 61 | super(ResnetEncoder, self).__init__() 62 | 63 | self.num_ch_enc = np.array([64, 64, 128, 256, 512]) 64 | 65 | resnets = {18: models.resnet18, 66 | 34: models.resnet34, 67 | 50: models.resnet50, 68 | 101: models.resnet101, 69 | 152: models.resnet152} 70 | 71 | if num_layers not in resnets: 72 | raise ValueError("{} is not a valid number of resnet layers".format(num_layers)) 73 | 74 | if num_input_images > 1: 75 | self.encoder = resnet_multiimage_input(num_layers, pretrained, num_input_images) 76 | else: 77 | self.encoder = resnets[num_layers](pretrained) 78 | 79 | if num_layers > 34: 80 | self.num_ch_enc[1:] *= 4 81 | 82 | def forward(self, input_image): 83 | self.features = [] 84 | x = (input_image - 0.45) / 0.225 85 | x = self.encoder.conv1(x) 86 | x = self.encoder.bn1(x) 87 | self.features.append(self.encoder.relu(x)) 88 | self.features.append(self.encoder.layer1(self.encoder.maxpool(self.features[-1]))) 89 | self.features.append(self.encoder.layer2(self.features[-1])) 90 | self.features.append(self.encoder.layer3(self.features[-1])) 91 | self.features.append(self.encoder.layer4(self.features[-1])) 92 | 93 | return self.features 94 | -------------------------------------------------------------------------------- /networks/util.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | from torch.nn import functional as F 3 | import math 4 | import torch.utils.model_zoo as model_zoo 5 | import torch 6 | import os 7 | import sys 8 | import pdb 9 | import numpy as np 10 | from torch.autograd import Variable 11 | import functools 12 | from inplace_abn.bn import InPlaceABNSync, InPlaceABN 13 | ABN_module = InPlaceABN 14 | 15 | BatchNorm2d = functools.partial(ABN_module, activation='none') 16 | 17 | def outS(i): 18 | i = int(i) 19 | i = (i + 1) / 2 20 | i = int(np.ceil((i + 1) / 2.0)) 21 | i = (i + 1) / 2 22 | return i 23 | 24 | 25 | def conv3x3(in_planes, out_planes, stride=1, padding=1, dilation=1): 26 | "3x3 convolution with padding" 27 | return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, 28 | padding=padding, bias=False, dilation=dilation) 29 | 30 | 31 | class Bottleneck(nn.Module): 32 | expansion = 4 33 | 34 | def __init__(self, inplanes, planes, stride=1, dilation=1, downsample=None, fist_dilation=1, multi_grid=1): 35 | super(Bottleneck, self).__init__() 36 | self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False) 37 | self.bn1 = BatchNorm2d(planes) 38 | self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, 39 | padding=dilation * multi_grid, dilation=dilation * multi_grid, bias=False) 40 | self.bn2 = BatchNorm2d(planes) 41 | self.conv3 = nn.Conv2d(planes, planes * 4, kernel_size=1, bias=False) 42 | self.bn3 = BatchNorm2d(planes * 4) 43 | self.relu = nn.ReLU(inplace=False) 44 | self.relu_inplace = nn.ReLU(inplace=True) 45 | self.downsample = downsample 46 | self.dilation = dilation 47 | self.stride = stride 48 | 49 | def _sum_each(self, x, y): 50 | assert (len(x) == len(y)) 51 | z = [] 52 | for i in range(len(x)): 53 | z.append(x[i] + y[i]) 54 | return z 55 | 56 | def forward(self, x): 57 | residual = x 58 | 59 | out = self.conv1(x) 60 | out = self.bn1(out) 61 | out = self.relu(out) 62 | 63 | out = self.conv2(out) 64 | out = self.bn2(out) 65 | out = self.relu(out) 66 | 67 | out = self.conv3(out) 68 | out = self.bn3(out) 69 | 70 | if self.downsample is not None: 71 | residual = self.downsample(x) 72 | 73 | out = out + residual 74 | out = self.relu_inplace(out) 75 | return out -------------------------------------------------------------------------------- /options.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | import argparse 5 | 6 | file_dir = os.path.dirname(__file__) # the directory that options.py resides in 7 | 8 | 9 | class MonodepthOptions: 10 | def __init__(self): 11 | self.parser = argparse.ArgumentParser(description="Monodepthv2 options") 12 | 13 | # PATHS 14 | self.parser.add_argument("--data_path", 15 | type=str, 16 | help="path to the training data", 17 | default=os.path.join(file_dir, "kitti_data")) 18 | self.parser.add_argument("--log_dir", 19 | type=str, 20 | help="log directory", 21 | default=os.path.join(os.path.expanduser("~"), "tmp")) 22 | 23 | # TRAINING options 24 | self.parser.add_argument("--model_name", 25 | type=str, 26 | help="the name of the folder to save the model in", 27 | default="mdp") 28 | self.parser.add_argument("--split", 29 | type=str, 30 | help="which training split to use", 31 | choices=["eigen_zhou", "eigen_full", "odom", "benchmark"], 32 | default="eigen_zhou") 33 | self.parser.add_argument("--num_layers", 34 | type=int, 35 | help="number of resnet layers", 36 | default=18, 37 | choices=[18, 34, 50, 101, 152]) 38 | self.parser.add_argument("--dataset", 39 | type=str, 40 | help="dataset to train on", 41 | default="kitti", 42 | choices=["kitti", "kitti_odom", "kitti_depth", "kitti_test"]) 43 | self.parser.add_argument("--discretization", 44 | type=str, 45 | help="disparity discretization method", 46 | default="UD", 47 | choices=["UD", "SID"]) 48 | self.parser.add_argument("--png", 49 | help="if set, trains from raw KITTI png files (instead of jpgs)", 50 | action="store_true") 51 | self.parser.add_argument("--height", 52 | type=int, 53 | help="input image height", 54 | default=192) 55 | self.parser.add_argument("--width", 56 | type=int, 57 | help="input image width", 58 | default=640) 59 | self.parser.add_argument("--disparity_smoothness", 60 | type=float, 61 | help="disparity smoothness weight", 62 | default=1e-3) 63 | self.parser.add_argument("--scales", 64 | nargs="+", 65 | type=int, 66 | help="scales used in the loss", 67 | default=[0, 1, 2, 3]) 68 | self.parser.add_argument("--min_depth", 69 | type=float, 70 | help="minimum depth", 71 | default=0.1) 72 | self.parser.add_argument("--max_depth", 73 | type=float, 74 | help="maximum depth", 75 | default=100.0) 76 | self.parser.add_argument("--use_stereo", 77 | help="if set, uses stereo pair for training", 78 | action="store_true") 79 | self.parser.add_argument("--frame_ids", 80 | nargs="+", 81 | type=int, 82 | help="frames to load", 83 | default=[0, -1, 1]) 84 | 85 | # OPTIMIZATION options 86 | self.parser.add_argument("--batch_size", 87 | type=int, 88 | help="batch size", 89 | default=12) 90 | self.parser.add_argument("--learning_rate", 91 | type=float, 92 | help="learning rate", 93 | default=1e-4) 94 | self.parser.add_argument("--num_epochs", 95 | type=int, 96 | help="number of epochs", 97 | default=20) 98 | self.parser.add_argument("--scheduler_step_size", 99 | type=int, 100 | help="step size of the scheduler", 101 | default=15) 102 | 103 | # ABLATION options 104 | self.parser.add_argument("--v1_multiscale", 105 | help="if set, uses monodepth v1 multiscale", 106 | action="store_true") 107 | self.parser.add_argument("--avg_reprojection", 108 | help="if set, uses average reprojection loss", 109 | action="store_true") 110 | self.parser.add_argument("--disable_automasking", 111 | help="if set, doesn't do auto-masking", 112 | action="store_true") 113 | self.parser.add_argument("--predictive_mask", 114 | help="if set, uses a predictive masking scheme as in Zhou et al", 115 | action="store_true") 116 | self.parser.add_argument("--no_ssim", 117 | help="if set, disables ssim in the loss", 118 | action="store_true") 119 | self.parser.add_argument("--weights_init", 120 | type=str, 121 | help="pretrained or scratch", 122 | default="pretrained", 123 | choices=["pretrained", "scratch"]) 124 | self.parser.add_argument("--pose_model_input", 125 | type=str, 126 | help="how many images the pose network gets", 127 | default="pairs", 128 | choices=["pairs", "all"]) 129 | self.parser.add_argument("--pose_model_type", 130 | type=str, 131 | help="normal or shared", 132 | default="separate_resnet", 133 | choices=["posecnn", "separate_resnet", "shared"]) 134 | # Self-attention and DDV 135 | self.parser.add_argument("--no_self_attention", 136 | help="if set, diables self-attention", 137 | action="store_true") 138 | self.parser.add_argument("--no_ddv", 139 | help="is set, disable discrete disparity volume", 140 | action="store_true") 141 | 142 | # SYSTEM options 143 | self.parser.add_argument("--no_cuda", 144 | help="if set disables CUDA", 145 | action="store_true") 146 | self.parser.add_argument("--num_workers", 147 | type=int, 148 | help="number of dataloader workers", 149 | default=12) 150 | 151 | # LOADING options 152 | self.parser.add_argument("--load_weights_folder", 153 | type=str, 154 | help="name of model to load") 155 | self.parser.add_argument("--models_to_load", 156 | nargs="+", 157 | type=str, 158 | help="models to load", 159 | default=["encoder", "depth", "pose_encoder", "pose"]) 160 | 161 | # LOGGING options 162 | self.parser.add_argument("--log_frequency", 163 | type=int, 164 | help="number of batches between each tensorboard log", 165 | default=250) 166 | self.parser.add_argument("--save_frequency", 167 | type=int, 168 | help="number of epochs between each save", 169 | default=1) 170 | 171 | # EVALUATION options 172 | self.parser.add_argument("--eval_stereo", 173 | help="if set evaluates in stereo mode", 174 | action="store_true") 175 | self.parser.add_argument("--eval_mono", 176 | help="if set evaluates in mono mode", 177 | action="store_true") 178 | self.parser.add_argument("--disable_median_scaling", 179 | help="if set disables median scaling in evaluation", 180 | action="store_true") 181 | self.parser.add_argument("--pred_depth_scale_factor", 182 | help="if set multiplies predictions by this number", 183 | type=float, 184 | default=1) 185 | self.parser.add_argument("--ext_disp_to_eval", 186 | type=str, 187 | help="optional path to a .npy disparities file to evaluate") 188 | self.parser.add_argument("--eval_split", 189 | type=str, 190 | default="eigen", 191 | choices=[ 192 | "eigen", "eigen_benchmark", "benchmark", "odom_9", "odom_10"], 193 | help="which split to run eval on") 194 | self.parser.add_argument("--save_pred_disps", 195 | help="if set saves predicted disparities", 196 | action="store_true") 197 | self.parser.add_argument("--no_eval", 198 | help="if set disables evaluation", 199 | action="store_true") 200 | self.parser.add_argument("--eval_eigen_to_benchmark", 201 | help="if set assume we are loading eigen results from npy but " 202 | "we want to evaluate using the new benchmark.", 203 | action="store_true") 204 | self.parser.add_argument("--eval_out_dir", 205 | help="if set will output the disparities to this folder", 206 | type=str) 207 | self.parser.add_argument("--post_process", 208 | help="if set will perform the flipping post processing " 209 | "from the original monodepth paper", 210 | action="store_true") 211 | 212 | def parse(self): 213 | self.options = self.parser.parse_args() 214 | return self.options 215 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | future==0.18.2 2 | google-auth==1.21.1 3 | google-auth-oauthlib==0.4.1 4 | google-pasta==0.2.0 5 | h5py==2.10.0 6 | Keras-Applications==1.0.8 7 | Keras-Preprocessing==1.1.2 8 | Markdown==3.3.3 9 | numpy==1.18.5 10 | opencv-python==4.2.0.34 11 | requests==2.24.0 12 | requests-oauthlib==1.3.0 13 | scikit-image==0.17.2 14 | scipy==1.4.1 15 | six==1.15.0 16 | tensorboard==2.3.0 17 | tensorboardX==1.4 18 | tensorflow==2.3.1 19 | tensorflow-gpu==1.2.0 20 | torch==0.4.1 21 | torchsummary==1.5.1 22 | torchvision==0.2.1 23 | -------------------------------------------------------------------------------- /splits/benchmark/eigen_to_benchmark_ids.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/sjsu-smart-lab/Self-supervised-Monocular-Trained-Depth-Estimation-using-Self-attention-and-Discrete-Disparity-Volum/3c6f46ab03cfd424b677dfeb0c4a45d6269415a9/splits/benchmark/eigen_to_benchmark_ids.npy -------------------------------------------------------------------------------- /splits/benchmark/test_files.txt: -------------------------------------------------------------------------------- 1 | image 0 2 | image 1 3 | image 2 4 | image 3 5 | image 4 6 | image 5 7 | image 6 8 | image 7 9 | image 8 10 | image 9 11 | image 10 12 | image 11 13 | image 12 14 | image 13 15 | image 14 16 | image 15 17 | image 16 18 | image 17 19 | image 18 20 | image 19 21 | image 20 22 | image 21 23 | image 22 24 | image 23 25 | image 24 26 | image 25 27 | image 26 28 | image 27 29 | image 28 30 | image 29 31 | image 30 32 | image 31 33 | image 32 34 | image 33 35 | image 34 36 | image 35 37 | image 36 38 | image 37 39 | image 38 40 | image 39 41 | image 40 42 | image 41 43 | image 42 44 | image 43 45 | image 44 46 | image 45 47 | image 46 48 | image 47 49 | image 48 50 | image 49 51 | image 50 52 | image 51 53 | image 52 54 | image 53 55 | image 54 56 | image 55 57 | image 56 58 | image 57 59 | image 58 60 | image 59 61 | image 60 62 | image 61 63 | image 62 64 | image 63 65 | image 64 66 | image 65 67 | image 66 68 | image 67 69 | image 68 70 | image 69 71 | image 70 72 | image 71 73 | image 72 74 | image 73 75 | image 74 76 | image 75 77 | image 76 78 | image 77 79 | image 78 80 | image 79 81 | image 80 82 | image 81 83 | image 82 84 | image 83 85 | image 84 86 | image 85 87 | image 86 88 | image 87 89 | image 88 90 | image 89 91 | image 90 92 | image 91 93 | image 92 94 | image 93 95 | image 94 96 | image 95 97 | image 96 98 | image 97 99 | image 98 100 | image 99 101 | image 100 102 | image 101 103 | image 102 104 | image 103 105 | image 104 106 | image 105 107 | image 106 108 | image 107 109 | image 108 110 | image 109 111 | image 110 112 | image 111 113 | image 112 114 | image 113 115 | image 114 116 | image 115 117 | image 116 118 | image 117 119 | image 118 120 | image 119 121 | image 120 122 | image 121 123 | image 122 124 | image 123 125 | image 124 126 | image 125 127 | image 126 128 | image 127 129 | image 128 130 | image 129 131 | image 130 132 | image 131 133 | image 132 134 | image 133 135 | image 134 136 | image 135 137 | image 136 138 | image 137 139 | image 138 140 | image 139 141 | image 140 142 | image 141 143 | image 142 144 | image 143 145 | image 144 146 | image 145 147 | image 146 148 | image 147 149 | image 148 150 | image 149 151 | image 150 152 | image 151 153 | image 152 154 | image 153 155 | image 154 156 | image 155 157 | image 156 158 | image 157 159 | image 158 160 | image 159 161 | image 160 162 | image 161 163 | image 162 164 | image 163 165 | image 164 166 | image 165 167 | image 166 168 | image 167 169 | image 168 170 | image 169 171 | image 170 172 | image 171 173 | image 172 174 | image 173 175 | image 174 176 | image 175 177 | image 176 178 | image 177 179 | image 178 180 | image 179 181 | image 180 182 | image 181 183 | image 182 184 | image 183 185 | image 184 186 | image 185 187 | image 186 188 | image 187 189 | image 188 190 | image 189 191 | image 190 192 | image 191 193 | image 192 194 | image 193 195 | image 194 196 | image 195 197 | image 196 198 | image 197 199 | image 198 200 | image 199 201 | image 200 202 | image 201 203 | image 202 204 | image 203 205 | image 204 206 | image 205 207 | image 206 208 | image 207 209 | image 208 210 | image 209 211 | image 210 212 | image 211 213 | image 212 214 | image 213 215 | image 214 216 | image 215 217 | image 216 218 | image 217 219 | image 218 220 | image 219 221 | image 220 222 | image 221 223 | image 222 224 | image 223 225 | image 224 226 | image 225 227 | image 226 228 | image 227 229 | image 228 230 | image 229 231 | image 230 232 | image 231 233 | image 232 234 | image 233 235 | image 234 236 | image 235 237 | image 236 238 | image 237 239 | image 238 240 | image 239 241 | image 240 242 | image 241 243 | image 242 244 | image 243 245 | image 244 246 | image 245 247 | image 246 248 | image 247 249 | image 248 250 | image 249 251 | image 250 252 | image 251 253 | image 252 254 | image 253 255 | image 254 256 | image 255 257 | image 256 258 | image 257 259 | image 258 260 | image 259 261 | image 260 262 | image 261 263 | image 262 264 | image 263 265 | image 264 266 | image 265 267 | image 266 268 | image 267 269 | image 268 270 | image 269 271 | image 270 272 | image 271 273 | image 272 274 | image 273 275 | image 274 276 | image 275 277 | image 276 278 | image 277 279 | image 278 280 | image 279 281 | image 280 282 | image 281 283 | image 282 284 | image 283 285 | image 284 286 | image 285 287 | image 286 288 | image 287 289 | image 288 290 | image 289 291 | image 290 292 | image 291 293 | image 292 294 | image 293 295 | image 294 296 | image 295 297 | image 296 298 | image 297 299 | image 298 300 | image 299 301 | image 300 302 | image 301 303 | image 302 304 | image 303 305 | image 304 306 | image 305 307 | image 306 308 | image 307 309 | image 308 310 | image 309 311 | image 310 312 | image 311 313 | image 312 314 | image 313 315 | image 314 316 | image 315 317 | image 316 318 | image 317 319 | image 318 320 | image 319 321 | image 320 322 | image 321 323 | image 322 324 | image 323 325 | image 324 326 | image 325 327 | image 326 328 | image 327 329 | image 328 330 | image 329 331 | image 330 332 | image 331 333 | image 332 334 | image 333 335 | image 334 336 | image 335 337 | image 336 338 | image 337 339 | image 338 340 | image 339 341 | image 340 342 | image 341 343 | image 342 344 | image 343 345 | image 344 346 | image 345 347 | image 346 348 | image 347 349 | image 348 350 | image 349 351 | image 350 352 | image 351 353 | image 352 354 | image 353 355 | image 354 356 | image 355 357 | image 356 358 | image 357 359 | image 358 360 | image 359 361 | image 360 362 | image 361 363 | image 362 364 | image 363 365 | image 364 366 | image 365 367 | image 366 368 | image 367 369 | image 368 370 | image 369 371 | image 370 372 | image 371 373 | image 372 374 | image 373 375 | image 374 376 | image 375 377 | image 376 378 | image 377 379 | image 378 380 | image 379 381 | image 380 382 | image 381 383 | image 382 384 | image 383 385 | image 384 386 | image 385 387 | image 386 388 | image 387 389 | image 388 390 | image 389 391 | image 390 392 | image 391 393 | image 392 394 | image 393 395 | image 394 396 | image 395 397 | image 396 398 | image 397 399 | image 398 400 | image 399 401 | image 400 402 | image 401 403 | image 402 404 | image 403 405 | image 404 406 | image 405 407 | image 406 408 | image 407 409 | image 408 410 | image 409 411 | image 410 412 | image 411 413 | image 412 414 | image 413 415 | image 414 416 | image 415 417 | image 416 418 | image 417 419 | image 418 420 | image 419 421 | image 420 422 | image 421 423 | image 422 424 | image 423 425 | image 424 426 | image 425 427 | image 426 428 | image 427 429 | image 428 430 | image 429 431 | image 430 432 | image 431 433 | image 432 434 | image 433 435 | image 434 436 | image 435 437 | image 436 438 | image 437 439 | image 438 440 | image 439 441 | image 440 442 | image 441 443 | image 442 444 | image 443 445 | image 444 446 | image 445 447 | image 446 448 | image 447 449 | image 448 450 | image 449 451 | image 450 452 | image 451 453 | image 452 454 | image 453 455 | image 454 456 | image 455 457 | image 456 458 | image 457 459 | image 458 460 | image 459 461 | image 460 462 | image 461 463 | image 462 464 | image 463 465 | image 464 466 | image 465 467 | image 466 468 | image 467 469 | image 468 470 | image 469 471 | image 470 472 | image 471 473 | image 472 474 | image 473 475 | image 474 476 | image 475 477 | image 476 478 | image 477 479 | image 478 480 | image 479 481 | image 480 482 | image 481 483 | image 482 484 | image 483 485 | image 484 486 | image 485 487 | image 486 488 | image 487 489 | image 488 490 | image 489 491 | image 490 492 | image 491 493 | image 492 494 | image 493 495 | image 494 496 | image 495 497 | image 496 498 | image 497 499 | image 498 500 | image 499 501 | -------------------------------------------------------------------------------- /splits/kitti_archives_to_download.txt: -------------------------------------------------------------------------------- 1 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_calib.zip 2 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0001/2011_09_26_drive_0001_sync.zip 3 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0002/2011_09_26_drive_0002_sync.zip 4 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0005/2011_09_26_drive_0005_sync.zip 5 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0009/2011_09_26_drive_0009_sync.zip 6 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0011/2011_09_26_drive_0011_sync.zip 7 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0013/2011_09_26_drive_0013_sync.zip 8 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0014/2011_09_26_drive_0014_sync.zip 9 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0015/2011_09_26_drive_0015_sync.zip 10 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0017/2011_09_26_drive_0017_sync.zip 11 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0018/2011_09_26_drive_0018_sync.zip 12 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0019/2011_09_26_drive_0019_sync.zip 13 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0020/2011_09_26_drive_0020_sync.zip 14 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0022/2011_09_26_drive_0022_sync.zip 15 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0023/2011_09_26_drive_0023_sync.zip 16 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0027/2011_09_26_drive_0027_sync.zip 17 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0028/2011_09_26_drive_0028_sync.zip 18 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0029/2011_09_26_drive_0029_sync.zip 19 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0032/2011_09_26_drive_0032_sync.zip 20 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0035/2011_09_26_drive_0035_sync.zip 21 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0036/2011_09_26_drive_0036_sync.zip 22 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0039/2011_09_26_drive_0039_sync.zip 23 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0046/2011_09_26_drive_0046_sync.zip 24 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0048/2011_09_26_drive_0048_sync.zip 25 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0051/2011_09_26_drive_0051_sync.zip 26 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0052/2011_09_26_drive_0052_sync.zip 27 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0056/2011_09_26_drive_0056_sync.zip 28 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0057/2011_09_26_drive_0057_sync.zip 29 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0059/2011_09_26_drive_0059_sync.zip 30 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0060/2011_09_26_drive_0060_sync.zip 31 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0061/2011_09_26_drive_0061_sync.zip 32 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0064/2011_09_26_drive_0064_sync.zip 33 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0070/2011_09_26_drive_0070_sync.zip 34 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0079/2011_09_26_drive_0079_sync.zip 35 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0084/2011_09_26_drive_0084_sync.zip 36 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0086/2011_09_26_drive_0086_sync.zip 37 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0087/2011_09_26_drive_0087_sync.zip 38 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0091/2011_09_26_drive_0091_sync.zip 39 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0093/2011_09_26_drive_0093_sync.zip 40 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0095/2011_09_26_drive_0095_sync.zip 41 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0096/2011_09_26_drive_0096_sync.zip 42 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0101/2011_09_26_drive_0101_sync.zip 43 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0104/2011_09_26_drive_0104_sync.zip 44 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0106/2011_09_26_drive_0106_sync.zip 45 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0113/2011_09_26_drive_0113_sync.zip 46 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0117/2011_09_26_drive_0117_sync.zip 47 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_28_calib.zip 48 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_28_drive_0001/2011_09_28_drive_0001_sync.zip 49 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_28_drive_0002/2011_09_28_drive_0002_sync.zip 50 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_29_calib.zip 51 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_29_drive_0004/2011_09_29_drive_0004_sync.zip 52 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_29_drive_0026/2011_09_29_drive_0026_sync.zip 53 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_29_drive_0071/2011_09_29_drive_0071_sync.zip 54 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_30_calib.zip 55 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_30_drive_0016/2011_09_30_drive_0016_sync.zip 56 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_30_drive_0018/2011_09_30_drive_0018_sync.zip 57 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_30_drive_0020/2011_09_30_drive_0020_sync.zip 58 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_30_drive_0027/2011_09_30_drive_0027_sync.zip 59 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_30_drive_0028/2011_09_30_drive_0028_sync.zip 60 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_30_drive_0033/2011_09_30_drive_0033_sync.zip 61 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_30_drive_0034/2011_09_30_drive_0034_sync.zip 62 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_10_03_calib.zip 63 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_10_03_drive_0027/2011_10_03_drive_0027_sync.zip 64 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_10_03_drive_0034/2011_10_03_drive_0034_sync.zip 65 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_10_03_drive_0042/2011_10_03_drive_0042_sync.zip 66 | https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_10_03_drive_0047/2011_10_03_drive_0047_sync.zip 67 | -------------------------------------------------------------------------------- /splits/odom/test_files_09.txt: -------------------------------------------------------------------------------- 1 | 9 0 l 2 | 9 1 l 3 | 9 2 l 4 | 9 3 l 5 | 9 4 l 6 | 9 5 l 7 | 9 6 l 8 | 9 7 l 9 | 9 8 l 10 | 9 9 l 11 | 9 10 l 12 | 9 11 l 13 | 9 12 l 14 | 9 13 l 15 | 9 14 l 16 | 9 15 l 17 | 9 16 l 18 | 9 17 l 19 | 9 18 l 20 | 9 19 l 21 | 9 20 l 22 | 9 21 l 23 | 9 22 l 24 | 9 23 l 25 | 9 24 l 26 | 9 25 l 27 | 9 26 l 28 | 9 27 l 29 | 9 28 l 30 | 9 29 l 31 | 9 30 l 32 | 9 31 l 33 | 9 32 l 34 | 9 33 l 35 | 9 34 l 36 | 9 35 l 37 | 9 36 l 38 | 9 37 l 39 | 9 38 l 40 | 9 39 l 41 | 9 40 l 42 | 9 41 l 43 | 9 42 l 44 | 9 43 l 45 | 9 44 l 46 | 9 45 l 47 | 9 46 l 48 | 9 47 l 49 | 9 48 l 50 | 9 49 l 51 | 9 50 l 52 | 9 51 l 53 | 9 52 l 54 | 9 53 l 55 | 9 54 l 56 | 9 55 l 57 | 9 56 l 58 | 9 57 l 59 | 9 58 l 60 | 9 59 l 61 | 9 60 l 62 | 9 61 l 63 | 9 62 l 64 | 9 63 l 65 | 9 64 l 66 | 9 65 l 67 | 9 66 l 68 | 9 67 l 69 | 9 68 l 70 | 9 69 l 71 | 9 70 l 72 | 9 71 l 73 | 9 72 l 74 | 9 73 l 75 | 9 74 l 76 | 9 75 l 77 | 9 76 l 78 | 9 77 l 79 | 9 78 l 80 | 9 79 l 81 | 9 80 l 82 | 9 81 l 83 | 9 82 l 84 | 9 83 l 85 | 9 84 l 86 | 9 85 l 87 | 9 86 l 88 | 9 87 l 89 | 9 88 l 90 | 9 89 l 91 | 9 90 l 92 | 9 91 l 93 | 9 92 l 94 | 9 93 l 95 | 9 94 l 96 | 9 95 l 97 | 9 96 l 98 | 9 97 l 99 | 9 98 l 100 | 9 99 l 101 | 9 100 l 102 | 9 101 l 103 | 9 102 l 104 | 9 103 l 105 | 9 104 l 106 | 9 105 l 107 | 9 106 l 108 | 9 107 l 109 | 9 108 l 110 | 9 109 l 111 | 9 110 l 112 | 9 111 l 113 | 9 112 l 114 | 9 113 l 115 | 9 114 l 116 | 9 115 l 117 | 9 116 l 118 | 9 117 l 119 | 9 118 l 120 | 9 119 l 121 | 9 120 l 122 | 9 121 l 123 | 9 122 l 124 | 9 123 l 125 | 9 124 l 126 | 9 125 l 127 | 9 126 l 128 | 9 127 l 129 | 9 128 l 130 | 9 129 l 131 | 9 130 l 132 | 9 131 l 133 | 9 132 l 134 | 9 133 l 135 | 9 134 l 136 | 9 135 l 137 | 9 136 l 138 | 9 137 l 139 | 9 138 l 140 | 9 139 l 141 | 9 140 l 142 | 9 141 l 143 | 9 142 l 144 | 9 143 l 145 | 9 144 l 146 | 9 145 l 147 | 9 146 l 148 | 9 147 l 149 | 9 148 l 150 | 9 149 l 151 | 9 150 l 152 | 9 151 l 153 | 9 152 l 154 | 9 153 l 155 | 9 154 l 156 | 9 155 l 157 | 9 156 l 158 | 9 157 l 159 | 9 158 l 160 | 9 159 l 161 | 9 160 l 162 | 9 161 l 163 | 9 162 l 164 | 9 163 l 165 | 9 164 l 166 | 9 165 l 167 | 9 166 l 168 | 9 167 l 169 | 9 168 l 170 | 9 169 l 171 | 9 170 l 172 | 9 171 l 173 | 9 172 l 174 | 9 173 l 175 | 9 174 l 176 | 9 175 l 177 | 9 176 l 178 | 9 177 l 179 | 9 178 l 180 | 9 179 l 181 | 9 180 l 182 | 9 181 l 183 | 9 182 l 184 | 9 183 l 185 | 9 184 l 186 | 9 185 l 187 | 9 186 l 188 | 9 187 l 189 | 9 188 l 190 | 9 189 l 191 | 9 190 l 192 | 9 191 l 193 | 9 192 l 194 | 9 193 l 195 | 9 194 l 196 | 9 195 l 197 | 9 196 l 198 | 9 197 l 199 | 9 198 l 200 | 9 199 l 201 | 9 200 l 202 | 9 201 l 203 | 9 202 l 204 | 9 203 l 205 | 9 204 l 206 | 9 205 l 207 | 9 206 l 208 | 9 207 l 209 | 9 208 l 210 | 9 209 l 211 | 9 210 l 212 | 9 211 l 213 | 9 212 l 214 | 9 213 l 215 | 9 214 l 216 | 9 215 l 217 | 9 216 l 218 | 9 217 l 219 | 9 218 l 220 | 9 219 l 221 | 9 220 l 222 | 9 221 l 223 | 9 222 l 224 | 9 223 l 225 | 9 224 l 226 | 9 225 l 227 | 9 226 l 228 | 9 227 l 229 | 9 228 l 230 | 9 229 l 231 | 9 230 l 232 | 9 231 l 233 | 9 232 l 234 | 9 233 l 235 | 9 234 l 236 | 9 235 l 237 | 9 236 l 238 | 9 237 l 239 | 9 238 l 240 | 9 239 l 241 | 9 240 l 242 | 9 241 l 243 | 9 242 l 244 | 9 243 l 245 | 9 244 l 246 | 9 245 l 247 | 9 246 l 248 | 9 247 l 249 | 9 248 l 250 | 9 249 l 251 | 9 250 l 252 | 9 251 l 253 | 9 252 l 254 | 9 253 l 255 | 9 254 l 256 | 9 255 l 257 | 9 256 l 258 | 9 257 l 259 | 9 258 l 260 | 9 259 l 261 | 9 260 l 262 | 9 261 l 263 | 9 262 l 264 | 9 263 l 265 | 9 264 l 266 | 9 265 l 267 | 9 266 l 268 | 9 267 l 269 | 9 268 l 270 | 9 269 l 271 | 9 270 l 272 | 9 271 l 273 | 9 272 l 274 | 9 273 l 275 | 9 274 l 276 | 9 275 l 277 | 9 276 l 278 | 9 277 l 279 | 9 278 l 280 | 9 279 l 281 | 9 280 l 282 | 9 281 l 283 | 9 282 l 284 | 9 283 l 285 | 9 284 l 286 | 9 285 l 287 | 9 286 l 288 | 9 287 l 289 | 9 288 l 290 | 9 289 l 291 | 9 290 l 292 | 9 291 l 293 | 9 292 l 294 | 9 293 l 295 | 9 294 l 296 | 9 295 l 297 | 9 296 l 298 | 9 297 l 299 | 9 298 l 300 | 9 299 l 301 | 9 300 l 302 | 9 301 l 303 | 9 302 l 304 | 9 303 l 305 | 9 304 l 306 | 9 305 l 307 | 9 306 l 308 | 9 307 l 309 | 9 308 l 310 | 9 309 l 311 | 9 310 l 312 | 9 311 l 313 | 9 312 l 314 | 9 313 l 315 | 9 314 l 316 | 9 315 l 317 | 9 316 l 318 | 9 317 l 319 | 9 318 l 320 | 9 319 l 321 | 9 320 l 322 | 9 321 l 323 | 9 322 l 324 | 9 323 l 325 | 9 324 l 326 | 9 325 l 327 | 9 326 l 328 | 9 327 l 329 | 9 328 l 330 | 9 329 l 331 | 9 330 l 332 | 9 331 l 333 | 9 332 l 334 | 9 333 l 335 | 9 334 l 336 | 9 335 l 337 | 9 336 l 338 | 9 337 l 339 | 9 338 l 340 | 9 339 l 341 | 9 340 l 342 | 9 341 l 343 | 9 342 l 344 | 9 343 l 345 | 9 344 l 346 | 9 345 l 347 | 9 346 l 348 | 9 347 l 349 | 9 348 l 350 | 9 349 l 351 | 9 350 l 352 | 9 351 l 353 | 9 352 l 354 | 9 353 l 355 | 9 354 l 356 | 9 355 l 357 | 9 356 l 358 | 9 357 l 359 | 9 358 l 360 | 9 359 l 361 | 9 360 l 362 | 9 361 l 363 | 9 362 l 364 | 9 363 l 365 | 9 364 l 366 | 9 365 l 367 | 9 366 l 368 | 9 367 l 369 | 9 368 l 370 | 9 369 l 371 | 9 370 l 372 | 9 371 l 373 | 9 372 l 374 | 9 373 l 375 | 9 374 l 376 | 9 375 l 377 | 9 376 l 378 | 9 377 l 379 | 9 378 l 380 | 9 379 l 381 | 9 380 l 382 | 9 381 l 383 | 9 382 l 384 | 9 383 l 385 | 9 384 l 386 | 9 385 l 387 | 9 386 l 388 | 9 387 l 389 | 9 388 l 390 | 9 389 l 391 | 9 390 l 392 | 9 391 l 393 | 9 392 l 394 | 9 393 l 395 | 9 394 l 396 | 9 395 l 397 | 9 396 l 398 | 9 397 l 399 | 9 398 l 400 | 9 399 l 401 | 9 400 l 402 | 9 401 l 403 | 9 402 l 404 | 9 403 l 405 | 9 404 l 406 | 9 405 l 407 | 9 406 l 408 | 9 407 l 409 | 9 408 l 410 | 9 409 l 411 | 9 410 l 412 | 9 411 l 413 | 9 412 l 414 | 9 413 l 415 | 9 414 l 416 | 9 415 l 417 | 9 416 l 418 | 9 417 l 419 | 9 418 l 420 | 9 419 l 421 | 9 420 l 422 | 9 421 l 423 | 9 422 l 424 | 9 423 l 425 | 9 424 l 426 | 9 425 l 427 | 9 426 l 428 | 9 427 l 429 | 9 428 l 430 | 9 429 l 431 | 9 430 l 432 | 9 431 l 433 | 9 432 l 434 | 9 433 l 435 | 9 434 l 436 | 9 435 l 437 | 9 436 l 438 | 9 437 l 439 | 9 438 l 440 | 9 439 l 441 | 9 440 l 442 | 9 441 l 443 | 9 442 l 444 | 9 443 l 445 | 9 444 l 446 | 9 445 l 447 | 9 446 l 448 | 9 447 l 449 | 9 448 l 450 | 9 449 l 451 | 9 450 l 452 | 9 451 l 453 | 9 452 l 454 | 9 453 l 455 | 9 454 l 456 | 9 455 l 457 | 9 456 l 458 | 9 457 l 459 | 9 458 l 460 | 9 459 l 461 | 9 460 l 462 | 9 461 l 463 | 9 462 l 464 | 9 463 l 465 | 9 464 l 466 | 9 465 l 467 | 9 466 l 468 | 9 467 l 469 | 9 468 l 470 | 9 469 l 471 | 9 470 l 472 | 9 471 l 473 | 9 472 l 474 | 9 473 l 475 | 9 474 l 476 | 9 475 l 477 | 9 476 l 478 | 9 477 l 479 | 9 478 l 480 | 9 479 l 481 | 9 480 l 482 | 9 481 l 483 | 9 482 l 484 | 9 483 l 485 | 9 484 l 486 | 9 485 l 487 | 9 486 l 488 | 9 487 l 489 | 9 488 l 490 | 9 489 l 491 | 9 490 l 492 | 9 491 l 493 | 9 492 l 494 | 9 493 l 495 | 9 494 l 496 | 9 495 l 497 | 9 496 l 498 | 9 497 l 499 | 9 498 l 500 | 9 499 l 501 | 9 500 l 502 | 9 501 l 503 | 9 502 l 504 | 9 503 l 505 | 9 504 l 506 | 9 505 l 507 | 9 506 l 508 | 9 507 l 509 | 9 508 l 510 | 9 509 l 511 | 9 510 l 512 | 9 511 l 513 | 9 512 l 514 | 9 513 l 515 | 9 514 l 516 | 9 515 l 517 | 9 516 l 518 | 9 517 l 519 | 9 518 l 520 | 9 519 l 521 | 9 520 l 522 | 9 521 l 523 | 9 522 l 524 | 9 523 l 525 | 9 524 l 526 | 9 525 l 527 | 9 526 l 528 | 9 527 l 529 | 9 528 l 530 | 9 529 l 531 | 9 530 l 532 | 9 531 l 533 | 9 532 l 534 | 9 533 l 535 | 9 534 l 536 | 9 535 l 537 | 9 536 l 538 | 9 537 l 539 | 9 538 l 540 | 9 539 l 541 | 9 540 l 542 | 9 541 l 543 | 9 542 l 544 | 9 543 l 545 | 9 544 l 546 | 9 545 l 547 | 9 546 l 548 | 9 547 l 549 | 9 548 l 550 | 9 549 l 551 | 9 550 l 552 | 9 551 l 553 | 9 552 l 554 | 9 553 l 555 | 9 554 l 556 | 9 555 l 557 | 9 556 l 558 | 9 557 l 559 | 9 558 l 560 | 9 559 l 561 | 9 560 l 562 | 9 561 l 563 | 9 562 l 564 | 9 563 l 565 | 9 564 l 566 | 9 565 l 567 | 9 566 l 568 | 9 567 l 569 | 9 568 l 570 | 9 569 l 571 | 9 570 l 572 | 9 571 l 573 | 9 572 l 574 | 9 573 l 575 | 9 574 l 576 | 9 575 l 577 | 9 576 l 578 | 9 577 l 579 | 9 578 l 580 | 9 579 l 581 | 9 580 l 582 | 9 581 l 583 | 9 582 l 584 | 9 583 l 585 | 9 584 l 586 | 9 585 l 587 | 9 586 l 588 | 9 587 l 589 | 9 588 l 590 | 9 589 l 591 | 9 590 l 592 | 9 591 l 593 | 9 592 l 594 | 9 593 l 595 | 9 594 l 596 | 9 595 l 597 | 9 596 l 598 | 9 597 l 599 | 9 598 l 600 | 9 599 l 601 | 9 600 l 602 | 9 601 l 603 | 9 602 l 604 | 9 603 l 605 | 9 604 l 606 | 9 605 l 607 | 9 606 l 608 | 9 607 l 609 | 9 608 l 610 | 9 609 l 611 | 9 610 l 612 | 9 611 l 613 | 9 612 l 614 | 9 613 l 615 | 9 614 l 616 | 9 615 l 617 | 9 616 l 618 | 9 617 l 619 | 9 618 l 620 | 9 619 l 621 | 9 620 l 622 | 9 621 l 623 | 9 622 l 624 | 9 623 l 625 | 9 624 l 626 | 9 625 l 627 | 9 626 l 628 | 9 627 l 629 | 9 628 l 630 | 9 629 l 631 | 9 630 l 632 | 9 631 l 633 | 9 632 l 634 | 9 633 l 635 | 9 634 l 636 | 9 635 l 637 | 9 636 l 638 | 9 637 l 639 | 9 638 l 640 | 9 639 l 641 | 9 640 l 642 | 9 641 l 643 | 9 642 l 644 | 9 643 l 645 | 9 644 l 646 | 9 645 l 647 | 9 646 l 648 | 9 647 l 649 | 9 648 l 650 | 9 649 l 651 | 9 650 l 652 | 9 651 l 653 | 9 652 l 654 | 9 653 l 655 | 9 654 l 656 | 9 655 l 657 | 9 656 l 658 | 9 657 l 659 | 9 658 l 660 | 9 659 l 661 | 9 660 l 662 | 9 661 l 663 | 9 662 l 664 | 9 663 l 665 | 9 664 l 666 | 9 665 l 667 | 9 666 l 668 | 9 667 l 669 | 9 668 l 670 | 9 669 l 671 | 9 670 l 672 | 9 671 l 673 | 9 672 l 674 | 9 673 l 675 | 9 674 l 676 | 9 675 l 677 | 9 676 l 678 | 9 677 l 679 | 9 678 l 680 | 9 679 l 681 | 9 680 l 682 | 9 681 l 683 | 9 682 l 684 | 9 683 l 685 | 9 684 l 686 | 9 685 l 687 | 9 686 l 688 | 9 687 l 689 | 9 688 l 690 | 9 689 l 691 | 9 690 l 692 | 9 691 l 693 | 9 692 l 694 | 9 693 l 695 | 9 694 l 696 | 9 695 l 697 | 9 696 l 698 | 9 697 l 699 | 9 698 l 700 | 9 699 l 701 | 9 700 l 702 | 9 701 l 703 | 9 702 l 704 | 9 703 l 705 | 9 704 l 706 | 9 705 l 707 | 9 706 l 708 | 9 707 l 709 | 9 708 l 710 | 9 709 l 711 | 9 710 l 712 | 9 711 l 713 | 9 712 l 714 | 9 713 l 715 | 9 714 l 716 | 9 715 l 717 | 9 716 l 718 | 9 717 l 719 | 9 718 l 720 | 9 719 l 721 | 9 720 l 722 | 9 721 l 723 | 9 722 l 724 | 9 723 l 725 | 9 724 l 726 | 9 725 l 727 | 9 726 l 728 | 9 727 l 729 | 9 728 l 730 | 9 729 l 731 | 9 730 l 732 | 9 731 l 733 | 9 732 l 734 | 9 733 l 735 | 9 734 l 736 | 9 735 l 737 | 9 736 l 738 | 9 737 l 739 | 9 738 l 740 | 9 739 l 741 | 9 740 l 742 | 9 741 l 743 | 9 742 l 744 | 9 743 l 745 | 9 744 l 746 | 9 745 l 747 | 9 746 l 748 | 9 747 l 749 | 9 748 l 750 | 9 749 l 751 | 9 750 l 752 | 9 751 l 753 | 9 752 l 754 | 9 753 l 755 | 9 754 l 756 | 9 755 l 757 | 9 756 l 758 | 9 757 l 759 | 9 758 l 760 | 9 759 l 761 | 9 760 l 762 | 9 761 l 763 | 9 762 l 764 | 9 763 l 765 | 9 764 l 766 | 9 765 l 767 | 9 766 l 768 | 9 767 l 769 | 9 768 l 770 | 9 769 l 771 | 9 770 l 772 | 9 771 l 773 | 9 772 l 774 | 9 773 l 775 | 9 774 l 776 | 9 775 l 777 | 9 776 l 778 | 9 777 l 779 | 9 778 l 780 | 9 779 l 781 | 9 780 l 782 | 9 781 l 783 | 9 782 l 784 | 9 783 l 785 | 9 784 l 786 | 9 785 l 787 | 9 786 l 788 | 9 787 l 789 | 9 788 l 790 | 9 789 l 791 | 9 790 l 792 | 9 791 l 793 | 9 792 l 794 | 9 793 l 795 | 9 794 l 796 | 9 795 l 797 | 9 796 l 798 | 9 797 l 799 | 9 798 l 800 | 9 799 l 801 | 9 800 l 802 | 9 801 l 803 | 9 802 l 804 | 9 803 l 805 | 9 804 l 806 | 9 805 l 807 | 9 806 l 808 | 9 807 l 809 | 9 808 l 810 | 9 809 l 811 | 9 810 l 812 | 9 811 l 813 | 9 812 l 814 | 9 813 l 815 | 9 814 l 816 | 9 815 l 817 | 9 816 l 818 | 9 817 l 819 | 9 818 l 820 | 9 819 l 821 | 9 820 l 822 | 9 821 l 823 | 9 822 l 824 | 9 823 l 825 | 9 824 l 826 | 9 825 l 827 | 9 826 l 828 | 9 827 l 829 | 9 828 l 830 | 9 829 l 831 | 9 830 l 832 | 9 831 l 833 | 9 832 l 834 | 9 833 l 835 | 9 834 l 836 | 9 835 l 837 | 9 836 l 838 | 9 837 l 839 | 9 838 l 840 | 9 839 l 841 | 9 840 l 842 | 9 841 l 843 | 9 842 l 844 | 9 843 l 845 | 9 844 l 846 | 9 845 l 847 | 9 846 l 848 | 9 847 l 849 | 9 848 l 850 | 9 849 l 851 | 9 850 l 852 | 9 851 l 853 | 9 852 l 854 | 9 853 l 855 | 9 854 l 856 | 9 855 l 857 | 9 856 l 858 | 9 857 l 859 | 9 858 l 860 | 9 859 l 861 | 9 860 l 862 | 9 861 l 863 | 9 862 l 864 | 9 863 l 865 | 9 864 l 866 | 9 865 l 867 | 9 866 l 868 | 9 867 l 869 | 9 868 l 870 | 9 869 l 871 | 9 870 l 872 | 9 871 l 873 | 9 872 l 874 | 9 873 l 875 | 9 874 l 876 | 9 875 l 877 | 9 876 l 878 | 9 877 l 879 | 9 878 l 880 | 9 879 l 881 | 9 880 l 882 | 9 881 l 883 | 9 882 l 884 | 9 883 l 885 | 9 884 l 886 | 9 885 l 887 | 9 886 l 888 | 9 887 l 889 | 9 888 l 890 | 9 889 l 891 | 9 890 l 892 | 9 891 l 893 | 9 892 l 894 | 9 893 l 895 | 9 894 l 896 | 9 895 l 897 | 9 896 l 898 | 9 897 l 899 | 9 898 l 900 | 9 899 l 901 | 9 900 l 902 | 9 901 l 903 | 9 902 l 904 | 9 903 l 905 | 9 904 l 906 | 9 905 l 907 | 9 906 l 908 | 9 907 l 909 | 9 908 l 910 | 9 909 l 911 | 9 910 l 912 | 9 911 l 913 | 9 912 l 914 | 9 913 l 915 | 9 914 l 916 | 9 915 l 917 | 9 916 l 918 | 9 917 l 919 | 9 918 l 920 | 9 919 l 921 | 9 920 l 922 | 9 921 l 923 | 9 922 l 924 | 9 923 l 925 | 9 924 l 926 | 9 925 l 927 | 9 926 l 928 | 9 927 l 929 | 9 928 l 930 | 9 929 l 931 | 9 930 l 932 | 9 931 l 933 | 9 932 l 934 | 9 933 l 935 | 9 934 l 936 | 9 935 l 937 | 9 936 l 938 | 9 937 l 939 | 9 938 l 940 | 9 939 l 941 | 9 940 l 942 | 9 941 l 943 | 9 942 l 944 | 9 943 l 945 | 9 944 l 946 | 9 945 l 947 | 9 946 l 948 | 9 947 l 949 | 9 948 l 950 | 9 949 l 951 | 9 950 l 952 | 9 951 l 953 | 9 952 l 954 | 9 953 l 955 | 9 954 l 956 | 9 955 l 957 | 9 956 l 958 | 9 957 l 959 | 9 958 l 960 | 9 959 l 961 | 9 960 l 962 | 9 961 l 963 | 9 962 l 964 | 9 963 l 965 | 9 964 l 966 | 9 965 l 967 | 9 966 l 968 | 9 967 l 969 | 9 968 l 970 | 9 969 l 971 | 9 970 l 972 | 9 971 l 973 | 9 972 l 974 | 9 973 l 975 | 9 974 l 976 | 9 975 l 977 | 9 976 l 978 | 9 977 l 979 | 9 978 l 980 | 9 979 l 981 | 9 980 l 982 | 9 981 l 983 | 9 982 l 984 | 9 983 l 985 | 9 984 l 986 | 9 985 l 987 | 9 986 l 988 | 9 987 l 989 | 9 988 l 990 | 9 989 l 991 | 9 990 l 992 | 9 991 l 993 | 9 992 l 994 | 9 993 l 995 | 9 994 l 996 | 9 995 l 997 | 9 996 l 998 | 9 997 l 999 | 9 998 l 1000 | 9 999 l 1001 | 9 1000 l 1002 | 9 1001 l 1003 | 9 1002 l 1004 | 9 1003 l 1005 | 9 1004 l 1006 | 9 1005 l 1007 | 9 1006 l 1008 | 9 1007 l 1009 | 9 1008 l 1010 | 9 1009 l 1011 | 9 1010 l 1012 | 9 1011 l 1013 | 9 1012 l 1014 | 9 1013 l 1015 | 9 1014 l 1016 | 9 1015 l 1017 | 9 1016 l 1018 | 9 1017 l 1019 | 9 1018 l 1020 | 9 1019 l 1021 | 9 1020 l 1022 | 9 1021 l 1023 | 9 1022 l 1024 | 9 1023 l 1025 | 9 1024 l 1026 | 9 1025 l 1027 | 9 1026 l 1028 | 9 1027 l 1029 | 9 1028 l 1030 | 9 1029 l 1031 | 9 1030 l 1032 | 9 1031 l 1033 | 9 1032 l 1034 | 9 1033 l 1035 | 9 1034 l 1036 | 9 1035 l 1037 | 9 1036 l 1038 | 9 1037 l 1039 | 9 1038 l 1040 | 9 1039 l 1041 | 9 1040 l 1042 | 9 1041 l 1043 | 9 1042 l 1044 | 9 1043 l 1045 | 9 1044 l 1046 | 9 1045 l 1047 | 9 1046 l 1048 | 9 1047 l 1049 | 9 1048 l 1050 | 9 1049 l 1051 | 9 1050 l 1052 | 9 1051 l 1053 | 9 1052 l 1054 | 9 1053 l 1055 | 9 1054 l 1056 | 9 1055 l 1057 | 9 1056 l 1058 | 9 1057 l 1059 | 9 1058 l 1060 | 9 1059 l 1061 | 9 1060 l 1062 | 9 1061 l 1063 | 9 1062 l 1064 | 9 1063 l 1065 | 9 1064 l 1066 | 9 1065 l 1067 | 9 1066 l 1068 | 9 1067 l 1069 | 9 1068 l 1070 | 9 1069 l 1071 | 9 1070 l 1072 | 9 1071 l 1073 | 9 1072 l 1074 | 9 1073 l 1075 | 9 1074 l 1076 | 9 1075 l 1077 | 9 1076 l 1078 | 9 1077 l 1079 | 9 1078 l 1080 | 9 1079 l 1081 | 9 1080 l 1082 | 9 1081 l 1083 | 9 1082 l 1084 | 9 1083 l 1085 | 9 1084 l 1086 | 9 1085 l 1087 | 9 1086 l 1088 | 9 1087 l 1089 | 9 1088 l 1090 | 9 1089 l 1091 | 9 1090 l 1092 | 9 1091 l 1093 | 9 1092 l 1094 | 9 1093 l 1095 | 9 1094 l 1096 | 9 1095 l 1097 | 9 1096 l 1098 | 9 1097 l 1099 | 9 1098 l 1100 | 9 1099 l 1101 | 9 1100 l 1102 | 9 1101 l 1103 | 9 1102 l 1104 | 9 1103 l 1105 | 9 1104 l 1106 | 9 1105 l 1107 | 9 1106 l 1108 | 9 1107 l 1109 | 9 1108 l 1110 | 9 1109 l 1111 | 9 1110 l 1112 | 9 1111 l 1113 | 9 1112 l 1114 | 9 1113 l 1115 | 9 1114 l 1116 | 9 1115 l 1117 | 9 1116 l 1118 | 9 1117 l 1119 | 9 1118 l 1120 | 9 1119 l 1121 | 9 1120 l 1122 | 9 1121 l 1123 | 9 1122 l 1124 | 9 1123 l 1125 | 9 1124 l 1126 | 9 1125 l 1127 | 9 1126 l 1128 | 9 1127 l 1129 | 9 1128 l 1130 | 9 1129 l 1131 | 9 1130 l 1132 | 9 1131 l 1133 | 9 1132 l 1134 | 9 1133 l 1135 | 9 1134 l 1136 | 9 1135 l 1137 | 9 1136 l 1138 | 9 1137 l 1139 | 9 1138 l 1140 | 9 1139 l 1141 | 9 1140 l 1142 | 9 1141 l 1143 | 9 1142 l 1144 | 9 1143 l 1145 | 9 1144 l 1146 | 9 1145 l 1147 | 9 1146 l 1148 | 9 1147 l 1149 | 9 1148 l 1150 | 9 1149 l 1151 | 9 1150 l 1152 | 9 1151 l 1153 | 9 1152 l 1154 | 9 1153 l 1155 | 9 1154 l 1156 | 9 1155 l 1157 | 9 1156 l 1158 | 9 1157 l 1159 | 9 1158 l 1160 | 9 1159 l 1161 | 9 1160 l 1162 | 9 1161 l 1163 | 9 1162 l 1164 | 9 1163 l 1165 | 9 1164 l 1166 | 9 1165 l 1167 | 9 1166 l 1168 | 9 1167 l 1169 | 9 1168 l 1170 | 9 1169 l 1171 | 9 1170 l 1172 | 9 1171 l 1173 | 9 1172 l 1174 | 9 1173 l 1175 | 9 1174 l 1176 | 9 1175 l 1177 | 9 1176 l 1178 | 9 1177 l 1179 | 9 1178 l 1180 | 9 1179 l 1181 | 9 1180 l 1182 | 9 1181 l 1183 | 9 1182 l 1184 | 9 1183 l 1185 | 9 1184 l 1186 | 9 1185 l 1187 | 9 1186 l 1188 | 9 1187 l 1189 | 9 1188 l 1190 | 9 1189 l 1191 | 9 1190 l 1192 | 9 1191 l 1193 | 9 1192 l 1194 | 9 1193 l 1195 | 9 1194 l 1196 | 9 1195 l 1197 | 9 1196 l 1198 | 9 1197 l 1199 | 9 1198 l 1200 | 9 1199 l 1201 | 9 1200 l 1202 | 9 1201 l 1203 | 9 1202 l 1204 | 9 1203 l 1205 | 9 1204 l 1206 | 9 1205 l 1207 | 9 1206 l 1208 | 9 1207 l 1209 | 9 1208 l 1210 | 9 1209 l 1211 | 9 1210 l 1212 | 9 1211 l 1213 | 9 1212 l 1214 | 9 1213 l 1215 | 9 1214 l 1216 | 9 1215 l 1217 | 9 1216 l 1218 | 9 1217 l 1219 | 9 1218 l 1220 | 9 1219 l 1221 | 9 1220 l 1222 | 9 1221 l 1223 | 9 1222 l 1224 | 9 1223 l 1225 | 9 1224 l 1226 | 9 1225 l 1227 | 9 1226 l 1228 | 9 1227 l 1229 | 9 1228 l 1230 | 9 1229 l 1231 | 9 1230 l 1232 | 9 1231 l 1233 | 9 1232 l 1234 | 9 1233 l 1235 | 9 1234 l 1236 | 9 1235 l 1237 | 9 1236 l 1238 | 9 1237 l 1239 | 9 1238 l 1240 | 9 1239 l 1241 | 9 1240 l 1242 | 9 1241 l 1243 | 9 1242 l 1244 | 9 1243 l 1245 | 9 1244 l 1246 | 9 1245 l 1247 | 9 1246 l 1248 | 9 1247 l 1249 | 9 1248 l 1250 | 9 1249 l 1251 | 9 1250 l 1252 | 9 1251 l 1253 | 9 1252 l 1254 | 9 1253 l 1255 | 9 1254 l 1256 | 9 1255 l 1257 | 9 1256 l 1258 | 9 1257 l 1259 | 9 1258 l 1260 | 9 1259 l 1261 | 9 1260 l 1262 | 9 1261 l 1263 | 9 1262 l 1264 | 9 1263 l 1265 | 9 1264 l 1266 | 9 1265 l 1267 | 9 1266 l 1268 | 9 1267 l 1269 | 9 1268 l 1270 | 9 1269 l 1271 | 9 1270 l 1272 | 9 1271 l 1273 | 9 1272 l 1274 | 9 1273 l 1275 | 9 1274 l 1276 | 9 1275 l 1277 | 9 1276 l 1278 | 9 1277 l 1279 | 9 1278 l 1280 | 9 1279 l 1281 | 9 1280 l 1282 | 9 1281 l 1283 | 9 1282 l 1284 | 9 1283 l 1285 | 9 1284 l 1286 | 9 1285 l 1287 | 9 1286 l 1288 | 9 1287 l 1289 | 9 1288 l 1290 | 9 1289 l 1291 | 9 1290 l 1292 | 9 1291 l 1293 | 9 1292 l 1294 | 9 1293 l 1295 | 9 1294 l 1296 | 9 1295 l 1297 | 9 1296 l 1298 | 9 1297 l 1299 | 9 1298 l 1300 | 9 1299 l 1301 | 9 1300 l 1302 | 9 1301 l 1303 | 9 1302 l 1304 | 9 1303 l 1305 | 9 1304 l 1306 | 9 1305 l 1307 | 9 1306 l 1308 | 9 1307 l 1309 | 9 1308 l 1310 | 9 1309 l 1311 | 9 1310 l 1312 | 9 1311 l 1313 | 9 1312 l 1314 | 9 1313 l 1315 | 9 1314 l 1316 | 9 1315 l 1317 | 9 1316 l 1318 | 9 1317 l 1319 | 9 1318 l 1320 | 9 1319 l 1321 | 9 1320 l 1322 | 9 1321 l 1323 | 9 1322 l 1324 | 9 1323 l 1325 | 9 1324 l 1326 | 9 1325 l 1327 | 9 1326 l 1328 | 9 1327 l 1329 | 9 1328 l 1330 | 9 1329 l 1331 | 9 1330 l 1332 | 9 1331 l 1333 | 9 1332 l 1334 | 9 1333 l 1335 | 9 1334 l 1336 | 9 1335 l 1337 | 9 1336 l 1338 | 9 1337 l 1339 | 9 1338 l 1340 | 9 1339 l 1341 | 9 1340 l 1342 | 9 1341 l 1343 | 9 1342 l 1344 | 9 1343 l 1345 | 9 1344 l 1346 | 9 1345 l 1347 | 9 1346 l 1348 | 9 1347 l 1349 | 9 1348 l 1350 | 9 1349 l 1351 | 9 1350 l 1352 | 9 1351 l 1353 | 9 1352 l 1354 | 9 1353 l 1355 | 9 1354 l 1356 | 9 1355 l 1357 | 9 1356 l 1358 | 9 1357 l 1359 | 9 1358 l 1360 | 9 1359 l 1361 | 9 1360 l 1362 | 9 1361 l 1363 | 9 1362 l 1364 | 9 1363 l 1365 | 9 1364 l 1366 | 9 1365 l 1367 | 9 1366 l 1368 | 9 1367 l 1369 | 9 1368 l 1370 | 9 1369 l 1371 | 9 1370 l 1372 | 9 1371 l 1373 | 9 1372 l 1374 | 9 1373 l 1375 | 9 1374 l 1376 | 9 1375 l 1377 | 9 1376 l 1378 | 9 1377 l 1379 | 9 1378 l 1380 | 9 1379 l 1381 | 9 1380 l 1382 | 9 1381 l 1383 | 9 1382 l 1384 | 9 1383 l 1385 | 9 1384 l 1386 | 9 1385 l 1387 | 9 1386 l 1388 | 9 1387 l 1389 | 9 1388 l 1390 | 9 1389 l 1391 | 9 1390 l 1392 | 9 1391 l 1393 | 9 1392 l 1394 | 9 1393 l 1395 | 9 1394 l 1396 | 9 1395 l 1397 | 9 1396 l 1398 | 9 1397 l 1399 | 9 1398 l 1400 | 9 1399 l 1401 | 9 1400 l 1402 | 9 1401 l 1403 | 9 1402 l 1404 | 9 1403 l 1405 | 9 1404 l 1406 | 9 1405 l 1407 | 9 1406 l 1408 | 9 1407 l 1409 | 9 1408 l 1410 | 9 1409 l 1411 | 9 1410 l 1412 | 9 1411 l 1413 | 9 1412 l 1414 | 9 1413 l 1415 | 9 1414 l 1416 | 9 1415 l 1417 | 9 1416 l 1418 | 9 1417 l 1419 | 9 1418 l 1420 | 9 1419 l 1421 | 9 1420 l 1422 | 9 1421 l 1423 | 9 1422 l 1424 | 9 1423 l 1425 | 9 1424 l 1426 | 9 1425 l 1427 | 9 1426 l 1428 | 9 1427 l 1429 | 9 1428 l 1430 | 9 1429 l 1431 | 9 1430 l 1432 | 9 1431 l 1433 | 9 1432 l 1434 | 9 1433 l 1435 | 9 1434 l 1436 | 9 1435 l 1437 | 9 1436 l 1438 | 9 1437 l 1439 | 9 1438 l 1440 | 9 1439 l 1441 | 9 1440 l 1442 | 9 1441 l 1443 | 9 1442 l 1444 | 9 1443 l 1445 | 9 1444 l 1446 | 9 1445 l 1447 | 9 1446 l 1448 | 9 1447 l 1449 | 9 1448 l 1450 | 9 1449 l 1451 | 9 1450 l 1452 | 9 1451 l 1453 | 9 1452 l 1454 | 9 1453 l 1455 | 9 1454 l 1456 | 9 1455 l 1457 | 9 1456 l 1458 | 9 1457 l 1459 | 9 1458 l 1460 | 9 1459 l 1461 | 9 1460 l 1462 | 9 1461 l 1463 | 9 1462 l 1464 | 9 1463 l 1465 | 9 1464 l 1466 | 9 1465 l 1467 | 9 1466 l 1468 | 9 1467 l 1469 | 9 1468 l 1470 | 9 1469 l 1471 | 9 1470 l 1472 | 9 1471 l 1473 | 9 1472 l 1474 | 9 1473 l 1475 | 9 1474 l 1476 | 9 1475 l 1477 | 9 1476 l 1478 | 9 1477 l 1479 | 9 1478 l 1480 | 9 1479 l 1481 | 9 1480 l 1482 | 9 1481 l 1483 | 9 1482 l 1484 | 9 1483 l 1485 | 9 1484 l 1486 | 9 1485 l 1487 | 9 1486 l 1488 | 9 1487 l 1489 | 9 1488 l 1490 | 9 1489 l 1491 | 9 1490 l 1492 | 9 1491 l 1493 | 9 1492 l 1494 | 9 1493 l 1495 | 9 1494 l 1496 | 9 1495 l 1497 | 9 1496 l 1498 | 9 1497 l 1499 | 9 1498 l 1500 | 9 1499 l 1501 | 9 1500 l 1502 | 9 1501 l 1503 | 9 1502 l 1504 | 9 1503 l 1505 | 9 1504 l 1506 | 9 1505 l 1507 | 9 1506 l 1508 | 9 1507 l 1509 | 9 1508 l 1510 | 9 1509 l 1511 | 9 1510 l 1512 | 9 1511 l 1513 | 9 1512 l 1514 | 9 1513 l 1515 | 9 1514 l 1516 | 9 1515 l 1517 | 9 1516 l 1518 | 9 1517 l 1519 | 9 1518 l 1520 | 9 1519 l 1521 | 9 1520 l 1522 | 9 1521 l 1523 | 9 1522 l 1524 | 9 1523 l 1525 | 9 1524 l 1526 | 9 1525 l 1527 | 9 1526 l 1528 | 9 1527 l 1529 | 9 1528 l 1530 | 9 1529 l 1531 | 9 1530 l 1532 | 9 1531 l 1533 | 9 1532 l 1534 | 9 1533 l 1535 | 9 1534 l 1536 | 9 1535 l 1537 | 9 1536 l 1538 | 9 1537 l 1539 | 9 1538 l 1540 | 9 1539 l 1541 | 9 1540 l 1542 | 9 1541 l 1543 | 9 1542 l 1544 | 9 1543 l 1545 | 9 1544 l 1546 | 9 1545 l 1547 | 9 1546 l 1548 | 9 1547 l 1549 | 9 1548 l 1550 | 9 1549 l 1551 | 9 1550 l 1552 | 9 1551 l 1553 | 9 1552 l 1554 | 9 1553 l 1555 | 9 1554 l 1556 | 9 1555 l 1557 | 9 1556 l 1558 | 9 1557 l 1559 | 9 1558 l 1560 | 9 1559 l 1561 | 9 1560 l 1562 | 9 1561 l 1563 | 9 1562 l 1564 | 9 1563 l 1565 | 9 1564 l 1566 | 9 1565 l 1567 | 9 1566 l 1568 | 9 1567 l 1569 | 9 1568 l 1570 | 9 1569 l 1571 | 9 1570 l 1572 | 9 1571 l 1573 | 9 1572 l 1574 | 9 1573 l 1575 | 9 1574 l 1576 | 9 1575 l 1577 | 9 1576 l 1578 | 9 1577 l 1579 | 9 1578 l 1580 | 9 1579 l 1581 | 9 1580 l 1582 | 9 1581 l 1583 | 9 1582 l 1584 | 9 1583 l 1585 | 9 1584 l 1586 | 9 1585 l 1587 | 9 1586 l 1588 | 9 1587 l 1589 | 9 1588 l 1590 | 9 1589 l 1591 | -------------------------------------------------------------------------------- /splits/odom/test_files_10.txt: -------------------------------------------------------------------------------- 1 | 10 0 l 2 | 10 1 l 3 | 10 2 l 4 | 10 3 l 5 | 10 4 l 6 | 10 5 l 7 | 10 6 l 8 | 10 7 l 9 | 10 8 l 10 | 10 9 l 11 | 10 10 l 12 | 10 11 l 13 | 10 12 l 14 | 10 13 l 15 | 10 14 l 16 | 10 15 l 17 | 10 16 l 18 | 10 17 l 19 | 10 18 l 20 | 10 19 l 21 | 10 20 l 22 | 10 21 l 23 | 10 22 l 24 | 10 23 l 25 | 10 24 l 26 | 10 25 l 27 | 10 26 l 28 | 10 27 l 29 | 10 28 l 30 | 10 29 l 31 | 10 30 l 32 | 10 31 l 33 | 10 32 l 34 | 10 33 l 35 | 10 34 l 36 | 10 35 l 37 | 10 36 l 38 | 10 37 l 39 | 10 38 l 40 | 10 39 l 41 | 10 40 l 42 | 10 41 l 43 | 10 42 l 44 | 10 43 l 45 | 10 44 l 46 | 10 45 l 47 | 10 46 l 48 | 10 47 l 49 | 10 48 l 50 | 10 49 l 51 | 10 50 l 52 | 10 51 l 53 | 10 52 l 54 | 10 53 l 55 | 10 54 l 56 | 10 55 l 57 | 10 56 l 58 | 10 57 l 59 | 10 58 l 60 | 10 59 l 61 | 10 60 l 62 | 10 61 l 63 | 10 62 l 64 | 10 63 l 65 | 10 64 l 66 | 10 65 l 67 | 10 66 l 68 | 10 67 l 69 | 10 68 l 70 | 10 69 l 71 | 10 70 l 72 | 10 71 l 73 | 10 72 l 74 | 10 73 l 75 | 10 74 l 76 | 10 75 l 77 | 10 76 l 78 | 10 77 l 79 | 10 78 l 80 | 10 79 l 81 | 10 80 l 82 | 10 81 l 83 | 10 82 l 84 | 10 83 l 85 | 10 84 l 86 | 10 85 l 87 | 10 86 l 88 | 10 87 l 89 | 10 88 l 90 | 10 89 l 91 | 10 90 l 92 | 10 91 l 93 | 10 92 l 94 | 10 93 l 95 | 10 94 l 96 | 10 95 l 97 | 10 96 l 98 | 10 97 l 99 | 10 98 l 100 | 10 99 l 101 | 10 100 l 102 | 10 101 l 103 | 10 102 l 104 | 10 103 l 105 | 10 104 l 106 | 10 105 l 107 | 10 106 l 108 | 10 107 l 109 | 10 108 l 110 | 10 109 l 111 | 10 110 l 112 | 10 111 l 113 | 10 112 l 114 | 10 113 l 115 | 10 114 l 116 | 10 115 l 117 | 10 116 l 118 | 10 117 l 119 | 10 118 l 120 | 10 119 l 121 | 10 120 l 122 | 10 121 l 123 | 10 122 l 124 | 10 123 l 125 | 10 124 l 126 | 10 125 l 127 | 10 126 l 128 | 10 127 l 129 | 10 128 l 130 | 10 129 l 131 | 10 130 l 132 | 10 131 l 133 | 10 132 l 134 | 10 133 l 135 | 10 134 l 136 | 10 135 l 137 | 10 136 l 138 | 10 137 l 139 | 10 138 l 140 | 10 139 l 141 | 10 140 l 142 | 10 141 l 143 | 10 142 l 144 | 10 143 l 145 | 10 144 l 146 | 10 145 l 147 | 10 146 l 148 | 10 147 l 149 | 10 148 l 150 | 10 149 l 151 | 10 150 l 152 | 10 151 l 153 | 10 152 l 154 | 10 153 l 155 | 10 154 l 156 | 10 155 l 157 | 10 156 l 158 | 10 157 l 159 | 10 158 l 160 | 10 159 l 161 | 10 160 l 162 | 10 161 l 163 | 10 162 l 164 | 10 163 l 165 | 10 164 l 166 | 10 165 l 167 | 10 166 l 168 | 10 167 l 169 | 10 168 l 170 | 10 169 l 171 | 10 170 l 172 | 10 171 l 173 | 10 172 l 174 | 10 173 l 175 | 10 174 l 176 | 10 175 l 177 | 10 176 l 178 | 10 177 l 179 | 10 178 l 180 | 10 179 l 181 | 10 180 l 182 | 10 181 l 183 | 10 182 l 184 | 10 183 l 185 | 10 184 l 186 | 10 185 l 187 | 10 186 l 188 | 10 187 l 189 | 10 188 l 190 | 10 189 l 191 | 10 190 l 192 | 10 191 l 193 | 10 192 l 194 | 10 193 l 195 | 10 194 l 196 | 10 195 l 197 | 10 196 l 198 | 10 197 l 199 | 10 198 l 200 | 10 199 l 201 | 10 200 l 202 | 10 201 l 203 | 10 202 l 204 | 10 203 l 205 | 10 204 l 206 | 10 205 l 207 | 10 206 l 208 | 10 207 l 209 | 10 208 l 210 | 10 209 l 211 | 10 210 l 212 | 10 211 l 213 | 10 212 l 214 | 10 213 l 215 | 10 214 l 216 | 10 215 l 217 | 10 216 l 218 | 10 217 l 219 | 10 218 l 220 | 10 219 l 221 | 10 220 l 222 | 10 221 l 223 | 10 222 l 224 | 10 223 l 225 | 10 224 l 226 | 10 225 l 227 | 10 226 l 228 | 10 227 l 229 | 10 228 l 230 | 10 229 l 231 | 10 230 l 232 | 10 231 l 233 | 10 232 l 234 | 10 233 l 235 | 10 234 l 236 | 10 235 l 237 | 10 236 l 238 | 10 237 l 239 | 10 238 l 240 | 10 239 l 241 | 10 240 l 242 | 10 241 l 243 | 10 242 l 244 | 10 243 l 245 | 10 244 l 246 | 10 245 l 247 | 10 246 l 248 | 10 247 l 249 | 10 248 l 250 | 10 249 l 251 | 10 250 l 252 | 10 251 l 253 | 10 252 l 254 | 10 253 l 255 | 10 254 l 256 | 10 255 l 257 | 10 256 l 258 | 10 257 l 259 | 10 258 l 260 | 10 259 l 261 | 10 260 l 262 | 10 261 l 263 | 10 262 l 264 | 10 263 l 265 | 10 264 l 266 | 10 265 l 267 | 10 266 l 268 | 10 267 l 269 | 10 268 l 270 | 10 269 l 271 | 10 270 l 272 | 10 271 l 273 | 10 272 l 274 | 10 273 l 275 | 10 274 l 276 | 10 275 l 277 | 10 276 l 278 | 10 277 l 279 | 10 278 l 280 | 10 279 l 281 | 10 280 l 282 | 10 281 l 283 | 10 282 l 284 | 10 283 l 285 | 10 284 l 286 | 10 285 l 287 | 10 286 l 288 | 10 287 l 289 | 10 288 l 290 | 10 289 l 291 | 10 290 l 292 | 10 291 l 293 | 10 292 l 294 | 10 293 l 295 | 10 294 l 296 | 10 295 l 297 | 10 296 l 298 | 10 297 l 299 | 10 298 l 300 | 10 299 l 301 | 10 300 l 302 | 10 301 l 303 | 10 302 l 304 | 10 303 l 305 | 10 304 l 306 | 10 305 l 307 | 10 306 l 308 | 10 307 l 309 | 10 308 l 310 | 10 309 l 311 | 10 310 l 312 | 10 311 l 313 | 10 312 l 314 | 10 313 l 315 | 10 314 l 316 | 10 315 l 317 | 10 316 l 318 | 10 317 l 319 | 10 318 l 320 | 10 319 l 321 | 10 320 l 322 | 10 321 l 323 | 10 322 l 324 | 10 323 l 325 | 10 324 l 326 | 10 325 l 327 | 10 326 l 328 | 10 327 l 329 | 10 328 l 330 | 10 329 l 331 | 10 330 l 332 | 10 331 l 333 | 10 332 l 334 | 10 333 l 335 | 10 334 l 336 | 10 335 l 337 | 10 336 l 338 | 10 337 l 339 | 10 338 l 340 | 10 339 l 341 | 10 340 l 342 | 10 341 l 343 | 10 342 l 344 | 10 343 l 345 | 10 344 l 346 | 10 345 l 347 | 10 346 l 348 | 10 347 l 349 | 10 348 l 350 | 10 349 l 351 | 10 350 l 352 | 10 351 l 353 | 10 352 l 354 | 10 353 l 355 | 10 354 l 356 | 10 355 l 357 | 10 356 l 358 | 10 357 l 359 | 10 358 l 360 | 10 359 l 361 | 10 360 l 362 | 10 361 l 363 | 10 362 l 364 | 10 363 l 365 | 10 364 l 366 | 10 365 l 367 | 10 366 l 368 | 10 367 l 369 | 10 368 l 370 | 10 369 l 371 | 10 370 l 372 | 10 371 l 373 | 10 372 l 374 | 10 373 l 375 | 10 374 l 376 | 10 375 l 377 | 10 376 l 378 | 10 377 l 379 | 10 378 l 380 | 10 379 l 381 | 10 380 l 382 | 10 381 l 383 | 10 382 l 384 | 10 383 l 385 | 10 384 l 386 | 10 385 l 387 | 10 386 l 388 | 10 387 l 389 | 10 388 l 390 | 10 389 l 391 | 10 390 l 392 | 10 391 l 393 | 10 392 l 394 | 10 393 l 395 | 10 394 l 396 | 10 395 l 397 | 10 396 l 398 | 10 397 l 399 | 10 398 l 400 | 10 399 l 401 | 10 400 l 402 | 10 401 l 403 | 10 402 l 404 | 10 403 l 405 | 10 404 l 406 | 10 405 l 407 | 10 406 l 408 | 10 407 l 409 | 10 408 l 410 | 10 409 l 411 | 10 410 l 412 | 10 411 l 413 | 10 412 l 414 | 10 413 l 415 | 10 414 l 416 | 10 415 l 417 | 10 416 l 418 | 10 417 l 419 | 10 418 l 420 | 10 419 l 421 | 10 420 l 422 | 10 421 l 423 | 10 422 l 424 | 10 423 l 425 | 10 424 l 426 | 10 425 l 427 | 10 426 l 428 | 10 427 l 429 | 10 428 l 430 | 10 429 l 431 | 10 430 l 432 | 10 431 l 433 | 10 432 l 434 | 10 433 l 435 | 10 434 l 436 | 10 435 l 437 | 10 436 l 438 | 10 437 l 439 | 10 438 l 440 | 10 439 l 441 | 10 440 l 442 | 10 441 l 443 | 10 442 l 444 | 10 443 l 445 | 10 444 l 446 | 10 445 l 447 | 10 446 l 448 | 10 447 l 449 | 10 448 l 450 | 10 449 l 451 | 10 450 l 452 | 10 451 l 453 | 10 452 l 454 | 10 453 l 455 | 10 454 l 456 | 10 455 l 457 | 10 456 l 458 | 10 457 l 459 | 10 458 l 460 | 10 459 l 461 | 10 460 l 462 | 10 461 l 463 | 10 462 l 464 | 10 463 l 465 | 10 464 l 466 | 10 465 l 467 | 10 466 l 468 | 10 467 l 469 | 10 468 l 470 | 10 469 l 471 | 10 470 l 472 | 10 471 l 473 | 10 472 l 474 | 10 473 l 475 | 10 474 l 476 | 10 475 l 477 | 10 476 l 478 | 10 477 l 479 | 10 478 l 480 | 10 479 l 481 | 10 480 l 482 | 10 481 l 483 | 10 482 l 484 | 10 483 l 485 | 10 484 l 486 | 10 485 l 487 | 10 486 l 488 | 10 487 l 489 | 10 488 l 490 | 10 489 l 491 | 10 490 l 492 | 10 491 l 493 | 10 492 l 494 | 10 493 l 495 | 10 494 l 496 | 10 495 l 497 | 10 496 l 498 | 10 497 l 499 | 10 498 l 500 | 10 499 l 501 | 10 500 l 502 | 10 501 l 503 | 10 502 l 504 | 10 503 l 505 | 10 504 l 506 | 10 505 l 507 | 10 506 l 508 | 10 507 l 509 | 10 508 l 510 | 10 509 l 511 | 10 510 l 512 | 10 511 l 513 | 10 512 l 514 | 10 513 l 515 | 10 514 l 516 | 10 515 l 517 | 10 516 l 518 | 10 517 l 519 | 10 518 l 520 | 10 519 l 521 | 10 520 l 522 | 10 521 l 523 | 10 522 l 524 | 10 523 l 525 | 10 524 l 526 | 10 525 l 527 | 10 526 l 528 | 10 527 l 529 | 10 528 l 530 | 10 529 l 531 | 10 530 l 532 | 10 531 l 533 | 10 532 l 534 | 10 533 l 535 | 10 534 l 536 | 10 535 l 537 | 10 536 l 538 | 10 537 l 539 | 10 538 l 540 | 10 539 l 541 | 10 540 l 542 | 10 541 l 543 | 10 542 l 544 | 10 543 l 545 | 10 544 l 546 | 10 545 l 547 | 10 546 l 548 | 10 547 l 549 | 10 548 l 550 | 10 549 l 551 | 10 550 l 552 | 10 551 l 553 | 10 552 l 554 | 10 553 l 555 | 10 554 l 556 | 10 555 l 557 | 10 556 l 558 | 10 557 l 559 | 10 558 l 560 | 10 559 l 561 | 10 560 l 562 | 10 561 l 563 | 10 562 l 564 | 10 563 l 565 | 10 564 l 566 | 10 565 l 567 | 10 566 l 568 | 10 567 l 569 | 10 568 l 570 | 10 569 l 571 | 10 570 l 572 | 10 571 l 573 | 10 572 l 574 | 10 573 l 575 | 10 574 l 576 | 10 575 l 577 | 10 576 l 578 | 10 577 l 579 | 10 578 l 580 | 10 579 l 581 | 10 580 l 582 | 10 581 l 583 | 10 582 l 584 | 10 583 l 585 | 10 584 l 586 | 10 585 l 587 | 10 586 l 588 | 10 587 l 589 | 10 588 l 590 | 10 589 l 591 | 10 590 l 592 | 10 591 l 593 | 10 592 l 594 | 10 593 l 595 | 10 594 l 596 | 10 595 l 597 | 10 596 l 598 | 10 597 l 599 | 10 598 l 600 | 10 599 l 601 | 10 600 l 602 | 10 601 l 603 | 10 602 l 604 | 10 603 l 605 | 10 604 l 606 | 10 605 l 607 | 10 606 l 608 | 10 607 l 609 | 10 608 l 610 | 10 609 l 611 | 10 610 l 612 | 10 611 l 613 | 10 612 l 614 | 10 613 l 615 | 10 614 l 616 | 10 615 l 617 | 10 616 l 618 | 10 617 l 619 | 10 618 l 620 | 10 619 l 621 | 10 620 l 622 | 10 621 l 623 | 10 622 l 624 | 10 623 l 625 | 10 624 l 626 | 10 625 l 627 | 10 626 l 628 | 10 627 l 629 | 10 628 l 630 | 10 629 l 631 | 10 630 l 632 | 10 631 l 633 | 10 632 l 634 | 10 633 l 635 | 10 634 l 636 | 10 635 l 637 | 10 636 l 638 | 10 637 l 639 | 10 638 l 640 | 10 639 l 641 | 10 640 l 642 | 10 641 l 643 | 10 642 l 644 | 10 643 l 645 | 10 644 l 646 | 10 645 l 647 | 10 646 l 648 | 10 647 l 649 | 10 648 l 650 | 10 649 l 651 | 10 650 l 652 | 10 651 l 653 | 10 652 l 654 | 10 653 l 655 | 10 654 l 656 | 10 655 l 657 | 10 656 l 658 | 10 657 l 659 | 10 658 l 660 | 10 659 l 661 | 10 660 l 662 | 10 661 l 663 | 10 662 l 664 | 10 663 l 665 | 10 664 l 666 | 10 665 l 667 | 10 666 l 668 | 10 667 l 669 | 10 668 l 670 | 10 669 l 671 | 10 670 l 672 | 10 671 l 673 | 10 672 l 674 | 10 673 l 675 | 10 674 l 676 | 10 675 l 677 | 10 676 l 678 | 10 677 l 679 | 10 678 l 680 | 10 679 l 681 | 10 680 l 682 | 10 681 l 683 | 10 682 l 684 | 10 683 l 685 | 10 684 l 686 | 10 685 l 687 | 10 686 l 688 | 10 687 l 689 | 10 688 l 690 | 10 689 l 691 | 10 690 l 692 | 10 691 l 693 | 10 692 l 694 | 10 693 l 695 | 10 694 l 696 | 10 695 l 697 | 10 696 l 698 | 10 697 l 699 | 10 698 l 700 | 10 699 l 701 | 10 700 l 702 | 10 701 l 703 | 10 702 l 704 | 10 703 l 705 | 10 704 l 706 | 10 705 l 707 | 10 706 l 708 | 10 707 l 709 | 10 708 l 710 | 10 709 l 711 | 10 710 l 712 | 10 711 l 713 | 10 712 l 714 | 10 713 l 715 | 10 714 l 716 | 10 715 l 717 | 10 716 l 718 | 10 717 l 719 | 10 718 l 720 | 10 719 l 721 | 10 720 l 722 | 10 721 l 723 | 10 722 l 724 | 10 723 l 725 | 10 724 l 726 | 10 725 l 727 | 10 726 l 728 | 10 727 l 729 | 10 728 l 730 | 10 729 l 731 | 10 730 l 732 | 10 731 l 733 | 10 732 l 734 | 10 733 l 735 | 10 734 l 736 | 10 735 l 737 | 10 736 l 738 | 10 737 l 739 | 10 738 l 740 | 10 739 l 741 | 10 740 l 742 | 10 741 l 743 | 10 742 l 744 | 10 743 l 745 | 10 744 l 746 | 10 745 l 747 | 10 746 l 748 | 10 747 l 749 | 10 748 l 750 | 10 749 l 751 | 10 750 l 752 | 10 751 l 753 | 10 752 l 754 | 10 753 l 755 | 10 754 l 756 | 10 755 l 757 | 10 756 l 758 | 10 757 l 759 | 10 758 l 760 | 10 759 l 761 | 10 760 l 762 | 10 761 l 763 | 10 762 l 764 | 10 763 l 765 | 10 764 l 766 | 10 765 l 767 | 10 766 l 768 | 10 767 l 769 | 10 768 l 770 | 10 769 l 771 | 10 770 l 772 | 10 771 l 773 | 10 772 l 774 | 10 773 l 775 | 10 774 l 776 | 10 775 l 777 | 10 776 l 778 | 10 777 l 779 | 10 778 l 780 | 10 779 l 781 | 10 780 l 782 | 10 781 l 783 | 10 782 l 784 | 10 783 l 785 | 10 784 l 786 | 10 785 l 787 | 10 786 l 788 | 10 787 l 789 | 10 788 l 790 | 10 789 l 791 | 10 790 l 792 | 10 791 l 793 | 10 792 l 794 | 10 793 l 795 | 10 794 l 796 | 10 795 l 797 | 10 796 l 798 | 10 797 l 799 | 10 798 l 800 | 10 799 l 801 | 10 800 l 802 | 10 801 l 803 | 10 802 l 804 | 10 803 l 805 | 10 804 l 806 | 10 805 l 807 | 10 806 l 808 | 10 807 l 809 | 10 808 l 810 | 10 809 l 811 | 10 810 l 812 | 10 811 l 813 | 10 812 l 814 | 10 813 l 815 | 10 814 l 816 | 10 815 l 817 | 10 816 l 818 | 10 817 l 819 | 10 818 l 820 | 10 819 l 821 | 10 820 l 822 | 10 821 l 823 | 10 822 l 824 | 10 823 l 825 | 10 824 l 826 | 10 825 l 827 | 10 826 l 828 | 10 827 l 829 | 10 828 l 830 | 10 829 l 831 | 10 830 l 832 | 10 831 l 833 | 10 832 l 834 | 10 833 l 835 | 10 834 l 836 | 10 835 l 837 | 10 836 l 838 | 10 837 l 839 | 10 838 l 840 | 10 839 l 841 | 10 840 l 842 | 10 841 l 843 | 10 842 l 844 | 10 843 l 845 | 10 844 l 846 | 10 845 l 847 | 10 846 l 848 | 10 847 l 849 | 10 848 l 850 | 10 849 l 851 | 10 850 l 852 | 10 851 l 853 | 10 852 l 854 | 10 853 l 855 | 10 854 l 856 | 10 855 l 857 | 10 856 l 858 | 10 857 l 859 | 10 858 l 860 | 10 859 l 861 | 10 860 l 862 | 10 861 l 863 | 10 862 l 864 | 10 863 l 865 | 10 864 l 866 | 10 865 l 867 | 10 866 l 868 | 10 867 l 869 | 10 868 l 870 | 10 869 l 871 | 10 870 l 872 | 10 871 l 873 | 10 872 l 874 | 10 873 l 875 | 10 874 l 876 | 10 875 l 877 | 10 876 l 878 | 10 877 l 879 | 10 878 l 880 | 10 879 l 881 | 10 880 l 882 | 10 881 l 883 | 10 882 l 884 | 10 883 l 885 | 10 884 l 886 | 10 885 l 887 | 10 886 l 888 | 10 887 l 889 | 10 888 l 890 | 10 889 l 891 | 10 890 l 892 | 10 891 l 893 | 10 892 l 894 | 10 893 l 895 | 10 894 l 896 | 10 895 l 897 | 10 896 l 898 | 10 897 l 899 | 10 898 l 900 | 10 899 l 901 | 10 900 l 902 | 10 901 l 903 | 10 902 l 904 | 10 903 l 905 | 10 904 l 906 | 10 905 l 907 | 10 906 l 908 | 10 907 l 909 | 10 908 l 910 | 10 909 l 911 | 10 910 l 912 | 10 911 l 913 | 10 912 l 914 | 10 913 l 915 | 10 914 l 916 | 10 915 l 917 | 10 916 l 918 | 10 917 l 919 | 10 918 l 920 | 10 919 l 921 | 10 920 l 922 | 10 921 l 923 | 10 922 l 924 | 10 923 l 925 | 10 924 l 926 | 10 925 l 927 | 10 926 l 928 | 10 927 l 929 | 10 928 l 930 | 10 929 l 931 | 10 930 l 932 | 10 931 l 933 | 10 932 l 934 | 10 933 l 935 | 10 934 l 936 | 10 935 l 937 | 10 936 l 938 | 10 937 l 939 | 10 938 l 940 | 10 939 l 941 | 10 940 l 942 | 10 941 l 943 | 10 942 l 944 | 10 943 l 945 | 10 944 l 946 | 10 945 l 947 | 10 946 l 948 | 10 947 l 949 | 10 948 l 950 | 10 949 l 951 | 10 950 l 952 | 10 951 l 953 | 10 952 l 954 | 10 953 l 955 | 10 954 l 956 | 10 955 l 957 | 10 956 l 958 | 10 957 l 959 | 10 958 l 960 | 10 959 l 961 | 10 960 l 962 | 10 961 l 963 | 10 962 l 964 | 10 963 l 965 | 10 964 l 966 | 10 965 l 967 | 10 966 l 968 | 10 967 l 969 | 10 968 l 970 | 10 969 l 971 | 10 970 l 972 | 10 971 l 973 | 10 972 l 974 | 10 973 l 975 | 10 974 l 976 | 10 975 l 977 | 10 976 l 978 | 10 977 l 979 | 10 978 l 980 | 10 979 l 981 | 10 980 l 982 | 10 981 l 983 | 10 982 l 984 | 10 983 l 985 | 10 984 l 986 | 10 985 l 987 | 10 986 l 988 | 10 987 l 989 | 10 988 l 990 | 10 989 l 991 | 10 990 l 992 | 10 991 l 993 | 10 992 l 994 | 10 993 l 995 | 10 994 l 996 | 10 995 l 997 | 10 996 l 998 | 10 997 l 999 | 10 998 l 1000 | 10 999 l 1001 | 10 1000 l 1002 | 10 1001 l 1003 | 10 1002 l 1004 | 10 1003 l 1005 | 10 1004 l 1006 | 10 1005 l 1007 | 10 1006 l 1008 | 10 1007 l 1009 | 10 1008 l 1010 | 10 1009 l 1011 | 10 1010 l 1012 | 10 1011 l 1013 | 10 1012 l 1014 | 10 1013 l 1015 | 10 1014 l 1016 | 10 1015 l 1017 | 10 1016 l 1018 | 10 1017 l 1019 | 10 1018 l 1020 | 10 1019 l 1021 | 10 1020 l 1022 | 10 1021 l 1023 | 10 1022 l 1024 | 10 1023 l 1025 | 10 1024 l 1026 | 10 1025 l 1027 | 10 1026 l 1028 | 10 1027 l 1029 | 10 1028 l 1030 | 10 1029 l 1031 | 10 1030 l 1032 | 10 1031 l 1033 | 10 1032 l 1034 | 10 1033 l 1035 | 10 1034 l 1036 | 10 1035 l 1037 | 10 1036 l 1038 | 10 1037 l 1039 | 10 1038 l 1040 | 10 1039 l 1041 | 10 1040 l 1042 | 10 1041 l 1043 | 10 1042 l 1044 | 10 1043 l 1045 | 10 1044 l 1046 | 10 1045 l 1047 | 10 1046 l 1048 | 10 1047 l 1049 | 10 1048 l 1050 | 10 1049 l 1051 | 10 1050 l 1052 | 10 1051 l 1053 | 10 1052 l 1054 | 10 1053 l 1055 | 10 1054 l 1056 | 10 1055 l 1057 | 10 1056 l 1058 | 10 1057 l 1059 | 10 1058 l 1060 | 10 1059 l 1061 | 10 1060 l 1062 | 10 1061 l 1063 | 10 1062 l 1064 | 10 1063 l 1065 | 10 1064 l 1066 | 10 1065 l 1067 | 10 1066 l 1068 | 10 1067 l 1069 | 10 1068 l 1070 | 10 1069 l 1071 | 10 1070 l 1072 | 10 1071 l 1073 | 10 1072 l 1074 | 10 1073 l 1075 | 10 1074 l 1076 | 10 1075 l 1077 | 10 1076 l 1078 | 10 1077 l 1079 | 10 1078 l 1080 | 10 1079 l 1081 | 10 1080 l 1082 | 10 1081 l 1083 | 10 1082 l 1084 | 10 1083 l 1085 | 10 1084 l 1086 | 10 1085 l 1087 | 10 1086 l 1088 | 10 1087 l 1089 | 10 1088 l 1090 | 10 1089 l 1091 | 10 1090 l 1092 | 10 1091 l 1093 | 10 1092 l 1094 | 10 1093 l 1095 | 10 1094 l 1096 | 10 1095 l 1097 | 10 1096 l 1098 | 10 1097 l 1099 | 10 1098 l 1100 | 10 1099 l 1101 | 10 1100 l 1102 | 10 1101 l 1103 | 10 1102 l 1104 | 10 1103 l 1105 | 10 1104 l 1106 | 10 1105 l 1107 | 10 1106 l 1108 | 10 1107 l 1109 | 10 1108 l 1110 | 10 1109 l 1111 | 10 1110 l 1112 | 10 1111 l 1113 | 10 1112 l 1114 | 10 1113 l 1115 | 10 1114 l 1116 | 10 1115 l 1117 | 10 1116 l 1118 | 10 1117 l 1119 | 10 1118 l 1120 | 10 1119 l 1121 | 10 1120 l 1122 | 10 1121 l 1123 | 10 1122 l 1124 | 10 1123 l 1125 | 10 1124 l 1126 | 10 1125 l 1127 | 10 1126 l 1128 | 10 1127 l 1129 | 10 1128 l 1130 | 10 1129 l 1131 | 10 1130 l 1132 | 10 1131 l 1133 | 10 1132 l 1134 | 10 1133 l 1135 | 10 1134 l 1136 | 10 1135 l 1137 | 10 1136 l 1138 | 10 1137 l 1139 | 10 1138 l 1140 | 10 1139 l 1141 | 10 1140 l 1142 | 10 1141 l 1143 | 10 1142 l 1144 | 10 1143 l 1145 | 10 1144 l 1146 | 10 1145 l 1147 | 10 1146 l 1148 | 10 1147 l 1149 | 10 1148 l 1150 | 10 1149 l 1151 | 10 1150 l 1152 | 10 1151 l 1153 | 10 1152 l 1154 | 10 1153 l 1155 | 10 1154 l 1156 | 10 1155 l 1157 | 10 1156 l 1158 | 10 1157 l 1159 | 10 1158 l 1160 | 10 1159 l 1161 | 10 1160 l 1162 | 10 1161 l 1163 | 10 1162 l 1164 | 10 1163 l 1165 | 10 1164 l 1166 | 10 1165 l 1167 | 10 1166 l 1168 | 10 1167 l 1169 | 10 1168 l 1170 | 10 1169 l 1171 | 10 1170 l 1172 | 10 1171 l 1173 | 10 1172 l 1174 | 10 1173 l 1175 | 10 1174 l 1176 | 10 1175 l 1177 | 10 1176 l 1178 | 10 1177 l 1179 | 10 1178 l 1180 | 10 1179 l 1181 | 10 1180 l 1182 | 10 1181 l 1183 | 10 1182 l 1184 | 10 1183 l 1185 | 10 1184 l 1186 | 10 1185 l 1187 | 10 1186 l 1188 | 10 1187 l 1189 | 10 1188 l 1190 | 10 1189 l 1191 | 10 1190 l 1192 | 10 1191 l 1193 | 10 1192 l 1194 | 10 1193 l 1195 | 10 1194 l 1196 | 10 1195 l 1197 | 10 1196 l 1198 | 10 1197 l 1199 | 10 1198 l 1200 | 10 1199 l 1201 | -------------------------------------------------------------------------------- /test_simple.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | import os 4 | import sys 5 | import time 6 | import glob 7 | import argparse 8 | import numpy as np 9 | import PIL.Image as pil 10 | import matplotlib as mpl 11 | import matplotlib.cm as cm 12 | 13 | import torch 14 | from torchvision import transforms, datasets 15 | 16 | import networks 17 | from layers import disp_to_depth 18 | 19 | 20 | def parse_args(): 21 | parser = argparse.ArgumentParser( 22 | description='Simple testing funtion for Monodepthv2 models.') 23 | 24 | parser.add_argument('--image_path', type=str, 25 | help='path to a test image or folder of images', required=True) 26 | parser.add_argument('--model_name', type=str, 27 | help='name of a pretrained model to use') 28 | parser.add_argument('--ext', type=str, 29 | help='image extension to search for in folder', default="jpg") 30 | parser.add_argument("--no_cuda", 31 | help='if set, disables CUDA', 32 | action='store_true') 33 | parser.add_argument("--no_ddv", 34 | help='if set, disables ddv', 35 | action='store_true') 36 | parser.add_argument("--no_self_attention", 37 | help='if set, disables self-attn', 38 | action='store_true') 39 | parser.add_argument("--discretization", 40 | type=str, 41 | help="disparity discretization method", 42 | default="UD", 43 | choices=["SID", "UD"]) 44 | 45 | return parser.parse_args() 46 | 47 | 48 | def print_size_of_model(model): 49 | """ Print the size of the model. 50 | Args: 51 | model: model whose size needs to be determined 52 | """ 53 | torch.save(model.state_dict(), "temp.p") 54 | print('Size of the model(MB):', os.path.getsize("temp.p") / 1e6) 55 | os.remove('temp.p') 56 | 57 | 58 | def test_simple(args): 59 | """Function to predict for a single image or folder of images 60 | """ 61 | assert args.model_name is not None, \ 62 | "You must specify the --model_name parameter." 63 | 64 | if torch.cuda.is_available() and not args.no_cuda: 65 | device = torch.device("cuda") 66 | else: 67 | device = torch.device("cpu") 68 | 69 | print("-> Loading model from ", args.model_name) 70 | encoder_path = os.path.join(args.model_name, "encoder.pth") 71 | depth_decoder_path = os.path.join(args.model_name, "depth.pth") 72 | 73 | # LOADING PRETRAINED MODEL 74 | if args.no_ddv: 75 | encoder = networks.get_resnet101_asp_oc_dsn( 76 | 2048, args.no_self_attention, False) 77 | depth_decoder = networks.DepthDecoder( 78 | encoder.num_ch_enc) 79 | else: 80 | encoder = networks.get_resnet101_asp_oc_dsn( 81 | 128, args.no_self_attention, False) 82 | depth_decoder = networks.MSDepthDecoder( 83 | encoder.num_ch_enc, discretization=args.discretization) 84 | 85 | print(" Loading pretrained encoder") 86 | loaded_dict_enc = torch.load(encoder_path, map_location=device) 87 | 88 | # extract the height and width of image that this model was trained with 89 | feed_height = loaded_dict_enc['height'] 90 | feed_width = loaded_dict_enc['width'] 91 | filtered_dict_enc = {k: v for k, v in loaded_dict_enc.items() if k in encoder.state_dict()} 92 | encoder.load_state_dict(filtered_dict_enc) 93 | encoder.to(device) 94 | encoder.eval() 95 | 96 | print(" Loading pretrained decoder") 97 | loaded_dict = torch.load(depth_decoder_path, map_location=device) 98 | depth_decoder.load_state_dict(loaded_dict) 99 | 100 | print_size_of_model(encoder) 101 | print_size_of_model(depth_decoder) 102 | 103 | depth_decoder.to(device) 104 | depth_decoder.eval() 105 | 106 | # FINDING INPUT IMAGES 107 | if os.path.isfile(args.image_path): 108 | # Only testing on a single image 109 | paths = [args.image_path] 110 | output_directory = os.path.dirname(args.image_path) 111 | elif os.path.isdir(args.image_path): 112 | # Searching folder for images 113 | paths = glob.glob(os.path.join(args.image_path, '*.{}'.format(args.ext))) 114 | output_directory = args.image_path 115 | else: 116 | raise Exception("Can not find args.image_path: {}".format(args.image_path)) 117 | 118 | print("-> Predicting on {:d} test images".format(len(paths))) 119 | 120 | # PREDICTING ON EACH IMAGE IN TURN 121 | timings = list() 122 | with torch.no_grad(): 123 | for idx, image_path in enumerate(paths): 124 | 125 | if image_path.endswith("_disp.jpg"): 126 | # don't try to predict disparity for a disparity image! 127 | continue 128 | 129 | # Load image and preprocess 130 | input_image = pil.open(image_path).convert('RGB') 131 | original_width, original_height = input_image.size 132 | input_image = input_image.resize((feed_width, feed_height), pil.LANCZOS) 133 | input_image = transforms.ToTensor()(input_image).unsqueeze(0) 134 | 135 | # PREDICTION 136 | input_image = input_image.to(device) 137 | 138 | st = time.time() 139 | features = encoder(input_image) 140 | if args.no_ddv: 141 | outputs = depth_decoder(features) 142 | else: 143 | all_features = {} 144 | all_features['conv3'] = features[0] 145 | all_features['layer1'] = features[1] 146 | all_features['output'] = features[-1] 147 | outputs = depth_decoder(all_features) 148 | et = time.time() 149 | print('Elapsed time = {:0.4f} ms'.format((et - st) * 1000)) 150 | timings.append((et - st) * 1000) 151 | 152 | disp = outputs[("disp", 0)] 153 | disp_resized = torch.nn.functional.interpolate( 154 | disp, (original_height, original_width), mode="bilinear", align_corners=False) 155 | 156 | # Saving numpy file 157 | output_name = os.path.splitext(os.path.basename(image_path))[0] 158 | name_dest_npy = os.path.join(output_directory, "{}_disp.npy".format(output_name)) 159 | scaled_disp, _ = disp_to_depth(disp, 0.1, 100) 160 | np.save(name_dest_npy, scaled_disp.cpu().numpy()) 161 | 162 | # Saving colormapped depth image 163 | disp_resized_np = disp_resized.squeeze().cpu().numpy() 164 | vmax = np.percentile(disp_resized_np, 95) 165 | normalizer = mpl.colors.Normalize(vmin=disp_resized_np.min(), vmax=vmax) 166 | mapper = cm.ScalarMappable(norm=normalizer, cmap='magma') 167 | colormapped_im = (mapper.to_rgba(disp_resized_np)[:, :, :3] * 255).astype(np.uint8) 168 | im = pil.fromarray(colormapped_im) 169 | 170 | name_dest_im = os.path.join(output_directory, "{}_disp.jpeg".format(output_name)) 171 | im.save(name_dest_im) 172 | 173 | print(" Processed {:d} of {:d} images - saved prediction to {}".format( 174 | idx + 1, len(paths), name_dest_im)) 175 | 176 | print('Mean time elapsed: {:0.4f}'.format(np.mean(timings[11:]))) 177 | print('Std time elapsed: {:0.4f}'.format(np.std(timings[11:]))) 178 | print('-> Done!') 179 | 180 | 181 | if __name__ == '__main__': 182 | args = parse_args() 183 | test_simple(args) 184 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | 3 | from trainer import Trainer 4 | from options import MonodepthOptions 5 | 6 | options = MonodepthOptions() 7 | opts = options.parse() 8 | 9 | 10 | if __name__ == "__main__": 11 | trainer = Trainer(opts) 12 | trainer.train() 13 | print('Done') 14 | 15 | -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function 2 | import os 3 | import hashlib 4 | import zipfile 5 | from six.moves import urllib 6 | 7 | 8 | def readlines(filename): 9 | """Read all the lines in a text file and return as a list 10 | """ 11 | with open(filename, 'r') as f: 12 | lines = f.read().splitlines() 13 | return lines 14 | 15 | 16 | def normalize_image(x): 17 | """Rescale image pixels to span range [0, 1] 18 | """ 19 | ma = float(x.max().cpu().data) 20 | mi = float(x.min().cpu().data) 21 | d = ma - mi if ma != mi else 1e5 22 | return (x - mi) / d 23 | 24 | 25 | def sec_to_hm(t): 26 | """Convert time in seconds to time in hours, minutes and seconds 27 | e.g. 10239 -> (2, 50, 39) 28 | """ 29 | t = int(t) 30 | s = t % 60 31 | t //= 60 32 | m = t % 60 33 | t //= 60 34 | return t, m, s 35 | 36 | 37 | def sec_to_hm_str(t): 38 | """Convert time in seconds to a nice string 39 | e.g. 10239 -> '02h50m39s' 40 | """ 41 | h, m, s = sec_to_hm(t) 42 | return "{:02d}h{:02d}m{:02d}s".format(h, m, s) 43 | 44 | 45 | def download_model_if_doesnt_exist(model_name): 46 | """If pretrained kitti model doesn't exist, download and unzip it 47 | """ 48 | # values are tuples of (, ) 49 | download_paths = { 50 | "mono_640x192": 51 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_640x192.zip", 52 | "a964b8356e08a02d009609d9e3928f7c"), 53 | "stereo_640x192": 54 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_640x192.zip", 55 | "3dfb76bcff0786e4ec07ac00f658dd07"), 56 | "mono+stereo_640x192": 57 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_640x192.zip", 58 | "c024d69012485ed05d7eaa9617a96b81"), 59 | "mono_no_pt_640x192": 60 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_no_pt_640x192.zip", 61 | "9c2f071e35027c895a4728358ffc913a"), 62 | "stereo_no_pt_640x192": 63 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_no_pt_640x192.zip", 64 | "41ec2de112905f85541ac33a854742d1"), 65 | "mono+stereo_no_pt_640x192": 66 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_no_pt_640x192.zip", 67 | "46c3b824f541d143a45c37df65fbab0a"), 68 | "mono_1024x320": 69 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono_1024x320.zip", 70 | "0ab0766efdfeea89a0d9ea8ba90e1e63"), 71 | "stereo_1024x320": 72 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/stereo_1024x320.zip", 73 | "afc2f2126d70cf3fdf26b550898b501a"), 74 | "mono+stereo_1024x320": 75 | ("https://storage.googleapis.com/niantic-lon-static/research/monodepth2/mono%2Bstereo_1024x320.zip", 76 | "cdc5fc9b23513c07d5b19235d9ef08f7"), 77 | } 78 | 79 | if not os.path.exists("models"): 80 | os.makedirs("models") 81 | 82 | model_path = os.path.join("models", model_name) 83 | 84 | def check_file_matches_md5(checksum, fpath): 85 | if not os.path.exists(fpath): 86 | return False 87 | with open(fpath, 'rb') as f: 88 | current_md5checksum = hashlib.md5(f.read()).hexdigest() 89 | return current_md5checksum == checksum 90 | 91 | # see if we have the model already downloaded... 92 | if not os.path.exists(os.path.join(model_path, "encoder.pth")): 93 | 94 | model_url, required_md5checksum = download_paths[model_name] 95 | 96 | if not check_file_matches_md5(required_md5checksum, model_path + ".zip"): 97 | print("-> Downloading pretrained model to {}".format(model_path + ".zip")) 98 | urllib.request.urlretrieve(model_url, model_path + ".zip") 99 | 100 | if not check_file_matches_md5(required_md5checksum, model_path + ".zip"): 101 | print(" Failed to download a file which matches the checksum - quitting") 102 | quit() 103 | 104 | print(" Unzipping model...") 105 | with zipfile.ZipFile(model_path + ".zip", 'r') as f: 106 | f.extractall(model_path) 107 | 108 | print(" Model unzipped to {}".format(model_path)) 109 | --------------------------------------------------------------------------------