├── README.md ├── create_proxy_stereo.py ├── datasets ├── booster │ └── train_stereo.txt ├── dataloader.py ├── msd │ ├── test.txt │ ├── train.txt │ ├── virtual_depth_dpt_large.txt │ └── virtual_depth_midas_v21.txt └── trans10k │ ├── test.txt │ ├── train.txt │ ├── validation.txt │ ├── virtual_depth_dpt_large.txt │ └── virtual_depth_midas_v21.txt ├── evaluate_mono.py ├── finetune.py ├── images ├── framework_mono.png └── qualitatives.png ├── loss.py ├── midas ├── base_model.py ├── blocks.py ├── dpt_depth.py ├── midas_net.py ├── midas_net_custom.py ├── transforms.py └── vit.py ├── requirements.txt ├── run.py ├── scripts ├── finetune.sh ├── generate_virtual_depth.sh ├── table2.sh └── table3.sh └── utils.py /README.md: -------------------------------------------------------------------------------- 1 | 2 |

Learning Depth Estimation for Transparent and Mirror Surfaces (ICCV 2023)

3 | 4 | 5 |
6 | 7 | :rotating_light: This repository contains download links to our dataset, code snippets, and trained deep models of our work "**Learning Depth Estimation for Transparent and Mirror Surfaces**", [ICCV 2023](https://cvpr2023.thecvf.com/) 8 | 9 | by [Alex Costanzino*](https://www.unibo.it/sitoweb/alex.costanzino), [Pierluigi Zama Ramirez*](https://pierlui92.github.io/), [Matteo Poggi*](https://mattpoggi.github.io/), [Fabio Tosi](https://fabiotosi92.github.io/), [Stefano Mattoccia](https://www.unibo.it/sitoweb/stefano.mattoccia), and [Luigi Di Stefano](https://www.unibo.it/sitoweb/luigi.distefano). \* _Equal Contribution_ 10 | 11 | University of Bologna 12 | 13 | 14 |

15 | 16 | 17 |

18 | 19 | [Project Page](https://cvlab-unibo.github.io/Depth4ToM/) | [Paper](https://arxiv.org/abs/2307.15052) 20 |

34 | 35 | ## :clapper: Introduction 36 | Inferring the depth of transparent or mirror (ToM) surfaces represents a hard challenge for either sensors, algorithms, or deep networks. We propose a simple pipeline for learning to estimate depth properly for such surfaces with neural networks, without requiring any ground-truth annotation. We unveil how to obtain reliable pseudo labels by in-painting ToM objects in images and processing them with a monocular depth estimation model. These labels can be used to fine-tune existing monocular or stereo networks, to let them learn how to deal with ToM surfaces. Experimental results on the Booster dataset show the dramatic improvements enabled by our remarkably simple proposal. 37 | 38 |

39 | 40 |

41 | 42 | Alt text

43 | 44 | :fountain_pen: If you find this code useful in your research, please cite: 45 | 46 | ```bibtex 47 | @inproceedings{costanzino2023iccv, 48 | title = {Learning Depth Estimation for Transparent and Mirror Surfaces}, 49 | author = {Costanzino, Alex and Zama Ramirez, Pierluigi and Poggi, Matteo and Tosi, Fabio and Mattoccia, Stefano and Di Stefano, Luigi}, 50 | booktitle = {The IEEE International Conference on Computer Vision}, 51 | note = {ICCV}, 52 | year = {2023}, 53 | } 54 | ``` 55 | 56 | ## :file_cabinet: Dataset 57 | 58 | In our experiments, we employed two datasets featuring transparent or mirror objets: [Trans10K](https://xieenze.github.io/projects/TransLAB/TransLAB.html) and [MSD](https://mhaiyang.github.io/ICCV2019_MirrorNet/index). With our in-painting technique we obtain virtual depth maps to finetune monocular networks. For sake of reproducibility, we make available Trans10K and MSD together with proxy labels used to finetune our models. 59 | 60 | ### :arrow_down: Get Your Hands on the Data 61 | Trans10K and MSD with Virtual Depths. [[Download]](https://1drv.ms/u/s!AgV49D1Z6rmGgZAz2I7tMepfdVrZYQ?e=jbuaJB) 62 | 63 | We also employed the Booster Dataset in our experiment. [[Download]](https://cvlab-unibo.github.io/booster-web/) 64 | 65 | ## :inbox_tray: Pretrained Models 66 | 67 | Here, you can download the weights of **MiDAS** and **DPT** architectures employed in the results of Table 2 and Table 3 of our paper. If you just need the best model, use `"Table 2/Ft. Virtual Depth/dpt_large_final.pt` 68 | 69 | To use these weights, please follow these steps: 70 | 71 | 1. Create a folder named `weights` in the project directory. 72 | 2. Download the weights [[Download]](https://1drv.ms/u/s!AgV49D1Z6rmGgZAyTbFLjjTMdgsE_A?e=1xcf4y) 73 | 3. Copy the downloaded weights into the `weights` folder. 74 | 75 | ## :memo: Code 76 | 77 |

78 | 79 | **Warning**: 80 | - Please be aware that we will not be releasing the training code for deep stereo models. We provide only our algorithm to obtain proxy depth labels by merging monocular and stereo predictions. 81 | - The code utilizes `wandb` during training to log results. Please be sure to have a wandb account. Otherwise, if you prefer to not use `wandb`, comment the wandb logging code lines in `finetune.py`. 82 | 83 |

84 | 85 | 86 | ### :hammer_and_wrench: Setup Instructions 87 | 88 | **Dependencies**: Ensure that you have installed all the necessary dependencies. The list of dependencies can be found in the `./requirements.txt` file. 89 | 90 | 91 | ### :rocket: Inference Monocular Networks 92 | 93 | The `run.py` script test monocular networks. It can be used to predict the monocular depth maps from pretrained networks, or to apply our in-painting strategy of Base networks to obtain Virtual Depths. 94 | 95 | You can specify the following options: 96 | - `--input_path`: Path to the root directory of the dataset. E.g., _Booster/balanced/train_ if you want to test the model on the training set of Booster. 97 | - `--dataset_txt`: The list of the dataset samples. Each line contains the relative path to `input_path` of each image. You can find some examples in the folder _datasets/_. E.g., to run on the training set of booster use *datasets\booster\train_stereo.txt* 98 | - `--mask_path`: Optional path with the folder containing masks. Each mask shoud have the same relative path of the corresponding image. When this path is specified, masks are applied to colorize ToM objects. 99 | - `--cls2mask`: IDs referring to ToM objects in masks. 100 | - `--it`: Number of inferences for each image. Used when in-painting with several random colors. 101 | - `--output_path`: Output directory, 102 | - `--output_list`: Save the prediction paths in a txt file. 103 | - `--save_full_res`: Save the prediction at the input resolution. If not specified save the predictions at the model output resolution. 104 | - `--model_weights`: Path to the trained weights of the model. If not specified load the Base network weights from default paths. 105 | - `--model_type`: Model type. Either `dpt_large` or `midas_v21`. 106 | 107 | You can reproduce the results of Table 2 and Table 3 of the paper by running `scripts/table2.sh` and `scripts/table3.sh`. 108 | 109 | If you haven't downloaded the pretrained models yet, you can find the download links in the **Pretrained Models** section above. 110 | 111 | ### :rocket: Train Monocular Networks 112 | 113 | To finetune networks refer to the example in `scripts/finetune.sh` 114 | 115 | ### :rocket: Monocular Virtual Depth Generation 116 | 117 | To generate virtual depth from depth networks using our in-paiting strategy refer to the example in `scripts/generate_virtual_depth.sh` 118 | 119 | ### :rocket: Stereo Proxy Depth Generation 120 | 121 | To generate proxy depth maps with our merging strategy to finetune stereo networks you can use `create_proxy_stereo.py`. 122 | 123 | As explained above, we will not release the code for finetuning stereo networks. However, our implementation was based on the official codes of [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo) and [CREStereo](https://github.com/megvii-research/CREStereo). 124 | 125 | ## :art: Qualitative Results 126 | 127 | In this section, we present illustrative examples that demonstrate the effectiveness of our proposal. 128 | 129 |

130 | GIF 131 |

132 | 133 | ## :envelope: Contacts 134 | 135 | For questions, please send an email to alex.costanzino@unibo.it, pierluigi.zama@unibo.it, m.poggi@unibo.it, or fabio.tosi5@unibo.it 136 | 137 | ## :pray: Acknowledgements 138 | 139 | We would like to extend our sincere appreciation to the authors of the following projects for making their code available, which we have utilized in our work: 140 | 141 | - We would like to thank the authors of [MiDAS](https://github.com/isl-org/MiDaS), [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo) and [CREStereo](https://github.com/megvii-research/CREStereo) for providing their code, which has been instrumental in our experiments. 142 | 143 | We deeply appreciate the authors of the competing research papers for their helpful responses, and provision of model weights, which greatly aided accurate comparisons. -------------------------------------------------------------------------------- /create_proxy_stereo.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import cv2 4 | import matplotlib.pyplot as plt 5 | import argparse 6 | 7 | 8 | # This script assumes the same subfolder structure for each root (mono_root, stereo_root, mask_root). 9 | 10 | parser = argparse.ArgumentParser() 11 | parser.add_argument('--mono_root', help="folder with mono predictions") 12 | parser.add_argument('--stereo_root', help="folder with stereo predictions") 13 | parser.add_argument('--stereo_ext', default=".npy", help="stereo extension.") 14 | parser.add_argument('--scale_factor_16bit_stereo', default=64, help="16bit scale factor used during saving") 15 | parser.add_argument('--mask_root', default="", help="folder with semantic masks") 16 | parser.add_argument('--output_root', default="results_merge", help="folder with semantic masks") 17 | parser.add_argument('--debug', action="store_true") 18 | args = parser.parse_args() 19 | 20 | debug=args.debug 21 | stereo_root=args.stereo_root 22 | mono_root=args.mono_root 23 | mask_root=args.mask_root 24 | output_root=args.output_root 25 | scale_factor_16bit_stereo = args.scale_factor_16bit_stereo 26 | stereo_ext = args.stereo_ext 27 | 28 | def compute_scale_and_shift(prediction, target, mask): 29 | # system matrix: A = [[a_00, a_01], [a_10, a_11]] 30 | a_00 = np.sum(mask * prediction * prediction, axis=(1, 2)) 31 | a_01 = np.sum(mask * prediction, axis=(1, 2)) 32 | a_11 = np.sum(mask, axis=(1, 2)) 33 | 34 | # right hand side: b = [b_0, b_1] 35 | b_0 = np.sum(mask * prediction * target, axis=(1, 2)) 36 | b_1 = np.sum(mask * target, axis=(1, 2)) 37 | 38 | # solution: x = A^-1 . b = [[a_11, -a_01], [-a_10, a_00]] / (a_00 * a_11 - a_01 * a_10) . b 39 | x_0 = np.zeros_like(b_0) 40 | x_1 = np.zeros_like(b_1) 41 | 42 | det = a_00 * a_11 - a_01 * a_01 43 | # A needs to be a positive definite matrix. 44 | valid = det > 0 45 | 46 | x_0[valid] = (a_11[valid] * b_0[valid] - a_01[valid] * b_1[valid]) / det[valid] 47 | x_1[valid] = (-a_01[valid] * b_0[valid] + a_00[valid] * b_1[valid]) / det[valid] 48 | 49 | return x_0, x_1 50 | 51 | 52 | for root, dirs, files in os.walk(mono_root): 53 | for mono_path in files: 54 | if mono_path.endswith(".npy"): 55 | mono_path = os.path.join(root, mono_path) 56 | 57 | stereo_path = mono_path.replace(mono_root, stereo_root).replace("camera_00/", "") 58 | if "npy" in stereo_ext: 59 | stereo = np.load(stereo_path) 60 | elif "png" in stereo_ext: 61 | stereo_path = stereo_path.replace(".npy", ".png") 62 | stereo = cv2.imread(stereo_path, -1).astype(np.float32) / scale_factor_16bit_stereo 63 | 64 | mono = np.load(os.path.join(mono_root, mono_path)) 65 | mono = cv2.resize(mono, (stereo.shape[1], stereo.shape[0]), cv2.INTER_CUBIC) 66 | 67 | valid = (stereo > 0).astype(np.float32) 68 | mono[valid == 0] = 0 69 | 70 | mask_path = mono_path.replace(mono_root, mask_root).replace(".npy", ".png") 71 | mask = cv2.imread(mask_path, 0) 72 | mask_transparent = (mask * valid) > 0 73 | mask_lambertian = ((1 - mask) * valid) > 0 74 | 75 | mono = (mono - np.min(mono[valid > 0])) / (mono[valid > 0].max() - mono[valid > 0].min()) 76 | a, b = compute_scale_and_shift(np.expand_dims(mono, axis=0), np.expand_dims(stereo, axis=0), np.expand_dims(mask_lambertian.astype(np.float32), axis=0)) 77 | mono = mono * a + b 78 | 79 | merged = np.zeros(stereo.shape) 80 | merged[mask_transparent] = mono[mask_transparent] 81 | merged[mask_lambertian] = stereo[mask_lambertian] 82 | 83 | output_path = os.path.join(output_root, os.path.dirname(mono_path).replace(mono_root + "/", "")) 84 | basename = os.path.basename(mono_path) 85 | os.makedirs(output_path, exist_ok=True) 86 | 87 | if debug: 88 | plt.subplot(3,2,1) 89 | plt.title("mask_seg") 90 | plt.imshow(cv2.resize((mask*255).astype(np.uint8), None, fx=0.25, fy=0.25, interpolation=cv2.INTER_NEAREST)) 91 | plt.subplot(3,2,2) 92 | plt.title("mask_trasp") 93 | plt.imshow(cv2.resize(mask_transparent.astype(np.float32), None, fx=0.25, fy=0.25, interpolation=cv2.INTER_NEAREST)) 94 | plt.subplot(3,2,3) 95 | plt.title("mask_lamb") 96 | plt.imshow(cv2.resize(mask_lambertian.astype(np.float32), None, fx=0.25, fy=0.25, interpolation=cv2.INTER_NEAREST)) 97 | plt.subplot(3,2,4) 98 | plt.title("stereo") 99 | plt.imshow(cv2.resize(stereo, None, fx=0.25, fy=0.25), vmin=stereo.min(), vmax=stereo.max(), cmap="jet") 100 | plt.subplot(3,2,5) 101 | plt.title("mono") 102 | plt.imshow(cv2.resize(mono, None, fx=0.25, fy=0.25), vmin=stereo.min(), vmax=stereo.max(), cmap="jet") 103 | plt.subplot(3,2,6) 104 | plt.title("merged") 105 | plt.imshow(cv2.resize(merged, None, fx=0.25, fy=0.25), vmin=stereo.min(), vmax=stereo.max(), cmap="jet") 106 | plt.savefig(os.path.join(output_path, basename.replace(".npy", ".png"))) 107 | else: 108 | np.save(os.path.join(output_path, basename), merged) 109 | -------------------------------------------------------------------------------- /datasets/booster/train_stereo.txt: -------------------------------------------------------------------------------- 1 | Bathroom/camera_00/im0.png Bathroom/disp_00.npy Bathroom/calib_00-02.xml 2 | Bathroom/camera_00/im1.png Bathroom/disp_00.npy Bathroom/calib_00-02.xml 3 | Bathroom/camera_00/im2.png Bathroom/disp_00.npy Bathroom/calib_00-02.xml 4 | Bedroom/camera_00/im0.png Bedroom/disp_00.npy Bedroom/calib_00-02.xml 5 | Bedroom/camera_00/im1.png Bedroom/disp_00.npy Bedroom/calib_00-02.xml 6 | Bedroom/camera_00/im2.png Bedroom/disp_00.npy Bedroom/calib_00-02.xml 7 | Bottle/camera_00/im0.png Bottle/disp_00.npy Bottle/calib_00-02.xml 8 | Bottle/camera_00/im1.png Bottle/disp_00.npy Bottle/calib_00-02.xml 9 | Bottle1/camera_00/im0.png Bottle1/disp_00.npy Bottle1/calib_00-02.xml 10 | Bottle1/camera_00/im1.png Bottle1/disp_00.npy Bottle1/calib_00-02.xml 11 | Bottle1/camera_00/im2.png Bottle1/disp_00.npy Bottle1/calib_00-02.xml 12 | Bottle1/camera_00/im3.png Bottle1/disp_00.npy Bottle1/calib_00-02.xml 13 | BottledWater/camera_00/im0.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml 14 | BottledWater/camera_00/im1.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml 15 | BottledWater/camera_00/im2.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml 16 | BottledWater/camera_00/im3.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml 17 | BottledWater/camera_00/im4.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml 18 | Bottles1/camera_00/im0.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml 19 | Bottles1/camera_00/im1.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml 20 | Bottles1/camera_00/im2.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml 21 | Bottles1/camera_00/im3.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml 22 | Bottles1/camera_00/im4.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml 23 | Bottles1/camera_00/im5.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml 24 | Bottles1/camera_00/im6.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml 25 | Bottles1/camera_00/im7.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml 26 | Bucket/camera_00/im0.png Bucket/disp_00.npy Bucket/calib_00-02.xml 27 | Bucket/camera_00/im1.png Bucket/disp_00.npy Bucket/calib_00-02.xml 28 | Bucket/camera_00/im2.png Bucket/disp_00.npy Bucket/calib_00-02.xml 29 | Bucket/camera_00/im3.png Bucket/disp_00.npy Bucket/calib_00-02.xml 30 | Bucket/camera_00/im4.png Bucket/disp_00.npy Bucket/calib_00-02.xml 31 | Bucket/camera_00/im5.png Bucket/disp_00.npy Bucket/calib_00-02.xml 32 | Bucket/camera_00/im6.png Bucket/disp_00.npy Bucket/calib_00-02.xml 33 | Canteen/camera_00/im0.png Canteen/disp_00.npy Canteen/calib_00-02.xml 34 | Canteen/camera_00/im1.png Canteen/disp_00.npy Canteen/calib_00-02.xml 35 | Canteen/camera_00/im2.png Canteen/disp_00.npy Canteen/calib_00-02.xml 36 | Canteen/camera_00/im3.png Canteen/disp_00.npy Canteen/calib_00-02.xml 37 | Canteen/camera_00/im4.png Canteen/disp_00.npy Canteen/calib_00-02.xml 38 | Canteen/camera_00/im5.png Canteen/disp_00.npy Canteen/calib_00-02.xml 39 | Canteen/camera_00/im6.png Canteen/disp_00.npy Canteen/calib_00-02.xml 40 | Canteen/camera_00/im7.png Canteen/disp_00.npy Canteen/calib_00-02.xml 41 | Canteen/camera_00/im8.png Canteen/disp_00.npy Canteen/calib_00-02.xml 42 | Canteen/camera_00/im9.png Canteen/disp_00.npy Canteen/calib_00-02.xml 43 | Case/camera_00/im0.png Case/disp_00.npy Case/calib_00-02.xml 44 | Case/camera_00/im10.png Case/disp_00.npy Case/calib_00-02.xml 45 | Case/camera_00/im1.png Case/disp_00.npy Case/calib_00-02.xml 46 | Case/camera_00/im2.png Case/disp_00.npy Case/calib_00-02.xml 47 | Case/camera_00/im3.png Case/disp_00.npy Case/calib_00-02.xml 48 | Case/camera_00/im4.png Case/disp_00.npy Case/calib_00-02.xml 49 | Case/camera_00/im5.png Case/disp_00.npy Case/calib_00-02.xml 50 | Case/camera_00/im6.png Case/disp_00.npy Case/calib_00-02.xml 51 | Case/camera_00/im7.png Case/disp_00.npy Case/calib_00-02.xml 52 | Case/camera_00/im8.png Case/disp_00.npy Case/calib_00-02.xml 53 | Case/camera_00/im9.png Case/disp_00.npy Case/calib_00-02.xml 54 | CashBox/camera_00/im0.png CashBox/disp_00.npy CashBox/calib_00-02.xml 55 | CashBox/camera_00/im1.png CashBox/disp_00.npy CashBox/calib_00-02.xml 56 | CashBox/camera_00/im2.png CashBox/disp_00.npy CashBox/calib_00-02.xml 57 | CashBox/camera_00/im3.png CashBox/disp_00.npy CashBox/calib_00-02.xml 58 | CashBox/camera_00/im4.png CashBox/disp_00.npy CashBox/calib_00-02.xml 59 | CashBox/camera_00/im5.png CashBox/disp_00.npy CashBox/calib_00-02.xml 60 | CashBox/camera_00/im6.png CashBox/disp_00.npy CashBox/calib_00-02.xml 61 | CashBox/camera_00/im7.png CashBox/disp_00.npy CashBox/calib_00-02.xml 62 | CoffeeMaker/camera_00/im0.png CoffeeMaker/disp_00.npy CoffeeMaker/calib_00-02.xml 63 | CoffeeMaker/camera_00/im1.png CoffeeMaker/disp_00.npy CoffeeMaker/calib_00-02.xml 64 | CoffeeMaker/camera_00/im2.png CoffeeMaker/disp_00.npy CoffeeMaker/calib_00-02.xml 65 | Cooker1/camera_00/im0.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml 66 | Cooker1/camera_00/im1.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml 67 | Cooker1/camera_00/im2.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml 68 | Cooker1/camera_00/im3.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml 69 | Cooker1/camera_00/im4.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml 70 | Cooker1/camera_00/im5.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml 71 | Cooker1/camera_00/im6.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml 72 | Cosmetics/camera_00/im0.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 73 | Cosmetics/camera_00/im1.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 74 | Cosmetics/camera_00/im2.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 75 | Cosmetics/camera_00/im3.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 76 | Cosmetics/camera_00/im4.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 77 | Cosmetics/camera_00/im5.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 78 | Cosmetics/camera_00/im6.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 79 | Cosmetics/camera_00/im7.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 80 | Cosmetics/camera_00/im8.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 81 | Cosmetics/camera_00/im9.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml 82 | DogHouse/camera_00/im0.png DogHouse/disp_00.npy DogHouse/calib_00-02.xml 83 | DogHouse/camera_00/im1.png DogHouse/disp_00.npy DogHouse/calib_00-02.xml 84 | DogHouse/camera_00/im2.png DogHouse/disp_00.npy DogHouse/calib_00-02.xml 85 | DogHouse/camera_00/im3.png DogHouse/disp_00.npy DogHouse/calib_00-02.xml 86 | Door/camera_00/im0.png Door/disp_00.npy Door/calib_00-02.xml 87 | Door/camera_00/im1.png Door/disp_00.npy Door/calib_00-02.xml 88 | Door/camera_00/im2.png Door/disp_00.npy Door/calib_00-02.xml 89 | Door/camera_00/im3.png Door/disp_00.npy Door/calib_00-02.xml 90 | Door/camera_00/im4.png Door/disp_00.npy Door/calib_00-02.xml 91 | Door/camera_00/im5.png Door/disp_00.npy Door/calib_00-02.xml 92 | Door/camera_00/im6.png Door/disp_00.npy Door/calib_00-02.xml 93 | ExtractorFan/camera_00/im0.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 94 | ExtractorFan/camera_00/im1.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 95 | ExtractorFan/camera_00/im2.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 96 | ExtractorFan/camera_00/im3.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 97 | ExtractorFan/camera_00/im4.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 98 | ExtractorFan/camera_00/im5.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 99 | ExtractorFan/camera_00/im6.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 100 | ExtractorFan/camera_00/im7.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 101 | ExtractorFan/camera_00/im8.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 102 | ExtractorFan/camera_00/im9.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml 103 | Fridge/camera_00/im0.png Fridge/disp_00.npy Fridge/calib_00-02.xml 104 | Fridge/camera_00/im1.png Fridge/disp_00.npy Fridge/calib_00-02.xml 105 | Fridge/camera_00/im2.png Fridge/disp_00.npy Fridge/calib_00-02.xml 106 | Lunch/camera_00/im0.png Lunch/disp_00.npy Lunch/calib_00-02.xml 107 | Microwave/camera_00/im0.png Microwave/disp_00.npy Microwave/calib_00-02.xml 108 | Microwave/camera_00/im1.png Microwave/disp_00.npy Microwave/calib_00-02.xml 109 | Microwave/camera_00/im2.png Microwave/disp_00.npy Microwave/calib_00-02.xml 110 | Microwave/camera_00/im3.png Microwave/disp_00.npy Microwave/calib_00-02.xml 111 | Microwave/camera_00/im4.png Microwave/disp_00.npy Microwave/calib_00-02.xml 112 | Microwave/camera_00/im5.png Microwave/disp_00.npy Microwave/calib_00-02.xml 113 | Microwave/camera_00/im6.png Microwave/disp_00.npy Microwave/calib_00-02.xml 114 | Mirror/camera_00/im0.png Mirror/disp_00.npy Mirror/calib_00-02.xml 115 | Mirror/camera_00/im1.png Mirror/disp_00.npy Mirror/calib_00-02.xml 116 | Moka/camera_00/im0.png Moka/disp_00.npy Moka/calib_00-02.xml 117 | Moka/camera_00/im1.png Moka/disp_00.npy Moka/calib_00-02.xml 118 | Moka/camera_00/im2.png Moka/disp_00.npy Moka/calib_00-02.xml 119 | Moka/camera_00/im3.png Moka/disp_00.npy Moka/calib_00-02.xml 120 | Moka/camera_00/im4.png Moka/disp_00.npy Moka/calib_00-02.xml 121 | Moka1/camera_00/im0.png Moka1/disp_00.npy Moka1/calib_00-02.xml 122 | Moka1/camera_00/im1.png Moka1/disp_00.npy Moka1/calib_00-02.xml 123 | Moka1/camera_00/im2.png Moka1/disp_00.npy Moka1/calib_00-02.xml 124 | Moka1/camera_00/im3.png Moka1/disp_00.npy Moka1/calib_00-02.xml 125 | Moka1/camera_00/im4.png Moka1/disp_00.npy Moka1/calib_00-02.xml 126 | Moka1/camera_00/im5.png Moka1/disp_00.npy Moka1/calib_00-02.xml 127 | Moka1/camera_00/im6.png Moka1/disp_00.npy Moka1/calib_00-02.xml 128 | Moka1/camera_00/im7.png Moka1/disp_00.npy Moka1/calib_00-02.xml 129 | Moka1/camera_00/im8.png Moka1/disp_00.npy Moka1/calib_00-02.xml 130 | Moka1/camera_00/im9.png Moka1/disp_00.npy Moka1/calib_00-02.xml 131 | Motorcycle/camera_00/im0.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 132 | Motorcycle/camera_00/im1.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 133 | Motorcycle/camera_00/im2.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 134 | Motorcycle/camera_00/im3.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 135 | Motorcycle/camera_00/im4.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 136 | Motorcycle/camera_00/im5.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 137 | Motorcycle/camera_00/im6.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 138 | Motorcycle/camera_00/im7.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 139 | Motorcycle/camera_00/im8.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml 140 | Mouthwash/camera_00/im0.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml 141 | Mouthwash/camera_00/im1.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml 142 | Mouthwash/camera_00/im2.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml 143 | Mouthwash/camera_00/im3.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml 144 | Mouthwash/camera_00/im4.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml 145 | Mouthwash/camera_00/im5.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml 146 | Mouthwash/camera_00/im6.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml 147 | OilCan/camera_00/im0.png OilCan/disp_00.npy OilCan/calib_00-02.xml 148 | OilCan/camera_00/im1.png OilCan/disp_00.npy OilCan/calib_00-02.xml 149 | OilCan/camera_00/im2.png OilCan/disp_00.npy OilCan/calib_00-02.xml 150 | OilCan/camera_00/im3.png OilCan/disp_00.npy OilCan/calib_00-02.xml 151 | OilCan/camera_00/im4.png OilCan/disp_00.npy OilCan/calib_00-02.xml 152 | OilCan/camera_00/im5.png OilCan/disp_00.npy OilCan/calib_00-02.xml 153 | OilCan/camera_00/im6.png OilCan/disp_00.npy OilCan/calib_00-02.xml 154 | OilCan/camera_00/im7.png OilCan/disp_00.npy OilCan/calib_00-02.xml 155 | Oven1/camera_00/im0.png Oven1/disp_00.npy Oven1/calib_00-02.xml 156 | Oven1/camera_00/im1.png Oven1/disp_00.npy Oven1/calib_00-02.xml 157 | Oven1/camera_00/im2.png Oven1/disp_00.npy Oven1/calib_00-02.xml 158 | Oven1/camera_00/im3.png Oven1/disp_00.npy Oven1/calib_00-02.xml 159 | Oven1/camera_00/im4.png Oven1/disp_00.npy Oven1/calib_00-02.xml 160 | Oven1/camera_00/im5.png Oven1/disp_00.npy Oven1/calib_00-02.xml 161 | Oven2/camera_00/im0.png Oven2/disp_00.npy Oven2/calib_00-02.xml 162 | Oven2/camera_00/im1.png Oven2/disp_00.npy Oven2/calib_00-02.xml 163 | Oven2/camera_00/im2.png Oven2/disp_00.npy Oven2/calib_00-02.xml 164 | Oven2/camera_00/im3.png Oven2/disp_00.npy Oven2/calib_00-02.xml 165 | Oven2/camera_00/im4.png Oven2/disp_00.npy Oven2/calib_00-02.xml 166 | Oven2/camera_00/im5.png Oven2/disp_00.npy Oven2/calib_00-02.xml 167 | Pots1/camera_00/im0.png Pots1/disp_00.npy Pots1/calib_00-02.xml 168 | Pots1/camera_00/im1.png Pots1/disp_00.npy Pots1/calib_00-02.xml 169 | Pots1/camera_00/im2.png Pots1/disp_00.npy Pots1/calib_00-02.xml 170 | Pots1/camera_00/im3.png Pots1/disp_00.npy Pots1/calib_00-02.xml 171 | Pots1/camera_00/im4.png Pots1/disp_00.npy Pots1/calib_00-02.xml 172 | Pots1/camera_00/im5.png Pots1/disp_00.npy Pots1/calib_00-02.xml 173 | Shower/camera_00/im0.png Shower/disp_00.npy Shower/calib_00-02.xml 174 | Shower/camera_00/im1.png Shower/disp_00.npy Shower/calib_00-02.xml 175 | Shower/camera_00/im2.png Shower/disp_00.npy Shower/calib_00-02.xml 176 | Shower/camera_00/im3.png Shower/disp_00.npy Shower/calib_00-02.xml 177 | Sink/camera_00/im0.png Sink/disp_00.npy Sink/calib_00-02.xml 178 | Sink/camera_00/im1.png Sink/disp_00.npy Sink/calib_00-02.xml 179 | Sink/camera_00/im2.png Sink/disp_00.npy Sink/calib_00-02.xml 180 | Sink/camera_00/im3.png Sink/disp_00.npy Sink/calib_00-02.xml 181 | Sink/camera_00/im4.png Sink/disp_00.npy Sink/calib_00-02.xml 182 | SoapDishes/camera_00/im0.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml 183 | SoapDishes/camera_00/im1.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml 184 | SoapDishes/camera_00/im2.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml 185 | SoapDishes/camera_00/im3.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml 186 | SoapDishes/camera_00/im4.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml 187 | SoapDishes/camera_00/im5.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml 188 | SoapDishes/camera_00/im6.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml 189 | SoapDishes/camera_00/im7.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml 190 | Tablet/camera_00/im0.png Tablet/disp_00.npy Tablet/calib_00-02.xml 191 | Tablet/camera_00/im1.png Tablet/disp_00.npy Tablet/calib_00-02.xml 192 | Tablet/camera_00/im2.png Tablet/disp_00.npy Tablet/calib_00-02.xml 193 | Tablet/camera_00/im3.png Tablet/disp_00.npy Tablet/calib_00-02.xml 194 | Tablet/camera_00/im4.png Tablet/disp_00.npy Tablet/calib_00-02.xml 195 | Tablet/camera_00/im5.png Tablet/disp_00.npy Tablet/calib_00-02.xml 196 | Tablet/camera_00/im6.png Tablet/disp_00.npy Tablet/calib_00-02.xml 197 | Tablet/camera_00/im7.png Tablet/disp_00.npy Tablet/calib_00-02.xml 198 | Tablet/camera_00/im8.png Tablet/disp_00.npy Tablet/calib_00-02.xml 199 | Toilet/camera_00/im0.png Toilet/disp_00.npy Toilet/calib_00-02.xml 200 | Toilet/camera_00/im1.png Toilet/disp_00.npy Toilet/calib_00-02.xml 201 | Toilet/camera_00/im2.png Toilet/disp_00.npy Toilet/calib_00-02.xml 202 | Toilet/camera_00/im3.png Toilet/disp_00.npy Toilet/calib_00-02.xml 203 | Toilet/camera_00/im4.png Toilet/disp_00.npy Toilet/calib_00-02.xml 204 | TV/camera_00/im0.png TV/disp_00.npy TV/calib_00-02.xml 205 | TV/camera_00/im1.png TV/disp_00.npy TV/calib_00-02.xml 206 | TV/camera_00/im2.png TV/disp_00.npy TV/calib_00-02.xml 207 | TV/camera_00/im3.png TV/disp_00.npy TV/calib_00-02.xml 208 | TV1/camera_00/im0.png TV1/disp_00.npy TV1/calib_00-02.xml 209 | TV1/camera_00/im1.png TV1/disp_00.npy TV1/calib_00-02.xml 210 | TV1/camera_00/im2.png TV1/disp_00.npy TV1/calib_00-02.xml 211 | TV1/camera_00/im3.png TV1/disp_00.npy TV1/calib_00-02.xml 212 | TV2/camera_00/im0.png TV2/disp_00.npy TV2/calib_00-02.xml 213 | TV2/camera_00/im1.png TV2/disp_00.npy TV2/calib_00-02.xml 214 | Vodka/camera_00/im0.png Vodka/disp_00.npy Vodka/calib_00-02.xml 215 | Vodka/camera_00/im1.png Vodka/disp_00.npy Vodka/calib_00-02.xml 216 | Vodka/camera_00/im2.png Vodka/disp_00.npy Vodka/calib_00-02.xml 217 | Vodka/camera_00/im3.png Vodka/disp_00.npy Vodka/calib_00-02.xml 218 | Vodka/camera_00/im4.png Vodka/disp_00.npy Vodka/calib_00-02.xml 219 | Vodka/camera_00/im5.png Vodka/disp_00.npy Vodka/calib_00-02.xml 220 | Vodka/camera_00/im6.png Vodka/disp_00.npy Vodka/calib_00-02.xml 221 | Vodka/camera_00/im7.png Vodka/disp_00.npy Vodka/calib_00-02.xml 222 | Washer/camera_00/im0.png Washer/disp_00.npy Washer/calib_00-02.xml 223 | Washer/camera_00/im1.png Washer/disp_00.npy Washer/calib_00-02.xml 224 | Washer/camera_00/im2.png Washer/disp_00.npy Washer/calib_00-02.xml 225 | Washer/camera_00/im3.png Washer/disp_00.npy Washer/calib_00-02.xml 226 | Washer/camera_00/im4.png Washer/disp_00.npy Washer/calib_00-02.xml 227 | Washer/camera_00/im5.png Washer/disp_00.npy Washer/calib_00-02.xml 228 | Washer/camera_00/im6.png Washer/disp_00.npy Washer/calib_00-02.xml 229 | -------------------------------------------------------------------------------- /datasets/dataloader.py: -------------------------------------------------------------------------------- 1 | import os 2 | import random 3 | import numpy as np 4 | import torch 5 | import cv2 6 | from torch.utils.data import Dataset 7 | from utils import parse_dataset_txt, read_image 8 | 9 | ###-----[Booster]-----### 10 | rgb_str = "camera_00" 11 | disp_str = "disp_00.npy" 12 | mask_str = "mask_00.png" 13 | mask_c_str = "mask_cat.png" 14 | 15 | 16 | class Trans10KLoader(Dataset): 17 | def __init__(self, dataset_dir, dataset_txt, transform): 18 | self.dataset_dir = dataset_dir 19 | self.transform = transform 20 | dataset_dict = parse_dataset_txt(dataset_txt) 21 | 22 | self.images_names = dataset_dict["basenames"] 23 | self.ground_truth_names = dataset_dict["gt_paths"] 24 | 25 | 26 | def __len__(self): 27 | return len(self.images_names) 28 | 29 | def __getitem__(self, idx): 30 | rgb_path = os.path.join(self.dataset_dir, self.images_names[idx]) 31 | disp_path = os.path.join(self.dataset_dir, self.ground_truth_names[idx]) 32 | 33 | # Read all the images in the folder and stack them to form the batch. 34 | rgb_image = read_image(rgb_path) # [0,1] rgb hxwxc image 35 | ground_truth = np.load(disp_path).astype(np.float32) 36 | ground_truth = cv2.resize(ground_truth, (rgb_image.shape[1], rgb_image.shape[0]), cv2.INTER_NEAREST) 37 | 38 | transformed_dict = self.transform({"image": rgb_image, "depth": ground_truth}) 39 | rgb_image = transformed_dict["image"] 40 | ground_truth = transformed_dict["depth"] 41 | rgb_image = torch.from_numpy(rgb_image) 42 | ground_truth = torch.from_numpy(ground_truth) 43 | 44 | return rgb_image, ground_truth, rgb_path 45 | 46 | class MSDLoader(Trans10KLoader): 47 | pass -------------------------------------------------------------------------------- /datasets/msd/test.txt: -------------------------------------------------------------------------------- 1 | 5398_512x640.jpg _ 2 | 4986_640x512.jpg _ 3 | 4996_640x512.jpg _ 4 | 586_512x640.jpg _ 5 | 5162_512x640.jpg _ 6 | 5107_512x640.jpg _ 7 | 5291_512x640.jpg _ 8 | 5120_512x640.jpg _ 9 | 5354_512x640.jpg _ 10 | 1860_512x640.jpg _ 11 | 3792_640x512.jpg _ 12 | 1971_512x640.jpg _ 13 | 3316_512x640.jpg _ 14 | 5148_640x512.jpg _ 15 | 3309_512x640.jpg _ 16 | 5304_640x512.jpg _ 17 | 119_512x640.jpg _ 18 | 1830_640x512.jpg _ 19 | 5279_512x640.jpg _ 20 | 5248_640x512.jpg _ 21 | 3975_512x640.jpg _ 22 | 654_512x640.jpg _ 23 | 4345_512x640.jpg _ 24 | 1777_512x640.jpg _ 25 | 5310_640x512.jpg _ 26 | 3771_512x640.jpg _ 27 | 3423_512x640.jpg _ 28 | 1652_512x640.jpg _ 29 | 5025_512x640.jpg _ 30 | 5439_512x640.jpg _ 31 | 2119_512x640.jpg _ 32 | 2711_512x640.jpg _ 33 | 429_512x640.jpg _ 34 | 5119_512x640.jpg _ 35 | 2907_512x640.jpg _ 36 | 4969_512x640.jpg _ 37 | 5131_640x512.jpg _ 38 | 4989_512x640.jpg _ 39 | 5092_512x640.jpg _ 40 | 5496_512x640.jpg _ 41 | 3895_512x640.jpg _ 42 | 5246_512x640.jpg _ 43 | 1852_512x640.jpg _ 44 | 3398_512x640.jpg _ 45 | 5452_512x640.jpg _ 46 | 1734_512x640.jpg _ 47 | 5169_512x640.jpg _ 48 | 1680_512x640.jpg _ 49 | 3658_512x640.jpg _ 50 | 4340_512x640.jpg _ 51 | 1881_640x512.jpg _ 52 | 5341_512x640.jpg _ 53 | 1693_512x640.jpg _ 54 | 2130_512x640.jpg _ 55 | 4376_512x640.jpg _ 56 | 5250_512x640.jpg _ 57 | 2897_512x640.jpg _ 58 | 5237_512x640.jpg _ 59 | 2961_512x640.jpg _ 60 | 160_640x512.jpg _ 61 | 1678_512x640.jpg _ 62 | 2763_512x640.jpg _ 63 | 4325_512x640.jpg _ 64 | 4991_512x640.jpg _ 65 | 3328_512x640.jpg _ 66 | 1770_512x640.jpg _ 67 | 5202_512x640.jpg _ 68 | 5242_512x640.jpg _ 69 | 1774_512x640.jpg _ 70 | 4963_640x512.jpg _ 71 | 3242_512x640.jpg _ 72 | 1789_512x640.jpg _ 73 | 2744_512x640.jpg _ 74 | 4393_512x640.jpg _ 75 | 5375_512x640.jpg _ 76 | 1929_512x640.jpg _ 77 | 5265_640x512.jpg _ 78 | 5331_512x640.jpg _ 79 | 5027_512x640.jpg _ 80 | 1932_512x640.jpg _ 81 | 195_640x512.jpg _ 82 | 1858_512x640.jpg _ 83 | 2079_640x512.jpg _ 84 | 1828_512x640.jpg _ 85 | 5089_640x512.jpg _ 86 | 5183_512x640.jpg _ 87 | 1983_512x640.jpg _ 88 | 1050_512x640.jpg _ 89 | 4391_512x640.jpg _ 90 | 5097_640x512.jpg _ 91 | 5269_512x640.jpg _ 92 | 1749_512x640.jpg _ 93 | 5520_512x640.jpg _ 94 | 5509_512x640.jpg _ 95 | 5319_640x512.jpg _ 96 | 5069_512x640.jpg _ 97 | 5118_640x512.jpg _ 98 | 4946_512x640.jpg _ 99 | 675_512x640.jpg _ 100 | 5112_640x512.jpg _ 101 | 5355_512x640.jpg _ 102 | 5263_512x640.jpg _ 103 | 5026_640x512.jpg _ 104 | 3005_512x640.jpg _ 105 | 3071_512x640.jpg _ 106 | 5535_512x640.jpg _ 107 | 1907_640x512.jpg _ 108 | 5429_512x640.jpg _ 109 | 4971_512x640.jpg _ 110 | 1935_640x512.jpg _ 111 | 3701_512x640.jpg _ 112 | 3205_512x640.jpg _ 113 | 1759_512x640.jpg _ 114 | 1985_512x640.jpg _ 115 | 3229_512x640.jpg _ 116 | 3650_512x640.jpg _ 117 | 1778_512x640.jpg _ 118 | 3260_512x640.jpg _ 119 | 5204_640x512.jpg _ 120 | 4378_512x640.jpg _ 121 | 893_512x640.jpg _ 122 | 5011_640x512.jpg _ 123 | 5008_640x512.jpg _ 124 | 3160_512x640.jpg _ 125 | 5143_512x640.jpg _ 126 | 5176_512x640.jpg _ 127 | 1912_512x640.jpg _ 128 | 5363_512x640.jpg _ 129 | 5161_512x640.jpg _ 130 | 5144_512x640.jpg _ 131 | 111_512x640.jpg _ 132 | 5288_512x640.jpg _ 133 | 4379_512x640.jpg _ 134 | 5258_512x640.jpg _ 135 | 844_512x640.jpg _ 136 | 2103_512x640.jpg _ 137 | 4956_640x512.jpg _ 138 | 608_512x640.jpg _ 139 | 2886_512x640.jpg _ 140 | 4045_512x640.jpg _ 141 | 3311_512x640.jpg _ 142 | 1791_512x640.jpg _ 143 | 5067_512x640.jpg _ 144 | 5510_512x640.jpg _ 145 | 2139_640x512.jpg _ 146 | 3693_512x640.jpg _ 147 | 5140_640x512.jpg _ 148 | 5287_640x512.jpg _ 149 | 4385_512x640.jpg _ 150 | 3905_512x640.jpg _ 151 | 5353_512x640.jpg _ 152 | 1846_512x640.jpg _ 153 | 3240_512x640.jpg _ 154 | 5454_512x640.jpg _ 155 | 1768_512x640.jpg _ 156 | 3397_512x640.jpg _ 157 | 5098_512x640.jpg _ 158 | 5301_512x640.jpg _ 159 | 1959_512x640.jpg _ 160 | 4809_512x640.jpg _ 161 | 5396_512x640.jpg _ 162 | 5321_640x512.jpg _ 163 | 1956_512x640.jpg _ 164 | 3696_512x640.jpg _ 165 | 3691_512x640.jpg _ 166 | 1918_512x640.jpg _ 167 | 4982_512x640.jpg _ 168 | 5016_512x640.jpg _ 169 | 1668_512x640.jpg _ 170 | 5524_512x640.jpg _ 171 | 3425_512x640.jpg _ 172 | 1751_512x640.jpg _ 173 | 5028_512x640.jpg _ 174 | 4033_512x640.jpg _ 175 | 5000_640x512.jpg _ 176 | 3702_512x640.jpg _ 177 | 4970_512x640.jpg _ 178 | 4384_512x640.jpg _ 179 | 5224_512x640.jpg _ 180 | 3594_512x640.jpg _ 181 | 5146_512x640.jpg _ 182 | 5175_512x640.jpg _ 183 | 4978_512x640.jpg _ 184 | 3797_512x640.jpg _ 185 | 5328_512x640.jpg _ 186 | 5074_640x512.jpg _ 187 | 4095_512x640.jpg _ 188 | 949_512x640.jpg _ 189 | 5132_640x512.jpg _ 190 | 4386_512x640.jpg _ 191 | 22_512x640.jpg _ 192 | 3166_512x640.jpg _ 193 | 196_512x640.jpg _ 194 | 5268_512x640.jpg _ 195 | 1694_512x640.jpg _ 196 | 1726_512x640.jpg _ 197 | 5428_512x640.jpg _ 198 | 5315_512x640.jpg _ 199 | 5289_512x640.jpg _ 200 | 1954_512x640.jpg _ 201 | 5033_640x512.jpg _ 202 | 5356_512x640.jpg _ 203 | 1986_640x512.jpg _ 204 | 2087_512x640.jpg _ 205 | 3652_512x640.jpg _ 206 | 1762_512x640.jpg _ 207 | 4944_512x640.jpg _ 208 | 5362_512x640.jpg _ 209 | 3491_512x640.jpg _ 210 | 5282_640x512.jpg _ 211 | 3239_512x640.jpg _ 212 | 421_512x640.jpg _ 213 | 3428_512x640.jpg _ 214 | 5059_512x640.jpg _ 215 | 5061_512x640.jpg _ 216 | 3543_512x640.jpg _ 217 | 5003_640x512.jpg _ 218 | 5414_512x640.jpg _ 219 | 3453_512x640.jpg _ 220 | 5277_512x640.jpg _ 221 | 3396_512x640.jpg _ 222 | 5332_512x640.jpg _ 223 | 2094_512x640.jpg _ 224 | 5102_640x512.jpg _ 225 | 5049_512x640.jpg _ 226 | 5080_512x640.jpg _ 227 | 5503_640x512.jpg _ 228 | 3909_512x640.jpg _ 229 | 5472_512x640.jpg _ 230 | 4364_512x640.jpg _ 231 | 5506_512x640.jpg _ 232 | 5427_512x640.jpg _ 233 | 320_512x640.jpg _ 234 | 5036_512x640.jpg _ 235 | 3329_512x640.jpg _ 236 | 5020_640x512.jpg _ 237 | 916_512x640.jpg _ 238 | 5350_640x512.jpg _ 239 | 5511_512x640.jpg _ 240 | 2102_512x640.jpg _ 241 | 5054_640x512.jpg _ 242 | 3478_512x640.jpg _ 243 | 5426_512x640.jpg _ 244 | 5359_512x640.jpg _ 245 | 1934_512x640.jpg _ 246 | 4998_640x512.jpg _ 247 | 5325_512x640.jpg _ 248 | 1702_512x640.jpg _ 249 | 2972_512x640.jpg _ 250 | 5022_512x640.jpg _ 251 | 1683_512x640.jpg _ 252 | 4042_512x640.jpg _ 253 | 5514_512x640.jpg _ 254 | 5139_512x640.jpg _ 255 | 5382_512x640.jpg _ 256 | 5370_512x640.jpg _ 257 | 5243_512x640.jpg _ 258 | 5207_640x512.jpg _ 259 | 4098_512x640.jpg _ 260 | 4363_512x640.jpg _ 261 | 5329_512x640.jpg _ 262 | 3454_512x640.jpg _ 263 | 5115_640x512.jpg _ 264 | 1699_512x640.jpg _ 265 | 5220_512x640.jpg _ 266 | 2667_512x640.jpg _ 267 | 5087_512x640.jpg _ 268 | 3646_640x512.jpg _ 269 | 1654_512x640.jpg _ 270 | 3937_512x640.jpg _ 271 | 4383_512x640.jpg _ 272 | 1833_640x512.jpg _ 273 | 5096_512x640.jpg _ 274 | 1893_640x512.jpg _ 275 | 5105_512x640.jpg _ 276 | 5433_512x640.jpg _ 277 | 318_512x640.jpg _ 278 | 1976_512x640.jpg _ 279 | 1923_640x512.jpg _ 280 | 336_512x640.jpg _ 281 | 3612_512x640.jpg _ 282 | 5171_512x640.jpg _ 283 | 3116_512x640.jpg _ 284 | 5351_512x640.jpg _ 285 | 5457_512x640.jpg _ 286 | 1036_512x640.jpg _ 287 | 5415_512x640.jpg _ 288 | 3458_512x640.jpg _ 289 | 5058_512x640.jpg _ 290 | 5128_640x512.jpg _ 291 | 4987_640x512.jpg _ 292 | 4974_512x640.jpg _ 293 | 1993_512x640.jpg _ 294 | 1647_512x640.jpg _ 295 | 5376_640x512.jpg _ 296 | 5012_640x512.jpg _ 297 | 1945_640x512.jpg _ 298 | 5208_512x640.jpg _ 299 | 4343_512x640.jpg _ 300 | 1961_512x640.jpg _ 301 | 5401_512x640.jpg _ 302 | 1730_512x640.jpg _ 303 | 5037_512x640.jpg _ 304 | 5244_640x512.jpg _ 305 | 3588_512x640.jpg _ 306 | 2798_512x640.jpg _ 307 | 5213_512x640.jpg _ 308 | 5364_640x512.jpg _ 309 | 1992_512x640.jpg _ 310 | 5151_512x640.jpg _ 311 | 5272_512x640.jpg _ 312 | 694_512x640.jpg _ 313 | 5186_512x640.jpg _ 314 | 5369_640x512.jpg _ 315 | 1854_512x640.jpg _ 316 | 5219_512x640.jpg _ 317 | 1994_512x640.jpg _ 318 | 5014_512x640.jpg _ 319 | 5085_640x512.jpg _ 320 | 1937_512x640.jpg _ 321 | 1001_512x640.jpg _ 322 | 1910_640x512.jpg _ 323 | 5007_512x640.jpg _ 324 | 2137_512x640.jpg _ 325 | 5399_512x640.jpg _ 326 | 5109_640x512.jpg _ 327 | 5264_512x640.jpg _ 328 | 5241_640x512.jpg _ 329 | 4099_512x640.jpg _ 330 | 5378_640x512.jpg _ 331 | 5114_512x640.jpg _ 332 | 138_512x640.jpg _ 333 | 5333_512x640.jpg _ 334 | 5337_640x512.jpg _ 335 | 5157_512x640.jpg _ 336 | 3514_512x640.jpg _ 337 | 5300_512x640.jpg _ 338 | 5073_512x640.jpg _ 339 | 879_512x640.jpg _ 340 | 1696_640x512.jpg _ 341 | 1981_512x640.jpg _ 342 | 1793_512x640.jpg _ 343 | 1973_512x640.jpg _ 344 | 4361_512x640.jpg _ 345 | 5262_512x640.jpg _ 346 | 5344_640x512.jpg _ 347 | 5254_512x640.jpg _ 348 | 66_512x640.jpg _ 349 | 5130_640x512.jpg _ 350 | 4374_512x640.jpg _ 351 | 5486_512x640.jpg _ 352 | 5374_512x640.jpg _ 353 | 680_512x640.jpg _ 354 | 1914_640x512.jpg _ 355 | 2655_512x640.jpg _ 356 | 5308_640x512.jpg _ 357 | 3325_512x640.jpg _ 358 | 1834_640x512.jpg _ 359 | 5476_512x640.jpg _ 360 | 5113_512x640.jpg _ 361 | 5216_512x640.jpg _ 362 | 3430_512x640.jpg _ 363 | 5348_512x640.jpg _ 364 | 4342_512x640.jpg _ 365 | 711_512x640.jpg _ 366 | 5330_512x640.jpg _ 367 | 5448_512x640.jpg _ 368 | 1921_640x512.jpg _ 369 | 5136_512x640.jpg _ 370 | 1810_512x640.jpg _ 371 | 3331_512x640.jpg _ 372 | 1880_512x640.jpg _ 373 | 3400_512x640.jpg _ 374 | 2116_512x640.jpg _ 375 | 5111_512x640.jpg _ 376 | 5134_512x640.jpg _ 377 | 5526_512x640.jpg _ 378 | 5528_512x640.jpg _ 379 | 5029_640x512.jpg _ 380 | 1724_512x640.jpg _ 381 | 3182_512x640.jpg _ 382 | 5394_512x640.jpg _ 383 | 2733_512x640.jpg _ 384 | 5298_640x512.jpg _ 385 | 1700_512x640.jpg _ 386 | 4957_512x640.jpg _ 387 | 5233_512x640.jpg _ 388 | 3455_512x640.jpg _ 389 | 2157_512x640.jpg _ 390 | 4981_512x640.jpg _ 391 | 3157_512x640.jpg _ 392 | 4967_512x640.jpg _ 393 | 5215_512x640.jpg _ 394 | 1952_512x640.jpg _ 395 | 5384_512x640.jpg _ 396 | 5435_512x640.jpg _ 397 | 5274_512x640.jpg _ 398 | 1805_512x640.jpg _ 399 | 5500_512x640.jpg _ 400 | 3878_512x640.jpg _ 401 | 1755_512x640.jpg _ 402 | 5323_512x640.jpg _ 403 | 1926_512x640.jpg _ 404 | 1951_512x640.jpg _ 405 | 3241_512x640.jpg _ 406 | 5459_512x640.jpg _ 407 | 5252_512x640.jpg _ 408 | 5468_512x640.jpg _ 409 | 5086_512x640.jpg _ 410 | 809_512x640.jpg _ 411 | 5481_512x640.jpg _ 412 | 4382_512x640.jpg _ 413 | 5464_512x640.jpg _ 414 | 5184_512x640.jpg _ 415 | 1728_512x640.jpg _ 416 | 5091_512x640.jpg _ 417 | 385_512x640.jpg _ 418 | 3624_512x640.jpg _ 419 | 3094_512x640.jpg _ 420 | 1864_512x640.jpg _ 421 | 1843_640x512.jpg _ 422 | 1967_512x640.jpg _ 423 | 1972_512x640.jpg _ 424 | 5019_512x640.jpg _ 425 | 5031_640x512.jpg _ 426 | 2689_640x512.jpg _ 427 | 5281_512x640.jpg _ 428 | 3027_512x640.jpg _ 429 | 5397_640x512.jpg _ 430 | 1794_512x640.jpg _ 431 | 5385_512x640.jpg _ 432 | 1824_512x640.jpg _ 433 | 5172_512x640.jpg _ 434 | 3314_512x640.jpg _ 435 | 5240_512x640.jpg _ 436 | 4348_512x640.jpg _ 437 | 3463_512x640.jpg _ 438 | 5106_512x640.jpg _ 439 | 3312_512x640.jpg _ 440 | 5044_640x512.jpg _ 441 | 5187_512x640.jpg _ 442 | 4984_640x512.jpg _ 443 | 998_512x640.jpg _ 444 | 1970_512x640.jpg _ 445 | 2142_512x640.jpg _ 446 | 1840_512x640.jpg _ 447 | 3934_512x640.jpg _ 448 | 1820_512x640.jpg _ 449 | 3349_512x640.jpg _ 450 | 5212_640x512.jpg _ 451 | 4965_512x640.jpg _ 452 | 3503_512x640.jpg _ 453 | 1685_512x640.jpg _ 454 | 5226_512x640.jpg _ 455 | 3429_512x640.jpg _ 456 | 2678_512x640.jpg _ 457 | 5297_640x512.jpg _ 458 | 4362_512x640.jpg _ 459 | 884_512x640.jpg _ 460 | 37_512x640.jpg _ 461 | 5179_512x640.jpg _ 462 | 1656_512x640.jpg _ 463 | 5460_512x640.jpg _ 464 | 5498_512x640.jpg _ 465 | 1731_512x640.jpg _ 466 | 1756_512x640.jpg _ 467 | 5373_640x512.jpg _ 468 | 5104_640x512.jpg _ 469 | 3644_512x640.jpg _ 470 | 5178_512x640.jpg _ 471 | 5478_640x512.jpg _ 472 | 3105_512x640.jpg _ 473 | 5530_512x640.jpg _ 474 | 1842_512x640.jpg _ 475 | 5078_512x640.jpg _ 476 | 5001_640x512.jpg _ 477 | 1965_512x640.jpg _ 478 | 4950_640x512.jpg _ 479 | 5048_512x640.jpg _ 480 | 5234_512x640.jpg _ 481 | 5122_512x640.jpg _ 482 | 3415_512x640.jpg _ 483 | 5479_512x640.jpg _ 484 | 3060_512x640.jpg _ 485 | 5280_640x512.jpg _ 486 | 3616_512x640.jpg _ 487 | 1784_512x640.jpg _ 488 | 5295_640x512.jpg _ 489 | 5010_512x640.jpg _ 490 | 4977_512x640.jpg _ 491 | 5266_640x512.jpg _ 492 | 5056_640x512.jpg _ 493 | 639_512x640.jpg _ 494 | 5041_640x512.jpg _ 495 | 1806_512x640.jpg _ 496 | 1982_512x640.jpg _ 497 | 3322_512x640.jpg _ 498 | 5125_512x640.jpg _ 499 | 1917_512x640.jpg _ 500 | 3694_512x640.jpg _ 501 | 5347_640x512.jpg _ 502 | 2950_640x512.jpg _ 503 | 4339_512x640.jpg _ 504 | 5196_640x512.jpg _ 505 | 5209_512x640.jpg _ 506 | 5121_512x640.jpg _ 507 | 3151_512x640.jpg _ 508 | 3193_512x640.jpg _ 509 | 5523_512x640.jpg _ 510 | 5023_640x512.jpg _ 511 | 5361_512x640.jpg _ 512 | 4958_512x640.jpg _ 513 | 2876_512x640.jpg _ 514 | 5532_512x640.jpg _ 515 | 5352_512x640.jpg _ 516 | 5166_512x640.jpg _ 517 | 2831_512x640.jpg _ 518 | 1887_640x512.jpg _ 519 | 1832_640x512.jpg _ 520 | 4346_512x640.jpg _ 521 | 5462_640x512.jpg _ 522 | 1831_640x512.jpg _ 523 | 5062_512x640.jpg _ 524 | 5273_512x640.jpg _ 525 | 5466_512x640.jpg _ 526 | 985_512x640.jpg _ 527 | 5365_512x640.jpg _ 528 | 3324_512x640.jpg _ 529 | 3152_512x640.jpg _ 530 | 5193_640x512.jpg _ 531 | 1963_512x640.jpg _ 532 | 5366_512x640.jpg _ 533 | 1938_512x640.jpg _ 534 | 1695_512x640.jpg _ 535 | 3593_512x640.jpg _ 536 | 4994_640x512.jpg _ 537 | 5018_512x640.jpg _ 538 | 5084_512x640.jpg _ 539 | 3332_512x640.jpg _ 540 | 1752_512x640.jpg _ 541 | 5030_512x640.jpg _ 542 | 5339_512x640.jpg _ 543 | 1848_512x640.jpg _ 544 | 4990_512x640.jpg _ 545 | 1906_640x512.jpg _ 546 | 5367_512x640.jpg _ 547 | 1841_512x640.jpg _ 548 | 4365_512x640.jpg _ 549 | 5392_640x512.jpg _ 550 | 5493_512x640.jpg _ 551 | 414_512x640.jpg _ 552 | 5299_640x512.jpg _ 553 | 5522_512x640.jpg _ 554 | 5276_512x640.jpg _ 555 | 5453_512x640.jpg _ 556 | 5194_640x512.jpg _ 557 | 5145_640x512.jpg _ 558 | 5190_512x640.jpg _ 559 | 1883_640x512.jpg _ 560 | 5380_512x640.jpg _ 561 | 1979_512x640.jpg _ 562 | 101_512x640.jpg _ 563 | 4347_512x640.jpg _ 564 | 5302_512x640.jpg _ 565 | 5004_512x640.jpg _ 566 | 852_512x640.jpg _ 567 | 672_512x640.jpg _ 568 | 3326_512x640.jpg _ 569 | 5451_512x640.jpg _ 570 | 5173_512x640.jpg _ 571 | 5326_512x640.jpg _ 572 | 3153_512x640.jpg _ 573 | 4945_512x640.jpg _ 574 | 5413_512x640.jpg _ 575 | 3226_512x640.jpg _ 576 | 5013_512x640.jpg _ 577 | 1779_512x640.jpg _ 578 | 5117_512x640.jpg _ 579 | 5286_512x640.jpg _ 580 | 5249_512x640.jpg _ 581 | 3988_512x640.jpg _ 582 | 5153_512x640.jpg _ 583 | 5501_512x640.jpg _ 584 | 2117_512x640.jpg _ 585 | 343_512x640.jpg _ 586 | 4389_512x640.jpg _ 587 | 1684_512x640.jpg _ 588 | 5101_640x512.jpg _ 589 | 1869_640x512.jpg _ 590 | 322_512x640.jpg _ 591 | 5042_640x512.jpg _ 592 | 5180_640x512.jpg _ 593 | 3318_512x640.jpg _ 594 | 663_512x640.jpg _ 595 | 5024_512x640.jpg _ 596 | 5253_640x512.jpg _ 597 | 3417_512x640.jpg _ 598 | 5159_512x640.jpg _ 599 | 4377_512x640.jpg _ 600 | 3457_512x640.jpg _ 601 | 1924_512x640.jpg _ 602 | 5223_512x640.jpg _ 603 | 3505_512x640.jpg _ 604 | 5227_640x512.jpg _ 605 | 5475_512x640.jpg _ 606 | 4976_640x512.jpg _ 607 | 1765_512x640.jpg _ 608 | 4979_640x512.jpg _ 609 | 5346_640x512.jpg _ 610 | 387_512x640.jpg _ 611 | 5412_512x640.jpg _ 612 | 3150_512x640.jpg _ 613 | 5185_512x640.jpg _ 614 | 4082_512x640.jpg _ 615 | 5214_512x640.jpg _ 616 | 4358_512x640.jpg _ 617 | 5275_512x640.jpg _ 618 | 5419_512x640.jpg _ 619 | 5090_640x512.jpg _ 620 | 3418_512x640.jpg _ 621 | 5163_512x640.jpg _ 622 | 3602_512x640.jpg _ 623 | 727_512x640.jpg _ 624 | 4995_512x640.jpg _ 625 | 5529_512x640.jpg _ 626 | 2755_512x640.jpg _ 627 | 1950_512x640.jpg _ 628 | 3607_512x640.jpg _ 629 | 3317_512x640.jpg _ 630 | 4968_512x640.jpg _ 631 | 4975_512x640.jpg _ 632 | 1837_640x512.jpg _ 633 | 5147_512x640.jpg _ 634 | 754_512x640.jpg _ 635 | 2863_512x640.jpg _ 636 | 3721_512x640.jpg _ 637 | 2722_512x640.jpg _ 638 | 5108_640x512.jpg _ 639 | 5127_640x512.jpg _ 640 | 5293_512x640.jpg _ 641 | 3705_512x640.jpg _ 642 | 1787_512x640.jpg _ 643 | 5306_512x640.jpg _ 644 | 3171_512x640.jpg _ 645 | 5038_512x640.jpg _ 646 | 3016_512x640.jpg _ 647 | 642_512x640.jpg _ 648 | 4390_512x640.jpg _ 649 | 2078_512x640.jpg _ 650 | 5322_640x512.jpg _ 651 | 1767_512x640.jpg _ 652 | 1657_640x512.jpg _ 653 | 5445_640x512.jpg _ 654 | 5247_512x640.jpg _ 655 | 1677_640x512.jpg _ 656 | 4999_512x640.jpg _ 657 | 1750_512x640.jpg _ 658 | 2115_512x640.jpg _ 659 | 2135_640x512.jpg _ 660 | 5100_512x640.jpg _ 661 | 5123_640x512.jpg _ 662 | 5390_512x640.jpg _ 663 | 1876_512x640.jpg _ 664 | 5349_512x640.jpg _ 665 | 5261_640x512.jpg _ 666 | 3236_512x640.jpg _ 667 | 5338_512x640.jpg _ 668 | 1025_512x640.jpg _ 669 | 5316_512x640.jpg _ 670 | 5471_512x640.jpg _ 671 | 4344_512x640.jpg _ 672 | 1825_512x640.jpg _ 673 | 3038_512x640.jpg _ 674 | 5441_512x640.jpg _ 675 | 5231_512x640.jpg _ 676 | 1757_512x640.jpg _ 677 | 5368_640x512.jpg _ 678 | 5228_512x640.jpg _ 679 | 5198_512x640.jpg _ 680 | 3718_512x640.jpg _ 681 | 4392_512x640.jpg _ 682 | 4973_512x640.jpg _ 683 | 598_512x640.jpg _ 684 | 5465_512x640.jpg _ 685 | 1940_640x512.jpg _ 686 | 5284_640x512.jpg _ 687 | 3282_512x640.jpg _ 688 | 1809_512x640.jpg _ 689 | 5296_640x512.jpg _ 690 | 1761_512x640.jpg _ 691 | 1930_512x640.jpg _ 692 | 5446_512x640.jpg _ 693 | 634_512x640.jpg _ 694 | 184_512x640.jpg _ 695 | 5317_512x640.jpg _ 696 | 3695_512x640.jpg _ 697 | 3424_512x640.jpg _ 698 | 2918_512x640.jpg _ 699 | 1771_640x512.jpg _ 700 | 3833_512x640.jpg _ 701 | 5103_512x640.jpg _ 702 | 5260_640x512.jpg _ 703 | 5307_512x640.jpg _ 704 | 5432_512x640.jpg _ 705 | 5188_512x640.jpg _ 706 | 3438_512x640.jpg _ 707 | 4954_512x640.jpg _ 708 | 2642_512x640.jpg _ 709 | 3371_512x640.jpg _ 710 | 4992_512x640.jpg _ 711 | 5485_512x640.jpg _ 712 | 1861_512x640.jpg _ 713 | 5002_512x640.jpg _ 714 | 5152_512x640.jpg _ 715 | 5055_512x640.jpg _ 716 | 5229_512x640.jpg _ 717 | 3138_512x640.jpg _ 718 | 4962_512x640.jpg _ 719 | 3360_512x640.jpg _ 720 | 1682_512x640.jpg _ 721 | 5232_640x512.jpg _ 722 | 5081_512x640.jpg _ 723 | 2995_512x640.jpg _ 724 | 5324_640x512.jpg _ 725 | 5340_640x512.jpg _ 726 | 4100_512x640.jpg _ 727 | 1753_512x640.jpg _ 728 | 251_512x640.jpg _ 729 | 3393_512x640.jpg _ 730 | 5174_512x640.jpg _ 731 | 5537_512x640.jpg _ 732 | 5387_640x512.jpg _ 733 | 1962_512x640.jpg _ 734 | 2700_512x640.jpg _ 735 | 5372_512x640.jpg _ 736 | 5221_640x512.jpg _ 737 | 5071_512x640.jpg _ 738 | 3635_512x640.jpg _ 739 | 3628_512x640.jpg _ 740 | 1729_512x640.jpg _ 741 | 1978_512x640.jpg _ 742 | 1838_640x512.jpg _ 743 | 5492_512x640.jpg _ 744 | 1927_512x640.jpg _ 745 | 1958_512x640.jpg _ 746 | 5006_640x512.jpg _ 747 | 5065_640x512.jpg _ 748 | 2620_512x640.jpg _ 749 | 3704_512x640.jpg _ 750 | 1733_512x640.jpg _ 751 | 4953_512x640.jpg _ 752 | 1989_512x640.jpg _ 753 | 1786_512x640.jpg _ 754 | 1727_512x640.jpg _ 755 | 1915_512x640.jpg _ 756 | 5518_512x640.jpg _ 757 | 5245_512x640.jpg _ 758 | 5165_512x640.jpg _ 759 | 5292_512x640.jpg _ 760 | 5411_512x640.jpg _ 761 | 5051_512x640.jpg _ 762 | 5156_512x640.jpg _ 763 | 3452_512x640.jpg _ 764 | 1782_512x640.jpg _ 765 | 5416_512x640.jpg _ 766 | 1863_512x640.jpg _ 767 | 5095_512x640.jpg _ 768 | 281_512x640.jpg _ 769 | 1913_640x512.jpg _ 770 | 5192_512x640.jpg _ 771 | 5531_512x640.jpg _ 772 | 3700_512x640.jpg _ 773 | 5083_640x512.jpg _ 774 | 5043_640x512.jpg _ 775 | 3082_512x640.jpg _ 776 | 845_512x640.jpg _ 777 | 5238_512x640.jpg _ 778 | 5197_512x640.jpg _ 779 | 3321_512x640.jpg _ 780 | 230_512x640.jpg _ 781 | 1688_512x640.jpg _ 782 | 5129_512x640.jpg _ 783 | 5126_512x640.jpg _ 784 | 1949_512x640.jpg _ 785 | 1975_512x640.jpg _ 786 | 5154_512x640.jpg _ 787 | 5336_640x512.jpg _ 788 | 5005_512x640.jpg _ 789 | 3304_512x640.jpg _ 790 | 5490_512x640.jpg _ 791 | 2929_512x640.jpg _ 792 | 3427_512x640.jpg _ 793 | 5487_512x640.jpg _ 794 | 3320_512x640.jpg _ 795 | 789_512x640.jpg _ 796 | 5133_512x640.jpg _ 797 | 2631_512x640.jpg _ 798 | 4983_512x640.jpg _ 799 | 2098_512x640.jpg _ 800 | 1697_512x640.jpg _ 801 | 1928_512x640.jpg _ 802 | 3669_512x640.jpg _ 803 | 5425_512x640.jpg _ 804 | 3873_640x512.jpg _ 805 | 1691_512x640.jpg _ 806 | 5484_512x640.jpg _ 807 | 5124_512x640.jpg _ 808 | 1859_512x640.jpg _ 809 | 5494_512x640.jpg _ 810 | 4980_512x640.jpg _ 811 | 5099_640x512.jpg _ 812 | 5395_512x640.jpg _ 813 | 2085_512x640.jpg _ 814 | 2940_512x640.jpg _ 815 | 962_640x512.jpg _ 816 | 5075_640x512.jpg _ 817 | 5189_640x512.jpg _ 818 | 5182_640x512.jpg _ 819 | 5077_640x512.jpg _ 820 | 5512_512x640.jpg _ 821 | 1990_512x640.jpg _ 822 | 5534_512x640.jpg _ 823 | 1775_512x640.jpg _ 824 | 603_512x640.jpg _ 825 | 5497_512x640.jpg _ 826 | 1748_512x640.jpg _ 827 | 4951_512x640.jpg _ 828 | 3630_512x640.jpg _ 829 | 5142_512x640.jpg _ 830 | 5488_512x640.jpg _ 831 | 5379_640x512.jpg _ 832 | 3382_512x640.jpg _ 833 | 5057_512x640.jpg _ 834 | 1892_512x640.jpg _ 835 | 3613_512x640.jpg _ 836 | 1811_512x640.jpg _ 837 | 5200_640x512.jpg _ 838 | 5211_640x512.jpg _ 839 | 601_512x640.jpg _ 840 | 3449_512x640.jpg _ 841 | 1984_512x640.jpg _ 842 | 5088_512x640.jpg _ 843 | 5206_640x512.jpg _ 844 | 509_640x512.jpg _ 845 | 3330_512x640.jpg _ 846 | 5504_640x512.jpg _ 847 | 5311_640x512.jpg _ 848 | 3422_512x640.jpg _ 849 | 1868_512x640.jpg _ 850 | 5327_512x640.jpg _ 851 | 5150_640x512.jpg _ 852 | 3154_512x640.jpg _ 853 | 1964_640x512.jpg _ 854 | 1974_512x640.jpg _ 855 | 5064_512x640.jpg _ 856 | 2112_512x640.jpg _ 857 | 5516_512x640.jpg _ 858 | 1857_512x640.jpg _ 859 | 3127_512x640.jpg _ 860 | 5383_512x640.jpg _ 861 | 3703_512x640.jpg _ 862 | 5393_512x640.jpg _ 863 | 1905_640x512.jpg _ 864 | 2083_640x512.jpg _ 865 | 1687_512x640.jpg _ 866 | 1891_512x640.jpg _ 867 | 1649_512x640.jpg _ 868 | 3814_512x640.jpg _ 869 | 3399_512x640.jpg _ 870 | 5222_512x640.jpg _ 871 | 5533_512x640.jpg _ 872 | 5168_640x512.jpg _ 873 | 2131_512x640.jpg _ 874 | 1980_512x640.jpg _ 875 | 3310_512x640.jpg _ 876 | 5052_512x640.jpg _ 877 | 5060_512x640.jpg _ 878 | 3510_512x640.jpg _ 879 | 1853_512x640.jpg _ 880 | 5141_512x640.jpg _ 881 | 3237_512x640.jpg _ 882 | 5271_512x640.jpg _ 883 | 3404_512x640.jpg _ 884 | 1807_512x640.jpg _ 885 | 4972_512x640.jpg _ 886 | 5155_512x640.jpg _ 887 | 1916_640x512.jpg _ 888 | 1936_512x640.jpg _ 889 | 1845_512x640.jpg _ 890 | 4960_640x512.jpg _ 891 | 5110_640x512.jpg _ 892 | 5164_512x640.jpg _ 893 | 5312_512x640.jpg _ 894 | 5482_512x640.jpg _ 895 | 1741_512x640.jpg _ 896 | 5053_512x640.jpg _ 897 | 1804_512x640.jpg _ 898 | 4388_512x640.jpg _ 899 | 1957_512x640.jpg _ 900 | 5076_512x640.jpg _ 901 | 5400_512x640.jpg _ 902 | 3395_512x640.jpg _ 903 | 1847_512x640.jpg _ 904 | 2128_512x640.jpg _ 905 | 3149_512x640.jpg _ 906 | 5305_640x512.jpg _ 907 | 5239_640x512.jpg _ 908 | 5201_640x512.jpg _ 909 | 3243_512x640.jpg _ 910 | 1043_512x640.jpg _ 911 | 3319_512x640.jpg _ 912 | 730_512x640.jpg _ 913 | 3904_512x640.jpg _ 914 | 1835_640x512.jpg _ 915 | 5473_512x640.jpg _ 916 | 3731_512x640.jpg _ 917 | 4387_512x640.jpg _ 918 | 5149_512x640.jpg _ 919 | 1920_640x512.jpg _ 920 | 5342_512x640.jpg _ 921 | 5357_640x512.jpg _ 922 | 5313_512x640.jpg _ 923 | 4966_640x512.jpg _ 924 | 1839_512x640.jpg _ 925 | 1690_512x640.jpg _ 926 | 5217_512x640.jpg _ 927 | 3049_512x640.jpg _ 928 | 5116_512x640.jpg _ 929 | 3459_512x640.jpg _ 930 | 5389_512x640.jpg _ 931 | 1763_512x640.jpg _ 932 | 4357_512x640.jpg _ 933 | 5167_512x640.jpg _ 934 | 5015_512x640.jpg _ 935 | 1851_512x640.jpg _ 936 | 1737_512x640.jpg _ 937 | 5070_512x640.jpg _ 938 | 864_512x640.jpg _ 939 | 1953_512x640.jpg _ 940 | 1968_512x640.jpg _ 941 | 5138_512x640.jpg _ 942 | 5158_512x640.jpg _ 943 | 1826_512x640.jpg _ 944 | 1796_512x640.jpg _ 945 | 1872_640x512.jpg _ 946 | 5094_640x512.jpg _ 947 | 3680_512x640.jpg _ 948 | 5218_512x640.jpg _ 949 | 5021_512x640.jpg _ 950 | 5068_640x512.jpg _ 951 | 3629_512x640.jpg _ 952 | 5039_512x640.jpg _ 953 | 5444_512x640.jpg _ 954 | 3293_512x640.jpg _ 955 | 5259_512x640.jpg _ 956 | -------------------------------------------------------------------------------- /datasets/trans10k/validation.txt: -------------------------------------------------------------------------------- 1 | 7621.jpg _ 2 | 2533.jpg _ 3 | 6098.jpg _ 4 | 8130.jpg _ 5 | 3091.jpg _ 6 | 1360.jpg _ 7 | 9693.jpg _ 8 | 1342.jpg _ 9 | 4478.jpg _ 10 | 6746.jpg _ 11 | 3645.jpg _ 12 | 6033.jpg _ 13 | 5321.jpg _ 14 | 4179.jpg _ 15 | 4109.jpg _ 16 | 7240.jpg _ 17 | 3071.jpg _ 18 | 1363.jpg _ 19 | 510.jpg _ 20 | 675.jpg _ 21 | 3265.jpg _ 22 | 3947.jpg _ 23 | 7272.jpg _ 24 | 3671.jpg _ 25 | 1620.jpg _ 26 | 3859.jpg _ 27 | 8475.jpg _ 28 | 5237.jpg _ 29 | 1629.jpg _ 30 | 4910.jpg _ 31 | 754.jpg _ 32 | 4018.jpg _ 33 | 7743.jpg _ 34 | 9218.jpg _ 35 | 1562.jpg _ 36 | 1634.jpg _ 37 | 7949.jpg _ 38 | 9279.jpg _ 39 | 2430.jpg _ 40 | 5859.jpg _ 41 | 7029.jpg _ 42 | 8054.jpg _ 43 | 639.jpg _ 44 | 8139.jpg _ 45 | 5301.jpg _ 46 | 1777.jpg _ 47 | 6078.jpg _ 48 | 1259.jpg _ 49 | 3759.jpg _ 50 | 6828.jpg _ 51 | 3144.jpg _ 52 | 1474.jpg _ 53 | 2309.jpg _ 54 | 3647.jpg _ 55 | 677.jpg _ 56 | 6330.jpg _ 57 | 3321.jpg _ 58 | 1316.jpg _ 59 | 58.jpg _ 60 | 6046.jpg _ 61 | 6800.jpg _ 62 | 4453.jpg _ 63 | 3563.jpg _ 64 | 5319.jpg _ 65 | 6862.jpg _ 66 | 2629.jpg _ 67 | 3676.jpg _ 68 | 924.jpg _ 69 | 8667.jpg _ 70 | 5188.jpg _ 71 | 6476.jpg _ 72 | 10006.jpg _ 73 | 8625.jpg _ 74 | 6106.jpg _ 75 | 2605.jpg _ 76 | 9504.jpg _ 77 | 10442.jpg _ 78 | 2563.jpg _ 79 | 8582.jpg _ 80 | 7167.jpg _ 81 | 4686.jpg _ 82 | 2145.jpg _ 83 | 8411.jpg _ 84 | 2645.jpg _ 85 | 5104.jpg _ 86 | 3508.jpg _ 87 | 634.jpg _ 88 | 3897.jpg _ 89 | 3103.jpg _ 90 | 5403.jpg _ 91 | 9775.jpg _ 92 | 1467.jpg _ 93 | 4246.jpg _ 94 | 1300.jpg _ 95 | 5663.jpg _ 96 | 1501.jpg _ 97 | 9591.jpg _ 98 | 1906.jpg _ 99 | 2448.jpg _ 100 | 7077.jpg _ 101 | 3233.jpg _ 102 | 2819.jpg _ 103 | 772.jpg _ 104 | 423.jpg _ 105 | 6938.jpg _ 106 | 4688.jpg _ 107 | 1759.jpg _ 108 | 2754.jpg _ 109 | 4449.jpg _ 110 | 8842.jpg _ 111 | 8603.jpg _ 112 | 1182.jpg _ 113 | 1395.jpg _ 114 | 8157.jpg _ 115 | 9640.jpg _ 116 | 10181.jpg _ 117 | 2805.jpg _ 118 | 5975.jpg _ 119 | 5910.jpg _ 120 | 942.jpg _ 121 | 6325.jpg _ 122 | 8795.jpg _ 123 | 7911.jpg _ 124 | 4586.jpg _ 125 | 6625.jpg _ 126 | 3665.jpg _ 127 | 6739.jpg _ 128 | 1810.jpg _ 129 | 5953.jpg _ 130 | 2893.jpg _ 131 | 5889.jpg _ 132 | 8925.jpg _ 133 | 2406.jpg _ 134 | 9113.jpg _ 135 | 2147.jpg _ 136 | 3057.jpg _ 137 | 539.jpg _ 138 | 2765.jpg _ 139 | 208.jpg _ 140 | 5699.jpg _ 141 | 1438.jpg _ 142 | 9571.jpg _ 143 | 8456.jpg _ 144 | 4104.jpg _ 145 | 2033.jpg _ 146 | 1721.jpg _ 147 | 1233.jpg _ 148 | 6286.jpg _ 149 | 7532.jpg _ 150 | 2568.jpg _ 151 | 10447.jpg _ 152 | 6284.jpg _ 153 | 4621.jpg _ 154 | 1449.jpg _ 155 | 8432.jpg _ 156 | 7256.jpg _ 157 | 5498.jpg _ 158 | 5177.jpg _ 159 | 2329.jpg _ 160 | 3138.jpg _ 161 | 6618.jpg _ 162 | 6366.jpg _ 163 | 9247.jpg _ 164 | 4535.jpg _ 165 | 1247.jpg _ 166 | 10097.jpg _ 167 | 361.jpg _ 168 | 6455.jpg _ 169 | 10000.jpg _ 170 | 2569.jpg _ 171 | 3843.jpg _ 172 | 7280.jpg _ 173 | 4658.jpg _ 174 | 3801.jpg _ 175 | 10114.jpg _ 176 | 7332.jpg _ 177 | 1459.jpg _ 178 | 1535.jpg _ 179 | 1368.jpg _ 180 | 542.jpg _ 181 | 10145.jpg _ 182 | 4461.jpg _ 183 | 4703.jpg _ 184 | 1478.jpg _ 185 | 3724.jpg _ 186 | 4832.jpg _ 187 | 5318.jpg _ 188 | 1749.jpg _ 189 | 8809.jpg _ 190 | 2346.jpg _ 191 | 4226.jpg _ 192 | 7309.jpg _ 193 | 2713.jpg _ 194 | 5456.jpg _ 195 | 5615.jpg _ 196 | 6398.jpg _ 197 | 9966.jpg _ 198 | 1470.jpg _ 199 | 8485.jpg _ 200 | 8199.jpg _ 201 | 2345.jpg _ 202 | 5144.jpg _ 203 | 9125.jpg _ 204 | 5202.jpg _ 205 | 4721.jpg _ 206 | 4638.jpg _ 207 | 314.jpg _ 208 | 2767.jpg _ 209 | 10437.jpg _ 210 | 8833.jpg _ 211 | 3608.jpg _ 212 | 4120.jpg _ 213 | 10026.jpg _ 214 | 7540.jpg _ 215 | 8202.jpg _ 216 | 6103.jpg _ 217 | 4276.jpg _ 218 | 6119.jpg _ 219 | 4842.jpg _ 220 | 3584.jpg _ 221 | 4289.jpg _ 222 | 2640.jpg _ 223 | 9782.jpg _ 224 | 2259.jpg _ 225 | 7324.jpg _ 226 | 2386.jpg _ 227 | 10178.jpg _ 228 | 5956.jpg _ 229 | 166.jpg _ 230 | 2409.jpg _ 231 | 611.jpg _ 232 | 1135.jpg _ 233 | 7327.jpg _ 234 | 9305.jpg _ 235 | 5165.jpg _ 236 | 1322.jpg _ 237 | 9625.jpg _ 238 | 9122.jpg _ 239 | 8070.jpg _ 240 | 4633.jpg _ 241 | 2183.jpg _ 242 | 8300.jpg _ 243 | 8121.jpg _ 244 | 8467.jpg _ 245 | 2964.jpg _ 246 | 6859.jpg _ 247 | 3324.jpg _ 248 | 9518.jpg _ 249 | 1427.jpg _ 250 | 2960.jpg _ 251 | 4724.jpg _ 252 | 2049.jpg _ 253 | 6074.jpg _ 254 | 3264.jpg _ 255 | 7070.jpg _ 256 | 9507.jpg _ 257 | 6335.jpg _ 258 | 9644.jpg _ 259 | 7590.jpg _ 260 | 9015.jpg _ 261 | 2233.jpg _ 262 | 9690.jpg _ 263 | 6282.jpg _ 264 | 4981.jpg _ 265 | 10040.jpg _ 266 | 5466.jpg _ 267 | 8376.jpg _ 268 | 6207.jpg _ 269 | 9941.jpg _ 270 | 582.jpg _ 271 | 9942.jpg _ 272 | 6136.jpg _ 273 | 4664.jpg _ 274 | 2485.jpg _ 275 | 10223.jpg _ 276 | 7527.jpg _ 277 | 9358.jpg _ 278 | 4827.jpg _ 279 | 5521.jpg _ 280 | 8668.jpg _ 281 | 7219.jpg _ 282 | 9458.jpg _ 283 | 2608.jpg _ 284 | 8929.jpg _ 285 | 988.jpg _ 286 | 3629.jpg _ 287 | 7415.jpg _ 288 | 1920.jpg _ 289 | 1623.jpg _ 290 | 8388.jpg _ 291 | 4225.jpg _ 292 | 1926.jpg _ 293 | 9282.jpg _ 294 | 2331.jpg _ 295 | 5632.jpg _ 296 | 8209.jpg _ 297 | 3024.jpg _ 298 | 9225.jpg _ 299 | 3692.jpg _ 300 | 5260.jpg _ 301 | 3666.jpg _ 302 | 111.jpg _ 303 | 7930.jpg _ 304 | 2652.jpg _ 305 | 8081.jpg _ 306 | 781.jpg _ 307 | 7229.jpg _ 308 | 1175.jpg _ 309 | 7722.jpg _ 310 | 8466.jpg _ 311 | 4785.jpg _ 312 | 6446.jpg _ 313 | 4815.jpg _ 314 | 9449.jpg _ 315 | 9649.jpg _ 316 | 7908.jpg _ 317 | 10045.jpg _ 318 | 5844.jpg _ 319 | 9334.jpg _ 320 | 10436.jpg _ 321 | 6464.jpg _ 322 | 2740.jpg _ 323 | 5040.jpg _ 324 | 3339.jpg _ 325 | 3260.jpg _ 326 | 4903.jpg _ 327 | 8599.jpg _ 328 | 10148.jpg _ 329 | 8105.jpg _ 330 | 10216.jpg _ 331 | 6010.jpg _ 332 | 1662.jpg _ 333 | 688.jpg _ 334 | 1700.jpg _ 335 | 3790.jpg _ 336 | 5865.jpg _ 337 | 6430.jpg _ 338 | 5007.jpg _ 339 | 6920.jpg _ 340 | 736.jpg _ 341 | 2213.jpg _ 342 | 5937.jpg _ 343 | 9801.jpg _ 344 | 9982.jpg _ 345 | 7989.jpg _ 346 | 8110.jpg _ 347 | 10130.jpg _ 348 | 5214.jpg _ 349 | 8811.jpg _ 350 | 1325.jpg _ 351 | 5494.jpg _ 352 | 3911.jpg _ 353 | 9540.jpg _ 354 | 9078.jpg _ 355 | 7424.jpg _ 356 | 5536.jpg _ 357 | 2671.jpg _ 358 | 533.jpg _ 359 | 10395.jpg _ 360 | 5963.jpg _ 361 | 6402.jpg _ 362 | 818.jpg _ 363 | 908.jpg _ 364 | 955.jpg _ 365 | 6263.jpg _ 366 | 6638.jpg _ 367 | 2766.jpg _ 368 | 6817.jpg _ 369 | 6883.jpg _ 370 | 8522.jpg _ 371 | 8696.jpg _ 372 | 3326.jpg _ 373 | 6025.jpg _ 374 | 835.jpg _ 375 | 394.jpg _ 376 | 9283.jpg _ 377 | 1878.jpg _ 378 | 5328.jpg _ 379 | 7171.jpg _ 380 | 2619.jpg _ 381 | 6316.jpg _ 382 | 336.jpg _ 383 | 3815.jpg _ 384 | 1529.jpg _ 385 | 7032.jpg _ 386 | 7537.jpg _ 387 | 7690.jpg _ 388 | 6918.jpg _ 389 | 9629.jpg _ 390 | 3950.jpg _ 391 | 3259.jpg _ 392 | 3140.jpg _ 393 | 3432.jpg _ 394 | 6575.jpg _ 395 | 1967.jpg _ 396 | 81.jpg _ 397 | 2830.jpg _ 398 | 2002.jpg _ 399 | 4804.jpg _ 400 | 1958.jpg _ 401 | 427.jpg _ 402 | 2022.jpg _ 403 | 7197.jpg _ 404 | 221.jpg _ 405 | 140.jpg _ 406 | 1514.jpg _ 407 | 8992.jpg _ 408 | 1956.jpg _ 409 | 8891.jpg _ 410 | 4691.jpg _ 411 | 4569.jpg _ 412 | 2611.jpg _ 413 | 1205.jpg _ 414 | 1612.jpg _ 415 | 358.jpg _ 416 | 7767.jpg _ 417 | 8447.jpg _ 418 | 8239.jpg _ 419 | 2621.jpg _ 420 | 7281.jpg _ 421 | 2024.jpg _ 422 | 8097.jpg _ 423 | 7840.jpg _ 424 | 8354.jpg _ 425 | 504.jpg _ 426 | 8440.jpg _ 427 | 7662.jpg _ 428 | 10346.jpg _ 429 | 1017.jpg _ 430 | 7315.jpg _ 431 | 5203.jpg _ 432 | 9929.jpg _ 433 | 7041.jpg _ 434 | 565.jpg _ 435 | 278.jpg _ 436 | 616.jpg _ 437 | 4689.jpg _ 438 | 6779.jpg _ 439 | 3842.jpg _ 440 | 3013.jpg _ 441 | 4372.jpg _ 442 | 1643.jpg _ 443 | 5884.jpg _ 444 | 5708.jpg _ 445 | 8156.jpg _ 446 | 8401.jpg _ 447 | 884.jpg _ 448 | 9750.jpg _ 449 | 8936.jpg _ 450 | 8865.jpg _ 451 | 1122.jpg _ 452 | 2179.jpg _ 453 | 3447.jpg _ 454 | 9857.jpg _ 455 | 6690.jpg _ 456 | 10327.jpg _ 457 | 3275.jpg _ 458 | 9494.jpg _ 459 | 957.jpg _ 460 | 4978.jpg _ 461 | 7535.jpg _ 462 | 6905.jpg _ 463 | 5809.jpg _ 464 | 7002.jpg _ 465 | 2587.jpg _ 466 | 5522.jpg _ 467 | 5417.jpg _ 468 | 247.jpg _ 469 | 6336.jpg _ 470 | 7288.jpg _ 471 | 4126.jpg _ 472 | 3946.jpg _ 473 | 8444.jpg _ 474 | 6130.jpg _ 475 | 8482.jpg _ 476 | 7036.jpg _ 477 | 5023.jpg _ 478 | 8154.jpg _ 479 | 5629.jpg _ 480 | 9771.jpg _ 481 | 1820.jpg _ 482 | 7772.jpg _ 483 | 7380.jpg _ 484 | 8483.jpg _ 485 | 4470.jpg _ 486 | 1947.jpg _ 487 | 8598.jpg _ 488 | 6656.jpg _ 489 | 1212.jpg _ 490 | 87.jpg _ 491 | 6742.jpg _ 492 | 1250.jpg _ 493 | 9089.jpg _ 494 | 3201.jpg _ 495 | 6169.jpg _ 496 | 10020.jpg _ 497 | 8677.jpg _ 498 | 7634.jpg _ 499 | 5736.jpg _ 500 | 9698.jpg _ 501 | 7665.jpg _ 502 | 531.jpg _ 503 | 5406.jpg _ 504 | 2601.jpg _ 505 | 5404.jpg _ 506 | 9380.jpg _ 507 | 983.jpg _ 508 | 9681.jpg _ 509 | 5460.jpg _ 510 | 9303.jpg _ 511 | 7866.jpg _ 512 | 6276.jpg _ 513 | 8457.jpg _ 514 | 7282.jpg _ 515 | 2520.jpg _ 516 | 7287.jpg _ 517 | 5816.jpg _ 518 | 5045.jpg _ 519 | 7541.jpg _ 520 | 3054.jpg _ 521 | 8371.jpg _ 522 | 7381.jpg _ 523 | 1505.jpg _ 524 | 8915.jpg _ 525 | 3278.jpg _ 526 | 2310.jpg _ 527 | 10201.jpg _ 528 | 3872.jpg _ 529 | 8616.jpg _ 530 | 6196.jpg _ 531 | 973.jpg _ 532 | 3444.jpg _ 533 | 1121.jpg _ 534 | 5733.jpg _ 535 | 6657.jpg _ 536 | 6901.jpg _ 537 | 10472.jpg _ 538 | 8841.jpg _ 539 | 6655.jpg _ 540 | 4228.jpg _ 541 | 7900.jpg _ 542 | 6993.jpg _ 543 | 1606.jpg _ 544 | 5349.jpg _ 545 | 6448.jpg _ 546 | 2000.jpg _ 547 | 8787.jpg _ 548 | 4350.jpg _ 549 | 8651.jpg _ 550 | 334.jpg _ 551 | 1821.jpg _ 552 | 6975.jpg _ 553 | 2375.jpg _ 554 | 3014.jpg _ 555 | 1558.jpg _ 556 | 940.jpg _ 557 | 6061.jpg _ 558 | 3713.jpg _ 559 | 548.jpg _ 560 | 3512.jpg _ 561 | 4753.jpg _ 562 | 148.jpg _ 563 | 2815.jpg _ 564 | 891.jpg _ 565 | 8013.jpg _ 566 | 3172.jpg _ 567 | 1424.jpg _ 568 | 8383.jpg _ 569 | 6879.jpg _ 570 | 4907.jpg _ 571 | 7263.jpg _ 572 | 1428.jpg _ 573 | 8566.jpg _ 574 | 9278.jpg _ 575 | 189.jpg _ 576 | 6356.jpg _ 577 | 9173.jpg _ 578 | 7750.jpg _ 579 | 702.jpg _ 580 | 7201.jpg _ 581 | 2927.jpg _ 582 | 7511.jpg _ 583 | 1668.jpg _ 584 | 4187.jpg _ 585 | 1506.jpg _ 586 | 1663.jpg _ 587 | 8633.jpg _ 588 | 3463.jpg _ 589 | 2788.jpg _ 590 | 6756.jpg _ 591 | 2418.jpg _ 592 | 6060.jpg _ 593 | 8881.jpg _ 594 | 8606.jpg _ 595 | 7763.jpg _ 596 | 8838.jpg _ 597 | 2866.jpg _ 598 | 2413.jpg _ 599 | 1077.jpg _ 600 | 8365.jpg _ 601 | 3702.jpg _ 602 | 6992.jpg _ 603 | 9331.jpg _ 604 | 9660.jpg _ 605 | 3855.jpg _ 606 | 2733.jpg _ 607 | 6513.jpg _ 608 | 674.jpg _ 609 | 96.jpg _ 610 | 4733.jpg _ 611 | 2419.jpg _ 612 | 2129.jpg _ 613 | 6254.jpg _ 614 | 10269.jpg _ 615 | 8897.jpg _ 616 | 1635.jpg _ 617 | 4137.jpg _ 618 | 4821.jpg _ 619 | 7152.jpg _ 620 | 9823.jpg _ 621 | 664.jpg _ 622 | 5133.jpg _ 623 | 4249.jpg _ 624 | 6112.jpg _ 625 | 8636.jpg _ 626 | 2666.jpg _ 627 | 8885.jpg _ 628 | 7776.jpg _ 629 | 5011.jpg _ 630 | 1362.jpg _ 631 | 8663.jpg _ 632 | 8568.jpg _ 633 | 1328.jpg _ 634 | 2562.jpg _ 635 | 10034.jpg _ 636 | 9923.jpg _ 637 | 9372.jpg _ 638 | 1359.jpg _ 639 | 5888.jpg _ 640 | 239.jpg _ 641 | 5497.jpg _ 642 | 1511.jpg _ 643 | 5950.jpg _ 644 | 1768.jpg _ 645 | 5993.jpg _ 646 | 10468.jpg _ 647 | 10031.jpg _ 648 | 812.jpg _ 649 | 6784.jpg _ 650 | 1461.jpg _ 651 | 4098.jpg _ 652 | 9557.jpg _ 653 | 5035.jpg _ 654 | 9727.jpg _ 655 | 7706.jpg _ 656 | 4549.jpg _ 657 | 9605.jpg _ 658 | 10150.jpg _ 659 | 698.jpg _ 660 | 1045.jpg _ 661 | 431.jpg _ 662 | 2461.jpg _ 663 | 1409.jpg _ 664 | 2057.jpg _ 665 | 8928.jpg _ 666 | 5196.jpg _ 667 | 4157.jpg _ 668 | 1294.jpg _ 669 | 3298.jpg _ 670 | 2168.jpg _ 671 | 1004.jpg _ 672 | 6671.jpg _ 673 | 5240.jpg _ 674 | 8523.jpg _ 675 | 2813.jpg _ 676 | 6278.jpg _ 677 | 9022.jpg _ 678 | 2721.jpg _ 679 | 3188.jpg _ 680 | 8963.jpg _ 681 | 8969.jpg _ 682 | 8259.jpg _ 683 | 5181.jpg _ 684 | 10466.jpg _ 685 | 1564.jpg _ 686 | 7661.jpg _ 687 | 6745.jpg _ 688 | 2887.jpg _ 689 | 6571.jpg _ 690 | 6285.jpg _ 691 | 4871.jpg _ 692 | 3705.jpg _ 693 | 5078.jpg _ 694 | 1881.jpg _ 695 | 5620.jpg _ 696 | 1975.jpg _ 697 | 5167.jpg _ 698 | 2366.jpg _ 699 | 3693.jpg _ 700 | 465.jpg _ 701 | 637.jpg _ 702 | 19.jpg _ 703 | 1108.jpg _ 704 | 8037.jpg _ 705 | 7865.jpg _ 706 | 2926.jpg _ 707 | 8220.jpg _ 708 | 9230.jpg _ 709 | 9825.jpg _ 710 | 1734.jpg _ 711 | 149.jpg _ 712 | 9663.jpg _ 713 | 2948.jpg _ 714 | 5936.jpg _ 715 | 9321.jpg _ 716 | 6108.jpg _ 717 | 5815.jpg _ 718 | 8621.jpg _ 719 | 9035.jpg _ 720 | 2139.jpg _ 721 | 7425.jpg _ 722 | 2987.jpg _ 723 | 902.jpg _ 724 | 10320.jpg _ 725 | 3465.jpg _ 726 | 2364.jpg _ 727 | 8976.jpg _ 728 | 9417.jpg _ 729 | 3455.jpg _ 730 | 9454.jpg _ 731 | 2305.jpg _ 732 | 3081.jpg _ 733 | 167.jpg _ 734 | 811.jpg _ 735 | 1874.jpg _ 736 | 10225.jpg _ 737 | 6840.jpg _ 738 | 6574.jpg _ 739 | 3187.jpg _ 740 | 9730.jpg _ 741 | 4323.jpg _ 742 | 218.jpg _ 743 | 8733.jpg _ 744 | 9161.jpg _ 745 | 2055.jpg _ 746 | 1480.jpg _ 747 | 7804.jpg _ 748 | 9927.jpg _ 749 | 9361.jpg _ 750 | 8679.jpg _ 751 | 6829.jpg _ 752 | 4448.jpg _ 753 | 2236.jpg _ 754 | 644.jpg _ 755 | 953.jpg _ 756 | 796.jpg _ 757 | 714.jpg _ 758 | 8734.jpg _ 759 | 4086.jpg _ 760 | 5338.jpg _ 761 | 1106.jpg _ 762 | 2076.jpg _ 763 | 7847.jpg _ 764 | 3454.jpg _ 765 | 2947.jpg _ 766 | 2048.jpg _ 767 | 563.jpg _ 768 | 5824.jpg _ 769 | 5954.jpg _ 770 | 8507.jpg _ 771 | 2769.jpg _ 772 | 3141.jpg _ 773 | 10300.jpg _ 774 | 6948.jpg _ 775 | 3131.jpg _ 776 | 6081.jpg _ 777 | 943.jpg _ 778 | 8257.jpg _ 779 | 8780.jpg _ 780 | 4755.jpg _ 781 | 9778.jpg _ 782 | 1856.jpg _ 783 | 3937.jpg _ 784 | 2074.jpg _ 785 | 9955.jpg _ 786 | 536.jpg _ 787 | 10401.jpg _ 788 | 1585.jpg _ 789 | 7430.jpg _ 790 | 4200.jpg _ 791 | 8798.jpg _ 792 | 8938.jpg _ 793 | 9538.jpg _ 794 | 6050.jpg _ 795 | 3331.jpg _ 796 | 10111.jpg _ 797 | 10313.jpg _ 798 | 7577.jpg _ 799 | 10063.jpg _ 800 | 7744.jpg _ 801 | 734.jpg _ 802 | 5505.jpg _ 803 | 1094.jpg _ 804 | 10237.jpg _ 805 | 9244.jpg _ 806 | 6246.jpg _ 807 | 2833.jpg _ 808 | 9183.jpg _ 809 | 8057.jpg _ 810 | 2755.jpg _ 811 | 9561.jpg _ 812 | 5097.jpg _ 813 | 2662.jpg _ 814 | 5842.jpg _ 815 | 8775.jpg _ 816 | 1884.jpg _ 817 | 8452.jpg _ 818 | 8067.jpg _ 819 | 3309.jpg _ 820 | 7297.jpg _ 821 | 8645.jpg _ 822 | 9831.jpg _ 823 | 5446.jpg _ 824 | 8476.jpg _ 825 | 4737.jpg _ 826 | 3945.jpg _ 827 | 6163.jpg _ 828 | 7880.jpg _ 829 | 7964.jpg _ 830 | 2992.jpg _ 831 | 7780.jpg _ 832 | 5079.jpg _ 833 | 6107.jpg _ 834 | 6483.jpg _ 835 | 7369.jpg _ 836 | 5411.jpg _ 837 | 6809.jpg _ 838 | 5679.jpg _ 839 | 201.jpg _ 840 | 10180.jpg _ 841 | 3084.jpg _ 842 | 5296.jpg _ 843 | 6373.jpg _ 844 | 7477.jpg _ 845 | 2913.jpg _ 846 | 5341.jpg _ 847 | 3697.jpg _ 848 | 7506.jpg _ 849 | 681.jpg _ 850 | 1383.jpg _ 851 | 10228.jpg _ 852 | 586.jpg _ 853 | 7769.jpg _ 854 | 1883.jpg _ 855 | 1984.jpg _ 856 | 4201.jpg _ 857 | 7939.jpg _ 858 | 727.jpg _ 859 | 8551.jpg _ 860 | 2118.jpg _ 861 | 498.jpg _ 862 | 7738.jpg _ 863 | 4796.jpg _ 864 | 3165.jpg _ 865 | 248.jpg _ 866 | 9967.jpg _ 867 | 10009.jpg _ 868 | 5118.jpg _ 869 | 4995.jpg _ 870 | 9074.jpg _ 871 | 67.jpg _ 872 | 7544.jpg _ 873 | 3549.jpg _ 874 | 3669.jpg _ 875 | 6607.jpg _ 876 | 4006.jpg _ 877 | 508.jpg _ 878 | 3199.jpg _ 879 | 6907.jpg _ 880 | 9645.jpg _ 881 | 782.jpg _ 882 | 3204.jpg _ 883 | 10256.jpg _ 884 | 5565.jpg _ 885 | 6494.jpg _ 886 | 225.jpg _ 887 | 8549.jpg _ 888 | 5211.jpg _ 889 | 8693.jpg _ 890 | 8845.jpg _ 891 | 7283.jpg _ 892 | 1600.jpg _ 893 | 4101.jpg _ 894 | 10291.jpg _ 895 | 6091.jpg _ 896 | 8171.jpg _ 897 | 2528.jpg _ 898 | 2102.jpg _ 899 | 6071.jpg _ 900 | 3992.jpg _ 901 | 4262.jpg _ 902 | 2160.jpg _ 903 | 8351.jpg _ 904 | 8995.jpg _ 905 | 10474.jpg _ 906 | 7986.jpg _ 907 | 9063.jpg _ 908 | 5656.jpg _ 909 | 5618.jpg _ 910 | 9593.jpg _ 911 | 8228.jpg _ 912 | 9843.jpg _ 913 | 6001.jpg _ 914 | 163.jpg _ 915 | 3615.jpg _ 916 | 5524.jpg _ 917 | 110.jpg _ 918 | 1545.jpg _ 919 | 9848.jpg _ 920 | 10032.jpg _ 921 | 3292.jpg _ 922 | 8997.jpg _ 923 | 3354.jpg _ 924 | 5730.jpg _ 925 | 6668.jpg _ 926 | 2581.jpg _ 927 | 6665.jpg _ 928 | 3716.jpg _ 929 | 9050.jpg _ 930 | 3916.jpg _ 931 | 4367.jpg _ 932 | 7124.jpg _ 933 | 8608.jpg _ 934 | 2427.jpg _ 935 | 629.jpg _ 936 | 550.jpg _ 937 | 3293.jpg _ 938 | 9781.jpg _ 939 | 9460.jpg _ 940 | 6709.jpg _ 941 | 8712.jpg _ 942 | 7106.jpg _ 943 | 2042.jpg _ 944 | 7893.jpg _ 945 | 2398.jpg _ 946 | 1432.jpg _ 947 | 1251.jpg _ 948 | 753.jpg _ 949 | 5005.jpg _ 950 | 5519.jpg _ 951 | 4757.jpg _ 952 | 1666.jpg _ 953 | 7793.jpg _ 954 | 8745.jpg _ 955 | 10095.jpg _ 956 | 9118.jpg _ 957 | 9679.jpg _ 958 | 3717.jpg _ 959 | 9553.jpg _ 960 | 8075.jpg _ 961 | 6933.jpg _ 962 | 679.jpg _ 963 | 3185.jpg _ 964 | 164.jpg _ 965 | 2319.jpg _ 966 | 9314.jpg _ 967 | 2457.jpg _ 968 | 3197.jpg _ 969 | 1039.jpg _ 970 | 5701.jpg _ 971 | 6857.jpg _ 972 | 5688.jpg _ 973 | 8286.jpg _ 974 | 8690.jpg _ 975 | 1490.jpg _ 976 | 3105.jpg _ 977 | 9097.jpg _ 978 | 3118.jpg _ 979 | 2169.jpg _ 980 | 9997.jpg _ 981 | 6032.jpg _ 982 | 6934.jpg _ 983 | 4866.jpg _ 984 | 7215.jpg _ 985 | 7576.jpg _ 986 | 10027.jpg _ 987 | 6041.jpg _ 988 | 7069.jpg _ 989 | 10174.jpg _ 990 | 5168.jpg _ 991 | 7470.jpg _ 992 | 8464.jpg _ 993 | 2772.jpg _ 994 | 1054.jpg _ 995 | 1326.jpg _ 996 | 2513.jpg _ 997 | 7074.jpg _ 998 | 10230.jpg _ 999 | 2470.jpg _ 1000 | 6714.jpg _ 1001 | -------------------------------------------------------------------------------- /evaluate_mono.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import skimage.io 4 | import argparse 5 | import cv2 6 | from tqdm import tqdm 7 | from utils import read_d, parse_dataset_txt, compute_scale_and_shift, read_calib_xml 8 | import threading 9 | 10 | CATEGORIES = ['All', 'ToM', 'Other'] 11 | METRICS = ['delta1.25', 'delta1.20', 'delta1.15', 'delta1.10', 'delta1.05', 'mae', 'absrel', 'rmse'] 12 | 13 | class evalThread(threading.Thread): 14 | def __init__(self, idxs, gts, preds, focals, baselines, acc, categories, min_depth=1, max_depth=10000, resize_factor=0.25, baseline_factor=1000, median_scale_and_shift=False): 15 | super(evalThread, self).__init__() 16 | self.idxs = idxs 17 | self.gts = gts 18 | self.preds = preds 19 | self.focals = focals 20 | self.baselines = baselines 21 | self.min_depth = min_depth 22 | self.max_depth = max_depth 23 | self.acc = acc 24 | self.categories = categories 25 | self.baseline_factor = baseline_factor 26 | self.median_scale_and_shift = median_scale_and_shift 27 | self.resize_factor = resize_factor 28 | 29 | def run(self): 30 | for idx in self.idxs: 31 | gt = read_d(self.gts[idx], scale_factor=256.) 32 | fx = self.focals[idx] 33 | baseline = self.baselines[idx] 34 | baseline = baseline * self.baseline_factor 35 | 36 | gt = cv2.resize(gt, None, fx=self.resize_factor, fy=self.resize_factor, interpolation=cv2.INTER_NEAREST) 37 | fx = fx * self.resize_factor 38 | gt = gt.astype(np.float32) * self.resize_factor 39 | 40 | # CLIP DEPTH GT 41 | gt[gt > fx * baseline / self.min_depth] = 0 # INVALID IF LESS THAN 1mm (very high disparity values) 42 | gt[gt < fx * baseline / self.max_depth] = 0 # INVALID IF MORE THAN max_depth meters (very small disparity values) 43 | 44 | pred = read_d(self.preds[idx], scale_factor=256.) 45 | pred = cv2.resize(pred, (gt.shape[1], gt.shape[0]), cv2.INTER_CUBIC) 46 | pred = (pred - np.min(pred[gt > 0])) / (pred[gt > 0].max() - pred[gt > 0].min()) 47 | if self.median_scale_and_shift: 48 | gt_shifted = gt - gt[gt>0].min() 49 | scale = np.median(gt_shifted[gt > 0])/np.median(pred[gt > 0]) 50 | pred = pred * scale 51 | shift = np.median(gt[gt > 0] - pred[gt > 0]) 52 | pred = pred + shift 53 | else: 54 | scale, shift = compute_scale_and_shift(np.expand_dims(pred, axis=0), 55 | np.expand_dims(gt, axis=0), 56 | np.expand_dims((gt > 0).astype(np.float32), axis=0)) 57 | pred = pred * scale + shift 58 | 59 | pred = baseline * fx / pred 60 | 61 | # CLIP PRED TO WORKING RANGE 62 | pred[np.isinf(pred)] = self.max_depth 63 | pred[pred > self.max_depth] = self.max_depth 64 | pred[pred 1: 70 | seg_mask = skimage.io.imread(self.gts[idx].replace(os.path.basename(self.gts[idx]), 'mask_cat.png')) 71 | seg_mask = cv2.resize(seg_mask, None, fx=self.resize_factor, fy=self.resize_factor, interpolation=cv2.INTER_NEAREST) 72 | 73 | for category in self.categories: 74 | valid = (gt>0).astype(np.float32) 75 | 76 | if category != 'All': 77 | if category == "Other": 78 | mask0 = seg_mask == 0 79 | mask1 = seg_mask == 1 80 | else: 81 | mask0 = seg_mask == 2 82 | mask1 = seg_mask == 3 83 | mask = mask0 | mask1 84 | mask = mask.astype(np.float32) 85 | valid = valid * mask 86 | 87 | if valid.sum() > 0: 88 | metrics = booster_metrics(pred, gt, valid) 89 | for k in METRICS: 90 | self.acc[category][k].append(metrics[k]) 91 | 92 | 93 | # Main evaluation function 94 | def booster_metrics(d, gt, valid): 95 | error = np.abs(d-gt) 96 | error[valid==0] = 0 97 | 98 | thresh = np.maximum((d[valid > 0] / gt[valid > 0]), (gt[valid > 0] / d[valid > 0])) 99 | delta3 = (thresh < 1.25).astype(np.float32).mean() 100 | delta4 = (thresh < 1.20).astype(np.float32).mean() 101 | delta5 = (thresh < 1.15).astype(np.float32).mean() 102 | delta6 = (thresh < 1.10).astype(np.float32).mean() 103 | delta7 = (thresh < 1.05).astype(np.float32).mean() 104 | 105 | avgerr = error[valid>0].mean() 106 | abs_rel = (error[valid>0]/gt[valid>0]).mean() 107 | 108 | rms = (d-gt)**2 109 | rms = np.sqrt( rms[valid>0].mean() ) 110 | 111 | return {'delta1.25':delta3*100., 'delta1.20':delta4*100.,'delta1.15':delta5*100., 'delta1.10':delta6*100., 'delta1.05':delta7*100., 'mae':avgerr, 'absrel': abs_rel, 'rmse':rms, 'errormap':error*(valid>0)} 112 | 113 | 114 | def eval(gts, preds, focals, baselines, min_depth=1, max_depth=10000, resize_factor=0.25, baseline_factor=1000, median_scale_and_shift=False): 115 | # Check all files OK 116 | for test_img in preds: 117 | if not os.path.exists(test_img): 118 | print("Missing files in the submission") 119 | exit(-1) 120 | 121 | if not os.path.exists(gts[0].replace(os.path.basename(gts[0]), 'mask_cat.png')): 122 | categories = ['All'] 123 | else: 124 | categories = CATEGORIES 125 | 126 | # INIT 127 | acc = {} 128 | results = {} 129 | for category in categories: 130 | acc[category] = {} 131 | results[category] = {} 132 | for metric in METRICS: 133 | acc[category][metric] = [] 134 | results[category][metric] = [] 135 | 136 | num_samples = len(gts) 137 | print("Number of samples", num_samples) 138 | num_workers = 32 139 | threads = [] 140 | for i in range(num_workers): 141 | start_idx = num_samples//num_workers * i 142 | if i != num_workers -1: 143 | end_idx = num_samples//num_workers * (i+1) 144 | else: 145 | end_idx = num_samples 146 | idxs = range(start_idx, end_idx) 147 | t = evalThread(idxs, gts, preds, focals, baselines, acc, categories, min_depth, max_depth, resize_factor, baseline_factor, median_scale_and_shift) 148 | threads.append(t) 149 | t.start() 150 | 151 | for t in threads: 152 | t.join() 153 | 154 | for category in categories: 155 | for k in acc[category]: 156 | results[category][k] = np.array(acc[category][k]).mean() 157 | 158 | return results 159 | 160 | 161 | def result2string(result): 162 | result_string = "{:<12}".format("CLASS") 163 | for k in METRICS: 164 | result_string += "{:<12}".format(k) 165 | result_string += "\n" 166 | for cat in CATEGORIES: 167 | if cat in result: 168 | result_string += "{:<12}".format(cat) 169 | for metric in METRICS: 170 | tmp = "" 171 | if metric in result[cat]: tmp = "{:.2f}".format(result[cat][metric]) 172 | result_string += "{:<12}".format(tmp) 173 | result_string += "\n" 174 | return result_string 175 | 176 | 177 | if __name__ == "__main__": 178 | 179 | parser = argparse.ArgumentParser() 180 | parser.add_argument('--gt_root', 181 | help='folder with gt' 182 | ) 183 | parser.add_argument('--pred_root', 184 | help='folder with predictions' 185 | ) 186 | parser.add_argument('--pred_ext', 187 | default=".npy", 188 | help='prediction extension' 189 | ) 190 | parser.add_argument('--dataset_txt', 191 | help='txt file with a set of $basename $gtpath $calib_file or $basename $gtpath $fx $baseline or $basename $gtpath' 192 | ) 193 | 194 | parser.add_argument('--output_path', 195 | default="results.txt", 196 | help='output file' 197 | ) 198 | parser.add_argument('--resize_factor', 199 | default=0.25, 200 | type=float, 201 | help='resize gt images with this factor. Evaluation will be done at the gt resolution' 202 | ) 203 | parser.add_argument('--baseline_factor', 204 | default=1000, 205 | type=float, 206 | help='scale baseline using this factor' 207 | ) 208 | parser.add_argument('--min_depth', 209 | default=1, 210 | type=float, 211 | help='min depth in millimeters' 212 | ) 213 | parser.add_argument('--max_depth', 214 | default=10000, 215 | type=float, 216 | help='max depth in millimeters' 217 | ) 218 | parser.add_argument('--median_scale_and_shift', 219 | action="store_true", 220 | help='rescale prediction with median instead of least square scale and shift' 221 | ) 222 | args = parser.parse_args() 223 | 224 | # Getting dataset paths 225 | dataset_dict = parse_dataset_txt(args.dataset_txt) 226 | 227 | gt_files = [os.path.join(args.gt_root, f) for f in dataset_dict["gt_paths"]] 228 | basenames = [os.path.join(args.pred_root, os.path.splitext(f)[0] + args.pred_ext) for f in dataset_dict["basenames"]] 229 | 230 | if "calib_paths" in dataset_dict: 231 | focals = [] 232 | baselines = [] 233 | for calib_path in dataset_dict["calib_paths"]: 234 | fx, baseline = read_calib_xml(os.path.join(args.gt_root, calib_path)) 235 | focals.append(fx) 236 | baselines.append(baseline) 237 | elif "focals" in dataset_dict and "baselines" in dataset_dict: 238 | focals = dataset_dict["focals"] 239 | baselines = dataset_dict["baselines"] 240 | else: 241 | print("Missing focals and baselines or calib files") 242 | exit(-1) 243 | 244 | # Evaluation 245 | results = eval(gt_files, basenames, focals, baselines, args.min_depth, args.max_depth, args.resize_factor, args.baseline_factor, args.median_scale_and_shift) 246 | 247 | # Saving results 248 | results_str = result2string(results) 249 | print(results_str) 250 | with open(args.output_path, "w") as fout: 251 | fout.write(results_str) -------------------------------------------------------------------------------- /finetune.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import cv2 4 | import argparse 5 | import time 6 | 7 | from tqdm import tqdm, trange 8 | import numpy as np 9 | import matplotlib.pyplot as plt 10 | 11 | from torchvision.transforms import Compose 12 | from torchvision.utils import make_grid 13 | from torch.optim.lr_scheduler import ExponentialLR 14 | from torch.utils.data import DataLoader, ConcatDataset 15 | 16 | import wandb 17 | 18 | from midas.dpt_depth import DPTDepthModel 19 | from midas.midas_net import MidasNet 20 | from midas.midas_net_custom import MidasNet_small 21 | from midas.transforms import Resize, ResizeTrain, NormalizeImage, PrepareForNet, RandomCrop, MirrorSquarePad, ColorAug, RandomHorizontalFlip 22 | 23 | from datasets.dataloader import MSDLoader, Trans10KLoader 24 | 25 | from loss import ScaleAndShiftInvariantLoss, GradientLoss, MSELoss 26 | 27 | def rescale(x, a = 0.0, b = 1.0): 28 | return a + (b - a)*((x - x.min())/(x.max() - x.min())) 29 | 30 | 31 | def run(args): 32 | """Run MonoDepthNN to train on novel depth maps.""" 33 | 34 | training_datasets = args.training_datasets 35 | training_datasets_dir = args.training_datasets_dir 36 | training_datasets_txt = args.training_datasets_txt 37 | output_path= os.path.join(args.output_path, args.exp_name) 38 | model_path=args.model_path 39 | model_type=args.model_type 40 | 41 | wandb.init(project = f"finetuning-{model_type}", 42 | name = args.exp_name, 43 | config = {"epochs" : args.epochs, 44 | "batch_size" : args.batch_size, 45 | "model_type" : model_type, 46 | "model_path": model_path, 47 | "training_datasets" : training_datasets, 48 | "training_datasets_dir": training_datasets_dir, 49 | "training_datasets_txt": training_datasets_txt, 50 | }) 51 | 52 | # Select device. 53 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 54 | print("Device: %s." % device) 55 | 56 | 57 | #### MODEL 58 | # Load network. 59 | if model_type == "dpt_large": # DPT-Large 60 | model = DPTDepthModel( 61 | path=None, 62 | backbone="vitl16_384", 63 | non_negative=True, 64 | ) 65 | net_w, net_h = 384, 384 66 | normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) 67 | transform = Compose( 68 | [ 69 | RandomHorizontalFlip(prob=0.5), 70 | ResizeTrain( 71 | net_w, 72 | net_h, 73 | resize_target=True, 74 | keep_aspect_ratio=True, 75 | ensure_multiple_of=32, 76 | resize_method="lower_bound", 77 | image_interpolation_method=cv2.INTER_CUBIC, 78 | ), 79 | RandomCrop(net_w, net_h), 80 | ColorAug(prob=0.5), 81 | normalization, 82 | PrepareForNet(), 83 | ] 84 | ) 85 | elif model_type == "midas_v21": 86 | model = MidasNet(None, non_negative=True) 87 | net_w, net_h = 384, 384 88 | normalization = NormalizeImage( 89 | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] 90 | ) 91 | transform = Compose( 92 | [ 93 | RandomHorizontalFlip(prob=0.5), 94 | MirrorSquarePad(), 95 | ResizeTrain( 96 | net_w, 97 | net_h, 98 | resize_target=True, 99 | keep_aspect_ratio=False, 100 | ensure_multiple_of=32, 101 | resize_method="upper_bound", 102 | image_interpolation_method=cv2.INTER_CUBIC, 103 | ), 104 | ColorAug(prob=0.5), 105 | normalization, 106 | PrepareForNet(), 107 | ] 108 | ) 109 | else: 110 | print(f"model_type '{model_type}' not implemented, use: --model_type large") 111 | assert False 112 | 113 | reload = torch.load(model_path) 114 | if "model_state_dict" in reload.keys(): 115 | checkpoint = reload['model_state_dict'] 116 | else: 117 | checkpoint = reload 118 | model.load_state_dict(checkpoint) 119 | 120 | optimizer = torch.optim.NAdam(model.parameters(), lr = 1e-7) 121 | if "optimizer_state_dict" in reload.keys() and args.continue_train: 122 | optimizer.load_state_dict(reload['optimizer_state_dict']) 123 | 124 | 125 | scheduler = ExponentialLR(optimizer, gamma = 0.95) 126 | if "scheduler" in reload.keys() and args.continue_train: 127 | scheduler.load_state_dict(checkpoint['scheduler']) 128 | 129 | ss_loss, grad_loss, mse_loss = ScaleAndShiftInvariantLoss(), GradientLoss(), MSELoss() 130 | 131 | # Un-freeze all layers. 132 | for param in model.parameters(): 133 | param.requires_grad = True # False 134 | 135 | # wandb.watch(model, log_freq=100) 136 | model.to(device) 137 | 138 | ### DATASETS 139 | t_datasets = [] 140 | 141 | if "trans10k" in training_datasets: 142 | idx = training_datasets.index("trans10k") 143 | train_t10k = Trans10KLoader(training_datasets_dir[idx], training_datasets_txt[idx], transform=transform) 144 | print("Training Samples Trans10K", len(train_t10k)) 145 | t_datasets.append(train_t10k) 146 | if "msd" in training_datasets: 147 | idx = training_datasets.index("msd") 148 | train_msd = MSDLoader(training_datasets_dir[idx], training_datasets_txt[idx], transform=transform) 149 | print("Training Samples MSD", len(train_msd)) 150 | t_datasets.append(train_msd) 151 | 152 | training_data = ConcatDataset(t_datasets) 153 | dataloader_train = DataLoader(dataset = training_data, batch_size = args.batch_size, shuffle = True, num_workers=8) 154 | 155 | running_time = 0.0 156 | train_step = 0 157 | for e in trange(args.epochs): 158 | start_time_epoch = time.time() 159 | 160 | ###---------------[Training loop]---------------### 161 | print(f"Training phase for epoch {e}: ") 162 | 163 | for img, depth, _ in tqdm(dataloader_train): 164 | if train_step % args.step_save == 0 and train_step != 0: 165 | # Save checkpoint. 166 | torch.save({'epoch': e, 167 | 'model_state_dict': model.state_dict(), 168 | 'optimizer_state_dict': optimizer.state_dict(), 169 | 'scheduler': scheduler.state_dict(), 170 | 'loss': loss, 171 | }, output_path + "/{}_{}.pt".format(model_type, train_step)) 172 | 173 | model.train(True) # I think it's redundant... 174 | 175 | # Turn to tensor and send to device. 176 | sample = img.to(device) 177 | gt = depth.to(device) 178 | optimizer.zero_grad() 179 | prediction = model(sample) 180 | 181 | mask_idx = torch.full(size = prediction.shape, fill_value = 1).to(device) 182 | loss = ss_loss(prediction, gt, mask_idx) + grad_loss(prediction, gt, mask_idx) + mse_loss(prediction, gt, mask_idx) 183 | 184 | if train_step % args.step_log == 0: 185 | wandb.log({"train/batch-wise-loss" : loss.detach().cpu()}) 186 | if train_step % args.step_log_images == 0: 187 | vis_rgbs = torch.nn.functional.interpolate(sample, scale_factor=0.25, mode="bilinear") 188 | vis_preds = torch.nn.functional.interpolate(prediction.unsqueeze(1), scale_factor=0.25) 189 | vis_gts = torch.nn.functional.interpolate(gt.unsqueeze(1), scale_factor=0.25) 190 | wandb.log({ 191 | "train/rgb": wandb.Image(make_grid(vis_rgbs, nrow = 4)), 192 | "train/prediction": wandb.Image(make_grid(vis_preds, nrow = 4)), 193 | "train/groundtruth" : wandb.Image(make_grid(vis_gts, nrow = 4)) 194 | }) 195 | if torch.isnan(loss) or torch.isinf(loss): 196 | exit() 197 | 198 | if not torch.isnan(loss) and not torch.isinf(loss): 199 | loss.backward() 200 | optimizer.step() 201 | train_step += 1 202 | 203 | scheduler.step() 204 | 205 | epoch_time = (time.time() - start_time_epoch) 206 | running_time += epoch_time 207 | print(f'Epoch {e} done in {epoch_time} s.') 208 | 209 | 210 | # Save checkpoint. 211 | torch.save({'epoch': e, 212 | 'model_state_dict': model.state_dict(), 213 | 'optimizer_state_dict': optimizer.state_dict(), 214 | 'scheduler': scheduler.state_dict(), 215 | 'loss': loss, 216 | }, output_path + "/{}_{}.pt".format(model_type, train_step)) 217 | 218 | # Save final ckpt without optimizer and scheduler 219 | torch.save({'model_state_dict': model.state_dict()}, output_path + "/{}_final.pt".format(model_type)) 220 | 221 | if __name__ == "__main__": 222 | 223 | parser = argparse.ArgumentParser() 224 | 225 | parser.add_argument('--exp_name', 226 | default='midas-ft', 227 | ) 228 | 229 | # Paths 230 | parser.add_argument('--training_datasets', 231 | nargs='+', 232 | default=['msd', 'trans10k'], 233 | help='training datasets' 234 | ) 235 | 236 | parser.add_argument('--training_datasets_dir', 237 | nargs='+', 238 | default=['MSD/', 'Trans10K/'], 239 | help='list of files for each training dataset' 240 | ) 241 | 242 | parser.add_argument('--training_datasets_txt', 243 | nargs='+', 244 | default=['datasets/msd/train.txt', 'datasets/trans10k/train.txt'], 245 | help='list of files for each training dataset' 246 | ) 247 | 248 | parser.add_argument('-o', '--output_path', 249 | default='./experiment_models', 250 | help='where to save the model' 251 | ) 252 | 253 | # Model specs 254 | parser.add_argument('-m', '--model_path', 255 | default=None, 256 | help='path to the trained weights of model' 257 | ) 258 | 259 | parser.add_argument('-t', '--model_type', 260 | default='dpt_large', 261 | help='model type: dpt_large, midas_v21' 262 | ) 263 | 264 | # Training params 265 | parser.add_argument('-e', '--epochs', 266 | default=20, 267 | type=int, 268 | help='number of epochs' 269 | ) 270 | 271 | parser.add_argument('-bs', '--batch_size', 272 | default=8, 273 | type=int, 274 | help='batch_size' 275 | ) 276 | 277 | parser.add_argument('--continue_train', 278 | action="store_true", 279 | help='load optimizer and scheduler state dict' 280 | ) 281 | 282 | # Logging params 283 | parser.add_argument('--step_save', 284 | default=5000, 285 | type=int, 286 | help='number of steps to save the model' 287 | ) 288 | parser.add_argument('--step_log', 289 | default=10, 290 | type=int, 291 | help='number of steps to save the model' 292 | ) 293 | parser.add_argument('--step_log_images', 294 | default=1000, 295 | type=int, 296 | help='number of steps to save the model' 297 | ) 298 | 299 | args = parser.parse_args() 300 | print(args) 301 | 302 | os.makedirs(os.path.join(args.output_path, args.exp_name), exist_ok=True) 303 | 304 | default_models = { 305 | "midas_v21" : "weights/Base/midas_v21-base.pt", 306 | "dpt_large" : "weights/Base/dpt_large-base.pt", 307 | } 308 | 309 | if args.model_path is None: 310 | args.model_path = default_models[args.model_type] 311 | 312 | # Set torch options 313 | torch.backends.cudnn.enabled = True 314 | torch.backends.cudnn.benchmark = True 315 | 316 | # Start fine-tuning. 317 | run(args) 318 | -------------------------------------------------------------------------------- /images/framework_mono.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CVLAB-Unibo/Depth4ToM-code/5de0f869d66edc48b79d2f9f197756e71b342f9a/images/framework_mono.png -------------------------------------------------------------------------------- /images/qualitatives.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CVLAB-Unibo/Depth4ToM-code/5de0f869d66edc48b79d2f9f197756e71b342f9a/images/qualitatives.png -------------------------------------------------------------------------------- /loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | 5 | def compute_scale_and_shift(prediction, target, mask): 6 | # system matrix: A = [[a_00, a_01], [a_10, a_11]] 7 | a_00 = torch.sum(mask * prediction * prediction, (1, 2)) 8 | a_01 = torch.sum(mask * prediction, (1, 2)) 9 | a_11 = torch.sum(mask, (1, 2)) 10 | 11 | # right hand side: b = [b_0, b_1] 12 | b_0 = torch.sum(mask * prediction * target, (1, 2)) 13 | b_1 = torch.sum(mask * target, (1, 2)) 14 | 15 | # solution: x = A^-1 . b = [[a_11, -a_01], [-a_10, a_00]] / (a_00 * a_11 - a_01 * a_10) . b 16 | x_0 = torch.zeros_like(b_0) 17 | x_1 = torch.zeros_like(b_1) 18 | 19 | det = a_00 * a_11 - a_01 * a_01 20 | valid = det.nonzero() 21 | 22 | x_0[valid] = (a_11[valid] * b_0[valid] - a_01[valid] * b_1[valid]) / det[valid] 23 | x_1[valid] = (-a_01[valid] * b_0[valid] + a_00[valid] * b_1[valid]) / det[valid] 24 | 25 | return x_0, x_1 26 | 27 | 28 | def reduction_batch_based(image_loss, M): 29 | # average of all valid pixels of the batch 30 | 31 | # avoid division by 0 (if sum(M) = sum(sum(mask)) = 0: sum(image_loss) = 0) 32 | divisor = torch.sum(M) 33 | 34 | if divisor == 0: 35 | return 0 36 | else: 37 | return torch.sum(image_loss) / divisor 38 | 39 | 40 | def reduction_image_based(image_loss, M): 41 | # mean of average of valid pixels of an image 42 | 43 | # avoid division by 0 (if M = sum(mask) = 0: image_loss = 0) 44 | valid = M.nonzero() 45 | 46 | image_loss[valid] = image_loss[valid] / M[valid] 47 | 48 | return torch.mean(image_loss) 49 | 50 | 51 | def mse_loss(prediction, target, mask, reduction=reduction_batch_based): 52 | 53 | M = torch.sum(mask, (1, 2)) 54 | res = prediction - target 55 | image_loss = torch.sum(mask * res * res, (1, 2)) 56 | 57 | return reduction(image_loss, 2 * M) 58 | 59 | 60 | def gradient_loss(prediction, target, mask, reduction=reduction_batch_based): 61 | 62 | M = torch.sum(mask, (1, 2)) 63 | 64 | diff = prediction - target 65 | diff = torch.mul(mask, diff) 66 | 67 | grad_x = torch.abs(diff[:, :, 1:] - diff[:, :, :-1]) 68 | mask_x = torch.mul(mask[:, :, 1:], mask[:, :, :-1]) 69 | grad_x = torch.mul(mask_x, grad_x) 70 | 71 | grad_y = torch.abs(diff[:, 1:, :] - diff[:, :-1, :]) 72 | mask_y = torch.mul(mask[:, 1:, :], mask[:, :-1, :]) 73 | grad_y = torch.mul(mask_y, grad_y) 74 | 75 | image_loss = torch.sum(grad_x, (1, 2)) + torch.sum(grad_y, (1, 2)) 76 | 77 | return reduction(image_loss, M) 78 | 79 | 80 | class MSELoss(nn.Module): 81 | def __init__(self, reduction='batch-based'): 82 | super().__init__() 83 | 84 | if reduction == 'batch-based': 85 | self.__reduction = reduction_batch_based 86 | else: 87 | self.__reduction = reduction_image_based 88 | 89 | def forward(self, prediction, target, mask): 90 | return mse_loss(prediction, target, mask, reduction=self.__reduction) 91 | 92 | 93 | class GradientLoss(nn.Module): 94 | def __init__(self, scales=4, reduction='batch-based'): 95 | super().__init__() 96 | 97 | if reduction == 'batch-based': 98 | self.__reduction = reduction_batch_based 99 | else: 100 | self.__reduction = reduction_image_based 101 | 102 | self.__scales = scales 103 | 104 | def forward(self, prediction, target, mask): 105 | total = 0 106 | 107 | for scale in range(self.__scales): 108 | step = pow(2, scale) 109 | 110 | total += gradient_loss(prediction[:, ::step, ::step], target[:, ::step, ::step], 111 | mask[:, ::step, ::step], reduction=self.__reduction) 112 | 113 | return total 114 | 115 | 116 | class ScaleAndShiftInvariantLoss(nn.Module): 117 | def __init__(self, alpha=0.5, scales=4, reduction='batch-based'): 118 | super().__init__() 119 | 120 | self.__data_loss = MSELoss(reduction=reduction) 121 | self.__regularization_loss = GradientLoss(scales=scales, reduction=reduction) 122 | self.__alpha = alpha 123 | 124 | self.__prediction_ssi = None 125 | 126 | def forward(self, prediction, target, mask): 127 | 128 | scale, shift = compute_scale_and_shift(prediction, target, mask) 129 | self.__prediction_ssi = scale.view(-1, 1, 1) * prediction + shift.view(-1, 1, 1) 130 | 131 | total = self.__data_loss(self.__prediction_ssi, target, mask) 132 | if self.__alpha > 0: 133 | total += self.__alpha * self.__regularization_loss(self.__prediction_ssi, target, mask) 134 | 135 | return total 136 | 137 | def __get_prediction_ssi(self): 138 | return self.__prediction_ssi 139 | 140 | prediction_ssi = property(__get_prediction_ssi) -------------------------------------------------------------------------------- /midas/base_model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | 4 | class BaseModel(torch.nn.Module): 5 | def load(self, path): 6 | """Load model from file. 7 | 8 | Args: 9 | path (str): file path 10 | """ 11 | parameters = torch.load(path, map_location=torch.device('cpu')) 12 | 13 | if "optimizer" in parameters: 14 | parameters = parameters["model"] 15 | 16 | self.load_state_dict(parameters) 17 | -------------------------------------------------------------------------------- /midas/blocks.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | from .vit import ( 5 | _make_pretrained_vitb_rn50_384, 6 | _make_pretrained_vitl16_384, 7 | _make_pretrained_vitb16_384, 8 | forward_vit, 9 | ) 10 | 11 | def _make_encoder(backbone, features, use_pretrained, groups=1, expand=False, exportable=True, hooks=None, use_vit_only=False, use_readout="ignore",): 12 | if backbone == "vitl16_384": 13 | pretrained = _make_pretrained_vitl16_384( 14 | use_pretrained, hooks=hooks, use_readout=use_readout 15 | ) 16 | scratch = _make_scratch( 17 | [256, 512, 1024, 1024], features, groups=groups, expand=expand 18 | ) # ViT-L/16 - 85.0% Top1 (backbone) 19 | elif backbone == "vitb_rn50_384": 20 | pretrained = _make_pretrained_vitb_rn50_384( 21 | use_pretrained, 22 | hooks=hooks, 23 | use_vit_only=use_vit_only, 24 | use_readout=use_readout, 25 | ) 26 | scratch = _make_scratch( 27 | [256, 512, 768, 768], features, groups=groups, expand=expand 28 | ) # ViT-H/16 - 85.0% Top1 (backbone) 29 | elif backbone == "vitb16_384": 30 | pretrained = _make_pretrained_vitb16_384( 31 | use_pretrained, hooks=hooks, use_readout=use_readout 32 | ) 33 | scratch = _make_scratch( 34 | [96, 192, 384, 768], features, groups=groups, expand=expand 35 | ) # ViT-B/16 - 84.6% Top1 (backbone) 36 | elif backbone == "resnext101_wsl": 37 | pretrained = _make_pretrained_resnext101_wsl(use_pretrained) 38 | scratch = _make_scratch([256, 512, 1024, 2048], features, groups=groups, expand=expand) # efficientnet_lite3 39 | elif backbone == "efficientnet_lite3": 40 | pretrained = _make_pretrained_efficientnet_lite3(use_pretrained, exportable=exportable) 41 | scratch = _make_scratch([32, 48, 136, 384], features, groups=groups, expand=expand) # efficientnet_lite3 42 | else: 43 | print(f"Backbone '{backbone}' not implemented") 44 | assert False 45 | 46 | return pretrained, scratch 47 | 48 | 49 | def _make_scratch(in_shape, out_shape, groups=1, expand=False): 50 | scratch = nn.Module() 51 | 52 | out_shape1 = out_shape 53 | out_shape2 = out_shape 54 | out_shape3 = out_shape 55 | out_shape4 = out_shape 56 | if expand==True: 57 | out_shape1 = out_shape 58 | out_shape2 = out_shape*2 59 | out_shape3 = out_shape*4 60 | out_shape4 = out_shape*8 61 | 62 | scratch.layer1_rn = nn.Conv2d( 63 | in_shape[0], out_shape1, kernel_size=3, stride=1, padding=1, bias=False, groups=groups 64 | ) 65 | scratch.layer2_rn = nn.Conv2d( 66 | in_shape[1], out_shape2, kernel_size=3, stride=1, padding=1, bias=False, groups=groups 67 | ) 68 | scratch.layer3_rn = nn.Conv2d( 69 | in_shape[2], out_shape3, kernel_size=3, stride=1, padding=1, bias=False, groups=groups 70 | ) 71 | scratch.layer4_rn = nn.Conv2d( 72 | in_shape[3], out_shape4, kernel_size=3, stride=1, padding=1, bias=False, groups=groups 73 | ) 74 | 75 | return scratch 76 | 77 | 78 | def _make_pretrained_efficientnet_lite3(use_pretrained, exportable=False): 79 | efficientnet = torch.hub.load( 80 | "rwightman/gen-efficientnet-pytorch", 81 | "tf_efficientnet_lite3", 82 | pretrained=use_pretrained, 83 | exportable=exportable 84 | ) 85 | return _make_efficientnet_backbone(efficientnet) 86 | 87 | 88 | def _make_efficientnet_backbone(effnet): 89 | pretrained = nn.Module() 90 | 91 | pretrained.layer1 = nn.Sequential( 92 | effnet.conv_stem, effnet.bn1, effnet.act1, *effnet.blocks[0:2] 93 | ) 94 | pretrained.layer2 = nn.Sequential(*effnet.blocks[2:3]) 95 | pretrained.layer3 = nn.Sequential(*effnet.blocks[3:5]) 96 | pretrained.layer4 = nn.Sequential(*effnet.blocks[5:9]) 97 | 98 | return pretrained 99 | 100 | 101 | def _make_resnet_backbone(resnet): 102 | pretrained = nn.Module() 103 | pretrained.layer1 = nn.Sequential( 104 | resnet.conv1, resnet.bn1, resnet.relu, resnet.maxpool, resnet.layer1 105 | ) 106 | 107 | pretrained.layer2 = resnet.layer2 108 | pretrained.layer3 = resnet.layer3 109 | pretrained.layer4 = resnet.layer4 110 | 111 | return pretrained 112 | 113 | 114 | def _make_pretrained_resnext101_wsl(use_pretrained): 115 | resnet = torch.hub.load("facebookresearch/WSL-Images", "resnext101_32x8d_wsl") 116 | return _make_resnet_backbone(resnet) 117 | 118 | 119 | 120 | class Interpolate(nn.Module): 121 | """Interpolation module. 122 | """ 123 | 124 | def __init__(self, scale_factor, mode, align_corners=False): 125 | """Init. 126 | 127 | Args: 128 | scale_factor (float): scaling 129 | mode (str): interpolation mode 130 | """ 131 | super(Interpolate, self).__init__() 132 | 133 | self.interp = nn.functional.interpolate 134 | self.scale_factor = scale_factor 135 | self.mode = mode 136 | self.align_corners = align_corners 137 | 138 | def forward(self, x): 139 | """Forward pass. 140 | 141 | Args: 142 | x (tensor): input 143 | 144 | Returns: 145 | tensor: interpolated data 146 | """ 147 | 148 | x = self.interp( 149 | x, scale_factor=self.scale_factor, mode=self.mode, align_corners=self.align_corners 150 | ) 151 | 152 | return x 153 | 154 | 155 | class ResidualConvUnit(nn.Module): 156 | """Residual convolution module. 157 | """ 158 | 159 | def __init__(self, features): 160 | """Init. 161 | 162 | Args: 163 | features (int): number of features 164 | """ 165 | super().__init__() 166 | 167 | self.conv1 = nn.Conv2d( 168 | features, features, kernel_size=3, stride=1, padding=1, bias=True 169 | ) 170 | 171 | self.conv2 = nn.Conv2d( 172 | features, features, kernel_size=3, stride=1, padding=1, bias=True 173 | ) 174 | 175 | self.relu = nn.ReLU(inplace=True) 176 | 177 | def forward(self, x): 178 | """Forward pass. 179 | 180 | Args: 181 | x (tensor): input 182 | 183 | Returns: 184 | tensor: output 185 | """ 186 | out = self.relu(x) 187 | out = self.conv1(out) 188 | out = self.relu(out) 189 | out = self.conv2(out) 190 | 191 | return out + x 192 | 193 | 194 | class FeatureFusionBlock(nn.Module): 195 | """Feature fusion block. 196 | """ 197 | 198 | def __init__(self, features): 199 | """Init. 200 | 201 | Args: 202 | features (int): number of features 203 | """ 204 | super(FeatureFusionBlock, self).__init__() 205 | 206 | self.resConfUnit1 = ResidualConvUnit(features) 207 | self.resConfUnit2 = ResidualConvUnit(features) 208 | 209 | def forward(self, *xs): 210 | """Forward pass. 211 | 212 | Returns: 213 | tensor: output 214 | """ 215 | output = xs[0] 216 | 217 | if len(xs) == 2: 218 | output += self.resConfUnit1(xs[1]) 219 | 220 | output = self.resConfUnit2(output) 221 | 222 | output = nn.functional.interpolate( 223 | output, scale_factor=2, mode="bilinear", align_corners=True 224 | ) 225 | 226 | return output 227 | 228 | 229 | 230 | 231 | class ResidualConvUnit_custom(nn.Module): 232 | """Residual convolution module. 233 | """ 234 | 235 | def __init__(self, features, activation, bn): 236 | """Init. 237 | 238 | Args: 239 | features (int): number of features 240 | """ 241 | super().__init__() 242 | 243 | self.bn = bn 244 | 245 | self.groups=1 246 | 247 | self.conv1 = nn.Conv2d( 248 | features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups 249 | ) 250 | 251 | self.conv2 = nn.Conv2d( 252 | features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups 253 | ) 254 | 255 | if self.bn==True: 256 | self.bn1 = nn.BatchNorm2d(features) 257 | self.bn2 = nn.BatchNorm2d(features) 258 | 259 | self.activation = activation 260 | 261 | self.skip_add = nn.quantized.FloatFunctional() 262 | 263 | def forward(self, x): 264 | """Forward pass. 265 | 266 | Args: 267 | x (tensor): input 268 | 269 | Returns: 270 | tensor: output 271 | """ 272 | 273 | out = self.activation(x) 274 | out = self.conv1(out) 275 | if self.bn==True: 276 | out = self.bn1(out) 277 | 278 | out = self.activation(out) 279 | out = self.conv2(out) 280 | if self.bn==True: 281 | out = self.bn2(out) 282 | 283 | if self.groups > 1: 284 | out = self.conv_merge(out) 285 | 286 | return self.skip_add.add(out, x) 287 | 288 | # return out + x 289 | 290 | 291 | class FeatureFusionBlock_custom(nn.Module): 292 | """Feature fusion block. 293 | """ 294 | 295 | def __init__(self, features, activation, deconv=False, bn=False, expand=False, align_corners=True): 296 | """Init. 297 | 298 | Args: 299 | features (int): number of features 300 | """ 301 | super(FeatureFusionBlock_custom, self).__init__() 302 | 303 | self.deconv = deconv 304 | self.align_corners = align_corners 305 | 306 | self.groups=1 307 | 308 | self.expand = expand 309 | out_features = features 310 | if self.expand==True: 311 | out_features = features//2 312 | 313 | self.out_conv = nn.Conv2d(features, out_features, kernel_size=1, stride=1, padding=0, bias=True, groups=1) 314 | 315 | self.resConfUnit1 = ResidualConvUnit_custom(features, activation, bn) 316 | self.resConfUnit2 = ResidualConvUnit_custom(features, activation, bn) 317 | 318 | self.skip_add = nn.quantized.FloatFunctional() 319 | 320 | def forward(self, *xs): 321 | """Forward pass. 322 | 323 | Returns: 324 | tensor: output 325 | """ 326 | output = xs[0] 327 | 328 | if len(xs) == 2: 329 | res = self.resConfUnit1(xs[1]) 330 | output = self.skip_add.add(output, res) 331 | # output += res 332 | 333 | output = self.resConfUnit2(output) 334 | 335 | output = nn.functional.interpolate( 336 | output, scale_factor=2, mode="bilinear", align_corners=self.align_corners 337 | ) 338 | 339 | output = self.out_conv(output) 340 | 341 | return output 342 | 343 | -------------------------------------------------------------------------------- /midas/dpt_depth.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | from .base_model import BaseModel 6 | from .blocks import ( 7 | FeatureFusionBlock, 8 | FeatureFusionBlock_custom, 9 | Interpolate, 10 | _make_encoder, 11 | forward_vit, 12 | ) 13 | 14 | 15 | def _make_fusion_block(features, use_bn): 16 | return FeatureFusionBlock_custom( 17 | features, 18 | nn.ReLU(False), 19 | deconv=False, 20 | bn=use_bn, 21 | expand=False, 22 | align_corners=True, 23 | ) 24 | 25 | 26 | class DPT(BaseModel): 27 | def __init__( 28 | self, 29 | head, 30 | features=256, 31 | backbone="vitb_rn50_384", 32 | readout="project", 33 | channels_last=False, 34 | use_bn=False, 35 | ): 36 | 37 | super(DPT, self).__init__() 38 | 39 | self.channels_last = channels_last 40 | 41 | hooks = { 42 | "vitb_rn50_384": [0, 1, 8, 11], 43 | "vitb16_384": [2, 5, 8, 11], 44 | "vitl16_384": [5, 11, 17, 23], 45 | } 46 | 47 | # Instantiate backbone and reassemble blocks 48 | self.pretrained, self.scratch = _make_encoder( 49 | backbone, 50 | features, 51 | False, # Set to true of you want to train from scratch, uses ImageNet weights 52 | groups=1, 53 | expand=False, 54 | exportable=False, 55 | hooks=hooks[backbone], 56 | use_readout=readout, 57 | ) 58 | 59 | self.scratch.refinenet1 = _make_fusion_block(features, use_bn) 60 | self.scratch.refinenet2 = _make_fusion_block(features, use_bn) 61 | self.scratch.refinenet3 = _make_fusion_block(features, use_bn) 62 | self.scratch.refinenet4 = _make_fusion_block(features, use_bn) 63 | 64 | self.scratch.output_conv = head 65 | 66 | 67 | def forward(self, x): 68 | if self.channels_last == True: 69 | x.contiguous(memory_format=torch.channels_last) 70 | 71 | layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x) 72 | 73 | layer_1_rn = self.scratch.layer1_rn(layer_1) 74 | layer_2_rn = self.scratch.layer2_rn(layer_2) 75 | layer_3_rn = self.scratch.layer3_rn(layer_3) 76 | layer_4_rn = self.scratch.layer4_rn(layer_4) 77 | 78 | path_4 = self.scratch.refinenet4(layer_4_rn) 79 | path_3 = self.scratch.refinenet3(path_4, layer_3_rn) 80 | path_2 = self.scratch.refinenet2(path_3, layer_2_rn) 81 | path_1 = self.scratch.refinenet1(path_2, layer_1_rn) 82 | 83 | out = self.scratch.output_conv(path_1) 84 | 85 | return out 86 | 87 | 88 | class DPTDepthModel(DPT): 89 | def __init__(self, path=None, non_negative=True, **kwargs): 90 | features = kwargs["features"] if "features" in kwargs else 256 91 | 92 | head = nn.Sequential( 93 | nn.Conv2d(features, features // 2, kernel_size=3, stride=1, padding=1), 94 | Interpolate(scale_factor=2, mode="bilinear", align_corners=True), 95 | nn.Conv2d(features // 2, 32, kernel_size=3, stride=1, padding=1), 96 | nn.ReLU(True), 97 | nn.Conv2d(32, 1, kernel_size=1, stride=1, padding=0), 98 | nn.ReLU(True) if non_negative else nn.Identity(), 99 | nn.Identity(), 100 | ) 101 | 102 | super().__init__(head, **kwargs) 103 | 104 | if path is not None: 105 | self.load(path) 106 | 107 | def forward(self, x): 108 | return super().forward(x).squeeze(dim=1) 109 | 110 | -------------------------------------------------------------------------------- /midas/midas_net.py: -------------------------------------------------------------------------------- 1 | """MidashNet: Network for monocular depth estimation trained by mixing several datasets. 2 | This file contains code that is adapted from 3 | https://github.com/thomasjpfan/pytorch_refinenet/blob/master/pytorch_refinenet/refinenet/refinenet_4cascade.py 4 | """ 5 | import torch 6 | import torch.nn as nn 7 | 8 | from .base_model import BaseModel 9 | from .blocks import FeatureFusionBlock, Interpolate, _make_encoder 10 | 11 | 12 | class MidasNet(BaseModel): 13 | """Network for monocular depth estimation. 14 | """ 15 | 16 | def __init__(self, path=None, features=256, non_negative=True): 17 | """Init. 18 | 19 | Args: 20 | path (str, optional): Path to saved model. Defaults to None. 21 | features (int, optional): Number of features. Defaults to 256. 22 | backbone (str, optional): Backbone network for encoder. Defaults to resnet50 23 | """ 24 | print("Loading weights: ", path) 25 | 26 | super(MidasNet, self).__init__() 27 | 28 | use_pretrained = False if path is None else True 29 | 30 | self.pretrained, self.scratch = _make_encoder(backbone="resnext101_wsl", features=features, use_pretrained=use_pretrained) 31 | 32 | self.scratch.refinenet4 = FeatureFusionBlock(features) 33 | self.scratch.refinenet3 = FeatureFusionBlock(features) 34 | self.scratch.refinenet2 = FeatureFusionBlock(features) 35 | self.scratch.refinenet1 = FeatureFusionBlock(features) 36 | 37 | self.scratch.output_conv = nn.Sequential( 38 | nn.Conv2d(features, 128, kernel_size=3, stride=1, padding=1), 39 | Interpolate(scale_factor=2, mode="bilinear"), 40 | nn.Conv2d(128, 32, kernel_size=3, stride=1, padding=1), 41 | nn.ReLU(True), 42 | nn.Conv2d(32, 1, kernel_size=1, stride=1, padding=0), 43 | nn.ReLU(True) if non_negative else nn.Identity(), 44 | ) 45 | 46 | if path: 47 | self.load(path) 48 | 49 | def forward(self, x): 50 | """Forward pass. 51 | 52 | Args: 53 | x (tensor): input data (image) 54 | 55 | Returns: 56 | tensor: depth 57 | """ 58 | 59 | layer_1 = self.pretrained.layer1(x) 60 | layer_2 = self.pretrained.layer2(layer_1) 61 | layer_3 = self.pretrained.layer3(layer_2) 62 | layer_4 = self.pretrained.layer4(layer_3) 63 | 64 | layer_1_rn = self.scratch.layer1_rn(layer_1) 65 | layer_2_rn = self.scratch.layer2_rn(layer_2) 66 | layer_3_rn = self.scratch.layer3_rn(layer_3) 67 | layer_4_rn = self.scratch.layer4_rn(layer_4) 68 | 69 | path_4 = self.scratch.refinenet4(layer_4_rn) 70 | path_3 = self.scratch.refinenet3(path_4, layer_3_rn) 71 | path_2 = self.scratch.refinenet2(path_3, layer_2_rn) 72 | path_1 = self.scratch.refinenet1(path_2, layer_1_rn) 73 | 74 | out = self.scratch.output_conv(path_1) 75 | 76 | return torch.squeeze(out, dim=1) 77 | -------------------------------------------------------------------------------- /midas/midas_net_custom.py: -------------------------------------------------------------------------------- 1 | """MidashNet: Network for monocular depth estimation trained by mixing several datasets. 2 | This file contains code that is adapted from 3 | https://github.com/thomasjpfan/pytorch_refinenet/blob/master/pytorch_refinenet/refinenet/refinenet_4cascade.py 4 | """ 5 | import torch 6 | import torch.nn as nn 7 | 8 | from .base_model import BaseModel 9 | from .blocks import FeatureFusionBlock, FeatureFusionBlock_custom, Interpolate, _make_encoder 10 | 11 | 12 | class MidasNet_small(BaseModel): 13 | """Network for monocular depth estimation. 14 | """ 15 | 16 | def __init__(self, path=None, features=64, backbone="efficientnet_lite3", non_negative=True, exportable=True, channels_last=False, align_corners=True, 17 | blocks={'expand': True}): 18 | """Init. 19 | 20 | Args: 21 | path (str, optional): Path to saved model. Defaults to None. 22 | features (int, optional): Number of features. Defaults to 256. 23 | backbone (str, optional): Backbone network for encoder. Defaults to resnet50 24 | """ 25 | print("Loading weights: ", path) 26 | 27 | super(MidasNet_small, self).__init__() 28 | 29 | use_pretrained = False if path else True 30 | 31 | self.channels_last = channels_last 32 | self.blocks = blocks 33 | self.backbone = backbone 34 | 35 | self.groups = 1 36 | 37 | features1=features 38 | features2=features 39 | features3=features 40 | features4=features 41 | self.expand = False 42 | if "expand" in self.blocks and self.blocks['expand'] == True: 43 | self.expand = True 44 | features1=features 45 | features2=features*2 46 | features3=features*4 47 | features4=features*8 48 | 49 | self.pretrained, self.scratch = _make_encoder(self.backbone, features, use_pretrained, groups=self.groups, expand=self.expand, exportable=exportable) 50 | 51 | self.scratch.activation = nn.ReLU(False) 52 | 53 | self.scratch.refinenet4 = FeatureFusionBlock_custom(features4, self.scratch.activation, deconv=False, bn=False, expand=self.expand, align_corners=align_corners) 54 | self.scratch.refinenet3 = FeatureFusionBlock_custom(features3, self.scratch.activation, deconv=False, bn=False, expand=self.expand, align_corners=align_corners) 55 | self.scratch.refinenet2 = FeatureFusionBlock_custom(features2, self.scratch.activation, deconv=False, bn=False, expand=self.expand, align_corners=align_corners) 56 | self.scratch.refinenet1 = FeatureFusionBlock_custom(features1, self.scratch.activation, deconv=False, bn=False, align_corners=align_corners) 57 | 58 | 59 | self.scratch.output_conv = nn.Sequential( 60 | nn.Conv2d(features, features//2, kernel_size=3, stride=1, padding=1, groups=self.groups), 61 | Interpolate(scale_factor=2, mode="bilinear"), 62 | nn.Conv2d(features//2, 32, kernel_size=3, stride=1, padding=1), 63 | self.scratch.activation, 64 | nn.Conv2d(32, 1, kernel_size=1, stride=1, padding=0), 65 | nn.ReLU(True) if non_negative else nn.Identity(), 66 | nn.Identity(), 67 | ) 68 | 69 | if path: 70 | self.load(path) 71 | 72 | 73 | def forward(self, x): 74 | """Forward pass. 75 | 76 | Args: 77 | x (tensor): input data (image) 78 | 79 | Returns: 80 | tensor: depth 81 | """ 82 | if self.channels_last==True: 83 | print("self.channels_last = ", self.channels_last) 84 | x.contiguous(memory_format=torch.channels_last) 85 | 86 | 87 | layer_1 = self.pretrained.layer1(x) 88 | layer_2 = self.pretrained.layer2(layer_1) 89 | layer_3 = self.pretrained.layer3(layer_2) 90 | layer_4 = self.pretrained.layer4(layer_3) 91 | 92 | layer_1_rn = self.scratch.layer1_rn(layer_1) 93 | layer_2_rn = self.scratch.layer2_rn(layer_2) 94 | layer_3_rn = self.scratch.layer3_rn(layer_3) 95 | layer_4_rn = self.scratch.layer4_rn(layer_4) 96 | 97 | 98 | path_4 = self.scratch.refinenet4(layer_4_rn) 99 | path_3 = self.scratch.refinenet3(path_4, layer_3_rn) 100 | path_2 = self.scratch.refinenet2(path_3, layer_2_rn) 101 | path_1 = self.scratch.refinenet1(path_2, layer_1_rn) 102 | 103 | out = self.scratch.output_conv(path_1) 104 | 105 | return torch.squeeze(out, dim=1) 106 | 107 | 108 | 109 | def fuse_model(m): 110 | prev_previous_type = nn.Identity() 111 | prev_previous_name = '' 112 | previous_type = nn.Identity() 113 | previous_name = '' 114 | for name, module in m.named_modules(): 115 | if prev_previous_type == nn.Conv2d and previous_type == nn.BatchNorm2d and type(module) == nn.ReLU: 116 | # print("FUSED ", prev_previous_name, previous_name, name) 117 | torch.quantization.fuse_modules(m, [prev_previous_name, previous_name, name], inplace=True) 118 | elif prev_previous_type == nn.Conv2d and previous_type == nn.BatchNorm2d: 119 | # print("FUSED ", prev_previous_name, previous_name) 120 | torch.quantization.fuse_modules(m, [prev_previous_name, previous_name], inplace=True) 121 | # elif previous_type == nn.Conv2d and type(module) == nn.ReLU: 122 | # print("FUSED ", previous_name, name) 123 | # torch.quantization.fuse_modules(m, [previous_name, name], inplace=True) 124 | 125 | prev_previous_type = previous_type 126 | prev_previous_name = previous_name 127 | previous_type = type(module) 128 | previous_name = name -------------------------------------------------------------------------------- /midas/transforms.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cv2 3 | import math 4 | import random 5 | 6 | 7 | def apply_min_size(sample, size, image_interpolation_method=cv2.INTER_AREA): 8 | """Rezise the sample to ensure the given size. Keeps aspect ratio. 9 | 10 | Args: 11 | sample (dict): sample 12 | size (tuple): image size 13 | 14 | Returns: 15 | tuple: new size 16 | """ 17 | shape = list(sample["disparity"].shape) 18 | 19 | if shape[0] >= size[0] and shape[1] >= size[1]: 20 | return sample 21 | 22 | scale = [0, 0] 23 | scale[0] = size[0] / shape[0] 24 | scale[1] = size[1] / shape[1] 25 | 26 | scale = max(scale) 27 | 28 | shape[0] = math.ceil(scale * shape[0]) 29 | shape[1] = math.ceil(scale * shape[1]) 30 | 31 | # resize 32 | sample["image"] = cv2.resize( 33 | sample["image"], tuple(shape[::-1]), interpolation=image_interpolation_method 34 | ) 35 | 36 | sample["disparity"] = cv2.resize( 37 | sample["disparity"], tuple(shape[::-1]), interpolation=cv2.INTER_NEAREST 38 | ) 39 | sample["mask"] = cv2.resize( 40 | sample["mask"].astype(np.float32), 41 | tuple(shape[::-1]), 42 | interpolation=cv2.INTER_NEAREST, 43 | ) 44 | sample["mask"] = sample["mask"].astype(bool) 45 | 46 | return tuple(shape) 47 | 48 | class ResizeTrain(object): 49 | """Resize sample to given size (width, height).""" 50 | def __init__( 51 | self, 52 | width, 53 | height, 54 | resize_target=True, 55 | keep_aspect_ratio=False, 56 | ensure_multiple_of=1, 57 | resize_method="lower_bound", 58 | image_interpolation_method=cv2.INTER_AREA, 59 | ): 60 | """Init. 61 | 62 | Args: 63 | width (int): desired output width 64 | height (int): desired output height 65 | resize_target (bool, optional): 66 | True: Resize the full sample (image, mask, target). 67 | False: Resize image only. 68 | Defaults to True. 69 | keep_aspect_ratio (bool, optional): 70 | True: Keep the aspect ratio of the input sample. 71 | Output sample might not have the given width and height, and 72 | resize behaviour depends on the parameter 'resize_method'. 73 | Defaults to False. 74 | ensure_multiple_of (int, optional): 75 | Output width and height is constrained to be multiple of this parameter. 76 | Defaults to 1. 77 | resize_method (str, optional): 78 | "lower_bound": Output will be at least as large as the given size. 79 | "upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.) 80 | "minimal": Scale as least as possible. (Output size might be smaller than given size.) 81 | Defaults to "lower_bound". 82 | """ 83 | self.__width = width 84 | self.__height = height 85 | 86 | self.__resize_target = resize_target 87 | self.__keep_aspect_ratio = keep_aspect_ratio 88 | self.__multiple_of = ensure_multiple_of 89 | self.__resize_method = resize_method 90 | self.__image_interpolation_method = image_interpolation_method 91 | 92 | def constrain_to_multiple_of(self, x, min_val=0, max_val=None): 93 | y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int) 94 | 95 | if max_val is not None and y > max_val: 96 | y = (np.floor(x / self.__multiple_of) * self.__multiple_of).astype(int) 97 | 98 | if y < min_val: 99 | y = (np.ceil(x / self.__multiple_of) * self.__multiple_of).astype(int) 100 | 101 | return y 102 | 103 | def get_size(self, width, height): 104 | # determine new height and width 105 | scale_height = self.__height / height 106 | scale_width = self.__width / width 107 | 108 | if self.__keep_aspect_ratio: 109 | if self.__resize_method == "lower_bound": 110 | # scale such that output size is lower bound 111 | if scale_width > scale_height: 112 | # fit width 113 | scale_height = scale_width 114 | else: 115 | # fit height 116 | scale_width = scale_height 117 | elif self.__resize_method == "upper_bound": 118 | # scale such that output size is upper bound 119 | if scale_width < scale_height: 120 | # fit width 121 | scale_height = scale_width 122 | else: 123 | # fit height 124 | scale_width = scale_height 125 | elif self.__resize_method == "minimal": 126 | # scale as least as possbile 127 | if abs(1 - scale_width) < abs(1 - scale_height): 128 | # fit width 129 | scale_height = scale_width 130 | else: 131 | # fit height 132 | scale_width = scale_height 133 | else: 134 | raise ValueError( 135 | f"resize_method {self.__resize_method} not implemented" 136 | ) 137 | 138 | if self.__resize_method == "lower_bound": 139 | new_height = self.constrain_to_multiple_of( 140 | scale_height * height, min_val=self.__height 141 | ) 142 | new_width = self.constrain_to_multiple_of( 143 | scale_width * width, min_val=self.__width 144 | ) 145 | elif self.__resize_method == "upper_bound": 146 | new_height = self.constrain_to_multiple_of( 147 | scale_height * height, max_val=self.__height 148 | ) 149 | new_width = self.constrain_to_multiple_of( 150 | scale_width * width, max_val=self.__width 151 | ) 152 | elif self.__resize_method == "minimal": 153 | new_height = self.constrain_to_multiple_of(scale_height * height) 154 | new_width = self.constrain_to_multiple_of(scale_width * width) 155 | else: 156 | raise ValueError(f"resize_method {self.__resize_method} not implemented") 157 | 158 | return (new_width, new_height) 159 | 160 | def __call__(self, sample): 161 | width, height = self.get_size( 162 | sample["image"].shape[1], sample["image"].shape[0] 163 | ) 164 | 165 | # resize sample 166 | sample["image"] = cv2.resize( 167 | sample["image"], 168 | (width, height), 169 | interpolation=self.__image_interpolation_method, 170 | ) 171 | sample["image"] = np.clip(sample["image"], 0, 1) 172 | 173 | if self.__resize_target: 174 | if "disparity" in sample: 175 | sample["disparity"] = cv2.resize( 176 | sample["disparity"], 177 | (width, height), 178 | interpolation=cv2.INTER_NEAREST, 179 | ) 180 | 181 | if "depth" in sample: 182 | sample["depth"] = cv2.resize( 183 | sample["depth"], (width, height), interpolation=cv2.INTER_NEAREST 184 | ) 185 | 186 | if "mask" in sample: 187 | sample["mask"] = cv2.resize( 188 | sample["mask"].astype(np.float32), 189 | (width, height), 190 | interpolation=cv2.INTER_NEAREST, 191 | ) 192 | sample["mask"] = sample["mask"].astype(bool) 193 | return sample 194 | 195 | 196 | 197 | class Resize(object): 198 | """Resize sample to given size (width, height). 199 | """ 200 | 201 | def __init__( 202 | self, 203 | width, 204 | height, 205 | resize_target=True, 206 | keep_aspect_ratio=False, 207 | ensure_multiple_of=1, 208 | resize_method="lower_bound", 209 | image_interpolation_method=cv2.INTER_AREA, 210 | ): 211 | """Init. 212 | 213 | Args: 214 | width (int): desired output width 215 | height (int): desired output height 216 | resize_target (bool, optional): 217 | True: Resize the full sample (image, mask, target). 218 | False: Resize image only. 219 | Defaults to True. 220 | keep_aspect_ratio (bool, optional): 221 | True: Keep the aspect ratio of the input sample. 222 | Output sample might not have the given width and height, and 223 | resize behaviour depends on the parameter 'resize_method'. 224 | Defaults to False. 225 | ensure_multiple_of (int, optional): 226 | Output width and height is constrained to be multiple of this parameter. 227 | Defaults to 1. 228 | resize_method (str, optional): 229 | "lower_bound": Output will be at least as large as the given size. 230 | "upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.) 231 | "minimal": Scale as least as possible. (Output size might be smaller than given size.) 232 | Defaults to "lower_bound". 233 | """ 234 | self.__width = width 235 | self.__height = height 236 | 237 | self.__resize_target = resize_target 238 | self.__keep_aspect_ratio = keep_aspect_ratio 239 | self.__multiple_of = ensure_multiple_of 240 | self.__resize_method = resize_method 241 | self.__image_interpolation_method = image_interpolation_method 242 | 243 | def constrain_to_multiple_of(self, x, min_val=0, max_val=None): 244 | y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int) 245 | 246 | if max_val is not None and y > max_val: 247 | y = (np.floor(x / self.__multiple_of) * self.__multiple_of).astype(int) 248 | 249 | if y < min_val: 250 | y = (np.ceil(x / self.__multiple_of) * self.__multiple_of).astype(int) 251 | 252 | return y 253 | 254 | def get_size(self, width, height): 255 | # determine new height and width 256 | scale_height = self.__height / height 257 | scale_width = self.__width / width 258 | 259 | if self.__keep_aspect_ratio: 260 | if self.__resize_method == "lower_bound": 261 | # scale such that output size is lower bound 262 | if scale_width > scale_height: 263 | # fit width 264 | scale_height = scale_width 265 | else: 266 | # fit height 267 | scale_width = scale_height 268 | elif self.__resize_method == "upper_bound": 269 | # scale such that output size is upper bound 270 | if scale_width < scale_height: 271 | # fit width 272 | scale_height = scale_width 273 | else: 274 | # fit height 275 | scale_width = scale_height 276 | elif self.__resize_method == "minimal": 277 | # scale as least as possbile 278 | if abs(1 - scale_width) < abs(1 - scale_height): 279 | # fit width 280 | scale_height = scale_width 281 | else: 282 | # fit height 283 | scale_width = scale_height 284 | else: 285 | raise ValueError( 286 | f"resize_method {self.__resize_method} not implemented" 287 | ) 288 | 289 | if self.__resize_method == "lower_bound": 290 | new_height = self.constrain_to_multiple_of( 291 | scale_height * height, min_val=self.__height 292 | ) 293 | new_width = self.constrain_to_multiple_of( 294 | scale_width * width, min_val=self.__width 295 | ) 296 | elif self.__resize_method == "upper_bound": 297 | new_height = self.constrain_to_multiple_of( 298 | scale_height * height, max_val=self.__height 299 | ) 300 | new_width = self.constrain_to_multiple_of( 301 | scale_width * width, max_val=self.__width 302 | ) 303 | elif self.__resize_method == "minimal": 304 | new_height = self.constrain_to_multiple_of(scale_height * height) 305 | new_width = self.constrain_to_multiple_of(scale_width * width) 306 | else: 307 | raise ValueError(f"resize_method {self.__resize_method} not implemented") 308 | 309 | return (new_width, new_height) 310 | 311 | def __call__(self, sample): 312 | width, height = self.get_size( 313 | sample["image"].shape[1], sample["image"].shape[0] 314 | ) 315 | 316 | # resize sample 317 | sample["image"] = cv2.resize( 318 | sample["image"], 319 | (width, height), 320 | interpolation=self.__image_interpolation_method, 321 | ) 322 | sample["image"] = np.clip(sample["image"], 0, 1) 323 | 324 | if self.__resize_target: 325 | if "disparity" in sample: 326 | sample["disparity"] = cv2.resize( 327 | sample["disparity"], 328 | (width, height), 329 | interpolation=cv2.INTER_NEAREST, 330 | ) 331 | 332 | if "depth" in sample: 333 | sample["depth"] = cv2.resize( 334 | sample["depth"], (width, height), interpolation=cv2.INTER_NEAREST 335 | ) 336 | 337 | if "mask" in sample: 338 | sample["mask"] = cv2.resize( 339 | sample["mask"].astype(np.float32), 340 | (width, height), 341 | interpolation=cv2.INTER_NEAREST, 342 | ) 343 | sample["mask"] = sample["mask"].astype(bool) 344 | return sample 345 | 346 | 347 | class NormalizeImage(object): 348 | """Normlize image by given mean and std. 349 | """ 350 | 351 | def __init__(self, mean, std): 352 | self.__mean = mean 353 | self.__std = std 354 | 355 | def __call__(self, sample): 356 | sample["image"] = (sample["image"] - self.__mean) / self.__std 357 | 358 | return sample 359 | 360 | 361 | class PrepareForNet(object): 362 | """Prepare sample for usage as network input. 363 | """ 364 | 365 | def __init__(self): 366 | pass 367 | 368 | def __call__(self, sample): 369 | image = np.transpose(sample["image"], (2, 0, 1)) 370 | sample["image"] = np.ascontiguousarray(image).astype(np.float32) 371 | 372 | if "mask" in sample: 373 | sample["mask"] = sample["mask"].astype(np.float32) 374 | sample["mask"] = np.ascontiguousarray(sample["mask"]) 375 | 376 | if "disparity" in sample: 377 | disparity = sample["disparity"].astype(np.float32) 378 | sample["disparity"] = np.ascontiguousarray(disparity) 379 | 380 | if "depth" in sample: 381 | depth = sample["depth"].astype(np.float32) 382 | sample["depth"] = np.ascontiguousarray(depth) 383 | 384 | return sample 385 | 386 | class RandomCrop(object): 387 | def __init__(self, width, height): 388 | self.__width = width 389 | self.__height = height 390 | 391 | def __call__(self, sample): 392 | h, w = sample["image"].shape[:2] 393 | x = random.randint(0, w - self.__width) 394 | y = random.randint(0, h - self.__height) 395 | 396 | sample["image"] = sample["image"][y : y + self.__height, x : x + self.__width, :] 397 | 398 | if "mask" in sample: 399 | sample["mask"] = sample["mask"][y : y + self.__height, x : x + self.__width] 400 | 401 | if "disparity" in sample: 402 | sample["disparity"] = sample["disparity"][y : y + self.__height, x : x + self.__width] 403 | 404 | if "depth" in sample: 405 | sample["depth"] = sample["depth"][y : y + self.__height, x : x + self.__width] 406 | 407 | return sample 408 | 409 | class MirrorSquarePad(object): 410 | def __call__(self, sample): 411 | h, w = sample["image"].shape[:2] 412 | 413 | if h > w: 414 | new_h = h 415 | new_w = h 416 | else: 417 | new_h = w 418 | new_w = w 419 | 420 | sample["image"] = cv2.copyMakeBorder(sample["image"], 421 | (new_h-h)//2, 422 | (new_h-h) - (new_h-h)//2, 423 | (new_w-w)//2, 424 | (new_w-w) - (new_w-w)//2, 425 | cv2.BORDER_REFLECT_101) 426 | 427 | if "mask" in sample: 428 | sample["mask"] = cv2.copyMakeBorder(sample["mask"], 429 | (new_h-h)//2, 430 | (new_h-h) - (new_h-h)//2, 431 | (new_w-w)//2, 432 | (new_w-w) - (new_w-w)//2, 433 | cv2.BORDER_REFLECT_101) 434 | 435 | if "disparity" in sample: 436 | sample["disparity"] = cv2.copyMakeBorder(sample["disparity"], 437 | (new_h-h)//2, 438 | (new_h-h) - (new_h-h)//2, 439 | (new_w-w)//2, 440 | (new_w-w) - (new_w-w)//2, 441 | cv2.BORDER_REFLECT_101) 442 | 443 | if "depth" in sample: 444 | 445 | sample["depth"] = cv2.copyMakeBorder(sample["depth"], 446 | (new_h-h)//2, 447 | (new_h-h) - (new_h-h)//2, 448 | (new_w-w)//2, 449 | (new_w-w) - (new_w-w)//2, 450 | cv2.BORDER_REFLECT_101) 451 | return sample 452 | 453 | 454 | class RandomHorizontalFlip(object): 455 | def __init__(self, prob): 456 | self.__prob = prob 457 | 458 | def __call__(self, sample): 459 | cond = np.random.uniform(0, 1, 1) 460 | if cond > self.__prob: 461 | # NOTE: to solve negative slice problem, we create a copy 462 | sample["image"] = np.fliplr(sample["image"] ) 463 | sample["image"] = np.copy(sample["image"]) 464 | 465 | if "mask" in sample: 466 | sample["mask"] = np.fliplr(sample["mask"] ) 467 | sample["mask"] = np.copy(sample["mask"]) 468 | 469 | if "disparity" in sample: 470 | sample["disparity"] = np.fliplr(sample["disparity"] ) 471 | sample["disparity"] = np.copy(sample["disparity"]) 472 | 473 | if "depth" in sample: 474 | sample["depth"] = np.fliplr(sample["depth"] ) 475 | sample["depth"] = np.copy(sample["depth"]) 476 | 477 | return sample 478 | 479 | class ColorAug(object): 480 | def __init__(self, 481 | gamma_low=0.8, 482 | gamma_high=1.2, 483 | brightness_low=0.5, 484 | brightness_high=1.2, 485 | color_low=0.8, 486 | color_high=1.2, 487 | prob=0.5, 488 | ): 489 | self.__gamma_low = gamma_low 490 | self.__gamma_high = gamma_high 491 | self.__brightness_low = brightness_low 492 | self.__brightness_high = brightness_high 493 | self.__color_low = color_low 494 | self.__color_high = color_high 495 | self.__prob = prob 496 | 497 | def __call__(self, sample): 498 | sample["image"] = np.clip(sample["image"], 0, 1) 499 | if np.random.uniform(0, 1, 1) < self.__prob: 500 | # randomly shift gamma 501 | random_gamma = np.random.uniform(self.__gamma_low, self.__gamma_high) 502 | sample["image"] = sample["image"] ** random_gamma 503 | 504 | # randomly shift brightness 505 | random_brightness = np.random.uniform(self.__brightness_low, self.__brightness_high) 506 | sample["image"] = sample["image"] * random_brightness 507 | 508 | if sample["image"].shape[2] == 3: 509 | # randomly shift color 510 | random_colors = np.random.uniform(self.__color_low, self.__color_high, 3) 511 | sample["image"] *= random_colors 512 | 513 | # saturate 514 | sample["image"] = np.clip(sample["image"], 0, 1) 515 | return sample -------------------------------------------------------------------------------- /midas/vit.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import timm 4 | import types 5 | import math 6 | import torch.nn.functional as F 7 | 8 | 9 | class Slice(nn.Module): 10 | def __init__(self, start_index=1): 11 | super(Slice, self).__init__() 12 | self.start_index = start_index 13 | 14 | def forward(self, x): 15 | return x[:, self.start_index :] 16 | 17 | 18 | class AddReadout(nn.Module): 19 | def __init__(self, start_index=1): 20 | super(AddReadout, self).__init__() 21 | self.start_index = start_index 22 | 23 | def forward(self, x): 24 | if self.start_index == 2: 25 | readout = (x[:, 0] + x[:, 1]) / 2 26 | else: 27 | readout = x[:, 0] 28 | return x[:, self.start_index :] + readout.unsqueeze(1) 29 | 30 | 31 | class ProjectReadout(nn.Module): 32 | def __init__(self, in_features, start_index=1): 33 | super(ProjectReadout, self).__init__() 34 | self.start_index = start_index 35 | 36 | self.project = nn.Sequential(nn.Linear(2 * in_features, in_features), nn.GELU()) 37 | 38 | def forward(self, x): 39 | readout = x[:, 0].unsqueeze(1).expand_as(x[:, self.start_index :]) 40 | features = torch.cat((x[:, self.start_index :], readout), -1) 41 | 42 | return self.project(features) 43 | 44 | 45 | class Transpose(nn.Module): 46 | def __init__(self, dim0, dim1): 47 | super(Transpose, self).__init__() 48 | self.dim0 = dim0 49 | self.dim1 = dim1 50 | 51 | def forward(self, x): 52 | x = x.transpose(self.dim0, self.dim1) 53 | return x 54 | 55 | 56 | def forward_vit(pretrained, x): 57 | b, c, h, w = x.shape 58 | 59 | glob = pretrained.model.forward_flex(x) 60 | 61 | layer_1 = pretrained.activations["1"] 62 | layer_2 = pretrained.activations["2"] 63 | layer_3 = pretrained.activations["3"] 64 | layer_4 = pretrained.activations["4"] 65 | 66 | layer_1 = pretrained.act_postprocess1[0:2](layer_1) 67 | layer_2 = pretrained.act_postprocess2[0:2](layer_2) 68 | layer_3 = pretrained.act_postprocess3[0:2](layer_3) 69 | layer_4 = pretrained.act_postprocess4[0:2](layer_4) 70 | 71 | unflatten = nn.Sequential( 72 | nn.Unflatten( 73 | 2, 74 | torch.Size( 75 | [ 76 | h // pretrained.model.patch_size[1], 77 | w // pretrained.model.patch_size[0], 78 | ] 79 | ), 80 | ) 81 | ) 82 | 83 | if layer_1.ndim == 3: 84 | layer_1 = unflatten(layer_1) 85 | if layer_2.ndim == 3: 86 | layer_2 = unflatten(layer_2) 87 | if layer_3.ndim == 3: 88 | layer_3 = unflatten(layer_3) 89 | if layer_4.ndim == 3: 90 | layer_4 = unflatten(layer_4) 91 | 92 | layer_1 = pretrained.act_postprocess1[3 : len(pretrained.act_postprocess1)](layer_1) 93 | layer_2 = pretrained.act_postprocess2[3 : len(pretrained.act_postprocess2)](layer_2) 94 | layer_3 = pretrained.act_postprocess3[3 : len(pretrained.act_postprocess3)](layer_3) 95 | layer_4 = pretrained.act_postprocess4[3 : len(pretrained.act_postprocess4)](layer_4) 96 | 97 | return layer_1, layer_2, layer_3, layer_4 98 | 99 | 100 | def _resize_pos_embed(self, posemb, gs_h, gs_w): 101 | posemb_tok, posemb_grid = ( 102 | posemb[:, : self.start_index], 103 | posemb[0, self.start_index :], 104 | ) 105 | 106 | gs_old = int(math.sqrt(len(posemb_grid))) 107 | 108 | posemb_grid = posemb_grid.reshape(1, gs_old, gs_old, -1).permute(0, 3, 1, 2) 109 | posemb_grid = F.interpolate(posemb_grid, size=(gs_h, gs_w), mode="bilinear") 110 | posemb_grid = posemb_grid.permute(0, 2, 3, 1).reshape(1, gs_h * gs_w, -1) 111 | 112 | posemb = torch.cat([posemb_tok, posemb_grid], dim=1) 113 | 114 | return posemb 115 | 116 | 117 | def forward_flex(self, x): 118 | b, c, h, w = x.shape 119 | 120 | pos_embed = self._resize_pos_embed( 121 | self.pos_embed, h // self.patch_size[1], w // self.patch_size[0] 122 | ) 123 | 124 | B = x.shape[0] 125 | 126 | if hasattr(self.patch_embed, "backbone"): 127 | x = self.patch_embed.backbone(x) 128 | if isinstance(x, (list, tuple)): 129 | x = x[-1] # last feature if backbone outputs list/tuple of features 130 | 131 | x = self.patch_embed.proj(x).flatten(2).transpose(1, 2) 132 | 133 | if getattr(self, "dist_token", None) is not None: 134 | cls_tokens = self.cls_token.expand( 135 | B, -1, -1 136 | ) # stole cls_tokens impl from Phil Wang, thanks 137 | dist_token = self.dist_token.expand(B, -1, -1) 138 | x = torch.cat((cls_tokens, dist_token, x), dim=1) 139 | else: 140 | cls_tokens = self.cls_token.expand( 141 | B, -1, -1 142 | ) # stole cls_tokens impl from Phil Wang, thanks 143 | x = torch.cat((cls_tokens, x), dim=1) 144 | 145 | x = x + pos_embed 146 | x = self.pos_drop(x) 147 | 148 | for blk in self.blocks: 149 | x = blk(x) 150 | 151 | x = self.norm(x) 152 | 153 | return x 154 | 155 | 156 | activations = {} 157 | 158 | 159 | def get_activation(name): 160 | def hook(model, input, output): 161 | activations[name] = output 162 | 163 | return hook 164 | 165 | 166 | def get_readout_oper(vit_features, features, use_readout, start_index=1): 167 | if use_readout == "ignore": 168 | readout_oper = [Slice(start_index)] * len(features) 169 | elif use_readout == "add": 170 | readout_oper = [AddReadout(start_index)] * len(features) 171 | elif use_readout == "project": 172 | readout_oper = [ 173 | ProjectReadout(vit_features, start_index) for out_feat in features 174 | ] 175 | else: 176 | assert ( 177 | False 178 | ), "wrong operation for readout token, use_readout can be 'ignore', 'add', or 'project'" 179 | 180 | return readout_oper 181 | 182 | 183 | def _make_vit_b16_backbone( 184 | model, 185 | features=[96, 192, 384, 768], 186 | size=[384, 384], 187 | hooks=[2, 5, 8, 11], 188 | vit_features=768, 189 | use_readout="ignore", 190 | start_index=1, 191 | ): 192 | pretrained = nn.Module() 193 | 194 | pretrained.model = model 195 | pretrained.model.blocks[hooks[0]].register_forward_hook(get_activation("1")) 196 | pretrained.model.blocks[hooks[1]].register_forward_hook(get_activation("2")) 197 | pretrained.model.blocks[hooks[2]].register_forward_hook(get_activation("3")) 198 | pretrained.model.blocks[hooks[3]].register_forward_hook(get_activation("4")) 199 | 200 | pretrained.activations = activations 201 | 202 | readout_oper = get_readout_oper(vit_features, features, use_readout, start_index) 203 | 204 | # 32, 48, 136, 384 205 | pretrained.act_postprocess1 = nn.Sequential( 206 | readout_oper[0], 207 | Transpose(1, 2), 208 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])), 209 | nn.Conv2d( 210 | in_channels=vit_features, 211 | out_channels=features[0], 212 | kernel_size=1, 213 | stride=1, 214 | padding=0, 215 | ), 216 | nn.ConvTranspose2d( 217 | in_channels=features[0], 218 | out_channels=features[0], 219 | kernel_size=4, 220 | stride=4, 221 | padding=0, 222 | bias=True, 223 | dilation=1, 224 | groups=1, 225 | ), 226 | ) 227 | 228 | pretrained.act_postprocess2 = nn.Sequential( 229 | readout_oper[1], 230 | Transpose(1, 2), 231 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])), 232 | nn.Conv2d( 233 | in_channels=vit_features, 234 | out_channels=features[1], 235 | kernel_size=1, 236 | stride=1, 237 | padding=0, 238 | ), 239 | nn.ConvTranspose2d( 240 | in_channels=features[1], 241 | out_channels=features[1], 242 | kernel_size=2, 243 | stride=2, 244 | padding=0, 245 | bias=True, 246 | dilation=1, 247 | groups=1, 248 | ), 249 | ) 250 | 251 | pretrained.act_postprocess3 = nn.Sequential( 252 | readout_oper[2], 253 | Transpose(1, 2), 254 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])), 255 | nn.Conv2d( 256 | in_channels=vit_features, 257 | out_channels=features[2], 258 | kernel_size=1, 259 | stride=1, 260 | padding=0, 261 | ), 262 | ) 263 | 264 | pretrained.act_postprocess4 = nn.Sequential( 265 | readout_oper[3], 266 | Transpose(1, 2), 267 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])), 268 | nn.Conv2d( 269 | in_channels=vit_features, 270 | out_channels=features[3], 271 | kernel_size=1, 272 | stride=1, 273 | padding=0, 274 | ), 275 | nn.Conv2d( 276 | in_channels=features[3], 277 | out_channels=features[3], 278 | kernel_size=3, 279 | stride=2, 280 | padding=1, 281 | ), 282 | ) 283 | 284 | pretrained.model.start_index = start_index 285 | pretrained.model.patch_size = [16, 16] 286 | 287 | # We inject this function into the VisionTransformer instances so that 288 | # we can use it with interpolated position embeddings without modifying the library source. 289 | pretrained.model.forward_flex = types.MethodType(forward_flex, pretrained.model) 290 | pretrained.model._resize_pos_embed = types.MethodType( 291 | _resize_pos_embed, pretrained.model 292 | ) 293 | 294 | return pretrained 295 | 296 | 297 | def _make_pretrained_vitl16_384(pretrained, use_readout="ignore", hooks=None): 298 | model = timm.create_model("vit_large_patch16_384", pretrained=pretrained) 299 | 300 | hooks = [5, 11, 17, 23] if hooks == None else hooks 301 | return _make_vit_b16_backbone( 302 | model, 303 | features=[256, 512, 1024, 1024], 304 | hooks=hooks, 305 | vit_features=1024, 306 | use_readout=use_readout, 307 | ) 308 | 309 | 310 | def _make_pretrained_vitb16_384(pretrained, use_readout="ignore", hooks=None): 311 | model = timm.create_model("vit_base_patch16_384", pretrained=pretrained) 312 | 313 | hooks = [2, 5, 8, 11] if hooks == None else hooks 314 | return _make_vit_b16_backbone( 315 | model, features=[96, 192, 384, 768], hooks=hooks, use_readout=use_readout 316 | ) 317 | 318 | 319 | def _make_pretrained_deitb16_384(pretrained, use_readout="ignore", hooks=None): 320 | model = timm.create_model("vit_deit_base_patch16_384", pretrained=pretrained) 321 | 322 | hooks = [2, 5, 8, 11] if hooks == None else hooks 323 | return _make_vit_b16_backbone( 324 | model, features=[96, 192, 384, 768], hooks=hooks, use_readout=use_readout 325 | ) 326 | 327 | 328 | def _make_pretrained_deitb16_distil_384(pretrained, use_readout="ignore", hooks=None): 329 | model = timm.create_model( 330 | "vit_deit_base_distilled_patch16_384", pretrained=pretrained 331 | ) 332 | 333 | hooks = [2, 5, 8, 11] if hooks == None else hooks 334 | return _make_vit_b16_backbone( 335 | model, 336 | features=[96, 192, 384, 768], 337 | hooks=hooks, 338 | use_readout=use_readout, 339 | start_index=2, 340 | ) 341 | 342 | 343 | def _make_vit_b_rn50_backbone( 344 | model, 345 | features=[256, 512, 768, 768], 346 | size=[384, 384], 347 | hooks=[0, 1, 8, 11], 348 | vit_features=768, 349 | use_vit_only=False, 350 | use_readout="ignore", 351 | start_index=1, 352 | ): 353 | pretrained = nn.Module() 354 | 355 | pretrained.model = model 356 | 357 | if use_vit_only == True: 358 | pretrained.model.blocks[hooks[0]].register_forward_hook(get_activation("1")) 359 | pretrained.model.blocks[hooks[1]].register_forward_hook(get_activation("2")) 360 | else: 361 | pretrained.model.patch_embed.backbone.stages[0].register_forward_hook( 362 | get_activation("1") 363 | ) 364 | pretrained.model.patch_embed.backbone.stages[1].register_forward_hook( 365 | get_activation("2") 366 | ) 367 | 368 | pretrained.model.blocks[hooks[2]].register_forward_hook(get_activation("3")) 369 | pretrained.model.blocks[hooks[3]].register_forward_hook(get_activation("4")) 370 | 371 | pretrained.activations = activations 372 | 373 | readout_oper = get_readout_oper(vit_features, features, use_readout, start_index) 374 | 375 | if use_vit_only == True: 376 | pretrained.act_postprocess1 = nn.Sequential( 377 | readout_oper[0], 378 | Transpose(1, 2), 379 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])), 380 | nn.Conv2d( 381 | in_channels=vit_features, 382 | out_channels=features[0], 383 | kernel_size=1, 384 | stride=1, 385 | padding=0, 386 | ), 387 | nn.ConvTranspose2d( 388 | in_channels=features[0], 389 | out_channels=features[0], 390 | kernel_size=4, 391 | stride=4, 392 | padding=0, 393 | bias=True, 394 | dilation=1, 395 | groups=1, 396 | ), 397 | ) 398 | 399 | pretrained.act_postprocess2 = nn.Sequential( 400 | readout_oper[1], 401 | Transpose(1, 2), 402 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])), 403 | nn.Conv2d( 404 | in_channels=vit_features, 405 | out_channels=features[1], 406 | kernel_size=1, 407 | stride=1, 408 | padding=0, 409 | ), 410 | nn.ConvTranspose2d( 411 | in_channels=features[1], 412 | out_channels=features[1], 413 | kernel_size=2, 414 | stride=2, 415 | padding=0, 416 | bias=True, 417 | dilation=1, 418 | groups=1, 419 | ), 420 | ) 421 | else: 422 | pretrained.act_postprocess1 = nn.Sequential( 423 | nn.Identity(), nn.Identity(), nn.Identity() 424 | ) 425 | pretrained.act_postprocess2 = nn.Sequential( 426 | nn.Identity(), nn.Identity(), nn.Identity() 427 | ) 428 | 429 | pretrained.act_postprocess3 = nn.Sequential( 430 | readout_oper[2], 431 | Transpose(1, 2), 432 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])), 433 | nn.Conv2d( 434 | in_channels=vit_features, 435 | out_channels=features[2], 436 | kernel_size=1, 437 | stride=1, 438 | padding=0, 439 | ), 440 | ) 441 | 442 | pretrained.act_postprocess4 = nn.Sequential( 443 | readout_oper[3], 444 | Transpose(1, 2), 445 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])), 446 | nn.Conv2d( 447 | in_channels=vit_features, 448 | out_channels=features[3], 449 | kernel_size=1, 450 | stride=1, 451 | padding=0, 452 | ), 453 | nn.Conv2d( 454 | in_channels=features[3], 455 | out_channels=features[3], 456 | kernel_size=3, 457 | stride=2, 458 | padding=1, 459 | ), 460 | ) 461 | 462 | pretrained.model.start_index = start_index 463 | pretrained.model.patch_size = [16, 16] 464 | 465 | # We inject this function into the VisionTransformer instances so that 466 | # we can use it with interpolated position embeddings without modifying the library source. 467 | pretrained.model.forward_flex = types.MethodType(forward_flex, pretrained.model) 468 | 469 | # We inject this function into the VisionTransformer instances so that 470 | # we can use it with interpolated position embeddings without modifying the library source. 471 | pretrained.model._resize_pos_embed = types.MethodType( 472 | _resize_pos_embed, pretrained.model 473 | ) 474 | 475 | return pretrained 476 | 477 | 478 | def _make_pretrained_vitb_rn50_384( 479 | pretrained, use_readout="ignore", hooks=None, use_vit_only=False 480 | ): 481 | model = timm.create_model("vit_base_resnet50_384", pretrained=pretrained) 482 | 483 | hooks = [0, 1, 8, 11] if hooks == None else hooks 484 | return _make_vit_b_rn50_backbone( 485 | model, 486 | features=[256, 512, 768, 768], 487 | size=[384, 384], 488 | hooks=hooks, 489 | use_vit_only=use_vit_only, 490 | use_readout=use_readout, 491 | ) 492 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib==3.6.3 2 | numpy==1.24.2 3 | opencv==4.7.0 4 | timm==0.6.12 5 | torch==1.13.1+cu116 6 | torchaudio==0.13.1+cu116 7 | torchvision==0.14.1+cu116 8 | wandb==0.13.10 -------------------------------------------------------------------------------- /run.py: -------------------------------------------------------------------------------- 1 | """Compute depth maps for images in the input folder. 2 | """ 3 | import os 4 | import glob 5 | import torch 6 | import utils 7 | import cv2 8 | import argparse 9 | import numpy as np 10 | 11 | from torchvision.transforms import Compose 12 | from midas.dpt_depth import DPTDepthModel 13 | from midas.midas_net import MidasNet 14 | from midas.midas_net_custom import MidasNet_small 15 | from midas.transforms import Resize, ResizeTrain, NormalizeImage, PrepareForNet, RandomCrop, MirrorSquarePad, ColorAug, RandomHorizontalFlip 16 | 17 | from utils import parse_dataset_txt 18 | 19 | def run(input_path, output_path, dataset_txt, model_path, model_type="large", save_full=False, mask_path="", cls2mask=[], mean=False, it=5, output_list=False): 20 | """Run MonoDepthNN to compute depth maps. 21 | 22 | Args: 23 | input_path (str): path to input folder 24 | output_path (str): path to output folder 25 | model_path (str): path to saved model 26 | """ 27 | print("initialize") 28 | 29 | # select device 30 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") 31 | print("device: %s" % device) 32 | 33 | # load network 34 | if model_type == "dpt_large": # DPT-Large 35 | model = DPTDepthModel( 36 | path=None, 37 | backbone="vitl16_384", 38 | non_negative=True, 39 | ) 40 | net_w, net_h = 384, 384 41 | normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]) 42 | 43 | transform = Compose( 44 | [ 45 | Resize( 46 | net_w, 47 | net_h, 48 | resize_target=True, 49 | keep_aspect_ratio=True, 50 | ensure_multiple_of=32, 51 | resize_method="lower_bound", 52 | image_interpolation_method=cv2.INTER_CUBIC, 53 | ), 54 | normalization, 55 | PrepareForNet(), 56 | ] 57 | ) 58 | elif model_type == "midas_v21": 59 | model = MidasNet(None, non_negative=True) 60 | net_w, net_h = 384, 384 61 | normalization = NormalizeImage( 62 | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] 63 | ) 64 | # Mirror Square Pad and Resize 65 | transform = Compose( 66 | [ 67 | Resize( 68 | net_w, 69 | net_h, 70 | resize_target=True, 71 | keep_aspect_ratio=True, 72 | ensure_multiple_of=32, 73 | resize_method="upper_bound", 74 | image_interpolation_method=cv2.INTER_CUBIC, 75 | ), 76 | normalization, 77 | PrepareForNet(), 78 | ] 79 | ) 80 | else: 81 | print(f"model_type '{model_type}' not implemented, use: --model_type large") 82 | assert False 83 | 84 | checkpoint = torch.load(model_path) 85 | 86 | if 'model_state_dict' in checkpoint.keys(): 87 | model.load_state_dict(checkpoint['model_state_dict']) 88 | else: 89 | model.load_state_dict(checkpoint) 90 | 91 | model.eval() 92 | model.to(device) 93 | 94 | # get input 95 | dataset_dict = parse_dataset_txt(dataset_txt) 96 | num_images = len(dataset_dict["basenames"]) 97 | 98 | # create output folder 99 | os.makedirs(output_path, exist_ok=True) 100 | if output_list: 101 | fout = open(output_list, "w") 102 | 103 | print("start processing") 104 | np.random.seed(0) 105 | for ind, basename in enumerate(dataset_dict["basenames"]): 106 | img_name = os.path.join(input_path, basename) 107 | print(" processing {} ({}/{})".format(img_name, ind + 1, num_images)) 108 | # input 109 | img = utils.read_image(img_name) 110 | if mask_path: 111 | mask_name = img_name.replace(input_path, mask_path).replace(".jpg",".png") 112 | mask = cv2.imread(mask_name, 0) 113 | 114 | preds = [] 115 | for _ in range(args.it): 116 | if mask_path: 117 | if args.it == 1: 118 | color = np.array([0.5, 0.5, 0.5]) 119 | else: 120 | color = np.random.random([3]) 121 | for cls in cls2mask: 122 | img[mask == cls] = color 123 | 124 | img_input = transform({"image": img})["image"] 125 | # compute 126 | with torch.no_grad(): 127 | sample = torch.from_numpy(img_input).to(device).unsqueeze(0) 128 | prediction = model.forward(sample) 129 | 130 | if save_full: 131 | prediction = ( 132 | torch.nn.functional.interpolate( 133 | prediction.unsqueeze(1), 134 | size=img.shape[:2], 135 | mode="bicubic", 136 | align_corners=False, 137 | ) 138 | .squeeze() 139 | .cpu() 140 | .numpy() 141 | ) 142 | else: 143 | prediction = prediction.squeeze().cpu().numpy() 144 | preds.append(prediction) 145 | 146 | prediction = np.median(np.stack(preds,axis=0), axis=0) 147 | 148 | output_dir = os.path.join(output_path, os.path.dirname(basename)) 149 | os.makedirs(output_dir, exist_ok=True) 150 | filename = os.path.join(output_dir, os.path.splitext(os.path.basename(img_name))[0]) 151 | 152 | np.save(filename, prediction.astype(np.float32)) 153 | if output_list: 154 | fout.write(img_name + " " + filename + ".npy\n") 155 | 156 | utils.write_depth(filename, prediction, bytes=2) 157 | 158 | print("finished") 159 | 160 | 161 | if __name__ == "__main__": 162 | parser = argparse.ArgumentParser() 163 | 164 | parser.add_argument('-i', '--input_path', 165 | default='input', 166 | help='folder with images' 167 | ) 168 | 169 | parser.add_argument('--dataset_txt', 170 | default='dataset.txt', 171 | help='dataset txt file', 172 | ) 173 | 174 | parser.add_argument('--mask_path', 175 | default='', 176 | help='folder with mask images' 177 | ) 178 | 179 | parser.add_argument('--cls2mask', 180 | default=[1], 181 | type=int, 182 | nargs='+', 183 | help='classes to mask' 184 | ) 185 | 186 | parser.add_argument('--it', 187 | default=1, 188 | type=int, 189 | help="number of iteration to run midas" 190 | ) 191 | 192 | parser.add_argument('-o', '--output_path', 193 | default='output', 194 | help='folder for output images' 195 | ) 196 | 197 | parser.add_argument('--output_list', 198 | default='', 199 | help='output list of generated depths as txt file' 200 | ) 201 | 202 | parser.add_argument('--save_full_res', 203 | action='store_true', 204 | help='save original resolution' 205 | ) 206 | 207 | parser.add_argument('-m', '--model_weights', 208 | default=None, 209 | help='path to the trained weights of model' 210 | ) 211 | 212 | parser.add_argument('-t', '--model_type', 213 | default='dpt_large', 214 | help='model type: dpt_large, midas_v21' 215 | ) 216 | 217 | args = parser.parse_args() 218 | 219 | default_models = { 220 | "midas_v21": "weights/Base/midas_v21-base.pt", 221 | "dpt_large": "weights/Base/dpt_large-base.pt", 222 | } 223 | 224 | if args.model_weights is None: 225 | args.model_weights = default_models[args.model_type] 226 | 227 | # set torch options 228 | torch.backends.cudnn.enabled = True 229 | torch.backends.cudnn.benchmark = True 230 | 231 | print(args) 232 | # compute depth maps 233 | run(args.input_path, args.output_path, args.dataset_txt, args.model_weights, args.model_type, save_full=args.save_full_res, mask_path=args.mask_path, cls2mask=args.cls2mask, it=args.it, output_list=args.output_list) 234 | -------------------------------------------------------------------------------- /scripts/finetune.sh: -------------------------------------------------------------------------------- 1 | cd .. 2 | 3 | model="dpt_large" # ["midas_v21", "dpt_large"] 4 | output_path=./experiment_models/ 5 | dataroot="data" 6 | txtroot="datasets" 7 | exp_name="Ft. Virtual Depth" 8 | 9 | python finetune.py --exp_name "$exp_name" \ 10 | --training_datasets trans10k msd \ 11 | --training_datasets_dir $dataroot"/Trans10K" $dataroot"/MSD" \ 12 | --training_datasets_txt $txtroot"/trans10k/virtual_depth_"$model".txt" $txtroot"/msd/virtual_depth_"$model".txt" \ 13 | --output_path $output_path \ 14 | --model_type $model -------------------------------------------------------------------------------- /scripts/generate_virtual_depth.sh: -------------------------------------------------------------------------------- 1 | root="path_to_dataset_root" 2 | cd .. 3 | 4 | model="dpt_large" # ["midas_v21", "dpt_large"] 5 | dataset="Trans10K" # ["Trans10K", "MSD"] 6 | splits="train test validation" 7 | for split in $splits 8 | do 9 | echo $model $dataset $split 10 | input_dir=$root/$dataset/$split/images # path to dataset folder with images 11 | mask_dir=$root/$dataset/$split/masks # path to dataset folder with segmentations, either GT or proxy 12 | output_dir=$root"/"$dataset/$split/$model"_proxies"/$exp # output path 13 | 14 | dataset_lower=$(echo $dataset | tr '[:upper:]' '[:lower:]') 15 | dataset_txt="datasets/"$dataset_lower"/"$split".txt" # inference list 16 | 17 | ### define output_list if you want to save the list of the generated virtual depths 18 | exp="base" 19 | output_list="datasets/"$dataset_lower"/"$split"_"$model"_"$exp".txt" 20 | ### 21 | 22 | if [ -f $dataset_txt ] 23 | then 24 | python run.py --model_type $model \ 25 | --input_path $input_dir \ 26 | --dataset_txt $dataset_txt \ 27 | --output_path $output_dir \ 28 | --output_list $output_list \ 29 | --mask_path $mask_dir \ 30 | --it 5 \ 31 | --cls2mask 255 # list of class ids in segmentation maps relative to ToM surfaces. 32 | fi 33 | done -------------------------------------------------------------------------------- /scripts/table2.sh: -------------------------------------------------------------------------------- 1 | cd .. 2 | 3 | ### Change this path ### 4 | dataset_root="/media/data2/Booster/train/balanced" 5 | ######################## 6 | 7 | dataset_txt="datasets/booster/train_stereo.txt" 8 | 9 | # RESULTS TABLE 2 10 | for model in "midas_v21" "dpt_large" 11 | do 12 | ## BASE MODEL ### 13 | output_dir="results/Base/"$model 14 | python run.py --model_type $model \ 15 | --input_path $dataset_root \ 16 | --dataset_txt $dataset_txt \ 17 | --output_path $output_dir 18 | result_path="results/table2_base_"$model".txt" 19 | python evaluate_mono.py --gt_root $dataset_root \ 20 | --pred_root $output_dir \ 21 | --dataset_txt $dataset_txt \ 22 | --output_path $result_path 23 | 24 | ## FT. BASE MODEL ### 25 | output_dir="results/Table2/Ft. Base/"$model 26 | model_weights="weights/Table 2/Ft. Base/"$model"_final.pt" 27 | python run.py --model_type $model \ 28 | --input_path $dataset_root \ 29 | --dataset_txt $dataset_txt \ 30 | --output_path "$output_dir" \ 31 | --model_weights "$model_weights" 32 | result_path="results/table2_ftbase_"$model".txt" 33 | python evaluate_mono.py --gt_root $dataset_root \ 34 | --pred_root "$output_dir" \ 35 | --dataset_txt $dataset_txt \ 36 | --output_path $result_path 37 | 38 | ## FT. VIRTUAL DEPTH MODEL - OUR ### 39 | output_dir="results/Table2/Ft. Virtual Depth/"$model 40 | model_weights="weights/Table 2/Ft. Virtual Depth/"$model"_final.pt" 41 | python run.py --model_type $model \ 42 | --input_path $dataset_root \ 43 | --dataset_txt $dataset_txt \ 44 | --output_path "$output_dir" \ 45 | --model_weights "$model_weights" 46 | result_path="results/table2_ftvirtualdepth_"$model".txt" 47 | python evaluate_mono.py --gt_root $dataset_root \ 48 | --pred_root "$output_dir" \ 49 | --dataset_txt $dataset_txt \ 50 | --output_path $result_path 51 | done -------------------------------------------------------------------------------- /scripts/table3.sh: -------------------------------------------------------------------------------- 1 | cd .. 2 | 3 | ### Change this path ### 4 | dataset_root="/media/data2/Booster/train/balanced" 5 | ######################## 6 | 7 | dataset_txt="datasets/booster/train_stereo.txt" 8 | 9 | # RESULTS TABLE 3 10 | for model in "midas_v21" "dpt_large" 11 | do 12 | ## Ft. Virtual Depth (GT) MODEL ### 13 | output_dir="results/Table3/Ft. Virtual Depth (GT)/"$model 14 | model_weights="weights/Table 3/Ft. Virtual Depth (GT)/"$model"_final.pt" 15 | python run.py --model_type $model \ 16 | --input_path $dataset_root \ 17 | --dataset_txt $dataset_txt \ 18 | --output_path "$output_dir" \ 19 | --model_weights "$model_weights" 20 | result_path="results/Table3_ftvirutaldepthgt_"$model".txt" 21 | python evaluate_mono.py --gt_root $dataset_root \ 22 | --pred_root "$output_dir" \ 23 | --dataset_txt $dataset_txt \ 24 | --output_path $result_path 25 | 26 | ## Ft. Virtual Depth (Proxy) MODEL - OUR ### 27 | output_dir="results/Table3/Ft. Virtual Depth (Proxy)/"$model 28 | model_weights="weights/Table 3/Ft. Virtual Depth (Proxy)/"$model"_final.pt" 29 | python run.py --model_type $model \ 30 | --input_path $dataset_root \ 31 | --dataset_txt $dataset_txt \ 32 | --output_path "$output_dir" \ 33 | --model_weights "$model_weights" 34 | result_path="results/Table3_ftvirtualdepthproxy_"$model".txt" 35 | python evaluate_mono.py --gt_root $dataset_root \ 36 | --pred_root "$output_dir" \ 37 | --dataset_txt $dataset_txt \ 38 | --output_path $result_path 39 | done -------------------------------------------------------------------------------- /utils.py: -------------------------------------------------------------------------------- 1 | """ 2 | Utils for monoDepth. 3 | """ 4 | import sys 5 | import re 6 | import numpy as np 7 | import cv2 8 | 9 | def decode_3_channels(raw, max_depth=1000): 10 | """Carla format to depth 11 | Args: 12 | raw: carla format depth image. Expected in BGR. 13 | max_depth: max depth used during rendering 14 | """ 15 | raw = raw.astype(np.float32) 16 | out = raw[:,:,2] + raw[:,:,1] * 256 + raw[:,:,0]*256*256 17 | out = out / (256*256*256 - 1) * max_depth 18 | return out 19 | 20 | 21 | def read_pfm(path): 22 | """Read pfm file. 23 | 24 | Args: 25 | path (str): path to file 26 | 27 | Returns: 28 | tuple: (data, scale) 29 | """ 30 | with open(path, "rb") as file: 31 | 32 | color = None 33 | width = None 34 | height = None 35 | scale = None 36 | endian = None 37 | 38 | header = file.readline().rstrip() 39 | if header.decode("ascii") == "PF": 40 | color = True 41 | elif header.decode("ascii") == "Pf": 42 | color = False 43 | else: 44 | raise Exception("Not a PFM file: " + path) 45 | 46 | dim_match = re.match(r"^(\d+)\s(\d+)\s$", file.readline().decode("ascii")) 47 | if dim_match: 48 | width, height = list(map(int, dim_match.groups())) 49 | else: 50 | raise Exception("Malformed PFM header.") 51 | 52 | scale = float(file.readline().decode("ascii").rstrip()) 53 | if scale < 0: 54 | # little-endian 55 | endian = "<" 56 | scale = -scale 57 | else: 58 | # big-endian 59 | endian = ">" 60 | 61 | data = np.fromfile(file, endian + "f") 62 | shape = (height, width, 3) if color else (height, width) 63 | 64 | data = np.reshape(data, shape) 65 | data = np.flipud(data) 66 | 67 | return data, scale 68 | 69 | 70 | def write_pfm(path, image, scale=1): 71 | """Write pfm file. 72 | 73 | Args: 74 | path (str): pathto file 75 | image (array): data 76 | scale (int, optional): Scale. Defaults to 1. 77 | """ 78 | 79 | with open(path, "wb") as file: 80 | color = None 81 | 82 | if image.dtype.name != "float32": 83 | raise Exception("Image dtype must be float32.") 84 | 85 | image = np.flipud(image) 86 | 87 | if len(image.shape) == 3 and image.shape[2] == 3: # color image 88 | color = True 89 | elif ( 90 | len(image.shape) == 2 or len(image.shape) == 3 and image.shape[2] == 1 91 | ): # greyscale 92 | color = False 93 | else: 94 | raise Exception("Image must have H x W x 3, H x W x 1 or H x W dimensions.") 95 | 96 | file.write("PF\n" if color else "Pf\n".encode()) 97 | file.write("%d %d\n".encode() % (image.shape[1], image.shape[0])) 98 | 99 | endian = image.dtype.byteorder 100 | 101 | if endian == "<" or endian == "=" and sys.byteorder == "little": 102 | scale = -scale 103 | 104 | file.write("%f\n".encode() % scale) 105 | 106 | image.tofile(file) 107 | 108 | 109 | def read_d(path, scale_factor=256.): 110 | """Read depth or disp Map 111 | Args: 112 | path: path to depth or disp 113 | scale_factor: scale factor used to decode png 16 bit images 114 | """ 115 | 116 | if path.endswith("pfm"): 117 | d = read_pfm(path) 118 | elif path.endswith("npy"): 119 | d = np.load(path) 120 | elif path.endswith("exr"): 121 | d = cv2.imread(path, cv2.IMREAD_UNCHANGED) 122 | d = d[:,:,0] 123 | elif path.endswith("png"): 124 | d = cv2.imread(path, cv2.IMREAD_UNCHANGED) 125 | if len(d.shape) == 3: 126 | d = decode_3_channels(d) 127 | elif d.dtype == np.uint16: 128 | d = d.astype(np.float32) 129 | d = d / scale_factor 130 | else: 131 | d = cv2.imread(path)[:,:,0] 132 | 133 | return d 134 | 135 | def read_image(path): 136 | """Read image and output RGB image (0-1). 137 | 138 | Args: 139 | path (str): path to file 140 | 141 | Returns: 142 | array: RGB image (0-1) 143 | """ 144 | img = cv2.imread(path) 145 | 146 | if img.ndim == 2: 147 | img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR) 148 | 149 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) / 255.0 150 | 151 | return img 152 | 153 | def write_depth(path, depth, bytes=1): 154 | """Write depth map to pfm and png file. 155 | 156 | Args: 157 | path (str): filepath without extension 158 | depth (array): depth 159 | """ 160 | 161 | depth_min = depth.min() 162 | depth_max = depth.max() 163 | 164 | max_val = (2**(8*bytes))-1 165 | 166 | if depth_max - depth_min > np.finfo("float").eps: 167 | out = max_val * (depth - depth_min) / (depth_max - depth_min) 168 | else: 169 | out = np.zeros(depth.shape, dtype=depth.type) 170 | 171 | if bytes == 1: 172 | cv2.imwrite(path + ".png", out.astype("uint8")) 173 | elif bytes == 2: 174 | cv2.imwrite(path + ".png", out.astype("uint16")) 175 | 176 | 177 | def read_calib_xml(calib_path, factor_baseline=0.001): 178 | cv_file = cv2.FileStorage(calib_path, cv2.FILE_STORAGE_READ) 179 | calib = cv_file.getNode("proj_matL").mat()[:3,:3] 180 | fx = calib[0,0] 181 | baseline = float(cv_file.getNode("baselineLR").real())*factor_baseline 182 | return fx, baseline 183 | 184 | 185 | def parse_dataset_txt(dataset_txt): 186 | with open(dataset_txt) as data_txt: 187 | gt_files = [] 188 | basenames = [] 189 | focals = [] 190 | baselines = [] 191 | calib_files = [] 192 | 193 | for line in data_txt: 194 | values = line.split(" ") 195 | 196 | if len(values) == 2: 197 | basenames.append(values[0].strip()) 198 | gt_files.append(values[1].strip()) 199 | 200 | elif len(values) == 3: 201 | basenames.append(values[0].strip()) 202 | gt_files.append(values[1].strip()) 203 | calib_files.append(values[2].strip()) 204 | 205 | elif len(values) == 4: 206 | basenames.append(values[0].strip()) 207 | gt_files.append(values[1].strip()) 208 | focals.append(float(values[2].strip())) 209 | baselines.append(float(values[3].strip())) 210 | 211 | else: 212 | print("Wrong format dataset txt file") 213 | exit(-1) 214 | 215 | dataset_dict = {} 216 | if gt_files: dataset_dict["gt_paths"] = gt_files 217 | if basenames: dataset_dict["basenames"] = basenames 218 | if calib_files: dataset_dict["calib_paths"] = calib_files 219 | if focals: dataset_dict["focals"] = focals 220 | if baselines: dataset_dict["baselines"] = baselines 221 | return dataset_dict 222 | 223 | 224 | def compute_scale_and_shift(prediction, target, mask): 225 | # system matrix: A = [[a_00, a_01], [a_10, a_11]] 226 | a_00 = np.sum(mask * prediction * prediction, axis=(1, 2)) 227 | a_01 = np.sum(mask * prediction, axis=(1, 2)) 228 | a_11 = np.sum(mask, axis=(1, 2)) 229 | 230 | # right hand side: b = [b_0, b_1] 231 | b_0 = np.sum(mask * prediction * target, axis=(1, 2)) 232 | b_1 = np.sum(mask * target, axis=(1, 2)) 233 | 234 | # solution: x = A^-1 . b = [[a_11, -a_01], [-a_10, a_00]] / (a_00 * a_11 - a_01 * a_10) . b 235 | x_0 = np.zeros_like(b_0) 236 | x_1 = np.zeros_like(b_1) 237 | 238 | det = a_00 * a_11 - a_01 * a_01 239 | # A needs to be a positive definite matrix. 240 | valid = det > 0 241 | 242 | x_0[valid] = (a_11[valid] * b_0[valid] - a_01[valid] * b_1[valid]) / det[valid] 243 | x_1[valid] = (-a_01[valid] * b_0[valid] + a_00[valid] * b_1[valid]) / det[valid] 244 | 245 | return x_0, x_1 --------------------------------------------------------------------------------