├── README.md
├── create_proxy_stereo.py
├── datasets
├── booster
│ └── train_stereo.txt
├── dataloader.py
├── msd
│ ├── test.txt
│ ├── train.txt
│ ├── virtual_depth_dpt_large.txt
│ └── virtual_depth_midas_v21.txt
└── trans10k
│ ├── test.txt
│ ├── train.txt
│ ├── validation.txt
│ ├── virtual_depth_dpt_large.txt
│ └── virtual_depth_midas_v21.txt
├── evaluate_mono.py
├── finetune.py
├── images
├── framework_mono.png
└── qualitatives.png
├── loss.py
├── midas
├── base_model.py
├── blocks.py
├── dpt_depth.py
├── midas_net.py
├── midas_net_custom.py
├── transforms.py
└── vit.py
├── requirements.txt
├── run.py
├── scripts
├── finetune.sh
├── generate_virtual_depth.sh
├── table2.sh
└── table3.sh
└── utils.py
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
Learning Depth Estimation for Transparent and Mirror Surfaces (ICCV 2023)
3 |
4 |
5 |
6 |
7 | :rotating_light: This repository contains download links to our dataset, code snippets, and trained deep models of our work "**Learning Depth Estimation for Transparent and Mirror Surfaces**", [ICCV 2023](https://cvpr2023.thecvf.com/)
8 |
9 | by [Alex Costanzino*](https://www.unibo.it/sitoweb/alex.costanzino), [Pierluigi Zama Ramirez*](https://pierlui92.github.io/), [Matteo Poggi*](https://mattpoggi.github.io/), [Fabio Tosi](https://fabiotosi92.github.io/), [Stefano Mattoccia](https://www.unibo.it/sitoweb/stefano.mattoccia), and [Luigi Di Stefano](https://www.unibo.it/sitoweb/luigi.distefano). \* _Equal Contribution_
10 |
11 | University of Bologna
12 |
13 |
14 |
15 |
16 |
17 |
18 |
19 | [Project Page](https://cvlab-unibo.github.io/Depth4ToM/) | [Paper](https://arxiv.org/abs/2307.15052)
20 |
21 |
22 |
23 | ## :bookmark_tabs: Table of Contents
24 |
25 | 1. [Introduction](#clapper-introduction)
26 | 2. [Dataset](#file_cabinet-dataset)
27 | - [Download](#arrow_down-get-your-hands-on-the-data)
28 | 3. [Pretrained Models](#inbox_tray-pretrained-models)
29 | 4. [Code](#memo-code)
30 | 5. [Qualitative Results](#art-qualitative-results)
31 | 6. [Contacts](#envelope-contacts)
32 |
33 |
34 |
35 | ## :clapper: Introduction
36 | Inferring the depth of transparent or mirror (ToM) surfaces represents a hard challenge for either sensors, algorithms, or deep networks. We propose a simple pipeline for learning to estimate depth properly for such surfaces with neural networks, without requiring any ground-truth annotation. We unveil how to obtain reliable pseudo labels by in-painting ToM objects in images and processing them with a monocular depth estimation model. These labels can be used to fine-tune existing monocular or stereo networks, to let them learn how to deal with ToM surfaces. Experimental results on the Booster dataset show the dramatic improvements enabled by our remarkably simple proposal.
37 |
38 |
39 |
40 |
41 |
42 |
43 |
44 | :fountain_pen: If you find this code useful in your research, please cite:
45 |
46 | ```bibtex
47 | @inproceedings{costanzino2023iccv,
48 | title = {Learning Depth Estimation for Transparent and Mirror Surfaces},
49 | author = {Costanzino, Alex and Zama Ramirez, Pierluigi and Poggi, Matteo and Tosi, Fabio and Mattoccia, Stefano and Di Stefano, Luigi},
50 | booktitle = {The IEEE International Conference on Computer Vision},
51 | note = {ICCV},
52 | year = {2023},
53 | }
54 | ```
55 |
56 | ## :file_cabinet: Dataset
57 |
58 | In our experiments, we employed two datasets featuring transparent or mirror objets: [Trans10K](https://xieenze.github.io/projects/TransLAB/TransLAB.html) and [MSD](https://mhaiyang.github.io/ICCV2019_MirrorNet/index). With our in-painting technique we obtain virtual depth maps to finetune monocular networks. For sake of reproducibility, we make available Trans10K and MSD together with proxy labels used to finetune our models.
59 |
60 | ### :arrow_down: Get Your Hands on the Data
61 | Trans10K and MSD with Virtual Depths. [[Download]](https://1drv.ms/u/s!AgV49D1Z6rmGgZAz2I7tMepfdVrZYQ?e=jbuaJB)
62 |
63 | We also employed the Booster Dataset in our experiment. [[Download]](https://cvlab-unibo.github.io/booster-web/)
64 |
65 | ## :inbox_tray: Pretrained Models
66 |
67 | Here, you can download the weights of **MiDAS** and **DPT** architectures employed in the results of Table 2 and Table 3 of our paper. If you just need the best model, use `"Table 2/Ft. Virtual Depth/dpt_large_final.pt`
68 |
69 | To use these weights, please follow these steps:
70 |
71 | 1. Create a folder named `weights` in the project directory.
72 | 2. Download the weights [[Download]](https://1drv.ms/u/s!AgV49D1Z6rmGgZAyTbFLjjTMdgsE_A?e=1xcf4y)
73 | 3. Copy the downloaded weights into the `weights` folder.
74 |
75 | ## :memo: Code
76 |
77 |
78 |
79 | **Warning**:
80 | - Please be aware that we will not be releasing the training code for deep stereo models. We provide only our algorithm to obtain proxy depth labels by merging monocular and stereo predictions.
81 | - The code utilizes `wandb` during training to log results. Please be sure to have a wandb account. Otherwise, if you prefer to not use `wandb`, comment the wandb logging code lines in `finetune.py`.
82 |
83 |
84 |
85 |
86 | ### :hammer_and_wrench: Setup Instructions
87 |
88 | **Dependencies**: Ensure that you have installed all the necessary dependencies. The list of dependencies can be found in the `./requirements.txt` file.
89 |
90 |
91 | ### :rocket: Inference Monocular Networks
92 |
93 | The `run.py` script test monocular networks. It can be used to predict the monocular depth maps from pretrained networks, or to apply our in-painting strategy of Base networks to obtain Virtual Depths.
94 |
95 | You can specify the following options:
96 | - `--input_path`: Path to the root directory of the dataset. E.g., _Booster/balanced/train_ if you want to test the model on the training set of Booster.
97 | - `--dataset_txt`: The list of the dataset samples. Each line contains the relative path to `input_path` of each image. You can find some examples in the folder _datasets/_. E.g., to run on the training set of booster use *datasets\booster\train_stereo.txt*
98 | - `--mask_path`: Optional path with the folder containing masks. Each mask shoud have the same relative path of the corresponding image. When this path is specified, masks are applied to colorize ToM objects.
99 | - `--cls2mask`: IDs referring to ToM objects in masks.
100 | - `--it`: Number of inferences for each image. Used when in-painting with several random colors.
101 | - `--output_path`: Output directory,
102 | - `--output_list`: Save the prediction paths in a txt file.
103 | - `--save_full_res`: Save the prediction at the input resolution. If not specified save the predictions at the model output resolution.
104 | - `--model_weights`: Path to the trained weights of the model. If not specified load the Base network weights from default paths.
105 | - `--model_type`: Model type. Either `dpt_large` or `midas_v21`.
106 |
107 | You can reproduce the results of Table 2 and Table 3 of the paper by running `scripts/table2.sh` and `scripts/table3.sh`.
108 |
109 | If you haven't downloaded the pretrained models yet, you can find the download links in the **Pretrained Models** section above.
110 |
111 | ### :rocket: Train Monocular Networks
112 |
113 | To finetune networks refer to the example in `scripts/finetune.sh`
114 |
115 | ### :rocket: Monocular Virtual Depth Generation
116 |
117 | To generate virtual depth from depth networks using our in-paiting strategy refer to the example in `scripts/generate_virtual_depth.sh`
118 |
119 | ### :rocket: Stereo Proxy Depth Generation
120 |
121 | To generate proxy depth maps with our merging strategy to finetune stereo networks you can use `create_proxy_stereo.py`.
122 |
123 | As explained above, we will not release the code for finetuning stereo networks. However, our implementation was based on the official codes of [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo) and [CREStereo](https://github.com/megvii-research/CREStereo).
124 |
125 | ## :art: Qualitative Results
126 |
127 | In this section, we present illustrative examples that demonstrate the effectiveness of our proposal.
128 |
129 |
130 |
131 |
132 |
133 | ## :envelope: Contacts
134 |
135 | For questions, please send an email to alex.costanzino@unibo.it, pierluigi.zama@unibo.it, m.poggi@unibo.it, or fabio.tosi5@unibo.it
136 |
137 | ## :pray: Acknowledgements
138 |
139 | We would like to extend our sincere appreciation to the authors of the following projects for making their code available, which we have utilized in our work:
140 |
141 | - We would like to thank the authors of [MiDAS](https://github.com/isl-org/MiDaS), [RAFT-Stereo](https://github.com/princeton-vl/RAFT-Stereo) and [CREStereo](https://github.com/megvii-research/CREStereo) for providing their code, which has been instrumental in our experiments.
142 |
143 | We deeply appreciate the authors of the competing research papers for their helpful responses, and provision of model weights, which greatly aided accurate comparisons.
--------------------------------------------------------------------------------
/create_proxy_stereo.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import cv2
4 | import matplotlib.pyplot as plt
5 | import argparse
6 |
7 |
8 | # This script assumes the same subfolder structure for each root (mono_root, stereo_root, mask_root).
9 |
10 | parser = argparse.ArgumentParser()
11 | parser.add_argument('--mono_root', help="folder with mono predictions")
12 | parser.add_argument('--stereo_root', help="folder with stereo predictions")
13 | parser.add_argument('--stereo_ext', default=".npy", help="stereo extension.")
14 | parser.add_argument('--scale_factor_16bit_stereo', default=64, help="16bit scale factor used during saving")
15 | parser.add_argument('--mask_root', default="", help="folder with semantic masks")
16 | parser.add_argument('--output_root', default="results_merge", help="folder with semantic masks")
17 | parser.add_argument('--debug', action="store_true")
18 | args = parser.parse_args()
19 |
20 | debug=args.debug
21 | stereo_root=args.stereo_root
22 | mono_root=args.mono_root
23 | mask_root=args.mask_root
24 | output_root=args.output_root
25 | scale_factor_16bit_stereo = args.scale_factor_16bit_stereo
26 | stereo_ext = args.stereo_ext
27 |
28 | def compute_scale_and_shift(prediction, target, mask):
29 | # system matrix: A = [[a_00, a_01], [a_10, a_11]]
30 | a_00 = np.sum(mask * prediction * prediction, axis=(1, 2))
31 | a_01 = np.sum(mask * prediction, axis=(1, 2))
32 | a_11 = np.sum(mask, axis=(1, 2))
33 |
34 | # right hand side: b = [b_0, b_1]
35 | b_0 = np.sum(mask * prediction * target, axis=(1, 2))
36 | b_1 = np.sum(mask * target, axis=(1, 2))
37 |
38 | # solution: x = A^-1 . b = [[a_11, -a_01], [-a_10, a_00]] / (a_00 * a_11 - a_01 * a_10) . b
39 | x_0 = np.zeros_like(b_0)
40 | x_1 = np.zeros_like(b_1)
41 |
42 | det = a_00 * a_11 - a_01 * a_01
43 | # A needs to be a positive definite matrix.
44 | valid = det > 0
45 |
46 | x_0[valid] = (a_11[valid] * b_0[valid] - a_01[valid] * b_1[valid]) / det[valid]
47 | x_1[valid] = (-a_01[valid] * b_0[valid] + a_00[valid] * b_1[valid]) / det[valid]
48 |
49 | return x_0, x_1
50 |
51 |
52 | for root, dirs, files in os.walk(mono_root):
53 | for mono_path in files:
54 | if mono_path.endswith(".npy"):
55 | mono_path = os.path.join(root, mono_path)
56 |
57 | stereo_path = mono_path.replace(mono_root, stereo_root).replace("camera_00/", "")
58 | if "npy" in stereo_ext:
59 | stereo = np.load(stereo_path)
60 | elif "png" in stereo_ext:
61 | stereo_path = stereo_path.replace(".npy", ".png")
62 | stereo = cv2.imread(stereo_path, -1).astype(np.float32) / scale_factor_16bit_stereo
63 |
64 | mono = np.load(os.path.join(mono_root, mono_path))
65 | mono = cv2.resize(mono, (stereo.shape[1], stereo.shape[0]), cv2.INTER_CUBIC)
66 |
67 | valid = (stereo > 0).astype(np.float32)
68 | mono[valid == 0] = 0
69 |
70 | mask_path = mono_path.replace(mono_root, mask_root).replace(".npy", ".png")
71 | mask = cv2.imread(mask_path, 0)
72 | mask_transparent = (mask * valid) > 0
73 | mask_lambertian = ((1 - mask) * valid) > 0
74 |
75 | mono = (mono - np.min(mono[valid > 0])) / (mono[valid > 0].max() - mono[valid > 0].min())
76 | a, b = compute_scale_and_shift(np.expand_dims(mono, axis=0), np.expand_dims(stereo, axis=0), np.expand_dims(mask_lambertian.astype(np.float32), axis=0))
77 | mono = mono * a + b
78 |
79 | merged = np.zeros(stereo.shape)
80 | merged[mask_transparent] = mono[mask_transparent]
81 | merged[mask_lambertian] = stereo[mask_lambertian]
82 |
83 | output_path = os.path.join(output_root, os.path.dirname(mono_path).replace(mono_root + "/", ""))
84 | basename = os.path.basename(mono_path)
85 | os.makedirs(output_path, exist_ok=True)
86 |
87 | if debug:
88 | plt.subplot(3,2,1)
89 | plt.title("mask_seg")
90 | plt.imshow(cv2.resize((mask*255).astype(np.uint8), None, fx=0.25, fy=0.25, interpolation=cv2.INTER_NEAREST))
91 | plt.subplot(3,2,2)
92 | plt.title("mask_trasp")
93 | plt.imshow(cv2.resize(mask_transparent.astype(np.float32), None, fx=0.25, fy=0.25, interpolation=cv2.INTER_NEAREST))
94 | plt.subplot(3,2,3)
95 | plt.title("mask_lamb")
96 | plt.imshow(cv2.resize(mask_lambertian.astype(np.float32), None, fx=0.25, fy=0.25, interpolation=cv2.INTER_NEAREST))
97 | plt.subplot(3,2,4)
98 | plt.title("stereo")
99 | plt.imshow(cv2.resize(stereo, None, fx=0.25, fy=0.25), vmin=stereo.min(), vmax=stereo.max(), cmap="jet")
100 | plt.subplot(3,2,5)
101 | plt.title("mono")
102 | plt.imshow(cv2.resize(mono, None, fx=0.25, fy=0.25), vmin=stereo.min(), vmax=stereo.max(), cmap="jet")
103 | plt.subplot(3,2,6)
104 | plt.title("merged")
105 | plt.imshow(cv2.resize(merged, None, fx=0.25, fy=0.25), vmin=stereo.min(), vmax=stereo.max(), cmap="jet")
106 | plt.savefig(os.path.join(output_path, basename.replace(".npy", ".png")))
107 | else:
108 | np.save(os.path.join(output_path, basename), merged)
109 |
--------------------------------------------------------------------------------
/datasets/booster/train_stereo.txt:
--------------------------------------------------------------------------------
1 | Bathroom/camera_00/im0.png Bathroom/disp_00.npy Bathroom/calib_00-02.xml
2 | Bathroom/camera_00/im1.png Bathroom/disp_00.npy Bathroom/calib_00-02.xml
3 | Bathroom/camera_00/im2.png Bathroom/disp_00.npy Bathroom/calib_00-02.xml
4 | Bedroom/camera_00/im0.png Bedroom/disp_00.npy Bedroom/calib_00-02.xml
5 | Bedroom/camera_00/im1.png Bedroom/disp_00.npy Bedroom/calib_00-02.xml
6 | Bedroom/camera_00/im2.png Bedroom/disp_00.npy Bedroom/calib_00-02.xml
7 | Bottle/camera_00/im0.png Bottle/disp_00.npy Bottle/calib_00-02.xml
8 | Bottle/camera_00/im1.png Bottle/disp_00.npy Bottle/calib_00-02.xml
9 | Bottle1/camera_00/im0.png Bottle1/disp_00.npy Bottle1/calib_00-02.xml
10 | Bottle1/camera_00/im1.png Bottle1/disp_00.npy Bottle1/calib_00-02.xml
11 | Bottle1/camera_00/im2.png Bottle1/disp_00.npy Bottle1/calib_00-02.xml
12 | Bottle1/camera_00/im3.png Bottle1/disp_00.npy Bottle1/calib_00-02.xml
13 | BottledWater/camera_00/im0.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml
14 | BottledWater/camera_00/im1.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml
15 | BottledWater/camera_00/im2.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml
16 | BottledWater/camera_00/im3.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml
17 | BottledWater/camera_00/im4.png BottledWater/disp_00.npy BottledWater/calib_00-02.xml
18 | Bottles1/camera_00/im0.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml
19 | Bottles1/camera_00/im1.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml
20 | Bottles1/camera_00/im2.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml
21 | Bottles1/camera_00/im3.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml
22 | Bottles1/camera_00/im4.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml
23 | Bottles1/camera_00/im5.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml
24 | Bottles1/camera_00/im6.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml
25 | Bottles1/camera_00/im7.png Bottles1/disp_00.npy Bottles1/calib_00-02.xml
26 | Bucket/camera_00/im0.png Bucket/disp_00.npy Bucket/calib_00-02.xml
27 | Bucket/camera_00/im1.png Bucket/disp_00.npy Bucket/calib_00-02.xml
28 | Bucket/camera_00/im2.png Bucket/disp_00.npy Bucket/calib_00-02.xml
29 | Bucket/camera_00/im3.png Bucket/disp_00.npy Bucket/calib_00-02.xml
30 | Bucket/camera_00/im4.png Bucket/disp_00.npy Bucket/calib_00-02.xml
31 | Bucket/camera_00/im5.png Bucket/disp_00.npy Bucket/calib_00-02.xml
32 | Bucket/camera_00/im6.png Bucket/disp_00.npy Bucket/calib_00-02.xml
33 | Canteen/camera_00/im0.png Canteen/disp_00.npy Canteen/calib_00-02.xml
34 | Canteen/camera_00/im1.png Canteen/disp_00.npy Canteen/calib_00-02.xml
35 | Canteen/camera_00/im2.png Canteen/disp_00.npy Canteen/calib_00-02.xml
36 | Canteen/camera_00/im3.png Canteen/disp_00.npy Canteen/calib_00-02.xml
37 | Canteen/camera_00/im4.png Canteen/disp_00.npy Canteen/calib_00-02.xml
38 | Canteen/camera_00/im5.png Canteen/disp_00.npy Canteen/calib_00-02.xml
39 | Canteen/camera_00/im6.png Canteen/disp_00.npy Canteen/calib_00-02.xml
40 | Canteen/camera_00/im7.png Canteen/disp_00.npy Canteen/calib_00-02.xml
41 | Canteen/camera_00/im8.png Canteen/disp_00.npy Canteen/calib_00-02.xml
42 | Canteen/camera_00/im9.png Canteen/disp_00.npy Canteen/calib_00-02.xml
43 | Case/camera_00/im0.png Case/disp_00.npy Case/calib_00-02.xml
44 | Case/camera_00/im10.png Case/disp_00.npy Case/calib_00-02.xml
45 | Case/camera_00/im1.png Case/disp_00.npy Case/calib_00-02.xml
46 | Case/camera_00/im2.png Case/disp_00.npy Case/calib_00-02.xml
47 | Case/camera_00/im3.png Case/disp_00.npy Case/calib_00-02.xml
48 | Case/camera_00/im4.png Case/disp_00.npy Case/calib_00-02.xml
49 | Case/camera_00/im5.png Case/disp_00.npy Case/calib_00-02.xml
50 | Case/camera_00/im6.png Case/disp_00.npy Case/calib_00-02.xml
51 | Case/camera_00/im7.png Case/disp_00.npy Case/calib_00-02.xml
52 | Case/camera_00/im8.png Case/disp_00.npy Case/calib_00-02.xml
53 | Case/camera_00/im9.png Case/disp_00.npy Case/calib_00-02.xml
54 | CashBox/camera_00/im0.png CashBox/disp_00.npy CashBox/calib_00-02.xml
55 | CashBox/camera_00/im1.png CashBox/disp_00.npy CashBox/calib_00-02.xml
56 | CashBox/camera_00/im2.png CashBox/disp_00.npy CashBox/calib_00-02.xml
57 | CashBox/camera_00/im3.png CashBox/disp_00.npy CashBox/calib_00-02.xml
58 | CashBox/camera_00/im4.png CashBox/disp_00.npy CashBox/calib_00-02.xml
59 | CashBox/camera_00/im5.png CashBox/disp_00.npy CashBox/calib_00-02.xml
60 | CashBox/camera_00/im6.png CashBox/disp_00.npy CashBox/calib_00-02.xml
61 | CashBox/camera_00/im7.png CashBox/disp_00.npy CashBox/calib_00-02.xml
62 | CoffeeMaker/camera_00/im0.png CoffeeMaker/disp_00.npy CoffeeMaker/calib_00-02.xml
63 | CoffeeMaker/camera_00/im1.png CoffeeMaker/disp_00.npy CoffeeMaker/calib_00-02.xml
64 | CoffeeMaker/camera_00/im2.png CoffeeMaker/disp_00.npy CoffeeMaker/calib_00-02.xml
65 | Cooker1/camera_00/im0.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml
66 | Cooker1/camera_00/im1.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml
67 | Cooker1/camera_00/im2.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml
68 | Cooker1/camera_00/im3.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml
69 | Cooker1/camera_00/im4.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml
70 | Cooker1/camera_00/im5.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml
71 | Cooker1/camera_00/im6.png Cooker1/disp_00.npy Cooker1/calib_00-02.xml
72 | Cosmetics/camera_00/im0.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
73 | Cosmetics/camera_00/im1.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
74 | Cosmetics/camera_00/im2.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
75 | Cosmetics/camera_00/im3.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
76 | Cosmetics/camera_00/im4.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
77 | Cosmetics/camera_00/im5.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
78 | Cosmetics/camera_00/im6.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
79 | Cosmetics/camera_00/im7.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
80 | Cosmetics/camera_00/im8.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
81 | Cosmetics/camera_00/im9.png Cosmetics/disp_00.npy Cosmetics/calib_00-02.xml
82 | DogHouse/camera_00/im0.png DogHouse/disp_00.npy DogHouse/calib_00-02.xml
83 | DogHouse/camera_00/im1.png DogHouse/disp_00.npy DogHouse/calib_00-02.xml
84 | DogHouse/camera_00/im2.png DogHouse/disp_00.npy DogHouse/calib_00-02.xml
85 | DogHouse/camera_00/im3.png DogHouse/disp_00.npy DogHouse/calib_00-02.xml
86 | Door/camera_00/im0.png Door/disp_00.npy Door/calib_00-02.xml
87 | Door/camera_00/im1.png Door/disp_00.npy Door/calib_00-02.xml
88 | Door/camera_00/im2.png Door/disp_00.npy Door/calib_00-02.xml
89 | Door/camera_00/im3.png Door/disp_00.npy Door/calib_00-02.xml
90 | Door/camera_00/im4.png Door/disp_00.npy Door/calib_00-02.xml
91 | Door/camera_00/im5.png Door/disp_00.npy Door/calib_00-02.xml
92 | Door/camera_00/im6.png Door/disp_00.npy Door/calib_00-02.xml
93 | ExtractorFan/camera_00/im0.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
94 | ExtractorFan/camera_00/im1.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
95 | ExtractorFan/camera_00/im2.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
96 | ExtractorFan/camera_00/im3.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
97 | ExtractorFan/camera_00/im4.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
98 | ExtractorFan/camera_00/im5.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
99 | ExtractorFan/camera_00/im6.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
100 | ExtractorFan/camera_00/im7.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
101 | ExtractorFan/camera_00/im8.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
102 | ExtractorFan/camera_00/im9.png ExtractorFan/disp_00.npy ExtractorFan/calib_00-02.xml
103 | Fridge/camera_00/im0.png Fridge/disp_00.npy Fridge/calib_00-02.xml
104 | Fridge/camera_00/im1.png Fridge/disp_00.npy Fridge/calib_00-02.xml
105 | Fridge/camera_00/im2.png Fridge/disp_00.npy Fridge/calib_00-02.xml
106 | Lunch/camera_00/im0.png Lunch/disp_00.npy Lunch/calib_00-02.xml
107 | Microwave/camera_00/im0.png Microwave/disp_00.npy Microwave/calib_00-02.xml
108 | Microwave/camera_00/im1.png Microwave/disp_00.npy Microwave/calib_00-02.xml
109 | Microwave/camera_00/im2.png Microwave/disp_00.npy Microwave/calib_00-02.xml
110 | Microwave/camera_00/im3.png Microwave/disp_00.npy Microwave/calib_00-02.xml
111 | Microwave/camera_00/im4.png Microwave/disp_00.npy Microwave/calib_00-02.xml
112 | Microwave/camera_00/im5.png Microwave/disp_00.npy Microwave/calib_00-02.xml
113 | Microwave/camera_00/im6.png Microwave/disp_00.npy Microwave/calib_00-02.xml
114 | Mirror/camera_00/im0.png Mirror/disp_00.npy Mirror/calib_00-02.xml
115 | Mirror/camera_00/im1.png Mirror/disp_00.npy Mirror/calib_00-02.xml
116 | Moka/camera_00/im0.png Moka/disp_00.npy Moka/calib_00-02.xml
117 | Moka/camera_00/im1.png Moka/disp_00.npy Moka/calib_00-02.xml
118 | Moka/camera_00/im2.png Moka/disp_00.npy Moka/calib_00-02.xml
119 | Moka/camera_00/im3.png Moka/disp_00.npy Moka/calib_00-02.xml
120 | Moka/camera_00/im4.png Moka/disp_00.npy Moka/calib_00-02.xml
121 | Moka1/camera_00/im0.png Moka1/disp_00.npy Moka1/calib_00-02.xml
122 | Moka1/camera_00/im1.png Moka1/disp_00.npy Moka1/calib_00-02.xml
123 | Moka1/camera_00/im2.png Moka1/disp_00.npy Moka1/calib_00-02.xml
124 | Moka1/camera_00/im3.png Moka1/disp_00.npy Moka1/calib_00-02.xml
125 | Moka1/camera_00/im4.png Moka1/disp_00.npy Moka1/calib_00-02.xml
126 | Moka1/camera_00/im5.png Moka1/disp_00.npy Moka1/calib_00-02.xml
127 | Moka1/camera_00/im6.png Moka1/disp_00.npy Moka1/calib_00-02.xml
128 | Moka1/camera_00/im7.png Moka1/disp_00.npy Moka1/calib_00-02.xml
129 | Moka1/camera_00/im8.png Moka1/disp_00.npy Moka1/calib_00-02.xml
130 | Moka1/camera_00/im9.png Moka1/disp_00.npy Moka1/calib_00-02.xml
131 | Motorcycle/camera_00/im0.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
132 | Motorcycle/camera_00/im1.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
133 | Motorcycle/camera_00/im2.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
134 | Motorcycle/camera_00/im3.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
135 | Motorcycle/camera_00/im4.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
136 | Motorcycle/camera_00/im5.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
137 | Motorcycle/camera_00/im6.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
138 | Motorcycle/camera_00/im7.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
139 | Motorcycle/camera_00/im8.png Motorcycle/disp_00.npy Motorcycle/calib_00-02.xml
140 | Mouthwash/camera_00/im0.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml
141 | Mouthwash/camera_00/im1.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml
142 | Mouthwash/camera_00/im2.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml
143 | Mouthwash/camera_00/im3.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml
144 | Mouthwash/camera_00/im4.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml
145 | Mouthwash/camera_00/im5.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml
146 | Mouthwash/camera_00/im6.png Mouthwash/disp_00.npy Mouthwash/calib_00-02.xml
147 | OilCan/camera_00/im0.png OilCan/disp_00.npy OilCan/calib_00-02.xml
148 | OilCan/camera_00/im1.png OilCan/disp_00.npy OilCan/calib_00-02.xml
149 | OilCan/camera_00/im2.png OilCan/disp_00.npy OilCan/calib_00-02.xml
150 | OilCan/camera_00/im3.png OilCan/disp_00.npy OilCan/calib_00-02.xml
151 | OilCan/camera_00/im4.png OilCan/disp_00.npy OilCan/calib_00-02.xml
152 | OilCan/camera_00/im5.png OilCan/disp_00.npy OilCan/calib_00-02.xml
153 | OilCan/camera_00/im6.png OilCan/disp_00.npy OilCan/calib_00-02.xml
154 | OilCan/camera_00/im7.png OilCan/disp_00.npy OilCan/calib_00-02.xml
155 | Oven1/camera_00/im0.png Oven1/disp_00.npy Oven1/calib_00-02.xml
156 | Oven1/camera_00/im1.png Oven1/disp_00.npy Oven1/calib_00-02.xml
157 | Oven1/camera_00/im2.png Oven1/disp_00.npy Oven1/calib_00-02.xml
158 | Oven1/camera_00/im3.png Oven1/disp_00.npy Oven1/calib_00-02.xml
159 | Oven1/camera_00/im4.png Oven1/disp_00.npy Oven1/calib_00-02.xml
160 | Oven1/camera_00/im5.png Oven1/disp_00.npy Oven1/calib_00-02.xml
161 | Oven2/camera_00/im0.png Oven2/disp_00.npy Oven2/calib_00-02.xml
162 | Oven2/camera_00/im1.png Oven2/disp_00.npy Oven2/calib_00-02.xml
163 | Oven2/camera_00/im2.png Oven2/disp_00.npy Oven2/calib_00-02.xml
164 | Oven2/camera_00/im3.png Oven2/disp_00.npy Oven2/calib_00-02.xml
165 | Oven2/camera_00/im4.png Oven2/disp_00.npy Oven2/calib_00-02.xml
166 | Oven2/camera_00/im5.png Oven2/disp_00.npy Oven2/calib_00-02.xml
167 | Pots1/camera_00/im0.png Pots1/disp_00.npy Pots1/calib_00-02.xml
168 | Pots1/camera_00/im1.png Pots1/disp_00.npy Pots1/calib_00-02.xml
169 | Pots1/camera_00/im2.png Pots1/disp_00.npy Pots1/calib_00-02.xml
170 | Pots1/camera_00/im3.png Pots1/disp_00.npy Pots1/calib_00-02.xml
171 | Pots1/camera_00/im4.png Pots1/disp_00.npy Pots1/calib_00-02.xml
172 | Pots1/camera_00/im5.png Pots1/disp_00.npy Pots1/calib_00-02.xml
173 | Shower/camera_00/im0.png Shower/disp_00.npy Shower/calib_00-02.xml
174 | Shower/camera_00/im1.png Shower/disp_00.npy Shower/calib_00-02.xml
175 | Shower/camera_00/im2.png Shower/disp_00.npy Shower/calib_00-02.xml
176 | Shower/camera_00/im3.png Shower/disp_00.npy Shower/calib_00-02.xml
177 | Sink/camera_00/im0.png Sink/disp_00.npy Sink/calib_00-02.xml
178 | Sink/camera_00/im1.png Sink/disp_00.npy Sink/calib_00-02.xml
179 | Sink/camera_00/im2.png Sink/disp_00.npy Sink/calib_00-02.xml
180 | Sink/camera_00/im3.png Sink/disp_00.npy Sink/calib_00-02.xml
181 | Sink/camera_00/im4.png Sink/disp_00.npy Sink/calib_00-02.xml
182 | SoapDishes/camera_00/im0.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml
183 | SoapDishes/camera_00/im1.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml
184 | SoapDishes/camera_00/im2.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml
185 | SoapDishes/camera_00/im3.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml
186 | SoapDishes/camera_00/im4.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml
187 | SoapDishes/camera_00/im5.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml
188 | SoapDishes/camera_00/im6.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml
189 | SoapDishes/camera_00/im7.png SoapDishes/disp_00.npy SoapDishes/calib_00-02.xml
190 | Tablet/camera_00/im0.png Tablet/disp_00.npy Tablet/calib_00-02.xml
191 | Tablet/camera_00/im1.png Tablet/disp_00.npy Tablet/calib_00-02.xml
192 | Tablet/camera_00/im2.png Tablet/disp_00.npy Tablet/calib_00-02.xml
193 | Tablet/camera_00/im3.png Tablet/disp_00.npy Tablet/calib_00-02.xml
194 | Tablet/camera_00/im4.png Tablet/disp_00.npy Tablet/calib_00-02.xml
195 | Tablet/camera_00/im5.png Tablet/disp_00.npy Tablet/calib_00-02.xml
196 | Tablet/camera_00/im6.png Tablet/disp_00.npy Tablet/calib_00-02.xml
197 | Tablet/camera_00/im7.png Tablet/disp_00.npy Tablet/calib_00-02.xml
198 | Tablet/camera_00/im8.png Tablet/disp_00.npy Tablet/calib_00-02.xml
199 | Toilet/camera_00/im0.png Toilet/disp_00.npy Toilet/calib_00-02.xml
200 | Toilet/camera_00/im1.png Toilet/disp_00.npy Toilet/calib_00-02.xml
201 | Toilet/camera_00/im2.png Toilet/disp_00.npy Toilet/calib_00-02.xml
202 | Toilet/camera_00/im3.png Toilet/disp_00.npy Toilet/calib_00-02.xml
203 | Toilet/camera_00/im4.png Toilet/disp_00.npy Toilet/calib_00-02.xml
204 | TV/camera_00/im0.png TV/disp_00.npy TV/calib_00-02.xml
205 | TV/camera_00/im1.png TV/disp_00.npy TV/calib_00-02.xml
206 | TV/camera_00/im2.png TV/disp_00.npy TV/calib_00-02.xml
207 | TV/camera_00/im3.png TV/disp_00.npy TV/calib_00-02.xml
208 | TV1/camera_00/im0.png TV1/disp_00.npy TV1/calib_00-02.xml
209 | TV1/camera_00/im1.png TV1/disp_00.npy TV1/calib_00-02.xml
210 | TV1/camera_00/im2.png TV1/disp_00.npy TV1/calib_00-02.xml
211 | TV1/camera_00/im3.png TV1/disp_00.npy TV1/calib_00-02.xml
212 | TV2/camera_00/im0.png TV2/disp_00.npy TV2/calib_00-02.xml
213 | TV2/camera_00/im1.png TV2/disp_00.npy TV2/calib_00-02.xml
214 | Vodka/camera_00/im0.png Vodka/disp_00.npy Vodka/calib_00-02.xml
215 | Vodka/camera_00/im1.png Vodka/disp_00.npy Vodka/calib_00-02.xml
216 | Vodka/camera_00/im2.png Vodka/disp_00.npy Vodka/calib_00-02.xml
217 | Vodka/camera_00/im3.png Vodka/disp_00.npy Vodka/calib_00-02.xml
218 | Vodka/camera_00/im4.png Vodka/disp_00.npy Vodka/calib_00-02.xml
219 | Vodka/camera_00/im5.png Vodka/disp_00.npy Vodka/calib_00-02.xml
220 | Vodka/camera_00/im6.png Vodka/disp_00.npy Vodka/calib_00-02.xml
221 | Vodka/camera_00/im7.png Vodka/disp_00.npy Vodka/calib_00-02.xml
222 | Washer/camera_00/im0.png Washer/disp_00.npy Washer/calib_00-02.xml
223 | Washer/camera_00/im1.png Washer/disp_00.npy Washer/calib_00-02.xml
224 | Washer/camera_00/im2.png Washer/disp_00.npy Washer/calib_00-02.xml
225 | Washer/camera_00/im3.png Washer/disp_00.npy Washer/calib_00-02.xml
226 | Washer/camera_00/im4.png Washer/disp_00.npy Washer/calib_00-02.xml
227 | Washer/camera_00/im5.png Washer/disp_00.npy Washer/calib_00-02.xml
228 | Washer/camera_00/im6.png Washer/disp_00.npy Washer/calib_00-02.xml
229 |
--------------------------------------------------------------------------------
/datasets/dataloader.py:
--------------------------------------------------------------------------------
1 | import os
2 | import random
3 | import numpy as np
4 | import torch
5 | import cv2
6 | from torch.utils.data import Dataset
7 | from utils import parse_dataset_txt, read_image
8 |
9 | ###-----[Booster]-----###
10 | rgb_str = "camera_00"
11 | disp_str = "disp_00.npy"
12 | mask_str = "mask_00.png"
13 | mask_c_str = "mask_cat.png"
14 |
15 |
16 | class Trans10KLoader(Dataset):
17 | def __init__(self, dataset_dir, dataset_txt, transform):
18 | self.dataset_dir = dataset_dir
19 | self.transform = transform
20 | dataset_dict = parse_dataset_txt(dataset_txt)
21 |
22 | self.images_names = dataset_dict["basenames"]
23 | self.ground_truth_names = dataset_dict["gt_paths"]
24 |
25 |
26 | def __len__(self):
27 | return len(self.images_names)
28 |
29 | def __getitem__(self, idx):
30 | rgb_path = os.path.join(self.dataset_dir, self.images_names[idx])
31 | disp_path = os.path.join(self.dataset_dir, self.ground_truth_names[idx])
32 |
33 | # Read all the images in the folder and stack them to form the batch.
34 | rgb_image = read_image(rgb_path) # [0,1] rgb hxwxc image
35 | ground_truth = np.load(disp_path).astype(np.float32)
36 | ground_truth = cv2.resize(ground_truth, (rgb_image.shape[1], rgb_image.shape[0]), cv2.INTER_NEAREST)
37 |
38 | transformed_dict = self.transform({"image": rgb_image, "depth": ground_truth})
39 | rgb_image = transformed_dict["image"]
40 | ground_truth = transformed_dict["depth"]
41 | rgb_image = torch.from_numpy(rgb_image)
42 | ground_truth = torch.from_numpy(ground_truth)
43 |
44 | return rgb_image, ground_truth, rgb_path
45 |
46 | class MSDLoader(Trans10KLoader):
47 | pass
--------------------------------------------------------------------------------
/datasets/msd/test.txt:
--------------------------------------------------------------------------------
1 | 5398_512x640.jpg _
2 | 4986_640x512.jpg _
3 | 4996_640x512.jpg _
4 | 586_512x640.jpg _
5 | 5162_512x640.jpg _
6 | 5107_512x640.jpg _
7 | 5291_512x640.jpg _
8 | 5120_512x640.jpg _
9 | 5354_512x640.jpg _
10 | 1860_512x640.jpg _
11 | 3792_640x512.jpg _
12 | 1971_512x640.jpg _
13 | 3316_512x640.jpg _
14 | 5148_640x512.jpg _
15 | 3309_512x640.jpg _
16 | 5304_640x512.jpg _
17 | 119_512x640.jpg _
18 | 1830_640x512.jpg _
19 | 5279_512x640.jpg _
20 | 5248_640x512.jpg _
21 | 3975_512x640.jpg _
22 | 654_512x640.jpg _
23 | 4345_512x640.jpg _
24 | 1777_512x640.jpg _
25 | 5310_640x512.jpg _
26 | 3771_512x640.jpg _
27 | 3423_512x640.jpg _
28 | 1652_512x640.jpg _
29 | 5025_512x640.jpg _
30 | 5439_512x640.jpg _
31 | 2119_512x640.jpg _
32 | 2711_512x640.jpg _
33 | 429_512x640.jpg _
34 | 5119_512x640.jpg _
35 | 2907_512x640.jpg _
36 | 4969_512x640.jpg _
37 | 5131_640x512.jpg _
38 | 4989_512x640.jpg _
39 | 5092_512x640.jpg _
40 | 5496_512x640.jpg _
41 | 3895_512x640.jpg _
42 | 5246_512x640.jpg _
43 | 1852_512x640.jpg _
44 | 3398_512x640.jpg _
45 | 5452_512x640.jpg _
46 | 1734_512x640.jpg _
47 | 5169_512x640.jpg _
48 | 1680_512x640.jpg _
49 | 3658_512x640.jpg _
50 | 4340_512x640.jpg _
51 | 1881_640x512.jpg _
52 | 5341_512x640.jpg _
53 | 1693_512x640.jpg _
54 | 2130_512x640.jpg _
55 | 4376_512x640.jpg _
56 | 5250_512x640.jpg _
57 | 2897_512x640.jpg _
58 | 5237_512x640.jpg _
59 | 2961_512x640.jpg _
60 | 160_640x512.jpg _
61 | 1678_512x640.jpg _
62 | 2763_512x640.jpg _
63 | 4325_512x640.jpg _
64 | 4991_512x640.jpg _
65 | 3328_512x640.jpg _
66 | 1770_512x640.jpg _
67 | 5202_512x640.jpg _
68 | 5242_512x640.jpg _
69 | 1774_512x640.jpg _
70 | 4963_640x512.jpg _
71 | 3242_512x640.jpg _
72 | 1789_512x640.jpg _
73 | 2744_512x640.jpg _
74 | 4393_512x640.jpg _
75 | 5375_512x640.jpg _
76 | 1929_512x640.jpg _
77 | 5265_640x512.jpg _
78 | 5331_512x640.jpg _
79 | 5027_512x640.jpg _
80 | 1932_512x640.jpg _
81 | 195_640x512.jpg _
82 | 1858_512x640.jpg _
83 | 2079_640x512.jpg _
84 | 1828_512x640.jpg _
85 | 5089_640x512.jpg _
86 | 5183_512x640.jpg _
87 | 1983_512x640.jpg _
88 | 1050_512x640.jpg _
89 | 4391_512x640.jpg _
90 | 5097_640x512.jpg _
91 | 5269_512x640.jpg _
92 | 1749_512x640.jpg _
93 | 5520_512x640.jpg _
94 | 5509_512x640.jpg _
95 | 5319_640x512.jpg _
96 | 5069_512x640.jpg _
97 | 5118_640x512.jpg _
98 | 4946_512x640.jpg _
99 | 675_512x640.jpg _
100 | 5112_640x512.jpg _
101 | 5355_512x640.jpg _
102 | 5263_512x640.jpg _
103 | 5026_640x512.jpg _
104 | 3005_512x640.jpg _
105 | 3071_512x640.jpg _
106 | 5535_512x640.jpg _
107 | 1907_640x512.jpg _
108 | 5429_512x640.jpg _
109 | 4971_512x640.jpg _
110 | 1935_640x512.jpg _
111 | 3701_512x640.jpg _
112 | 3205_512x640.jpg _
113 | 1759_512x640.jpg _
114 | 1985_512x640.jpg _
115 | 3229_512x640.jpg _
116 | 3650_512x640.jpg _
117 | 1778_512x640.jpg _
118 | 3260_512x640.jpg _
119 | 5204_640x512.jpg _
120 | 4378_512x640.jpg _
121 | 893_512x640.jpg _
122 | 5011_640x512.jpg _
123 | 5008_640x512.jpg _
124 | 3160_512x640.jpg _
125 | 5143_512x640.jpg _
126 | 5176_512x640.jpg _
127 | 1912_512x640.jpg _
128 | 5363_512x640.jpg _
129 | 5161_512x640.jpg _
130 | 5144_512x640.jpg _
131 | 111_512x640.jpg _
132 | 5288_512x640.jpg _
133 | 4379_512x640.jpg _
134 | 5258_512x640.jpg _
135 | 844_512x640.jpg _
136 | 2103_512x640.jpg _
137 | 4956_640x512.jpg _
138 | 608_512x640.jpg _
139 | 2886_512x640.jpg _
140 | 4045_512x640.jpg _
141 | 3311_512x640.jpg _
142 | 1791_512x640.jpg _
143 | 5067_512x640.jpg _
144 | 5510_512x640.jpg _
145 | 2139_640x512.jpg _
146 | 3693_512x640.jpg _
147 | 5140_640x512.jpg _
148 | 5287_640x512.jpg _
149 | 4385_512x640.jpg _
150 | 3905_512x640.jpg _
151 | 5353_512x640.jpg _
152 | 1846_512x640.jpg _
153 | 3240_512x640.jpg _
154 | 5454_512x640.jpg _
155 | 1768_512x640.jpg _
156 | 3397_512x640.jpg _
157 | 5098_512x640.jpg _
158 | 5301_512x640.jpg _
159 | 1959_512x640.jpg _
160 | 4809_512x640.jpg _
161 | 5396_512x640.jpg _
162 | 5321_640x512.jpg _
163 | 1956_512x640.jpg _
164 | 3696_512x640.jpg _
165 | 3691_512x640.jpg _
166 | 1918_512x640.jpg _
167 | 4982_512x640.jpg _
168 | 5016_512x640.jpg _
169 | 1668_512x640.jpg _
170 | 5524_512x640.jpg _
171 | 3425_512x640.jpg _
172 | 1751_512x640.jpg _
173 | 5028_512x640.jpg _
174 | 4033_512x640.jpg _
175 | 5000_640x512.jpg _
176 | 3702_512x640.jpg _
177 | 4970_512x640.jpg _
178 | 4384_512x640.jpg _
179 | 5224_512x640.jpg _
180 | 3594_512x640.jpg _
181 | 5146_512x640.jpg _
182 | 5175_512x640.jpg _
183 | 4978_512x640.jpg _
184 | 3797_512x640.jpg _
185 | 5328_512x640.jpg _
186 | 5074_640x512.jpg _
187 | 4095_512x640.jpg _
188 | 949_512x640.jpg _
189 | 5132_640x512.jpg _
190 | 4386_512x640.jpg _
191 | 22_512x640.jpg _
192 | 3166_512x640.jpg _
193 | 196_512x640.jpg _
194 | 5268_512x640.jpg _
195 | 1694_512x640.jpg _
196 | 1726_512x640.jpg _
197 | 5428_512x640.jpg _
198 | 5315_512x640.jpg _
199 | 5289_512x640.jpg _
200 | 1954_512x640.jpg _
201 | 5033_640x512.jpg _
202 | 5356_512x640.jpg _
203 | 1986_640x512.jpg _
204 | 2087_512x640.jpg _
205 | 3652_512x640.jpg _
206 | 1762_512x640.jpg _
207 | 4944_512x640.jpg _
208 | 5362_512x640.jpg _
209 | 3491_512x640.jpg _
210 | 5282_640x512.jpg _
211 | 3239_512x640.jpg _
212 | 421_512x640.jpg _
213 | 3428_512x640.jpg _
214 | 5059_512x640.jpg _
215 | 5061_512x640.jpg _
216 | 3543_512x640.jpg _
217 | 5003_640x512.jpg _
218 | 5414_512x640.jpg _
219 | 3453_512x640.jpg _
220 | 5277_512x640.jpg _
221 | 3396_512x640.jpg _
222 | 5332_512x640.jpg _
223 | 2094_512x640.jpg _
224 | 5102_640x512.jpg _
225 | 5049_512x640.jpg _
226 | 5080_512x640.jpg _
227 | 5503_640x512.jpg _
228 | 3909_512x640.jpg _
229 | 5472_512x640.jpg _
230 | 4364_512x640.jpg _
231 | 5506_512x640.jpg _
232 | 5427_512x640.jpg _
233 | 320_512x640.jpg _
234 | 5036_512x640.jpg _
235 | 3329_512x640.jpg _
236 | 5020_640x512.jpg _
237 | 916_512x640.jpg _
238 | 5350_640x512.jpg _
239 | 5511_512x640.jpg _
240 | 2102_512x640.jpg _
241 | 5054_640x512.jpg _
242 | 3478_512x640.jpg _
243 | 5426_512x640.jpg _
244 | 5359_512x640.jpg _
245 | 1934_512x640.jpg _
246 | 4998_640x512.jpg _
247 | 5325_512x640.jpg _
248 | 1702_512x640.jpg _
249 | 2972_512x640.jpg _
250 | 5022_512x640.jpg _
251 | 1683_512x640.jpg _
252 | 4042_512x640.jpg _
253 | 5514_512x640.jpg _
254 | 5139_512x640.jpg _
255 | 5382_512x640.jpg _
256 | 5370_512x640.jpg _
257 | 5243_512x640.jpg _
258 | 5207_640x512.jpg _
259 | 4098_512x640.jpg _
260 | 4363_512x640.jpg _
261 | 5329_512x640.jpg _
262 | 3454_512x640.jpg _
263 | 5115_640x512.jpg _
264 | 1699_512x640.jpg _
265 | 5220_512x640.jpg _
266 | 2667_512x640.jpg _
267 | 5087_512x640.jpg _
268 | 3646_640x512.jpg _
269 | 1654_512x640.jpg _
270 | 3937_512x640.jpg _
271 | 4383_512x640.jpg _
272 | 1833_640x512.jpg _
273 | 5096_512x640.jpg _
274 | 1893_640x512.jpg _
275 | 5105_512x640.jpg _
276 | 5433_512x640.jpg _
277 | 318_512x640.jpg _
278 | 1976_512x640.jpg _
279 | 1923_640x512.jpg _
280 | 336_512x640.jpg _
281 | 3612_512x640.jpg _
282 | 5171_512x640.jpg _
283 | 3116_512x640.jpg _
284 | 5351_512x640.jpg _
285 | 5457_512x640.jpg _
286 | 1036_512x640.jpg _
287 | 5415_512x640.jpg _
288 | 3458_512x640.jpg _
289 | 5058_512x640.jpg _
290 | 5128_640x512.jpg _
291 | 4987_640x512.jpg _
292 | 4974_512x640.jpg _
293 | 1993_512x640.jpg _
294 | 1647_512x640.jpg _
295 | 5376_640x512.jpg _
296 | 5012_640x512.jpg _
297 | 1945_640x512.jpg _
298 | 5208_512x640.jpg _
299 | 4343_512x640.jpg _
300 | 1961_512x640.jpg _
301 | 5401_512x640.jpg _
302 | 1730_512x640.jpg _
303 | 5037_512x640.jpg _
304 | 5244_640x512.jpg _
305 | 3588_512x640.jpg _
306 | 2798_512x640.jpg _
307 | 5213_512x640.jpg _
308 | 5364_640x512.jpg _
309 | 1992_512x640.jpg _
310 | 5151_512x640.jpg _
311 | 5272_512x640.jpg _
312 | 694_512x640.jpg _
313 | 5186_512x640.jpg _
314 | 5369_640x512.jpg _
315 | 1854_512x640.jpg _
316 | 5219_512x640.jpg _
317 | 1994_512x640.jpg _
318 | 5014_512x640.jpg _
319 | 5085_640x512.jpg _
320 | 1937_512x640.jpg _
321 | 1001_512x640.jpg _
322 | 1910_640x512.jpg _
323 | 5007_512x640.jpg _
324 | 2137_512x640.jpg _
325 | 5399_512x640.jpg _
326 | 5109_640x512.jpg _
327 | 5264_512x640.jpg _
328 | 5241_640x512.jpg _
329 | 4099_512x640.jpg _
330 | 5378_640x512.jpg _
331 | 5114_512x640.jpg _
332 | 138_512x640.jpg _
333 | 5333_512x640.jpg _
334 | 5337_640x512.jpg _
335 | 5157_512x640.jpg _
336 | 3514_512x640.jpg _
337 | 5300_512x640.jpg _
338 | 5073_512x640.jpg _
339 | 879_512x640.jpg _
340 | 1696_640x512.jpg _
341 | 1981_512x640.jpg _
342 | 1793_512x640.jpg _
343 | 1973_512x640.jpg _
344 | 4361_512x640.jpg _
345 | 5262_512x640.jpg _
346 | 5344_640x512.jpg _
347 | 5254_512x640.jpg _
348 | 66_512x640.jpg _
349 | 5130_640x512.jpg _
350 | 4374_512x640.jpg _
351 | 5486_512x640.jpg _
352 | 5374_512x640.jpg _
353 | 680_512x640.jpg _
354 | 1914_640x512.jpg _
355 | 2655_512x640.jpg _
356 | 5308_640x512.jpg _
357 | 3325_512x640.jpg _
358 | 1834_640x512.jpg _
359 | 5476_512x640.jpg _
360 | 5113_512x640.jpg _
361 | 5216_512x640.jpg _
362 | 3430_512x640.jpg _
363 | 5348_512x640.jpg _
364 | 4342_512x640.jpg _
365 | 711_512x640.jpg _
366 | 5330_512x640.jpg _
367 | 5448_512x640.jpg _
368 | 1921_640x512.jpg _
369 | 5136_512x640.jpg _
370 | 1810_512x640.jpg _
371 | 3331_512x640.jpg _
372 | 1880_512x640.jpg _
373 | 3400_512x640.jpg _
374 | 2116_512x640.jpg _
375 | 5111_512x640.jpg _
376 | 5134_512x640.jpg _
377 | 5526_512x640.jpg _
378 | 5528_512x640.jpg _
379 | 5029_640x512.jpg _
380 | 1724_512x640.jpg _
381 | 3182_512x640.jpg _
382 | 5394_512x640.jpg _
383 | 2733_512x640.jpg _
384 | 5298_640x512.jpg _
385 | 1700_512x640.jpg _
386 | 4957_512x640.jpg _
387 | 5233_512x640.jpg _
388 | 3455_512x640.jpg _
389 | 2157_512x640.jpg _
390 | 4981_512x640.jpg _
391 | 3157_512x640.jpg _
392 | 4967_512x640.jpg _
393 | 5215_512x640.jpg _
394 | 1952_512x640.jpg _
395 | 5384_512x640.jpg _
396 | 5435_512x640.jpg _
397 | 5274_512x640.jpg _
398 | 1805_512x640.jpg _
399 | 5500_512x640.jpg _
400 | 3878_512x640.jpg _
401 | 1755_512x640.jpg _
402 | 5323_512x640.jpg _
403 | 1926_512x640.jpg _
404 | 1951_512x640.jpg _
405 | 3241_512x640.jpg _
406 | 5459_512x640.jpg _
407 | 5252_512x640.jpg _
408 | 5468_512x640.jpg _
409 | 5086_512x640.jpg _
410 | 809_512x640.jpg _
411 | 5481_512x640.jpg _
412 | 4382_512x640.jpg _
413 | 5464_512x640.jpg _
414 | 5184_512x640.jpg _
415 | 1728_512x640.jpg _
416 | 5091_512x640.jpg _
417 | 385_512x640.jpg _
418 | 3624_512x640.jpg _
419 | 3094_512x640.jpg _
420 | 1864_512x640.jpg _
421 | 1843_640x512.jpg _
422 | 1967_512x640.jpg _
423 | 1972_512x640.jpg _
424 | 5019_512x640.jpg _
425 | 5031_640x512.jpg _
426 | 2689_640x512.jpg _
427 | 5281_512x640.jpg _
428 | 3027_512x640.jpg _
429 | 5397_640x512.jpg _
430 | 1794_512x640.jpg _
431 | 5385_512x640.jpg _
432 | 1824_512x640.jpg _
433 | 5172_512x640.jpg _
434 | 3314_512x640.jpg _
435 | 5240_512x640.jpg _
436 | 4348_512x640.jpg _
437 | 3463_512x640.jpg _
438 | 5106_512x640.jpg _
439 | 3312_512x640.jpg _
440 | 5044_640x512.jpg _
441 | 5187_512x640.jpg _
442 | 4984_640x512.jpg _
443 | 998_512x640.jpg _
444 | 1970_512x640.jpg _
445 | 2142_512x640.jpg _
446 | 1840_512x640.jpg _
447 | 3934_512x640.jpg _
448 | 1820_512x640.jpg _
449 | 3349_512x640.jpg _
450 | 5212_640x512.jpg _
451 | 4965_512x640.jpg _
452 | 3503_512x640.jpg _
453 | 1685_512x640.jpg _
454 | 5226_512x640.jpg _
455 | 3429_512x640.jpg _
456 | 2678_512x640.jpg _
457 | 5297_640x512.jpg _
458 | 4362_512x640.jpg _
459 | 884_512x640.jpg _
460 | 37_512x640.jpg _
461 | 5179_512x640.jpg _
462 | 1656_512x640.jpg _
463 | 5460_512x640.jpg _
464 | 5498_512x640.jpg _
465 | 1731_512x640.jpg _
466 | 1756_512x640.jpg _
467 | 5373_640x512.jpg _
468 | 5104_640x512.jpg _
469 | 3644_512x640.jpg _
470 | 5178_512x640.jpg _
471 | 5478_640x512.jpg _
472 | 3105_512x640.jpg _
473 | 5530_512x640.jpg _
474 | 1842_512x640.jpg _
475 | 5078_512x640.jpg _
476 | 5001_640x512.jpg _
477 | 1965_512x640.jpg _
478 | 4950_640x512.jpg _
479 | 5048_512x640.jpg _
480 | 5234_512x640.jpg _
481 | 5122_512x640.jpg _
482 | 3415_512x640.jpg _
483 | 5479_512x640.jpg _
484 | 3060_512x640.jpg _
485 | 5280_640x512.jpg _
486 | 3616_512x640.jpg _
487 | 1784_512x640.jpg _
488 | 5295_640x512.jpg _
489 | 5010_512x640.jpg _
490 | 4977_512x640.jpg _
491 | 5266_640x512.jpg _
492 | 5056_640x512.jpg _
493 | 639_512x640.jpg _
494 | 5041_640x512.jpg _
495 | 1806_512x640.jpg _
496 | 1982_512x640.jpg _
497 | 3322_512x640.jpg _
498 | 5125_512x640.jpg _
499 | 1917_512x640.jpg _
500 | 3694_512x640.jpg _
501 | 5347_640x512.jpg _
502 | 2950_640x512.jpg _
503 | 4339_512x640.jpg _
504 | 5196_640x512.jpg _
505 | 5209_512x640.jpg _
506 | 5121_512x640.jpg _
507 | 3151_512x640.jpg _
508 | 3193_512x640.jpg _
509 | 5523_512x640.jpg _
510 | 5023_640x512.jpg _
511 | 5361_512x640.jpg _
512 | 4958_512x640.jpg _
513 | 2876_512x640.jpg _
514 | 5532_512x640.jpg _
515 | 5352_512x640.jpg _
516 | 5166_512x640.jpg _
517 | 2831_512x640.jpg _
518 | 1887_640x512.jpg _
519 | 1832_640x512.jpg _
520 | 4346_512x640.jpg _
521 | 5462_640x512.jpg _
522 | 1831_640x512.jpg _
523 | 5062_512x640.jpg _
524 | 5273_512x640.jpg _
525 | 5466_512x640.jpg _
526 | 985_512x640.jpg _
527 | 5365_512x640.jpg _
528 | 3324_512x640.jpg _
529 | 3152_512x640.jpg _
530 | 5193_640x512.jpg _
531 | 1963_512x640.jpg _
532 | 5366_512x640.jpg _
533 | 1938_512x640.jpg _
534 | 1695_512x640.jpg _
535 | 3593_512x640.jpg _
536 | 4994_640x512.jpg _
537 | 5018_512x640.jpg _
538 | 5084_512x640.jpg _
539 | 3332_512x640.jpg _
540 | 1752_512x640.jpg _
541 | 5030_512x640.jpg _
542 | 5339_512x640.jpg _
543 | 1848_512x640.jpg _
544 | 4990_512x640.jpg _
545 | 1906_640x512.jpg _
546 | 5367_512x640.jpg _
547 | 1841_512x640.jpg _
548 | 4365_512x640.jpg _
549 | 5392_640x512.jpg _
550 | 5493_512x640.jpg _
551 | 414_512x640.jpg _
552 | 5299_640x512.jpg _
553 | 5522_512x640.jpg _
554 | 5276_512x640.jpg _
555 | 5453_512x640.jpg _
556 | 5194_640x512.jpg _
557 | 5145_640x512.jpg _
558 | 5190_512x640.jpg _
559 | 1883_640x512.jpg _
560 | 5380_512x640.jpg _
561 | 1979_512x640.jpg _
562 | 101_512x640.jpg _
563 | 4347_512x640.jpg _
564 | 5302_512x640.jpg _
565 | 5004_512x640.jpg _
566 | 852_512x640.jpg _
567 | 672_512x640.jpg _
568 | 3326_512x640.jpg _
569 | 5451_512x640.jpg _
570 | 5173_512x640.jpg _
571 | 5326_512x640.jpg _
572 | 3153_512x640.jpg _
573 | 4945_512x640.jpg _
574 | 5413_512x640.jpg _
575 | 3226_512x640.jpg _
576 | 5013_512x640.jpg _
577 | 1779_512x640.jpg _
578 | 5117_512x640.jpg _
579 | 5286_512x640.jpg _
580 | 5249_512x640.jpg _
581 | 3988_512x640.jpg _
582 | 5153_512x640.jpg _
583 | 5501_512x640.jpg _
584 | 2117_512x640.jpg _
585 | 343_512x640.jpg _
586 | 4389_512x640.jpg _
587 | 1684_512x640.jpg _
588 | 5101_640x512.jpg _
589 | 1869_640x512.jpg _
590 | 322_512x640.jpg _
591 | 5042_640x512.jpg _
592 | 5180_640x512.jpg _
593 | 3318_512x640.jpg _
594 | 663_512x640.jpg _
595 | 5024_512x640.jpg _
596 | 5253_640x512.jpg _
597 | 3417_512x640.jpg _
598 | 5159_512x640.jpg _
599 | 4377_512x640.jpg _
600 | 3457_512x640.jpg _
601 | 1924_512x640.jpg _
602 | 5223_512x640.jpg _
603 | 3505_512x640.jpg _
604 | 5227_640x512.jpg _
605 | 5475_512x640.jpg _
606 | 4976_640x512.jpg _
607 | 1765_512x640.jpg _
608 | 4979_640x512.jpg _
609 | 5346_640x512.jpg _
610 | 387_512x640.jpg _
611 | 5412_512x640.jpg _
612 | 3150_512x640.jpg _
613 | 5185_512x640.jpg _
614 | 4082_512x640.jpg _
615 | 5214_512x640.jpg _
616 | 4358_512x640.jpg _
617 | 5275_512x640.jpg _
618 | 5419_512x640.jpg _
619 | 5090_640x512.jpg _
620 | 3418_512x640.jpg _
621 | 5163_512x640.jpg _
622 | 3602_512x640.jpg _
623 | 727_512x640.jpg _
624 | 4995_512x640.jpg _
625 | 5529_512x640.jpg _
626 | 2755_512x640.jpg _
627 | 1950_512x640.jpg _
628 | 3607_512x640.jpg _
629 | 3317_512x640.jpg _
630 | 4968_512x640.jpg _
631 | 4975_512x640.jpg _
632 | 1837_640x512.jpg _
633 | 5147_512x640.jpg _
634 | 754_512x640.jpg _
635 | 2863_512x640.jpg _
636 | 3721_512x640.jpg _
637 | 2722_512x640.jpg _
638 | 5108_640x512.jpg _
639 | 5127_640x512.jpg _
640 | 5293_512x640.jpg _
641 | 3705_512x640.jpg _
642 | 1787_512x640.jpg _
643 | 5306_512x640.jpg _
644 | 3171_512x640.jpg _
645 | 5038_512x640.jpg _
646 | 3016_512x640.jpg _
647 | 642_512x640.jpg _
648 | 4390_512x640.jpg _
649 | 2078_512x640.jpg _
650 | 5322_640x512.jpg _
651 | 1767_512x640.jpg _
652 | 1657_640x512.jpg _
653 | 5445_640x512.jpg _
654 | 5247_512x640.jpg _
655 | 1677_640x512.jpg _
656 | 4999_512x640.jpg _
657 | 1750_512x640.jpg _
658 | 2115_512x640.jpg _
659 | 2135_640x512.jpg _
660 | 5100_512x640.jpg _
661 | 5123_640x512.jpg _
662 | 5390_512x640.jpg _
663 | 1876_512x640.jpg _
664 | 5349_512x640.jpg _
665 | 5261_640x512.jpg _
666 | 3236_512x640.jpg _
667 | 5338_512x640.jpg _
668 | 1025_512x640.jpg _
669 | 5316_512x640.jpg _
670 | 5471_512x640.jpg _
671 | 4344_512x640.jpg _
672 | 1825_512x640.jpg _
673 | 3038_512x640.jpg _
674 | 5441_512x640.jpg _
675 | 5231_512x640.jpg _
676 | 1757_512x640.jpg _
677 | 5368_640x512.jpg _
678 | 5228_512x640.jpg _
679 | 5198_512x640.jpg _
680 | 3718_512x640.jpg _
681 | 4392_512x640.jpg _
682 | 4973_512x640.jpg _
683 | 598_512x640.jpg _
684 | 5465_512x640.jpg _
685 | 1940_640x512.jpg _
686 | 5284_640x512.jpg _
687 | 3282_512x640.jpg _
688 | 1809_512x640.jpg _
689 | 5296_640x512.jpg _
690 | 1761_512x640.jpg _
691 | 1930_512x640.jpg _
692 | 5446_512x640.jpg _
693 | 634_512x640.jpg _
694 | 184_512x640.jpg _
695 | 5317_512x640.jpg _
696 | 3695_512x640.jpg _
697 | 3424_512x640.jpg _
698 | 2918_512x640.jpg _
699 | 1771_640x512.jpg _
700 | 3833_512x640.jpg _
701 | 5103_512x640.jpg _
702 | 5260_640x512.jpg _
703 | 5307_512x640.jpg _
704 | 5432_512x640.jpg _
705 | 5188_512x640.jpg _
706 | 3438_512x640.jpg _
707 | 4954_512x640.jpg _
708 | 2642_512x640.jpg _
709 | 3371_512x640.jpg _
710 | 4992_512x640.jpg _
711 | 5485_512x640.jpg _
712 | 1861_512x640.jpg _
713 | 5002_512x640.jpg _
714 | 5152_512x640.jpg _
715 | 5055_512x640.jpg _
716 | 5229_512x640.jpg _
717 | 3138_512x640.jpg _
718 | 4962_512x640.jpg _
719 | 3360_512x640.jpg _
720 | 1682_512x640.jpg _
721 | 5232_640x512.jpg _
722 | 5081_512x640.jpg _
723 | 2995_512x640.jpg _
724 | 5324_640x512.jpg _
725 | 5340_640x512.jpg _
726 | 4100_512x640.jpg _
727 | 1753_512x640.jpg _
728 | 251_512x640.jpg _
729 | 3393_512x640.jpg _
730 | 5174_512x640.jpg _
731 | 5537_512x640.jpg _
732 | 5387_640x512.jpg _
733 | 1962_512x640.jpg _
734 | 2700_512x640.jpg _
735 | 5372_512x640.jpg _
736 | 5221_640x512.jpg _
737 | 5071_512x640.jpg _
738 | 3635_512x640.jpg _
739 | 3628_512x640.jpg _
740 | 1729_512x640.jpg _
741 | 1978_512x640.jpg _
742 | 1838_640x512.jpg _
743 | 5492_512x640.jpg _
744 | 1927_512x640.jpg _
745 | 1958_512x640.jpg _
746 | 5006_640x512.jpg _
747 | 5065_640x512.jpg _
748 | 2620_512x640.jpg _
749 | 3704_512x640.jpg _
750 | 1733_512x640.jpg _
751 | 4953_512x640.jpg _
752 | 1989_512x640.jpg _
753 | 1786_512x640.jpg _
754 | 1727_512x640.jpg _
755 | 1915_512x640.jpg _
756 | 5518_512x640.jpg _
757 | 5245_512x640.jpg _
758 | 5165_512x640.jpg _
759 | 5292_512x640.jpg _
760 | 5411_512x640.jpg _
761 | 5051_512x640.jpg _
762 | 5156_512x640.jpg _
763 | 3452_512x640.jpg _
764 | 1782_512x640.jpg _
765 | 5416_512x640.jpg _
766 | 1863_512x640.jpg _
767 | 5095_512x640.jpg _
768 | 281_512x640.jpg _
769 | 1913_640x512.jpg _
770 | 5192_512x640.jpg _
771 | 5531_512x640.jpg _
772 | 3700_512x640.jpg _
773 | 5083_640x512.jpg _
774 | 5043_640x512.jpg _
775 | 3082_512x640.jpg _
776 | 845_512x640.jpg _
777 | 5238_512x640.jpg _
778 | 5197_512x640.jpg _
779 | 3321_512x640.jpg _
780 | 230_512x640.jpg _
781 | 1688_512x640.jpg _
782 | 5129_512x640.jpg _
783 | 5126_512x640.jpg _
784 | 1949_512x640.jpg _
785 | 1975_512x640.jpg _
786 | 5154_512x640.jpg _
787 | 5336_640x512.jpg _
788 | 5005_512x640.jpg _
789 | 3304_512x640.jpg _
790 | 5490_512x640.jpg _
791 | 2929_512x640.jpg _
792 | 3427_512x640.jpg _
793 | 5487_512x640.jpg _
794 | 3320_512x640.jpg _
795 | 789_512x640.jpg _
796 | 5133_512x640.jpg _
797 | 2631_512x640.jpg _
798 | 4983_512x640.jpg _
799 | 2098_512x640.jpg _
800 | 1697_512x640.jpg _
801 | 1928_512x640.jpg _
802 | 3669_512x640.jpg _
803 | 5425_512x640.jpg _
804 | 3873_640x512.jpg _
805 | 1691_512x640.jpg _
806 | 5484_512x640.jpg _
807 | 5124_512x640.jpg _
808 | 1859_512x640.jpg _
809 | 5494_512x640.jpg _
810 | 4980_512x640.jpg _
811 | 5099_640x512.jpg _
812 | 5395_512x640.jpg _
813 | 2085_512x640.jpg _
814 | 2940_512x640.jpg _
815 | 962_640x512.jpg _
816 | 5075_640x512.jpg _
817 | 5189_640x512.jpg _
818 | 5182_640x512.jpg _
819 | 5077_640x512.jpg _
820 | 5512_512x640.jpg _
821 | 1990_512x640.jpg _
822 | 5534_512x640.jpg _
823 | 1775_512x640.jpg _
824 | 603_512x640.jpg _
825 | 5497_512x640.jpg _
826 | 1748_512x640.jpg _
827 | 4951_512x640.jpg _
828 | 3630_512x640.jpg _
829 | 5142_512x640.jpg _
830 | 5488_512x640.jpg _
831 | 5379_640x512.jpg _
832 | 3382_512x640.jpg _
833 | 5057_512x640.jpg _
834 | 1892_512x640.jpg _
835 | 3613_512x640.jpg _
836 | 1811_512x640.jpg _
837 | 5200_640x512.jpg _
838 | 5211_640x512.jpg _
839 | 601_512x640.jpg _
840 | 3449_512x640.jpg _
841 | 1984_512x640.jpg _
842 | 5088_512x640.jpg _
843 | 5206_640x512.jpg _
844 | 509_640x512.jpg _
845 | 3330_512x640.jpg _
846 | 5504_640x512.jpg _
847 | 5311_640x512.jpg _
848 | 3422_512x640.jpg _
849 | 1868_512x640.jpg _
850 | 5327_512x640.jpg _
851 | 5150_640x512.jpg _
852 | 3154_512x640.jpg _
853 | 1964_640x512.jpg _
854 | 1974_512x640.jpg _
855 | 5064_512x640.jpg _
856 | 2112_512x640.jpg _
857 | 5516_512x640.jpg _
858 | 1857_512x640.jpg _
859 | 3127_512x640.jpg _
860 | 5383_512x640.jpg _
861 | 3703_512x640.jpg _
862 | 5393_512x640.jpg _
863 | 1905_640x512.jpg _
864 | 2083_640x512.jpg _
865 | 1687_512x640.jpg _
866 | 1891_512x640.jpg _
867 | 1649_512x640.jpg _
868 | 3814_512x640.jpg _
869 | 3399_512x640.jpg _
870 | 5222_512x640.jpg _
871 | 5533_512x640.jpg _
872 | 5168_640x512.jpg _
873 | 2131_512x640.jpg _
874 | 1980_512x640.jpg _
875 | 3310_512x640.jpg _
876 | 5052_512x640.jpg _
877 | 5060_512x640.jpg _
878 | 3510_512x640.jpg _
879 | 1853_512x640.jpg _
880 | 5141_512x640.jpg _
881 | 3237_512x640.jpg _
882 | 5271_512x640.jpg _
883 | 3404_512x640.jpg _
884 | 1807_512x640.jpg _
885 | 4972_512x640.jpg _
886 | 5155_512x640.jpg _
887 | 1916_640x512.jpg _
888 | 1936_512x640.jpg _
889 | 1845_512x640.jpg _
890 | 4960_640x512.jpg _
891 | 5110_640x512.jpg _
892 | 5164_512x640.jpg _
893 | 5312_512x640.jpg _
894 | 5482_512x640.jpg _
895 | 1741_512x640.jpg _
896 | 5053_512x640.jpg _
897 | 1804_512x640.jpg _
898 | 4388_512x640.jpg _
899 | 1957_512x640.jpg _
900 | 5076_512x640.jpg _
901 | 5400_512x640.jpg _
902 | 3395_512x640.jpg _
903 | 1847_512x640.jpg _
904 | 2128_512x640.jpg _
905 | 3149_512x640.jpg _
906 | 5305_640x512.jpg _
907 | 5239_640x512.jpg _
908 | 5201_640x512.jpg _
909 | 3243_512x640.jpg _
910 | 1043_512x640.jpg _
911 | 3319_512x640.jpg _
912 | 730_512x640.jpg _
913 | 3904_512x640.jpg _
914 | 1835_640x512.jpg _
915 | 5473_512x640.jpg _
916 | 3731_512x640.jpg _
917 | 4387_512x640.jpg _
918 | 5149_512x640.jpg _
919 | 1920_640x512.jpg _
920 | 5342_512x640.jpg _
921 | 5357_640x512.jpg _
922 | 5313_512x640.jpg _
923 | 4966_640x512.jpg _
924 | 1839_512x640.jpg _
925 | 1690_512x640.jpg _
926 | 5217_512x640.jpg _
927 | 3049_512x640.jpg _
928 | 5116_512x640.jpg _
929 | 3459_512x640.jpg _
930 | 5389_512x640.jpg _
931 | 1763_512x640.jpg _
932 | 4357_512x640.jpg _
933 | 5167_512x640.jpg _
934 | 5015_512x640.jpg _
935 | 1851_512x640.jpg _
936 | 1737_512x640.jpg _
937 | 5070_512x640.jpg _
938 | 864_512x640.jpg _
939 | 1953_512x640.jpg _
940 | 1968_512x640.jpg _
941 | 5138_512x640.jpg _
942 | 5158_512x640.jpg _
943 | 1826_512x640.jpg _
944 | 1796_512x640.jpg _
945 | 1872_640x512.jpg _
946 | 5094_640x512.jpg _
947 | 3680_512x640.jpg _
948 | 5218_512x640.jpg _
949 | 5021_512x640.jpg _
950 | 5068_640x512.jpg _
951 | 3629_512x640.jpg _
952 | 5039_512x640.jpg _
953 | 5444_512x640.jpg _
954 | 3293_512x640.jpg _
955 | 5259_512x640.jpg _
956 |
--------------------------------------------------------------------------------
/datasets/trans10k/validation.txt:
--------------------------------------------------------------------------------
1 | 7621.jpg _
2 | 2533.jpg _
3 | 6098.jpg _
4 | 8130.jpg _
5 | 3091.jpg _
6 | 1360.jpg _
7 | 9693.jpg _
8 | 1342.jpg _
9 | 4478.jpg _
10 | 6746.jpg _
11 | 3645.jpg _
12 | 6033.jpg _
13 | 5321.jpg _
14 | 4179.jpg _
15 | 4109.jpg _
16 | 7240.jpg _
17 | 3071.jpg _
18 | 1363.jpg _
19 | 510.jpg _
20 | 675.jpg _
21 | 3265.jpg _
22 | 3947.jpg _
23 | 7272.jpg _
24 | 3671.jpg _
25 | 1620.jpg _
26 | 3859.jpg _
27 | 8475.jpg _
28 | 5237.jpg _
29 | 1629.jpg _
30 | 4910.jpg _
31 | 754.jpg _
32 | 4018.jpg _
33 | 7743.jpg _
34 | 9218.jpg _
35 | 1562.jpg _
36 | 1634.jpg _
37 | 7949.jpg _
38 | 9279.jpg _
39 | 2430.jpg _
40 | 5859.jpg _
41 | 7029.jpg _
42 | 8054.jpg _
43 | 639.jpg _
44 | 8139.jpg _
45 | 5301.jpg _
46 | 1777.jpg _
47 | 6078.jpg _
48 | 1259.jpg _
49 | 3759.jpg _
50 | 6828.jpg _
51 | 3144.jpg _
52 | 1474.jpg _
53 | 2309.jpg _
54 | 3647.jpg _
55 | 677.jpg _
56 | 6330.jpg _
57 | 3321.jpg _
58 | 1316.jpg _
59 | 58.jpg _
60 | 6046.jpg _
61 | 6800.jpg _
62 | 4453.jpg _
63 | 3563.jpg _
64 | 5319.jpg _
65 | 6862.jpg _
66 | 2629.jpg _
67 | 3676.jpg _
68 | 924.jpg _
69 | 8667.jpg _
70 | 5188.jpg _
71 | 6476.jpg _
72 | 10006.jpg _
73 | 8625.jpg _
74 | 6106.jpg _
75 | 2605.jpg _
76 | 9504.jpg _
77 | 10442.jpg _
78 | 2563.jpg _
79 | 8582.jpg _
80 | 7167.jpg _
81 | 4686.jpg _
82 | 2145.jpg _
83 | 8411.jpg _
84 | 2645.jpg _
85 | 5104.jpg _
86 | 3508.jpg _
87 | 634.jpg _
88 | 3897.jpg _
89 | 3103.jpg _
90 | 5403.jpg _
91 | 9775.jpg _
92 | 1467.jpg _
93 | 4246.jpg _
94 | 1300.jpg _
95 | 5663.jpg _
96 | 1501.jpg _
97 | 9591.jpg _
98 | 1906.jpg _
99 | 2448.jpg _
100 | 7077.jpg _
101 | 3233.jpg _
102 | 2819.jpg _
103 | 772.jpg _
104 | 423.jpg _
105 | 6938.jpg _
106 | 4688.jpg _
107 | 1759.jpg _
108 | 2754.jpg _
109 | 4449.jpg _
110 | 8842.jpg _
111 | 8603.jpg _
112 | 1182.jpg _
113 | 1395.jpg _
114 | 8157.jpg _
115 | 9640.jpg _
116 | 10181.jpg _
117 | 2805.jpg _
118 | 5975.jpg _
119 | 5910.jpg _
120 | 942.jpg _
121 | 6325.jpg _
122 | 8795.jpg _
123 | 7911.jpg _
124 | 4586.jpg _
125 | 6625.jpg _
126 | 3665.jpg _
127 | 6739.jpg _
128 | 1810.jpg _
129 | 5953.jpg _
130 | 2893.jpg _
131 | 5889.jpg _
132 | 8925.jpg _
133 | 2406.jpg _
134 | 9113.jpg _
135 | 2147.jpg _
136 | 3057.jpg _
137 | 539.jpg _
138 | 2765.jpg _
139 | 208.jpg _
140 | 5699.jpg _
141 | 1438.jpg _
142 | 9571.jpg _
143 | 8456.jpg _
144 | 4104.jpg _
145 | 2033.jpg _
146 | 1721.jpg _
147 | 1233.jpg _
148 | 6286.jpg _
149 | 7532.jpg _
150 | 2568.jpg _
151 | 10447.jpg _
152 | 6284.jpg _
153 | 4621.jpg _
154 | 1449.jpg _
155 | 8432.jpg _
156 | 7256.jpg _
157 | 5498.jpg _
158 | 5177.jpg _
159 | 2329.jpg _
160 | 3138.jpg _
161 | 6618.jpg _
162 | 6366.jpg _
163 | 9247.jpg _
164 | 4535.jpg _
165 | 1247.jpg _
166 | 10097.jpg _
167 | 361.jpg _
168 | 6455.jpg _
169 | 10000.jpg _
170 | 2569.jpg _
171 | 3843.jpg _
172 | 7280.jpg _
173 | 4658.jpg _
174 | 3801.jpg _
175 | 10114.jpg _
176 | 7332.jpg _
177 | 1459.jpg _
178 | 1535.jpg _
179 | 1368.jpg _
180 | 542.jpg _
181 | 10145.jpg _
182 | 4461.jpg _
183 | 4703.jpg _
184 | 1478.jpg _
185 | 3724.jpg _
186 | 4832.jpg _
187 | 5318.jpg _
188 | 1749.jpg _
189 | 8809.jpg _
190 | 2346.jpg _
191 | 4226.jpg _
192 | 7309.jpg _
193 | 2713.jpg _
194 | 5456.jpg _
195 | 5615.jpg _
196 | 6398.jpg _
197 | 9966.jpg _
198 | 1470.jpg _
199 | 8485.jpg _
200 | 8199.jpg _
201 | 2345.jpg _
202 | 5144.jpg _
203 | 9125.jpg _
204 | 5202.jpg _
205 | 4721.jpg _
206 | 4638.jpg _
207 | 314.jpg _
208 | 2767.jpg _
209 | 10437.jpg _
210 | 8833.jpg _
211 | 3608.jpg _
212 | 4120.jpg _
213 | 10026.jpg _
214 | 7540.jpg _
215 | 8202.jpg _
216 | 6103.jpg _
217 | 4276.jpg _
218 | 6119.jpg _
219 | 4842.jpg _
220 | 3584.jpg _
221 | 4289.jpg _
222 | 2640.jpg _
223 | 9782.jpg _
224 | 2259.jpg _
225 | 7324.jpg _
226 | 2386.jpg _
227 | 10178.jpg _
228 | 5956.jpg _
229 | 166.jpg _
230 | 2409.jpg _
231 | 611.jpg _
232 | 1135.jpg _
233 | 7327.jpg _
234 | 9305.jpg _
235 | 5165.jpg _
236 | 1322.jpg _
237 | 9625.jpg _
238 | 9122.jpg _
239 | 8070.jpg _
240 | 4633.jpg _
241 | 2183.jpg _
242 | 8300.jpg _
243 | 8121.jpg _
244 | 8467.jpg _
245 | 2964.jpg _
246 | 6859.jpg _
247 | 3324.jpg _
248 | 9518.jpg _
249 | 1427.jpg _
250 | 2960.jpg _
251 | 4724.jpg _
252 | 2049.jpg _
253 | 6074.jpg _
254 | 3264.jpg _
255 | 7070.jpg _
256 | 9507.jpg _
257 | 6335.jpg _
258 | 9644.jpg _
259 | 7590.jpg _
260 | 9015.jpg _
261 | 2233.jpg _
262 | 9690.jpg _
263 | 6282.jpg _
264 | 4981.jpg _
265 | 10040.jpg _
266 | 5466.jpg _
267 | 8376.jpg _
268 | 6207.jpg _
269 | 9941.jpg _
270 | 582.jpg _
271 | 9942.jpg _
272 | 6136.jpg _
273 | 4664.jpg _
274 | 2485.jpg _
275 | 10223.jpg _
276 | 7527.jpg _
277 | 9358.jpg _
278 | 4827.jpg _
279 | 5521.jpg _
280 | 8668.jpg _
281 | 7219.jpg _
282 | 9458.jpg _
283 | 2608.jpg _
284 | 8929.jpg _
285 | 988.jpg _
286 | 3629.jpg _
287 | 7415.jpg _
288 | 1920.jpg _
289 | 1623.jpg _
290 | 8388.jpg _
291 | 4225.jpg _
292 | 1926.jpg _
293 | 9282.jpg _
294 | 2331.jpg _
295 | 5632.jpg _
296 | 8209.jpg _
297 | 3024.jpg _
298 | 9225.jpg _
299 | 3692.jpg _
300 | 5260.jpg _
301 | 3666.jpg _
302 | 111.jpg _
303 | 7930.jpg _
304 | 2652.jpg _
305 | 8081.jpg _
306 | 781.jpg _
307 | 7229.jpg _
308 | 1175.jpg _
309 | 7722.jpg _
310 | 8466.jpg _
311 | 4785.jpg _
312 | 6446.jpg _
313 | 4815.jpg _
314 | 9449.jpg _
315 | 9649.jpg _
316 | 7908.jpg _
317 | 10045.jpg _
318 | 5844.jpg _
319 | 9334.jpg _
320 | 10436.jpg _
321 | 6464.jpg _
322 | 2740.jpg _
323 | 5040.jpg _
324 | 3339.jpg _
325 | 3260.jpg _
326 | 4903.jpg _
327 | 8599.jpg _
328 | 10148.jpg _
329 | 8105.jpg _
330 | 10216.jpg _
331 | 6010.jpg _
332 | 1662.jpg _
333 | 688.jpg _
334 | 1700.jpg _
335 | 3790.jpg _
336 | 5865.jpg _
337 | 6430.jpg _
338 | 5007.jpg _
339 | 6920.jpg _
340 | 736.jpg _
341 | 2213.jpg _
342 | 5937.jpg _
343 | 9801.jpg _
344 | 9982.jpg _
345 | 7989.jpg _
346 | 8110.jpg _
347 | 10130.jpg _
348 | 5214.jpg _
349 | 8811.jpg _
350 | 1325.jpg _
351 | 5494.jpg _
352 | 3911.jpg _
353 | 9540.jpg _
354 | 9078.jpg _
355 | 7424.jpg _
356 | 5536.jpg _
357 | 2671.jpg _
358 | 533.jpg _
359 | 10395.jpg _
360 | 5963.jpg _
361 | 6402.jpg _
362 | 818.jpg _
363 | 908.jpg _
364 | 955.jpg _
365 | 6263.jpg _
366 | 6638.jpg _
367 | 2766.jpg _
368 | 6817.jpg _
369 | 6883.jpg _
370 | 8522.jpg _
371 | 8696.jpg _
372 | 3326.jpg _
373 | 6025.jpg _
374 | 835.jpg _
375 | 394.jpg _
376 | 9283.jpg _
377 | 1878.jpg _
378 | 5328.jpg _
379 | 7171.jpg _
380 | 2619.jpg _
381 | 6316.jpg _
382 | 336.jpg _
383 | 3815.jpg _
384 | 1529.jpg _
385 | 7032.jpg _
386 | 7537.jpg _
387 | 7690.jpg _
388 | 6918.jpg _
389 | 9629.jpg _
390 | 3950.jpg _
391 | 3259.jpg _
392 | 3140.jpg _
393 | 3432.jpg _
394 | 6575.jpg _
395 | 1967.jpg _
396 | 81.jpg _
397 | 2830.jpg _
398 | 2002.jpg _
399 | 4804.jpg _
400 | 1958.jpg _
401 | 427.jpg _
402 | 2022.jpg _
403 | 7197.jpg _
404 | 221.jpg _
405 | 140.jpg _
406 | 1514.jpg _
407 | 8992.jpg _
408 | 1956.jpg _
409 | 8891.jpg _
410 | 4691.jpg _
411 | 4569.jpg _
412 | 2611.jpg _
413 | 1205.jpg _
414 | 1612.jpg _
415 | 358.jpg _
416 | 7767.jpg _
417 | 8447.jpg _
418 | 8239.jpg _
419 | 2621.jpg _
420 | 7281.jpg _
421 | 2024.jpg _
422 | 8097.jpg _
423 | 7840.jpg _
424 | 8354.jpg _
425 | 504.jpg _
426 | 8440.jpg _
427 | 7662.jpg _
428 | 10346.jpg _
429 | 1017.jpg _
430 | 7315.jpg _
431 | 5203.jpg _
432 | 9929.jpg _
433 | 7041.jpg _
434 | 565.jpg _
435 | 278.jpg _
436 | 616.jpg _
437 | 4689.jpg _
438 | 6779.jpg _
439 | 3842.jpg _
440 | 3013.jpg _
441 | 4372.jpg _
442 | 1643.jpg _
443 | 5884.jpg _
444 | 5708.jpg _
445 | 8156.jpg _
446 | 8401.jpg _
447 | 884.jpg _
448 | 9750.jpg _
449 | 8936.jpg _
450 | 8865.jpg _
451 | 1122.jpg _
452 | 2179.jpg _
453 | 3447.jpg _
454 | 9857.jpg _
455 | 6690.jpg _
456 | 10327.jpg _
457 | 3275.jpg _
458 | 9494.jpg _
459 | 957.jpg _
460 | 4978.jpg _
461 | 7535.jpg _
462 | 6905.jpg _
463 | 5809.jpg _
464 | 7002.jpg _
465 | 2587.jpg _
466 | 5522.jpg _
467 | 5417.jpg _
468 | 247.jpg _
469 | 6336.jpg _
470 | 7288.jpg _
471 | 4126.jpg _
472 | 3946.jpg _
473 | 8444.jpg _
474 | 6130.jpg _
475 | 8482.jpg _
476 | 7036.jpg _
477 | 5023.jpg _
478 | 8154.jpg _
479 | 5629.jpg _
480 | 9771.jpg _
481 | 1820.jpg _
482 | 7772.jpg _
483 | 7380.jpg _
484 | 8483.jpg _
485 | 4470.jpg _
486 | 1947.jpg _
487 | 8598.jpg _
488 | 6656.jpg _
489 | 1212.jpg _
490 | 87.jpg _
491 | 6742.jpg _
492 | 1250.jpg _
493 | 9089.jpg _
494 | 3201.jpg _
495 | 6169.jpg _
496 | 10020.jpg _
497 | 8677.jpg _
498 | 7634.jpg _
499 | 5736.jpg _
500 | 9698.jpg _
501 | 7665.jpg _
502 | 531.jpg _
503 | 5406.jpg _
504 | 2601.jpg _
505 | 5404.jpg _
506 | 9380.jpg _
507 | 983.jpg _
508 | 9681.jpg _
509 | 5460.jpg _
510 | 9303.jpg _
511 | 7866.jpg _
512 | 6276.jpg _
513 | 8457.jpg _
514 | 7282.jpg _
515 | 2520.jpg _
516 | 7287.jpg _
517 | 5816.jpg _
518 | 5045.jpg _
519 | 7541.jpg _
520 | 3054.jpg _
521 | 8371.jpg _
522 | 7381.jpg _
523 | 1505.jpg _
524 | 8915.jpg _
525 | 3278.jpg _
526 | 2310.jpg _
527 | 10201.jpg _
528 | 3872.jpg _
529 | 8616.jpg _
530 | 6196.jpg _
531 | 973.jpg _
532 | 3444.jpg _
533 | 1121.jpg _
534 | 5733.jpg _
535 | 6657.jpg _
536 | 6901.jpg _
537 | 10472.jpg _
538 | 8841.jpg _
539 | 6655.jpg _
540 | 4228.jpg _
541 | 7900.jpg _
542 | 6993.jpg _
543 | 1606.jpg _
544 | 5349.jpg _
545 | 6448.jpg _
546 | 2000.jpg _
547 | 8787.jpg _
548 | 4350.jpg _
549 | 8651.jpg _
550 | 334.jpg _
551 | 1821.jpg _
552 | 6975.jpg _
553 | 2375.jpg _
554 | 3014.jpg _
555 | 1558.jpg _
556 | 940.jpg _
557 | 6061.jpg _
558 | 3713.jpg _
559 | 548.jpg _
560 | 3512.jpg _
561 | 4753.jpg _
562 | 148.jpg _
563 | 2815.jpg _
564 | 891.jpg _
565 | 8013.jpg _
566 | 3172.jpg _
567 | 1424.jpg _
568 | 8383.jpg _
569 | 6879.jpg _
570 | 4907.jpg _
571 | 7263.jpg _
572 | 1428.jpg _
573 | 8566.jpg _
574 | 9278.jpg _
575 | 189.jpg _
576 | 6356.jpg _
577 | 9173.jpg _
578 | 7750.jpg _
579 | 702.jpg _
580 | 7201.jpg _
581 | 2927.jpg _
582 | 7511.jpg _
583 | 1668.jpg _
584 | 4187.jpg _
585 | 1506.jpg _
586 | 1663.jpg _
587 | 8633.jpg _
588 | 3463.jpg _
589 | 2788.jpg _
590 | 6756.jpg _
591 | 2418.jpg _
592 | 6060.jpg _
593 | 8881.jpg _
594 | 8606.jpg _
595 | 7763.jpg _
596 | 8838.jpg _
597 | 2866.jpg _
598 | 2413.jpg _
599 | 1077.jpg _
600 | 8365.jpg _
601 | 3702.jpg _
602 | 6992.jpg _
603 | 9331.jpg _
604 | 9660.jpg _
605 | 3855.jpg _
606 | 2733.jpg _
607 | 6513.jpg _
608 | 674.jpg _
609 | 96.jpg _
610 | 4733.jpg _
611 | 2419.jpg _
612 | 2129.jpg _
613 | 6254.jpg _
614 | 10269.jpg _
615 | 8897.jpg _
616 | 1635.jpg _
617 | 4137.jpg _
618 | 4821.jpg _
619 | 7152.jpg _
620 | 9823.jpg _
621 | 664.jpg _
622 | 5133.jpg _
623 | 4249.jpg _
624 | 6112.jpg _
625 | 8636.jpg _
626 | 2666.jpg _
627 | 8885.jpg _
628 | 7776.jpg _
629 | 5011.jpg _
630 | 1362.jpg _
631 | 8663.jpg _
632 | 8568.jpg _
633 | 1328.jpg _
634 | 2562.jpg _
635 | 10034.jpg _
636 | 9923.jpg _
637 | 9372.jpg _
638 | 1359.jpg _
639 | 5888.jpg _
640 | 239.jpg _
641 | 5497.jpg _
642 | 1511.jpg _
643 | 5950.jpg _
644 | 1768.jpg _
645 | 5993.jpg _
646 | 10468.jpg _
647 | 10031.jpg _
648 | 812.jpg _
649 | 6784.jpg _
650 | 1461.jpg _
651 | 4098.jpg _
652 | 9557.jpg _
653 | 5035.jpg _
654 | 9727.jpg _
655 | 7706.jpg _
656 | 4549.jpg _
657 | 9605.jpg _
658 | 10150.jpg _
659 | 698.jpg _
660 | 1045.jpg _
661 | 431.jpg _
662 | 2461.jpg _
663 | 1409.jpg _
664 | 2057.jpg _
665 | 8928.jpg _
666 | 5196.jpg _
667 | 4157.jpg _
668 | 1294.jpg _
669 | 3298.jpg _
670 | 2168.jpg _
671 | 1004.jpg _
672 | 6671.jpg _
673 | 5240.jpg _
674 | 8523.jpg _
675 | 2813.jpg _
676 | 6278.jpg _
677 | 9022.jpg _
678 | 2721.jpg _
679 | 3188.jpg _
680 | 8963.jpg _
681 | 8969.jpg _
682 | 8259.jpg _
683 | 5181.jpg _
684 | 10466.jpg _
685 | 1564.jpg _
686 | 7661.jpg _
687 | 6745.jpg _
688 | 2887.jpg _
689 | 6571.jpg _
690 | 6285.jpg _
691 | 4871.jpg _
692 | 3705.jpg _
693 | 5078.jpg _
694 | 1881.jpg _
695 | 5620.jpg _
696 | 1975.jpg _
697 | 5167.jpg _
698 | 2366.jpg _
699 | 3693.jpg _
700 | 465.jpg _
701 | 637.jpg _
702 | 19.jpg _
703 | 1108.jpg _
704 | 8037.jpg _
705 | 7865.jpg _
706 | 2926.jpg _
707 | 8220.jpg _
708 | 9230.jpg _
709 | 9825.jpg _
710 | 1734.jpg _
711 | 149.jpg _
712 | 9663.jpg _
713 | 2948.jpg _
714 | 5936.jpg _
715 | 9321.jpg _
716 | 6108.jpg _
717 | 5815.jpg _
718 | 8621.jpg _
719 | 9035.jpg _
720 | 2139.jpg _
721 | 7425.jpg _
722 | 2987.jpg _
723 | 902.jpg _
724 | 10320.jpg _
725 | 3465.jpg _
726 | 2364.jpg _
727 | 8976.jpg _
728 | 9417.jpg _
729 | 3455.jpg _
730 | 9454.jpg _
731 | 2305.jpg _
732 | 3081.jpg _
733 | 167.jpg _
734 | 811.jpg _
735 | 1874.jpg _
736 | 10225.jpg _
737 | 6840.jpg _
738 | 6574.jpg _
739 | 3187.jpg _
740 | 9730.jpg _
741 | 4323.jpg _
742 | 218.jpg _
743 | 8733.jpg _
744 | 9161.jpg _
745 | 2055.jpg _
746 | 1480.jpg _
747 | 7804.jpg _
748 | 9927.jpg _
749 | 9361.jpg _
750 | 8679.jpg _
751 | 6829.jpg _
752 | 4448.jpg _
753 | 2236.jpg _
754 | 644.jpg _
755 | 953.jpg _
756 | 796.jpg _
757 | 714.jpg _
758 | 8734.jpg _
759 | 4086.jpg _
760 | 5338.jpg _
761 | 1106.jpg _
762 | 2076.jpg _
763 | 7847.jpg _
764 | 3454.jpg _
765 | 2947.jpg _
766 | 2048.jpg _
767 | 563.jpg _
768 | 5824.jpg _
769 | 5954.jpg _
770 | 8507.jpg _
771 | 2769.jpg _
772 | 3141.jpg _
773 | 10300.jpg _
774 | 6948.jpg _
775 | 3131.jpg _
776 | 6081.jpg _
777 | 943.jpg _
778 | 8257.jpg _
779 | 8780.jpg _
780 | 4755.jpg _
781 | 9778.jpg _
782 | 1856.jpg _
783 | 3937.jpg _
784 | 2074.jpg _
785 | 9955.jpg _
786 | 536.jpg _
787 | 10401.jpg _
788 | 1585.jpg _
789 | 7430.jpg _
790 | 4200.jpg _
791 | 8798.jpg _
792 | 8938.jpg _
793 | 9538.jpg _
794 | 6050.jpg _
795 | 3331.jpg _
796 | 10111.jpg _
797 | 10313.jpg _
798 | 7577.jpg _
799 | 10063.jpg _
800 | 7744.jpg _
801 | 734.jpg _
802 | 5505.jpg _
803 | 1094.jpg _
804 | 10237.jpg _
805 | 9244.jpg _
806 | 6246.jpg _
807 | 2833.jpg _
808 | 9183.jpg _
809 | 8057.jpg _
810 | 2755.jpg _
811 | 9561.jpg _
812 | 5097.jpg _
813 | 2662.jpg _
814 | 5842.jpg _
815 | 8775.jpg _
816 | 1884.jpg _
817 | 8452.jpg _
818 | 8067.jpg _
819 | 3309.jpg _
820 | 7297.jpg _
821 | 8645.jpg _
822 | 9831.jpg _
823 | 5446.jpg _
824 | 8476.jpg _
825 | 4737.jpg _
826 | 3945.jpg _
827 | 6163.jpg _
828 | 7880.jpg _
829 | 7964.jpg _
830 | 2992.jpg _
831 | 7780.jpg _
832 | 5079.jpg _
833 | 6107.jpg _
834 | 6483.jpg _
835 | 7369.jpg _
836 | 5411.jpg _
837 | 6809.jpg _
838 | 5679.jpg _
839 | 201.jpg _
840 | 10180.jpg _
841 | 3084.jpg _
842 | 5296.jpg _
843 | 6373.jpg _
844 | 7477.jpg _
845 | 2913.jpg _
846 | 5341.jpg _
847 | 3697.jpg _
848 | 7506.jpg _
849 | 681.jpg _
850 | 1383.jpg _
851 | 10228.jpg _
852 | 586.jpg _
853 | 7769.jpg _
854 | 1883.jpg _
855 | 1984.jpg _
856 | 4201.jpg _
857 | 7939.jpg _
858 | 727.jpg _
859 | 8551.jpg _
860 | 2118.jpg _
861 | 498.jpg _
862 | 7738.jpg _
863 | 4796.jpg _
864 | 3165.jpg _
865 | 248.jpg _
866 | 9967.jpg _
867 | 10009.jpg _
868 | 5118.jpg _
869 | 4995.jpg _
870 | 9074.jpg _
871 | 67.jpg _
872 | 7544.jpg _
873 | 3549.jpg _
874 | 3669.jpg _
875 | 6607.jpg _
876 | 4006.jpg _
877 | 508.jpg _
878 | 3199.jpg _
879 | 6907.jpg _
880 | 9645.jpg _
881 | 782.jpg _
882 | 3204.jpg _
883 | 10256.jpg _
884 | 5565.jpg _
885 | 6494.jpg _
886 | 225.jpg _
887 | 8549.jpg _
888 | 5211.jpg _
889 | 8693.jpg _
890 | 8845.jpg _
891 | 7283.jpg _
892 | 1600.jpg _
893 | 4101.jpg _
894 | 10291.jpg _
895 | 6091.jpg _
896 | 8171.jpg _
897 | 2528.jpg _
898 | 2102.jpg _
899 | 6071.jpg _
900 | 3992.jpg _
901 | 4262.jpg _
902 | 2160.jpg _
903 | 8351.jpg _
904 | 8995.jpg _
905 | 10474.jpg _
906 | 7986.jpg _
907 | 9063.jpg _
908 | 5656.jpg _
909 | 5618.jpg _
910 | 9593.jpg _
911 | 8228.jpg _
912 | 9843.jpg _
913 | 6001.jpg _
914 | 163.jpg _
915 | 3615.jpg _
916 | 5524.jpg _
917 | 110.jpg _
918 | 1545.jpg _
919 | 9848.jpg _
920 | 10032.jpg _
921 | 3292.jpg _
922 | 8997.jpg _
923 | 3354.jpg _
924 | 5730.jpg _
925 | 6668.jpg _
926 | 2581.jpg _
927 | 6665.jpg _
928 | 3716.jpg _
929 | 9050.jpg _
930 | 3916.jpg _
931 | 4367.jpg _
932 | 7124.jpg _
933 | 8608.jpg _
934 | 2427.jpg _
935 | 629.jpg _
936 | 550.jpg _
937 | 3293.jpg _
938 | 9781.jpg _
939 | 9460.jpg _
940 | 6709.jpg _
941 | 8712.jpg _
942 | 7106.jpg _
943 | 2042.jpg _
944 | 7893.jpg _
945 | 2398.jpg _
946 | 1432.jpg _
947 | 1251.jpg _
948 | 753.jpg _
949 | 5005.jpg _
950 | 5519.jpg _
951 | 4757.jpg _
952 | 1666.jpg _
953 | 7793.jpg _
954 | 8745.jpg _
955 | 10095.jpg _
956 | 9118.jpg _
957 | 9679.jpg _
958 | 3717.jpg _
959 | 9553.jpg _
960 | 8075.jpg _
961 | 6933.jpg _
962 | 679.jpg _
963 | 3185.jpg _
964 | 164.jpg _
965 | 2319.jpg _
966 | 9314.jpg _
967 | 2457.jpg _
968 | 3197.jpg _
969 | 1039.jpg _
970 | 5701.jpg _
971 | 6857.jpg _
972 | 5688.jpg _
973 | 8286.jpg _
974 | 8690.jpg _
975 | 1490.jpg _
976 | 3105.jpg _
977 | 9097.jpg _
978 | 3118.jpg _
979 | 2169.jpg _
980 | 9997.jpg _
981 | 6032.jpg _
982 | 6934.jpg _
983 | 4866.jpg _
984 | 7215.jpg _
985 | 7576.jpg _
986 | 10027.jpg _
987 | 6041.jpg _
988 | 7069.jpg _
989 | 10174.jpg _
990 | 5168.jpg _
991 | 7470.jpg _
992 | 8464.jpg _
993 | 2772.jpg _
994 | 1054.jpg _
995 | 1326.jpg _
996 | 2513.jpg _
997 | 7074.jpg _
998 | 10230.jpg _
999 | 2470.jpg _
1000 | 6714.jpg _
1001 |
--------------------------------------------------------------------------------
/evaluate_mono.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import skimage.io
4 | import argparse
5 | import cv2
6 | from tqdm import tqdm
7 | from utils import read_d, parse_dataset_txt, compute_scale_and_shift, read_calib_xml
8 | import threading
9 |
10 | CATEGORIES = ['All', 'ToM', 'Other']
11 | METRICS = ['delta1.25', 'delta1.20', 'delta1.15', 'delta1.10', 'delta1.05', 'mae', 'absrel', 'rmse']
12 |
13 | class evalThread(threading.Thread):
14 | def __init__(self, idxs, gts, preds, focals, baselines, acc, categories, min_depth=1, max_depth=10000, resize_factor=0.25, baseline_factor=1000, median_scale_and_shift=False):
15 | super(evalThread, self).__init__()
16 | self.idxs = idxs
17 | self.gts = gts
18 | self.preds = preds
19 | self.focals = focals
20 | self.baselines = baselines
21 | self.min_depth = min_depth
22 | self.max_depth = max_depth
23 | self.acc = acc
24 | self.categories = categories
25 | self.baseline_factor = baseline_factor
26 | self.median_scale_and_shift = median_scale_and_shift
27 | self.resize_factor = resize_factor
28 |
29 | def run(self):
30 | for idx in self.idxs:
31 | gt = read_d(self.gts[idx], scale_factor=256.)
32 | fx = self.focals[idx]
33 | baseline = self.baselines[idx]
34 | baseline = baseline * self.baseline_factor
35 |
36 | gt = cv2.resize(gt, None, fx=self.resize_factor, fy=self.resize_factor, interpolation=cv2.INTER_NEAREST)
37 | fx = fx * self.resize_factor
38 | gt = gt.astype(np.float32) * self.resize_factor
39 |
40 | # CLIP DEPTH GT
41 | gt[gt > fx * baseline / self.min_depth] = 0 # INVALID IF LESS THAN 1mm (very high disparity values)
42 | gt[gt < fx * baseline / self.max_depth] = 0 # INVALID IF MORE THAN max_depth meters (very small disparity values)
43 |
44 | pred = read_d(self.preds[idx], scale_factor=256.)
45 | pred = cv2.resize(pred, (gt.shape[1], gt.shape[0]), cv2.INTER_CUBIC)
46 | pred = (pred - np.min(pred[gt > 0])) / (pred[gt > 0].max() - pred[gt > 0].min())
47 | if self.median_scale_and_shift:
48 | gt_shifted = gt - gt[gt>0].min()
49 | scale = np.median(gt_shifted[gt > 0])/np.median(pred[gt > 0])
50 | pred = pred * scale
51 | shift = np.median(gt[gt > 0] - pred[gt > 0])
52 | pred = pred + shift
53 | else:
54 | scale, shift = compute_scale_and_shift(np.expand_dims(pred, axis=0),
55 | np.expand_dims(gt, axis=0),
56 | np.expand_dims((gt > 0).astype(np.float32), axis=0))
57 | pred = pred * scale + shift
58 |
59 | pred = baseline * fx / pred
60 |
61 | # CLIP PRED TO WORKING RANGE
62 | pred[np.isinf(pred)] = self.max_depth
63 | pred[pred > self.max_depth] = self.max_depth
64 | pred[pred 1:
70 | seg_mask = skimage.io.imread(self.gts[idx].replace(os.path.basename(self.gts[idx]), 'mask_cat.png'))
71 | seg_mask = cv2.resize(seg_mask, None, fx=self.resize_factor, fy=self.resize_factor, interpolation=cv2.INTER_NEAREST)
72 |
73 | for category in self.categories:
74 | valid = (gt>0).astype(np.float32)
75 |
76 | if category != 'All':
77 | if category == "Other":
78 | mask0 = seg_mask == 0
79 | mask1 = seg_mask == 1
80 | else:
81 | mask0 = seg_mask == 2
82 | mask1 = seg_mask == 3
83 | mask = mask0 | mask1
84 | mask = mask.astype(np.float32)
85 | valid = valid * mask
86 |
87 | if valid.sum() > 0:
88 | metrics = booster_metrics(pred, gt, valid)
89 | for k in METRICS:
90 | self.acc[category][k].append(metrics[k])
91 |
92 |
93 | # Main evaluation function
94 | def booster_metrics(d, gt, valid):
95 | error = np.abs(d-gt)
96 | error[valid==0] = 0
97 |
98 | thresh = np.maximum((d[valid > 0] / gt[valid > 0]), (gt[valid > 0] / d[valid > 0]))
99 | delta3 = (thresh < 1.25).astype(np.float32).mean()
100 | delta4 = (thresh < 1.20).astype(np.float32).mean()
101 | delta5 = (thresh < 1.15).astype(np.float32).mean()
102 | delta6 = (thresh < 1.10).astype(np.float32).mean()
103 | delta7 = (thresh < 1.05).astype(np.float32).mean()
104 |
105 | avgerr = error[valid>0].mean()
106 | abs_rel = (error[valid>0]/gt[valid>0]).mean()
107 |
108 | rms = (d-gt)**2
109 | rms = np.sqrt( rms[valid>0].mean() )
110 |
111 | return {'delta1.25':delta3*100., 'delta1.20':delta4*100.,'delta1.15':delta5*100., 'delta1.10':delta6*100., 'delta1.05':delta7*100., 'mae':avgerr, 'absrel': abs_rel, 'rmse':rms, 'errormap':error*(valid>0)}
112 |
113 |
114 | def eval(gts, preds, focals, baselines, min_depth=1, max_depth=10000, resize_factor=0.25, baseline_factor=1000, median_scale_and_shift=False):
115 | # Check all files OK
116 | for test_img in preds:
117 | if not os.path.exists(test_img):
118 | print("Missing files in the submission")
119 | exit(-1)
120 |
121 | if not os.path.exists(gts[0].replace(os.path.basename(gts[0]), 'mask_cat.png')):
122 | categories = ['All']
123 | else:
124 | categories = CATEGORIES
125 |
126 | # INIT
127 | acc = {}
128 | results = {}
129 | for category in categories:
130 | acc[category] = {}
131 | results[category] = {}
132 | for metric in METRICS:
133 | acc[category][metric] = []
134 | results[category][metric] = []
135 |
136 | num_samples = len(gts)
137 | print("Number of samples", num_samples)
138 | num_workers = 32
139 | threads = []
140 | for i in range(num_workers):
141 | start_idx = num_samples//num_workers * i
142 | if i != num_workers -1:
143 | end_idx = num_samples//num_workers * (i+1)
144 | else:
145 | end_idx = num_samples
146 | idxs = range(start_idx, end_idx)
147 | t = evalThread(idxs, gts, preds, focals, baselines, acc, categories, min_depth, max_depth, resize_factor, baseline_factor, median_scale_and_shift)
148 | threads.append(t)
149 | t.start()
150 |
151 | for t in threads:
152 | t.join()
153 |
154 | for category in categories:
155 | for k in acc[category]:
156 | results[category][k] = np.array(acc[category][k]).mean()
157 |
158 | return results
159 |
160 |
161 | def result2string(result):
162 | result_string = "{:<12}".format("CLASS")
163 | for k in METRICS:
164 | result_string += "{:<12}".format(k)
165 | result_string += "\n"
166 | for cat in CATEGORIES:
167 | if cat in result:
168 | result_string += "{:<12}".format(cat)
169 | for metric in METRICS:
170 | tmp = ""
171 | if metric in result[cat]: tmp = "{:.2f}".format(result[cat][metric])
172 | result_string += "{:<12}".format(tmp)
173 | result_string += "\n"
174 | return result_string
175 |
176 |
177 | if __name__ == "__main__":
178 |
179 | parser = argparse.ArgumentParser()
180 | parser.add_argument('--gt_root',
181 | help='folder with gt'
182 | )
183 | parser.add_argument('--pred_root',
184 | help='folder with predictions'
185 | )
186 | parser.add_argument('--pred_ext',
187 | default=".npy",
188 | help='prediction extension'
189 | )
190 | parser.add_argument('--dataset_txt',
191 | help='txt file with a set of $basename $gtpath $calib_file or $basename $gtpath $fx $baseline or $basename $gtpath'
192 | )
193 |
194 | parser.add_argument('--output_path',
195 | default="results.txt",
196 | help='output file'
197 | )
198 | parser.add_argument('--resize_factor',
199 | default=0.25,
200 | type=float,
201 | help='resize gt images with this factor. Evaluation will be done at the gt resolution'
202 | )
203 | parser.add_argument('--baseline_factor',
204 | default=1000,
205 | type=float,
206 | help='scale baseline using this factor'
207 | )
208 | parser.add_argument('--min_depth',
209 | default=1,
210 | type=float,
211 | help='min depth in millimeters'
212 | )
213 | parser.add_argument('--max_depth',
214 | default=10000,
215 | type=float,
216 | help='max depth in millimeters'
217 | )
218 | parser.add_argument('--median_scale_and_shift',
219 | action="store_true",
220 | help='rescale prediction with median instead of least square scale and shift'
221 | )
222 | args = parser.parse_args()
223 |
224 | # Getting dataset paths
225 | dataset_dict = parse_dataset_txt(args.dataset_txt)
226 |
227 | gt_files = [os.path.join(args.gt_root, f) for f in dataset_dict["gt_paths"]]
228 | basenames = [os.path.join(args.pred_root, os.path.splitext(f)[0] + args.pred_ext) for f in dataset_dict["basenames"]]
229 |
230 | if "calib_paths" in dataset_dict:
231 | focals = []
232 | baselines = []
233 | for calib_path in dataset_dict["calib_paths"]:
234 | fx, baseline = read_calib_xml(os.path.join(args.gt_root, calib_path))
235 | focals.append(fx)
236 | baselines.append(baseline)
237 | elif "focals" in dataset_dict and "baselines" in dataset_dict:
238 | focals = dataset_dict["focals"]
239 | baselines = dataset_dict["baselines"]
240 | else:
241 | print("Missing focals and baselines or calib files")
242 | exit(-1)
243 |
244 | # Evaluation
245 | results = eval(gt_files, basenames, focals, baselines, args.min_depth, args.max_depth, args.resize_factor, args.baseline_factor, args.median_scale_and_shift)
246 |
247 | # Saving results
248 | results_str = result2string(results)
249 | print(results_str)
250 | with open(args.output_path, "w") as fout:
251 | fout.write(results_str)
--------------------------------------------------------------------------------
/finetune.py:
--------------------------------------------------------------------------------
1 | import os
2 | import torch
3 | import cv2
4 | import argparse
5 | import time
6 |
7 | from tqdm import tqdm, trange
8 | import numpy as np
9 | import matplotlib.pyplot as plt
10 |
11 | from torchvision.transforms import Compose
12 | from torchvision.utils import make_grid
13 | from torch.optim.lr_scheduler import ExponentialLR
14 | from torch.utils.data import DataLoader, ConcatDataset
15 |
16 | import wandb
17 |
18 | from midas.dpt_depth import DPTDepthModel
19 | from midas.midas_net import MidasNet
20 | from midas.midas_net_custom import MidasNet_small
21 | from midas.transforms import Resize, ResizeTrain, NormalizeImage, PrepareForNet, RandomCrop, MirrorSquarePad, ColorAug, RandomHorizontalFlip
22 |
23 | from datasets.dataloader import MSDLoader, Trans10KLoader
24 |
25 | from loss import ScaleAndShiftInvariantLoss, GradientLoss, MSELoss
26 |
27 | def rescale(x, a = 0.0, b = 1.0):
28 | return a + (b - a)*((x - x.min())/(x.max() - x.min()))
29 |
30 |
31 | def run(args):
32 | """Run MonoDepthNN to train on novel depth maps."""
33 |
34 | training_datasets = args.training_datasets
35 | training_datasets_dir = args.training_datasets_dir
36 | training_datasets_txt = args.training_datasets_txt
37 | output_path= os.path.join(args.output_path, args.exp_name)
38 | model_path=args.model_path
39 | model_type=args.model_type
40 |
41 | wandb.init(project = f"finetuning-{model_type}",
42 | name = args.exp_name,
43 | config = {"epochs" : args.epochs,
44 | "batch_size" : args.batch_size,
45 | "model_type" : model_type,
46 | "model_path": model_path,
47 | "training_datasets" : training_datasets,
48 | "training_datasets_dir": training_datasets_dir,
49 | "training_datasets_txt": training_datasets_txt,
50 | })
51 |
52 | # Select device.
53 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
54 | print("Device: %s." % device)
55 |
56 |
57 | #### MODEL
58 | # Load network.
59 | if model_type == "dpt_large": # DPT-Large
60 | model = DPTDepthModel(
61 | path=None,
62 | backbone="vitl16_384",
63 | non_negative=True,
64 | )
65 | net_w, net_h = 384, 384
66 | normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
67 | transform = Compose(
68 | [
69 | RandomHorizontalFlip(prob=0.5),
70 | ResizeTrain(
71 | net_w,
72 | net_h,
73 | resize_target=True,
74 | keep_aspect_ratio=True,
75 | ensure_multiple_of=32,
76 | resize_method="lower_bound",
77 | image_interpolation_method=cv2.INTER_CUBIC,
78 | ),
79 | RandomCrop(net_w, net_h),
80 | ColorAug(prob=0.5),
81 | normalization,
82 | PrepareForNet(),
83 | ]
84 | )
85 | elif model_type == "midas_v21":
86 | model = MidasNet(None, non_negative=True)
87 | net_w, net_h = 384, 384
88 | normalization = NormalizeImage(
89 | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
90 | )
91 | transform = Compose(
92 | [
93 | RandomHorizontalFlip(prob=0.5),
94 | MirrorSquarePad(),
95 | ResizeTrain(
96 | net_w,
97 | net_h,
98 | resize_target=True,
99 | keep_aspect_ratio=False,
100 | ensure_multiple_of=32,
101 | resize_method="upper_bound",
102 | image_interpolation_method=cv2.INTER_CUBIC,
103 | ),
104 | ColorAug(prob=0.5),
105 | normalization,
106 | PrepareForNet(),
107 | ]
108 | )
109 | else:
110 | print(f"model_type '{model_type}' not implemented, use: --model_type large")
111 | assert False
112 |
113 | reload = torch.load(model_path)
114 | if "model_state_dict" in reload.keys():
115 | checkpoint = reload['model_state_dict']
116 | else:
117 | checkpoint = reload
118 | model.load_state_dict(checkpoint)
119 |
120 | optimizer = torch.optim.NAdam(model.parameters(), lr = 1e-7)
121 | if "optimizer_state_dict" in reload.keys() and args.continue_train:
122 | optimizer.load_state_dict(reload['optimizer_state_dict'])
123 |
124 |
125 | scheduler = ExponentialLR(optimizer, gamma = 0.95)
126 | if "scheduler" in reload.keys() and args.continue_train:
127 | scheduler.load_state_dict(checkpoint['scheduler'])
128 |
129 | ss_loss, grad_loss, mse_loss = ScaleAndShiftInvariantLoss(), GradientLoss(), MSELoss()
130 |
131 | # Un-freeze all layers.
132 | for param in model.parameters():
133 | param.requires_grad = True # False
134 |
135 | # wandb.watch(model, log_freq=100)
136 | model.to(device)
137 |
138 | ### DATASETS
139 | t_datasets = []
140 |
141 | if "trans10k" in training_datasets:
142 | idx = training_datasets.index("trans10k")
143 | train_t10k = Trans10KLoader(training_datasets_dir[idx], training_datasets_txt[idx], transform=transform)
144 | print("Training Samples Trans10K", len(train_t10k))
145 | t_datasets.append(train_t10k)
146 | if "msd" in training_datasets:
147 | idx = training_datasets.index("msd")
148 | train_msd = MSDLoader(training_datasets_dir[idx], training_datasets_txt[idx], transform=transform)
149 | print("Training Samples MSD", len(train_msd))
150 | t_datasets.append(train_msd)
151 |
152 | training_data = ConcatDataset(t_datasets)
153 | dataloader_train = DataLoader(dataset = training_data, batch_size = args.batch_size, shuffle = True, num_workers=8)
154 |
155 | running_time = 0.0
156 | train_step = 0
157 | for e in trange(args.epochs):
158 | start_time_epoch = time.time()
159 |
160 | ###---------------[Training loop]---------------###
161 | print(f"Training phase for epoch {e}: ")
162 |
163 | for img, depth, _ in tqdm(dataloader_train):
164 | if train_step % args.step_save == 0 and train_step != 0:
165 | # Save checkpoint.
166 | torch.save({'epoch': e,
167 | 'model_state_dict': model.state_dict(),
168 | 'optimizer_state_dict': optimizer.state_dict(),
169 | 'scheduler': scheduler.state_dict(),
170 | 'loss': loss,
171 | }, output_path + "/{}_{}.pt".format(model_type, train_step))
172 |
173 | model.train(True) # I think it's redundant...
174 |
175 | # Turn to tensor and send to device.
176 | sample = img.to(device)
177 | gt = depth.to(device)
178 | optimizer.zero_grad()
179 | prediction = model(sample)
180 |
181 | mask_idx = torch.full(size = prediction.shape, fill_value = 1).to(device)
182 | loss = ss_loss(prediction, gt, mask_idx) + grad_loss(prediction, gt, mask_idx) + mse_loss(prediction, gt, mask_idx)
183 |
184 | if train_step % args.step_log == 0:
185 | wandb.log({"train/batch-wise-loss" : loss.detach().cpu()})
186 | if train_step % args.step_log_images == 0:
187 | vis_rgbs = torch.nn.functional.interpolate(sample, scale_factor=0.25, mode="bilinear")
188 | vis_preds = torch.nn.functional.interpolate(prediction.unsqueeze(1), scale_factor=0.25)
189 | vis_gts = torch.nn.functional.interpolate(gt.unsqueeze(1), scale_factor=0.25)
190 | wandb.log({
191 | "train/rgb": wandb.Image(make_grid(vis_rgbs, nrow = 4)),
192 | "train/prediction": wandb.Image(make_grid(vis_preds, nrow = 4)),
193 | "train/groundtruth" : wandb.Image(make_grid(vis_gts, nrow = 4))
194 | })
195 | if torch.isnan(loss) or torch.isinf(loss):
196 | exit()
197 |
198 | if not torch.isnan(loss) and not torch.isinf(loss):
199 | loss.backward()
200 | optimizer.step()
201 | train_step += 1
202 |
203 | scheduler.step()
204 |
205 | epoch_time = (time.time() - start_time_epoch)
206 | running_time += epoch_time
207 | print(f'Epoch {e} done in {epoch_time} s.')
208 |
209 |
210 | # Save checkpoint.
211 | torch.save({'epoch': e,
212 | 'model_state_dict': model.state_dict(),
213 | 'optimizer_state_dict': optimizer.state_dict(),
214 | 'scheduler': scheduler.state_dict(),
215 | 'loss': loss,
216 | }, output_path + "/{}_{}.pt".format(model_type, train_step))
217 |
218 | # Save final ckpt without optimizer and scheduler
219 | torch.save({'model_state_dict': model.state_dict()}, output_path + "/{}_final.pt".format(model_type))
220 |
221 | if __name__ == "__main__":
222 |
223 | parser = argparse.ArgumentParser()
224 |
225 | parser.add_argument('--exp_name',
226 | default='midas-ft',
227 | )
228 |
229 | # Paths
230 | parser.add_argument('--training_datasets',
231 | nargs='+',
232 | default=['msd', 'trans10k'],
233 | help='training datasets'
234 | )
235 |
236 | parser.add_argument('--training_datasets_dir',
237 | nargs='+',
238 | default=['MSD/', 'Trans10K/'],
239 | help='list of files for each training dataset'
240 | )
241 |
242 | parser.add_argument('--training_datasets_txt',
243 | nargs='+',
244 | default=['datasets/msd/train.txt', 'datasets/trans10k/train.txt'],
245 | help='list of files for each training dataset'
246 | )
247 |
248 | parser.add_argument('-o', '--output_path',
249 | default='./experiment_models',
250 | help='where to save the model'
251 | )
252 |
253 | # Model specs
254 | parser.add_argument('-m', '--model_path',
255 | default=None,
256 | help='path to the trained weights of model'
257 | )
258 |
259 | parser.add_argument('-t', '--model_type',
260 | default='dpt_large',
261 | help='model type: dpt_large, midas_v21'
262 | )
263 |
264 | # Training params
265 | parser.add_argument('-e', '--epochs',
266 | default=20,
267 | type=int,
268 | help='number of epochs'
269 | )
270 |
271 | parser.add_argument('-bs', '--batch_size',
272 | default=8,
273 | type=int,
274 | help='batch_size'
275 | )
276 |
277 | parser.add_argument('--continue_train',
278 | action="store_true",
279 | help='load optimizer and scheduler state dict'
280 | )
281 |
282 | # Logging params
283 | parser.add_argument('--step_save',
284 | default=5000,
285 | type=int,
286 | help='number of steps to save the model'
287 | )
288 | parser.add_argument('--step_log',
289 | default=10,
290 | type=int,
291 | help='number of steps to save the model'
292 | )
293 | parser.add_argument('--step_log_images',
294 | default=1000,
295 | type=int,
296 | help='number of steps to save the model'
297 | )
298 |
299 | args = parser.parse_args()
300 | print(args)
301 |
302 | os.makedirs(os.path.join(args.output_path, args.exp_name), exist_ok=True)
303 |
304 | default_models = {
305 | "midas_v21" : "weights/Base/midas_v21-base.pt",
306 | "dpt_large" : "weights/Base/dpt_large-base.pt",
307 | }
308 |
309 | if args.model_path is None:
310 | args.model_path = default_models[args.model_type]
311 |
312 | # Set torch options
313 | torch.backends.cudnn.enabled = True
314 | torch.backends.cudnn.benchmark = True
315 |
316 | # Start fine-tuning.
317 | run(args)
318 |
--------------------------------------------------------------------------------
/images/framework_mono.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CVLAB-Unibo/Depth4ToM-code/5de0f869d66edc48b79d2f9f197756e71b342f9a/images/framework_mono.png
--------------------------------------------------------------------------------
/images/qualitatives.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/CVLAB-Unibo/Depth4ToM-code/5de0f869d66edc48b79d2f9f197756e71b342f9a/images/qualitatives.png
--------------------------------------------------------------------------------
/loss.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 |
4 |
5 | def compute_scale_and_shift(prediction, target, mask):
6 | # system matrix: A = [[a_00, a_01], [a_10, a_11]]
7 | a_00 = torch.sum(mask * prediction * prediction, (1, 2))
8 | a_01 = torch.sum(mask * prediction, (1, 2))
9 | a_11 = torch.sum(mask, (1, 2))
10 |
11 | # right hand side: b = [b_0, b_1]
12 | b_0 = torch.sum(mask * prediction * target, (1, 2))
13 | b_1 = torch.sum(mask * target, (1, 2))
14 |
15 | # solution: x = A^-1 . b = [[a_11, -a_01], [-a_10, a_00]] / (a_00 * a_11 - a_01 * a_10) . b
16 | x_0 = torch.zeros_like(b_0)
17 | x_1 = torch.zeros_like(b_1)
18 |
19 | det = a_00 * a_11 - a_01 * a_01
20 | valid = det.nonzero()
21 |
22 | x_0[valid] = (a_11[valid] * b_0[valid] - a_01[valid] * b_1[valid]) / det[valid]
23 | x_1[valid] = (-a_01[valid] * b_0[valid] + a_00[valid] * b_1[valid]) / det[valid]
24 |
25 | return x_0, x_1
26 |
27 |
28 | def reduction_batch_based(image_loss, M):
29 | # average of all valid pixels of the batch
30 |
31 | # avoid division by 0 (if sum(M) = sum(sum(mask)) = 0: sum(image_loss) = 0)
32 | divisor = torch.sum(M)
33 |
34 | if divisor == 0:
35 | return 0
36 | else:
37 | return torch.sum(image_loss) / divisor
38 |
39 |
40 | def reduction_image_based(image_loss, M):
41 | # mean of average of valid pixels of an image
42 |
43 | # avoid division by 0 (if M = sum(mask) = 0: image_loss = 0)
44 | valid = M.nonzero()
45 |
46 | image_loss[valid] = image_loss[valid] / M[valid]
47 |
48 | return torch.mean(image_loss)
49 |
50 |
51 | def mse_loss(prediction, target, mask, reduction=reduction_batch_based):
52 |
53 | M = torch.sum(mask, (1, 2))
54 | res = prediction - target
55 | image_loss = torch.sum(mask * res * res, (1, 2))
56 |
57 | return reduction(image_loss, 2 * M)
58 |
59 |
60 | def gradient_loss(prediction, target, mask, reduction=reduction_batch_based):
61 |
62 | M = torch.sum(mask, (1, 2))
63 |
64 | diff = prediction - target
65 | diff = torch.mul(mask, diff)
66 |
67 | grad_x = torch.abs(diff[:, :, 1:] - diff[:, :, :-1])
68 | mask_x = torch.mul(mask[:, :, 1:], mask[:, :, :-1])
69 | grad_x = torch.mul(mask_x, grad_x)
70 |
71 | grad_y = torch.abs(diff[:, 1:, :] - diff[:, :-1, :])
72 | mask_y = torch.mul(mask[:, 1:, :], mask[:, :-1, :])
73 | grad_y = torch.mul(mask_y, grad_y)
74 |
75 | image_loss = torch.sum(grad_x, (1, 2)) + torch.sum(grad_y, (1, 2))
76 |
77 | return reduction(image_loss, M)
78 |
79 |
80 | class MSELoss(nn.Module):
81 | def __init__(self, reduction='batch-based'):
82 | super().__init__()
83 |
84 | if reduction == 'batch-based':
85 | self.__reduction = reduction_batch_based
86 | else:
87 | self.__reduction = reduction_image_based
88 |
89 | def forward(self, prediction, target, mask):
90 | return mse_loss(prediction, target, mask, reduction=self.__reduction)
91 |
92 |
93 | class GradientLoss(nn.Module):
94 | def __init__(self, scales=4, reduction='batch-based'):
95 | super().__init__()
96 |
97 | if reduction == 'batch-based':
98 | self.__reduction = reduction_batch_based
99 | else:
100 | self.__reduction = reduction_image_based
101 |
102 | self.__scales = scales
103 |
104 | def forward(self, prediction, target, mask):
105 | total = 0
106 |
107 | for scale in range(self.__scales):
108 | step = pow(2, scale)
109 |
110 | total += gradient_loss(prediction[:, ::step, ::step], target[:, ::step, ::step],
111 | mask[:, ::step, ::step], reduction=self.__reduction)
112 |
113 | return total
114 |
115 |
116 | class ScaleAndShiftInvariantLoss(nn.Module):
117 | def __init__(self, alpha=0.5, scales=4, reduction='batch-based'):
118 | super().__init__()
119 |
120 | self.__data_loss = MSELoss(reduction=reduction)
121 | self.__regularization_loss = GradientLoss(scales=scales, reduction=reduction)
122 | self.__alpha = alpha
123 |
124 | self.__prediction_ssi = None
125 |
126 | def forward(self, prediction, target, mask):
127 |
128 | scale, shift = compute_scale_and_shift(prediction, target, mask)
129 | self.__prediction_ssi = scale.view(-1, 1, 1) * prediction + shift.view(-1, 1, 1)
130 |
131 | total = self.__data_loss(self.__prediction_ssi, target, mask)
132 | if self.__alpha > 0:
133 | total += self.__alpha * self.__regularization_loss(self.__prediction_ssi, target, mask)
134 |
135 | return total
136 |
137 | def __get_prediction_ssi(self):
138 | return self.__prediction_ssi
139 |
140 | prediction_ssi = property(__get_prediction_ssi)
--------------------------------------------------------------------------------
/midas/base_model.py:
--------------------------------------------------------------------------------
1 | import torch
2 |
3 |
4 | class BaseModel(torch.nn.Module):
5 | def load(self, path):
6 | """Load model from file.
7 |
8 | Args:
9 | path (str): file path
10 | """
11 | parameters = torch.load(path, map_location=torch.device('cpu'))
12 |
13 | if "optimizer" in parameters:
14 | parameters = parameters["model"]
15 |
16 | self.load_state_dict(parameters)
17 |
--------------------------------------------------------------------------------
/midas/blocks.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 |
4 | from .vit import (
5 | _make_pretrained_vitb_rn50_384,
6 | _make_pretrained_vitl16_384,
7 | _make_pretrained_vitb16_384,
8 | forward_vit,
9 | )
10 |
11 | def _make_encoder(backbone, features, use_pretrained, groups=1, expand=False, exportable=True, hooks=None, use_vit_only=False, use_readout="ignore",):
12 | if backbone == "vitl16_384":
13 | pretrained = _make_pretrained_vitl16_384(
14 | use_pretrained, hooks=hooks, use_readout=use_readout
15 | )
16 | scratch = _make_scratch(
17 | [256, 512, 1024, 1024], features, groups=groups, expand=expand
18 | ) # ViT-L/16 - 85.0% Top1 (backbone)
19 | elif backbone == "vitb_rn50_384":
20 | pretrained = _make_pretrained_vitb_rn50_384(
21 | use_pretrained,
22 | hooks=hooks,
23 | use_vit_only=use_vit_only,
24 | use_readout=use_readout,
25 | )
26 | scratch = _make_scratch(
27 | [256, 512, 768, 768], features, groups=groups, expand=expand
28 | ) # ViT-H/16 - 85.0% Top1 (backbone)
29 | elif backbone == "vitb16_384":
30 | pretrained = _make_pretrained_vitb16_384(
31 | use_pretrained, hooks=hooks, use_readout=use_readout
32 | )
33 | scratch = _make_scratch(
34 | [96, 192, 384, 768], features, groups=groups, expand=expand
35 | ) # ViT-B/16 - 84.6% Top1 (backbone)
36 | elif backbone == "resnext101_wsl":
37 | pretrained = _make_pretrained_resnext101_wsl(use_pretrained)
38 | scratch = _make_scratch([256, 512, 1024, 2048], features, groups=groups, expand=expand) # efficientnet_lite3
39 | elif backbone == "efficientnet_lite3":
40 | pretrained = _make_pretrained_efficientnet_lite3(use_pretrained, exportable=exportable)
41 | scratch = _make_scratch([32, 48, 136, 384], features, groups=groups, expand=expand) # efficientnet_lite3
42 | else:
43 | print(f"Backbone '{backbone}' not implemented")
44 | assert False
45 |
46 | return pretrained, scratch
47 |
48 |
49 | def _make_scratch(in_shape, out_shape, groups=1, expand=False):
50 | scratch = nn.Module()
51 |
52 | out_shape1 = out_shape
53 | out_shape2 = out_shape
54 | out_shape3 = out_shape
55 | out_shape4 = out_shape
56 | if expand==True:
57 | out_shape1 = out_shape
58 | out_shape2 = out_shape*2
59 | out_shape3 = out_shape*4
60 | out_shape4 = out_shape*8
61 |
62 | scratch.layer1_rn = nn.Conv2d(
63 | in_shape[0], out_shape1, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
64 | )
65 | scratch.layer2_rn = nn.Conv2d(
66 | in_shape[1], out_shape2, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
67 | )
68 | scratch.layer3_rn = nn.Conv2d(
69 | in_shape[2], out_shape3, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
70 | )
71 | scratch.layer4_rn = nn.Conv2d(
72 | in_shape[3], out_shape4, kernel_size=3, stride=1, padding=1, bias=False, groups=groups
73 | )
74 |
75 | return scratch
76 |
77 |
78 | def _make_pretrained_efficientnet_lite3(use_pretrained, exportable=False):
79 | efficientnet = torch.hub.load(
80 | "rwightman/gen-efficientnet-pytorch",
81 | "tf_efficientnet_lite3",
82 | pretrained=use_pretrained,
83 | exportable=exportable
84 | )
85 | return _make_efficientnet_backbone(efficientnet)
86 |
87 |
88 | def _make_efficientnet_backbone(effnet):
89 | pretrained = nn.Module()
90 |
91 | pretrained.layer1 = nn.Sequential(
92 | effnet.conv_stem, effnet.bn1, effnet.act1, *effnet.blocks[0:2]
93 | )
94 | pretrained.layer2 = nn.Sequential(*effnet.blocks[2:3])
95 | pretrained.layer3 = nn.Sequential(*effnet.blocks[3:5])
96 | pretrained.layer4 = nn.Sequential(*effnet.blocks[5:9])
97 |
98 | return pretrained
99 |
100 |
101 | def _make_resnet_backbone(resnet):
102 | pretrained = nn.Module()
103 | pretrained.layer1 = nn.Sequential(
104 | resnet.conv1, resnet.bn1, resnet.relu, resnet.maxpool, resnet.layer1
105 | )
106 |
107 | pretrained.layer2 = resnet.layer2
108 | pretrained.layer3 = resnet.layer3
109 | pretrained.layer4 = resnet.layer4
110 |
111 | return pretrained
112 |
113 |
114 | def _make_pretrained_resnext101_wsl(use_pretrained):
115 | resnet = torch.hub.load("facebookresearch/WSL-Images", "resnext101_32x8d_wsl")
116 | return _make_resnet_backbone(resnet)
117 |
118 |
119 |
120 | class Interpolate(nn.Module):
121 | """Interpolation module.
122 | """
123 |
124 | def __init__(self, scale_factor, mode, align_corners=False):
125 | """Init.
126 |
127 | Args:
128 | scale_factor (float): scaling
129 | mode (str): interpolation mode
130 | """
131 | super(Interpolate, self).__init__()
132 |
133 | self.interp = nn.functional.interpolate
134 | self.scale_factor = scale_factor
135 | self.mode = mode
136 | self.align_corners = align_corners
137 |
138 | def forward(self, x):
139 | """Forward pass.
140 |
141 | Args:
142 | x (tensor): input
143 |
144 | Returns:
145 | tensor: interpolated data
146 | """
147 |
148 | x = self.interp(
149 | x, scale_factor=self.scale_factor, mode=self.mode, align_corners=self.align_corners
150 | )
151 |
152 | return x
153 |
154 |
155 | class ResidualConvUnit(nn.Module):
156 | """Residual convolution module.
157 | """
158 |
159 | def __init__(self, features):
160 | """Init.
161 |
162 | Args:
163 | features (int): number of features
164 | """
165 | super().__init__()
166 |
167 | self.conv1 = nn.Conv2d(
168 | features, features, kernel_size=3, stride=1, padding=1, bias=True
169 | )
170 |
171 | self.conv2 = nn.Conv2d(
172 | features, features, kernel_size=3, stride=1, padding=1, bias=True
173 | )
174 |
175 | self.relu = nn.ReLU(inplace=True)
176 |
177 | def forward(self, x):
178 | """Forward pass.
179 |
180 | Args:
181 | x (tensor): input
182 |
183 | Returns:
184 | tensor: output
185 | """
186 | out = self.relu(x)
187 | out = self.conv1(out)
188 | out = self.relu(out)
189 | out = self.conv2(out)
190 |
191 | return out + x
192 |
193 |
194 | class FeatureFusionBlock(nn.Module):
195 | """Feature fusion block.
196 | """
197 |
198 | def __init__(self, features):
199 | """Init.
200 |
201 | Args:
202 | features (int): number of features
203 | """
204 | super(FeatureFusionBlock, self).__init__()
205 |
206 | self.resConfUnit1 = ResidualConvUnit(features)
207 | self.resConfUnit2 = ResidualConvUnit(features)
208 |
209 | def forward(self, *xs):
210 | """Forward pass.
211 |
212 | Returns:
213 | tensor: output
214 | """
215 | output = xs[0]
216 |
217 | if len(xs) == 2:
218 | output += self.resConfUnit1(xs[1])
219 |
220 | output = self.resConfUnit2(output)
221 |
222 | output = nn.functional.interpolate(
223 | output, scale_factor=2, mode="bilinear", align_corners=True
224 | )
225 |
226 | return output
227 |
228 |
229 |
230 |
231 | class ResidualConvUnit_custom(nn.Module):
232 | """Residual convolution module.
233 | """
234 |
235 | def __init__(self, features, activation, bn):
236 | """Init.
237 |
238 | Args:
239 | features (int): number of features
240 | """
241 | super().__init__()
242 |
243 | self.bn = bn
244 |
245 | self.groups=1
246 |
247 | self.conv1 = nn.Conv2d(
248 | features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups
249 | )
250 |
251 | self.conv2 = nn.Conv2d(
252 | features, features, kernel_size=3, stride=1, padding=1, bias=True, groups=self.groups
253 | )
254 |
255 | if self.bn==True:
256 | self.bn1 = nn.BatchNorm2d(features)
257 | self.bn2 = nn.BatchNorm2d(features)
258 |
259 | self.activation = activation
260 |
261 | self.skip_add = nn.quantized.FloatFunctional()
262 |
263 | def forward(self, x):
264 | """Forward pass.
265 |
266 | Args:
267 | x (tensor): input
268 |
269 | Returns:
270 | tensor: output
271 | """
272 |
273 | out = self.activation(x)
274 | out = self.conv1(out)
275 | if self.bn==True:
276 | out = self.bn1(out)
277 |
278 | out = self.activation(out)
279 | out = self.conv2(out)
280 | if self.bn==True:
281 | out = self.bn2(out)
282 |
283 | if self.groups > 1:
284 | out = self.conv_merge(out)
285 |
286 | return self.skip_add.add(out, x)
287 |
288 | # return out + x
289 |
290 |
291 | class FeatureFusionBlock_custom(nn.Module):
292 | """Feature fusion block.
293 | """
294 |
295 | def __init__(self, features, activation, deconv=False, bn=False, expand=False, align_corners=True):
296 | """Init.
297 |
298 | Args:
299 | features (int): number of features
300 | """
301 | super(FeatureFusionBlock_custom, self).__init__()
302 |
303 | self.deconv = deconv
304 | self.align_corners = align_corners
305 |
306 | self.groups=1
307 |
308 | self.expand = expand
309 | out_features = features
310 | if self.expand==True:
311 | out_features = features//2
312 |
313 | self.out_conv = nn.Conv2d(features, out_features, kernel_size=1, stride=1, padding=0, bias=True, groups=1)
314 |
315 | self.resConfUnit1 = ResidualConvUnit_custom(features, activation, bn)
316 | self.resConfUnit2 = ResidualConvUnit_custom(features, activation, bn)
317 |
318 | self.skip_add = nn.quantized.FloatFunctional()
319 |
320 | def forward(self, *xs):
321 | """Forward pass.
322 |
323 | Returns:
324 | tensor: output
325 | """
326 | output = xs[0]
327 |
328 | if len(xs) == 2:
329 | res = self.resConfUnit1(xs[1])
330 | output = self.skip_add.add(output, res)
331 | # output += res
332 |
333 | output = self.resConfUnit2(output)
334 |
335 | output = nn.functional.interpolate(
336 | output, scale_factor=2, mode="bilinear", align_corners=self.align_corners
337 | )
338 |
339 | output = self.out_conv(output)
340 |
341 | return output
342 |
343 |
--------------------------------------------------------------------------------
/midas/dpt_depth.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import torch.nn.functional as F
4 |
5 | from .base_model import BaseModel
6 | from .blocks import (
7 | FeatureFusionBlock,
8 | FeatureFusionBlock_custom,
9 | Interpolate,
10 | _make_encoder,
11 | forward_vit,
12 | )
13 |
14 |
15 | def _make_fusion_block(features, use_bn):
16 | return FeatureFusionBlock_custom(
17 | features,
18 | nn.ReLU(False),
19 | deconv=False,
20 | bn=use_bn,
21 | expand=False,
22 | align_corners=True,
23 | )
24 |
25 |
26 | class DPT(BaseModel):
27 | def __init__(
28 | self,
29 | head,
30 | features=256,
31 | backbone="vitb_rn50_384",
32 | readout="project",
33 | channels_last=False,
34 | use_bn=False,
35 | ):
36 |
37 | super(DPT, self).__init__()
38 |
39 | self.channels_last = channels_last
40 |
41 | hooks = {
42 | "vitb_rn50_384": [0, 1, 8, 11],
43 | "vitb16_384": [2, 5, 8, 11],
44 | "vitl16_384": [5, 11, 17, 23],
45 | }
46 |
47 | # Instantiate backbone and reassemble blocks
48 | self.pretrained, self.scratch = _make_encoder(
49 | backbone,
50 | features,
51 | False, # Set to true of you want to train from scratch, uses ImageNet weights
52 | groups=1,
53 | expand=False,
54 | exportable=False,
55 | hooks=hooks[backbone],
56 | use_readout=readout,
57 | )
58 |
59 | self.scratch.refinenet1 = _make_fusion_block(features, use_bn)
60 | self.scratch.refinenet2 = _make_fusion_block(features, use_bn)
61 | self.scratch.refinenet3 = _make_fusion_block(features, use_bn)
62 | self.scratch.refinenet4 = _make_fusion_block(features, use_bn)
63 |
64 | self.scratch.output_conv = head
65 |
66 |
67 | def forward(self, x):
68 | if self.channels_last == True:
69 | x.contiguous(memory_format=torch.channels_last)
70 |
71 | layer_1, layer_2, layer_3, layer_4 = forward_vit(self.pretrained, x)
72 |
73 | layer_1_rn = self.scratch.layer1_rn(layer_1)
74 | layer_2_rn = self.scratch.layer2_rn(layer_2)
75 | layer_3_rn = self.scratch.layer3_rn(layer_3)
76 | layer_4_rn = self.scratch.layer4_rn(layer_4)
77 |
78 | path_4 = self.scratch.refinenet4(layer_4_rn)
79 | path_3 = self.scratch.refinenet3(path_4, layer_3_rn)
80 | path_2 = self.scratch.refinenet2(path_3, layer_2_rn)
81 | path_1 = self.scratch.refinenet1(path_2, layer_1_rn)
82 |
83 | out = self.scratch.output_conv(path_1)
84 |
85 | return out
86 |
87 |
88 | class DPTDepthModel(DPT):
89 | def __init__(self, path=None, non_negative=True, **kwargs):
90 | features = kwargs["features"] if "features" in kwargs else 256
91 |
92 | head = nn.Sequential(
93 | nn.Conv2d(features, features // 2, kernel_size=3, stride=1, padding=1),
94 | Interpolate(scale_factor=2, mode="bilinear", align_corners=True),
95 | nn.Conv2d(features // 2, 32, kernel_size=3, stride=1, padding=1),
96 | nn.ReLU(True),
97 | nn.Conv2d(32, 1, kernel_size=1, stride=1, padding=0),
98 | nn.ReLU(True) if non_negative else nn.Identity(),
99 | nn.Identity(),
100 | )
101 |
102 | super().__init__(head, **kwargs)
103 |
104 | if path is not None:
105 | self.load(path)
106 |
107 | def forward(self, x):
108 | return super().forward(x).squeeze(dim=1)
109 |
110 |
--------------------------------------------------------------------------------
/midas/midas_net.py:
--------------------------------------------------------------------------------
1 | """MidashNet: Network for monocular depth estimation trained by mixing several datasets.
2 | This file contains code that is adapted from
3 | https://github.com/thomasjpfan/pytorch_refinenet/blob/master/pytorch_refinenet/refinenet/refinenet_4cascade.py
4 | """
5 | import torch
6 | import torch.nn as nn
7 |
8 | from .base_model import BaseModel
9 | from .blocks import FeatureFusionBlock, Interpolate, _make_encoder
10 |
11 |
12 | class MidasNet(BaseModel):
13 | """Network for monocular depth estimation.
14 | """
15 |
16 | def __init__(self, path=None, features=256, non_negative=True):
17 | """Init.
18 |
19 | Args:
20 | path (str, optional): Path to saved model. Defaults to None.
21 | features (int, optional): Number of features. Defaults to 256.
22 | backbone (str, optional): Backbone network for encoder. Defaults to resnet50
23 | """
24 | print("Loading weights: ", path)
25 |
26 | super(MidasNet, self).__init__()
27 |
28 | use_pretrained = False if path is None else True
29 |
30 | self.pretrained, self.scratch = _make_encoder(backbone="resnext101_wsl", features=features, use_pretrained=use_pretrained)
31 |
32 | self.scratch.refinenet4 = FeatureFusionBlock(features)
33 | self.scratch.refinenet3 = FeatureFusionBlock(features)
34 | self.scratch.refinenet2 = FeatureFusionBlock(features)
35 | self.scratch.refinenet1 = FeatureFusionBlock(features)
36 |
37 | self.scratch.output_conv = nn.Sequential(
38 | nn.Conv2d(features, 128, kernel_size=3, stride=1, padding=1),
39 | Interpolate(scale_factor=2, mode="bilinear"),
40 | nn.Conv2d(128, 32, kernel_size=3, stride=1, padding=1),
41 | nn.ReLU(True),
42 | nn.Conv2d(32, 1, kernel_size=1, stride=1, padding=0),
43 | nn.ReLU(True) if non_negative else nn.Identity(),
44 | )
45 |
46 | if path:
47 | self.load(path)
48 |
49 | def forward(self, x):
50 | """Forward pass.
51 |
52 | Args:
53 | x (tensor): input data (image)
54 |
55 | Returns:
56 | tensor: depth
57 | """
58 |
59 | layer_1 = self.pretrained.layer1(x)
60 | layer_2 = self.pretrained.layer2(layer_1)
61 | layer_3 = self.pretrained.layer3(layer_2)
62 | layer_4 = self.pretrained.layer4(layer_3)
63 |
64 | layer_1_rn = self.scratch.layer1_rn(layer_1)
65 | layer_2_rn = self.scratch.layer2_rn(layer_2)
66 | layer_3_rn = self.scratch.layer3_rn(layer_3)
67 | layer_4_rn = self.scratch.layer4_rn(layer_4)
68 |
69 | path_4 = self.scratch.refinenet4(layer_4_rn)
70 | path_3 = self.scratch.refinenet3(path_4, layer_3_rn)
71 | path_2 = self.scratch.refinenet2(path_3, layer_2_rn)
72 | path_1 = self.scratch.refinenet1(path_2, layer_1_rn)
73 |
74 | out = self.scratch.output_conv(path_1)
75 |
76 | return torch.squeeze(out, dim=1)
77 |
--------------------------------------------------------------------------------
/midas/midas_net_custom.py:
--------------------------------------------------------------------------------
1 | """MidashNet: Network for monocular depth estimation trained by mixing several datasets.
2 | This file contains code that is adapted from
3 | https://github.com/thomasjpfan/pytorch_refinenet/blob/master/pytorch_refinenet/refinenet/refinenet_4cascade.py
4 | """
5 | import torch
6 | import torch.nn as nn
7 |
8 | from .base_model import BaseModel
9 | from .blocks import FeatureFusionBlock, FeatureFusionBlock_custom, Interpolate, _make_encoder
10 |
11 |
12 | class MidasNet_small(BaseModel):
13 | """Network for monocular depth estimation.
14 | """
15 |
16 | def __init__(self, path=None, features=64, backbone="efficientnet_lite3", non_negative=True, exportable=True, channels_last=False, align_corners=True,
17 | blocks={'expand': True}):
18 | """Init.
19 |
20 | Args:
21 | path (str, optional): Path to saved model. Defaults to None.
22 | features (int, optional): Number of features. Defaults to 256.
23 | backbone (str, optional): Backbone network for encoder. Defaults to resnet50
24 | """
25 | print("Loading weights: ", path)
26 |
27 | super(MidasNet_small, self).__init__()
28 |
29 | use_pretrained = False if path else True
30 |
31 | self.channels_last = channels_last
32 | self.blocks = blocks
33 | self.backbone = backbone
34 |
35 | self.groups = 1
36 |
37 | features1=features
38 | features2=features
39 | features3=features
40 | features4=features
41 | self.expand = False
42 | if "expand" in self.blocks and self.blocks['expand'] == True:
43 | self.expand = True
44 | features1=features
45 | features2=features*2
46 | features3=features*4
47 | features4=features*8
48 |
49 | self.pretrained, self.scratch = _make_encoder(self.backbone, features, use_pretrained, groups=self.groups, expand=self.expand, exportable=exportable)
50 |
51 | self.scratch.activation = nn.ReLU(False)
52 |
53 | self.scratch.refinenet4 = FeatureFusionBlock_custom(features4, self.scratch.activation, deconv=False, bn=False, expand=self.expand, align_corners=align_corners)
54 | self.scratch.refinenet3 = FeatureFusionBlock_custom(features3, self.scratch.activation, deconv=False, bn=False, expand=self.expand, align_corners=align_corners)
55 | self.scratch.refinenet2 = FeatureFusionBlock_custom(features2, self.scratch.activation, deconv=False, bn=False, expand=self.expand, align_corners=align_corners)
56 | self.scratch.refinenet1 = FeatureFusionBlock_custom(features1, self.scratch.activation, deconv=False, bn=False, align_corners=align_corners)
57 |
58 |
59 | self.scratch.output_conv = nn.Sequential(
60 | nn.Conv2d(features, features//2, kernel_size=3, stride=1, padding=1, groups=self.groups),
61 | Interpolate(scale_factor=2, mode="bilinear"),
62 | nn.Conv2d(features//2, 32, kernel_size=3, stride=1, padding=1),
63 | self.scratch.activation,
64 | nn.Conv2d(32, 1, kernel_size=1, stride=1, padding=0),
65 | nn.ReLU(True) if non_negative else nn.Identity(),
66 | nn.Identity(),
67 | )
68 |
69 | if path:
70 | self.load(path)
71 |
72 |
73 | def forward(self, x):
74 | """Forward pass.
75 |
76 | Args:
77 | x (tensor): input data (image)
78 |
79 | Returns:
80 | tensor: depth
81 | """
82 | if self.channels_last==True:
83 | print("self.channels_last = ", self.channels_last)
84 | x.contiguous(memory_format=torch.channels_last)
85 |
86 |
87 | layer_1 = self.pretrained.layer1(x)
88 | layer_2 = self.pretrained.layer2(layer_1)
89 | layer_3 = self.pretrained.layer3(layer_2)
90 | layer_4 = self.pretrained.layer4(layer_3)
91 |
92 | layer_1_rn = self.scratch.layer1_rn(layer_1)
93 | layer_2_rn = self.scratch.layer2_rn(layer_2)
94 | layer_3_rn = self.scratch.layer3_rn(layer_3)
95 | layer_4_rn = self.scratch.layer4_rn(layer_4)
96 |
97 |
98 | path_4 = self.scratch.refinenet4(layer_4_rn)
99 | path_3 = self.scratch.refinenet3(path_4, layer_3_rn)
100 | path_2 = self.scratch.refinenet2(path_3, layer_2_rn)
101 | path_1 = self.scratch.refinenet1(path_2, layer_1_rn)
102 |
103 | out = self.scratch.output_conv(path_1)
104 |
105 | return torch.squeeze(out, dim=1)
106 |
107 |
108 |
109 | def fuse_model(m):
110 | prev_previous_type = nn.Identity()
111 | prev_previous_name = ''
112 | previous_type = nn.Identity()
113 | previous_name = ''
114 | for name, module in m.named_modules():
115 | if prev_previous_type == nn.Conv2d and previous_type == nn.BatchNorm2d and type(module) == nn.ReLU:
116 | # print("FUSED ", prev_previous_name, previous_name, name)
117 | torch.quantization.fuse_modules(m, [prev_previous_name, previous_name, name], inplace=True)
118 | elif prev_previous_type == nn.Conv2d and previous_type == nn.BatchNorm2d:
119 | # print("FUSED ", prev_previous_name, previous_name)
120 | torch.quantization.fuse_modules(m, [prev_previous_name, previous_name], inplace=True)
121 | # elif previous_type == nn.Conv2d and type(module) == nn.ReLU:
122 | # print("FUSED ", previous_name, name)
123 | # torch.quantization.fuse_modules(m, [previous_name, name], inplace=True)
124 |
125 | prev_previous_type = previous_type
126 | prev_previous_name = previous_name
127 | previous_type = type(module)
128 | previous_name = name
--------------------------------------------------------------------------------
/midas/transforms.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import cv2
3 | import math
4 | import random
5 |
6 |
7 | def apply_min_size(sample, size, image_interpolation_method=cv2.INTER_AREA):
8 | """Rezise the sample to ensure the given size. Keeps aspect ratio.
9 |
10 | Args:
11 | sample (dict): sample
12 | size (tuple): image size
13 |
14 | Returns:
15 | tuple: new size
16 | """
17 | shape = list(sample["disparity"].shape)
18 |
19 | if shape[0] >= size[0] and shape[1] >= size[1]:
20 | return sample
21 |
22 | scale = [0, 0]
23 | scale[0] = size[0] / shape[0]
24 | scale[1] = size[1] / shape[1]
25 |
26 | scale = max(scale)
27 |
28 | shape[0] = math.ceil(scale * shape[0])
29 | shape[1] = math.ceil(scale * shape[1])
30 |
31 | # resize
32 | sample["image"] = cv2.resize(
33 | sample["image"], tuple(shape[::-1]), interpolation=image_interpolation_method
34 | )
35 |
36 | sample["disparity"] = cv2.resize(
37 | sample["disparity"], tuple(shape[::-1]), interpolation=cv2.INTER_NEAREST
38 | )
39 | sample["mask"] = cv2.resize(
40 | sample["mask"].astype(np.float32),
41 | tuple(shape[::-1]),
42 | interpolation=cv2.INTER_NEAREST,
43 | )
44 | sample["mask"] = sample["mask"].astype(bool)
45 |
46 | return tuple(shape)
47 |
48 | class ResizeTrain(object):
49 | """Resize sample to given size (width, height)."""
50 | def __init__(
51 | self,
52 | width,
53 | height,
54 | resize_target=True,
55 | keep_aspect_ratio=False,
56 | ensure_multiple_of=1,
57 | resize_method="lower_bound",
58 | image_interpolation_method=cv2.INTER_AREA,
59 | ):
60 | """Init.
61 |
62 | Args:
63 | width (int): desired output width
64 | height (int): desired output height
65 | resize_target (bool, optional):
66 | True: Resize the full sample (image, mask, target).
67 | False: Resize image only.
68 | Defaults to True.
69 | keep_aspect_ratio (bool, optional):
70 | True: Keep the aspect ratio of the input sample.
71 | Output sample might not have the given width and height, and
72 | resize behaviour depends on the parameter 'resize_method'.
73 | Defaults to False.
74 | ensure_multiple_of (int, optional):
75 | Output width and height is constrained to be multiple of this parameter.
76 | Defaults to 1.
77 | resize_method (str, optional):
78 | "lower_bound": Output will be at least as large as the given size.
79 | "upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.)
80 | "minimal": Scale as least as possible. (Output size might be smaller than given size.)
81 | Defaults to "lower_bound".
82 | """
83 | self.__width = width
84 | self.__height = height
85 |
86 | self.__resize_target = resize_target
87 | self.__keep_aspect_ratio = keep_aspect_ratio
88 | self.__multiple_of = ensure_multiple_of
89 | self.__resize_method = resize_method
90 | self.__image_interpolation_method = image_interpolation_method
91 |
92 | def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
93 | y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
94 |
95 | if max_val is not None and y > max_val:
96 | y = (np.floor(x / self.__multiple_of) * self.__multiple_of).astype(int)
97 |
98 | if y < min_val:
99 | y = (np.ceil(x / self.__multiple_of) * self.__multiple_of).astype(int)
100 |
101 | return y
102 |
103 | def get_size(self, width, height):
104 | # determine new height and width
105 | scale_height = self.__height / height
106 | scale_width = self.__width / width
107 |
108 | if self.__keep_aspect_ratio:
109 | if self.__resize_method == "lower_bound":
110 | # scale such that output size is lower bound
111 | if scale_width > scale_height:
112 | # fit width
113 | scale_height = scale_width
114 | else:
115 | # fit height
116 | scale_width = scale_height
117 | elif self.__resize_method == "upper_bound":
118 | # scale such that output size is upper bound
119 | if scale_width < scale_height:
120 | # fit width
121 | scale_height = scale_width
122 | else:
123 | # fit height
124 | scale_width = scale_height
125 | elif self.__resize_method == "minimal":
126 | # scale as least as possbile
127 | if abs(1 - scale_width) < abs(1 - scale_height):
128 | # fit width
129 | scale_height = scale_width
130 | else:
131 | # fit height
132 | scale_width = scale_height
133 | else:
134 | raise ValueError(
135 | f"resize_method {self.__resize_method} not implemented"
136 | )
137 |
138 | if self.__resize_method == "lower_bound":
139 | new_height = self.constrain_to_multiple_of(
140 | scale_height * height, min_val=self.__height
141 | )
142 | new_width = self.constrain_to_multiple_of(
143 | scale_width * width, min_val=self.__width
144 | )
145 | elif self.__resize_method == "upper_bound":
146 | new_height = self.constrain_to_multiple_of(
147 | scale_height * height, max_val=self.__height
148 | )
149 | new_width = self.constrain_to_multiple_of(
150 | scale_width * width, max_val=self.__width
151 | )
152 | elif self.__resize_method == "minimal":
153 | new_height = self.constrain_to_multiple_of(scale_height * height)
154 | new_width = self.constrain_to_multiple_of(scale_width * width)
155 | else:
156 | raise ValueError(f"resize_method {self.__resize_method} not implemented")
157 |
158 | return (new_width, new_height)
159 |
160 | def __call__(self, sample):
161 | width, height = self.get_size(
162 | sample["image"].shape[1], sample["image"].shape[0]
163 | )
164 |
165 | # resize sample
166 | sample["image"] = cv2.resize(
167 | sample["image"],
168 | (width, height),
169 | interpolation=self.__image_interpolation_method,
170 | )
171 | sample["image"] = np.clip(sample["image"], 0, 1)
172 |
173 | if self.__resize_target:
174 | if "disparity" in sample:
175 | sample["disparity"] = cv2.resize(
176 | sample["disparity"],
177 | (width, height),
178 | interpolation=cv2.INTER_NEAREST,
179 | )
180 |
181 | if "depth" in sample:
182 | sample["depth"] = cv2.resize(
183 | sample["depth"], (width, height), interpolation=cv2.INTER_NEAREST
184 | )
185 |
186 | if "mask" in sample:
187 | sample["mask"] = cv2.resize(
188 | sample["mask"].astype(np.float32),
189 | (width, height),
190 | interpolation=cv2.INTER_NEAREST,
191 | )
192 | sample["mask"] = sample["mask"].astype(bool)
193 | return sample
194 |
195 |
196 |
197 | class Resize(object):
198 | """Resize sample to given size (width, height).
199 | """
200 |
201 | def __init__(
202 | self,
203 | width,
204 | height,
205 | resize_target=True,
206 | keep_aspect_ratio=False,
207 | ensure_multiple_of=1,
208 | resize_method="lower_bound",
209 | image_interpolation_method=cv2.INTER_AREA,
210 | ):
211 | """Init.
212 |
213 | Args:
214 | width (int): desired output width
215 | height (int): desired output height
216 | resize_target (bool, optional):
217 | True: Resize the full sample (image, mask, target).
218 | False: Resize image only.
219 | Defaults to True.
220 | keep_aspect_ratio (bool, optional):
221 | True: Keep the aspect ratio of the input sample.
222 | Output sample might not have the given width and height, and
223 | resize behaviour depends on the parameter 'resize_method'.
224 | Defaults to False.
225 | ensure_multiple_of (int, optional):
226 | Output width and height is constrained to be multiple of this parameter.
227 | Defaults to 1.
228 | resize_method (str, optional):
229 | "lower_bound": Output will be at least as large as the given size.
230 | "upper_bound": Output will be at max as large as the given size. (Output size might be smaller than given size.)
231 | "minimal": Scale as least as possible. (Output size might be smaller than given size.)
232 | Defaults to "lower_bound".
233 | """
234 | self.__width = width
235 | self.__height = height
236 |
237 | self.__resize_target = resize_target
238 | self.__keep_aspect_ratio = keep_aspect_ratio
239 | self.__multiple_of = ensure_multiple_of
240 | self.__resize_method = resize_method
241 | self.__image_interpolation_method = image_interpolation_method
242 |
243 | def constrain_to_multiple_of(self, x, min_val=0, max_val=None):
244 | y = (np.round(x / self.__multiple_of) * self.__multiple_of).astype(int)
245 |
246 | if max_val is not None and y > max_val:
247 | y = (np.floor(x / self.__multiple_of) * self.__multiple_of).astype(int)
248 |
249 | if y < min_val:
250 | y = (np.ceil(x / self.__multiple_of) * self.__multiple_of).astype(int)
251 |
252 | return y
253 |
254 | def get_size(self, width, height):
255 | # determine new height and width
256 | scale_height = self.__height / height
257 | scale_width = self.__width / width
258 |
259 | if self.__keep_aspect_ratio:
260 | if self.__resize_method == "lower_bound":
261 | # scale such that output size is lower bound
262 | if scale_width > scale_height:
263 | # fit width
264 | scale_height = scale_width
265 | else:
266 | # fit height
267 | scale_width = scale_height
268 | elif self.__resize_method == "upper_bound":
269 | # scale such that output size is upper bound
270 | if scale_width < scale_height:
271 | # fit width
272 | scale_height = scale_width
273 | else:
274 | # fit height
275 | scale_width = scale_height
276 | elif self.__resize_method == "minimal":
277 | # scale as least as possbile
278 | if abs(1 - scale_width) < abs(1 - scale_height):
279 | # fit width
280 | scale_height = scale_width
281 | else:
282 | # fit height
283 | scale_width = scale_height
284 | else:
285 | raise ValueError(
286 | f"resize_method {self.__resize_method} not implemented"
287 | )
288 |
289 | if self.__resize_method == "lower_bound":
290 | new_height = self.constrain_to_multiple_of(
291 | scale_height * height, min_val=self.__height
292 | )
293 | new_width = self.constrain_to_multiple_of(
294 | scale_width * width, min_val=self.__width
295 | )
296 | elif self.__resize_method == "upper_bound":
297 | new_height = self.constrain_to_multiple_of(
298 | scale_height * height, max_val=self.__height
299 | )
300 | new_width = self.constrain_to_multiple_of(
301 | scale_width * width, max_val=self.__width
302 | )
303 | elif self.__resize_method == "minimal":
304 | new_height = self.constrain_to_multiple_of(scale_height * height)
305 | new_width = self.constrain_to_multiple_of(scale_width * width)
306 | else:
307 | raise ValueError(f"resize_method {self.__resize_method} not implemented")
308 |
309 | return (new_width, new_height)
310 |
311 | def __call__(self, sample):
312 | width, height = self.get_size(
313 | sample["image"].shape[1], sample["image"].shape[0]
314 | )
315 |
316 | # resize sample
317 | sample["image"] = cv2.resize(
318 | sample["image"],
319 | (width, height),
320 | interpolation=self.__image_interpolation_method,
321 | )
322 | sample["image"] = np.clip(sample["image"], 0, 1)
323 |
324 | if self.__resize_target:
325 | if "disparity" in sample:
326 | sample["disparity"] = cv2.resize(
327 | sample["disparity"],
328 | (width, height),
329 | interpolation=cv2.INTER_NEAREST,
330 | )
331 |
332 | if "depth" in sample:
333 | sample["depth"] = cv2.resize(
334 | sample["depth"], (width, height), interpolation=cv2.INTER_NEAREST
335 | )
336 |
337 | if "mask" in sample:
338 | sample["mask"] = cv2.resize(
339 | sample["mask"].astype(np.float32),
340 | (width, height),
341 | interpolation=cv2.INTER_NEAREST,
342 | )
343 | sample["mask"] = sample["mask"].astype(bool)
344 | return sample
345 |
346 |
347 | class NormalizeImage(object):
348 | """Normlize image by given mean and std.
349 | """
350 |
351 | def __init__(self, mean, std):
352 | self.__mean = mean
353 | self.__std = std
354 |
355 | def __call__(self, sample):
356 | sample["image"] = (sample["image"] - self.__mean) / self.__std
357 |
358 | return sample
359 |
360 |
361 | class PrepareForNet(object):
362 | """Prepare sample for usage as network input.
363 | """
364 |
365 | def __init__(self):
366 | pass
367 |
368 | def __call__(self, sample):
369 | image = np.transpose(sample["image"], (2, 0, 1))
370 | sample["image"] = np.ascontiguousarray(image).astype(np.float32)
371 |
372 | if "mask" in sample:
373 | sample["mask"] = sample["mask"].astype(np.float32)
374 | sample["mask"] = np.ascontiguousarray(sample["mask"])
375 |
376 | if "disparity" in sample:
377 | disparity = sample["disparity"].astype(np.float32)
378 | sample["disparity"] = np.ascontiguousarray(disparity)
379 |
380 | if "depth" in sample:
381 | depth = sample["depth"].astype(np.float32)
382 | sample["depth"] = np.ascontiguousarray(depth)
383 |
384 | return sample
385 |
386 | class RandomCrop(object):
387 | def __init__(self, width, height):
388 | self.__width = width
389 | self.__height = height
390 |
391 | def __call__(self, sample):
392 | h, w = sample["image"].shape[:2]
393 | x = random.randint(0, w - self.__width)
394 | y = random.randint(0, h - self.__height)
395 |
396 | sample["image"] = sample["image"][y : y + self.__height, x : x + self.__width, :]
397 |
398 | if "mask" in sample:
399 | sample["mask"] = sample["mask"][y : y + self.__height, x : x + self.__width]
400 |
401 | if "disparity" in sample:
402 | sample["disparity"] = sample["disparity"][y : y + self.__height, x : x + self.__width]
403 |
404 | if "depth" in sample:
405 | sample["depth"] = sample["depth"][y : y + self.__height, x : x + self.__width]
406 |
407 | return sample
408 |
409 | class MirrorSquarePad(object):
410 | def __call__(self, sample):
411 | h, w = sample["image"].shape[:2]
412 |
413 | if h > w:
414 | new_h = h
415 | new_w = h
416 | else:
417 | new_h = w
418 | new_w = w
419 |
420 | sample["image"] = cv2.copyMakeBorder(sample["image"],
421 | (new_h-h)//2,
422 | (new_h-h) - (new_h-h)//2,
423 | (new_w-w)//2,
424 | (new_w-w) - (new_w-w)//2,
425 | cv2.BORDER_REFLECT_101)
426 |
427 | if "mask" in sample:
428 | sample["mask"] = cv2.copyMakeBorder(sample["mask"],
429 | (new_h-h)//2,
430 | (new_h-h) - (new_h-h)//2,
431 | (new_w-w)//2,
432 | (new_w-w) - (new_w-w)//2,
433 | cv2.BORDER_REFLECT_101)
434 |
435 | if "disparity" in sample:
436 | sample["disparity"] = cv2.copyMakeBorder(sample["disparity"],
437 | (new_h-h)//2,
438 | (new_h-h) - (new_h-h)//2,
439 | (new_w-w)//2,
440 | (new_w-w) - (new_w-w)//2,
441 | cv2.BORDER_REFLECT_101)
442 |
443 | if "depth" in sample:
444 |
445 | sample["depth"] = cv2.copyMakeBorder(sample["depth"],
446 | (new_h-h)//2,
447 | (new_h-h) - (new_h-h)//2,
448 | (new_w-w)//2,
449 | (new_w-w) - (new_w-w)//2,
450 | cv2.BORDER_REFLECT_101)
451 | return sample
452 |
453 |
454 | class RandomHorizontalFlip(object):
455 | def __init__(self, prob):
456 | self.__prob = prob
457 |
458 | def __call__(self, sample):
459 | cond = np.random.uniform(0, 1, 1)
460 | if cond > self.__prob:
461 | # NOTE: to solve negative slice problem, we create a copy
462 | sample["image"] = np.fliplr(sample["image"] )
463 | sample["image"] = np.copy(sample["image"])
464 |
465 | if "mask" in sample:
466 | sample["mask"] = np.fliplr(sample["mask"] )
467 | sample["mask"] = np.copy(sample["mask"])
468 |
469 | if "disparity" in sample:
470 | sample["disparity"] = np.fliplr(sample["disparity"] )
471 | sample["disparity"] = np.copy(sample["disparity"])
472 |
473 | if "depth" in sample:
474 | sample["depth"] = np.fliplr(sample["depth"] )
475 | sample["depth"] = np.copy(sample["depth"])
476 |
477 | return sample
478 |
479 | class ColorAug(object):
480 | def __init__(self,
481 | gamma_low=0.8,
482 | gamma_high=1.2,
483 | brightness_low=0.5,
484 | brightness_high=1.2,
485 | color_low=0.8,
486 | color_high=1.2,
487 | prob=0.5,
488 | ):
489 | self.__gamma_low = gamma_low
490 | self.__gamma_high = gamma_high
491 | self.__brightness_low = brightness_low
492 | self.__brightness_high = brightness_high
493 | self.__color_low = color_low
494 | self.__color_high = color_high
495 | self.__prob = prob
496 |
497 | def __call__(self, sample):
498 | sample["image"] = np.clip(sample["image"], 0, 1)
499 | if np.random.uniform(0, 1, 1) < self.__prob:
500 | # randomly shift gamma
501 | random_gamma = np.random.uniform(self.__gamma_low, self.__gamma_high)
502 | sample["image"] = sample["image"] ** random_gamma
503 |
504 | # randomly shift brightness
505 | random_brightness = np.random.uniform(self.__brightness_low, self.__brightness_high)
506 | sample["image"] = sample["image"] * random_brightness
507 |
508 | if sample["image"].shape[2] == 3:
509 | # randomly shift color
510 | random_colors = np.random.uniform(self.__color_low, self.__color_high, 3)
511 | sample["image"] *= random_colors
512 |
513 | # saturate
514 | sample["image"] = np.clip(sample["image"], 0, 1)
515 | return sample
--------------------------------------------------------------------------------
/midas/vit.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import timm
4 | import types
5 | import math
6 | import torch.nn.functional as F
7 |
8 |
9 | class Slice(nn.Module):
10 | def __init__(self, start_index=1):
11 | super(Slice, self).__init__()
12 | self.start_index = start_index
13 |
14 | def forward(self, x):
15 | return x[:, self.start_index :]
16 |
17 |
18 | class AddReadout(nn.Module):
19 | def __init__(self, start_index=1):
20 | super(AddReadout, self).__init__()
21 | self.start_index = start_index
22 |
23 | def forward(self, x):
24 | if self.start_index == 2:
25 | readout = (x[:, 0] + x[:, 1]) / 2
26 | else:
27 | readout = x[:, 0]
28 | return x[:, self.start_index :] + readout.unsqueeze(1)
29 |
30 |
31 | class ProjectReadout(nn.Module):
32 | def __init__(self, in_features, start_index=1):
33 | super(ProjectReadout, self).__init__()
34 | self.start_index = start_index
35 |
36 | self.project = nn.Sequential(nn.Linear(2 * in_features, in_features), nn.GELU())
37 |
38 | def forward(self, x):
39 | readout = x[:, 0].unsqueeze(1).expand_as(x[:, self.start_index :])
40 | features = torch.cat((x[:, self.start_index :], readout), -1)
41 |
42 | return self.project(features)
43 |
44 |
45 | class Transpose(nn.Module):
46 | def __init__(self, dim0, dim1):
47 | super(Transpose, self).__init__()
48 | self.dim0 = dim0
49 | self.dim1 = dim1
50 |
51 | def forward(self, x):
52 | x = x.transpose(self.dim0, self.dim1)
53 | return x
54 |
55 |
56 | def forward_vit(pretrained, x):
57 | b, c, h, w = x.shape
58 |
59 | glob = pretrained.model.forward_flex(x)
60 |
61 | layer_1 = pretrained.activations["1"]
62 | layer_2 = pretrained.activations["2"]
63 | layer_3 = pretrained.activations["3"]
64 | layer_4 = pretrained.activations["4"]
65 |
66 | layer_1 = pretrained.act_postprocess1[0:2](layer_1)
67 | layer_2 = pretrained.act_postprocess2[0:2](layer_2)
68 | layer_3 = pretrained.act_postprocess3[0:2](layer_3)
69 | layer_4 = pretrained.act_postprocess4[0:2](layer_4)
70 |
71 | unflatten = nn.Sequential(
72 | nn.Unflatten(
73 | 2,
74 | torch.Size(
75 | [
76 | h // pretrained.model.patch_size[1],
77 | w // pretrained.model.patch_size[0],
78 | ]
79 | ),
80 | )
81 | )
82 |
83 | if layer_1.ndim == 3:
84 | layer_1 = unflatten(layer_1)
85 | if layer_2.ndim == 3:
86 | layer_2 = unflatten(layer_2)
87 | if layer_3.ndim == 3:
88 | layer_3 = unflatten(layer_3)
89 | if layer_4.ndim == 3:
90 | layer_4 = unflatten(layer_4)
91 |
92 | layer_1 = pretrained.act_postprocess1[3 : len(pretrained.act_postprocess1)](layer_1)
93 | layer_2 = pretrained.act_postprocess2[3 : len(pretrained.act_postprocess2)](layer_2)
94 | layer_3 = pretrained.act_postprocess3[3 : len(pretrained.act_postprocess3)](layer_3)
95 | layer_4 = pretrained.act_postprocess4[3 : len(pretrained.act_postprocess4)](layer_4)
96 |
97 | return layer_1, layer_2, layer_3, layer_4
98 |
99 |
100 | def _resize_pos_embed(self, posemb, gs_h, gs_w):
101 | posemb_tok, posemb_grid = (
102 | posemb[:, : self.start_index],
103 | posemb[0, self.start_index :],
104 | )
105 |
106 | gs_old = int(math.sqrt(len(posemb_grid)))
107 |
108 | posemb_grid = posemb_grid.reshape(1, gs_old, gs_old, -1).permute(0, 3, 1, 2)
109 | posemb_grid = F.interpolate(posemb_grid, size=(gs_h, gs_w), mode="bilinear")
110 | posemb_grid = posemb_grid.permute(0, 2, 3, 1).reshape(1, gs_h * gs_w, -1)
111 |
112 | posemb = torch.cat([posemb_tok, posemb_grid], dim=1)
113 |
114 | return posemb
115 |
116 |
117 | def forward_flex(self, x):
118 | b, c, h, w = x.shape
119 |
120 | pos_embed = self._resize_pos_embed(
121 | self.pos_embed, h // self.patch_size[1], w // self.patch_size[0]
122 | )
123 |
124 | B = x.shape[0]
125 |
126 | if hasattr(self.patch_embed, "backbone"):
127 | x = self.patch_embed.backbone(x)
128 | if isinstance(x, (list, tuple)):
129 | x = x[-1] # last feature if backbone outputs list/tuple of features
130 |
131 | x = self.patch_embed.proj(x).flatten(2).transpose(1, 2)
132 |
133 | if getattr(self, "dist_token", None) is not None:
134 | cls_tokens = self.cls_token.expand(
135 | B, -1, -1
136 | ) # stole cls_tokens impl from Phil Wang, thanks
137 | dist_token = self.dist_token.expand(B, -1, -1)
138 | x = torch.cat((cls_tokens, dist_token, x), dim=1)
139 | else:
140 | cls_tokens = self.cls_token.expand(
141 | B, -1, -1
142 | ) # stole cls_tokens impl from Phil Wang, thanks
143 | x = torch.cat((cls_tokens, x), dim=1)
144 |
145 | x = x + pos_embed
146 | x = self.pos_drop(x)
147 |
148 | for blk in self.blocks:
149 | x = blk(x)
150 |
151 | x = self.norm(x)
152 |
153 | return x
154 |
155 |
156 | activations = {}
157 |
158 |
159 | def get_activation(name):
160 | def hook(model, input, output):
161 | activations[name] = output
162 |
163 | return hook
164 |
165 |
166 | def get_readout_oper(vit_features, features, use_readout, start_index=1):
167 | if use_readout == "ignore":
168 | readout_oper = [Slice(start_index)] * len(features)
169 | elif use_readout == "add":
170 | readout_oper = [AddReadout(start_index)] * len(features)
171 | elif use_readout == "project":
172 | readout_oper = [
173 | ProjectReadout(vit_features, start_index) for out_feat in features
174 | ]
175 | else:
176 | assert (
177 | False
178 | ), "wrong operation for readout token, use_readout can be 'ignore', 'add', or 'project'"
179 |
180 | return readout_oper
181 |
182 |
183 | def _make_vit_b16_backbone(
184 | model,
185 | features=[96, 192, 384, 768],
186 | size=[384, 384],
187 | hooks=[2, 5, 8, 11],
188 | vit_features=768,
189 | use_readout="ignore",
190 | start_index=1,
191 | ):
192 | pretrained = nn.Module()
193 |
194 | pretrained.model = model
195 | pretrained.model.blocks[hooks[0]].register_forward_hook(get_activation("1"))
196 | pretrained.model.blocks[hooks[1]].register_forward_hook(get_activation("2"))
197 | pretrained.model.blocks[hooks[2]].register_forward_hook(get_activation("3"))
198 | pretrained.model.blocks[hooks[3]].register_forward_hook(get_activation("4"))
199 |
200 | pretrained.activations = activations
201 |
202 | readout_oper = get_readout_oper(vit_features, features, use_readout, start_index)
203 |
204 | # 32, 48, 136, 384
205 | pretrained.act_postprocess1 = nn.Sequential(
206 | readout_oper[0],
207 | Transpose(1, 2),
208 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])),
209 | nn.Conv2d(
210 | in_channels=vit_features,
211 | out_channels=features[0],
212 | kernel_size=1,
213 | stride=1,
214 | padding=0,
215 | ),
216 | nn.ConvTranspose2d(
217 | in_channels=features[0],
218 | out_channels=features[0],
219 | kernel_size=4,
220 | stride=4,
221 | padding=0,
222 | bias=True,
223 | dilation=1,
224 | groups=1,
225 | ),
226 | )
227 |
228 | pretrained.act_postprocess2 = nn.Sequential(
229 | readout_oper[1],
230 | Transpose(1, 2),
231 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])),
232 | nn.Conv2d(
233 | in_channels=vit_features,
234 | out_channels=features[1],
235 | kernel_size=1,
236 | stride=1,
237 | padding=0,
238 | ),
239 | nn.ConvTranspose2d(
240 | in_channels=features[1],
241 | out_channels=features[1],
242 | kernel_size=2,
243 | stride=2,
244 | padding=0,
245 | bias=True,
246 | dilation=1,
247 | groups=1,
248 | ),
249 | )
250 |
251 | pretrained.act_postprocess3 = nn.Sequential(
252 | readout_oper[2],
253 | Transpose(1, 2),
254 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])),
255 | nn.Conv2d(
256 | in_channels=vit_features,
257 | out_channels=features[2],
258 | kernel_size=1,
259 | stride=1,
260 | padding=0,
261 | ),
262 | )
263 |
264 | pretrained.act_postprocess4 = nn.Sequential(
265 | readout_oper[3],
266 | Transpose(1, 2),
267 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])),
268 | nn.Conv2d(
269 | in_channels=vit_features,
270 | out_channels=features[3],
271 | kernel_size=1,
272 | stride=1,
273 | padding=0,
274 | ),
275 | nn.Conv2d(
276 | in_channels=features[3],
277 | out_channels=features[3],
278 | kernel_size=3,
279 | stride=2,
280 | padding=1,
281 | ),
282 | )
283 |
284 | pretrained.model.start_index = start_index
285 | pretrained.model.patch_size = [16, 16]
286 |
287 | # We inject this function into the VisionTransformer instances so that
288 | # we can use it with interpolated position embeddings without modifying the library source.
289 | pretrained.model.forward_flex = types.MethodType(forward_flex, pretrained.model)
290 | pretrained.model._resize_pos_embed = types.MethodType(
291 | _resize_pos_embed, pretrained.model
292 | )
293 |
294 | return pretrained
295 |
296 |
297 | def _make_pretrained_vitl16_384(pretrained, use_readout="ignore", hooks=None):
298 | model = timm.create_model("vit_large_patch16_384", pretrained=pretrained)
299 |
300 | hooks = [5, 11, 17, 23] if hooks == None else hooks
301 | return _make_vit_b16_backbone(
302 | model,
303 | features=[256, 512, 1024, 1024],
304 | hooks=hooks,
305 | vit_features=1024,
306 | use_readout=use_readout,
307 | )
308 |
309 |
310 | def _make_pretrained_vitb16_384(pretrained, use_readout="ignore", hooks=None):
311 | model = timm.create_model("vit_base_patch16_384", pretrained=pretrained)
312 |
313 | hooks = [2, 5, 8, 11] if hooks == None else hooks
314 | return _make_vit_b16_backbone(
315 | model, features=[96, 192, 384, 768], hooks=hooks, use_readout=use_readout
316 | )
317 |
318 |
319 | def _make_pretrained_deitb16_384(pretrained, use_readout="ignore", hooks=None):
320 | model = timm.create_model("vit_deit_base_patch16_384", pretrained=pretrained)
321 |
322 | hooks = [2, 5, 8, 11] if hooks == None else hooks
323 | return _make_vit_b16_backbone(
324 | model, features=[96, 192, 384, 768], hooks=hooks, use_readout=use_readout
325 | )
326 |
327 |
328 | def _make_pretrained_deitb16_distil_384(pretrained, use_readout="ignore", hooks=None):
329 | model = timm.create_model(
330 | "vit_deit_base_distilled_patch16_384", pretrained=pretrained
331 | )
332 |
333 | hooks = [2, 5, 8, 11] if hooks == None else hooks
334 | return _make_vit_b16_backbone(
335 | model,
336 | features=[96, 192, 384, 768],
337 | hooks=hooks,
338 | use_readout=use_readout,
339 | start_index=2,
340 | )
341 |
342 |
343 | def _make_vit_b_rn50_backbone(
344 | model,
345 | features=[256, 512, 768, 768],
346 | size=[384, 384],
347 | hooks=[0, 1, 8, 11],
348 | vit_features=768,
349 | use_vit_only=False,
350 | use_readout="ignore",
351 | start_index=1,
352 | ):
353 | pretrained = nn.Module()
354 |
355 | pretrained.model = model
356 |
357 | if use_vit_only == True:
358 | pretrained.model.blocks[hooks[0]].register_forward_hook(get_activation("1"))
359 | pretrained.model.blocks[hooks[1]].register_forward_hook(get_activation("2"))
360 | else:
361 | pretrained.model.patch_embed.backbone.stages[0].register_forward_hook(
362 | get_activation("1")
363 | )
364 | pretrained.model.patch_embed.backbone.stages[1].register_forward_hook(
365 | get_activation("2")
366 | )
367 |
368 | pretrained.model.blocks[hooks[2]].register_forward_hook(get_activation("3"))
369 | pretrained.model.blocks[hooks[3]].register_forward_hook(get_activation("4"))
370 |
371 | pretrained.activations = activations
372 |
373 | readout_oper = get_readout_oper(vit_features, features, use_readout, start_index)
374 |
375 | if use_vit_only == True:
376 | pretrained.act_postprocess1 = nn.Sequential(
377 | readout_oper[0],
378 | Transpose(1, 2),
379 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])),
380 | nn.Conv2d(
381 | in_channels=vit_features,
382 | out_channels=features[0],
383 | kernel_size=1,
384 | stride=1,
385 | padding=0,
386 | ),
387 | nn.ConvTranspose2d(
388 | in_channels=features[0],
389 | out_channels=features[0],
390 | kernel_size=4,
391 | stride=4,
392 | padding=0,
393 | bias=True,
394 | dilation=1,
395 | groups=1,
396 | ),
397 | )
398 |
399 | pretrained.act_postprocess2 = nn.Sequential(
400 | readout_oper[1],
401 | Transpose(1, 2),
402 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])),
403 | nn.Conv2d(
404 | in_channels=vit_features,
405 | out_channels=features[1],
406 | kernel_size=1,
407 | stride=1,
408 | padding=0,
409 | ),
410 | nn.ConvTranspose2d(
411 | in_channels=features[1],
412 | out_channels=features[1],
413 | kernel_size=2,
414 | stride=2,
415 | padding=0,
416 | bias=True,
417 | dilation=1,
418 | groups=1,
419 | ),
420 | )
421 | else:
422 | pretrained.act_postprocess1 = nn.Sequential(
423 | nn.Identity(), nn.Identity(), nn.Identity()
424 | )
425 | pretrained.act_postprocess2 = nn.Sequential(
426 | nn.Identity(), nn.Identity(), nn.Identity()
427 | )
428 |
429 | pretrained.act_postprocess3 = nn.Sequential(
430 | readout_oper[2],
431 | Transpose(1, 2),
432 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])),
433 | nn.Conv2d(
434 | in_channels=vit_features,
435 | out_channels=features[2],
436 | kernel_size=1,
437 | stride=1,
438 | padding=0,
439 | ),
440 | )
441 |
442 | pretrained.act_postprocess4 = nn.Sequential(
443 | readout_oper[3],
444 | Transpose(1, 2),
445 | nn.Unflatten(2, torch.Size([size[0] // 16, size[1] // 16])),
446 | nn.Conv2d(
447 | in_channels=vit_features,
448 | out_channels=features[3],
449 | kernel_size=1,
450 | stride=1,
451 | padding=0,
452 | ),
453 | nn.Conv2d(
454 | in_channels=features[3],
455 | out_channels=features[3],
456 | kernel_size=3,
457 | stride=2,
458 | padding=1,
459 | ),
460 | )
461 |
462 | pretrained.model.start_index = start_index
463 | pretrained.model.patch_size = [16, 16]
464 |
465 | # We inject this function into the VisionTransformer instances so that
466 | # we can use it with interpolated position embeddings without modifying the library source.
467 | pretrained.model.forward_flex = types.MethodType(forward_flex, pretrained.model)
468 |
469 | # We inject this function into the VisionTransformer instances so that
470 | # we can use it with interpolated position embeddings without modifying the library source.
471 | pretrained.model._resize_pos_embed = types.MethodType(
472 | _resize_pos_embed, pretrained.model
473 | )
474 |
475 | return pretrained
476 |
477 |
478 | def _make_pretrained_vitb_rn50_384(
479 | pretrained, use_readout="ignore", hooks=None, use_vit_only=False
480 | ):
481 | model = timm.create_model("vit_base_resnet50_384", pretrained=pretrained)
482 |
483 | hooks = [0, 1, 8, 11] if hooks == None else hooks
484 | return _make_vit_b_rn50_backbone(
485 | model,
486 | features=[256, 512, 768, 768],
487 | size=[384, 384],
488 | hooks=hooks,
489 | use_vit_only=use_vit_only,
490 | use_readout=use_readout,
491 | )
492 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | matplotlib==3.6.3
2 | numpy==1.24.2
3 | opencv==4.7.0
4 | timm==0.6.12
5 | torch==1.13.1+cu116
6 | torchaudio==0.13.1+cu116
7 | torchvision==0.14.1+cu116
8 | wandb==0.13.10
--------------------------------------------------------------------------------
/run.py:
--------------------------------------------------------------------------------
1 | """Compute depth maps for images in the input folder.
2 | """
3 | import os
4 | import glob
5 | import torch
6 | import utils
7 | import cv2
8 | import argparse
9 | import numpy as np
10 |
11 | from torchvision.transforms import Compose
12 | from midas.dpt_depth import DPTDepthModel
13 | from midas.midas_net import MidasNet
14 | from midas.midas_net_custom import MidasNet_small
15 | from midas.transforms import Resize, ResizeTrain, NormalizeImage, PrepareForNet, RandomCrop, MirrorSquarePad, ColorAug, RandomHorizontalFlip
16 |
17 | from utils import parse_dataset_txt
18 |
19 | def run(input_path, output_path, dataset_txt, model_path, model_type="large", save_full=False, mask_path="", cls2mask=[], mean=False, it=5, output_list=False):
20 | """Run MonoDepthNN to compute depth maps.
21 |
22 | Args:
23 | input_path (str): path to input folder
24 | output_path (str): path to output folder
25 | model_path (str): path to saved model
26 | """
27 | print("initialize")
28 |
29 | # select device
30 | device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
31 | print("device: %s" % device)
32 |
33 | # load network
34 | if model_type == "dpt_large": # DPT-Large
35 | model = DPTDepthModel(
36 | path=None,
37 | backbone="vitl16_384",
38 | non_negative=True,
39 | )
40 | net_w, net_h = 384, 384
41 | normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
42 |
43 | transform = Compose(
44 | [
45 | Resize(
46 | net_w,
47 | net_h,
48 | resize_target=True,
49 | keep_aspect_ratio=True,
50 | ensure_multiple_of=32,
51 | resize_method="lower_bound",
52 | image_interpolation_method=cv2.INTER_CUBIC,
53 | ),
54 | normalization,
55 | PrepareForNet(),
56 | ]
57 | )
58 | elif model_type == "midas_v21":
59 | model = MidasNet(None, non_negative=True)
60 | net_w, net_h = 384, 384
61 | normalization = NormalizeImage(
62 | mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]
63 | )
64 | # Mirror Square Pad and Resize
65 | transform = Compose(
66 | [
67 | Resize(
68 | net_w,
69 | net_h,
70 | resize_target=True,
71 | keep_aspect_ratio=True,
72 | ensure_multiple_of=32,
73 | resize_method="upper_bound",
74 | image_interpolation_method=cv2.INTER_CUBIC,
75 | ),
76 | normalization,
77 | PrepareForNet(),
78 | ]
79 | )
80 | else:
81 | print(f"model_type '{model_type}' not implemented, use: --model_type large")
82 | assert False
83 |
84 | checkpoint = torch.load(model_path)
85 |
86 | if 'model_state_dict' in checkpoint.keys():
87 | model.load_state_dict(checkpoint['model_state_dict'])
88 | else:
89 | model.load_state_dict(checkpoint)
90 |
91 | model.eval()
92 | model.to(device)
93 |
94 | # get input
95 | dataset_dict = parse_dataset_txt(dataset_txt)
96 | num_images = len(dataset_dict["basenames"])
97 |
98 | # create output folder
99 | os.makedirs(output_path, exist_ok=True)
100 | if output_list:
101 | fout = open(output_list, "w")
102 |
103 | print("start processing")
104 | np.random.seed(0)
105 | for ind, basename in enumerate(dataset_dict["basenames"]):
106 | img_name = os.path.join(input_path, basename)
107 | print(" processing {} ({}/{})".format(img_name, ind + 1, num_images))
108 | # input
109 | img = utils.read_image(img_name)
110 | if mask_path:
111 | mask_name = img_name.replace(input_path, mask_path).replace(".jpg",".png")
112 | mask = cv2.imread(mask_name, 0)
113 |
114 | preds = []
115 | for _ in range(args.it):
116 | if mask_path:
117 | if args.it == 1:
118 | color = np.array([0.5, 0.5, 0.5])
119 | else:
120 | color = np.random.random([3])
121 | for cls in cls2mask:
122 | img[mask == cls] = color
123 |
124 | img_input = transform({"image": img})["image"]
125 | # compute
126 | with torch.no_grad():
127 | sample = torch.from_numpy(img_input).to(device).unsqueeze(0)
128 | prediction = model.forward(sample)
129 |
130 | if save_full:
131 | prediction = (
132 | torch.nn.functional.interpolate(
133 | prediction.unsqueeze(1),
134 | size=img.shape[:2],
135 | mode="bicubic",
136 | align_corners=False,
137 | )
138 | .squeeze()
139 | .cpu()
140 | .numpy()
141 | )
142 | else:
143 | prediction = prediction.squeeze().cpu().numpy()
144 | preds.append(prediction)
145 |
146 | prediction = np.median(np.stack(preds,axis=0), axis=0)
147 |
148 | output_dir = os.path.join(output_path, os.path.dirname(basename))
149 | os.makedirs(output_dir, exist_ok=True)
150 | filename = os.path.join(output_dir, os.path.splitext(os.path.basename(img_name))[0])
151 |
152 | np.save(filename, prediction.astype(np.float32))
153 | if output_list:
154 | fout.write(img_name + " " + filename + ".npy\n")
155 |
156 | utils.write_depth(filename, prediction, bytes=2)
157 |
158 | print("finished")
159 |
160 |
161 | if __name__ == "__main__":
162 | parser = argparse.ArgumentParser()
163 |
164 | parser.add_argument('-i', '--input_path',
165 | default='input',
166 | help='folder with images'
167 | )
168 |
169 | parser.add_argument('--dataset_txt',
170 | default='dataset.txt',
171 | help='dataset txt file',
172 | )
173 |
174 | parser.add_argument('--mask_path',
175 | default='',
176 | help='folder with mask images'
177 | )
178 |
179 | parser.add_argument('--cls2mask',
180 | default=[1],
181 | type=int,
182 | nargs='+',
183 | help='classes to mask'
184 | )
185 |
186 | parser.add_argument('--it',
187 | default=1,
188 | type=int,
189 | help="number of iteration to run midas"
190 | )
191 |
192 | parser.add_argument('-o', '--output_path',
193 | default='output',
194 | help='folder for output images'
195 | )
196 |
197 | parser.add_argument('--output_list',
198 | default='',
199 | help='output list of generated depths as txt file'
200 | )
201 |
202 | parser.add_argument('--save_full_res',
203 | action='store_true',
204 | help='save original resolution'
205 | )
206 |
207 | parser.add_argument('-m', '--model_weights',
208 | default=None,
209 | help='path to the trained weights of model'
210 | )
211 |
212 | parser.add_argument('-t', '--model_type',
213 | default='dpt_large',
214 | help='model type: dpt_large, midas_v21'
215 | )
216 |
217 | args = parser.parse_args()
218 |
219 | default_models = {
220 | "midas_v21": "weights/Base/midas_v21-base.pt",
221 | "dpt_large": "weights/Base/dpt_large-base.pt",
222 | }
223 |
224 | if args.model_weights is None:
225 | args.model_weights = default_models[args.model_type]
226 |
227 | # set torch options
228 | torch.backends.cudnn.enabled = True
229 | torch.backends.cudnn.benchmark = True
230 |
231 | print(args)
232 | # compute depth maps
233 | run(args.input_path, args.output_path, args.dataset_txt, args.model_weights, args.model_type, save_full=args.save_full_res, mask_path=args.mask_path, cls2mask=args.cls2mask, it=args.it, output_list=args.output_list)
234 |
--------------------------------------------------------------------------------
/scripts/finetune.sh:
--------------------------------------------------------------------------------
1 | cd ..
2 |
3 | model="dpt_large" # ["midas_v21", "dpt_large"]
4 | output_path=./experiment_models/
5 | dataroot="data"
6 | txtroot="datasets"
7 | exp_name="Ft. Virtual Depth"
8 |
9 | python finetune.py --exp_name "$exp_name" \
10 | --training_datasets trans10k msd \
11 | --training_datasets_dir $dataroot"/Trans10K" $dataroot"/MSD" \
12 | --training_datasets_txt $txtroot"/trans10k/virtual_depth_"$model".txt" $txtroot"/msd/virtual_depth_"$model".txt" \
13 | --output_path $output_path \
14 | --model_type $model
--------------------------------------------------------------------------------
/scripts/generate_virtual_depth.sh:
--------------------------------------------------------------------------------
1 | root="path_to_dataset_root"
2 | cd ..
3 |
4 | model="dpt_large" # ["midas_v21", "dpt_large"]
5 | dataset="Trans10K" # ["Trans10K", "MSD"]
6 | splits="train test validation"
7 | for split in $splits
8 | do
9 | echo $model $dataset $split
10 | input_dir=$root/$dataset/$split/images # path to dataset folder with images
11 | mask_dir=$root/$dataset/$split/masks # path to dataset folder with segmentations, either GT or proxy
12 | output_dir=$root"/"$dataset/$split/$model"_proxies"/$exp # output path
13 |
14 | dataset_lower=$(echo $dataset | tr '[:upper:]' '[:lower:]')
15 | dataset_txt="datasets/"$dataset_lower"/"$split".txt" # inference list
16 |
17 | ### define output_list if you want to save the list of the generated virtual depths
18 | exp="base"
19 | output_list="datasets/"$dataset_lower"/"$split"_"$model"_"$exp".txt"
20 | ###
21 |
22 | if [ -f $dataset_txt ]
23 | then
24 | python run.py --model_type $model \
25 | --input_path $input_dir \
26 | --dataset_txt $dataset_txt \
27 | --output_path $output_dir \
28 | --output_list $output_list \
29 | --mask_path $mask_dir \
30 | --it 5 \
31 | --cls2mask 255 # list of class ids in segmentation maps relative to ToM surfaces.
32 | fi
33 | done
--------------------------------------------------------------------------------
/scripts/table2.sh:
--------------------------------------------------------------------------------
1 | cd ..
2 |
3 | ### Change this path ###
4 | dataset_root="/media/data2/Booster/train/balanced"
5 | ########################
6 |
7 | dataset_txt="datasets/booster/train_stereo.txt"
8 |
9 | # RESULTS TABLE 2
10 | for model in "midas_v21" "dpt_large"
11 | do
12 | ## BASE MODEL ###
13 | output_dir="results/Base/"$model
14 | python run.py --model_type $model \
15 | --input_path $dataset_root \
16 | --dataset_txt $dataset_txt \
17 | --output_path $output_dir
18 | result_path="results/table2_base_"$model".txt"
19 | python evaluate_mono.py --gt_root $dataset_root \
20 | --pred_root $output_dir \
21 | --dataset_txt $dataset_txt \
22 | --output_path $result_path
23 |
24 | ## FT. BASE MODEL ###
25 | output_dir="results/Table2/Ft. Base/"$model
26 | model_weights="weights/Table 2/Ft. Base/"$model"_final.pt"
27 | python run.py --model_type $model \
28 | --input_path $dataset_root \
29 | --dataset_txt $dataset_txt \
30 | --output_path "$output_dir" \
31 | --model_weights "$model_weights"
32 | result_path="results/table2_ftbase_"$model".txt"
33 | python evaluate_mono.py --gt_root $dataset_root \
34 | --pred_root "$output_dir" \
35 | --dataset_txt $dataset_txt \
36 | --output_path $result_path
37 |
38 | ## FT. VIRTUAL DEPTH MODEL - OUR ###
39 | output_dir="results/Table2/Ft. Virtual Depth/"$model
40 | model_weights="weights/Table 2/Ft. Virtual Depth/"$model"_final.pt"
41 | python run.py --model_type $model \
42 | --input_path $dataset_root \
43 | --dataset_txt $dataset_txt \
44 | --output_path "$output_dir" \
45 | --model_weights "$model_weights"
46 | result_path="results/table2_ftvirtualdepth_"$model".txt"
47 | python evaluate_mono.py --gt_root $dataset_root \
48 | --pred_root "$output_dir" \
49 | --dataset_txt $dataset_txt \
50 | --output_path $result_path
51 | done
--------------------------------------------------------------------------------
/scripts/table3.sh:
--------------------------------------------------------------------------------
1 | cd ..
2 |
3 | ### Change this path ###
4 | dataset_root="/media/data2/Booster/train/balanced"
5 | ########################
6 |
7 | dataset_txt="datasets/booster/train_stereo.txt"
8 |
9 | # RESULTS TABLE 3
10 | for model in "midas_v21" "dpt_large"
11 | do
12 | ## Ft. Virtual Depth (GT) MODEL ###
13 | output_dir="results/Table3/Ft. Virtual Depth (GT)/"$model
14 | model_weights="weights/Table 3/Ft. Virtual Depth (GT)/"$model"_final.pt"
15 | python run.py --model_type $model \
16 | --input_path $dataset_root \
17 | --dataset_txt $dataset_txt \
18 | --output_path "$output_dir" \
19 | --model_weights "$model_weights"
20 | result_path="results/Table3_ftvirutaldepthgt_"$model".txt"
21 | python evaluate_mono.py --gt_root $dataset_root \
22 | --pred_root "$output_dir" \
23 | --dataset_txt $dataset_txt \
24 | --output_path $result_path
25 |
26 | ## Ft. Virtual Depth (Proxy) MODEL - OUR ###
27 | output_dir="results/Table3/Ft. Virtual Depth (Proxy)/"$model
28 | model_weights="weights/Table 3/Ft. Virtual Depth (Proxy)/"$model"_final.pt"
29 | python run.py --model_type $model \
30 | --input_path $dataset_root \
31 | --dataset_txt $dataset_txt \
32 | --output_path "$output_dir" \
33 | --model_weights "$model_weights"
34 | result_path="results/Table3_ftvirtualdepthproxy_"$model".txt"
35 | python evaluate_mono.py --gt_root $dataset_root \
36 | --pred_root "$output_dir" \
37 | --dataset_txt $dataset_txt \
38 | --output_path $result_path
39 | done
--------------------------------------------------------------------------------
/utils.py:
--------------------------------------------------------------------------------
1 | """
2 | Utils for monoDepth.
3 | """
4 | import sys
5 | import re
6 | import numpy as np
7 | import cv2
8 |
9 | def decode_3_channels(raw, max_depth=1000):
10 | """Carla format to depth
11 | Args:
12 | raw: carla format depth image. Expected in BGR.
13 | max_depth: max depth used during rendering
14 | """
15 | raw = raw.astype(np.float32)
16 | out = raw[:,:,2] + raw[:,:,1] * 256 + raw[:,:,0]*256*256
17 | out = out / (256*256*256 - 1) * max_depth
18 | return out
19 |
20 |
21 | def read_pfm(path):
22 | """Read pfm file.
23 |
24 | Args:
25 | path (str): path to file
26 |
27 | Returns:
28 | tuple: (data, scale)
29 | """
30 | with open(path, "rb") as file:
31 |
32 | color = None
33 | width = None
34 | height = None
35 | scale = None
36 | endian = None
37 |
38 | header = file.readline().rstrip()
39 | if header.decode("ascii") == "PF":
40 | color = True
41 | elif header.decode("ascii") == "Pf":
42 | color = False
43 | else:
44 | raise Exception("Not a PFM file: " + path)
45 |
46 | dim_match = re.match(r"^(\d+)\s(\d+)\s$", file.readline().decode("ascii"))
47 | if dim_match:
48 | width, height = list(map(int, dim_match.groups()))
49 | else:
50 | raise Exception("Malformed PFM header.")
51 |
52 | scale = float(file.readline().decode("ascii").rstrip())
53 | if scale < 0:
54 | # little-endian
55 | endian = "<"
56 | scale = -scale
57 | else:
58 | # big-endian
59 | endian = ">"
60 |
61 | data = np.fromfile(file, endian + "f")
62 | shape = (height, width, 3) if color else (height, width)
63 |
64 | data = np.reshape(data, shape)
65 | data = np.flipud(data)
66 |
67 | return data, scale
68 |
69 |
70 | def write_pfm(path, image, scale=1):
71 | """Write pfm file.
72 |
73 | Args:
74 | path (str): pathto file
75 | image (array): data
76 | scale (int, optional): Scale. Defaults to 1.
77 | """
78 |
79 | with open(path, "wb") as file:
80 | color = None
81 |
82 | if image.dtype.name != "float32":
83 | raise Exception("Image dtype must be float32.")
84 |
85 | image = np.flipud(image)
86 |
87 | if len(image.shape) == 3 and image.shape[2] == 3: # color image
88 | color = True
89 | elif (
90 | len(image.shape) == 2 or len(image.shape) == 3 and image.shape[2] == 1
91 | ): # greyscale
92 | color = False
93 | else:
94 | raise Exception("Image must have H x W x 3, H x W x 1 or H x W dimensions.")
95 |
96 | file.write("PF\n" if color else "Pf\n".encode())
97 | file.write("%d %d\n".encode() % (image.shape[1], image.shape[0]))
98 |
99 | endian = image.dtype.byteorder
100 |
101 | if endian == "<" or endian == "=" and sys.byteorder == "little":
102 | scale = -scale
103 |
104 | file.write("%f\n".encode() % scale)
105 |
106 | image.tofile(file)
107 |
108 |
109 | def read_d(path, scale_factor=256.):
110 | """Read depth or disp Map
111 | Args:
112 | path: path to depth or disp
113 | scale_factor: scale factor used to decode png 16 bit images
114 | """
115 |
116 | if path.endswith("pfm"):
117 | d = read_pfm(path)
118 | elif path.endswith("npy"):
119 | d = np.load(path)
120 | elif path.endswith("exr"):
121 | d = cv2.imread(path, cv2.IMREAD_UNCHANGED)
122 | d = d[:,:,0]
123 | elif path.endswith("png"):
124 | d = cv2.imread(path, cv2.IMREAD_UNCHANGED)
125 | if len(d.shape) == 3:
126 | d = decode_3_channels(d)
127 | elif d.dtype == np.uint16:
128 | d = d.astype(np.float32)
129 | d = d / scale_factor
130 | else:
131 | d = cv2.imread(path)[:,:,0]
132 |
133 | return d
134 |
135 | def read_image(path):
136 | """Read image and output RGB image (0-1).
137 |
138 | Args:
139 | path (str): path to file
140 |
141 | Returns:
142 | array: RGB image (0-1)
143 | """
144 | img = cv2.imread(path)
145 |
146 | if img.ndim == 2:
147 | img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
148 |
149 | img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) / 255.0
150 |
151 | return img
152 |
153 | def write_depth(path, depth, bytes=1):
154 | """Write depth map to pfm and png file.
155 |
156 | Args:
157 | path (str): filepath without extension
158 | depth (array): depth
159 | """
160 |
161 | depth_min = depth.min()
162 | depth_max = depth.max()
163 |
164 | max_val = (2**(8*bytes))-1
165 |
166 | if depth_max - depth_min > np.finfo("float").eps:
167 | out = max_val * (depth - depth_min) / (depth_max - depth_min)
168 | else:
169 | out = np.zeros(depth.shape, dtype=depth.type)
170 |
171 | if bytes == 1:
172 | cv2.imwrite(path + ".png", out.astype("uint8"))
173 | elif bytes == 2:
174 | cv2.imwrite(path + ".png", out.astype("uint16"))
175 |
176 |
177 | def read_calib_xml(calib_path, factor_baseline=0.001):
178 | cv_file = cv2.FileStorage(calib_path, cv2.FILE_STORAGE_READ)
179 | calib = cv_file.getNode("proj_matL").mat()[:3,:3]
180 | fx = calib[0,0]
181 | baseline = float(cv_file.getNode("baselineLR").real())*factor_baseline
182 | return fx, baseline
183 |
184 |
185 | def parse_dataset_txt(dataset_txt):
186 | with open(dataset_txt) as data_txt:
187 | gt_files = []
188 | basenames = []
189 | focals = []
190 | baselines = []
191 | calib_files = []
192 |
193 | for line in data_txt:
194 | values = line.split(" ")
195 |
196 | if len(values) == 2:
197 | basenames.append(values[0].strip())
198 | gt_files.append(values[1].strip())
199 |
200 | elif len(values) == 3:
201 | basenames.append(values[0].strip())
202 | gt_files.append(values[1].strip())
203 | calib_files.append(values[2].strip())
204 |
205 | elif len(values) == 4:
206 | basenames.append(values[0].strip())
207 | gt_files.append(values[1].strip())
208 | focals.append(float(values[2].strip()))
209 | baselines.append(float(values[3].strip()))
210 |
211 | else:
212 | print("Wrong format dataset txt file")
213 | exit(-1)
214 |
215 | dataset_dict = {}
216 | if gt_files: dataset_dict["gt_paths"] = gt_files
217 | if basenames: dataset_dict["basenames"] = basenames
218 | if calib_files: dataset_dict["calib_paths"] = calib_files
219 | if focals: dataset_dict["focals"] = focals
220 | if baselines: dataset_dict["baselines"] = baselines
221 | return dataset_dict
222 |
223 |
224 | def compute_scale_and_shift(prediction, target, mask):
225 | # system matrix: A = [[a_00, a_01], [a_10, a_11]]
226 | a_00 = np.sum(mask * prediction * prediction, axis=(1, 2))
227 | a_01 = np.sum(mask * prediction, axis=(1, 2))
228 | a_11 = np.sum(mask, axis=(1, 2))
229 |
230 | # right hand side: b = [b_0, b_1]
231 | b_0 = np.sum(mask * prediction * target, axis=(1, 2))
232 | b_1 = np.sum(mask * target, axis=(1, 2))
233 |
234 | # solution: x = A^-1 . b = [[a_11, -a_01], [-a_10, a_00]] / (a_00 * a_11 - a_01 * a_10) . b
235 | x_0 = np.zeros_like(b_0)
236 | x_1 = np.zeros_like(b_1)
237 |
238 | det = a_00 * a_11 - a_01 * a_01
239 | # A needs to be a positive definite matrix.
240 | valid = det > 0
241 |
242 | x_0[valid] = (a_11[valid] * b_0[valid] - a_01[valid] * b_1[valid]) / det[valid]
243 | x_1[valid] = (-a_01[valid] * b_0[valid] + a_00[valid] * b_1[valid]) / det[valid]
244 |
245 | return x_0, x_1
--------------------------------------------------------------------------------