├── .gitignore ├── LICENSE ├── README.md ├── demos ├── dataset │ ├── dataset_exploration.py │ └── dataset_sparsity.py └── depth_completion.py ├── images ├── 000043 │ ├── 000043.png │ ├── completed.png │ └── lidar.png ├── 006338 │ ├── 006338.png │ ├── completed.png │ └── lidar.png ├── extra_vid.png ├── short_vid.jpg └── show_process.png ├── ip_basic ├── depth_map_utils.py └── vis_utils.py └── requirements.txt /.gitignore: -------------------------------------------------------------------------------- 1 | *.pyc 2 | 3 | outputs 4 | 5 | # PyCharm 6 | .idea 7 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Jason Ku 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Image Processing for Basic Depth Completion (IP-Basic) 2 | Depth completion is the task of converting a sparse depth map Dsparse into a dense depth map Ddense. This algorithm was originally created to help visualize 3D object detection results for [AVOD](https://arxiv.org/abs/1712.02294). 3 | 4 | An accurate dense depth map can also benefit 3D object detection or SLAM algorithms that use point cloud input. This method uses an unguided approach (images are ignored, only LIDAR projections are used). Basic depth completion is done with OpenCV and NumPy operations in Python. For more information, please see our paper: [In Defense of Classical Image Processing: Fast Depth Completion on the CPU](https://arxiv.org/abs/1802.00036). 5 | 6 | Please visit [https://github.com/kujason/scene_vis](https://github.com/kujason/scene_vis) for 3D point cloud visualization demos on raw KITTI data. 7 | 8 | If you use this code, we would appreciate if you cite our paper: 9 | [In Defense of Classical Image Processing: Fast Depth Completion on the CPU](https://arxiv.org/abs/1802.00036) 10 | 11 | ``` 12 | @inproceedings{ku2018defense, 13 | title={In Defense of Classical Image Processing: Fast Depth Completion on the CPU}, 14 | author={Ku, Jason and Harakeh, Ali and Waslander, Steven L}, 15 | booktitle={2018 15th Conference on Computer and Robot Vision (CRV)}, 16 | pages={16--22}, 17 | year={2018}, 18 | organization={IEEE} 19 | } 20 | ``` 21 | 22 | ### Videos 23 | Click [here](https://www.youtube.com/watch?v=t_CGGUE2kEM) for a short demo video with comparison of different versions. 24 | 25 | [![Demo Video](images/short_vid.jpg)](https://www.youtube.com/watch?v=t_CGGUE2kEM) 26 | --- 27 | Click [here](https://youtu.be/tYwIcLvfC70) to see point clouds from additional KITTI raw sequences. Note that the structure of smaller or thin objects (e.g. poles, bicycle wheels, pedestrians) are well preserved after depth completion. 28 | 29 | [![Extra Scenes](images/extra_vid.png)](https://youtu.be/tYwIcLvfC70) 30 | --- 31 | Also see an earlier version of the algorithm in action [here (2 top views)](https://www.youtube.com/watch?v=Q1f-s6_yHtw). 32 | 33 | --- 34 | ### Setup 35 | [show_process]: images/show_process.png "Showing Process" 36 | Tested on Ubuntu 16.04 with Python 3.5. 37 | 1. Download and unzip the KITTI depth completion benchmark dataset into `~/Kitti/depth` (only the val_selection_cropped and test data sets are required to run the demo). The folder should look like something the following: 38 | - ~/Kitti/depth 39 | - devkit 40 | - test_depth_completion_anonymous 41 | - test_depth_prediction_anonymous 42 | - train 43 | - val 44 | - val_selection_cropped 45 | 46 | 2. Clone this repo and install Python requirements: 47 | ``` 48 | git clone git@github.com:kujason/ip_basic.git 49 | cd ip_basic 50 | pip3 install -r requirements.txt 51 | ``` 52 | 53 | 3. Run the script: 54 | ```bash 55 | python3 demos/depth_completion.py 56 | ``` 57 | This will run the algorithm on the cropped validation set and save the outputs to a new folder in `demos/outputs`. Refer to the readme in the downloaded devkit to evaluate the results. 58 | 59 | 4. (Optional) Set options in `depth_completion.py` 60 | - To run on the test set, comment the lines below `# Validation set` and uncomment the lines below `# Test set` 61 | - `fill_type`: 62 | - `'fast'` - Version described in the paper 63 | - `'multiscale'` - Multi-scale dilations based on depth, with additional noise removal 64 | - `extrapolate`: 65 | - `True`: Extends depths to the top of the frame and runs a 31x31 full kernel dilation 66 | - `False`: Skips the extension and large dilation 67 | - `save_output`: 68 | - `True` - Saves the output depth maps to disk 69 | - `False` - Shows the filling process. Only works with `fill_type == 'multiscale'` 70 | ![alt text][show_process] 71 | 72 | 5. (Optional) To run the algorithm on a specific CPU (e.g. CPU 0): 73 | ```bash 74 | taskset --cpu-list 0 python3 demos/depth_completion.py 75 | ``` 76 | 77 | ### Results 78 | #### KITTI Test Set Evaluation 79 | | Method | `iRMSE` | `iMAE` | **RMSE** | `MAE` | `Device` | `Runtime` | `FPS` | 80 | |:-------------:|:--------:|:--------:|:-----------:|:----------:|:------------:|:----------:|:-------:| 81 | | NadarayaW | 6.34 | 1.84 | 1852.60 | 416.77 | CPU (1 core) | 0.05 s | 20 | 82 | | SparseConvs | 4.94 | 1.78 | 1601.33 | 481.27 | GPU | **0.01 s** | **100** | 83 | | NN+CNN | **3.25** | **1.29** | 1419.75 | 416.14 | GPU | 0.02 s | 50 | 84 | | **IP-Basic** | 3.75 | **1.29** | **1288.46** | **302.60** | CPU (1 core) | 0.011 s | 90 | 85 | 86 | Table: Comparison of results with other published unguided methods on the [KITTI Depth Completion benchmark](http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_completion). 87 | 88 | ### Versions 89 | Several versions are provided for experimentation. 90 | - `Gaussian (Paper Result, Lowest RMSE)`: Provides lowest RMSE, but adds many additional 3D points to the scene. 91 | - `Bilateral`: Preserves local structure, but large extrapolations make it slower. 92 | - `Gaussian, No Extrapolation`: Fastest version, but adds many additional 3D points to the scene. 93 | - `Bilateral, No Extrapolation (Lowest MAE)`: Preserves local structure, and skips the large extrapolation steps. This is the recommended version for practical applications. 94 | - `Multi-Scale, Bilateral, Noise Removal, No Extrapolation`: Slower version with additional noise removal. See the [above video](https://www.youtube.com/watch?v=t_CGGUE2kEM) for a qualitative comparison. 95 | 96 | #### Timing Comparisons 97 | The table below shows a comparison of timing on an Intel Core i7-7700K for different versions. The Gaussian versions can be run on a single core, while other versions run faster with multiple cores. The bilateral blur version with no extrapolation is recommended for practical applications. 98 | 99 | | Version | Runtime | FPS | 100 | |:---------------------------------------------------------:|:--------:|:---:| 101 | | Gaussian (**Paper Result, Lowest RMSE**) | 0.0111 s | 90 | 102 | | Bilateral | 0.0139 s | 71 | 103 | | Gaussian, No Extrapolation | 0.0075 s | 133 | 104 | | Bilateral, No Extrapolation (**Lowest MAE**) | 0.0115 s | 87 | 105 | | Multi-Scale, Bilateral, Noise Removal, No Extrapolation | 0.0328 s | 30 | 106 | 107 | Table: Timing comparison for different versions. 108 | 109 | ### Examples 110 | Qualitative results from the `Multi-Scale, Bilateral, Noise Removal, No Extrapolation` version on samples from the KITTI object detection benchmark. 111 | #### Cars 112 | [sample_006338]: images/006338/006338.png "Sample 006338" 113 | [lidar_006338]: images/006338/lidar.png "Colourized LIDAR points" 114 | [completed_006338]: images/006338/completed.png "Points after depth completion" 115 | - Image: 116 | ![alt text][sample_006338] 117 | - Colourized LIDAR points (showing ground truth boxes): 118 | ![alt text][lidar_006338] 119 | - After depth completion (showing ground truth boxes): 120 | ![alt text][completed_006338] 121 | 122 | #### People 123 | [sample_000043]: images/000043/000043.png "Sample 000043" 124 | [lidar_000043]: images/000043/lidar.png "Colourized LIDAR points" 125 | [completed_000043]: images/000043/completed.png "Points after depth completion" 126 | - Image: 127 | ![alt text][sample_000043] 128 | - Colourized LIDAR points (showing ground truth boxes): 129 | ![alt text][lidar_000043] 130 | - After depth completion (showing ground truth boxes): 131 | ![alt text][completed_000043] 132 | -------------------------------------------------------------------------------- /demos/dataset/dataset_exploration.py: -------------------------------------------------------------------------------- 1 | import glob 2 | import os 3 | import sys 4 | 5 | import cv2 6 | import matplotlib.pyplot as plt 7 | import numpy as np 8 | 9 | 10 | def main(): 11 | 12 | input_depth_dir = os.path.expanduser( 13 | '~/Kitti/depth/val_selection_cropped/velodyne_raw') 14 | images_to_use = sorted(glob.glob(input_depth_dir + '/*')) 15 | 16 | npy_file_name = 'outputs/all_depths.npy' 17 | all_depths_saved = os.path.exists(npy_file_name) 18 | 19 | if not all_depths_saved: 20 | os.makedirs('./outputs', exist_ok=True) 21 | 22 | # Process depth images 23 | all_depths = np.array([]) 24 | num_images = len(images_to_use) 25 | for i in range(num_images): 26 | 27 | # Print progress 28 | sys.stdout.write('\rProcessing {} / {}'.format(i, num_images - 1)) 29 | sys.stdout.flush() 30 | 31 | depth_image_path = images_to_use[i] 32 | 33 | # Load depth from image 34 | depth_image = cv2.imread(depth_image_path, cv2.IMREAD_ANYDEPTH) 35 | 36 | # Divide by 256 37 | depth_map = depth_image / 256.0 38 | 39 | valid_pixels = (depth_map > 0.0) 40 | all_depths = np.concatenate([all_depths, depth_map[valid_pixels]]) 41 | 42 | np.save(npy_file_name, all_depths) 43 | 44 | else: 45 | # Load from npy 46 | all_depths = np.load(npy_file_name) 47 | 48 | print('') 49 | print('Depths') 50 | print('Min: ', np.amin(all_depths)) 51 | print('Max: ', np.amax(all_depths)) 52 | print('Mean: ', np.mean(all_depths)) 53 | print('Median: ', np.median(all_depths)) 54 | 55 | plt.hist(all_depths, bins=80) 56 | plt.show() 57 | 58 | 59 | if __name__ == '__main__': 60 | main() 61 | -------------------------------------------------------------------------------- /demos/dataset/dataset_sparsity.py: -------------------------------------------------------------------------------- 1 | import glob 2 | import os 3 | import sys 4 | 5 | import cv2 6 | import matplotlib.pyplot as plt 7 | import numpy as np 8 | 9 | 10 | def main(): 11 | 12 | input_depth_dir = os.path.expanduser( 13 | '~/Kitti/depth/val_selection_cropped/velodyne_raw') 14 | 15 | images_to_use = sorted(glob.glob(input_depth_dir + '/*')) 16 | 17 | # Process depth images 18 | num_images = len(images_to_use) 19 | all_sparsities = np.zeros(num_images) 20 | 21 | for i in range(num_images): 22 | 23 | # Print progress 24 | sys.stdout.write('\rProcessing index {} / {}'.format(i, num_images - 1)) 25 | sys.stdout.flush() 26 | 27 | depth_image_path = images_to_use[i] 28 | 29 | # Load depth from image 30 | depth_image = cv2.imread(depth_image_path, cv2.IMREAD_ANYDEPTH) 31 | 32 | # Divide by 256 33 | depth_map = depth_image / 256.0 34 | 35 | num_valid_pixels = len(np.where(depth_map > 0.0)[0]) 36 | num_pixels = depth_image.shape[0] * depth_image.shape[1] 37 | 38 | sparsity = num_valid_pixels / (num_pixels * 2/3) 39 | all_sparsities[i] = sparsity 40 | 41 | print('') 42 | print('Sparsity') 43 | print('Min: ', np.amin(all_sparsities)) 44 | print('Max: ', np.amax(all_sparsities)) 45 | print('Mean: ', np.mean(all_sparsities)) 46 | print('Median: ', np.median(all_sparsities)) 47 | 48 | plt.hist(all_sparsities, bins=20) 49 | plt.show() 50 | 51 | 52 | if __name__ == '__main__': 53 | main() 54 | -------------------------------------------------------------------------------- /demos/depth_completion.py: -------------------------------------------------------------------------------- 1 | import glob 2 | import os 3 | import sys 4 | import time 5 | 6 | import cv2 7 | import numpy as np 8 | import png 9 | 10 | from ip_basic import depth_map_utils 11 | from ip_basic import vis_utils 12 | 13 | 14 | def main(): 15 | """Depth maps are saved to the 'outputs' folder. 16 | """ 17 | 18 | ############################## 19 | # Options 20 | ############################## 21 | # Validation set 22 | input_depth_dir = os.path.expanduser( 23 | '~/Kitti/depth/depth_selection/val_selection_cropped/velodyne_raw') 24 | data_split = 'val' 25 | 26 | # Test set 27 | # input_depth_dir = os.path.expanduser( 28 | # '~/Kitti/depth/depth_selection/test_depth_completion_anonymous/velodyne_raw') 29 | # data_split = 'test' 30 | 31 | # Fast fill with Gaussian blur @90Hz (paper result) 32 | fill_type = 'fast' 33 | extrapolate = True 34 | blur_type = 'gaussian' 35 | 36 | # Fast Fill with bilateral blur, no extrapolation @87Hz (recommended) 37 | # fill_type = 'fast' 38 | # extrapolate = False 39 | # blur_type = 'bilateral' 40 | 41 | # Multi-scale dilations with extra noise removal, no extrapolation @ 30Hz 42 | # fill_type = 'multiscale' 43 | # extrapolate = False 44 | # blur_type = 'bilateral' 45 | 46 | # Save output to disk or show process 47 | save_output = True 48 | 49 | ############################## 50 | # Processing 51 | ############################## 52 | if save_output: 53 | # Save to Disk 54 | show_process = False 55 | save_depth_maps = True 56 | else: 57 | if fill_type == 'fast': 58 | raise ValueError('"fast" fill does not support show_process') 59 | 60 | # Show Process 61 | show_process = True 62 | save_depth_maps = False 63 | 64 | # Create output folder 65 | this_file_path = os.path.dirname(os.path.realpath(__file__)) 66 | outputs_dir = this_file_path + '/outputs' 67 | os.makedirs(outputs_dir, exist_ok=True) 68 | 69 | output_folder_prefix = 'depth_' + data_split 70 | output_list = sorted(os.listdir(outputs_dir)) 71 | if len(output_list) > 0: 72 | split_folders = [folder for folder in output_list 73 | if folder.startswith(output_folder_prefix)] 74 | if len(split_folders) > 0: 75 | last_output_folder = split_folders[-1] 76 | last_output_index = int(last_output_folder.split('_')[-1]) 77 | else: 78 | last_output_index = -1 79 | else: 80 | last_output_index = -1 81 | output_depth_dir = outputs_dir + '/{}_{:03d}'.format( 82 | output_folder_prefix, last_output_index + 1) 83 | 84 | if save_output: 85 | if not os.path.exists(output_depth_dir): 86 | os.makedirs(output_depth_dir) 87 | else: 88 | raise FileExistsError('Already exists!') 89 | print('Output dir:', output_depth_dir) 90 | 91 | # Get images in sorted order 92 | images_to_use = sorted(glob.glob(input_depth_dir + '/*')) 93 | 94 | # Rolling average array of times for time estimation 95 | avg_time_arr_length = 10 96 | last_fill_times = np.repeat([1.0], avg_time_arr_length) 97 | last_total_times = np.repeat([1.0], avg_time_arr_length) 98 | 99 | num_images = len(images_to_use) 100 | for i in range(num_images): 101 | 102 | depth_image_path = images_to_use[i] 103 | 104 | # Calculate average time with last n fill times 105 | avg_fill_time = np.mean(last_fill_times) 106 | avg_total_time = np.mean(last_total_times) 107 | 108 | # Show progress 109 | sys.stdout.write('\rProcessing {} / {}, ' 110 | 'Avg Fill Time: {:.5f}s, ' 111 | 'Avg Total Time: {:.5f}s, ' 112 | 'Est Time Remaining: {:.3f}s'.format( 113 | i, num_images - 1, avg_fill_time, avg_total_time, 114 | avg_total_time * (num_images - i))) 115 | sys.stdout.flush() 116 | 117 | # Start timing 118 | start_total_time = time.time() 119 | 120 | # Load depth projections from uint16 image 121 | depth_image = cv2.imread(depth_image_path, cv2.IMREAD_ANYDEPTH) 122 | projected_depths = np.float32(depth_image / 256.0) 123 | 124 | # Fill in 125 | start_fill_time = time.time() 126 | if fill_type == 'fast': 127 | final_depths = depth_map_utils.fill_in_fast( 128 | projected_depths, extrapolate=extrapolate, blur_type=blur_type) 129 | elif fill_type == 'multiscale': 130 | final_depths, process_dict = depth_map_utils.fill_in_multiscale( 131 | projected_depths, extrapolate=extrapolate, blur_type=blur_type, 132 | show_process=show_process) 133 | else: 134 | raise ValueError('Invalid fill_type {}'.format(fill_type)) 135 | end_fill_time = time.time() 136 | 137 | # Display images from process_dict 138 | if fill_type == 'multiscale' and show_process: 139 | img_size = (570, 165) 140 | 141 | x_start = 80 142 | y_start = 50 143 | x_offset = img_size[0] 144 | y_offset = img_size[1] 145 | x_padding = 0 146 | y_padding = 28 147 | 148 | img_x = x_start 149 | img_y = y_start 150 | max_x = 1900 151 | 152 | row_idx = 0 153 | for key, value in process_dict.items(): 154 | 155 | image_jet = cv2.applyColorMap( 156 | np.uint8(value / np.amax(value) * 255), 157 | cv2.COLORMAP_JET) 158 | vis_utils.cv2_show_image( 159 | key, image_jet, 160 | img_size, (img_x, img_y)) 161 | 162 | img_x += x_offset + x_padding 163 | if (img_x + x_offset + x_padding) > max_x: 164 | img_x = x_start 165 | row_idx += 1 166 | img_y = y_start + row_idx * (y_offset + y_padding) 167 | 168 | # Save process images 169 | cv2.imwrite('process/' + key + '.png', image_jet) 170 | 171 | cv2.waitKey() 172 | 173 | # Save depth images to disk 174 | if save_depth_maps: 175 | depth_image_file_name = os.path.split(depth_image_path)[1] 176 | 177 | # Save depth map to a uint16 png (same format as disparity maps) 178 | file_path = output_depth_dir + '/' + depth_image_file_name 179 | with open(file_path, 'wb') as f: 180 | depth_image = (final_depths * 256).astype(np.uint16) 181 | 182 | # pypng is used because cv2 cannot save uint16 format images 183 | writer = png.Writer(width=depth_image.shape[1], 184 | height=depth_image.shape[0], 185 | bitdepth=16, 186 | greyscale=True) 187 | writer.write(f, depth_image) 188 | 189 | end_total_time = time.time() 190 | 191 | # Update fill times 192 | last_fill_times = np.roll(last_fill_times, -1) 193 | last_fill_times[-1] = end_fill_time - start_fill_time 194 | 195 | # Update total times 196 | last_total_times = np.roll(last_total_times, -1) 197 | last_total_times[-1] = end_total_time - start_total_time 198 | 199 | 200 | if __name__ == "__main__": 201 | main() 202 | -------------------------------------------------------------------------------- /images/000043/000043.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/000043/000043.png -------------------------------------------------------------------------------- /images/000043/completed.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/000043/completed.png -------------------------------------------------------------------------------- /images/000043/lidar.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/000043/lidar.png -------------------------------------------------------------------------------- /images/006338/006338.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/006338/006338.png -------------------------------------------------------------------------------- /images/006338/completed.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/006338/completed.png -------------------------------------------------------------------------------- /images/006338/lidar.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/006338/lidar.png -------------------------------------------------------------------------------- /images/extra_vid.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/extra_vid.png -------------------------------------------------------------------------------- /images/short_vid.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/short_vid.jpg -------------------------------------------------------------------------------- /images/show_process.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/show_process.png -------------------------------------------------------------------------------- /ip_basic/depth_map_utils.py: -------------------------------------------------------------------------------- 1 | import collections 2 | 3 | import cv2 4 | import numpy as np 5 | 6 | # Full kernels 7 | FULL_KERNEL_3 = np.ones((3, 3), np.uint8) 8 | FULL_KERNEL_5 = np.ones((5, 5), np.uint8) 9 | FULL_KERNEL_7 = np.ones((7, 7), np.uint8) 10 | FULL_KERNEL_9 = np.ones((9, 9), np.uint8) 11 | FULL_KERNEL_31 = np.ones((31, 31), np.uint8) 12 | 13 | # 3x3 cross kernel 14 | CROSS_KERNEL_3 = np.asarray( 15 | [ 16 | [0, 1, 0], 17 | [1, 1, 1], 18 | [0, 1, 0], 19 | ], dtype=np.uint8) 20 | 21 | # 5x5 cross kernel 22 | CROSS_KERNEL_5 = np.asarray( 23 | [ 24 | [0, 0, 1, 0, 0], 25 | [0, 0, 1, 0, 0], 26 | [1, 1, 1, 1, 1], 27 | [0, 0, 1, 0, 0], 28 | [0, 0, 1, 0, 0], 29 | ], dtype=np.uint8) 30 | 31 | # 5x5 diamond kernel 32 | DIAMOND_KERNEL_5 = np.array( 33 | [ 34 | [0, 0, 1, 0, 0], 35 | [0, 1, 1, 1, 0], 36 | [1, 1, 1, 1, 1], 37 | [0, 1, 1, 1, 0], 38 | [0, 0, 1, 0, 0], 39 | ], dtype=np.uint8) 40 | 41 | # 7x7 cross kernel 42 | CROSS_KERNEL_7 = np.asarray( 43 | [ 44 | [0, 0, 0, 1, 0, 0, 0], 45 | [0, 0, 0, 1, 0, 0, 0], 46 | [0, 0, 0, 1, 0, 0, 0], 47 | [1, 1, 1, 1, 1, 1, 1], 48 | [0, 0, 0, 1, 0, 0, 0], 49 | [0, 0, 0, 1, 0, 0, 0], 50 | [0, 0, 0, 1, 0, 0, 0], 51 | ], dtype=np.uint8) 52 | 53 | # 7x7 diamond kernel 54 | DIAMOND_KERNEL_7 = np.asarray( 55 | [ 56 | [0, 0, 0, 1, 0, 0, 0], 57 | [0, 0, 1, 1, 1, 0, 0], 58 | [0, 1, 1, 1, 1, 1, 0], 59 | [1, 1, 1, 1, 1, 1, 1], 60 | [0, 1, 1, 1, 1, 1, 0], 61 | [0, 0, 1, 1, 1, 0, 0], 62 | [0, 0, 0, 1, 0, 0, 0], 63 | ], dtype=np.uint8) 64 | 65 | 66 | def fill_in_fast(depth_map, max_depth=100.0, custom_kernel=DIAMOND_KERNEL_5, 67 | extrapolate=False, blur_type='bilateral'): 68 | """Fast, in-place depth completion. 69 | 70 | Args: 71 | depth_map: projected depths 72 | max_depth: max depth value for inversion 73 | custom_kernel: kernel to apply initial dilation 74 | extrapolate: whether to extrapolate by extending depths to top of 75 | the frame, and applying a 31x31 full kernel dilation 76 | blur_type: 77 | 'bilateral' - preserves local structure (recommended) 78 | 'gaussian' - provides lower RMSE 79 | 80 | Returns: 81 | depth_map: dense depth map 82 | """ 83 | 84 | # Invert 85 | valid_pixels = (depth_map > 0.1) 86 | depth_map[valid_pixels] = max_depth - depth_map[valid_pixels] 87 | 88 | # Dilate 89 | depth_map = cv2.dilate(depth_map, custom_kernel) 90 | 91 | # Hole closing 92 | depth_map = cv2.morphologyEx(depth_map, cv2.MORPH_CLOSE, FULL_KERNEL_5) 93 | 94 | # Fill empty spaces with dilated values 95 | empty_pixels = (depth_map < 0.1) 96 | dilated = cv2.dilate(depth_map, FULL_KERNEL_7) 97 | depth_map[empty_pixels] = dilated[empty_pixels] 98 | 99 | # Extend highest pixel to top of image 100 | if extrapolate: 101 | top_row_pixels = np.argmax(depth_map > 0.1, axis=0) 102 | top_pixel_values = depth_map[top_row_pixels, range(depth_map.shape[1])] 103 | 104 | for pixel_col_idx in range(depth_map.shape[1]): 105 | depth_map[0:top_row_pixels[pixel_col_idx], pixel_col_idx] = \ 106 | top_pixel_values[pixel_col_idx] 107 | 108 | # Large Fill 109 | empty_pixels = depth_map < 0.1 110 | dilated = cv2.dilate(depth_map, FULL_KERNEL_31) 111 | depth_map[empty_pixels] = dilated[empty_pixels] 112 | 113 | # Median blur 114 | depth_map = cv2.medianBlur(depth_map, 5) 115 | 116 | # Bilateral or Gaussian blur 117 | if blur_type == 'bilateral': 118 | # Bilateral blur 119 | depth_map = cv2.bilateralFilter(depth_map, 5, 1.5, 2.0) 120 | elif blur_type == 'gaussian': 121 | # Gaussian blur 122 | valid_pixels = (depth_map > 0.1) 123 | blurred = cv2.GaussianBlur(depth_map, (5, 5), 0) 124 | depth_map[valid_pixels] = blurred[valid_pixels] 125 | 126 | # Invert 127 | valid_pixels = (depth_map > 0.1) 128 | depth_map[valid_pixels] = max_depth - depth_map[valid_pixels] 129 | 130 | return depth_map 131 | 132 | 133 | def fill_in_multiscale(depth_map, max_depth=100.0, 134 | dilation_kernel_far=CROSS_KERNEL_3, 135 | dilation_kernel_med=CROSS_KERNEL_5, 136 | dilation_kernel_near=CROSS_KERNEL_7, 137 | extrapolate=False, 138 | blur_type='bilateral', 139 | show_process=False): 140 | """Slower, multi-scale dilation version with additional noise removal that 141 | provides better qualitative results. 142 | 143 | Args: 144 | depth_map: projected depths 145 | max_depth: max depth value for inversion 146 | dilation_kernel_far: dilation kernel to use for 30.0 < depths < 80.0 m 147 | dilation_kernel_med: dilation kernel to use for 15.0 < depths < 30.0 m 148 | dilation_kernel_near: dilation kernel to use for 0.1 < depths < 15.0 m 149 | extrapolate:whether to extrapolate by extending depths to top of 150 | the frame, and applying a 31x31 full kernel dilation 151 | blur_type: 152 | 'gaussian' - provides lower RMSE 153 | 'bilateral' - preserves local structure (recommended) 154 | show_process: saves process images into an OrderedDict 155 | 156 | Returns: 157 | depth_map: dense depth map 158 | process_dict: OrderedDict of process images 159 | """ 160 | 161 | # Convert to float32 162 | depths_in = np.float32(depth_map) 163 | 164 | # Calculate bin masks before inversion 165 | valid_pixels_near = (depths_in > 0.1) & (depths_in <= 15.0) 166 | valid_pixels_med = (depths_in > 15.0) & (depths_in <= 30.0) 167 | valid_pixels_far = (depths_in > 30.0) 168 | 169 | # Invert (and offset) 170 | s1_inverted_depths = np.copy(depths_in) 171 | valid_pixels = (s1_inverted_depths > 0.1) 172 | s1_inverted_depths[valid_pixels] = \ 173 | max_depth - s1_inverted_depths[valid_pixels] 174 | 175 | # Multi-scale dilation 176 | dilated_far = cv2.dilate( 177 | np.multiply(s1_inverted_depths, valid_pixels_far), 178 | dilation_kernel_far) 179 | dilated_med = cv2.dilate( 180 | np.multiply(s1_inverted_depths, valid_pixels_med), 181 | dilation_kernel_med) 182 | dilated_near = cv2.dilate( 183 | np.multiply(s1_inverted_depths, valid_pixels_near), 184 | dilation_kernel_near) 185 | 186 | # Find valid pixels for each binned dilation 187 | valid_pixels_near = (dilated_near > 0.1) 188 | valid_pixels_med = (dilated_med > 0.1) 189 | valid_pixels_far = (dilated_far > 0.1) 190 | 191 | # Combine dilated versions, starting farthest to nearest 192 | s2_dilated_depths = np.copy(s1_inverted_depths) 193 | s2_dilated_depths[valid_pixels_far] = dilated_far[valid_pixels_far] 194 | s2_dilated_depths[valid_pixels_med] = dilated_med[valid_pixels_med] 195 | s2_dilated_depths[valid_pixels_near] = dilated_near[valid_pixels_near] 196 | 197 | # Small hole closure 198 | s3_closed_depths = cv2.morphologyEx( 199 | s2_dilated_depths, cv2.MORPH_CLOSE, FULL_KERNEL_5) 200 | 201 | # Median blur to remove outliers 202 | s4_blurred_depths = np.copy(s3_closed_depths) 203 | blurred = cv2.medianBlur(s3_closed_depths, 5) 204 | valid_pixels = (s3_closed_depths > 0.1) 205 | s4_blurred_depths[valid_pixels] = blurred[valid_pixels] 206 | 207 | # Calculate a top mask 208 | top_mask = np.ones(depths_in.shape, dtype=np.bool) 209 | for pixel_col_idx in range(s4_blurred_depths.shape[1]): 210 | pixel_col = s4_blurred_depths[:, pixel_col_idx] 211 | top_pixel_row = np.argmax(pixel_col > 0.1) 212 | top_mask[0:top_pixel_row, pixel_col_idx] = False 213 | 214 | # Get empty mask 215 | valid_pixels = (s4_blurred_depths > 0.1) 216 | empty_pixels = ~valid_pixels & top_mask 217 | 218 | # Hole fill 219 | dilated = cv2.dilate(s4_blurred_depths, FULL_KERNEL_9) 220 | s5_dilated_depths = np.copy(s4_blurred_depths) 221 | s5_dilated_depths[empty_pixels] = dilated[empty_pixels] 222 | 223 | # Extend highest pixel to top of image or create top mask 224 | s6_extended_depths = np.copy(s5_dilated_depths) 225 | top_mask = np.ones(s5_dilated_depths.shape, dtype=np.bool) 226 | 227 | top_row_pixels = np.argmax(s5_dilated_depths > 0.1, axis=0) 228 | top_pixel_values = s5_dilated_depths[top_row_pixels, 229 | range(s5_dilated_depths.shape[1])] 230 | 231 | for pixel_col_idx in range(s5_dilated_depths.shape[1]): 232 | if extrapolate: 233 | s6_extended_depths[0:top_row_pixels[pixel_col_idx], 234 | pixel_col_idx] = top_pixel_values[pixel_col_idx] 235 | else: 236 | # Create top mask 237 | top_mask[0:top_row_pixels[pixel_col_idx], pixel_col_idx] = False 238 | 239 | # Fill large holes with masked dilations 240 | s7_blurred_depths = np.copy(s6_extended_depths) 241 | for i in range(6): 242 | empty_pixels = (s7_blurred_depths < 0.1) & top_mask 243 | dilated = cv2.dilate(s7_blurred_depths, FULL_KERNEL_5) 244 | s7_blurred_depths[empty_pixels] = dilated[empty_pixels] 245 | 246 | # Median blur 247 | blurred = cv2.medianBlur(s7_blurred_depths, 5) 248 | valid_pixels = (s7_blurred_depths > 0.1) & top_mask 249 | s7_blurred_depths[valid_pixels] = blurred[valid_pixels] 250 | 251 | if blur_type == 'gaussian': 252 | # Gaussian blur 253 | blurred = cv2.GaussianBlur(s7_blurred_depths, (5, 5), 0) 254 | valid_pixels = (s7_blurred_depths > 0.1) & top_mask 255 | s7_blurred_depths[valid_pixels] = blurred[valid_pixels] 256 | elif blur_type == 'bilateral': 257 | # Bilateral blur 258 | blurred = cv2.bilateralFilter(s7_blurred_depths, 5, 0.5, 2.0) 259 | s7_blurred_depths[valid_pixels] = blurred[valid_pixels] 260 | 261 | # Invert (and offset) 262 | s8_inverted_depths = np.copy(s7_blurred_depths) 263 | valid_pixels = np.where(s8_inverted_depths > 0.1) 264 | s8_inverted_depths[valid_pixels] = \ 265 | max_depth - s8_inverted_depths[valid_pixels] 266 | 267 | depths_out = s8_inverted_depths 268 | 269 | process_dict = None 270 | if show_process: 271 | process_dict = collections.OrderedDict() 272 | 273 | process_dict['s0_depths_in'] = depths_in 274 | 275 | process_dict['s1_inverted_depths'] = s1_inverted_depths 276 | process_dict['s2_dilated_depths'] = s2_dilated_depths 277 | process_dict['s3_closed_depths'] = s3_closed_depths 278 | process_dict['s4_blurred_depths'] = s4_blurred_depths 279 | process_dict['s5_combined_depths'] = s5_dilated_depths 280 | process_dict['s6_extended_depths'] = s6_extended_depths 281 | process_dict['s7_blurred_depths'] = s7_blurred_depths 282 | process_dict['s8_inverted_depths'] = s8_inverted_depths 283 | 284 | process_dict['s9_depths_out'] = depths_out 285 | 286 | return depths_out, process_dict 287 | -------------------------------------------------------------------------------- /ip_basic/vis_utils.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | 3 | 4 | def cv2_show_image(window_name, image, 5 | size_wh=None, location_xy=None): 6 | """Helper function for specifying window size and location when 7 | displaying images with cv2. 8 | 9 | Args: 10 | window_name: str window name 11 | image: ndarray image to display 12 | size_wh: window size (w, h) 13 | location_xy: window location (x, y) 14 | """ 15 | 16 | if size_wh is not None: 17 | cv2.namedWindow(window_name, 18 | cv2.WINDOW_KEEPRATIO | cv2.WINDOW_GUI_NORMAL) 19 | cv2.resizeWindow(window_name, *size_wh) 20 | else: 21 | cv2.namedWindow(window_name, cv2.WINDOW_AUTOSIZE) 22 | 23 | if location_xy is not None: 24 | cv2.moveWindow(window_name, *location_xy) 25 | 26 | cv2.imshow(window_name, image) 27 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib 2 | numpy 3 | opencv-python 4 | pypng 5 | --------------------------------------------------------------------------------