├── .gitignore
├── LICENSE
├── README.md
├── demos
    ├── dataset
    │   ├── dataset_exploration.py
    │   └── dataset_sparsity.py
    └── depth_completion.py
├── images
    ├── 000043
    │   ├── 000043.png
    │   ├── completed.png
    │   └── lidar.png
    ├── 006338
    │   ├── 006338.png
    │   ├── completed.png
    │   └── lidar.png
    ├── extra_vid.png
    ├── short_vid.jpg
    └── show_process.png
├── ip_basic
    ├── depth_map_utils.py
    └── vis_utils.py
└── requirements.txt


/.gitignore:
--------------------------------------------------------------------------------
1 | *.pyc
2 | 
3 | outputs
4 | 
5 | # PyCharm
6 | .idea
7 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2018 Jason Ku
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | ## Image Processing for Basic Depth Completion (IP-Basic)
  2 | Depth completion is the task of converting a sparse depth map D<sub>sparse</sub> into a dense depth map D<sub>dense</sub>. This algorithm was originally created to help visualize 3D object detection results for [AVOD](https://arxiv.org/abs/1712.02294).
  3 | 
  4 | An accurate dense depth map can also benefit 3D object detection or SLAM algorithms that use point cloud input. This method uses an unguided approach (images are ignored, only LIDAR projections are used). Basic depth completion is done with OpenCV and NumPy operations in Python. For more information, please see our paper: [In Defense of Classical Image Processing: Fast Depth Completion on the CPU](https://arxiv.org/abs/1802.00036).
  5 | 
  6 | Please visit [https://github.com/kujason/scene_vis](https://github.com/kujason/scene_vis) for 3D point cloud visualization demos on raw KITTI data.
  7 | 
  8 | If you use this code, we would appreciate if you cite our paper:
  9 | [In Defense of Classical Image Processing: Fast Depth Completion on the CPU](https://arxiv.org/abs/1802.00036)
 10 | 
 11 | ```
 12 | @inproceedings{ku2018defense,
 13 |   title={In Defense of Classical Image Processing: Fast Depth Completion on the CPU},
 14 |   author={Ku, Jason and Harakeh, Ali and Waslander, Steven L},
 15 |   booktitle={2018 15th Conference on Computer and Robot Vision (CRV)},
 16 |   pages={16--22},
 17 |   year={2018},
 18 |   organization={IEEE}
 19 | }
 20 | ```
 21 | 
 22 | ### Videos
 23 | Click [here](https://www.youtube.com/watch?v=t_CGGUE2kEM) for a short demo video with comparison of different versions.
 24 | 
 25 | [![Demo Video](images/short_vid.jpg)](https://www.youtube.com/watch?v=t_CGGUE2kEM)
 26 | ---
 27 | Click [here](https://youtu.be/tYwIcLvfC70) to see point clouds from additional KITTI raw sequences. Note that the structure of smaller or thin objects (e.g. poles, bicycle wheels, pedestrians) are well preserved after depth completion.
 28 | 
 29 | [![Extra Scenes](images/extra_vid.png)](https://youtu.be/tYwIcLvfC70)
 30 | ---
 31 | Also see an earlier version of the algorithm in action [here (2 top views)](https://www.youtube.com/watch?v=Q1f-s6_yHtw).
 32 | 
 33 | ---
 34 | ### Setup
 35 | [show_process]: images/show_process.png "Showing Process"
 36 | Tested on Ubuntu 16.04 with Python 3.5.
 37 | 1. Download and unzip the KITTI depth completion benchmark dataset into `~/Kitti/depth` (only the val_selection_cropped and test data sets are required to run the demo). The folder should look like something the following:
 38 |   - ~/Kitti/depth
 39 |     - devkit
 40 |     - test_depth_completion_anonymous
 41 |     - test_depth_prediction_anonymous
 42 |     - train
 43 |     - val
 44 |     - val_selection_cropped
 45 | 
 46 | 2. Clone this repo and install Python requirements:
 47 | ```
 48 | git clone git@github.com:kujason/ip_basic.git
 49 | cd ip_basic
 50 | pip3 install -r requirements.txt
 51 | ```
 52 | 
 53 | 3. Run the script:
 54 | ```bash
 55 | python3 demos/depth_completion.py
 56 | ```
 57 | This will run the algorithm on the cropped validation set and save the outputs to a new folder in `demos/outputs`. Refer to the readme in the downloaded devkit to evaluate the results.
 58 | 
 59 | 4. (Optional) Set options in `depth_completion.py`
 60 | - To run on the test set, comment the lines below `# Validation set` and uncomment the lines below `# Test set`
 61 | - `fill_type`:
 62 |   - `'fast'` - Version described in the paper
 63 |   - `'multiscale'` - Multi-scale dilations based on depth, with additional noise removal
 64 | - `extrapolate`:
 65 |   - `True`: Extends depths to the top of the frame and runs a 31x31 full kernel dilation
 66 |   - `False`: Skips the extension and large dilation
 67 | - `save_output`:
 68 |   - `True` - Saves the output depth maps to disk
 69 |   - `False` - Shows the filling process. Only works with `fill_type == 'multiscale'`
 70 |   ![alt text][show_process]
 71 | 
 72 | 5. (Optional) To run the algorithm on a specific CPU (e.g. CPU 0):
 73 | ```bash
 74 | taskset --cpu-list 0 python3 demos/depth_completion.py
 75 | ```
 76 | 
 77 | ### Results
 78 | #### KITTI Test Set Evaluation
 79 | |        Method |  `iRMSE` |   `iMAE` |    **RMSE** |      `MAE` |     `Device` |  `Runtime` |   `FPS` |
 80 | |:-------------:|:--------:|:--------:|:-----------:|:----------:|:------------:|:----------:|:-------:|
 81 | |     NadarayaW |     6.34 |     1.84 |     1852.60 |     416.77 | CPU (1 core) |     0.05 s |      20 |
 82 | |   SparseConvs |     4.94 |     1.78 |     1601.33 |     481.27 |          GPU | **0.01 s** | **100** |
 83 | |        NN+CNN | **3.25** | **1.29** |     1419.75 |     416.14 |          GPU |     0.02 s |      50 |
 84 | |  **IP-Basic** |     3.75 | **1.29** | **1288.46** | **302.60** | CPU (1 core) |    0.011 s |      90 |
 85 | 
 86 | Table: Comparison of results with other published unguided methods on the [KITTI Depth Completion benchmark](http://www.cvlibs.net/datasets/kitti/eval_depth.php?benchmark=depth_completion).
 87 | 
 88 | ### Versions
 89 | Several versions are provided for experimentation.
 90 | - `Gaussian (Paper Result, Lowest RMSE)`: Provides lowest RMSE, but adds many additional 3D points to the scene.
 91 | - `Bilateral`: Preserves local structure, but large extrapolations make it slower.
 92 | - `Gaussian, No Extrapolation`: Fastest version, but adds many additional 3D points to the scene.
 93 | - `Bilateral, No Extrapolation (Lowest MAE)`: Preserves local structure, and skips the large extrapolation steps. This is the recommended version for practical applications.
 94 | - `Multi-Scale, Bilateral, Noise Removal, No Extrapolation`: Slower version with additional noise removal. See the [above video](https://www.youtube.com/watch?v=t_CGGUE2kEM) for a qualitative comparison.
 95 | 
 96 | #### Timing Comparisons
 97 | The table below shows a comparison of timing on an Intel Core i7-7700K for different versions. The Gaussian versions can be run on a single core, while other versions run faster with multiple cores. The bilateral blur version with no extrapolation is recommended for practical applications.
 98 | 
 99 | |                                                   Version |  Runtime | FPS |
100 | |:---------------------------------------------------------:|:--------:|:---:|
101 | |                  Gaussian (**Paper Result, Lowest RMSE**) | 0.0111 s |  90 |
102 | |                                                 Bilateral | 0.0139 s |  71 |
103 | |                                Gaussian, No Extrapolation | 0.0075 s | 133 |
104 | |              Bilateral, No Extrapolation (**Lowest MAE**) | 0.0115 s |  87 |
105 | |   Multi-Scale, Bilateral, Noise Removal, No Extrapolation | 0.0328 s |  30 |
106 | 
107 | Table: Timing comparison for different versions.
108 | 
109 | ### Examples
110 | Qualitative results from the `Multi-Scale, Bilateral, Noise Removal, No Extrapolation` version on samples from the KITTI object detection benchmark.
111 | #### Cars
112 | [sample_006338]: images/006338/006338.png "Sample 006338"
113 | [lidar_006338]: images/006338/lidar.png "Colourized LIDAR points"
114 | [completed_006338]: images/006338/completed.png "Points after depth completion"
115 | - Image:
116 |     ![alt text][sample_006338]
117 | - Colourized LIDAR points (showing ground truth boxes):
118 |     ![alt text][lidar_006338]
119 | - After depth completion (showing ground truth boxes):
120 |     ![alt text][completed_006338]
121 | 
122 | #### People
123 | [sample_000043]: images/000043/000043.png "Sample 000043"
124 | [lidar_000043]: images/000043/lidar.png "Colourized LIDAR points"
125 | [completed_000043]: images/000043/completed.png "Points after depth completion"
126 | - Image:
127 |     ![alt text][sample_000043]
128 | - Colourized LIDAR points (showing ground truth boxes):
129 |     ![alt text][lidar_000043]
130 | - After depth completion (showing ground truth boxes):
131 |     ![alt text][completed_000043]
132 | 


--------------------------------------------------------------------------------
/demos/dataset/dataset_exploration.py:
--------------------------------------------------------------------------------
 1 | import glob
 2 | import os
 3 | import sys
 4 | 
 5 | import cv2
 6 | import matplotlib.pyplot as plt
 7 | import numpy as np
 8 | 
 9 | 
10 | def main():
11 | 
12 |     input_depth_dir = os.path.expanduser(
13 |         '~/Kitti/depth/val_selection_cropped/velodyne_raw')
14 |     images_to_use = sorted(glob.glob(input_depth_dir + '/*'))
15 | 
16 |     npy_file_name = 'outputs/all_depths.npy'
17 |     all_depths_saved = os.path.exists(npy_file_name)
18 | 
19 |     if not all_depths_saved:
20 |         os.makedirs('./outputs', exist_ok=True)
21 | 
22 |         # Process depth images
23 |         all_depths = np.array([])
24 |         num_images = len(images_to_use)
25 |         for i in range(num_images):
26 | 
27 |             # Print progress
28 |             sys.stdout.write('\rProcessing {} / {}'.format(i, num_images - 1))
29 |             sys.stdout.flush()
30 | 
31 |             depth_image_path = images_to_use[i]
32 | 
33 |             # Load depth from image
34 |             depth_image = cv2.imread(depth_image_path, cv2.IMREAD_ANYDEPTH)
35 | 
36 |             # Divide by 256
37 |             depth_map = depth_image / 256.0
38 | 
39 |             valid_pixels = (depth_map > 0.0)
40 |             all_depths = np.concatenate([all_depths, depth_map[valid_pixels]])
41 | 
42 |         np.save(npy_file_name, all_depths)
43 | 
44 |     else:
45 |         # Load from npy
46 |         all_depths = np.load(npy_file_name)
47 | 
48 |     print('')
49 |     print('Depths')
50 |     print('Min:   ', np.amin(all_depths))
51 |     print('Max:   ', np.amax(all_depths))
52 |     print('Mean:  ', np.mean(all_depths))
53 |     print('Median:  ', np.median(all_depths))
54 | 
55 |     plt.hist(all_depths, bins=80)
56 |     plt.show()
57 | 
58 | 
59 | if __name__ == '__main__':
60 |     main()
61 | 


--------------------------------------------------------------------------------
/demos/dataset/dataset_sparsity.py:
--------------------------------------------------------------------------------
 1 | import glob
 2 | import os
 3 | import sys
 4 | 
 5 | import cv2
 6 | import matplotlib.pyplot as plt
 7 | import numpy as np
 8 | 
 9 | 
10 | def main():
11 | 
12 |     input_depth_dir = os.path.expanduser(
13 |         '~/Kitti/depth/val_selection_cropped/velodyne_raw')
14 | 
15 |     images_to_use = sorted(glob.glob(input_depth_dir + '/*'))
16 | 
17 |     # Process depth images
18 |     num_images = len(images_to_use)
19 |     all_sparsities = np.zeros(num_images)
20 | 
21 |     for i in range(num_images):
22 | 
23 |         # Print progress
24 |         sys.stdout.write('\rProcessing index {} / {}'.format(i, num_images - 1))
25 |         sys.stdout.flush()
26 | 
27 |         depth_image_path = images_to_use[i]
28 | 
29 |         # Load depth from image
30 |         depth_image = cv2.imread(depth_image_path, cv2.IMREAD_ANYDEPTH)
31 | 
32 |         # Divide by 256
33 |         depth_map = depth_image / 256.0
34 | 
35 |         num_valid_pixels = len(np.where(depth_map > 0.0)[0])
36 |         num_pixels = depth_image.shape[0] * depth_image.shape[1]
37 | 
38 |         sparsity = num_valid_pixels / (num_pixels * 2/3)
39 |         all_sparsities[i] = sparsity
40 | 
41 |     print('')
42 |     print('Sparsity')
43 |     print('Min:   ', np.amin(all_sparsities))
44 |     print('Max:   ', np.amax(all_sparsities))
45 |     print('Mean:  ', np.mean(all_sparsities))
46 |     print('Median:  ', np.median(all_sparsities))
47 | 
48 |     plt.hist(all_sparsities, bins=20)
49 |     plt.show()
50 | 
51 | 
52 | if __name__ == '__main__':
53 |     main()
54 | 


--------------------------------------------------------------------------------
/demos/depth_completion.py:
--------------------------------------------------------------------------------
  1 | import glob
  2 | import os
  3 | import sys
  4 | import time
  5 | 
  6 | import cv2
  7 | import numpy as np
  8 | import png
  9 | 
 10 | from ip_basic import depth_map_utils
 11 | from ip_basic import vis_utils
 12 | 
 13 | 
 14 | def main():
 15 |     """Depth maps are saved to the 'outputs' folder.
 16 |     """
 17 | 
 18 |     ##############################
 19 |     # Options
 20 |     ##############################
 21 |     # Validation set
 22 |     input_depth_dir = os.path.expanduser(
 23 |         '~/Kitti/depth/depth_selection/val_selection_cropped/velodyne_raw')
 24 |     data_split = 'val'
 25 | 
 26 |     # Test set
 27 |     # input_depth_dir = os.path.expanduser(
 28 |     #     '~/Kitti/depth/depth_selection/test_depth_completion_anonymous/velodyne_raw')
 29 |     # data_split = 'test'
 30 | 
 31 |     # Fast fill with Gaussian blur @90Hz (paper result)
 32 |     fill_type = 'fast'
 33 |     extrapolate = True
 34 |     blur_type = 'gaussian'
 35 | 
 36 |     # Fast Fill with bilateral blur, no extrapolation @87Hz (recommended)
 37 |     # fill_type = 'fast'
 38 |     # extrapolate = False
 39 |     # blur_type = 'bilateral'
 40 | 
 41 |     # Multi-scale dilations with extra noise removal, no extrapolation @ 30Hz
 42 |     # fill_type = 'multiscale'
 43 |     # extrapolate = False
 44 |     # blur_type = 'bilateral'
 45 | 
 46 |     # Save output to disk or show process
 47 |     save_output = True
 48 | 
 49 |     ##############################
 50 |     # Processing
 51 |     ##############################
 52 |     if save_output:
 53 |         # Save to Disk
 54 |         show_process = False
 55 |         save_depth_maps = True
 56 |     else:
 57 |         if fill_type == 'fast':
 58 |             raise ValueError('"fast" fill does not support show_process')
 59 | 
 60 |         # Show Process
 61 |         show_process = True
 62 |         save_depth_maps = False
 63 | 
 64 |     # Create output folder
 65 |     this_file_path = os.path.dirname(os.path.realpath(__file__))
 66 |     outputs_dir = this_file_path + '/outputs'
 67 |     os.makedirs(outputs_dir, exist_ok=True)
 68 | 
 69 |     output_folder_prefix = 'depth_' + data_split
 70 |     output_list = sorted(os.listdir(outputs_dir))
 71 |     if len(output_list) > 0:
 72 |         split_folders = [folder for folder in output_list
 73 |                          if folder.startswith(output_folder_prefix)]
 74 |         if len(split_folders) > 0:
 75 |             last_output_folder = split_folders[-1]
 76 |             last_output_index = int(last_output_folder.split('_')[-1])
 77 |         else:
 78 |             last_output_index = -1
 79 |     else:
 80 |         last_output_index = -1
 81 |     output_depth_dir = outputs_dir + '/{}_{:03d}'.format(
 82 |         output_folder_prefix, last_output_index + 1)
 83 | 
 84 |     if save_output:
 85 |         if not os.path.exists(output_depth_dir):
 86 |             os.makedirs(output_depth_dir)
 87 |         else:
 88 |             raise FileExistsError('Already exists!')
 89 |         print('Output dir:', output_depth_dir)
 90 | 
 91 |     # Get images in sorted order
 92 |     images_to_use = sorted(glob.glob(input_depth_dir + '/*'))
 93 | 
 94 |     # Rolling average array of times for time estimation
 95 |     avg_time_arr_length = 10
 96 |     last_fill_times = np.repeat([1.0], avg_time_arr_length)
 97 |     last_total_times = np.repeat([1.0], avg_time_arr_length)
 98 | 
 99 |     num_images = len(images_to_use)
100 |     for i in range(num_images):
101 | 
102 |         depth_image_path = images_to_use[i]
103 | 
104 |         # Calculate average time with last n fill times
105 |         avg_fill_time = np.mean(last_fill_times)
106 |         avg_total_time = np.mean(last_total_times)
107 | 
108 |         # Show progress
109 |         sys.stdout.write('\rProcessing {} / {}, '
110 |                          'Avg Fill Time: {:.5f}s, '
111 |                          'Avg Total Time: {:.5f}s, '
112 |                          'Est Time Remaining: {:.3f}s'.format(
113 |                              i, num_images - 1, avg_fill_time, avg_total_time,
114 |                              avg_total_time * (num_images - i)))
115 |         sys.stdout.flush()
116 | 
117 |         # Start timing
118 |         start_total_time = time.time()
119 | 
120 |         # Load depth projections from uint16 image
121 |         depth_image = cv2.imread(depth_image_path, cv2.IMREAD_ANYDEPTH)
122 |         projected_depths = np.float32(depth_image / 256.0)
123 | 
124 |         # Fill in
125 |         start_fill_time = time.time()
126 |         if fill_type == 'fast':
127 |             final_depths = depth_map_utils.fill_in_fast(
128 |                 projected_depths, extrapolate=extrapolate, blur_type=blur_type)
129 |         elif fill_type == 'multiscale':
130 |             final_depths, process_dict = depth_map_utils.fill_in_multiscale(
131 |                 projected_depths, extrapolate=extrapolate, blur_type=blur_type,
132 |                 show_process=show_process)
133 |         else:
134 |             raise ValueError('Invalid fill_type {}'.format(fill_type))
135 |         end_fill_time = time.time()
136 | 
137 |         # Display images from process_dict
138 |         if fill_type == 'multiscale' and show_process:
139 |             img_size = (570, 165)
140 | 
141 |             x_start = 80
142 |             y_start = 50
143 |             x_offset = img_size[0]
144 |             y_offset = img_size[1]
145 |             x_padding = 0
146 |             y_padding = 28
147 | 
148 |             img_x = x_start
149 |             img_y = y_start
150 |             max_x = 1900
151 | 
152 |             row_idx = 0
153 |             for key, value in process_dict.items():
154 | 
155 |                 image_jet = cv2.applyColorMap(
156 |                     np.uint8(value / np.amax(value) * 255),
157 |                     cv2.COLORMAP_JET)
158 |                 vis_utils.cv2_show_image(
159 |                     key, image_jet,
160 |                     img_size, (img_x, img_y))
161 | 
162 |                 img_x += x_offset + x_padding
163 |                 if (img_x + x_offset + x_padding) > max_x:
164 |                     img_x = x_start
165 |                     row_idx += 1
166 |                 img_y = y_start + row_idx * (y_offset + y_padding)
167 | 
168 |                 # Save process images
169 |                 cv2.imwrite('process/' + key + '.png', image_jet)
170 | 
171 |             cv2.waitKey()
172 | 
173 |         # Save depth images to disk
174 |         if save_depth_maps:
175 |             depth_image_file_name = os.path.split(depth_image_path)[1]
176 | 
177 |             # Save depth map to a uint16 png (same format as disparity maps)
178 |             file_path = output_depth_dir + '/' + depth_image_file_name
179 |             with open(file_path, 'wb') as f:
180 |                 depth_image = (final_depths * 256).astype(np.uint16)
181 | 
182 |                 # pypng is used because cv2 cannot save uint16 format images
183 |                 writer = png.Writer(width=depth_image.shape[1],
184 |                                     height=depth_image.shape[0],
185 |                                     bitdepth=16,
186 |                                     greyscale=True)
187 |                 writer.write(f, depth_image)
188 | 
189 |         end_total_time = time.time()
190 | 
191 |         # Update fill times
192 |         last_fill_times = np.roll(last_fill_times, -1)
193 |         last_fill_times[-1] = end_fill_time - start_fill_time
194 | 
195 |         # Update total times
196 |         last_total_times = np.roll(last_total_times, -1)
197 |         last_total_times[-1] = end_total_time - start_total_time
198 | 
199 | 
200 | if __name__ == "__main__":
201 |     main()
202 | 


--------------------------------------------------------------------------------
/images/000043/000043.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/000043/000043.png


--------------------------------------------------------------------------------
/images/000043/completed.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/000043/completed.png


--------------------------------------------------------------------------------
/images/000043/lidar.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/000043/lidar.png


--------------------------------------------------------------------------------
/images/006338/006338.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/006338/006338.png


--------------------------------------------------------------------------------
/images/006338/completed.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/006338/completed.png


--------------------------------------------------------------------------------
/images/006338/lidar.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/006338/lidar.png


--------------------------------------------------------------------------------
/images/extra_vid.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/extra_vid.png


--------------------------------------------------------------------------------
/images/short_vid.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/short_vid.jpg


--------------------------------------------------------------------------------
/images/show_process.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/kujason/ip_basic/b6e6ce9c64e1fee63c2535f9999503cfe7e68066/images/show_process.png


--------------------------------------------------------------------------------
/ip_basic/depth_map_utils.py:
--------------------------------------------------------------------------------
  1 | import collections
  2 | 
  3 | import cv2
  4 | import numpy as np
  5 | 
  6 | # Full kernels
  7 | FULL_KERNEL_3 = np.ones((3, 3), np.uint8)
  8 | FULL_KERNEL_5 = np.ones((5, 5), np.uint8)
  9 | FULL_KERNEL_7 = np.ones((7, 7), np.uint8)
 10 | FULL_KERNEL_9 = np.ones((9, 9), np.uint8)
 11 | FULL_KERNEL_31 = np.ones((31, 31), np.uint8)
 12 | 
 13 | # 3x3 cross kernel
 14 | CROSS_KERNEL_3 = np.asarray(
 15 |     [
 16 |         [0, 1, 0],
 17 |         [1, 1, 1],
 18 |         [0, 1, 0],
 19 |     ], dtype=np.uint8)
 20 | 
 21 | # 5x5 cross kernel
 22 | CROSS_KERNEL_5 = np.asarray(
 23 |     [
 24 |         [0, 0, 1, 0, 0],
 25 |         [0, 0, 1, 0, 0],
 26 |         [1, 1, 1, 1, 1],
 27 |         [0, 0, 1, 0, 0],
 28 |         [0, 0, 1, 0, 0],
 29 |     ], dtype=np.uint8)
 30 | 
 31 | # 5x5 diamond kernel
 32 | DIAMOND_KERNEL_5 = np.array(
 33 |     [
 34 |         [0, 0, 1, 0, 0],
 35 |         [0, 1, 1, 1, 0],
 36 |         [1, 1, 1, 1, 1],
 37 |         [0, 1, 1, 1, 0],
 38 |         [0, 0, 1, 0, 0],
 39 |     ], dtype=np.uint8)
 40 | 
 41 | # 7x7 cross kernel
 42 | CROSS_KERNEL_7 = np.asarray(
 43 |     [
 44 |         [0, 0, 0, 1, 0, 0, 0],
 45 |         [0, 0, 0, 1, 0, 0, 0],
 46 |         [0, 0, 0, 1, 0, 0, 0],
 47 |         [1, 1, 1, 1, 1, 1, 1],
 48 |         [0, 0, 0, 1, 0, 0, 0],
 49 |         [0, 0, 0, 1, 0, 0, 0],
 50 |         [0, 0, 0, 1, 0, 0, 0],
 51 |     ], dtype=np.uint8)
 52 | 
 53 | # 7x7 diamond kernel
 54 | DIAMOND_KERNEL_7 = np.asarray(
 55 |     [
 56 |         [0, 0, 0, 1, 0, 0, 0],
 57 |         [0, 0, 1, 1, 1, 0, 0],
 58 |         [0, 1, 1, 1, 1, 1, 0],
 59 |         [1, 1, 1, 1, 1, 1, 1],
 60 |         [0, 1, 1, 1, 1, 1, 0],
 61 |         [0, 0, 1, 1, 1, 0, 0],
 62 |         [0, 0, 0, 1, 0, 0, 0],
 63 |     ], dtype=np.uint8)
 64 | 
 65 | 
 66 | def fill_in_fast(depth_map, max_depth=100.0, custom_kernel=DIAMOND_KERNEL_5,
 67 |                  extrapolate=False, blur_type='bilateral'):
 68 |     """Fast, in-place depth completion.
 69 | 
 70 |     Args:
 71 |         depth_map: projected depths
 72 |         max_depth: max depth value for inversion
 73 |         custom_kernel: kernel to apply initial dilation
 74 |         extrapolate: whether to extrapolate by extending depths to top of
 75 |             the frame, and applying a 31x31 full kernel dilation
 76 |         blur_type:
 77 |             'bilateral' - preserves local structure (recommended)
 78 |             'gaussian' - provides lower RMSE
 79 | 
 80 |     Returns:
 81 |         depth_map: dense depth map
 82 |     """
 83 | 
 84 |     # Invert
 85 |     valid_pixels = (depth_map > 0.1)
 86 |     depth_map[valid_pixels] = max_depth - depth_map[valid_pixels]
 87 | 
 88 |     # Dilate
 89 |     depth_map = cv2.dilate(depth_map, custom_kernel)
 90 | 
 91 |     # Hole closing
 92 |     depth_map = cv2.morphologyEx(depth_map, cv2.MORPH_CLOSE, FULL_KERNEL_5)
 93 | 
 94 |     # Fill empty spaces with dilated values
 95 |     empty_pixels = (depth_map < 0.1)
 96 |     dilated = cv2.dilate(depth_map, FULL_KERNEL_7)
 97 |     depth_map[empty_pixels] = dilated[empty_pixels]
 98 | 
 99 |     # Extend highest pixel to top of image
100 |     if extrapolate:
101 |         top_row_pixels = np.argmax(depth_map > 0.1, axis=0)
102 |         top_pixel_values = depth_map[top_row_pixels, range(depth_map.shape[1])]
103 | 
104 |         for pixel_col_idx in range(depth_map.shape[1]):
105 |             depth_map[0:top_row_pixels[pixel_col_idx], pixel_col_idx] = \
106 |                 top_pixel_values[pixel_col_idx]
107 | 
108 |         # Large Fill
109 |         empty_pixels = depth_map < 0.1
110 |         dilated = cv2.dilate(depth_map, FULL_KERNEL_31)
111 |         depth_map[empty_pixels] = dilated[empty_pixels]
112 | 
113 |     # Median blur
114 |     depth_map = cv2.medianBlur(depth_map, 5)
115 | 
116 |     # Bilateral or Gaussian blur
117 |     if blur_type == 'bilateral':
118 |         # Bilateral blur
119 |         depth_map = cv2.bilateralFilter(depth_map, 5, 1.5, 2.0)
120 |     elif blur_type == 'gaussian':
121 |         # Gaussian blur
122 |         valid_pixels = (depth_map > 0.1)
123 |         blurred = cv2.GaussianBlur(depth_map, (5, 5), 0)
124 |         depth_map[valid_pixels] = blurred[valid_pixels]
125 | 
126 |     # Invert
127 |     valid_pixels = (depth_map > 0.1)
128 |     depth_map[valid_pixels] = max_depth - depth_map[valid_pixels]
129 | 
130 |     return depth_map
131 | 
132 | 
133 | def fill_in_multiscale(depth_map, max_depth=100.0,
134 |                        dilation_kernel_far=CROSS_KERNEL_3,
135 |                        dilation_kernel_med=CROSS_KERNEL_5,
136 |                        dilation_kernel_near=CROSS_KERNEL_7,
137 |                        extrapolate=False,
138 |                        blur_type='bilateral',
139 |                        show_process=False):
140 |     """Slower, multi-scale dilation version with additional noise removal that
141 |     provides better qualitative results.
142 | 
143 |     Args:
144 |         depth_map: projected depths
145 |         max_depth: max depth value for inversion
146 |         dilation_kernel_far: dilation kernel to use for 30.0 < depths < 80.0 m
147 |         dilation_kernel_med: dilation kernel to use for 15.0 < depths < 30.0 m
148 |         dilation_kernel_near: dilation kernel to use for 0.1 < depths < 15.0 m
149 |         extrapolate:whether to extrapolate by extending depths to top of
150 |             the frame, and applying a 31x31 full kernel dilation
151 |         blur_type:
152 |             'gaussian' - provides lower RMSE
153 |             'bilateral' - preserves local structure (recommended)
154 |         show_process: saves process images into an OrderedDict
155 | 
156 |     Returns:
157 |         depth_map: dense depth map
158 |         process_dict: OrderedDict of process images
159 |     """
160 | 
161 |     # Convert to float32
162 |     depths_in = np.float32(depth_map)
163 | 
164 |     # Calculate bin masks before inversion
165 |     valid_pixels_near = (depths_in > 0.1) & (depths_in <= 15.0)
166 |     valid_pixels_med = (depths_in > 15.0) & (depths_in <= 30.0)
167 |     valid_pixels_far = (depths_in > 30.0)
168 | 
169 |     # Invert (and offset)
170 |     s1_inverted_depths = np.copy(depths_in)
171 |     valid_pixels = (s1_inverted_depths > 0.1)
172 |     s1_inverted_depths[valid_pixels] = \
173 |         max_depth - s1_inverted_depths[valid_pixels]
174 | 
175 |     # Multi-scale dilation
176 |     dilated_far = cv2.dilate(
177 |         np.multiply(s1_inverted_depths, valid_pixels_far),
178 |         dilation_kernel_far)
179 |     dilated_med = cv2.dilate(
180 |         np.multiply(s1_inverted_depths, valid_pixels_med),
181 |         dilation_kernel_med)
182 |     dilated_near = cv2.dilate(
183 |         np.multiply(s1_inverted_depths, valid_pixels_near),
184 |         dilation_kernel_near)
185 | 
186 |     # Find valid pixels for each binned dilation
187 |     valid_pixels_near = (dilated_near > 0.1)
188 |     valid_pixels_med = (dilated_med > 0.1)
189 |     valid_pixels_far = (dilated_far > 0.1)
190 | 
191 |     # Combine dilated versions, starting farthest to nearest
192 |     s2_dilated_depths = np.copy(s1_inverted_depths)
193 |     s2_dilated_depths[valid_pixels_far] = dilated_far[valid_pixels_far]
194 |     s2_dilated_depths[valid_pixels_med] = dilated_med[valid_pixels_med]
195 |     s2_dilated_depths[valid_pixels_near] = dilated_near[valid_pixels_near]
196 | 
197 |     # Small hole closure
198 |     s3_closed_depths = cv2.morphologyEx(
199 |         s2_dilated_depths, cv2.MORPH_CLOSE, FULL_KERNEL_5)
200 | 
201 |     # Median blur to remove outliers
202 |     s4_blurred_depths = np.copy(s3_closed_depths)
203 |     blurred = cv2.medianBlur(s3_closed_depths, 5)
204 |     valid_pixels = (s3_closed_depths > 0.1)
205 |     s4_blurred_depths[valid_pixels] = blurred[valid_pixels]
206 | 
207 |     # Calculate a top mask
208 |     top_mask = np.ones(depths_in.shape, dtype=np.bool)
209 |     for pixel_col_idx in range(s4_blurred_depths.shape[1]):
210 |         pixel_col = s4_blurred_depths[:, pixel_col_idx]
211 |         top_pixel_row = np.argmax(pixel_col > 0.1)
212 |         top_mask[0:top_pixel_row, pixel_col_idx] = False
213 | 
214 |     # Get empty mask
215 |     valid_pixels = (s4_blurred_depths > 0.1)
216 |     empty_pixels = ~valid_pixels & top_mask
217 | 
218 |     # Hole fill
219 |     dilated = cv2.dilate(s4_blurred_depths, FULL_KERNEL_9)
220 |     s5_dilated_depths = np.copy(s4_blurred_depths)
221 |     s5_dilated_depths[empty_pixels] = dilated[empty_pixels]
222 | 
223 |     # Extend highest pixel to top of image or create top mask
224 |     s6_extended_depths = np.copy(s5_dilated_depths)
225 |     top_mask = np.ones(s5_dilated_depths.shape, dtype=np.bool)
226 | 
227 |     top_row_pixels = np.argmax(s5_dilated_depths > 0.1, axis=0)
228 |     top_pixel_values = s5_dilated_depths[top_row_pixels,
229 |                                          range(s5_dilated_depths.shape[1])]
230 | 
231 |     for pixel_col_idx in range(s5_dilated_depths.shape[1]):
232 |         if extrapolate:
233 |             s6_extended_depths[0:top_row_pixels[pixel_col_idx],
234 |                                pixel_col_idx] = top_pixel_values[pixel_col_idx]
235 |         else:
236 |             # Create top mask
237 |             top_mask[0:top_row_pixels[pixel_col_idx], pixel_col_idx] = False
238 | 
239 |     # Fill large holes with masked dilations
240 |     s7_blurred_depths = np.copy(s6_extended_depths)
241 |     for i in range(6):
242 |         empty_pixels = (s7_blurred_depths < 0.1) & top_mask
243 |         dilated = cv2.dilate(s7_blurred_depths, FULL_KERNEL_5)
244 |         s7_blurred_depths[empty_pixels] = dilated[empty_pixels]
245 | 
246 |     # Median blur
247 |     blurred = cv2.medianBlur(s7_blurred_depths, 5)
248 |     valid_pixels = (s7_blurred_depths > 0.1) & top_mask
249 |     s7_blurred_depths[valid_pixels] = blurred[valid_pixels]
250 | 
251 |     if blur_type == 'gaussian':
252 |         # Gaussian blur
253 |         blurred = cv2.GaussianBlur(s7_blurred_depths, (5, 5), 0)
254 |         valid_pixels = (s7_blurred_depths > 0.1) & top_mask
255 |         s7_blurred_depths[valid_pixels] = blurred[valid_pixels]
256 |     elif blur_type == 'bilateral':
257 |         # Bilateral blur
258 |         blurred = cv2.bilateralFilter(s7_blurred_depths, 5, 0.5, 2.0)
259 |         s7_blurred_depths[valid_pixels] = blurred[valid_pixels]
260 | 
261 |     # Invert (and offset)
262 |     s8_inverted_depths = np.copy(s7_blurred_depths)
263 |     valid_pixels = np.where(s8_inverted_depths > 0.1)
264 |     s8_inverted_depths[valid_pixels] = \
265 |         max_depth - s8_inverted_depths[valid_pixels]
266 | 
267 |     depths_out = s8_inverted_depths
268 | 
269 |     process_dict = None
270 |     if show_process:
271 |         process_dict = collections.OrderedDict()
272 | 
273 |         process_dict['s0_depths_in'] = depths_in
274 | 
275 |         process_dict['s1_inverted_depths'] = s1_inverted_depths
276 |         process_dict['s2_dilated_depths'] = s2_dilated_depths
277 |         process_dict['s3_closed_depths'] = s3_closed_depths
278 |         process_dict['s4_blurred_depths'] = s4_blurred_depths
279 |         process_dict['s5_combined_depths'] = s5_dilated_depths
280 |         process_dict['s6_extended_depths'] = s6_extended_depths
281 |         process_dict['s7_blurred_depths'] = s7_blurred_depths
282 |         process_dict['s8_inverted_depths'] = s8_inverted_depths
283 | 
284 |         process_dict['s9_depths_out'] = depths_out
285 | 
286 |     return depths_out, process_dict
287 | 


--------------------------------------------------------------------------------
/ip_basic/vis_utils.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | 
 3 | 
 4 | def cv2_show_image(window_name, image,
 5 |                    size_wh=None, location_xy=None):
 6 |     """Helper function for specifying window size and location when
 7 |     displaying images with cv2.
 8 | 
 9 |     Args:
10 |         window_name: str window name
11 |         image: ndarray image to display
12 |         size_wh: window size (w, h)
13 |         location_xy: window location (x, y)
14 |     """
15 | 
16 |     if size_wh is not None:
17 |         cv2.namedWindow(window_name,
18 |                         cv2.WINDOW_KEEPRATIO | cv2.WINDOW_GUI_NORMAL)
19 |         cv2.resizeWindow(window_name, *size_wh)
20 |     else:
21 |         cv2.namedWindow(window_name, cv2.WINDOW_AUTOSIZE)
22 | 
23 |     if location_xy is not None:
24 |         cv2.moveWindow(window_name, *location_xy)
25 | 
26 |     cv2.imshow(window_name, image)
27 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | matplotlib
2 | numpy
3 | opencv-python
4 | pypng
5 | 


--------------------------------------------------------------------------------