├── .gitignore
├── README.md
├── assets
└── epic_fields.png
├── demo
├── demo.py
├── demo_ego4d.md
├── dense_point_cloud.sh
├── reconstruct_sparse.sh
└── register_dense.sh
├── example_data
├── P04_01_line.json
├── P04_01_line.png
├── P06_09_line.json
├── P06_09_line.png
├── P12_101_line.json
├── P12_101_line.png
├── P28_101.json
├── P28_101
│ ├── frame_0000000080.jpg
│ ├── frame_0000000085.jpg
│ ├── frame_0000000090.jpg
│ ├── frame_0000000095.jpg
│ ├── frame_0000000100.jpg
│ ├── frame_0000000105.jpg
│ ├── frame_0000000110.jpg
│ └── frame_0000000115.jpg
├── P28_101_line.json
├── P28_101_line.png
├── example_output_gui.jpg
└── example_output_line.jpg
├── homography_filter
├── __init__.py
├── argparser.py
├── filter.py
└── lib.py
├── input_videos.txt
├── licence.txt
├── reconstruct_sparse.py
├── register_dense.py
├── scripts
├── reconstruct_sparse.sh
└── register_dense.sh
├── select_sparse_frames.py
├── tools
├── __init__.py
├── common_functions.py
├── project_3d_line.py
├── visualise_data_open3d.py
└── visualize_colmap_open3d.py
└── utils
├── __init__.py
├── base_type.py
├── colmap_utils.py
├── hovering
├── __init__.py
├── helper.py
├── hover_open3d.py
└── o3d_line_mesh.py
└── lib.py
/.gitignore:
--------------------------------------------------------------------------------
1 | **/__pycache__/
2 |
3 | outputs/
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # EPIC Fields: Marrying 3D Geometry and Video Understanding
2 | 
3 |
4 | This repository provides tools and scripts for visualizing and reconstructing the [EPIC FIELDS](https://epic-kitchens.github.io/epic-fields) dataset.
5 |
6 | ## Table of Contents
7 |
8 | 1. [Visualization Code](#visualization-code)
9 | - [Introduction](#introduction)
10 | - [Format](#format)
11 | - [Visualization](#visualisation)
12 | 2. [Reconstruction Pipeline](#reconstruction-pipeline)
13 | - [Steps for EPIC-KITCHENS Reconstruction](#steps-for-epic-kitchens-reconstruction)
14 | - [Understanding the Output File Structure](#understanding-the-output-file-structure)
15 | 3. [Reconstruction Pipeline: Quick Demo](#reconstruction-pipeline-quick-demo)
16 | 4. [Additional info](#additional-info)
17 | - [Credit](#credit)
18 | - [Citation](#citation)
19 | - [License](#license)
20 | - [Contact](#contact)
21 |
22 |
23 |
24 | # Visualization Code
25 | ## Introduction
26 |
27 | This visualisation code is associated with the released EPIC FIELDS dataset. Further details on the dataset and associated preprint are available at:
28 | [https://epic-kitchens.github.io/epic-fields](https://epic-kitchens.github.io/epic-fields)
29 |
30 |
31 | ## Format
32 |
33 | - The `camera` parameters use the COLMAP format, which is the same as the OpenCV format.
34 | - The `images` stores the world-to-camera transformation, represented by quaternion and translation.
35 | - Note: for NeRF usage this needs to be converted to camera-to-world transformation and possibly changing (+x, +y, +z) to (+x, -y, -z)
36 | - The `points` is part of COLMAP output. It's kept here for visualisation purpose and potentially for computing the `near`/`far` bounds in NeRF input.
37 | ```
38 | {
39 | "camera": {
40 | "id": 1, "model": "OPENCV", "width": 456, "height": 256,
41 | "params": [fx, fy, cx, cy, k1, k2, p1, p2]
42 | },
43 | "images": {
44 | frame_name: [qw, qx, qy, qz, tx, ty, tz],
45 | ...
46 | },
47 | "points": [
48 | [x, y, z, r, g, b],
49 | ...
50 | ]
51 | }
52 |
53 | example data can be found in `example_data/P28_101.json`
54 | ```
55 |
56 | ## Visualisation
57 |
58 | ### Visualise camera poses and pointcloud
59 |
60 | This script requires Open3D. This script is tested with Open3D==0.16.1.
61 | ```python
62 | python tools/visualise_data_open3d.py --json-data example_data/P28_101.json
63 | ```
64 | PS: Press 'h' to see the Open3D help message.
65 |
66 |
67 | Click to see the example output
68 |
69 |
70 |
71 | ### Example: Project a 3D line onto epic-kitchens images using camera poses
72 |
73 | ```python
74 | python tools/project_3d_line.py \
75 | --json-data example_data/P28_101.json \
76 | --line-data example_data/P28_101_line.json \
77 | --frames-root example_data/P28_101/
78 | ```
79 |
80 | Click to see the example output
81 |
82 |
83 |
84 | To draw a 3D line, one option is to download the COLMAP format data and use COLMAP GUI to click on points.
85 |
86 |
87 | ---
88 |
89 |
90 |
91 | # Reconstruction Pipeline
92 |
93 | This section contains the pipeline for the dataset introduced in our paper, "EPIC Fields: Marrying 3D Geometry and Video Understanding." We aim to bridge the domains of 3D geometry and video understanding, leading to innovative advancements in both areas.
94 |
95 | ## Steps for EPIC-KITCHENS Reconstruction
96 |
97 | This section outlines the procedure to achieve the [EPIC-KITCHENS](https://epic-kitchens.github.io) reconstructions using our methodology.
98 |
99 | ### Step 0: Prerequisites and Initial Configuration
100 |
101 | #### 1. Installing COLMAP (preferably with CUDA support)
102 |
103 | To efficiently process and reconstruct the frames, it's recommended to install COLMAP with CUDA support, which accelerates the reconstruction process using NVIDIA GPUs.
104 |
105 | You can download and install COLMAP from their official website. For detailed installation instructions, especially on how to enable CUDA support, refer to the [COLMAP installation guide](https://colmap.github.io/install.html).
106 |
107 | #### 2. Cloning the Repository
108 |
109 | To proceed with the subsequent steps, you'll need to clone the current repository. Run the following commands:
110 |
111 | ```bash
112 | git clone https://github.com/epic-kitchens/epic-fields-code.git
113 | cd epic-fields-code
114 | ```
115 |
116 | #### 3. Downloading Vocabulary Trees
117 | COLMAP utilizes vocabulary trees for efficient image matching. Create a directory called vocab_bins and download the required Vocabulary Trees into this directory:
118 | ```bash
119 | mkdir vocab_bins
120 | cd vocab_bins
121 | wget https://demuc.de/colmap/vocab_tree_flickr100K_words32K.bin
122 | cd ..
123 | ```
124 | #### 4. Installing `pycolmap` package
125 |
126 | The `pycolmap` package will be used to gather statistics from the model later on. Install it using `pip` (assuming that you've created an environment):
127 |
128 | ```bash
129 | pip install pycolmap
130 | ```
131 | ### Step 1: Downloading Video Frames
132 |
133 | To utilize the EPIC Fields pipeline, the first step is to acquire the necessary video frames. We're particularly interested in the RGB frames from EPIC-KITCHENS. You can download the entire collection from [EPIC-KITCHENS](https://epic-kitchens.github.io).
134 |
135 | For demonstration purposes, we'll guide you through downloading the `P15_12` video RGB frames.
136 |
137 | ##### Demo: Downloading and Extracting `P15_12` Video Frames
138 |
139 | Execute the following shell commands to download and extract the RGB frames:
140 |
141 | ```bash
142 | # Download the tarball
143 | wget https://data.bris.ac.uk/datasets/3h91syskeag572hl6tvuovwv4d/frames_rgb_flow/rgb/train/P15/P15_12.tar
144 |
145 | # Create the desired directory structure
146 | mkdir -p P15/P15_12
147 |
148 | # Extract the frames into the specified directory
149 | tar -xf P15_12.tar -C P15/P15_12
150 | ```
151 | This will place all the .jpg frames inside the P15/P15_12 directory.
152 |
153 | ##### Directory Structure Confirmation
154 |
155 | After downloading and extracting, your directory structure should look like this (which is [EPIC-KITCHENS](https://epic-kitchens.github.io) format :
156 | ```
157 | /root-directory/
158 | │
159 | └───PXX/
160 | │
161 | └───PXX_YY(Y)/
162 | │ frame_000001.jpg
163 | │ frame_000002.jpg
164 | │ ...
165 | ```
166 | For our P15_12 example, this would be:
167 | ```
168 | /root-directory/
169 | │
170 | └───P15/
171 | │
172 | └───P15_12/
173 | │ frame_000001.jpg
174 | │ frame_000002.jpg
175 | │ ...
176 | ```
177 |
178 | This structure ensures a consistent format for the pipeline to process the frames effectively.
179 |
180 | ### Step 2: Specifying Videos for Reconstruction
181 |
182 | Update the `input_videos.txt` file in the repository to list the video identifiers you wish to process. In our demo example, we put P15_12 in the file. If you have multiple files, please ensure each video identifier is on a separate line.
183 |
184 |
185 | ### Step 3: Running the Homography-Based Frame Sampling
186 |
187 | Execute the `select_sparse_frames.py` script to perform homography-based sampling of the frames.
188 |
189 | ##### Script Parameters:
190 |
191 | - `--input_videos`: Path to the file containing a list of videos to be processed. Default: `input_videos.txt`
192 | - `--epic_kithens_root`: Directory path to the EPIC-KITCHENS images. Default: `.`
193 | - `--sampled_images_path`: Directory where the sampled image files will be stored. Default: `sampled_frames`
194 | - `--homography_overlap`: Threshold for the homography to sample new frames. A higher value will sample more images. Default: `0.9`
195 | - `--max_concurrent`: Maximum number of concurrent processes. Default: `8`
196 |
197 | ##### Example Usage:
198 |
199 | ```bash
200 | python3 select_sparse_frames.py --input_videos input_videos.txt --epic_kithens_root path_to_epic_images --sampled_images_path path_for_sampled_frames
201 | ```
202 |
203 | ##### Demo: Homography-Based Frame Sampling for `P15_12` Video
204 |
205 | For the demo, using the `P15_12` video you've downloaded into the current directory, run:
206 |
207 | ```bash
208 | python3 select_sparse_frames.py --input_videos input_videos.txt --epic_kithens_root . --sampled_images_path sampled_frames --homography_overlap 0.9 --max_concurrent 8
209 | ```
210 |
211 |
212 | ### Step 4: Running the COLMAP Sparse Reconstruction
213 |
214 | Execute the `reconstruct_sparse.py` script to perform sparse reconstruction using COLMAP.
215 |
216 | ##### Script Parameters:
217 |
218 | - `--input_videos`: Path to the file containing a list of videos to be processed. Default: `input_videos.txt`
219 | - `--sparse_reconstuctions_root`: Path to store the sparsely reconstructed models. Default: `colmap_models/sparse`
220 | - `--epic_kithens_root`: Directory path to the EPIC-KITCHENS images. Default: `.`
221 | - `--logs_path`: Path where the log files will be stored. Default: `logs/sparse/out_logs_terminal`
222 | - `--summary_path`: Path where the summary files will be stored. Default: `logs/sparse/out_summary`
223 | - `--sampled_images_path`: Directory where the sampled image files are located. Default: `sampled_frames`
224 | - `--gpu_index`: Index of the GPU to be used. Default: `0`
225 |
226 | ##### Example Usage:
227 | ```bash
228 | python3 reconstruct_sparse.py --input_videos input_videos.txt --sparse_reconstuctions_root colmap_models/sparse --epic_kithens_root path_to_epic_images --logs_path logs/sparse/out_logs_terminal --summary_path logs/sparse/out_summary --sampled_images_path path_for_sampled_frames --gpu_index 0
229 | ```
230 |
231 | #### Demo: Sparse Reconstruction for P15_12 Video
232 | For the demo, using the P15_12 video and the sampled frames in the current directory, run:
233 |
234 | ```bash
235 | python3 reconstruct_sparse.py --input_videos input_videos.txt --sparse_reconstuctions_root colmap_models/sparse --epic_kithens_root . --logs_path logs/sparse/out_logs_terminal --summary_path logs/sparse/out_summary --sampled_images_path sampled_frames --gpu_index 0
236 | ```
237 |
238 | ### Understanding the Output File Structure
239 |
240 | After running the sparse reconstruction demo, you'll notice the following directory hierarchy:
241 | ```
242 | logs/
243 | │
244 | └───sparse/
245 | │
246 | ├───out_logs_terminal/
247 | │ │ P15_12__reconstruct_sparse.out
248 | │ │ ...
249 | │
250 | └───out_summary/
251 | │ P15_12.out
252 | │ ...
253 |
254 | ```
255 | #### Sparse Model Directory:
256 | The sparsely reconstructed model for our demo video P15_12 will be found in: ```colmap_models/sparse/P15_12```
257 |
258 | #### Logs Directory:
259 | The "logs" directory provides insights into the sparse reconstruction process:
260 |
261 | - COLMAP Execution Logs (out_logs_terminal): These logs capture details from the COLMAP execution and can be helpful for debugging. For our demo video P15_12, the respective log file would be named something like: ```logs/sparse/out_logs_terminal/P15_12__reconstruct_sparse.out```
262 |
263 | - Sparse Model Summary (out_summary): This directory contains a summary of the sparse model's statistics. For our demo video P15_12, the summary file is ```logs/sparse/out_summary/P15_12.out```
264 | By examining the P15_12.out file, you can gain insights into how well the reconstruction process performed for that specific video and the excution time.
265 |
266 |
267 | ### Step 5: Registering All Frames into the Sparse Model
268 |
269 | For this step, you'll use the `register_dense.py` script. This script registers all the frames with the sparse model, preparing them for a dense reconstruction.
270 |
271 | ##### Script Parameters:
272 |
273 | - `--input_videos`: Path to the file containing a list of videos to be processed. Default: `input_videos.txt`
274 | - `--sparse_reconstuctions_root`: Directory path to the sparsely reconstructed models. Default: `colmap_models/sparse`
275 | - `--dense_reconstuctions_root`: Directory path to the densely registered models. Default: `colmap_models/dense`
276 | - `--epic_kithens_root`: Directory path to the EPIC-KITCHENS images. Default: `.`
277 | - `--logs_path`: Directory where the log files of the dense registration will be stored. Default: `logs/dense/out_logs_terminal`
278 | - `--summary_path`: Directory where the summary files of the dense registration will be stored. Default: `logs/dense/out_summary`
279 | - `--gpu_index`: Index of the GPU to use. Default: `0`
280 |
281 | #### Demo: Registering Frames into Sparse Model for Video `P15_12`
282 |
283 | To demonstrate the registration process using the `register_dense.py` script, let's use the sample video `P15_12` as an example.
284 |
285 | ```bash
286 | python3 register_dense.py --input_videos input_videos.txt --sparse_reconstuctions_root colmap_models/sparse --dense_reconstuctions_root colmap_models/dense --epic_kithens_root . --logs_path logs/dense/out_logs_terminal --summary_path logs/dense/out_summary --gpu_index 0
287 | ```
288 |
289 | Assuming input_videos.txt contains the entry for P15_12, the above command will register all frames from the P15_12 video with the sparse model stored under colmap_models/sparse, the new registered model will be saved under colmap_models/dense. The logs and summary for this registration process will be saved under the logs/dense/out_logs_terminal and logs/dense/out_summary directories, respectively.
290 |
291 | After executing the command, you can check the log files and summary for insights and statistics on the registration process for the P15_12 video.
292 |
293 | # Reconstruction Pipeline: Quick Demo
294 |
295 | Here we provide another demo script `demo/demo.py`
296 | that works on a video directly, summarising all above steps into one file.
297 |
298 | ```
299 | python demo/demo.py video.mp4
300 | ```
301 |
302 | Please refer to [demo_ego4d.md](demo/demo_ego4d.md) for details.
303 |
304 |
305 |
306 | # Additional info
307 |
308 | ## Credit
309 |
310 | Code prepared by Zhifan Zhu, Ahmad Darkhalil and Vadim Tschernezki.
311 |
312 | ## Citation
313 | If you find this work useful please cite our paper:
314 |
315 | ```
316 | @article{EPICFIELDS2023,
317 | title={{EPIC-FIELDS}: {M}arrying {3D} {G}eometry and {V}ideo {U}nderstanding},
318 | author={Tschernezki, Vadim and Darkhalil, Ahmad and Zhu, Zhifan and Fouhey, David and Larina, Iro and Larlus, Diane and Damen, Dima and Vedaldi, Andrea},
319 | booktitle = {ArXiv},
320 | year = {2023}
321 | }
322 | ```
323 |
324 | Also cite the [EPIC-KITCHENS-100](https://epic-kitchens.github.io) paper where the videos originate:
325 |
326 | ```
327 | @ARTICLE{Damen2022RESCALING,
328 | title={Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100},
329 | author={Damen, Dima and Doughty, Hazel and Farinella, Giovanni Maria and and Furnari, Antonino
330 | and Ma, Jian and Kazakos, Evangelos and Moltisanti, Davide and Munro, Jonathan
331 | and Perrett, Toby and Price, Will and Wray, Michael},
332 | journal = {International Journal of Computer Vision (IJCV)},
333 | year = {2022},
334 | volume = {130},
335 | pages = {33–55},
336 | Url = {https://doi.org/10.1007/s11263-021-01531-2}
337 | }
338 | ```
339 | For more information on the project and related research, please visit the [EPIC-Kitchens' EPIC Fields page](https://epic-kitchens.github.io/epic-fields/).
340 |
341 |
342 | ## License
343 | All files in this dataset are copyright by us and published under the
344 | Creative Commons Attribution-NonCommerial 4.0 International License, found
345 | [here](https://creativecommons.org/licenses/by-nc/4.0/).
346 | This means that you must give appropriate credit, provide a link to the license,
347 | and indicate if changes were made. You may do so in any reasonable manner,
348 | but not in any way that suggests the licensor endorses you or your use. You
349 | may not use the material for commercial purposes.
350 |
351 | ## Contact
352 |
353 | For general enquiries regarding this work or related projects, feel free to email us at [uob-epic-kitchens@bristol.ac.uk](mailto:uob-epic-kitchens@bristol.ac.uk).
354 |
355 |
--------------------------------------------------------------------------------
/assets/epic_fields.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/assets/epic_fields.png
--------------------------------------------------------------------------------
/demo/demo.py:
--------------------------------------------------------------------------------
1 | import sys
2 | import os
3 | import os.path as osp
4 | from pathlib import Path
5 | import subprocess
6 | import logging
7 | import pycolmap
8 |
9 |
10 | def parse_args():
11 | import argparse
12 | parser = argparse.ArgumentParser()
13 | parser.add_argument('video_path', type=str)
14 | return parser.parse_args()
15 |
16 |
17 | def setup_logger(name, log_file, level=logging.DEBUG):
18 | """To setup as many loggers as you want"""
19 |
20 | handler = logging.FileHandler(log_file, mode='a')
21 | formatter = logging.Formatter('%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s',
22 | datefmt='%Y-%m-%d %H:%M:%S')
23 | handler.setFormatter(formatter)
24 | logger = logging.getLogger(name)
25 | logger.setLevel(level)
26 | logger.addHandler(handler)
27 |
28 | return logger
29 |
30 |
31 | class PipelineExecutor:
32 | """
33 | Output structure we need are:
34 |
35 | /
36 | pipeline.log
37 | sparse.log
38 | register.log
39 | dense_pcd.log
40 | colmap/
41 | sparse/{max_model_id}/{cameras.bin,points.bin,images.bin}
42 | registered/{cameras.bin,points.bin,images.bin}
43 | dense/dense.ply
44 | """
45 |
46 | def __init__(self,
47 | video_path: str,
48 | out_dir: str,
49 | longside: int = 512,
50 | camera_model: str = 'OPENCV',
51 | make_log_and_dirs=True,
52 | ):
53 | """
54 | Args:
55 | video_path: path to the video
56 | camera_model: See Colmap doc
57 | longside: this controls the frame resolution for the extracted frames
58 | """
59 | self.worker_dir = Path(out_dir)
60 | self.video_file = video_path
61 | self.camera_model = camera_model
62 | self.longside = longside
63 |
64 | self.frames_dir = self.worker_dir / 'frames'
65 | self.homo_path = self.worker_dir / 'homo90.txt'
66 | self.colmap_dir = self.worker_dir / 'colmap'
67 | self.pipeline_log = self.worker_dir / 'pipeline.log'
68 | self.sparse_log = self.worker_dir / 'sparse.log'
69 | self.register_log = self.worker_dir / 'register.log'
70 | self.dense_pcd_log = self.worker_dir / 'dense_pcd.log'
71 |
72 | self.sparse_dir = self.colmap_dir / 'sparse' # generated by colmap
73 | self.register_dir = self.colmap_dir / 'registered'
74 | self.dense_pcd_dir = self.colmap_dir / 'dense'
75 |
76 | if not make_log_and_dirs:
77 | return
78 | os.makedirs(self.worker_dir, exist_ok=True)
79 | self.logger = setup_logger('demo-logger', self.pipeline_log)
80 | self.logger.info("Run start")
81 | assert os.path.exists(self.pipeline_log)
82 |
83 | def extract_frames(self, with_skip=True):
84 | # num_expected_frames = -1
85 | if with_skip and os.path.exists(self.frames_dir) and len(os.listdir(self.frames_dir)) > 0:
86 | print(f'{self.frames_dir} exist and is non-empty, skip')
87 | return
88 |
89 | cmd1 = [
90 | 'ffprobe', '-v', 'error', '-select_streams', 'v:0', '-show_entries', 'stream=width,height', '-of', 'csv=s=x:p=0', self.video_file
91 | ]
92 | # extract output resolution
93 | p = subprocess.Popen(cmd1, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
94 | out, err = p.communicate()
95 | print('Original resolution: ', out)
96 | w, h = out.decode('utf-8').strip().split('x')
97 | h, w = int(h), int(w)
98 | assert w > h
99 | h = h * self.longside // w
100 | w = self.longside
101 |
102 | s = f'{w}x{h}'
103 | os.makedirs(self.frames_dir, exist_ok=True)
104 |
105 | print("Extracting frames... ")
106 | cmd2 = [
107 | 'ffmpeg', '-i', self.video_file, '-q:v', '1', '-vf', 'fps=30', '-s', s, f'{self.frames_dir}/frame_%010d.jpg']
108 | cmd2 = ' '.join(cmd2)
109 | p = subprocess.call(cmd2, shell=True)
110 | self.logger.info(f'Extract frames done')
111 |
112 | def run_homography(self):
113 | self.logger.info(f'Run homography')
114 | cmd = [
115 | 'python', 'homography_filter/filter.py', '--src',
116 | str(self.frames_dir), '--dst_file', str(self.homo_path), '--overlap', '0.9'
117 | ]
118 | print(' '.join(cmd))
119 | if os.path.exists(self.homo_path):
120 | with open(self.homo_path, 'r') as fp:
121 | lines = fp.readlines()
122 | n_lines = len(lines)
123 | print(f'{self.homo_path} with {n_lines}, skip')
124 | self.logger.info(f'{self.homo_path} with {n_lines}, skip')
125 | return
126 | cmd = ' '.join(cmd)
127 | self.logger.info(cmd)
128 | p = subprocess.call(cmd, shell=True)
129 | self.logger.info(f'Homography Done')
130 |
131 | def run_sparse_reconstruct(self, script_path='demo/reconstruct_sparse.sh'):
132 | status = self.get_summary()
133 | if status['num_sparse_models'] > 0:
134 | self.logger.info(f'Found {status["num_sparse_models"]} sparse models, skip sparse reconstruction()')
135 | print(f'Found {status["num_sparse_models"]} sparse models, skip sparse reconstruction()')
136 | return
137 | self.logger.info(f'Run sparse')
138 | cmd = [
139 | 'bash', script_path,
140 | str(self.worker_dir), str(self.camera_model)
141 | ]
142 | print(' '.join(cmd))
143 | print('Check sparse log at ', self.sparse_log)
144 | self.logger.info(' '.join(cmd))
145 | with open(self.sparse_log, 'w') as sparse_fp:
146 | p = subprocess.run(cmd, stdout=sparse_fp, stderr=sparse_fp)
147 | # out, err = p.communicate()
148 | if p.returncode != 0:
149 | print(f'Error in sparse reconstruction. See {self.sparse_log}')
150 | sys.exit(1)
151 | self.logger.info(f'Done sparse')
152 |
153 | def run_register(self, script_path='demo/register_dense.sh'):
154 | summary = self.get_summary()
155 | max_sparse_ind = summary['max_sparse_ind']
156 | if summary['num_register'] > 0:
157 | print(f'Found {summary["num_register"]} already registered, skiping')
158 | return
159 | self.logger.info(f'Run Register')
160 | cmd = [
161 | 'bash', script_path,
162 | str(self.worker_dir), str(self.camera_model), str(max_sparse_ind)
163 | ]
164 | print(' '.join(cmd))
165 | self.logger.info(' '.join(cmd))
166 | with open(self.register_log, 'w') as register_fp:
167 | p = subprocess.run(cmd, stdout=register_fp, stderr=register_fp)
168 | self.logger.info(f'Done Register')
169 |
170 | def run_dense_pcd(self, script_path='demo/dense_point_cloud.sh'):
171 | summary = self.get_summary()
172 | max_sparse_ind = summary['max_sparse_ind']
173 | if os.path.exists(self.dense_pcd_dir / 'fused.ply'):
174 | print(f'fused.ply already exist in {self.dense_pcd_dir}, skiping')
175 | return
176 | self.logger.info(f'Run Dense PCD (patch stereo)')
177 | cmd = [
178 | 'bash', script_path,
179 | str(self.worker_dir), str(max_sparse_ind)
180 | ]
181 | print(' '.join(cmd))
182 | self.logger.info(' '.join(cmd))
183 | with open(self.dense_pcd_log, 'w') as dense_pcd_fp:
184 | p = subprocess.run(cmd, stdout=dense_pcd_fp, stderr=dense_pcd_fp)
185 | self.logger.info(f'Done Dense PCD')
186 |
187 | def execute(self):
188 | self.extract_frames()
189 | self.run_homography()
190 | if not osp.exists(self.homo_path):
191 | print(f'{self.homo_path} not exist after homography, abort')
192 | return
193 | self.run_sparse_reconstruct()
194 | if not self.get_summary()['num_sparse_models'] > 0:
195 | print(f"num_sparse_models <= 0 after sparse reconstruction, abort")
196 | return
197 | self.run_register()
198 | self.run_dense_pcd()
199 |
200 | def get_summary(self) -> dict:
201 | """
202 | N-frames, N-homo, N-sparse-models, max_sparse_ind, N-sparse-images, N-register
203 | """
204 | info = dict(
205 | video=self.video_file,
206 | num_frames=-1, num_homo=-1, num_sparse_models=-1,
207 | max_sparse_ind=-1, num_sparse_images=-1, num_register=-1
208 | )
209 | info['num_frames'] = len(os.listdir(self.frames_dir))
210 | if not os.path.exists(self.homo_path):
211 | return info
212 |
213 | with open(self.homo_path) as fp:
214 | info['num_homo'] = len(fp.readlines())
215 |
216 | if not osp.exists(self.sparse_dir):
217 | return info
218 |
219 | info['num_sparse_models'] = len(os.listdir(self.sparse_dir))
220 | for mod in os.listdir(self.sparse_dir):
221 | mod_path = osp.join(self.sparse_dir, mod)
222 | recon = pycolmap.Reconstruction(mod_path)
223 | num_images = recon.num_images()
224 | if num_images > info['num_sparse_images']:
225 | info['num_sparse_images'] = num_images
226 | info['max_sparse_ind'] = mod # str
227 |
228 | reg_path = osp.join(self.register_dir)
229 | if not osp.exists(osp.join(reg_path, 'images.bin')):
230 | return info
231 | recon = pycolmap.Reconstruction(reg_path)
232 | num_images = recon.num_images()
233 | info['num_register'] = num_images
234 |
235 | return info
236 |
237 | if __name__ == '__main__':
238 | args = parse_args()
239 | executor = PipelineExecutor(
240 | args.video_path, out_dir='outputs/demo/',
241 | longside=512)
242 | executor.execute()
--------------------------------------------------------------------------------
/demo/demo_ego4d.md:
--------------------------------------------------------------------------------
1 | # Reconstruction Pipeline: Demo on Ego4D
2 |
3 | This `demo/demo.py` will works on a video directly.
4 |
5 | Assume the environment is setup as described in [Step 0](/README.md#step-0-prerequisites-and-initial-configuration),
6 | and the video file is named `video.mp4`.
7 | Run the demo with:
8 |
9 | ```
10 | python demo/demo.py video.mp4
11 | ```
12 |
13 | You will find the results in `outputs/demo/colmap/`:
14 | the file `outputs/demo/colmap/registered/images.bin` stores (nearly) all camera poses;
15 | the file `outputs/demo/colmap/dense/fused.ply` stores the dense point cloud of the scene.
16 | There are also log files `outputs/demo/*.log` to monitor the progress.
17 |
18 | You should now inspect(visualise) the results using:
19 | ```
20 | # Tested with open3d==0.16.0
21 | python3 tools/visualize_colmap_open3d.py \
22 | --model outputs/demo/colmap/registered \
23 | --pcd-path outputs/demo/colmap/dense/fused.ply
24 | ```
25 | Note the `outputs/demo/colmap/registered/images.bin` might be slow to load. In practice, we visualise the key-frames:
26 | ```
27 | python3 tools/visualize_colmap_open3d.py \
28 | --model outputs/demo/colmap/sparse/0 \
29 | --pcd-path outputs/demo/colmap/dense/fused.ply
30 | # Note: See colmap doc for what `sparse/0` exactly means.
31 | ```
32 |
33 | ### What does this `demo/demo.py` do?
34 |
35 | Specifically, `demo/demo.py` file will do the following sequentially:
36 | - Extract frames using `ffmpeg` with longside 512px. This is analogous to Step 1 & 2 in [Reconstruction Pipeline](/README.md#reconstruction-pipeline).
37 | - Compute important frames via homography. This correspond to Step 3 above.
38 | - Perform the _sparse reconstruction_. This corresponds to Step 4 above.
39 | - at the end of this step, you should inspect the sparse result to make sure it makes sense.
40 | - Perform the _dense frame registration_. This corresponds to Step 5 above.
41 | - at the end of this, you will have all the camera poses.
42 | - Compute dense point cloud using colmap's patch_match_stereo. This gives you the dense pretty point-cloud you see in the teaser image.
43 |
44 | ### Example: Ego4D videos
45 |
46 | We demo this script on following two Ego4D videos:
47 | - Task: Cooking — 10 minutes. Ego4d uid = `id18f5c2be-cb79-46fa-8ff1-e03b7e26c986`. Demo output on Youtube: https://youtu.be/GfBsLnZoFGs
48 | - The running time of this video is 4 hours.
49 | - As a sanity check, the file `homo90.txt` after the homography step contains *1522* frames.
50 | - Task: Construction — 35 minutes of decorating and refurbishment. Ego4d uid =`a2dd8a8f-835f-4068-be78-99d38ad99625`. Demo output on Youtube: https://youtu.be/EZlayZIwNgQ
51 | - The running time of this video breaks down as follows:
52 | - Extract frames: 5 mins
53 | - Homography filter: 1 hour
54 | - Sparse reconstruction: **20 hours**
55 | - Dense register: 1.5 hours
56 | - Dense Point-cloud generation: 2 hours
57 |
58 | ### Tips for running the demo script
59 |
60 | We rely on COLMAP, but no tool is perfect. In case of failure, check:
61 | - If the resulting point cloud is not geometrically correct, e.g. the ground is clearly not flat, try to re-run from the sparse reconstruction step.
62 | COLMAP has some stochastic behaviur at initial view choosing.
63 | - If above fails again, try to increase the `--overlap` in homography filter to e.g. 0.95. This will the number of important frames, at the cost of increasing running time during sparse reconstruction.
64 |
65 |
66 | ### Visualise a video of camera poses
67 |
68 | To produce a video of camera poses and trajectory overtime (see e.g. Youtube video above), follow steps below:
69 |
70 | Click to see steps
71 |
72 | - Visualise the result again with Open3D GUI
python3 tools/visualize_colmap_open3d.py --model outputs/demo/colmap/sparse/0 --pcd-path outputs/demo/colmap/dense/fused.ply
73 |
74 | -
75 | In Open3D GUI, press
Ctrl-C
(Linux) / Cmd-C
(Mac) to copy the view to system clipboard. Go to any editor, press Ctrl-V/Cmd-V
to paste the view status, save the file to outputs/demo/view.json
.
76 |
77 | - Run the following script to produce the video
python utils/hovering/hover_open3d.py --model outputs/demo/colmap/registered --pcd-path outputs/demo/colmap/dense/fused.ply --view-path outputs/demo/view.json
The produced video is at outputs/hovering/out.mp4
.
78 |
79 |
80 |
81 |
--------------------------------------------------------------------------------
/demo/dense_point_cloud.sh:
--------------------------------------------------------------------------------
1 |
2 | WORK_DIR=$1
3 | SPARSE_INDEX=$2
4 |
5 | IMG_PATH=$WORK_DIR/frames
6 | INPUT_PATH=$WORK_DIR/colmap/sparse/$SPARSE_INDEX
7 | OUTPUT_PATH=$WORK_DIR/colmap/dense
8 |
9 | OLD_DIR=$(pwd)
10 |
11 | mkdir -p $OUTPUT_PATH
12 |
13 | colmap image_undistorter \
14 | --image_path $IMG_PATH \
15 | --input_path $INPUT_PATH \
16 | --output_path $OUTPUT_PATH \
17 | --output_type COLMAP \
18 | --max_image_size 1000 \
19 |
20 | cd $OUTPUT_PATH
21 |
22 | colmap patch_match_stereo \
23 | --workspace_path . \
24 | --workspace_format COLMAP \
25 | --PatchMatchStereo.max_image_size=1000 \
26 | --PatchMatchStereo.gpu_index=0,1 \
27 | --PatchMatchStereo.cache_size=32 \
28 | --PatchMatchStereo.geom_consistency false \
29 |
30 | colmap stereo_fusion \
31 | --workspace_path . \
32 | --workspace_format COLMAP \
33 | --input_type photometric \
34 | --output_type PLY \
35 | --output_path ./fused.ply \
36 |
37 | # For geometric consistency, do the following lines instead
38 | # colmap patch_match_stereo \
39 | # --workspace_path . \
40 | # --workspace_format COLMAP \
41 | # --PatchMatchStereo.max_image_size=1000 \
42 | # --PatchMatchStereo.gpu_index=0,1 \
43 | # --PatchMatchStereo.cache_size=32 \
44 | # --PatchMatchStereo.geom_consistency false \
45 |
46 | # colmap stereo_fusion \
47 | # --workspace_path . \
48 | # --workspace_format COLMAP \
49 | # --input_type photometric \
50 | # --output_type PLY \
51 | # --output_path ./fused.ply \
52 |
53 | cd $OLD_DIR
--------------------------------------------------------------------------------
/demo/reconstruct_sparse.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | start=`date +%s`
3 |
4 | WORK_DIR=$1
5 | CAMERA_MODEL=$2 # OPENCV or OPENCV_FISHEYE
6 | GPU_IDX=0
7 |
8 | IMGS_DIR=$WORK_DIR/frames
9 | OUT_DIR=${WORK_DIR}/colmap
10 |
11 | DB_PATH=${OUT_DIR}/database.db
12 | SPARSE_DIR=${OUT_DIR}/sparse
13 |
14 | mkdir -p ${OUT_DIR}
15 | mkdir -p ${SPARSE_DIR}
16 |
17 | #SIMPLE_PINHOLE
18 | colmap feature_extractor \
19 | --database_path ${DB_PATH} \
20 | --ImageReader.camera_model $CAMERA_MODEL \
21 | --image_list_path $WORK_DIR/homo90.txt \
22 | --ImageReader.single_camera 1 \
23 | --SiftExtraction.use_gpu 1 \
24 | --SiftExtraction.gpu_index $GPU_IDX \
25 | --image_path $IMGS_DIR \
26 |
27 | colmap sequential_matcher \
28 | --database_path ${DB_PATH} \
29 | --SiftMatching.use_gpu 1 \
30 | --SequentialMatching.loop_detection 1 \
31 | --SiftMatching.gpu_index $GPU_IDX \
32 | --SequentialMatching.vocab_tree_path vocab_bins/vocab_tree_flickr100K_words32K.bin \
33 |
34 | colmap mapper \
35 | --database_path ${DB_PATH} \
36 | --image_path $IMGS_DIR \
37 | --output_path ${SPARSE_DIR} \
38 | --image_list_path $WORK_DIR/homo90.txt \
39 | #--Mapper.ba_global_use_pba 1 \
40 | #--Mapper.ba_global_pba_gpu_index 0 1 \
41 |
42 |
43 | end=`date +%s`
44 |
45 | runtime=$(((end-start)/60))
46 | echo "$runtime minutes"
47 |
--------------------------------------------------------------------------------
/demo/register_dense.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | start=`date +%s`
3 |
4 | GPU_IDX=0
5 |
6 | WORK_DIR=$1
7 | CAMERA_MODEL=$2
8 | MAX_SPARSE_IND=$3
9 | IMGS_DIR=$WORK_DIR/frames
10 | OUT_DIR=${WORK_DIR}/colmap
11 |
12 | DB_PATH=${OUT_DIR}/database.db
13 | SPARSE_DIR=${OUT_DIR}/sparse
14 |
15 | REG_DIR=${OUT_DIR}/registered
16 | mkdir -p $REG_DIR
17 |
18 | VIDEOUID=`basename $WORK_DIR`
19 | REG_DB_PATH=${OUT_DIR}/reg${VIDEOUID}.db
20 | echo $VIDOEUID $REG_DB_PATH
21 | rm -f $REG_DB_PATH $REG_DB_PATH-shm $REG_DB_PATH-wal
22 | cp $DB_PATH $REG_DB_PATH
23 |
24 | colmap feature_extractor \
25 | --database_path ${REG_DB_PATH} \
26 | --ImageReader.camera_model $CAMERA_MODEL \
27 | --ImageReader.single_camera 1 \
28 | --ImageReader.existing_camera_id 1 \
29 | --SiftExtraction.use_gpu 1 \
30 | --SiftExtraction.gpu_index $GPU_IDX \
31 | --image_path $IMGS_DIR
32 |
33 | colmap sequential_matcher \
34 | --database_path ${REG_DB_PATH} \
35 | --SiftMatching.use_gpu 1 \
36 | --SequentialMatching.loop_detection 1 \
37 | --SiftMatching.gpu_index $GPU_IDX \
38 | --SequentialMatching.vocab_tree_path vocab_bins/vocab_tree_flickr100K_words32K.bin \
39 |
40 | colmap image_registrator \
41 | --database_path $REG_DB_PATH \
42 | --input_path $SPARSE_DIR/$MAX_SPARSE_IND \
43 | --output_path $REG_DIR \
44 |
45 | # Release space after successful registration
46 | if [ -e $REG_DIR/images.bin ]; then
47 | rm -f $REG_DB_PATH $REG_DB_PATH-shm $REG_DB_PATH-wal
48 | fi
49 |
50 | end_reg=`date +%s`
51 |
52 | runtime=$(((end_reg-start)/60))
53 | echo "$runtime minutes"
54 |
55 |
--------------------------------------------------------------------------------
/example_data/P04_01_line.json:
--------------------------------------------------------------------------------
1 | [
2 | -3.028, 4.60835, 3.67792,
3 | 0.199998, 0.291596, 5.56575
4 | ]
5 |
--------------------------------------------------------------------------------
/example_data/P04_01_line.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P04_01_line.png
--------------------------------------------------------------------------------
/example_data/P06_09_line.json:
--------------------------------------------------------------------------------
1 | [
2 | 11.5486, 1.00723, 3.13634,
3 | -2.84154, 0.720368, 6.66926
4 | ]
--------------------------------------------------------------------------------
/example_data/P06_09_line.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P06_09_line.png
--------------------------------------------------------------------------------
/example_data/P12_101_line.json:
--------------------------------------------------------------------------------
1 | [
2 | 2.44827, 0.0581669, 8.20895,
3 | -7.4244, 3.82762, 7.32
4 | ]
--------------------------------------------------------------------------------
/example_data/P12_101_line.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P12_101_line.png
--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000080.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000080.jpg
--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000085.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000085.jpg
--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000090.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000090.jpg
--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000095.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000095.jpg
--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000100.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000100.jpg
--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000105.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000105.jpg
--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000110.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000110.jpg
--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000115.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000115.jpg
--------------------------------------------------------------------------------
/example_data/P28_101_line.json:
--------------------------------------------------------------------------------
1 | [
2 | -2.49927, -0.543869, 2.57086,
3 | 3.32875, -2.17165, 2.4229
4 | ]
--------------------------------------------------------------------------------
/example_data/P28_101_line.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101_line.png
--------------------------------------------------------------------------------
/example_data/example_output_gui.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/example_output_gui.jpg
--------------------------------------------------------------------------------
/example_data/example_output_line.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/example_output_line.jpg
--------------------------------------------------------------------------------
/homography_filter/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/homography_filter/__init__.py
--------------------------------------------------------------------------------
/homography_filter/argparser.py:
--------------------------------------------------------------------------------
1 |
2 | import argparse
3 |
4 |
5 | def parse_args():
6 | parser = argparse.ArgumentParser()
7 | parser.add_argument(
8 | "--src",
9 | type=str,
10 | )
11 | parser.add_argument(
12 | "--dst_file",
13 | type=str,
14 | )
15 | parser.add_argument(
16 | "--overlap",
17 | default=0.9,
18 | type=float,
19 | )
20 | parser.add_argument(
21 | "--frame_range_min",
22 | default=0,
23 | type=int,
24 | )
25 | parser.add_argument(
26 | "--frame_range_max",
27 | default=None,
28 | type=int,
29 | )
30 | parser.add_argument(
31 | "--filtering_scale",
32 | default=1,
33 | type=int,
34 | )
35 | parser.add_argument(
36 | '-f',
37 | type=str,
38 | default=None
39 | )
40 | args = parser.parse_args()
41 | return args
42 |
--------------------------------------------------------------------------------
/homography_filter/filter.py:
--------------------------------------------------------------------------------
1 |
2 | import os
3 | from glob import glob
4 | import numpy as np
5 | from matplotlib import pyplot as plt
6 | from collections import defaultdict
7 | import time
8 |
9 | from lib import *
10 | from argparser import parse_args
11 | import cv2
12 |
13 |
14 | def make_homography_loader(args):
15 |
16 | images = Images(args.src, scale=args.filtering_scale)
17 | print(f'Found {len(images.imreader.fpaths)} images.')
18 | features = Features(images)
19 | matches = Matches(features)
20 | homographies = Homographies(images, features, matches)
21 |
22 | return homographies
23 |
24 |
25 | def save(fpaths_filtered, args):
26 | imreader = ImageReader(src=args.src)
27 | dir_dst = args.dir_dst
28 | dir_images = os.path.join(dir_dst, 'images')
29 | extract_frames(dir_images, fpaths_filtered, imreader)
30 | save_as_video(os.path.join(dir_dst, 'video'), fpaths_filtered, imreader)
31 |
32 |
33 | if __name__ == '__main__':
34 |
35 | # set filtering to deterministic mode
36 | cv2.setRNGSeed(0)
37 | args = parse_args()
38 | homographies = make_homography_loader(args)
39 | graph = calc_graph(homographies, **vars(args))
40 | fpaths_filtered = graph2fpaths(graph)
41 | lines = [os.path.basename(v)+'\n' for v in fpaths_filtered]
42 | dir_name = os.path.dirname(args.dst_file)
43 | if not os.path.exists(dir_name):
44 | os.makedirs(dir_name)
45 | with open(args.dst_file, 'w') as fp:
46 | fp.writelines(lines)
--------------------------------------------------------------------------------
/homography_filter/lib.py:
--------------------------------------------------------------------------------
1 |
2 | import cv2 as cv
3 | import numpy as np
4 | from matplotlib import pyplot as plt
5 | from collections import defaultdict
6 | import sys
7 | import os
8 | import shutil
9 | from glob import glob
10 |
11 |
12 | if '-f' in sys.argv:
13 | from tqdm.notebook import tqdm
14 | else:
15 | from tqdm import tqdm
16 |
17 |
18 | class Images:
19 | def __init__(self, src, load_grey=True, scale=1):
20 | self.images = {}
21 | self.im_size = None
22 | self.src = src
23 | self.scale = scale
24 | if load_grey:
25 | self.imreader = ImageReader(src, scale=scale, cv_flag=cv.IMREAD_GRAYSCALE)
26 | else:
27 | self.imreader = ImageReader(src, scale=scale)
28 |
29 | def __getitem__(self, k):
30 | if k not in self.images:
31 | im = self.imreader[k]
32 | self.images[k] = im
33 | self.im_size = self.images[k].shape[:2]
34 | return self.images[k]
35 |
36 |
37 | class Features:
38 | def __init__(self, images):
39 | self.features = {}
40 | self.images = images
41 | self.sift = cv.SIFT_create()
42 |
43 | def __getitem__(self, k):
44 | if k not in self.features:
45 | im = self.images[k]
46 | kp, des = self.sift.detectAndCompute(im, None)
47 | self.features[k] = (kp, des)
48 | return self.features[k]
49 |
50 |
51 | class Matches:
52 | def __init__(self, features):
53 |
54 | FLANN_INDEX_KDTREE = 1
55 | index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
56 | search_params = dict(checks=50)
57 | self.features = features
58 | self.matcher = cv.FlannBasedMatcher(index_params, search_params)
59 | self.matches = {}
60 | self.for_panorama_stitching = False
61 |
62 | def __getitem__(self, k):
63 | if k not in self.matches:
64 | (kp1, des1) = self.features[k[0]]
65 | (kp2, des2) = self.features[k[1]]
66 | if len(kp1) > 8:
67 | try:
68 | matches = self.matcher.knnMatch(des1, des2, k=2)
69 | except cv.error as e:
70 | print('NOTE: Too few keypoints for matching, skip.')
71 | matches = zip([], [])
72 | else:
73 | matches = zip([], [])
74 | # store all the good matches as per Lowe's ratio test.
75 | good = []
76 | for m, n in matches:
77 | if m.distance < 0.7 * n.distance:
78 | good.append(m)
79 | self.matches[k] = good
80 |
81 | return self.matches[k]
82 |
83 |
84 | class Homographies:
85 | def __init__(self, images, features, matches):
86 | self.matches = matches
87 | self.homographies = {}
88 | self.images = images
89 | self.features = features
90 | self.warps = {}
91 | self.min_match_count = 10
92 | self.images_rgb = ImageReader(src=self.images.src, scale=self.images.scale)
93 |
94 | def __getitem__(self, k):
95 | good = self.matches[k]
96 | kp1, _ = self.features[k[0]]
97 | kp2, _ = self.features[k[1]]
98 | img2 = self.images[k[1]]
99 | if k not in self.homographies:
100 | if len(good) > self.min_match_count:
101 | src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(
102 | -1, 1, 2
103 | )
104 | dst_pts = np.float32([kp2[m.trainIdx].pt for m in good]).reshape(
105 | -1, 1, 2
106 | )
107 | M, mask = cv.findHomography(src_pts, dst_pts, cv.RANSAC, 5.0)
108 | self.homographies[k] = (M, mask)
109 | else:
110 | # print( "Not enough matches are found - {}/{}".format(len(good), self.min_match_count) )
111 | matchesMask = None
112 | self.homographies[k] = (None, None)
113 | return self.homographies[k]
114 |
115 | def calc_overlap(self, *k, vis=False, is_debug=False, with_warp=False, draw_matches=True):
116 | img1 = self.images_rgb[k[0]].copy()
117 | img2 = self.images_rgb[k[1]].copy()
118 | kp1, _ = self.features[k[0]]
119 | kp2, _ = self.features[k[1]]
120 | good = self.matches[k]
121 | h, w, c = img1.shape
122 | M, mask = self[k]
123 |
124 | if M is None:
125 | return 0, [], np.zeros([h, w * 2])
126 |
127 | matchesMask = mask.ravel().tolist()
128 |
129 | pts = np.float32([[0, 0], [0, h - 1], [w - 1, h - 1], [w - 1, 0]]).reshape(
130 | -1, 1, 2
131 | )
132 | dst = cv.perspectiveTransform(pts, M)
133 |
134 | img2 = cv.polylines(img2, [np.int32(dst)], True, 255, 3, cv.LINE_AA)
135 |
136 | if with_warp:
137 | self.warps[k] = img2
138 | draw_params = dict(
139 | matchColor=(0, 255, 0), # draw matches in green color
140 | singlePointColor=None,
141 | matchesMask=matchesMask, # draw only inliers
142 | flags=2,
143 | )
144 |
145 | if is_debug:
146 | if draw_matches:
147 | im_matches = cv.drawMatches(img1, kp1, img2, kp2, good, None, **draw_params)
148 | else:
149 | im_matches = img2
150 | if vis:
151 | plt.imshow(im_matches, "gray"), plt.show()
152 | # plt.imshow(img3, "gray"), plt.show()
153 | else:
154 | im_matches = img2
155 |
156 | image_area = self.images.im_size[0] * self.images.im_size[1]
157 | polygon = dst.copy()[:, 0]
158 | polygon = bound_polygon(polygon, im_size=self.images.im_size)
159 | overlap = polygon_area(polygon[:, 1], polygon[:, 0]) / image_area
160 |
161 | return overlap, good, im_matches
162 |
163 | def calc_graph(
164 | homographies,
165 | return_im_matches=False,
166 | overlap=0.9,
167 | frame_range_min=0,
168 | frame_range_max=None,
169 | is_debug=False,
170 | clear_cache=True,
171 | **kwargs,
172 | ):
173 |
174 | fpaths = homographies.images.imreader.fpaths
175 | print(overlap)
176 | graph = {'im_matches': {}, 'fpaths': {}}
177 | if frame_range_max is None:
178 | frame_range_max = len(fpaths)
179 | i = frame_range_min
180 | j = i + 1
181 | pbar = tqdm(total=frame_range_max - frame_range_min - 1)
182 | while i < frame_range_max - 1 and j < frame_range_max:
183 | j = i + 1
184 | while j < frame_range_max:
185 | pbar.update(1)
186 | overlap_ij, matches, im_matches = homographies.calc_overlap(
187 | fpaths[i],
188 | fpaths[j],
189 | vis=False,
190 | is_debug=is_debug,
191 | )
192 | if overlap_ij < overlap:
193 | if is_debug:
194 | graph['im_matches'][i, j] = im_matches
195 | graph['fpaths'][i, j] = [fpaths[i], fpaths[j]]
196 | if clear_cache:
197 | i_ = i
198 | pi = fpaths[i_]
199 | del homographies.images.images[pi]
200 | del homographies.features.features[pi]
201 | for j_ in range(i_+1, j+1):
202 | pj = fpaths[j_]
203 | del homographies.homographies[(pi, pj)]
204 | del homographies.matches.matches[(pi, pj)]
205 | del homographies.images.images[pj]
206 | del homographies.features.features[pj]
207 | i = j
208 | break
209 | j += 1
210 | pbar.close()
211 | return graph
212 |
213 |
214 | def graph2fpaths(graph):
215 | fpaths = list(graph['fpaths'].values())
216 | first_fpath = fpaths[0][0]
217 | graph = graph['fpaths']
218 | paths = [first_fpath] + [fpath_pair[1] for fpath_pair in graph.values()]
219 | return paths
220 |
221 |
222 | def bound_polygon(polygon, im_size):
223 | # approximate for now instead of line clipping
224 | polygon[:, 0] = np.clip(polygon[:, 0], 0, im_size[1])
225 | polygon[:, 1] = np.clip(polygon[:, 1], 0, im_size[0])
226 | return polygon
227 |
228 |
229 | def polygon_area(x,y):
230 | return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
231 |
232 |
233 | def write_mp4(name, frames, fps=10):
234 | import imageio
235 | imageio.mimwrite(name + ".mp4", frames, "mp4", fps=fps)
236 |
237 |
238 | def save_as_video(dst, fpaths, imreader):
239 | frames = []
240 | for fp in tqdm(fpaths):
241 | frames += [imreader[fp]]
242 | write_mp4(dst, frames)
243 |
244 |
245 | def extract_frames(dir_dst, fpaths, imreader):
246 | for k in fpaths:
247 | imreader.save(k, dir_dst)
248 |
249 |
250 | # imreader
251 |
252 | import io
253 | def tar2bytearr(tar_member):
254 | return np.asarray(
255 | bytearray(
256 | tar_member.read()
257 | ),
258 | dtype=np.uint8
259 | )
260 |
261 | import shutil
262 |
263 | import tarfile
264 | class ImageReader:
265 | def __init__(self, src, scale=1, cv_flag=cv.IMREAD_UNCHANGED):
266 | # src can be directory or tar file
267 |
268 | self.scale = 1
269 | self.cv_flag = cv_flag
270 |
271 | if os.path.isdir(src):
272 | self.src_type = 'dir'
273 | self.fpaths = sorted(glob(os.path.join(src, '*.jpg')))
274 | elif os.path.isfile(src) and os.path.splitext(src)[1] == '.tar':
275 | self.tar = tarfile.open(src)
276 | self.src_type = 'tar'
277 | self.fpaths = sorted([x for x in self.tar.getnames() if 'frame_' in x and '.jpg' in x])
278 | else:
279 | print('Source has unknown format.')
280 | exit()
281 |
282 | def __getitem__(self, k):
283 | if self.src_type == 'dir':
284 |
285 | im = cv.imread(k, self.cv_flag)
286 | elif self.src_type == 'tar':
287 | member = self.tar.getmember(k)
288 | tarfile = self.tar.extractfile(member)
289 | byte_array = tar2bytearr(tarfile)
290 | im = cv.imdecode(byte_array, self.cv_flag)
291 | if self.scale != 1:
292 | im = cv.resize(
293 | im, dsize=[im.shape[0] // self.scale, im.shape[1] // self.scale]
294 | )
295 | if self.cv_flag != cv.IMREAD_GRAYSCALE:
296 | im = im[..., [2, 1, 0]]
297 | return im
298 |
299 | def save(self, k, dst):
300 | fn = os.path.split(k)[-1]
301 | if self.src_type == 'dir':
302 | shutil.copy(k, os.path.join(dst, fn))
303 | elif self.src_type == 'tar':
304 | self.tar.extract(self.tar.getmember(k), dst)
305 |
306 |
307 | # test
308 | def test():
309 | reader_args = {'scale': 2, 'cv_flag': cv.IMREAD_GRAYSCALE}
310 | reader_args = {'scale': 2}
311 |
312 | src = '/work/vadim/datasets/visor/2v6cgv1x04ol22qp9rm9x2j6a7/' + \
313 | 'EPIC-KITCHENS-frames/tar/P28_05.tar'
314 | imreader1 = ImageReader(src=src, **reader_args)
315 | fpaths1 = imreader1.fpaths
316 |
317 | reader_args = {'scale': 2}
318 |
319 | video_id = 'P28_05'
320 | src = f'/work/vadim/datasets/visor/2v6cgv1x04ol22qp9rm9x2j6a7/EPIC-KITCHENS-frames/rgb_frames/{video_id}'
321 | imreader2 = ImageReader(src=src, **reader_args)
322 | fpaths2 = imreader2.fpaths
323 |
324 | for i in range(0, len(fpaths1), 1000):
325 | print((imreader1[fpaths1[i]] == imreader2[fpaths2[i]]).all())
--------------------------------------------------------------------------------
/input_videos.txt:
--------------------------------------------------------------------------------
1 | P15_12
--------------------------------------------------------------------------------
/licence.txt:
--------------------------------------------------------------------------------
1 | All files in this dataset are copyright by us and published under the
2 | Creative Commons Attribution-NonCommerial 4.0 International License, found
3 | at https://creativecommons.org/licenses/by-nc/4.0/.
4 | This means that you must give appropriate credit, provide a link to the license,
5 | and indicate if changes were made. You may do so in any reasonable manner,
6 | but not in any way that suggests the licensor endorses you or your use. You
7 | may not use the material for commercial purposes.
8 |
--------------------------------------------------------------------------------
/reconstruct_sparse.py:
--------------------------------------------------------------------------------
1 | import subprocess
2 | import shutil
3 | import os
4 | import time
5 | import glob
6 | import argparse
7 | import pycolmap
8 | from utils.lib import *
9 | # Function to parse command-line arguments
10 | def parse_args():
11 | parser = argparse.ArgumentParser(description='COLMAP Reconstruction Script')
12 | parser.add_argument('--input_videos', type=str, default='input_videos.txt',
13 | help='A file with list of vidoes to be processed in all stages')
14 | parser.add_argument('--sparse_reconstuctions_root', type=str, default='colmap_models/sparse',
15 | help='Path to the sparsely reconstructed models.')
16 | parser.add_argument('--epic_kithens_root', type=str, default='.',
17 | help='Path to epic kitchens images.')
18 | parser.add_argument('--logs_path', type=str, default='logs/sparse/out_logs_terminal',
19 | help='Path to store the log files.')
20 | parser.add_argument('--summary_path', type=str, default='logs/sparse/out_summary',
21 | help='Path to store the summary files.')
22 | parser.add_argument('--sampled_images_path', type=str, default='sampled_frames',
23 | help='Path to the directory containing sampled image files.')
24 | parser.add_argument('--gpu_index', type=int, default=0,
25 | help='Index of the GPU to use.')
26 |
27 | return parser.parse_args()
28 |
29 |
30 | args = parse_args()
31 |
32 | gpu_index = args.gpu_index
33 |
34 | videos_list = read_lines_from_file(args.input_videos)
35 | videos_list = sorted(videos_list)
36 | print('GPU: %d' % (gpu_index))
37 | os.makedirs(args.logs_path, exist_ok=True)
38 | os.makedirs(args.summary_path, exist_ok=True)
39 | os.makedirs(args.sparse_reconstuctions_root, exist_ok=True)
40 |
41 | i = 0
42 | for video in videos_list:
43 | pre = video.split('_')[0]
44 | if (not os.path.exists(os.path.join(args.sparse_reconstuctions_root, '%s' % video))):
45 | # check the number of images in this video
46 | with open(os.path.join(args.sampled_images_path, '%s_selected_frames.txt' % (video)), 'r') as f:
47 | lines = f.readlines()
48 | num_lines = len(lines)
49 | #print(f'The file {video} contains {num_lines} lines.')
50 | if num_lines < 100000: #it's too large, so it would take days!
51 | print('Processing: ', video, '(',num_lines, 'images )')
52 | start_time = time.time()
53 |
54 | # Define the path to the shell script
55 | script_path = 'scripts/reconstruct_sparse.sh'
56 |
57 | # Create a unique copy of the script
58 | script_copy_path = video + '_' + str(os.getpid()) + '_' + os.path.basename(script_path)
59 | shutil.copy(script_path, script_copy_path)
60 |
61 | # Output file
62 | output_file_path = os.path.join(args.logs_path, script_copy_path.replace('.sh', '.out'))
63 |
64 |
65 | # Define the command to execute the script
66 | command = ["bash", script_copy_path, video,args.sparse_reconstuctions_root,args.epic_kithens_root,args.sampled_images_path,args.summary_path,str(gpu_index)]
67 | # Open the output file in write mode
68 | with open(output_file_path, 'w') as output_file:
69 | # Run the command and capture its output in real time
70 | process = subprocess.Popen(command, stdout=output_file, stderr=subprocess.PIPE, text=True)
71 | while True:
72 | output = process.stderr.readline()
73 | if output == '' and process.poll() is not None:
74 | break
75 | if output:
76 | output_file.write(output)
77 | output_file.flush()
78 |
79 | # Once the script has finished running, you can delete the copy of the script
80 | os.remove(script_copy_path)
81 |
82 | #In case of having multiple models, will keep the one with largest number of images and rename it as 0
83 | reg_images = keep_model_with_largest_images(os.path.join(args.sparse_reconstuctions_root,video,'sparse'))
84 | if reg_images > 0:
85 | print(f"Registered_images/total_images: {reg_images}/{num_lines} = {round(reg_images/num_lines*100)}%")
86 | else:
87 | print('The video reconstruction fails!! no reconstruction file is found!')
88 |
89 |
90 |
91 |
92 | print("Execution time: %s minutes" % round((time.time() - start_time)/60, 0))
93 | print('-----------------------------------------------------------')
94 |
95 | i += 1
96 |
97 |
--------------------------------------------------------------------------------
/register_dense.py:
--------------------------------------------------------------------------------
1 | import subprocess
2 | import shutil
3 | import os
4 | import time
5 | import glob
6 | import argparse
7 | import pycolmap
8 | from utils.lib import *
9 | # Function to parse command-line arguments
10 | def parse_args():
11 | parser = argparse.ArgumentParser(description='COLMAP Reconstruction Script')
12 | parser.add_argument('--input_videos', type=str, default='input_videos.txt',
13 | help='A file with list of vidoes to be processed in all stages')
14 | parser.add_argument('--sparse_reconstuctions_root', type=str, default='colmap_models/sparse',
15 | help='Path to the sparsely reconstructed models.')
16 | parser.add_argument('--dense_reconstuctions_root', type=str, default='colmap_models/dense',
17 | help='Path to the densely registered models.')
18 | parser.add_argument('--epic_kithens_root', type=str, default='.',
19 | help='Path to epic kitchens images.')
20 | parser.add_argument('--logs_path', type=str, default='logs/dense/out_logs_terminal',
21 | help='Path to store the log files.')
22 | parser.add_argument('--summary_path', type=str, default='logs/dense/out_summary',
23 | help='Path to store the summary files.')
24 | parser.add_argument('--gpu_index', type=int, default=0,
25 | help='Index of the GPU to use.')
26 |
27 | return parser.parse_args()
28 |
29 |
30 | args = parse_args()
31 |
32 | gpu_index = args.gpu_index
33 |
34 | videos_list = read_lines_from_file(args.input_videos)
35 | videos_list = sorted(videos_list)
36 | print('GPU: %d' % (gpu_index))
37 | os.makedirs(args.logs_path, exist_ok=True)
38 | os.makedirs(args.summary_path, exist_ok=True)
39 | os.makedirs(args.sparse_reconstuctions_root, exist_ok=True)
40 | os.makedirs(args.dense_reconstuctions_root, exist_ok=True)
41 |
42 |
43 | i = 0
44 | for video in videos_list:
45 | pre = video.split('_')[0]
46 | if (not os.path.exists(os.path.join(args.dense_reconstuctions_root, '%s' % video))):
47 | # check the number of images in this video
48 | num_lines = len(glob.glob(os.path.join(args.epic_kithens_root,pre,video,'*.jpg')))
49 |
50 | print('Processing: ', video, '(',num_lines, 'images )')
51 | start_time = time.time()
52 |
53 | # Define the path to the shell script
54 | script_path = 'scripts/register_dense.sh'
55 |
56 | # Create a unique copy of the script
57 | script_copy_path = video + '_' + str(os.getpid()) + '_' + os.path.basename(script_path)
58 | shutil.copy(script_path, script_copy_path)
59 |
60 | # Output file
61 | output_file_path = os.path.join(args.logs_path, script_copy_path.replace('.sh', '.out'))
62 |
63 |
64 | # Define the command to execute the script
65 | command = ["bash", script_copy_path, video,args.sparse_reconstuctions_root,args.dense_reconstuctions_root,args.epic_kithens_root,args.summary_path,str(gpu_index)]
66 | # Open the output file in write mode
67 | with open(output_file_path, 'w') as output_file:
68 | # Run the command and capture its output in real time
69 | process = subprocess.Popen(command, stdout=output_file, stderr=subprocess.PIPE, text=True)
70 | while True:
71 | output = process.stderr.readline()
72 | if output == '' and process.poll() is not None:
73 | break
74 | if output:
75 | output_file.write(output)
76 | output_file.flush()
77 |
78 | # Once the script has finished running, you can delete the copy of the script
79 | os.remove(script_copy_path)
80 |
81 |
82 | reg_images = get_num_images(os.path.join(args.dense_reconstuctions_root,video))
83 | if reg_images > 0:
84 | print(f"Registered_images/total_images: {reg_images}/{num_lines} = {round(reg_images/num_lines*100)}%")
85 | else:
86 | print('The video reconstruction fails!! no colmap files are found!')
87 |
88 |
89 |
90 |
91 | print("Execution time: %s minutes" % round((time.time() - start_time)/60, 0))
92 | print('-----------------------------------------------------------')
93 |
94 | i += 1
95 |
96 |
--------------------------------------------------------------------------------
/scripts/reconstruct_sparse.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 | start=`date +%s`
3 |
4 | VIDEO=$1 #i.e. P02_14
5 | SPARSE_PATH=$2 # path to save the sparse models
6 | IMAGES_ROOT=$3 # root of epic kitchens images
7 | SAMPLED_IMAGES=$4 # path of the sampeld images to be used for reconstruction
8 | LOGS=$5 # to save the output logs
9 | GPU_IDX=$6 # i.e. 0
10 |
11 | PRE=$(echo "$VIDEO" | cut -d'_' -f1)
12 | #cat $0 > "${LOGS}/$VIDEO.out"
13 | mkdir ${SPARSE_PATH}/${VIDEO}
14 | mkdir ${SPARSE_PATH}/${VIDEO}/sparse
15 |
16 | colmap feature_extractor \
17 | --database_path ${VIDEO}_database.db \
18 | --ImageReader.camera_model OPENCV \
19 | --image_list_path ${SAMPLED_IMAGES}/${VIDEO}_selected_frames.txt \
20 | --ImageReader.single_camera 1 \
21 | --SiftExtraction.use_gpu 1 \
22 | --SiftExtraction.gpu_index $GPU_IDX \
23 | --image_path ${IMAGES_ROOT}/${PRE}/${VIDEO} \
24 |
25 | colmap sequential_matcher \
26 | --database_path ${VIDEO}_database.db \
27 | --SiftMatching.use_gpu 1 \
28 | --SequentialMatching.loop_detection 1 \
29 | --SiftMatching.gpu_index $GPU_IDX \
30 | --SequentialMatching.vocab_tree_path vocab_bins/vocab_tree_flickr100K_words32K.bin \
31 |
32 | colmap mapper \
33 | --database_path ${VIDEO}_database.db \
34 | --image_path ${PRE}/${VIDEO} \
35 | --output_path ${SPARSE_PATH}/${VIDEO}/sparse \
36 | --image_list_path ${SAMPLED_IMAGES}/${VIDEO}_selected_frames.txt \
37 |
38 |
39 | #echo "----------------------------------------------------------------------SUMMARY----------------------------------------------------------------------">> "${LOGS}/$VIDEO.out"
40 | colmap model_analyzer --path ${SPARSE_PATH}/${VIDEO}/sparse/0/ > "${LOGS}/$VIDEO.out"
41 |
42 | end=`date +%s`
43 | runtime=$(((end-start)/60))
44 | echo "$runtime minutes">> "${LOGS}/$VIDEO.out"
45 | mv ${VIDEO}_database.db ${SPARSE_PATH}/${VIDEO}/database.db #move the database
46 |
--------------------------------------------------------------------------------
/scripts/register_dense.sh:
--------------------------------------------------------------------------------
1 | start=`date +%s`
2 |
3 | VIDEO=$1 #i.e. P02_14
4 | SPARSE_PATH=$2 # path to save the sparse models
5 | DENSE_PATH=$3 # path to save the sparse models
6 | IMAGES_ROOT=$4 # root of epic kitchens images
7 | LOGS=$5 # to save the output logs
8 | GPU_IDX=$6 # i.e. 0
9 |
10 | PRE=$(echo "$VIDEO" | cut -d'_' -f1)
11 |
12 | cp ${SPARSE_PATH}/${VIDEO}/database.db ${VIDEO}_database.db #move the database from the sparse model
13 | mkdir ${DENSE_PATH}/${VIDEO}
14 |
15 | colmap feature_extractor \
16 | --database_path ${VIDEO}_database.db \
17 | --ImageReader.camera_model OPENCV \
18 | --ImageReader.single_camera 1 \
19 | --ImageReader.existing_camera_id 1 \
20 | --SiftExtraction.use_gpu 1 \
21 | --SiftExtraction.gpu_index $GPU_IDX \
22 | --image_path ${IMAGES_ROOT}/${PRE}/${VIDEO} \
23 |
24 |
25 |
26 | colmap sequential_matcher \
27 | --database_path ${VIDEO}_database.db \
28 | --SiftMatching.use_gpu 1 \
29 | --SequentialMatching.loop_detection 1 \
30 | --SiftMatching.gpu_index $GPU_IDX \
31 | --SequentialMatching.vocab_tree_path vocab_bins/vocab_tree_flickr100K_words32K.bin \
32 |
33 |
34 | colmap image_registrator \
35 | --database_path ${VIDEO}_database.db \
36 | --input_path ${SPARSE_PATH}/${VIDEO}/sparse/0 \
37 | --output_path ${DENSE_PATH}/${VIDEO} \
38 |
39 |
40 | colmap model_analyzer --path ${DENSE_PATH}/${VIDEO} > "${LOGS}/$VIDEO.out"
41 |
42 | end_reg=`date +%s`
43 |
44 | runtime=$(((end_reg-start)/60))
45 | echo "$runtime minutes (registration time)">> "${LOGS}/$VIDEO.out"
46 |
47 | rm ${VIDEO}_database.db #remove the database since it's too large, you can keep it upon your usecase
48 |
--------------------------------------------------------------------------------
/select_sparse_frames.py:
--------------------------------------------------------------------------------
1 | import subprocess
2 | import concurrent.futures
3 | import glob
4 | import os
5 | import argparse
6 | from utils.lib import *
7 | # Function to parse command-line arguments
8 | def parse_args():
9 | parser = argparse.ArgumentParser(description='COLMAP Reconstruction Script')
10 | parser.add_argument('--input_videos', type=str, default='input_videos.txt',
11 | help='A file with list of vidoes to be processed in all stages')
12 | parser.add_argument('--epic_kithens_root', type=str, default='.',
13 | help='Path to epic kitchens images.')
14 | parser.add_argument('--sampled_images_path', type=str, default='sampled_frames',
15 | help='Path to the directory containing sampled image files.')
16 | parser.add_argument('--homography_overlap', type=float, default=0.9,
17 | help='Threshold of the homography to sample new frames, higher value samples more images')
18 | parser.add_argument('--max_concurrent', type=int, default=8,
19 | help='Max number of concurrent processes')
20 | return parser.parse_args()
21 |
22 |
23 |
24 |
25 | def main():
26 | args = parse_args()
27 |
28 | videos = read_lines_from_file(args.input_videos)
29 | epic_root = args.epic_kithens_root
30 | params_list = []
31 | for video in videos:
32 | video_pre = video.split('_')[0]
33 | for folder in sorted(glob.glob(os.path.join(epic_root,video_pre+'/*'))):
34 | video = folder.split('/')[-1]
35 | if video in videos:
36 | print(video)
37 | added_run = ['--src', folder, '--dst_file', '%s/%s_selected_frames.txt'%(args.sampled_images_path,video), '--overlap', str(args.homography_overlap)]
38 | if not added_run in params_list:
39 | params_list.append(added_run)
40 |
41 | if params_list:
42 | max_concurrent = args.max_concurrent
43 | # Create a process pool executor with a maximum of K processes
44 | executor = concurrent.futures.ProcessPoolExecutor(max_workers=max_concurrent)
45 |
46 | # Submit the tasks to the executor
47 | results = []
48 | for i in range(len(params_list)):
49 | future = executor.submit(run_script, 'homography_filter/filter.py', params_list[i % len(params_list)])
50 | results.append(future)
51 |
52 | # Wait for all tasks to complete
53 | for r in concurrent.futures.as_completed(results):
54 | try:
55 | r.result()
56 | except Exception as e:
57 | print(f"Error occurred: {e}")
58 |
59 | # Shut down the executor
60 | executor.shutdown()
61 |
62 |
63 | if __name__ == '__main__':
64 | main()
--------------------------------------------------------------------------------
/tools/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/tools/__init__.py
--------------------------------------------------------------------------------
/tools/common_functions.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 |
4 | """ Source: see COLMAP """
5 | def qvec2rotmat(qvec):
6 | return np.array([
7 | [1 - 2 * qvec[2]**2 - 2 * qvec[3]**2,
8 | 2 * qvec[1] * qvec[2] - 2 * qvec[0] * qvec[3],
9 | 2 * qvec[3] * qvec[1] + 2 * qvec[0] * qvec[2]],
10 | [2 * qvec[1] * qvec[2] + 2 * qvec[0] * qvec[3],
11 | 1 - 2 * qvec[1]**2 - 2 * qvec[3]**2,
12 | 2 * qvec[2] * qvec[3] - 2 * qvec[0] * qvec[1]],
13 | [2 * qvec[3] * qvec[1] - 2 * qvec[0] * qvec[2],
14 | 2 * qvec[2] * qvec[3] + 2 * qvec[0] * qvec[1],
15 | 1 - 2 * qvec[1]**2 - 2 * qvec[2]**2]])
16 |
17 |
18 |
19 | def get_c2w(img_data: list) -> np.ndarray:
20 | """
21 | Args:
22 | img_data: list, [qvec, tvec] of w2c
23 |
24 | Returns:
25 | c2w: np.ndarray, 4x4 camera-to-world matrix
26 | """
27 | w2c = np.eye(4)
28 | w2c[:3, :3] = qvec2rotmat(img_data[:4])
29 | w2c[:3, -1] = img_data[4:7]
30 | c2w = np.linalg.inv(w2c)
31 | return c2w
32 |
--------------------------------------------------------------------------------
/tools/project_3d_line.py:
--------------------------------------------------------------------------------
1 | from typing import List, Dict
2 | import argparse
3 | import json
4 | import os
5 | import re
6 | import os.path as osp
7 | import tqdm
8 | import numpy as np
9 |
10 | import cv2
11 | from PIL import Image
12 |
13 | from tools.common_functions import qvec2rotmat
14 |
15 |
16 | class Line:
17 | """ An infinite 3D line to denote Annotated Line """
18 |
19 | def __init__(self, line_ends: np.ndarray):
20 | """
21 | Args:
22 | line_ends: (2, 3)
23 | points annotated using some GUI, denoting points along the desired line
24 | """
25 | st, ed = line_ends
26 | self.vc = (st + ed) / 2
27 | self.dir = ed - st
28 | self.v0 = st
29 | self.v1 = ed
30 |
31 | def __repr__(self) -> str:
32 | return f'vc: {str(self.vc)} \ndir: {str(self.dir)}'
33 |
34 | def check_single_point(self,
35 | point: np.ndarray,
36 | radius: float) -> bool:
37 | """
38 | point-to-line = (|(p-v_0)x(p-v_1)|)/(|v_1 - v_0|)
39 |
40 | Args:
41 | point: (3,) array of point
42 | radius: threshold for checking inside
43 | """
44 | area2 = np.linalg.norm(np.cross(point - self.v0, point - self.v1))
45 | base_len = np.linalg.norm(self.v1 - self.v0)
46 | d = area2 / base_len
47 | return True if d < radius else False
48 |
49 | def check_points(self,
50 | points: np.ndarray,
51 | diameter: float) -> np.ndarray:
52 | """
53 | Args:
54 | points: (N, 3) array of points
55 | diameter: threshold for checking inside
56 |
57 | Returns:
58 | (N,) bool array
59 | """
60 | area2 = np.linalg.norm(np.cross(points - self.v0, points - self.v1), axis=1)
61 | base_len = np.linalg.norm(self.v1 - self.v0)
62 | d = area2 / base_len
63 | return d < diameter
64 |
65 |
66 | def line_rectangle_check(cen, dir, rect,
67 | eps=1e-6):
68 | """
69 | Args:
70 | cen, dir: (2,) float
71 | rect: Tuple (xmin, ymin, xmax, ymax)
72 |
73 | Returns:
74 | num_intersect: int
75 | inters: (num_intersect, 2) float
76 | """
77 | x1, y1 = cen
78 | u1, v1 = dir
79 | xmin, ymin, xmax, ymax = rect
80 | rect_loop = np.asarray([
81 | [xmin, ymin], [xmax, ymin], [xmax, ymax], [xmin, ymax],
82 | [xmin, ymin]
83 | ], dtype=np.float32)
84 | x2, y2 = rect_loop[:4, 0], rect_loop[:4, 1]
85 | u2 = rect_loop[1:, 0] - rect_loop[:-1, 0]
86 | v2 = rect_loop[1:, 1] - rect_loop[:-1, 1]
87 |
88 | t2 = (v1*x1 - u1*y1) - (v1*x2 - u1*y2)
89 | divisor = (v1*u2 - v2*u1)
90 | cond = np.abs(divisor) > eps
91 |
92 | t2[~cond] = -1
93 | t2[cond] = t2[cond] / divisor[cond]
94 |
95 | keep = (t2 >= 0) & (t2 <= 1)
96 | num_intersect = np.sum(keep)
97 | uv = np.stack([u2, v2], 1)
98 | inters = rect_loop[:4, :] + t2[:, None] * uv
99 | inters = inters[keep, :]
100 | return num_intersect, inters
101 |
102 |
103 | def project_line_image(line: Line,
104 | pose_data: list,
105 | camera: dict):
106 | """ Project a 3D line using camera pose and intrinsics
107 |
108 | This implementation ignores distortion.
109 |
110 | Args:
111 | line:
112 | -vc: (3,) float
113 | -dir: (3,) float
114 | pose_data: stores camera pose
115 | [qw, qx, qy, qz, tx, ty, tz, frame_name]
116 | camera: dict, stores intrinsics
117 | -width,
118 | -height
119 | -params (8,) fx, fy, cx, cy, k1, k2, p1, p2
120 |
121 | Returns:
122 | (st, ed): (2,) float
123 | """
124 | cen, dir = line.vc, line.dir
125 | rot_w2c = qvec2rotmat(pose_data[:4])
126 | tvec = np.asarray(pose_data[4:7])
127 | # Represent as column vector
128 | cen = rot_w2c @ cen + tvec
129 | dir = rot_w2c @ dir
130 | width, height = camera['width'], camera['height']
131 | fx, fy, cx, cy, k1, k2, p1, p2 = camera['params']
132 |
133 | cen_uv = cen[:2] / cen[2]
134 | cen_uv = cen_uv * np.array([fx, fy]) + np.array([cx, cy])
135 | dir_uv = ((dir + cen)[:2] / (dir + cen)[2]) - (cen[:2] / cen[2])
136 | dir_uv = dir_uv * np.array([fx, fy])
137 | dir_uv = dir_uv / np.linalg.norm(dir_uv)
138 |
139 | line2d = None
140 | num_inters, inters = line_rectangle_check(
141 | cen_uv, dir_uv, (0, 0, width, height))
142 | if num_inters == 2:
143 | line2d = (inters[0], inters[1])
144 | return line2d
145 |
146 |
147 | class LineProjector:
148 |
149 | COLORS = dict(yellow=(255, 255, 0),)
150 |
151 | def __init__(self,
152 | camera: Dict,
153 | images: Dict[str, List],
154 | line: Line):
155 | """
156 | Args:
157 | camera: dict, camera info
158 | images: dict of
159 | frame_name: [qw, qx, qy, qz, tx, ty, tz] in **w2c**
160 | """
161 | self.camera = camera
162 | self.images = images
163 | self.line = line
164 | self.line_color = self.COLORS['yellow']
165 |
166 | def project_frame(self, frame_name: str, frames_root: str) -> np.ndarray:
167 | """ Project a line onto a frame
168 |
169 | Args:
170 | frame_idx: int. epic frame index
171 | frames_root: str.
172 | f'{frame_root}/frame_{frame_idx:010d}.jpg' is the path to the epic-kitchens frame
173 |
174 | Returns:
175 | img: (H, W, 3) np.uint8
176 | """
177 | pose_data = self.images[frame_name]
178 | img_path = osp.join(frames_root, frame_name)
179 | img = np.asarray(Image.open(img_path))
180 | line_2d = project_line_image(self.line, pose_data, self.camera)
181 | if line_2d is None:
182 | return img
183 | img = cv2.line(
184 | img, np.int32(line_2d[0]), np.int32(line_2d[1]),
185 | color=self.line_color, thickness=2, lineType=cv2.LINE_AA)
186 |
187 | return img
188 |
189 | def write_mp4(self,
190 | frames_root: str,
191 | fps=5,
192 | out_name='line_output'):
193 | """ Write mp4 file that has line projected on the image frames
194 |
195 | Args:
196 | frames_root: str.
197 | f'{frame_root}/frame_{frame_idx:010d}.jpg' is the path to the epic-kitchens frame
198 | """
199 | out_dir = os.path.join('./outputs/', out_name)
200 | os.makedirs(out_dir, exist_ok=True)
201 | fmt = os.path.join(out_dir, '{}')
202 |
203 | frames_on_disk = set(os.listdir(frames_root))
204 | frame_names = set(self.images.keys())
205 | if len(frames_on_disk) < len(frame_names):
206 | print(f"Showing {len(frames_on_disk)} / {len(frame_names)} frames")
207 | frame_names = frame_names.intersection(frames_on_disk)
208 | frame_names = sorted(list(frame_names))
209 | for frame_name in tqdm.tqdm(frame_names):
210 | img = self.project_frame(frame_name, frames_root)
211 | frame_number = re.search('\d{10,}', frame_name)[0]
212 | cv2.putText(img, frame_number,
213 | (self.camera['width']//4, self.camera['height'] * 31 // 32),
214 | cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
215 | Image.fromarray(img).save(fmt.format(frame_name))
216 |
217 | from moviepy import editor
218 | clip = editor.ImageSequenceClip(sequence=out_dir, fps=fps)
219 | clip.write_videofile(f'./outputs/{out_name}-fps{fps}.mp4')
220 |
221 |
222 | if __name__ == '__main__':
223 | parser = argparse.ArgumentParser()
224 | parser.add_argument('--json-data', type=str, required=True)
225 | parser.add_argument('--line-data', type=str, required=True)
226 | parser.add_argument('--frames-root', type=str, required=True)
227 | parser.add_argument('--out-name', type=str, default="line_output")
228 | parser.add_argument('--fps', type=int, default=5)
229 | args = parser.parse_args()
230 |
231 | with open(args.json_data) as f:
232 | model = json.load(f)
233 | camera = model['camera']
234 | images = model['images']
235 |
236 | with open(args.line_data) as f:
237 | line = json.load(f)
238 | line = np.asarray(line).reshape(2, 3)
239 | line = Line(line)
240 |
241 | runner = LineProjector(camera, images, line)
242 | runner.write_mp4(
243 | frames_root=args.frames_root, fps=args.fps, out_name=args.out_name)
244 |
--------------------------------------------------------------------------------
/tools/visualise_data_open3d.py:
--------------------------------------------------------------------------------
1 | import open3d as o3d
2 | import numpy as np
3 | from argparse import ArgumentParser
4 | import json
5 |
6 | from tools.common_functions import get_c2w
7 |
8 | """ Visualize poses and point-cloud stored in json file."""
9 |
10 | def parse_args():
11 | parser = ArgumentParser()
12 | parser.add_argument('--json-data', help='path to json data', required=True)
13 | parser.add_argument('--line-data', help='path to line data', default=None)
14 | parser.add_argument(
15 | '--num-display-poses', type=int, default=500,
16 | help='randomly display num-display-poses to avoid creating too many poses')
17 | parser.add_argument('--frustum-size', type=float, default=0.1)
18 | return parser.parse_args()
19 |
20 |
21 | def get_frustum(c2w: np.ndarray,
22 | sz=0.2,
23 | camera_height=None,
24 | camera_width=None,
25 | frustum_color=[1, 0, 0]) -> o3d.geometry.LineSet:
26 | """
27 | Args:
28 | c2w: np.ndarray, 4x4 camera-to-world matrix
29 | sz: float, size (width) of the frustum
30 | Returns:
31 | frustum: o3d.geometry.TriangleMesh
32 | """
33 | cen = [0, 0, 0]
34 | wid = sz
35 | if camera_height is not None and camera_width is not None:
36 | hei = wid * camera_height / camera_width
37 | else:
38 | hei = wid
39 | tl = [wid, hei, sz]
40 | tr = [-wid, hei, sz]
41 | br = [-wid, -hei, sz]
42 | bl = [wid, -hei, sz]
43 | points = np.float32([cen, tl, tr, br, bl])
44 | lines = [
45 | [0, 1], [0, 2], [0, 3], [0, 4],
46 | [1, 2], [2, 3], [3, 4], [4, 1],]
47 | frustum = o3d.geometry.LineSet()
48 | frustum.points = o3d.utility.Vector3dVector(points)
49 | frustum.lines = o3d.utility.Vector2iVector(lines)
50 | frustum.colors = o3d.utility.Vector3dVector([np.asarray([1, 0, 0])])
51 | frustum.paint_uniform_color(frustum_color)
52 |
53 | frustum = frustum.transform(c2w)
54 | return frustum
55 |
56 |
57 | if __name__ == "__main__":
58 | args = parse_args()
59 | frustum_size = args.frustum_size
60 |
61 | vis = o3d.visualization.Visualizer()
62 | vis.create_window()
63 | with open(args.json_data, 'r') as f:
64 | model = json.load(f)
65 |
66 | """ Points """
67 | points = model['points']
68 | pcd_np = [v[:3] for v in points]
69 | pcd_rgb = [np.asarray(v[3:6]) / 255 for v in points]
70 | pcd = o3d.geometry.PointCloud()
71 | pcd.points = o3d.utility.Vector3dVector(pcd_np)
72 | pcd.colors = o3d.utility.Vector3dVector(pcd_rgb)
73 | vis.add_geometry(pcd, reset_bounding_box=True)
74 |
75 | """ Camear Poses """
76 | camera = model['camera']
77 | cam_h, cam_w = camera['height'], camera['width']
78 | c2w_list = [get_c2w(img) for img in model['images'].values()]
79 | c2w_sel_inds = np.linspace(0, len(c2w_list)-1, args.num_display_poses).astype(int)
80 | c2w_sel = [c2w_list[i] for i in c2w_sel_inds]
81 | frustums = [
82 | get_frustum(c2w, sz=frustum_size, camera_height=cam_h, camera_width=cam_w)
83 | for c2w in c2w_sel
84 | ]
85 | for frustum in frustums:
86 | vis.add_geometry(frustum, reset_bounding_box=True)
87 |
88 | """ Optional: Line """
89 | if args.line_data is not None:
90 | line_set = o3d.geometry.LineSet()
91 | with open(args.line_data, 'r') as f:
92 | line_points = np.asarray(json.load(f)).reshape(2, 3)
93 | vc = line_points.mean(axis=0)
94 | dir = line_points[1] - line_points[0]
95 | lst = vc + 2 * dir
96 | led = vc - 2 * dir
97 | lines = [lst, led]
98 | line_set.points = o3d.utility.Vector3dVector(lines)
99 | line_set.lines = o3d.utility.Vector2iVector([[0, 1]])
100 | vis.add_geometry(line_set, reset_bounding_box=True)
101 |
102 | control = vis.get_view_control()
103 | control.set_front([1, 1, 1])
104 | control.set_lookat([0, 0, 0])
105 | control.set_up([0, 0, 1])
106 | control.set_zoom(1)
107 |
108 | vis.run()
109 | vis.destroy_window()
110 |
--------------------------------------------------------------------------------
/tools/visualize_colmap_open3d.py:
--------------------------------------------------------------------------------
1 | import open3d as o3d
2 | import numpy as np
3 | from argparse import ArgumentParser
4 | from utils.base_type import ColmapModel
5 | from tools.visualise_data_open3d import get_c2w, get_frustum
6 |
7 | """TODO
8 | 1. Frustum, on/off
9 | 2. Line (saved in json)
10 | """
11 |
12 | def parse_args():
13 | parser = ArgumentParser()
14 | parser.add_argument('--model', help="path to direcctory containing images.bin", required=True)
15 | parser.add_argument('--pcd-path', help="path to fused.ply", default=None)
16 | parser.add_argument('--show-mesh-frame', default=False)
17 | parser.add_argument('--specify-frame-name', default=None)
18 | parser.add_argument(
19 | '--num-display-poses', type=int, default=500,
20 | help='randomly display num-display-poses to avoid creating too many poses')
21 | return parser.parse_args()
22 |
23 | if __name__ == "__main__":
24 | args = parse_args()
25 |
26 | model_path = args.model
27 | mod = ColmapModel(args.model)
28 | if args.pcd_path is not None:
29 | pcd = o3d.io.read_point_cloud(args.pcd_path)
30 | else:
31 | pcd_np = np.asarray([v.xyz for v in mod.points.values()])
32 | pcd_rgb = np.asarray([v.rgb / 255 for v in mod.points.values()])
33 | # Remove too far points from GUI -- usually noise
34 | pcd_np_center = np.mean(pcd_np, axis=0)
35 | pcd_ind = np.linalg.norm(pcd_np - pcd_np_center, axis=1) < 500
36 | pcd_np, pcd_rgb = pcd_np[pcd_ind], pcd_rgb[pcd_ind]
37 |
38 | pcd = o3d.geometry.PointCloud()
39 | pcd.points = o3d.utility.Vector3dVector(pcd_np)
40 | pcd.colors = o3d.utility.Vector3dVector(pcd_rgb)
41 |
42 | mesh_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(
43 | size=1.0, origin=[0, 0, 0])
44 |
45 | vis = o3d.visualization.Visualizer()
46 | vis.create_window()
47 | vis.add_geometry(pcd, reset_bounding_box=True)
48 | if args.show_mesh_frame:
49 | vis.add_geometry(mesh_frame, reset_bounding_box=True)
50 |
51 | frustum_size = 0.1
52 | camera = mod.camera
53 | cam_h, cam_w = camera.height, camera.width
54 | """ Camear Poses """
55 | if args.specify_frame_name is not None:
56 | qvec, tvec = [
57 | (v.qvec, v.tvec) for k, v in mod.images.items() if v.name == args.specify_frame_name][0]
58 | img_data = [qvec[0], qvec[1], qvec[2], qvec[3], tvec[0], tvec[1], tvec[2]]
59 | c2w = get_c2w(img_data)
60 | frustum = get_frustum(c2w, sz=frustum_size, camera_height=cam_h, camera_width=cam_w)
61 | vis.add_geometry(frustum, reset_bounding_box=True)
62 | else:
63 | qtvecs = [list(v.qvec) + list(v.tvec) for v in mod.images.values()]
64 | qtvecs = [qtvecs[i]
65 | for i in np.linspace(0, len(qtvecs)-1, args.num_display_poses).astype(int)]
66 | c2w_list = [get_c2w(img) for img in qtvecs]
67 | for c2w in c2w_list:
68 | frustum = get_frustum(c2w, sz=frustum_size, camera_height=cam_h, camera_width=cam_w)
69 | vis.add_geometry(frustum, reset_bounding_box=True)
70 |
71 | control = vis.get_view_control()
72 | control.set_front([1, 1, 1])
73 | control.set_lookat([0, 0, 0])
74 | control.set_up([0, 0, 1])
75 | control.set_zoom(1.0)
76 |
77 | vis.run()
78 | vis.destroy_window()
79 |
--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/utils/__init__.py
--------------------------------------------------------------------------------
/utils/base_type.py:
--------------------------------------------------------------------------------
1 | from typing import List
2 | import json
3 | from functools import cached_property
4 | from utils.colmap_utils import (
5 | read_cameras_binary, read_points3d_binary,
6 | read_images_binary, BaseImage)
7 | from utils.colmap_utils import Image as ColmapImage
8 |
9 |
10 |
11 | class ColmapModel:
12 |
13 | """
14 | NOTE: this class shares commons codes with line_check.LineChecker,
15 | reuse these codes?
16 | """
17 | def __init__(self, model_dir: str):
18 |
19 | def _as_list(path, func):
20 | return func(path)
21 |
22 | cameras = _as_list(
23 | f'{model_dir}/cameras.bin', read_cameras_binary)
24 | if len(cameras) != 1:
25 | print("Found more than one camera!")
26 | self.camera = cameras[1]
27 | self.points = _as_list(
28 | f'{model_dir}/points3D.bin', read_points3d_binary)
29 | self.images = _as_list(
30 | f'{model_dir}/images.bin', read_images_binary)
31 |
32 | def __repr__(self) -> str:
33 | return f'{self.num_images} images - {self.num_points} points'
34 |
35 | @property
36 | def example_data(self):
37 | ki = list(self.images.keys())[0]
38 | img = self.images[ki]
39 | kp = list(self.points.keys())[0]
40 | point = self.points[kp]
41 | return img, point
42 |
43 | @cached_property
44 | def ordered_image_ids(self):
45 | return sorted(self.images.keys(), key=lambda x: self.images[x].name)
46 |
47 | @property
48 | def num_points(self):
49 | return len(self.points)
50 |
51 | @property
52 | def num_images(self):
53 | return len(self.images)
54 |
55 | @property
56 | def ordered_images(self) -> List[BaseImage]:
57 | return [self.images[i] for i in self.ordered_image_ids]
58 |
59 | def get_image_by_id(self, image_id: int):
60 | return self.images[image_id]
61 |
62 |
63 | class JsonColmapModel:
64 | def __init__(self, json_path_or_dict):
65 | if isinstance(json_path_or_dict, str):
66 | with open(json_path_or_dict) as f:
67 | model = json.load(f)
68 | elif isinstance(json_path_or_dict, dict):
69 | model = json_path_or_dict
70 | self.camera = model['camera']
71 | self.points = model['points']
72 | self.images = [
73 | model['images'][k] + [k] for k in sorted(model['images'].keys())
74 | ] # qw, qx, qy, qz, tx, ty, tz, frame_name
75 |
76 | @property
77 | def ordered_image_ids(self):
78 | return list(range(len(self.images)))
79 |
80 | @property
81 | def ordered_images(self) -> List[ColmapImage]:
82 | return [self.get_image_by_id(i) for i in self.ordered_image_ids]
83 |
84 | def get_image_by_id(self, image_id: int) -> ColmapImage:
85 | img_info = self.images[image_id]
86 | cimg = ColmapImage(
87 | id=image_id, qvec=img_info[:4], tvec=img_info[4:7], camera_id=0,
88 | name=img_info[7], xys=[], point3D_ids=[])
89 | return cimg
90 |
--------------------------------------------------------------------------------
/utils/colmap_utils.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) 2018, ETH Zurich and UNC Chapel Hill.
2 | # All rights reserved.
3 | #
4 | # Redistribution and use in source and binary forms, with or without
5 | # modification, are permitted provided that the following conditions are met:
6 | #
7 | # * Redistributions of source code must retain the above copyright
8 | # notice, this list of conditions and the following disclaimer.
9 | #
10 | # * Redistributions in binary form must reproduce the above copyright
11 | # notice, this list of conditions and the following disclaimer in the
12 | # documentation and/or other materials provided with the distribution.
13 | #
14 | # * Neither the name of ETH Zurich and UNC Chapel Hill nor the names of
15 | # its contributors may be used to endorse or promote products derived
16 | # from this software without specific prior written permission.
17 | #
18 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
19 | # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20 | # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21 | # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE
22 | # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
23 | # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
24 | # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
25 | # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
26 | # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
27 | # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
28 | # POSSIBILITY OF SUCH DAMAGE.
29 | #
30 | # Author: Johannes L. Schoenberger (jsch at inf.ethz.ch)
31 |
32 | import os
33 | import sys
34 | import collections
35 | import numpy as np
36 | import struct
37 |
38 |
39 | CameraModel = collections.namedtuple(
40 | "CameraModel", ["model_id", "model_name", "num_params"])
41 | Camera = collections.namedtuple(
42 | "Camera", ["id", "model", "width", "height", "params"])
43 | BaseImage = collections.namedtuple(
44 | "Image", ["id", "qvec", "tvec", "camera_id", "name", "xys", "point3D_ids"])
45 | Point3D = collections.namedtuple(
46 | "Point3D", ["id", "xyz", "rgb", "error", "image_ids", "point2D_idxs"])
47 |
48 | class Image(BaseImage):
49 | def qvec2rotmat(self):
50 | return qvec2rotmat(self.qvec)
51 |
52 |
53 | CAMERA_MODELS = {
54 | CameraModel(model_id=0, model_name="SIMPLE_PINHOLE", num_params=3),
55 | CameraModel(model_id=1, model_name="PINHOLE", num_params=4),
56 | CameraModel(model_id=2, model_name="SIMPLE_RADIAL", num_params=4),
57 | CameraModel(model_id=3, model_name="RADIAL", num_params=5),
58 | CameraModel(model_id=4, model_name="OPENCV", num_params=8),
59 | CameraModel(model_id=5, model_name="OPENCV_FISHEYE", num_params=8),
60 | CameraModel(model_id=6, model_name="FULL_OPENCV", num_params=12),
61 | CameraModel(model_id=7, model_name="FOV", num_params=5),
62 | CameraModel(model_id=8, model_name="SIMPLE_RADIAL_FISHEYE", num_params=4),
63 | CameraModel(model_id=9, model_name="RADIAL_FISHEYE", num_params=5),
64 | CameraModel(model_id=10, model_name="THIN_PRISM_FISHEYE", num_params=12)
65 | }
66 | CAMERA_MODEL_IDS = dict([(camera_model.model_id, camera_model) \
67 | for camera_model in CAMERA_MODELS])
68 |
69 |
70 | def read_next_bytes(fid, num_bytes, format_char_sequence, endian_character="<"):
71 | """Read and unpack the next bytes from a binary file.
72 | :param fid:
73 | :param num_bytes: Sum of combination of {2, 4, 8}, e.g. 2, 6, 16, 30, etc.
74 | :param format_char_sequence: List of {c, e, f, d, h, H, i, I, l, L, q, Q}.
75 | :param endian_character: Any of {@, =, <, >, !}
76 | :return: Tuple of read and unpacked values.
77 | """
78 | data = fid.read(num_bytes)
79 | return struct.unpack(endian_character + format_char_sequence, data)
80 |
81 |
82 | def read_cameras_text(path):
83 | """
84 | see: src/base/reconstruction.cc
85 | void Reconstruction::WriteCamerasText(const std::string& path)
86 | void Reconstruction::ReadCamerasText(const std::string& path)
87 | """
88 | cameras = {}
89 | with open(path, "r") as fid:
90 | while True:
91 | line = fid.readline()
92 | if not line:
93 | break
94 | line = line.strip()
95 | if len(line) > 0 and line[0] != "#":
96 | elems = line.split()
97 | camera_id = int(elems[0])
98 | model = elems[1]
99 | width = int(elems[2])
100 | height = int(elems[3])
101 | params = np.array(tuple(map(float, elems[4:])))
102 | cameras[camera_id] = Camera(id=camera_id, model=model,
103 | width=width, height=height,
104 | params=params)
105 | return cameras
106 |
107 |
108 | def read_cameras_binary(path_to_model_file):
109 | """
110 | see: src/base/reconstruction.cc
111 | void Reconstruction::WriteCamerasBinary(const std::string& path)
112 | void Reconstruction::ReadCamerasBinary(const std::string& path)
113 | """
114 | cameras = {}
115 | with open(path_to_model_file, "rb") as fid:
116 | num_cameras = read_next_bytes(fid, 8, "Q")[0]
117 | for camera_line_index in range(num_cameras):
118 | camera_properties = read_next_bytes(
119 | fid, num_bytes=24, format_char_sequence="iiQQ")
120 | camera_id = camera_properties[0]
121 | model_id = camera_properties[1]
122 | model_name = CAMERA_MODEL_IDS[camera_properties[1]].model_name
123 | width = camera_properties[2]
124 | height = camera_properties[3]
125 | num_params = CAMERA_MODEL_IDS[model_id].num_params
126 | params = read_next_bytes(fid, num_bytes=8*num_params,
127 | format_char_sequence="d"*num_params)
128 | cameras[camera_id] = Camera(id=camera_id,
129 | model=model_name,
130 | width=width,
131 | height=height,
132 | params=np.array(params))
133 | assert len(cameras) == num_cameras
134 | return cameras
135 |
136 |
137 | def read_images_text(path):
138 | """
139 | see: src/base/reconstruction.cc
140 | void Reconstruction::ReadImagesText(const std::string& path)
141 | void Reconstruction::WriteImagesText(const std::string& path)
142 | """
143 | images = {}
144 | with open(path, "r") as fid:
145 | while True:
146 | line = fid.readline()
147 | if not line:
148 | break
149 | line = line.strip()
150 | if len(line) > 0 and line[0] != "#":
151 | elems = line.split()
152 | image_id = int(elems[0])
153 | qvec = np.array(tuple(map(float, elems[1:5])))
154 | tvec = np.array(tuple(map(float, elems[5:8])))
155 | camera_id = int(elems[8])
156 | image_name = elems[9]
157 | elems = fid.readline().split()
158 | xys = np.column_stack([tuple(map(float, elems[0::3])),
159 | tuple(map(float, elems[1::3]))])
160 | point3D_ids = np.array(tuple(map(int, elems[2::3])))
161 | images[image_id] = Image(
162 | id=image_id, qvec=qvec, tvec=tvec,
163 | camera_id=camera_id, name=image_name,
164 | xys=xys, point3D_ids=point3D_ids)
165 | return images
166 |
167 |
168 | def read_images_binary(path_to_model_file):
169 | """
170 | see: src/base/reconstruction.cc
171 | void Reconstruction::ReadImagesBinary(const std::string& path)
172 | void Reconstruction::WriteImagesBinary(const std::string& path)
173 | """
174 | images = {}
175 | with open(path_to_model_file, "rb") as fid:
176 | num_reg_images = read_next_bytes(fid, 8, "Q")[0]
177 | for image_index in range(num_reg_images):
178 | binary_image_properties = read_next_bytes(
179 | fid, num_bytes=64, format_char_sequence="idddddddi")
180 | image_id = binary_image_properties[0]
181 | qvec = np.array(binary_image_properties[1:5])
182 | tvec = np.array(binary_image_properties[5:8])
183 | camera_id = binary_image_properties[8]
184 | image_name = ""
185 | current_char = read_next_bytes(fid, 1, "c")[0]
186 | while current_char != b"\x00": # look for the ASCII 0 entry
187 | image_name += current_char.decode("utf-8")
188 | current_char = read_next_bytes(fid, 1, "c")[0]
189 | num_points2D = read_next_bytes(fid, num_bytes=8,
190 | format_char_sequence="Q")[0]
191 | x_y_id_s = read_next_bytes(fid, num_bytes=24*num_points2D,
192 | format_char_sequence="ddq"*num_points2D)
193 | xys = np.column_stack([tuple(map(float, x_y_id_s[0::3])),
194 | tuple(map(float, x_y_id_s[1::3]))])
195 | point3D_ids = np.array(tuple(map(int, x_y_id_s[2::3])))
196 | images[image_id] = Image(
197 | id=image_id, qvec=qvec, tvec=tvec,
198 | camera_id=camera_id, name=image_name,
199 | xys=xys, point3D_ids=point3D_ids)
200 | return images
201 |
202 |
203 | def read_points3D_text(path):
204 | """
205 | see: src/base/reconstruction.cc
206 | void Reconstruction::ReadPoints3DText(const std::string& path)
207 | void Reconstruction::WritePoints3DText(const std::string& path)
208 | """
209 | points3D = {}
210 | with open(path, "r") as fid:
211 | while True:
212 | line = fid.readline()
213 | if not line:
214 | break
215 | line = line.strip()
216 | if len(line) > 0 and line[0] != "#":
217 | elems = line.split()
218 | point3D_id = int(elems[0])
219 | xyz = np.array(tuple(map(float, elems[1:4])))
220 | rgb = np.array(tuple(map(int, elems[4:7])))
221 | error = float(elems[7])
222 | image_ids = np.array(tuple(map(int, elems[8::2])))
223 | point2D_idxs = np.array(tuple(map(int, elems[9::2])))
224 | points3D[point3D_id] = Point3D(id=point3D_id, xyz=xyz, rgb=rgb,
225 | error=error, image_ids=image_ids,
226 | point2D_idxs=point2D_idxs)
227 | return points3D
228 |
229 |
230 | def read_points3d_binary(path_to_model_file):
231 | """
232 | see: src/base/reconstruction.cc
233 | void Reconstruction::ReadPoints3DBinary(const std::string& path)
234 | void Reconstruction::WritePoints3DBinary(const std::string& path)
235 | """
236 | points3D = {}
237 | with open(path_to_model_file, "rb") as fid:
238 | num_points = read_next_bytes(fid, 8, "Q")[0]
239 | for point_line_index in range(num_points):
240 | binary_point_line_properties = read_next_bytes(
241 | fid, num_bytes=43, format_char_sequence="QdddBBBd")
242 | point3D_id = binary_point_line_properties[0]
243 | xyz = np.array(binary_point_line_properties[1:4])
244 | rgb = np.array(binary_point_line_properties[4:7])
245 | error = np.array(binary_point_line_properties[7])
246 | track_length = read_next_bytes(
247 | fid, num_bytes=8, format_char_sequence="Q")[0]
248 | track_elems = read_next_bytes(
249 | fid, num_bytes=8*track_length,
250 | format_char_sequence="ii"*track_length)
251 | image_ids = np.array(tuple(map(int, track_elems[0::2])))
252 | point2D_idxs = np.array(tuple(map(int, track_elems[1::2])))
253 | points3D[point3D_id] = Point3D(
254 | id=point3D_id, xyz=xyz, rgb=rgb,
255 | error=error, image_ids=image_ids,
256 | point2D_idxs=point2D_idxs)
257 | return points3D
258 |
259 |
260 | def read_model(path, ext):
261 | if ext == ".txt":
262 | cameras = read_cameras_text(os.path.join(path, "cameras" + ext))
263 | images = read_images_text(os.path.join(path, "images" + ext))
264 | points3D = read_points3D_text(os.path.join(path, "points3D") + ext)
265 | else:
266 | cameras = read_cameras_binary(os.path.join(path, "cameras" + ext))
267 | images = read_images_binary(os.path.join(path, "images" + ext))
268 | points3D = read_points3d_binary(os.path.join(path, "points3D") + ext)
269 | return cameras, images, points3D
270 |
271 |
272 | def qvec2rotmat(qvec):
273 | return np.array([
274 | [1 - 2 * qvec[2]**2 - 2 * qvec[3]**2,
275 | 2 * qvec[1] * qvec[2] - 2 * qvec[0] * qvec[3],
276 | 2 * qvec[3] * qvec[1] + 2 * qvec[0] * qvec[2]],
277 | [2 * qvec[1] * qvec[2] + 2 * qvec[0] * qvec[3],
278 | 1 - 2 * qvec[1]**2 - 2 * qvec[3]**2,
279 | 2 * qvec[2] * qvec[3] - 2 * qvec[0] * qvec[1]],
280 | [2 * qvec[3] * qvec[1] - 2 * qvec[0] * qvec[2],
281 | 2 * qvec[2] * qvec[3] + 2 * qvec[0] * qvec[1],
282 | 1 - 2 * qvec[1]**2 - 2 * qvec[2]**2]])
283 |
284 |
285 | def rotmat2qvec(R):
286 | Rxx, Ryx, Rzx, Rxy, Ryy, Rzy, Rxz, Ryz, Rzz = R.flat
287 | K = np.array([
288 | [Rxx - Ryy - Rzz, 0, 0, 0],
289 | [Ryx + Rxy, Ryy - Rxx - Rzz, 0, 0],
290 | [Rzx + Rxz, Rzy + Ryz, Rzz - Rxx - Ryy, 0],
291 | [Ryz - Rzy, Rzx - Rxz, Rxy - Ryx, Rxx + Ryy + Rzz]]) / 3.0
292 | eigvals, eigvecs = np.linalg.eigh(K)
293 | qvec = eigvecs[[3, 0, 1, 2], np.argmax(eigvals)]
294 | if qvec[0] < 0:
295 | qvec *= -1
296 | return qvec
--------------------------------------------------------------------------------
/utils/hovering/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/utils/hovering/__init__.py
--------------------------------------------------------------------------------
/utils/hovering/helper.py:
--------------------------------------------------------------------------------
1 | from typing import List
2 | import os
3 | import numpy as np
4 | from PIL import Image
5 | import open3d as o3d
6 | import matplotlib.pyplot as plt
7 | from open3d.visualization import rendering
8 |
9 |
10 | from utils.hovering.o3d_line_mesh import LineMesh
11 |
12 |
13 | class Helper:
14 | base_colors = {
15 | 'white': [1, 1, 1, 0.8],
16 | 'red': [1, 0, 0, 1],
17 | 'blue': [0, 0, 1,1],
18 | 'green': [0, 1, 0,1],
19 | 'yellow': [1, 1, 0,1],
20 | 'purple': [0.2, 0.2, 0.8, 1]
21 | }
22 |
23 | def __init__(self, point_size):
24 | self.point_size = point_size
25 |
26 | def material(self, color: str, shader="defaultUnlit") -> rendering.MaterialRecord:
27 | """
28 | Args:
29 | shader: e.g.'defaultUnlit', 'defaultLit', 'depth', 'normal'
30 | see Open3D: cpp/open3d/visualization/rendering/filament/FilamentScene.cpp#L1109
31 | """
32 | material = rendering.MaterialRecord()
33 | material.shader = shader
34 | material.base_color = self.base_colors[color]
35 | material.point_size = self.point_size
36 | return material
37 |
38 | def get_cam_pos(c2w: np.ndarray) -> np.ndarray:
39 | """ Get camera position in world coordinate system
40 | """
41 | cen = np.float32([0, 0, 0, 1])
42 | pos = c2w @ cen
43 | return pos[:3]
44 |
45 |
46 | # def get_frustum(c2w: np.ndarray,
47 | # sz=0.2,
48 | # camera_height=None,
49 | # camera_width=None,
50 | # frustum_color=[1, 0, 0]) -> o3d.geometry.LineSet:
51 | # """
52 | # Args:
53 | # c2w: np.ndarray, 4x4 camera-to-world matrix
54 | # sz: float, size (width) of the frustum
55 | # Returns:
56 | # frustum: o3d.geometry.TriangleMesh
57 | # """
58 | # cen = [0, 0, 0]
59 | # wid = sz
60 | # if camera_height is not None and camera_width is not None:
61 | # hei = wid * camera_height / camera_width
62 | # else:
63 | # hei = wid
64 | # tl = [wid, hei, sz]
65 | # tr = [-wid, hei, sz]
66 | # br = [-wid, -hei, sz]
67 | # bl = [wid, -hei, sz]
68 | # points = np.float32([cen, tl, tr, br, bl])
69 | # lines = [
70 | # [0, 1], [0, 2], [0, 3], [0, 4],
71 | # [1, 2], [2, 3], [3, 4], [4, 1],]
72 | # frustum = o3d.geometry.LineSet()
73 | # frustum.points = o3d.utility.Vector3dVector(points)
74 | # frustum.lines = o3d.utility.Vector2iVector(lines)
75 | # frustum.colors = o3d.utility.Vector3dVector([np.asarray([1, 0, 0])])
76 | # frustum.paint_uniform_color(frustum_color)
77 |
78 | # frustum = frustum.transform(c2w)
79 | # return frustum
80 |
81 |
82 | def get_trajectory(pos_history,
83 | num_line=6,
84 | line_radius=0.15
85 | ) -> o3d.geometry.TriangleMesh:
86 | """ pos_history: absolute position history
87 | """
88 | pos_history = np.asarray(pos_history)[-num_line:]
89 | colors = [0, 0, 0.6]
90 | line_mesh = LineMesh(
91 | points=pos_history,
92 | colors=colors, radius=line_radius)
93 | line_mesh.merge_cylinder_segments()
94 | path = line_mesh.cylinder_segments[0]
95 | return path
96 |
97 |
98 | def get_pretty_trajectory(pos_history,
99 | num_line=6,
100 | line_radius=0.15,
101 | darkness=1.0,
102 | ) -> List[o3d.geometry.TriangleMesh]:
103 | """ pos_history: absolute position history
104 | """
105 | def generate_jet_colors(n, darkness=0.6):
106 | cmap = plt.get_cmap('jet')
107 | norm = plt.Normalize(vmin=0, vmax=n-1)
108 | colors = cmap(norm(np.arange(n)))
109 | # Convert RGBA to RGB
110 | colors_rgb = []
111 | for color in colors:
112 | colors_rgb.append(color[:3] * darkness)
113 |
114 | return colors_rgb
115 |
116 | pos_history = np.asarray(pos_history)[-num_line:]
117 | colors = generate_jet_colors(len(pos_history), darkness)
118 | line_mesh = LineMesh(
119 | points=pos_history,
120 | colors=colors, radius=line_radius)
121 | return line_mesh.cylinder_segments
122 |
123 |
124 | """ Obtain Viewpoint from Open3D GUI """
125 | def parse_o3d_gui_view_status(status: dict, render: rendering.OffscreenRenderer):
126 | """ Parse open3d GUI's view status and convert to OffscreenRenderer format.
127 | This will do the normalisation of front and compute eye vector (updated version of front)
128 |
129 |
130 | Args:
131 | status: Ctrl-C output from Open3D GUI
132 | render: OffscreenRenderer
133 | Output:
134 | params for render.setup_camera(fov, lookat, eye, up)
135 | """
136 | cam_info = status['trajectory'][0]
137 | fov = cam_info['field_of_view']
138 | lookat = np.asarray(cam_info['lookat'])
139 | front = np.asarray(cam_info['front'])
140 | front = front / np.linalg.norm(front)
141 | up = np.asarray(cam_info['up'])
142 | zoom = cam_info['zoom']
143 | """
144 | See Open3D/cpp/open3d/visualization/visualizer/ViewControl.cpp#L243:
145 | void ViewControl::SetProjectionParameters()
146 | """
147 | right = np.cross(up, front) / np.linalg.norm(np.cross(up, front))
148 | view_ratio = zoom * render.scene.bounding_box.get_max_extent()
149 | distance = view_ratio / np.tan(fov * 0.5 / 180.0 * np.pi)
150 | eye = lookat + front * distance
151 | return fov, lookat, eye, up
152 |
153 |
154 | def set_offscreen_as_gui(render: rendering.OffscreenRenderer, status: dict):
155 | """ Set offscreen renderer as GUI's view status
156 | """
157 | fov, lookat, eye, up = parse_o3d_gui_view_status(status, render)
158 | render.setup_camera(fov, lookat, eye, up)
--------------------------------------------------------------------------------
/utils/hovering/hover_open3d.py:
--------------------------------------------------------------------------------
1 | from argparse import ArgumentParser
2 | import os
3 | import glob
4 | import numpy as np
5 | from PIL import Image
6 | from tqdm import tqdm
7 | import json
8 | import cv2
9 | import open3d as o3d
10 | from open3d.visualization import rendering
11 |
12 | from utils.base_type import ColmapModel
13 | from utils.hovering.helper import (
14 | Helper,
15 | get_cam_pos,
16 | get_trajectory, get_pretty_trajectory, set_offscreen_as_gui
17 | )
18 | from tools.visualise_data_open3d import get_c2w, get_frustum
19 |
20 | from moviepy import editor
21 | from PIL import ImageDraw, ImageFont
22 |
23 |
24 | TRAJECTORY_LINE_RADIUS = 0.01
25 |
26 |
27 | def parse_args():
28 | parser = ArgumentParser()
29 | parser.add_argument('--model', help="path to direcctory containing images.bin", required=True)
30 | parser.add_argument('--pcd-path', help="path to fused.ply", default=None)
31 | parser.add_argument('--view-path', type=str, required=True,
32 | help='path to the view file, copy-paste from open3d gui.')
33 | parser.add_argument('--out_dir', type=str, default='outputs/hovering/')
34 | args = parser.parse_args()
35 | return args
36 |
37 |
38 | class HoverRunner:
39 |
40 | fov = None
41 | lookat = None
42 | front = None
43 | up = None
44 |
45 | background_color = [1, 1, 1, 1.0]
46 |
47 | def __init__(self, out_size: str = 'big'):
48 | if out_size == 'big':
49 | out_size = (1920, 1080)
50 | else:
51 | out_size = (640, 480)
52 | self.render = rendering.OffscreenRenderer(*out_size)
53 |
54 | def setup(self,
55 | model: ColmapModel,
56 | pcd_path: str,
57 | viewstatus_path: str,
58 | out_dir: str,
59 | img_x0: int = 0,
60 | img_y0: int = 0,
61 | frustum_size: float = 0.2,
62 | frustum_line_width: float = 5):
63 | """
64 | Args:
65 | model:
66 | viewstatus_path:
67 | path to viewstatus.json, CTRL-c output from Open3D gui
68 | out_dir:
69 | e.g. 'P34_104_out'
70 | """
71 | self.model = model
72 | if pcd_path is not None:
73 | pcd = o3d.io.read_point_cloud(args.pcd_path)
74 | else:
75 | pcd_np = np.asarray([v.xyz for v in model.points.values()])
76 | pcd_rgb = np.asarray([v.rgb / 255 for v in model.points.values()])
77 | pcd = o3d.geometry.PointCloud()
78 | pcd.points = o3d.utility.Vector3dVector(pcd_np)
79 | pcd.colors = o3d.utility.Vector3dVector(pcd_rgb)
80 | self.transformed_pcd = pcd
81 |
82 | self.viewstatus_path = viewstatus_path
83 | self.out_dir = out_dir
84 |
85 | # Render Layout params
86 | # img_x0/img_y0: int. The top-left corner of the display image
87 | self.img_x0 = img_x0
88 | self.img_y0 = img_y0
89 | self.rgb_monitor_height = 456
90 | self.rgb_monitor_width = 456
91 | self.frustum_size = frustum_size
92 | self.frustum_line_width = frustum_line_width
93 | self.text_loc = (450, 1000)
94 |
95 | def test_single_frame(self,
96 | psize,
97 | img_index:int =None,
98 | clear_geometry: bool =True,
99 | lay_rgb_img: bool =True,
100 | sun_light: bool =False,
101 | show_first_frustum: bool =True,
102 | ):
103 | """
104 | Args:
105 | psize: point size,
106 | probing a good point size is a bit tricky but very important!
107 | img_index: int. I.e. Frame number
108 | """
109 | pcd = self.transformed_pcd
110 |
111 | if clear_geometry:
112 | self.render.scene.clear_geometry()
113 |
114 | # Get materials
115 | helper = Helper(point_size=psize)
116 | white = helper.material('white')
117 | red = helper.material('red', shader='unlitLine')
118 | red.line_width = self.frustum_line_width
119 | self.helper = helper
120 |
121 | # put on pcd
122 | self.render.scene.add_geometry('pcd', pcd, white)
123 | with open(self.viewstatus_path) as f:
124 | viewstatus = json.load(f)
125 | set_offscreen_as_gui(self.render, viewstatus)
126 |
127 | # now put frustum on canvas
128 | if img_index is None:
129 | img_index = 0
130 | c_image = self.model.ordered_images[img_index]
131 | c2w = get_c2w(list(c_image.qvec) + list(c_image.tvec))
132 | frustum = get_frustum(
133 | c2w=c2w, sz=self.frustum_size,
134 | camera_height=self.rgb_monitor_height,
135 | camera_width=self.rgb_monitor_width)
136 | if show_first_frustum:
137 | self.render.scene.add_geometry('first_frustum', frustum, red)
138 | self.render.scene.set_background(self.background_color)
139 |
140 | if sun_light:
141 | self.render.scene.scene.set_sun_light(
142 | [0.707, 0.0, -.707], [1.0, 1.0, 1.0], 75000)
143 | self.render.scene.scene.enable_sun_light(True)
144 | else:
145 | self.render.scene.set_lighting(
146 | rendering.Open3DScene.NO_SHADOWS, (0, 0, 0))
147 | self.render.scene.show_axes(False)
148 |
149 | img_buf = self.render.render_to_image()
150 | img = np.asarray(img_buf)
151 | test_img = self.model.read_rgb_from_name(c_image.name)
152 | test_img = cv2.resize(
153 | test_img, (self.rgb_monitor_width, self.rgb_monitor_height))
154 | if lay_rgb_img:
155 | img[-self.rgb_monitor_height:,
156 | -self.rgb_monitor_width:] = test_img
157 |
158 | img_pil = Image.fromarray(img)
159 | I1 = ImageDraw.Draw(img_pil)
160 | myFont = ImageFont.truetype('FreeMono.ttf', 65)
161 | bbox = (
162 | img.shape[1] - self.rgb_monitor_width,
163 | img.shape[0] - self.rgb_monitor_height,
164 | img.shape[1],
165 | img.shape[0])
166 | # print(bbox)
167 | text = "Frame %d" % img_index
168 | I1.text(self.text_loc, text, font=myFont, fill =(0, 0, 0))
169 | I1.rectangle(bbox, outline='red', width=5)
170 | img = np.asarray(img_pil)
171 | return img
172 |
173 | def run_all(self, step, traj_len=10):
174 | """
175 | Args:
176 | step: int. Render every `step` frames
177 | traj_len: int. Number of trajectory lines to show
178 | """
179 | render = self.render
180 | os.makedirs(self.out_dir, exist_ok=True)
181 | out_fmt = os.path.join(self.out_dir, '%010d.jpg')
182 | red_m = self.helper.material('red', shader='unlitLine')
183 | red_m.line_width = self.frustum_line_width
184 | white_m = self.helper.material('white')
185 |
186 | render.scene.remove_geometry('first_frustum')
187 |
188 | myFont = ImageFont.truetype('FreeMono.ttf', 65)
189 | bbox = (1464, 624, 1920, 1080)
190 |
191 | pos_history = []
192 | num_images = self.model.num_images
193 | for frame_idx in tqdm(range(0, num_images, step), total=num_images//step):
194 | c_image = self.model.ordered_images[frame_idx]
195 | frame_rgb = self.model.read_rgb_from_name(c_image.name)
196 | frame_rgb = cv2.resize(
197 | frame_rgb, (self.rgb_monitor_width, self.rgb_monitor_height))
198 | c2w = get_c2w(list(c_image.qvec) + list(c_image.tvec))
199 | frustum = get_frustum(
200 | c2w=c2w, sz=self.frustum_size,
201 | camera_height=self.rgb_monitor_height,
202 | camera_width=self.rgb_monitor_width)
203 | pos_history.append(get_cam_pos(c2w))
204 |
205 | if len(pos_history) > 2:
206 | # lines = get_pretty_trajectory(
207 | traj = get_trajectory(
208 | pos_history, num_line=traj_len,
209 | line_radius=TRAJECTORY_LINE_RADIUS)
210 | if render.scene.has_geometry('traj'):
211 | render.scene.remove_geometry('traj')
212 | render.scene.add_geometry('traj', traj, white_m)
213 | render.scene.add_geometry('frustum', frustum, red_m)
214 |
215 | img = render.render_to_image()
216 | img = np.asarray(img)
217 | img[-self.rgb_monitor_height:,
218 | -self.rgb_monitor_width:] = frame_rgb
219 | img_pil = Image.fromarray(img)
220 |
221 | I1 = ImageDraw.Draw(img_pil)
222 | text = "Frame %d" % frame_idx
223 | I1.text(self.text_loc, text, font=myFont, fill =(0, 0, 0))
224 | I1.rectangle(bbox, outline='red', width=5)
225 | img_pil.save(out_fmt % frame_idx)
226 |
227 | render.scene.remove_geometry('frustum')
228 |
229 | # Gen output
230 | video_fps = 20
231 | print("Generating video...")
232 | seq = sorted(glob.glob(os.path.join(self.out_dir, '*.jpg')))
233 | clip = editor.ImageSequenceClip(seq, fps=video_fps)
234 | clip.write_videofile(os.path.join(self.out_dir, 'out.mp4'))
235 |
236 |
237 | if __name__ == '__main__':
238 | args = parse_args()
239 | model = ColmapModel(args.model)
240 | model.read_rgb_from_name = \
241 | lambda name: np.asarray(Image.open(f"outputs/demo/frames/{name}"))
242 | runner = HoverRunner()
243 | runner.setup(
244 | model,
245 | pcd_path=args.pcd_path,
246 | viewstatus_path=args.view_path,
247 | out_dir=args.out_dir,
248 | frustum_size=1,
249 | frustum_line_width=1)
250 | runner.test_single_frame(0.1)
251 | runner.run_all(step=3, traj_len=10)
252 |
--------------------------------------------------------------------------------
/utils/hovering/o3d_line_mesh.py:
--------------------------------------------------------------------------------
1 | """Module which creates mesh lines from a line set
2 | Open3D relies upon using glLineWidth to set line width on a LineSet
3 | However, this method is now deprecated and not fully supporeted in newer OpenGL versions
4 | See:
5 | Open3D Github Pull Request - https://github.com/intel-isl/Open3D/pull/738
6 | Other Framework Issues - https://github.com/openframeworks/openFrameworks/issues/3460
7 |
8 | This module aims to solve this by converting a line into a triangular mesh (which has thickness)
9 | The basic idea is to create a cylinder for each line segment, translate it, and then rotate it.
10 |
11 | License: MIT
12 |
13 | """
14 | import numpy as np
15 | import open3d as o3d
16 |
17 |
18 | def align_vector_to_another(a=np.array([0, 0, 1]), b=np.array([1, 0, 0])):
19 | """
20 | Aligns vector a to vector b with axis angle rotation
21 | """
22 | if np.array_equal(a, b):
23 | return None, None
24 | axis_ = np.cross(a, b)
25 | axis_ = axis_ / np.linalg.norm(axis_)
26 | angle = np.arccos(np.dot(a, b))
27 |
28 | return axis_, angle
29 |
30 |
31 | def normalized(a, axis=-1, order=2):
32 | """Normalizes a numpy array of points"""
33 | l2 = np.atleast_1d(np.linalg.norm(a, order, axis))
34 | l2[l2 == 0] = 1
35 | return a / np.expand_dims(l2, axis), l2
36 |
37 |
38 | class LineMesh(object):
39 | def __init__(self, points, lines=None, colors=[0, 1, 0], radius=0.15):
40 | """Creates a line represented as sequence of cylinder triangular meshes
41 |
42 | Arguments:
43 | points {ndarray} -- Numpy array of ponts Nx3.
44 |
45 | Keyword Arguments:
46 | lines {list[list] or None} -- List of point index pairs denoting line segments. If None, implicit lines from ordered pairwise points. (default: {None})
47 | colors {list} -- list of colors, or single color of the line (default: {[0, 1, 0]})
48 | radius {float} -- radius of cylinder (default: {0.15})
49 | """
50 | self.points = np.array(points)
51 | self.lines = np.array(
52 | lines) if lines is not None else self.lines_from_ordered_points(self.points)
53 | self.colors = np.array(colors)
54 | self.radius = radius
55 | self.cylinder_segments = []
56 |
57 | self.create_line_mesh()
58 |
59 | @staticmethod
60 | def lines_from_ordered_points(points):
61 | lines = [[i, i + 1] for i in range(0, points.shape[0] - 1, 1)]
62 | return np.array(lines)
63 |
64 | def create_line_mesh(self):
65 | first_points = self.points[self.lines[:, 0], :]
66 | second_points = self.points[self.lines[:, 1], :]
67 | line_segments = second_points - first_points
68 | line_segments_unit, line_lengths = normalized(line_segments)
69 |
70 | z_axis = np.array([0, 0, 1])
71 | # Create triangular mesh cylinder segments of line
72 | for i in range(line_segments_unit.shape[0]):
73 | line_segment = line_segments_unit[i, :]
74 | line_length = line_lengths[i]
75 | # get axis angle rotation to allign cylinder with line segment
76 | axis, angle = align_vector_to_another(z_axis, line_segment)
77 | # Get translation vector
78 | translation = first_points[i, :] + line_segment * line_length * 0.5
79 | # create cylinder and apply transformations
80 | cylinder_segment = o3d.geometry.TriangleMesh.create_cylinder(
81 | self.radius, line_length)
82 | cylinder_segment = cylinder_segment.translate(
83 | translation, relative=False)
84 | if axis is not None:
85 | axis_a = axis * angle
86 | rot = o3d.geometry.get_rotation_matrix_from_axis_angle(axis_a)
87 | cylinder_segment = cylinder_segment.rotate(
88 | R=rot, center=cylinder_segment.get_center())
89 | # cylinder_segment = cylinder_segment.rotate(
90 | # axis_a, center=True, type=o3d.geometry.RotationType.AxisAngle)
91 | # color cylinder
92 | color = self.colors if self.colors.ndim == 1 else self.colors[i, :]
93 | cylinder_segment.paint_uniform_color(color)
94 |
95 | self.cylinder_segments.append(cylinder_segment)
96 |
97 | def merge_cylinder_segments(self):
98 |
99 | vertices_list = [np.asarray(mesh.vertices) for mesh in self.cylinder_segments]
100 | triangles_list = [np.asarray(mesh.triangles) for mesh in self.cylinder_segments]
101 | triangles_offset = np.cumsum([v.shape[0] for v in vertices_list])
102 | triangles_offset = np.insert(triangles_offset, 0, 0)[:-1]
103 |
104 | vertices = np.vstack(vertices_list)
105 | triangles = np.vstack([triangle + offset for triangle, offset in zip(triangles_list, triangles_offset)])
106 |
107 | merged_mesh = o3d.geometry.TriangleMesh(o3d.open3d.utility.Vector3dVector(vertices),
108 | o3d.open3d.utility.Vector3iVector(triangles))
109 | color = self.colors if self.colors.ndim == 1 else self.colors[0]
110 | merged_mesh.paint_uniform_color(color)
111 | self.cylinder_segments = [merged_mesh]
112 |
113 | def add_line(self, vis):
114 | """Adds this line to the visualizer"""
115 | for cylinder in self.cylinder_segments:
116 | vis.add_geometry(cylinder)
117 |
118 | def remove_line(self, vis):
119 | """Removes this line from the visualizer"""
120 | for cylinder in self.cylinder_segments:
121 | vis.remove_geometry(cylinder)
122 |
123 |
124 | def main():
125 | print("Demonstrating LineMesh vs LineSet")
126 | # Create Line Set
127 | points = [[0, 0, 0], [1, 0, 0], [0, 1, 0], [1, 1, 0], [0, 0, 1], [1, 0, 1],
128 | [0, 1, 1], [1, 1, 1]]
129 | lines = [[0, 1], [0, 2], [1, 3], [2, 3], [4, 5], [4, 6], [5, 7], [6, 7],
130 | [0, 4], [1, 5], [2, 6], [3, 7]]
131 | colors = [[1, 0, 0] for i in range(len(lines))]
132 |
133 | line_set = o3d.geometry.LineSet()
134 | line_set.points = o3d.utility.Vector3dVector(points)
135 | line_set.lines = o3d.utility.Vector2iVector(lines)
136 | line_set.colors = o3d.utility.Vector3dVector(colors)
137 |
138 | # Create Line Mesh 1
139 | points = np.array(points) + [0, 0, 2]
140 | line_mesh1 = LineMesh(points, lines, colors, radius=0.02)
141 | line_mesh1_geoms = line_mesh1.cylinder_segments
142 |
143 | # Create Line Mesh 1
144 | points = np.array(points) + [0, 2, 0]
145 | line_mesh2 = LineMesh(points, radius=0.03)
146 | line_mesh2_geoms = line_mesh2.cylinder_segments
147 |
148 | o3d.visualization.draw_geometries(
149 | [line_set, *line_mesh1_geoms, *line_mesh2_geoms])
150 |
151 |
152 | if __name__ == "__main__":
153 | main()
154 |
155 |
--------------------------------------------------------------------------------
/utils/lib.py:
--------------------------------------------------------------------------------
1 | import pycolmap
2 | import shutil
3 | import os
4 | import glob
5 | import subprocess
6 |
7 | def get_num_images(model_path):
8 | reconstruction = pycolmap.Reconstruction(model_path)
9 | num_images = reconstruction.num_images()
10 | return num_images
11 |
12 | def read_lines_from_file(filename):
13 | """
14 | Read lines from a txt file and return them as a list.
15 |
16 | :param filename: Name of the file to read from.
17 | :return: List of lines from the file.
18 | """
19 | with open(filename, 'r') as file:
20 | lines = file.readlines()
21 |
22 | # Strip any trailing newline characters
23 | return [line.strip() for line in lines]
24 |
25 | def keep_model_with_largest_images(reconstuction_path):
26 | all_models = sorted(glob.glob(os.path.join(reconstuction_path,'*')))
27 | try:
28 | max_images = get_num_images(all_models[0])
29 | except:
30 | return 0
31 | selected_model = all_models[0]
32 | if len(all_models) > 1:
33 | for model in all_models:
34 | num_images = get_num_images(model)
35 | if num_images > max_images:
36 | max_images = num_images
37 | selected_model = model
38 |
39 | for model in all_models:
40 | if model != selected_model:
41 | shutil.rmtree(model)
42 | os.rename(selected_model,os.path.join(reconstuction_path,'0'))
43 | return max_images
44 |
45 | # Define the function to execute in each process
46 | def run_script(script_path, arg):
47 | cmd = ['python3', script_path] + arg
48 | print(cmd)
49 | subprocess.call(cmd)
--------------------------------------------------------------------------------