├── .gitignore
├── README.md
├── assets
    └── epic_fields.png
├── demo
    ├── demo.py
    ├── demo_ego4d.md
    ├── dense_point_cloud.sh
    ├── reconstruct_sparse.sh
    └── register_dense.sh
├── example_data
    ├── P04_01_line.json
    ├── P04_01_line.png
    ├── P06_09_line.json
    ├── P06_09_line.png
    ├── P12_101_line.json
    ├── P12_101_line.png
    ├── P28_101.json
    ├── P28_101
    │   ├── frame_0000000080.jpg
    │   ├── frame_0000000085.jpg
    │   ├── frame_0000000090.jpg
    │   ├── frame_0000000095.jpg
    │   ├── frame_0000000100.jpg
    │   ├── frame_0000000105.jpg
    │   ├── frame_0000000110.jpg
    │   └── frame_0000000115.jpg
    ├── P28_101_line.json
    ├── P28_101_line.png
    ├── example_output_gui.jpg
    └── example_output_line.jpg
├── homography_filter
    ├── __init__.py
    ├── argparser.py
    ├── filter.py
    └── lib.py
├── input_videos.txt
├── licence.txt
├── reconstruct_sparse.py
├── register_dense.py
├── scripts
    ├── reconstruct_sparse.sh
    └── register_dense.sh
├── select_sparse_frames.py
├── tools
    ├── __init__.py
    ├── common_functions.py
    ├── project_3d_line.py
    ├── visualise_data_open3d.py
    └── visualize_colmap_open3d.py
└── utils
    ├── __init__.py
    ├── base_type.py
    ├── colmap_utils.py
    ├── hovering
        ├── __init__.py
        ├── helper.py
        ├── hover_open3d.py
        └── o3d_line_mesh.py
    └── lib.py


/.gitignore:
--------------------------------------------------------------------------------
1 | **/__pycache__/
2 | 
3 | outputs/


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # EPIC Fields: Marrying 3D Geometry and Video Understanding 
  2 | ![EPIC Fields Overview](assets/epic_fields.png?raw=true)
  3 | 
  4 | This repository provides tools and scripts for visualizing and reconstructing the [EPIC FIELDS](https://epic-kitchens.github.io/epic-fields) dataset.
  5 | 
  6 | ## Table of Contents
  7 | 
  8 | 1. [Visualization Code](#visualization-code)
  9 |    - [Introduction](#introduction)
 10 |    - [Format](#format)
 11 |    - [Visualization](#visualisation)
 12 | 2. [Reconstruction Pipeline](#reconstruction-pipeline)
 13 |    - [Steps for EPIC-KITCHENS Reconstruction](#steps-for-epic-kitchens-reconstruction)
 14 |    - [Understanding the Output File Structure](#understanding-the-output-file-structure)
 15 | 3. [Reconstruction Pipeline: Quick Demo](#reconstruction-pipeline-quick-demo)
 16 | 4. [Additional info](#additional-info)
 17 |    - [Credit](#credit)
 18 |    - [Citation](#citation)
 19 |    - [License](#license)
 20 |    - [Contact](#contact)
 21 | 
 22 | 
 23 |  
 24 | # Visualization Code
 25 | ## Introduction
 26 | 
 27 | This visualisation code is associated with the released EPIC FIELDS dataset. Further details on the dataset and associated preprint are available at:
 28 | [https://epic-kitchens.github.io/epic-fields](https://epic-kitchens.github.io/epic-fields)
 29 | 
 30 | 
 31 | ## Format
 32 | 
 33 | - The `camera` parameters use the COLMAP format, which is the same as the OpenCV format.
 34 | - The `images` stores the world-to-camera transformation, represented by quaternion and translation. 
 35 |     - Note: for NeRF usage this needs to be converted to camera-to-world transformation and possibly changing (+x, +y, +z) to (+x, -y, -z)
 36 | - The `points` is part of COLMAP output. It's kept here for visualisation purpose and potentially for computing the `near`/`far` bounds in NeRF input.
 37 | ```
 38 | {
 39 |     "camera": {
 40 |         "id": 1, "model": "OPENCV", "width": 456, "height": 256,
 41 |         "params": [fx, fy, cx, cy, k1, k2, p1, p2]
 42 |     },
 43 |     "images": {
 44 |         frame_name: [qw, qx, qy, qz, tx, ty, tz],
 45 |         ...
 46 |     },
 47 |     "points": [
 48 |         [x, y, z, r, g, b],
 49 |         ...
 50 |     ]
 51 | }
 52 | 
 53 | example data can be found in `example_data/P28_101.json`
 54 | ```
 55 | 
 56 | ## Visualisation
 57 | 
 58 | ### Visualise camera poses and pointcloud 
 59 | 
 60 | This script requires Open3D. This script is tested with Open3D==0.16.1.
 61 | ```python
 62 | python tools/visualise_data_open3d.py --json-data example_data/P28_101.json
 63 | ```
 64 | PS: Press 'h' to see the Open3D help message.
 65 | 
 66 | <details>
 67 |     <summary>Click to see the example output</summary>
 68 |     <img width="1011" alt="gui" src="./example_data/example_output_gui.jpg">
 69 | </details>
 70 | 
 71 | ### Example: Project a 3D line onto epic-kitchens images using camera poses
 72 | 
 73 | ```python
 74 | python tools/project_3d_line.py \
 75 |     --json-data example_data/P28_101.json \
 76 |     --line-data example_data/P28_101_line.json \
 77 |     --frames-root example_data/P28_101/
 78 | ```
 79 | <details>
 80 |     <summary>Click to see the example output</summary>
 81 |     <img width="1011" alt="line" src="./example_data/example_output_line.jpg">
 82 | </details>
 83 | 
 84 | To draw a 3D line, one option is to download the COLMAP format data and use COLMAP GUI to click on points.
 85 | 
 86 | 
 87 | ---
 88 | 
 89 | 
 90 | 
 91 | # Reconstruction Pipeline
 92 | 
 93 | This section contains the pipeline for the dataset introduced in our paper, "EPIC Fields: Marrying 3D Geometry and Video Understanding." We aim to bridge the domains of 3D geometry and video understanding, leading to innovative advancements in both areas.
 94 | 
 95 | ## Steps for EPIC-KITCHENS Reconstruction
 96 | 
 97 | This section outlines the procedure to achieve the [EPIC-KITCHENS](https://epic-kitchens.github.io) reconstructions using our methodology.
 98 | 
 99 | ### Step 0: Prerequisites and Initial Configuration
100 | 
101 | #### 1. Installing COLMAP (preferably with CUDA support)
102 | 
103 | To efficiently process and reconstruct the frames, it's recommended to install COLMAP with CUDA support, which accelerates the reconstruction process using NVIDIA GPUs.
104 | 
105 | You can download and install COLMAP from their official website. For detailed installation instructions, especially on how to enable CUDA support, refer to the [COLMAP installation guide](https://colmap.github.io/install.html).
106 | 
107 | #### 2. Cloning the Repository
108 | 
109 | To proceed with the subsequent steps, you'll need to clone the current repository. Run the following commands:
110 | 
111 | ```bash
112 | git clone https://github.com/epic-kitchens/epic-fields-code.git
113 | cd epic-fields-code
114 | ```
115 | 
116 | #### 3. Downloading Vocabulary Trees
117 | COLMAP utilizes vocabulary trees for efficient image matching. Create a directory called vocab_bins and download the required Vocabulary Trees into this directory:
118 | ```bash
119 | mkdir vocab_bins
120 | cd vocab_bins
121 | wget https://demuc.de/colmap/vocab_tree_flickr100K_words32K.bin
122 | cd ..
123 | ```
124 | #### 4. Installing `pycolmap` package
125 | 
126 | The `pycolmap` package will be used to gather statistics from the model later on. Install it using `pip` (assuming that you've created an environment):
127 | 
128 | ```bash
129 | pip install pycolmap
130 | ```
131 | ### Step 1: Downloading Video Frames
132 | 
133 | To utilize the EPIC Fields pipeline, the first step is to acquire the necessary video frames. We're particularly interested in the RGB frames from EPIC-KITCHENS. You can download the entire collection from [EPIC-KITCHENS](https://epic-kitchens.github.io).
134 | 
135 | For demonstration purposes, we'll guide you through downloading the `P15_12` video RGB frames.
136 | 
137 | ##### Demo: Downloading and Extracting `P15_12` Video Frames
138 | 
139 | Execute the following shell commands to download and extract the RGB frames:
140 | 
141 | ```bash
142 | # Download the tarball
143 | wget https://data.bris.ac.uk/datasets/3h91syskeag572hl6tvuovwv4d/frames_rgb_flow/rgb/train/P15/P15_12.tar
144 | 
145 | # Create the desired directory structure
146 | mkdir -p P15/P15_12
147 | 
148 | # Extract the frames into the specified directory
149 | tar -xf P15_12.tar -C P15/P15_12
150 | ```
151 | This will place all the .jpg frames inside the P15/P15_12 directory.
152 | 
153 | ##### Directory Structure Confirmation
154 | 
155 | After downloading and extracting, your directory structure should look like this (which is [EPIC-KITCHENS](https://epic-kitchens.github.io) format :
156 | ```
157 | /root-directory/
158 | │
159 | └───PXX/
160 |     │
161 |     └───PXX_YY(Y)/
162 |         │   frame_000001.jpg
163 |         │   frame_000002.jpg
164 |         │   ...
165 | ```
166 | For our P15_12 example, this would be:
167 | ```
168 | /root-directory/
169 | │
170 | └───P15/
171 |     │
172 |     └───P15_12/
173 |         │   frame_000001.jpg
174 |         │   frame_000002.jpg
175 |         │   ...
176 | ```
177 | 
178 | This structure ensures a consistent format for the pipeline to process the frames effectively.
179 | 
180 | ### Step 2: Specifying Videos for Reconstruction
181 | 
182 | Update the `input_videos.txt` file in the repository to list the video identifiers you wish to process. In our demo example, we put P15_12 in the file. If you have multiple files, please ensure each video identifier is on a separate line.
183 | 
184 | 
185 | ### Step 3: Running the Homography-Based Frame Sampling
186 | 
187 | Execute the `select_sparse_frames.py` script to perform homography-based sampling of the frames.
188 | 
189 | ##### Script Parameters:
190 | 
191 | - `--input_videos`: Path to the file containing a list of videos to be processed. Default: `input_videos.txt`
192 | - `--epic_kithens_root`: Directory path to the EPIC-KITCHENS images. Default: `.`
193 | - `--sampled_images_path`: Directory where the sampled image files will be stored. Default: `sampled_frames`
194 | - `--homography_overlap`: Threshold for the homography to sample new frames. A higher value will sample more images. Default: `0.9`
195 | - `--max_concurrent`: Maximum number of concurrent processes. Default: `8`
196 | 
197 | ##### Example Usage:
198 | 
199 | ```bash
200 | python3 select_sparse_frames.py --input_videos input_videos.txt --epic_kithens_root path_to_epic_images --sampled_images_path path_for_sampled_frames
201 | ```
202 | 
203 | ##### Demo: Homography-Based Frame Sampling for `P15_12` Video
204 | 
205 | For the demo, using the `P15_12` video you've downloaded into the current directory, run:
206 | 
207 | ```bash
208 | python3 select_sparse_frames.py --input_videos input_videos.txt --epic_kithens_root . --sampled_images_path sampled_frames --homography_overlap 0.9 --max_concurrent 8
209 | ```
210 | 
211 | 
212 | ### Step 4: Running the COLMAP Sparse Reconstruction
213 | 
214 | Execute the `reconstruct_sparse.py` script to perform sparse reconstruction using COLMAP.
215 | 
216 | ##### Script Parameters:
217 | 
218 | - `--input_videos`: Path to the file containing a list of videos to be processed. Default: `input_videos.txt`
219 | - `--sparse_reconstuctions_root`: Path to store the sparsely reconstructed models. Default: `colmap_models/sparse`
220 | - `--epic_kithens_root`: Directory path to the EPIC-KITCHENS images. Default: `.`
221 | - `--logs_path`: Path where the log files will be stored. Default: `logs/sparse/out_logs_terminal`
222 | - `--summary_path`: Path where the summary files will be stored. Default: `logs/sparse/out_summary`
223 | - `--sampled_images_path`: Directory where the sampled image files are located. Default: `sampled_frames`
224 | - `--gpu_index`: Index of the GPU to be used. Default: `0`
225 | 
226 | ##### Example Usage:
227 | ```bash
228 | python3 reconstruct_sparse.py --input_videos input_videos.txt --sparse_reconstuctions_root colmap_models/sparse --epic_kithens_root path_to_epic_images --logs_path logs/sparse/out_logs_terminal --summary_path logs/sparse/out_summary --sampled_images_path path_for_sampled_frames --gpu_index 0
229 | ```
230 | 
231 | #### Demo: Sparse Reconstruction for P15_12 Video
232 | For the demo, using the P15_12 video and the sampled frames in the current directory, run:
233 | 
234 | ```bash
235 | python3 reconstruct_sparse.py --input_videos input_videos.txt --sparse_reconstuctions_root colmap_models/sparse --epic_kithens_root . --logs_path logs/sparse/out_logs_terminal --summary_path logs/sparse/out_summary --sampled_images_path sampled_frames --gpu_index 0
236 | ```
237 | 
238 | ### Understanding the Output File Structure 
239 | 
240 | After running the sparse reconstruction demo, you'll notice the following directory hierarchy:
241 | ```
242 | logs/
243 | │
244 | └───sparse/
245 |     │
246 |     ├───out_logs_terminal/
247 |     │   │   P15_12_<PROCESS_ID>_reconstruct_sparse.out
248 |     │   │   ...
249 |     │
250 |     └───out_summary/
251 |         │   P15_12.out
252 |         │   ...
253 | 
254 | ```
255 | #### Sparse Model Directory:
256 | The sparsely reconstructed model for our demo video P15_12 will be found in: ```colmap_models/sparse/P15_12```
257 | 
258 | #### Logs Directory:
259 | The "logs" directory provides insights into the sparse reconstruction process:
260 | 
261 | - COLMAP Execution Logs (out_logs_terminal): These logs capture details from the COLMAP execution and can be helpful for debugging. For our demo video P15_12, the respective log file would be named something like: ```logs/sparse/out_logs_terminal/P15_12_<PROCESS_ID>_reconstruct_sparse.out```
262 | 
263 | - Sparse Model Summary (out_summary): This directory contains a summary of the sparse model's statistics. For our demo video P15_12, the summary file is ```logs/sparse/out_summary/P15_12.out```
264 | By examining the P15_12.out file, you can gain insights into how well the reconstruction process performed for that specific video and the excution time.
265 | 
266 | 
267 | ### Step 5: Registering All Frames into the Sparse Model
268 | 
269 | For this step, you'll use the `register_dense.py` script. This script registers all the frames with the sparse model, preparing them for a dense reconstruction.
270 | 
271 | ##### Script Parameters:
272 | 
273 | - `--input_videos`: Path to the file containing a list of videos to be processed. Default: `input_videos.txt`
274 | - `--sparse_reconstuctions_root`: Directory path to the sparsely reconstructed models. Default: `colmap_models/sparse`
275 | - `--dense_reconstuctions_root`: Directory path to the densely registered models. Default: `colmap_models/dense`
276 | - `--epic_kithens_root`: Directory path to the EPIC-KITCHENS images. Default: `.`
277 | - `--logs_path`: Directory where the log files of the dense registration will be stored. Default: `logs/dense/out_logs_terminal`
278 | - `--summary_path`: Directory where the summary files of the dense registration will be stored. Default: `logs/dense/out_summary`
279 | - `--gpu_index`: Index of the GPU to use. Default: `0`
280 | 
281 | #### Demo: Registering Frames into Sparse Model for Video `P15_12`
282 | 
283 | To demonstrate the registration process using the `register_dense.py` script, let's use the sample video `P15_12` as an example.
284 | 
285 | ```bash
286 | python3 register_dense.py --input_videos input_videos.txt --sparse_reconstuctions_root colmap_models/sparse --dense_reconstuctions_root colmap_models/dense --epic_kithens_root . --logs_path logs/dense/out_logs_terminal --summary_path logs/dense/out_summary --gpu_index 0
287 | ```
288 | 
289 | Assuming input_videos.txt contains the entry for P15_12, the above command will register all frames from the P15_12 video with the sparse model stored under colmap_models/sparse, the new registered model will be saved under colmap_models/dense. The logs and summary for this registration process will be saved under the logs/dense/out_logs_terminal and logs/dense/out_summary directories, respectively.
290 | 
291 | After executing the command, you can check the log files and summary for insights and statistics on the registration process for the P15_12 video.
292 | 
293 | # Reconstruction Pipeline: Quick Demo
294 | 
295 | Here we provide another demo script `demo/demo.py`
296 | that works on a video directly, summarising all above steps into one file.
297 | 
298 | ```
299 | python demo/demo.py video.mp4
300 | ```
301 | 
302 | Please refer to [demo_ego4d.md](demo/demo_ego4d.md) for details.
303 | 
304 | 
305 | 
306 | # Additional info
307 | 
308 | ## Credit
309 | 
310 | Code prepared by Zhifan Zhu, Ahmad Darkhalil and Vadim Tschernezki.
311 | 
312 | ## Citation
313 | If you find this work useful please cite our paper:
314 | 
315 | ```
316 |     @article{EPICFIELDS2023,
317 |            title={{EPIC-FIELDS}: {M}arrying {3D} {G}eometry and {V}ideo {U}nderstanding},
318 |            author={Tschernezki, Vadim and Darkhalil, Ahmad and Zhu, Zhifan and Fouhey, David and Larina, Iro and Larlus, Diane and Damen, Dima and Vedaldi, Andrea},
319 |            booktitle   = {ArXiv},
320 |            year      = {2023}
321 |     } 
322 | ```
323 | 
324 | Also cite the [EPIC-KITCHENS-100](https://epic-kitchens.github.io) paper where the videos originate:
325 | 
326 | ```
327 | @ARTICLE{Damen2022RESCALING,
328 |            title={Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100},
329 |            author={Damen, Dima and Doughty, Hazel and Farinella, Giovanni Maria  and and Furnari, Antonino 
330 |            and Ma, Jian and Kazakos, Evangelos and Moltisanti, Davide and Munro, Jonathan 
331 |            and Perrett, Toby and Price, Will and Wray, Michael},
332 |            journal   = {International Journal of Computer Vision (IJCV)},
333 |            year      = {2022},
334 |            volume = {130},
335 |            pages = {33–55},
336 |            Url       = {https://doi.org/10.1007/s11263-021-01531-2}
337 | } 
338 | ```
339 | For more information on the project and related research, please visit the [EPIC-Kitchens' EPIC Fields page](https://epic-kitchens.github.io/epic-fields/).
340 | 
341 | 
342 | ## License
343 | All files in this dataset are copyright by us and published under the 
344 | Creative Commons Attribution-NonCommerial 4.0 International License, found 
345 | [here](https://creativecommons.org/licenses/by-nc/4.0/).
346 | This means that you must give appropriate credit, provide a link to the license,
347 | and indicate if changes were made. You may do so in any reasonable manner,
348 | but not in any way that suggests the licensor endorses you or your use. You
349 | may not use the material for commercial purposes.
350 | 
351 | ## Contact
352 | 
353 | For general enquiries regarding this work or related projects, feel free to email us at [uob-epic-kitchens@bristol.ac.uk](mailto:uob-epic-kitchens@bristol.ac.uk).
354 | 
355 | 


--------------------------------------------------------------------------------
/assets/epic_fields.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/assets/epic_fields.png


--------------------------------------------------------------------------------
/demo/demo.py:
--------------------------------------------------------------------------------
  1 | import sys
  2 | import os
  3 | import os.path as osp
  4 | from pathlib import Path
  5 | import subprocess
  6 | import logging
  7 | import pycolmap
  8 | 
  9 | 
 10 | def parse_args():
 11 |     import argparse
 12 |     parser = argparse.ArgumentParser()
 13 |     parser.add_argument('video_path', type=str)
 14 |     return parser.parse_args()
 15 | 
 16 | 
 17 | def setup_logger(name, log_file, level=logging.DEBUG):
 18 |     """To setup as many loggers as you want"""
 19 | 
 20 |     handler = logging.FileHandler(log_file, mode='a')
 21 |     formatter = logging.Formatter('%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s',
 22 |                          datefmt='%Y-%m-%d %H:%M:%S')
 23 |     handler.setFormatter(formatter)
 24 |     logger = logging.getLogger(name)
 25 |     logger.setLevel(level)
 26 |     logger.addHandler(handler)
 27 | 
 28 |     return logger
 29 | 
 30 | 
 31 | class PipelineExecutor:
 32 |     """
 33 |     Output structure we need are:
 34 | 
 35 |     <out_dir>/
 36 |         pipeline.log
 37 |         sparse.log
 38 |         register.log
 39 |         dense_pcd.log
 40 |         colmap/
 41 |             sparse/{max_model_id}/{cameras.bin,points.bin,images.bin}
 42 |             registered/{cameras.bin,points.bin,images.bin}
 43 |             dense/dense.ply
 44 |     """
 45 | 
 46 |     def __init__(self, 
 47 |                  video_path: str, 
 48 |                  out_dir: str,
 49 |                  longside: int = 512,
 50 |                  camera_model: str = 'OPENCV', 
 51 |                  make_log_and_dirs=True,
 52 |                  ):
 53 |         """
 54 |         Args:
 55 |             video_path: path to the video
 56 |             camera_model: See Colmap doc
 57 |             longside: this controls the frame resolution for the extracted frames
 58 |         """
 59 |         self.worker_dir = Path(out_dir)
 60 |         self.video_file = video_path
 61 |         self.camera_model = camera_model
 62 |         self.longside = longside
 63 | 
 64 |         self.frames_dir = self.worker_dir / 'frames'
 65 |         self.homo_path = self.worker_dir / 'homo90.txt'
 66 |         self.colmap_dir = self.worker_dir / 'colmap'
 67 |         self.pipeline_log = self.worker_dir / 'pipeline.log'
 68 |         self.sparse_log = self.worker_dir / 'sparse.log'
 69 |         self.register_log = self.worker_dir / 'register.log'
 70 |         self.dense_pcd_log = self.worker_dir / 'dense_pcd.log'
 71 | 
 72 |         self.sparse_dir = self.colmap_dir / 'sparse'  # generated by colmap
 73 |         self.register_dir = self.colmap_dir / 'registered'
 74 |         self.dense_pcd_dir = self.colmap_dir / 'dense'
 75 | 
 76 |         if not make_log_and_dirs:
 77 |             return
 78 |         os.makedirs(self.worker_dir, exist_ok=True)
 79 |         self.logger = setup_logger('demo-logger', self.pipeline_log)
 80 |         self.logger.info("Run start")
 81 |         assert os.path.exists(self.pipeline_log)
 82 | 
 83 |     def extract_frames(self, with_skip=True):
 84 |         # num_expected_frames = -1
 85 |         if with_skip and os.path.exists(self.frames_dir) and len(os.listdir(self.frames_dir)) > 0:
 86 |             print(f'{self.frames_dir} exist and is non-empty, skip')
 87 |             return
 88 | 
 89 |         cmd1 = [
 90 |             'ffprobe', '-v', 'error', '-select_streams', 'v:0', '-show_entries', 'stream=width,height', '-of', 'csv=s=x:p=0', self.video_file
 91 |         ]
 92 |         # extract output resolution
 93 |         p = subprocess.Popen(cmd1, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
 94 |         out, err = p.communicate()
 95 |         print('Original resolution: ', out)
 96 |         w, h = out.decode('utf-8').strip().split('x')
 97 |         h, w = int(h), int(w)
 98 |         assert w > h
 99 |         h = h * self.longside // w
100 |         w = self.longside
101 | 
102 |         s = f'{w}x{h}'
103 |         os.makedirs(self.frames_dir, exist_ok=True)
104 | 
105 |         print("Extracting frames... ")
106 |         cmd2 = [
107 |             'ffmpeg', '-i', self.video_file, '-q:v', '1', '-vf', 'fps=30', '-s', s, f'{self.frames_dir}/frame_%010d.jpg']
108 |         cmd2 = ' '.join(cmd2)
109 |         p = subprocess.call(cmd2, shell=True)
110 |         self.logger.info(f'Extract frames done')
111 | 
112 |     def run_homography(self):
113 |         self.logger.info(f'Run homography')
114 |         cmd = [ 
115 |             'python', 'homography_filter/filter.py', '--src',
116 |             str(self.frames_dir), '--dst_file', str(self.homo_path), '--overlap', '0.9'
117 |         ] 
118 |         print(' '.join(cmd))
119 |         if os.path.exists(self.homo_path):
120 |             with open(self.homo_path, 'r') as fp:
121 |                 lines = fp.readlines()
122 |             n_lines = len(lines)
123 |             print(f'{self.homo_path} with {n_lines}, skip')
124 |             self.logger.info(f'{self.homo_path} with {n_lines}, skip')
125 |             return
126 |         cmd = ' '.join(cmd)
127 |         self.logger.info(cmd)
128 |         p = subprocess.call(cmd, shell=True)
129 |         self.logger.info(f'Homography Done')
130 | 
131 |     def run_sparse_reconstruct(self, script_path='demo/reconstruct_sparse.sh'):
132 |         status = self.get_summary()
133 |         if status['num_sparse_models'] > 0:
134 |             self.logger.info(f'Found {status["num_sparse_models"]} sparse models, skip sparse reconstruction()')
135 |             print(f'Found {status["num_sparse_models"]} sparse models, skip sparse reconstruction()')
136 |             return
137 |         self.logger.info(f'Run sparse')
138 |         cmd = [
139 |             'bash', script_path,
140 |             str(self.worker_dir), str(self.camera_model)
141 |         ]
142 |         print(' '.join(cmd))
143 |         print('Check sparse log at ', self.sparse_log)
144 |         self.logger.info(' '.join(cmd))
145 |         with open(self.sparse_log, 'w') as sparse_fp:
146 |             p = subprocess.run(cmd, stdout=sparse_fp, stderr=sparse_fp)
147 |         # out, err = p.communicate()
148 |         if p.returncode != 0:
149 |             print(f'Error in sparse reconstruction. See {self.sparse_log}')
150 |             sys.exit(1)
151 |         self.logger.info(f'Done sparse')
152 | 
153 |     def run_register(self, script_path='demo/register_dense.sh'):
154 |         summary = self.get_summary()
155 |         max_sparse_ind = summary['max_sparse_ind']
156 |         if summary['num_register'] > 0:
157 |             print(f'Found {summary["num_register"]} already registered, skiping')
158 |             return
159 |         self.logger.info(f'Run Register')
160 |         cmd = [
161 |             'bash', script_path,
162 |             str(self.worker_dir), str(self.camera_model), str(max_sparse_ind)
163 |         ]
164 |         print(' '.join(cmd))
165 |         self.logger.info(' '.join(cmd))
166 |         with open(self.register_log, 'w') as register_fp:
167 |             p = subprocess.run(cmd, stdout=register_fp, stderr=register_fp)
168 |         self.logger.info(f'Done Register')
169 | 
170 |     def run_dense_pcd(self, script_path='demo/dense_point_cloud.sh'):
171 |         summary = self.get_summary()
172 |         max_sparse_ind = summary['max_sparse_ind']
173 |         if os.path.exists(self.dense_pcd_dir / 'fused.ply'):
174 |             print(f'fused.ply already exist in {self.dense_pcd_dir}, skiping')
175 |             return
176 |         self.logger.info(f'Run Dense PCD (patch stereo)')
177 |         cmd = [
178 |             'bash', script_path,
179 |             str(self.worker_dir), str(max_sparse_ind)
180 |         ]
181 |         print(' '.join(cmd))
182 |         self.logger.info(' '.join(cmd))
183 |         with open(self.dense_pcd_log, 'w') as dense_pcd_fp:
184 |             p = subprocess.run(cmd, stdout=dense_pcd_fp, stderr=dense_pcd_fp)
185 |         self.logger.info(f'Done Dense PCD')
186 | 
187 |     def execute(self):
188 |         self.extract_frames()
189 |         self.run_homography()
190 |         if not osp.exists(self.homo_path):
191 |             print(f'{self.homo_path} not exist after homography, abort')
192 |             return
193 |         self.run_sparse_reconstruct()
194 |         if not self.get_summary()['num_sparse_models'] > 0:
195 |             print(f"num_sparse_models <= 0 after sparse reconstruction, abort")
196 |             return
197 |         self.run_register()
198 |         self.run_dense_pcd()
199 |     
200 |     def get_summary(self) -> dict:
201 |         """
202 |         N-frames, N-homo, N-sparse-models, max_sparse_ind, N-sparse-images, N-register
203 |         """
204 |         info = dict(
205 |             video=self.video_file,
206 |             num_frames=-1, num_homo=-1, num_sparse_models=-1,
207 |             max_sparse_ind=-1, num_sparse_images=-1, num_register=-1
208 |         )
209 |         info['num_frames'] = len(os.listdir(self.frames_dir))
210 |         if not os.path.exists(self.homo_path):
211 |             return info
212 | 
213 |         with open(self.homo_path) as fp:
214 |             info['num_homo'] = len(fp.readlines())
215 |     
216 |         if not osp.exists(self.sparse_dir):
217 |             return info
218 | 
219 |         info['num_sparse_models'] = len(os.listdir(self.sparse_dir))
220 |         for mod in os.listdir(self.sparse_dir):
221 |             mod_path = osp.join(self.sparse_dir, mod)
222 |             recon = pycolmap.Reconstruction(mod_path)
223 |             num_images = recon.num_images()
224 |             if num_images > info['num_sparse_images']:
225 |                 info['num_sparse_images'] = num_images
226 |                 info['max_sparse_ind'] = mod  # str
227 | 
228 |         reg_path = osp.join(self.register_dir)
229 |         if not osp.exists(osp.join(reg_path, 'images.bin')):
230 |             return info
231 |         recon = pycolmap.Reconstruction(reg_path)
232 |         num_images = recon.num_images()
233 |         info['num_register'] = num_images
234 |         
235 |         return info
236 | 
237 | if __name__ == '__main__':
238 |     args = parse_args()
239 |     executor = PipelineExecutor(
240 |         args.video_path, out_dir='outputs/demo/',
241 |         longside=512)
242 |     executor.execute()


--------------------------------------------------------------------------------
/demo/demo_ego4d.md:
--------------------------------------------------------------------------------
 1 | # Reconstruction Pipeline: Demo on Ego4D
 2 | 
 3 | This `demo/demo.py` will works on a video directly.
 4 | 
 5 | Assume the environment is setup as described in [Step 0](/README.md#step-0-prerequisites-and-initial-configuration),
 6 | and the video file is named `video.mp4`.
 7 | Run the demo with:
 8 | 
 9 | ```
10 | python demo/demo.py video.mp4
11 | ```
12 | 
13 | You will find the results in `outputs/demo/colmap/`:
14 | the file `outputs/demo/colmap/registered/images.bin` stores (nearly) all camera poses;
15 | the file `outputs/demo/colmap/dense/fused.ply` stores the dense point cloud of the scene.
16 | There are also log files `outputs/demo/*.log` to monitor the progress.
17 | 
18 | You should now inspect(visualise) the results using:
19 | ```
20 | # Tested with open3d==0.16.0
21 | python3 tools/visualize_colmap_open3d.py \
22 |     --model outputs/demo/colmap/registered \
23 |     --pcd-path outputs/demo/colmap/dense/fused.ply
24 | ```
25 | Note the `outputs/demo/colmap/registered/images.bin` might be slow to load. In practice, we visualise the key-frames:
26 | ```
27 | python3 tools/visualize_colmap_open3d.py  \
28 |     --model outputs/demo/colmap/sparse/0 \  
29 |     --pcd-path outputs/demo/colmap/dense/fused.ply
30 | # Note: See colmap doc for what `sparse/0` exactly means.
31 | ```
32 | 
33 | ### What does this `demo/demo.py` do?
34 | 
35 | Specifically, `demo/demo.py` file will do the following sequentially:
36 | - Extract frames using `ffmpeg` with longside 512px. This is analogous to Step 1 & 2 in [Reconstruction Pipeline](/README.md#reconstruction-pipeline).
37 | - Compute important frames via homography. This correspond to Step 3 above.
38 | - Perform the _sparse reconstruction_. This corresponds to Step 4 above.
39 |     - at the end of this step, you should inspect the sparse result to make sure it makes sense.
40 | - Perform the _dense frame registration_. This corresponds to Step 5 above.
41 |     - at the end of this, you will have all the camera poses.
42 | - Compute dense point cloud using colmap's patch_match_stereo. This gives you the dense pretty point-cloud you see in the teaser image.
43 | 
44 | ### Example: Ego4D videos
45 | 
46 | We demo this script on following two Ego4D videos:
47 | - Task: Cooking — 10 minutes. Ego4d uid = `id18f5c2be-cb79-46fa-8ff1-e03b7e26c986`. Demo output on Youtube: https://youtu.be/GfBsLnZoFGs
48 |     - The running time of this video is 4 hours.
49 |     - As a sanity check, the file `homo90.txt` after the homography step contains *1522* frames.
50 | - Task: Construction — 35 minutes of decorating and refurbishment. Ego4d uid =`a2dd8a8f-835f-4068-be78-99d38ad99625`. Demo output on Youtube: https://youtu.be/EZlayZIwNgQ
51 |     - The running time of this video breaks down as follows:
52 |         - Extract frames: 5 mins
53 |         - Homography filter: 1 hour
54 |         - Sparse reconstruction: **20 hours**
55 |         - Dense register: 1.5 hours
56 |         - Dense Point-cloud generation: 2 hours
57 | 
58 | ### Tips for running the demo script
59 | 
60 | We rely on COLMAP, but no tool is perfect. In case of failure, check:
61 | - If the resulting point cloud is not geometrically correct, e.g. the ground is clearly not flat, try to re-run from the sparse reconstruction step.
62 | COLMAP has some stochastic behaviur at initial view choosing.
63 | - If above fails again, try to increase the `--overlap` in homography filter to e.g. 0.95. This will the number of important frames, at the cost of increasing running time during sparse reconstruction.
64 | 
65 | 
66 | ### Visualise a video of camera poses
67 | 
68 | To produce a video of camera poses and trajectory overtime (see e.g. Youtube video above), follow steps below:
69 | <details>
70 |     <summary>Click to see steps</summary>
71 |     <ol>
72 |     <li> Visualise the result again with Open3D GUI<br><code>python3 tools/visualize_colmap_open3d.py --model outputs/demo/colmap/sparse/0 --pcd-path outputs/demo/colmap/dense/fused.ply</code>
73 |     </li>
74 |     <li>
75 |     In Open3D GUI, press <code>Ctrl-C</code>(Linux) / <code>Cmd-C</code> (Mac) to copy the view to system clipboard. Go to any editor, press <code>Ctrl-V/Cmd-V</code> to paste the view status, save the file to <code>outputs/demo/view.json</code>.
76 |     </li>
77 |     <li> Run the following script to produce the video<br><code>python utils/hovering/hover_open3d.py --model outputs/demo/colmap/registered --pcd-path outputs/demo/colmap/dense/fused.ply  --view-path outputs/demo/view.json</code><br>The produced video is at <code>outputs/hovering/out.mp4</code>.
78 |     </li>
79 |     </ol>
80 | </details>
81 | 


--------------------------------------------------------------------------------
/demo/dense_point_cloud.sh:
--------------------------------------------------------------------------------
 1 | 
 2 | WORK_DIR=$1
 3 | SPARSE_INDEX=$2
 4 | 
 5 | IMG_PATH=$WORK_DIR/frames
 6 | INPUT_PATH=$WORK_DIR/colmap/sparse/$SPARSE_INDEX
 7 | OUTPUT_PATH=$WORK_DIR/colmap/dense
 8 | 
 9 | OLD_DIR=$(pwd)
10 | 
11 | mkdir -p $OUTPUT_PATH
12 | 
13 | colmap image_undistorter   \
14 |    --image_path $IMG_PATH \
15 |    --input_path $INPUT_PATH  \
16 |    --output_path $OUTPUT_PATH   \
17 |    --output_type COLMAP   \
18 |    --max_image_size 1000 \
19 | 
20 | cd $OUTPUT_PATH
21 | 
22 | colmap patch_match_stereo    \
23 | --workspace_path .    \
24 | --workspace_format COLMAP    \
25 | --PatchMatchStereo.max_image_size=1000     \
26 | --PatchMatchStereo.gpu_index=0,1     \
27 | --PatchMatchStereo.cache_size=32 \
28 | --PatchMatchStereo.geom_consistency false \
29 | 
30 | colmap stereo_fusion   \
31 |   --workspace_path .   \
32 |   --workspace_format COLMAP   \
33 |   --input_type photometric   \
34 |   --output_type PLY \
35 |   --output_path ./fused.ply \
36 | 
37 | # For geometric consistency, do the following lines instead
38 | # colmap patch_match_stereo    \
39 | # --workspace_path .    \
40 | # --workspace_format COLMAP    \
41 | # --PatchMatchStereo.max_image_size=1000     \
42 | # --PatchMatchStereo.gpu_index=0,1     \
43 | # --PatchMatchStereo.cache_size=32 \
44 | # --PatchMatchStereo.geom_consistency false \
45 | 
46 | # colmap stereo_fusion   \
47 | #   --workspace_path .   \
48 | #   --workspace_format COLMAP   \
49 | #   --input_type photometric   \
50 | #   --output_type PLY \
51 | #   --output_path ./fused.ply \
52 | 
53 | cd $OLD_DIR


--------------------------------------------------------------------------------
/demo/reconstruct_sparse.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | start=`date +%s`
 3 | 
 4 | WORK_DIR=$1
 5 | CAMERA_MODEL=$2  # OPENCV or OPENCV_FISHEYE
 6 | GPU_IDX=0
 7 | 
 8 | IMGS_DIR=$WORK_DIR/frames
 9 | OUT_DIR=${WORK_DIR}/colmap
10 | 
11 | DB_PATH=${OUT_DIR}/database.db
12 | SPARSE_DIR=${OUT_DIR}/sparse
13 | 
14 | mkdir -p ${OUT_DIR}
15 | mkdir -p ${SPARSE_DIR}
16 | 
17 | #SIMPLE_PINHOLE
18 | colmap feature_extractor    \
19 |     --database_path ${DB_PATH} \
20 |     --ImageReader.camera_model $CAMERA_MODEL \
21 |     --image_list_path $WORK_DIR/homo90.txt     \
22 |     --ImageReader.single_camera 1     \
23 |     --SiftExtraction.use_gpu 1 \
24 |     --SiftExtraction.gpu_index $GPU_IDX \
25 |     --image_path $IMGS_DIR \
26 | 
27 | colmap sequential_matcher \
28 |      --database_path ${DB_PATH} \
29 |      --SiftMatching.use_gpu 1 \
30 |      --SequentialMatching.loop_detection 1 \
31 |      --SiftMatching.gpu_index $GPU_IDX \
32 |      --SequentialMatching.vocab_tree_path vocab_bins/vocab_tree_flickr100K_words32K.bin \
33 | 
34 | colmap mapper     \
35 |     --database_path ${DB_PATH}     \
36 |     --image_path $IMGS_DIR    \
37 |     --output_path ${SPARSE_DIR} \
38 |     --image_list_path $WORK_DIR/homo90.txt \
39 |     #--Mapper.ba_global_use_pba 1 \
40 |     #--Mapper.ba_global_pba_gpu_index 0 1 \
41 | 
42 | 
43 | end=`date +%s`
44 | 
45 | runtime=$(((end-start)/60))
46 | echo "$runtime minutes"
47 | 


--------------------------------------------------------------------------------
/demo/register_dense.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | start=`date +%s`
 3 | 
 4 | GPU_IDX=0
 5 | 
 6 | WORK_DIR=$1
 7 | CAMERA_MODEL=$2
 8 | MAX_SPARSE_IND=$3
 9 | IMGS_DIR=$WORK_DIR/frames
10 | OUT_DIR=${WORK_DIR}/colmap
11 | 
12 | DB_PATH=${OUT_DIR}/database.db
13 | SPARSE_DIR=${OUT_DIR}/sparse
14 | 
15 | REG_DIR=${OUT_DIR}/registered
16 | mkdir -p $REG_DIR
17 | 
18 | VIDEOUID=`basename $WORK_DIR`
19 | REG_DB_PATH=${OUT_DIR}/reg${VIDEOUID}.db
20 | echo $VIDOEUID $REG_DB_PATH
21 | rm -f $REG_DB_PATH $REG_DB_PATH-shm $REG_DB_PATH-wal
22 | cp $DB_PATH $REG_DB_PATH
23 | 
24 | colmap feature_extractor    \
25 |     --database_path ${REG_DB_PATH} \
26 |     --ImageReader.camera_model $CAMERA_MODEL \
27 |     --ImageReader.single_camera 1     \
28 |     --ImageReader.existing_camera_id 1 \
29 |     --SiftExtraction.use_gpu 1 \
30 |     --SiftExtraction.gpu_index $GPU_IDX \
31 |     --image_path $IMGS_DIR
32 | 
33 | colmap sequential_matcher \
34 |      --database_path ${REG_DB_PATH} \
35 |      --SiftMatching.use_gpu 1 \
36 |      --SequentialMatching.loop_detection 1 \
37 |      --SiftMatching.gpu_index $GPU_IDX \
38 |      --SequentialMatching.vocab_tree_path vocab_bins/vocab_tree_flickr100K_words32K.bin \
39 | 
40 | colmap image_registrator    \
41 |      --database_path $REG_DB_PATH \
42 |      --input_path $SPARSE_DIR/$MAX_SPARSE_IND    \
43 |      --output_path $REG_DIR \
44 | 
45 | # Release space after successful registration
46 | if [ -e $REG_DIR/images.bin ]; then
47 |      rm -f $REG_DB_PATH $REG_DB_PATH-shm $REG_DB_PATH-wal
48 | fi
49 | 
50 | end_reg=`date +%s`
51 | 
52 | runtime=$(((end_reg-start)/60))
53 | echo "$runtime minutes"
54 | 
55 | 


--------------------------------------------------------------------------------
/example_data/P04_01_line.json:
--------------------------------------------------------------------------------
1 | [
2 |     -3.028, 4.60835, 3.67792,
3 |     0.199998, 0.291596, 5.56575
4 | ]
5 | 


--------------------------------------------------------------------------------
/example_data/P04_01_line.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P04_01_line.png


--------------------------------------------------------------------------------
/example_data/P06_09_line.json:
--------------------------------------------------------------------------------
1 | [
2 |     11.5486, 1.00723, 3.13634,
3 |     -2.84154, 0.720368, 6.66926
4 | ]


--------------------------------------------------------------------------------
/example_data/P06_09_line.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P06_09_line.png


--------------------------------------------------------------------------------
/example_data/P12_101_line.json:
--------------------------------------------------------------------------------
1 | [
2 |     2.44827, 0.0581669, 8.20895,
3 |     -7.4244, 3.82762, 7.32
4 | ]


--------------------------------------------------------------------------------
/example_data/P12_101_line.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P12_101_line.png


--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000080.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000080.jpg


--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000085.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000085.jpg


--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000090.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000090.jpg


--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000095.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000095.jpg


--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000100.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000100.jpg


--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000105.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000105.jpg


--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000110.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000110.jpg


--------------------------------------------------------------------------------
/example_data/P28_101/frame_0000000115.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101/frame_0000000115.jpg


--------------------------------------------------------------------------------
/example_data/P28_101_line.json:
--------------------------------------------------------------------------------
1 | [
2 |     -2.49927, -0.543869, 2.57086,
3 |     3.32875, -2.17165, 2.4229
4 | ]


--------------------------------------------------------------------------------
/example_data/P28_101_line.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/P28_101_line.png


--------------------------------------------------------------------------------
/example_data/example_output_gui.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/example_output_gui.jpg


--------------------------------------------------------------------------------
/example_data/example_output_line.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/example_data/example_output_line.jpg


--------------------------------------------------------------------------------
/homography_filter/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/homography_filter/__init__.py


--------------------------------------------------------------------------------
/homography_filter/argparser.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import argparse
 3 | 
 4 | 
 5 | def parse_args():
 6 |     parser = argparse.ArgumentParser()
 7 |     parser.add_argument(
 8 |         "--src",
 9 |         type=str,
10 |     )
11 |     parser.add_argument(
12 |         "--dst_file",
13 |         type=str,
14 |     )
15 |     parser.add_argument(
16 |         "--overlap",
17 |         default=0.9,
18 |         type=float,
19 |     )
20 |     parser.add_argument(
21 |         "--frame_range_min",
22 |         default=0,
23 |         type=int,
24 |     )
25 |     parser.add_argument(
26 |         "--frame_range_max",
27 |         default=None,
28 |         type=int,
29 |     )
30 |     parser.add_argument(
31 |         "--filtering_scale",
32 |         default=1,
33 |         type=int,
34 |     )
35 |     parser.add_argument(
36 |         '-f',
37 |         type=str,
38 |         default=None
39 |     )
40 |     args = parser.parse_args()
41 |     return args
42 | 


--------------------------------------------------------------------------------
/homography_filter/filter.py:
--------------------------------------------------------------------------------
 1 | 
 2 | import os
 3 | from glob import glob
 4 | import numpy as np
 5 | from matplotlib import pyplot as plt
 6 | from collections import defaultdict
 7 | import time
 8 | 
 9 | from lib import *
10 | from argparser import parse_args
11 | import cv2
12 | 
13 | 
14 | def make_homography_loader(args):
15 | 
16 |     images = Images(args.src, scale=args.filtering_scale)
17 |     print(f'Found {len(images.imreader.fpaths)} images.')
18 |     features = Features(images)
19 |     matches = Matches(features)
20 |     homographies = Homographies(images, features, matches)
21 | 
22 |     return homographies
23 | 
24 | 
25 | def save(fpaths_filtered, args):
26 |     imreader = ImageReader(src=args.src)
27 |     dir_dst = args.dir_dst
28 |     dir_images = os.path.join(dir_dst, 'images')
29 |     extract_frames(dir_images, fpaths_filtered, imreader)
30 |     save_as_video(os.path.join(dir_dst, 'video'), fpaths_filtered, imreader)
31 | 
32 | 
33 | if __name__ == '__main__':
34 | 
35 |     # set filtering to deterministic mode
36 |     cv2.setRNGSeed(0)
37 |     args = parse_args()
38 |     homographies = make_homography_loader(args)
39 |     graph = calc_graph(homographies, **vars(args))
40 |     fpaths_filtered = graph2fpaths(graph)
41 |     lines = [os.path.basename(v)+'\n' for v in fpaths_filtered]
42 |     dir_name = os.path.dirname(args.dst_file)
43 |     if not os.path.exists(dir_name):
44 |         os.makedirs(dir_name)
45 |     with open(args.dst_file, 'w') as fp:
46 |         fp.writelines(lines)


--------------------------------------------------------------------------------
/homography_filter/lib.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import cv2 as cv
  3 | import numpy as np
  4 | from matplotlib import pyplot as plt
  5 | from collections import defaultdict
  6 | import sys
  7 | import os
  8 | import shutil
  9 | from glob import glob
 10 | 
 11 | 
 12 | if '-f' in sys.argv:
 13 |     from tqdm.notebook import tqdm
 14 | else:
 15 |     from tqdm import tqdm
 16 | 
 17 | 
 18 | class Images:
 19 |     def __init__(self, src, load_grey=True, scale=1):
 20 |         self.images = {}
 21 |         self.im_size = None
 22 |         self.src = src
 23 |         self.scale = scale
 24 |         if load_grey:
 25 |             self.imreader = ImageReader(src, scale=scale, cv_flag=cv.IMREAD_GRAYSCALE)
 26 |         else:
 27 |             self.imreader = ImageReader(src, scale=scale)
 28 | 
 29 |     def __getitem__(self, k):
 30 |         if k not in self.images:
 31 |             im = self.imreader[k]
 32 |             self.images[k] = im
 33 |             self.im_size = self.images[k].shape[:2]
 34 |         return self.images[k]
 35 | 
 36 | 
 37 | class Features:
 38 |     def __init__(self, images):
 39 |         self.features = {}
 40 |         self.images = images
 41 |         self.sift = cv.SIFT_create()
 42 | 
 43 |     def __getitem__(self, k):
 44 |         if k not in self.features:
 45 |             im = self.images[k]
 46 |             kp, des = self.sift.detectAndCompute(im, None)
 47 |             self.features[k] = (kp, des)
 48 |         return self.features[k]
 49 | 
 50 | 
 51 | class Matches:
 52 |     def __init__(self, features):
 53 | 
 54 |         FLANN_INDEX_KDTREE = 1
 55 |         index_params = dict(algorithm=FLANN_INDEX_KDTREE, trees=5)
 56 |         search_params = dict(checks=50)
 57 |         self.features = features
 58 |         self.matcher = cv.FlannBasedMatcher(index_params, search_params)
 59 |         self.matches = {}
 60 |         self.for_panorama_stitching = False
 61 | 
 62 |     def __getitem__(self, k):
 63 |         if k not in self.matches:
 64 |             (kp1, des1) = self.features[k[0]]
 65 |             (kp2, des2) = self.features[k[1]]
 66 |             if len(kp1) > 8:
 67 |                 try:
 68 |                     matches = self.matcher.knnMatch(des1, des2, k=2)
 69 |                 except cv.error as e:
 70 |                     print('NOTE: Too few keypoints for matching, skip.')
 71 |                     matches = zip([], [])
 72 |             else:
 73 |                 matches = zip([], [])
 74 |             # store all the good matches as per Lowe's ratio test.
 75 |             good = []
 76 |             for m, n in matches:
 77 |                 if m.distance < 0.7 * n.distance:
 78 |                     good.append(m)
 79 |             self.matches[k] = good
 80 | 
 81 |         return self.matches[k]
 82 | 
 83 | 
 84 | class Homographies:
 85 |     def __init__(self, images, features, matches):
 86 |         self.matches = matches
 87 |         self.homographies = {}
 88 |         self.images = images
 89 |         self.features = features
 90 |         self.warps = {}
 91 |         self.min_match_count = 10
 92 |         self.images_rgb = ImageReader(src=self.images.src, scale=self.images.scale)
 93 | 
 94 |     def __getitem__(self, k):
 95 |         good = self.matches[k]
 96 |         kp1, _ = self.features[k[0]]
 97 |         kp2, _ = self.features[k[1]]
 98 |         img2 = self.images[k[1]]
 99 |         if k not in self.homographies:
100 |             if len(good) > self.min_match_count:
101 |                 src_pts = np.float32([kp1[m.queryIdx].pt for m in good]).reshape(
102 |                     -1, 1, 2
103 |                 )
104 |                 dst_pts = np.float32([kp2[m.trainIdx].pt for m in good]).reshape(
105 |                     -1, 1, 2
106 |                 )
107 |                 M, mask = cv.findHomography(src_pts, dst_pts, cv.RANSAC, 5.0)
108 |                 self.homographies[k] = (M, mask)
109 |             else:
110 |                 # print( "Not enough matches are found - {}/{}".format(len(good), self.min_match_count) )
111 |                 matchesMask = None
112 |                 self.homographies[k] = (None, None)
113 |         return self.homographies[k]
114 | 
115 |     def calc_overlap(self, *k, vis=False, is_debug=False, with_warp=False, draw_matches=True):
116 |         img1 = self.images_rgb[k[0]].copy()
117 |         img2 = self.images_rgb[k[1]].copy()
118 |         kp1, _ = self.features[k[0]]
119 |         kp2, _ = self.features[k[1]]
120 |         good = self.matches[k]
121 |         h, w, c = img1.shape
122 |         M, mask = self[k]
123 | 
124 |         if M is None:
125 |             return 0, [], np.zeros([h, w * 2])
126 | 
127 |         matchesMask = mask.ravel().tolist()
128 | 
129 |         pts = np.float32([[0, 0], [0, h - 1], [w - 1, h - 1], [w - 1, 0]]).reshape(
130 |             -1, 1, 2
131 |         )
132 |         dst = cv.perspectiveTransform(pts, M)
133 | 
134 |         img2 = cv.polylines(img2, [np.int32(dst)], True, 255, 3, cv.LINE_AA)
135 | 
136 |         if with_warp:
137 |             self.warps[k] = img2
138 |         draw_params = dict(
139 |             matchColor=(0, 255, 0),  # draw matches in green color
140 |             singlePointColor=None,
141 |             matchesMask=matchesMask,  # draw only inliers
142 |             flags=2,
143 |         )
144 | 
145 |         if is_debug:
146 |             if draw_matches:
147 |                 im_matches = cv.drawMatches(img1, kp1, img2, kp2, good, None, **draw_params)
148 |             else:
149 |                 im_matches = img2
150 |             if vis:
151 |                 plt.imshow(im_matches, "gray"), plt.show()
152 |                 # plt.imshow(img3, "gray"), plt.show()
153 |         else:
154 |             im_matches = img2
155 | 
156 |         image_area = self.images.im_size[0] * self.images.im_size[1]
157 |         polygon = dst.copy()[:, 0]
158 |         polygon = bound_polygon(polygon, im_size=self.images.im_size)
159 |         overlap = polygon_area(polygon[:, 1], polygon[:, 0]) / image_area
160 | 
161 |         return overlap, good, im_matches
162 | 
163 | def calc_graph(
164 |     homographies,
165 |     return_im_matches=False,
166 |     overlap=0.9,
167 |     frame_range_min=0,
168 |     frame_range_max=None,
169 |     is_debug=False,
170 |     clear_cache=True,
171 |     **kwargs,
172 | ):
173 | 
174 |     fpaths = homographies.images.imreader.fpaths
175 |     print(overlap)
176 |     graph = {'im_matches': {}, 'fpaths': {}}
177 |     if frame_range_max is None:
178 |         frame_range_max = len(fpaths)
179 |     i = frame_range_min
180 |     j = i + 1
181 |     pbar = tqdm(total=frame_range_max - frame_range_min - 1)
182 |     while i < frame_range_max - 1 and j < frame_range_max:
183 |         j = i + 1
184 |         while j < frame_range_max:
185 |             pbar.update(1)
186 |             overlap_ij, matches, im_matches = homographies.calc_overlap(
187 |                 fpaths[i],
188 |                 fpaths[j],
189 |                 vis=False,
190 |                 is_debug=is_debug,
191 |             )
192 |             if overlap_ij < overlap:
193 |                 if is_debug:
194 |                     graph['im_matches'][i, j] = im_matches
195 |                 graph['fpaths'][i, j] = [fpaths[i], fpaths[j]]
196 |                 if clear_cache:
197 |                     i_ = i
198 |                     pi = fpaths[i_]
199 |                     del homographies.images.images[pi]
200 |                     del homographies.features.features[pi]
201 |                     for j_ in range(i_+1, j+1):
202 |                         pj = fpaths[j_]
203 |                         del homographies.homographies[(pi, pj)]
204 |                         del homographies.matches.matches[(pi, pj)]
205 |                         del homographies.images.images[pj]
206 |                         del homographies.features.features[pj]
207 |                 i = j
208 |                 break
209 |             j += 1
210 |     pbar.close()
211 |     return graph
212 | 
213 | 
214 | def graph2fpaths(graph):
215 |     fpaths = list(graph['fpaths'].values())
216 |     first_fpath = fpaths[0][0]
217 |     graph = graph['fpaths']
218 |     paths = [first_fpath] + [fpath_pair[1] for fpath_pair in graph.values()]
219 |     return paths
220 | 
221 | 
222 | def bound_polygon(polygon, im_size):
223 |     # approximate for now instead of line clipping
224 |     polygon[:, 0] = np.clip(polygon[:, 0], 0, im_size[1])
225 |     polygon[:, 1] = np.clip(polygon[:, 1], 0, im_size[0])
226 |     return polygon
227 | 
228 | 
229 | def polygon_area(x,y):
230 |     return 0.5*np.abs(np.dot(x,np.roll(y,1))-np.dot(y,np.roll(x,1)))
231 | 
232 | 
233 | def write_mp4(name, frames, fps=10):
234 |     import imageio
235 |     imageio.mimwrite(name + ".mp4", frames, "mp4", fps=fps)
236 | 
237 | 
238 | def save_as_video(dst, fpaths, imreader):
239 |     frames = []
240 |     for fp in tqdm(fpaths):
241 |         frames += [imreader[fp]]
242 |     write_mp4(dst, frames)
243 | 
244 | 
245 | def extract_frames(dir_dst, fpaths, imreader):
246 |     for k in fpaths:
247 |         imreader.save(k, dir_dst)
248 | 
249 | 
250 | # imreader
251 | 
252 | import io
253 | def tar2bytearr(tar_member):
254 |     return np.asarray(
255 |         bytearray(
256 |             tar_member.read()
257 |         ),
258 |         dtype=np.uint8
259 |     )
260 | 
261 | import shutil
262 | 
263 | import tarfile
264 | class ImageReader:
265 |     def __init__(self, src, scale=1, cv_flag=cv.IMREAD_UNCHANGED):
266 |         # src can be directory or tar file
267 | 
268 |         self.scale = 1
269 |         self.cv_flag = cv_flag
270 | 
271 |         if os.path.isdir(src):
272 |             self.src_type = 'dir'
273 |             self.fpaths = sorted(glob(os.path.join(src, '*.jpg')))
274 |         elif os.path.isfile(src) and os.path.splitext(src)[1] == '.tar':
275 |             self.tar = tarfile.open(src)
276 |             self.src_type = 'tar'
277 |             self.fpaths = sorted([x for x in self.tar.getnames() if 'frame_' in x and '.jpg' in x])
278 |         else:
279 |             print('Source has unknown format.')
280 |             exit()
281 | 
282 |     def __getitem__(self, k):
283 |         if self.src_type == 'dir':
284 | 
285 |             im = cv.imread(k, self.cv_flag)
286 |         elif self.src_type == 'tar':
287 |             member = self.tar.getmember(k)
288 |             tarfile = self.tar.extractfile(member)
289 |             byte_array = tar2bytearr(tarfile)
290 |             im = cv.imdecode(byte_array, self.cv_flag)
291 |         if self.scale != 1:
292 |             im = cv.resize(
293 |                 im, dsize=[im.shape[0] // self.scale, im.shape[1] // self.scale]
294 |             )
295 |         if self.cv_flag != cv.IMREAD_GRAYSCALE:
296 |             im = im[..., [2, 1, 0]]
297 |         return im
298 | 
299 |     def save(self, k, dst):
300 |         fn = os.path.split(k)[-1]
301 |         if self.src_type == 'dir':
302 |             shutil.copy(k, os.path.join(dst, fn))
303 |         elif self.src_type == 'tar':
304 |             self.tar.extract(self.tar.getmember(k), dst)
305 | 
306 | 
307 | # test
308 | def test():
309 |     reader_args = {'scale': 2, 'cv_flag': cv.IMREAD_GRAYSCALE}
310 |     reader_args = {'scale': 2}
311 | 
312 |     src = '/work/vadim/datasets/visor/2v6cgv1x04ol22qp9rm9x2j6a7/' + \
313 |     'EPIC-KITCHENS-frames/tar/P28_05.tar'
314 |     imreader1 = ImageReader(src=src, **reader_args)
315 |     fpaths1 = imreader1.fpaths
316 | 
317 |     reader_args = {'scale': 2}
318 | 
319 |     video_id = 'P28_05'
320 |     src = f'/work/vadim/datasets/visor/2v6cgv1x04ol22qp9rm9x2j6a7/EPIC-KITCHENS-frames/rgb_frames/{video_id}'
321 |     imreader2 = ImageReader(src=src, **reader_args)
322 |     fpaths2 = imreader2.fpaths
323 | 
324 |     for i in range(0, len(fpaths1), 1000):
325 |         print((imreader1[fpaths1[i]] == imreader2[fpaths2[i]]).all())


--------------------------------------------------------------------------------
/input_videos.txt:
--------------------------------------------------------------------------------
1 | P15_12


--------------------------------------------------------------------------------
/licence.txt:
--------------------------------------------------------------------------------
1 | All files in this dataset are copyright by us and published under the 
2 | Creative Commons Attribution-NonCommerial 4.0 International License, found 
3 | at https://creativecommons.org/licenses/by-nc/4.0/.
4 | This means that you must give appropriate credit, provide a link to the license,
5 | and indicate if changes were made. You may do so in any reasonable manner,
6 | but not in any way that suggests the licensor endorses you or your use. You
7 | may not use the material for commercial purposes.
8 | 


--------------------------------------------------------------------------------
/reconstruct_sparse.py:
--------------------------------------------------------------------------------
 1 | import subprocess
 2 | import shutil
 3 | import os
 4 | import time
 5 | import glob
 6 | import argparse
 7 | import pycolmap
 8 | from utils.lib import *
 9 | # Function to parse command-line arguments
10 | def parse_args():
11 |     parser = argparse.ArgumentParser(description='COLMAP Reconstruction Script')
12 |     parser.add_argument('--input_videos', type=str, default='input_videos.txt',
13 |                         help='A file with list of vidoes to be processed in all stages')
14 |     parser.add_argument('--sparse_reconstuctions_root', type=str, default='colmap_models/sparse',
15 |                         help='Path to the sparsely reconstructed models.')
16 |     parser.add_argument('--epic_kithens_root', type=str, default='.',
17 |                         help='Path to epic kitchens images.')
18 |     parser.add_argument('--logs_path', type=str, default='logs/sparse/out_logs_terminal',
19 |                         help='Path to store the log files.')
20 |     parser.add_argument('--summary_path', type=str, default='logs/sparse/out_summary',
21 |                         help='Path to store the summary files.')
22 |     parser.add_argument('--sampled_images_path', type=str, default='sampled_frames',
23 |                         help='Path to the directory containing sampled image files.')
24 |     parser.add_argument('--gpu_index', type=int, default=0,
25 |                         help='Index of the GPU to use.')
26 | 
27 |     return parser.parse_args()
28 | 
29 | 
30 | args = parse_args()
31 | 
32 | gpu_index = args.gpu_index
33 | 
34 | videos_list = read_lines_from_file(args.input_videos)
35 | videos_list = sorted(videos_list)
36 | print('GPU: %d' % (gpu_index))
37 | os.makedirs(args.logs_path, exist_ok=True)
38 | os.makedirs(args.summary_path, exist_ok=True)
39 | os.makedirs(args.sparse_reconstuctions_root, exist_ok=True)
40 | 
41 | i = 0
42 | for video in videos_list:
43 |     pre = video.split('_')[0]
44 |     if (not os.path.exists(os.path.join(args.sparse_reconstuctions_root, '%s' % video))):
45 |         # check the number of images in this video
46 |         with open(os.path.join(args.sampled_images_path, '%s_selected_frames.txt' % (video)), 'r') as f:
47 |             lines = f.readlines()
48 |             num_lines = len(lines)
49 |         #print(f'The file {video} contains {num_lines} lines.')
50 |         if num_lines < 100000: #it's too large, so it would take days! 
51 |             print('Processing: ', video, '(',num_lines, 'images )')
52 |             start_time = time.time()
53 | 
54 |             # Define the path to the shell script
55 |             script_path = 'scripts/reconstruct_sparse.sh'
56 | 
57 |             # Create a unique copy of the script
58 |             script_copy_path = video + '_' + str(os.getpid()) + '_' + os.path.basename(script_path)
59 |             shutil.copy(script_path, script_copy_path)
60 | 
61 |             # Output file
62 |             output_file_path = os.path.join(args.logs_path, script_copy_path.replace('.sh', '.out'))
63 | 
64 | 
65 |             # Define the command to execute the script
66 |             command = ["bash", script_copy_path, video,args.sparse_reconstuctions_root,args.epic_kithens_root,args.sampled_images_path,args.summary_path,str(gpu_index)]
67 |             # Open the output file in write mode
68 |             with open(output_file_path, 'w') as output_file:
69 |                 # Run the command and capture its output in real time
70 |                 process = subprocess.Popen(command, stdout=output_file, stderr=subprocess.PIPE, text=True)
71 |                 while True:
72 |                     output = process.stderr.readline()
73 |                     if output == '' and process.poll() is not None:
74 |                         break
75 |                     if output:
76 |                         output_file.write(output)
77 |                         output_file.flush()
78 | 
79 |             # Once the script has finished running, you can delete the copy of the script
80 |             os.remove(script_copy_path)
81 | 
82 |             #In case of having multiple models, will keep the one with largest number of images and rename it as 0
83 |             reg_images = keep_model_with_largest_images(os.path.join(args.sparse_reconstuctions_root,video,'sparse'))
84 |             if reg_images > 0:
85 |                 print(f"Registered_images/total_images: {reg_images}/{num_lines} = {round(reg_images/num_lines*100)}%")
86 |             else:
87 |                 print('The video reconstruction fails!! no reconstruction file is found!')
88 | 
89 | 
90 | 
91 | 
92 |             print("Execution time:  %s minutes" % round((time.time() - start_time)/60, 0))
93 |             print('-----------------------------------------------------------')
94 | 
95 |     i += 1
96 | 
97 | 


--------------------------------------------------------------------------------
/register_dense.py:
--------------------------------------------------------------------------------
 1 | import subprocess
 2 | import shutil
 3 | import os
 4 | import time
 5 | import glob
 6 | import argparse
 7 | import pycolmap
 8 | from utils.lib import *
 9 | # Function to parse command-line arguments
10 | def parse_args():
11 |     parser = argparse.ArgumentParser(description='COLMAP Reconstruction Script')
12 |     parser.add_argument('--input_videos', type=str, default='input_videos.txt',
13 |                         help='A file with list of vidoes to be processed in all stages')
14 |     parser.add_argument('--sparse_reconstuctions_root', type=str, default='colmap_models/sparse',
15 |                         help='Path to the sparsely reconstructed models.')
16 |     parser.add_argument('--dense_reconstuctions_root', type=str, default='colmap_models/dense',
17 |                         help='Path to the densely registered models.')
18 |     parser.add_argument('--epic_kithens_root', type=str, default='.',
19 |                         help='Path to epic kitchens images.')
20 |     parser.add_argument('--logs_path', type=str, default='logs/dense/out_logs_terminal',
21 |                         help='Path to store the log files.')
22 |     parser.add_argument('--summary_path', type=str, default='logs/dense/out_summary',
23 |                         help='Path to store the summary files.')
24 |     parser.add_argument('--gpu_index', type=int, default=0,
25 |                         help='Index of the GPU to use.')
26 | 
27 |     return parser.parse_args()
28 | 
29 | 
30 | args = parse_args()
31 | 
32 | gpu_index = args.gpu_index
33 | 
34 | videos_list = read_lines_from_file(args.input_videos)
35 | videos_list = sorted(videos_list)
36 | print('GPU: %d' % (gpu_index))
37 | os.makedirs(args.logs_path, exist_ok=True)
38 | os.makedirs(args.summary_path, exist_ok=True)
39 | os.makedirs(args.sparse_reconstuctions_root, exist_ok=True)
40 | os.makedirs(args.dense_reconstuctions_root, exist_ok=True)
41 | 
42 | 
43 | i = 0
44 | for video in videos_list:
45 |     pre = video.split('_')[0]
46 |     if (not os.path.exists(os.path.join(args.dense_reconstuctions_root, '%s' % video))):
47 |         # check the number of images in this video
48 |         num_lines = len(glob.glob(os.path.join(args.epic_kithens_root,pre,video,'*.jpg')))
49 | 
50 |         print('Processing: ', video, '(',num_lines, 'images )')
51 |         start_time = time.time()
52 | 
53 |         # Define the path to the shell script
54 |         script_path = 'scripts/register_dense.sh'
55 | 
56 |         # Create a unique copy of the script
57 |         script_copy_path = video + '_' + str(os.getpid()) + '_' + os.path.basename(script_path)
58 |         shutil.copy(script_path, script_copy_path)
59 | 
60 |         # Output file
61 |         output_file_path = os.path.join(args.logs_path, script_copy_path.replace('.sh', '.out'))
62 | 
63 | 
64 |         # Define the command to execute the script
65 |         command = ["bash", script_copy_path, video,args.sparse_reconstuctions_root,args.dense_reconstuctions_root,args.epic_kithens_root,args.summary_path,str(gpu_index)]
66 |         # Open the output file in write mode
67 |         with open(output_file_path, 'w') as output_file:
68 |             # Run the command and capture its output in real time
69 |             process = subprocess.Popen(command, stdout=output_file, stderr=subprocess.PIPE, text=True)
70 |             while True:
71 |                 output = process.stderr.readline()
72 |                 if output == '' and process.poll() is not None:
73 |                     break
74 |                 if output:
75 |                     output_file.write(output)
76 |                     output_file.flush()
77 | 
78 |         # Once the script has finished running, you can delete the copy of the script
79 |         os.remove(script_copy_path)
80 | 
81 | 
82 |         reg_images = get_num_images(os.path.join(args.dense_reconstuctions_root,video))
83 |         if reg_images > 0:
84 |             print(f"Registered_images/total_images: {reg_images}/{num_lines} = {round(reg_images/num_lines*100)}%")
85 |         else:
86 |             print('The video reconstruction fails!! no colmap files are found!')
87 | 
88 | 
89 | 
90 | 
91 |         print("Execution time:  %s minutes" % round((time.time() - start_time)/60, 0))
92 |         print('-----------------------------------------------------------')
93 | 
94 |     i += 1
95 | 
96 | 


--------------------------------------------------------------------------------
/scripts/reconstruct_sparse.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | start=`date +%s`
 3 | 
 4 | VIDEO=$1 #i.e. P02_14
 5 | SPARSE_PATH=$2 # path to save the sparse models
 6 | IMAGES_ROOT=$3 # root of epic kitchens images
 7 | SAMPLED_IMAGES=$4 # path of the sampeld images to be used for reconstruction
 8 | LOGS=$5 # to save the output logs
 9 | GPU_IDX=$6 # i.e. 0
10 | 
11 | PRE=$(echo "$VIDEO" | cut -d'_' -f1)
12 | #cat $0 > "${LOGS}/$VIDEO.out"
13 | mkdir ${SPARSE_PATH}/${VIDEO}
14 | mkdir ${SPARSE_PATH}/${VIDEO}/sparse
15 | 
16 | colmap feature_extractor    \
17 |     --database_path ${VIDEO}_database.db \
18 |     --ImageReader.camera_model OPENCV \
19 |     --image_list_path ${SAMPLED_IMAGES}/${VIDEO}_selected_frames.txt     \
20 |     --ImageReader.single_camera 1     \
21 |     --SiftExtraction.use_gpu 1 \
22 |     --SiftExtraction.gpu_index $GPU_IDX \
23 |     --image_path ${IMAGES_ROOT}/${PRE}/${VIDEO} \
24 | 
25 | colmap sequential_matcher \
26 |      --database_path ${VIDEO}_database.db \
27 |      --SiftMatching.use_gpu 1 \
28 |      --SequentialMatching.loop_detection 1 \
29 |      --SiftMatching.gpu_index $GPU_IDX \
30 |      --SequentialMatching.vocab_tree_path vocab_bins/vocab_tree_flickr100K_words32K.bin \
31 | 
32 | colmap mapper     \
33 |     --database_path ${VIDEO}_database.db     \
34 |     --image_path ${PRE}/${VIDEO}     \
35 |     --output_path ${SPARSE_PATH}/${VIDEO}/sparse \
36 |     --image_list_path ${SAMPLED_IMAGES}/${VIDEO}_selected_frames.txt \
37 | 
38 | 
39 | #echo "----------------------------------------------------------------------SUMMARY----------------------------------------------------------------------">> "${LOGS}/$VIDEO.out"
40 | colmap model_analyzer --path ${SPARSE_PATH}/${VIDEO}/sparse/0/ > "${LOGS}/$VIDEO.out"
41 | 
42 | end=`date +%s`
43 | runtime=$(((end-start)/60))
44 | echo "$runtime minutes">> "${LOGS}/$VIDEO.out"
45 | mv ${VIDEO}_database.db ${SPARSE_PATH}/${VIDEO}/database.db #move the database
46 | 


--------------------------------------------------------------------------------
/scripts/register_dense.sh:
--------------------------------------------------------------------------------
 1 | start=`date +%s`
 2 | 
 3 | VIDEO=$1 #i.e. P02_14
 4 | SPARSE_PATH=$2 # path to save the sparse models
 5 | DENSE_PATH=$3 # path to save the sparse models
 6 | IMAGES_ROOT=$4 # root of epic kitchens images
 7 | LOGS=$5 # to save the output logs
 8 | GPU_IDX=$6 # i.e. 0
 9 | 
10 | PRE=$(echo "$VIDEO" | cut -d'_' -f1)
11 | 
12 | cp ${SPARSE_PATH}/${VIDEO}/database.db ${VIDEO}_database.db #move the database from the sparse model
13 | mkdir ${DENSE_PATH}/${VIDEO}
14 | 
15 | colmap feature_extractor    \
16 |     --database_path ${VIDEO}_database.db \
17 |     --ImageReader.camera_model OPENCV \
18 |     --ImageReader.single_camera 1     \
19 |     --ImageReader.existing_camera_id 1 \
20 |     --SiftExtraction.use_gpu 1 \
21 |     --SiftExtraction.gpu_index $GPU_IDX \
22 |     --image_path ${IMAGES_ROOT}/${PRE}/${VIDEO} \
23 | 
24 | 
25 | 
26 | colmap sequential_matcher \
27 |      --database_path ${VIDEO}_database.db \
28 |      --SiftMatching.use_gpu 1 \
29 |      --SequentialMatching.loop_detection 1 \
30 |      --SiftMatching.gpu_index $GPU_IDX \
31 |      --SequentialMatching.vocab_tree_path vocab_bins/vocab_tree_flickr100K_words32K.bin \
32 | 
33 | 
34 | colmap image_registrator    \
35 |      --database_path ${VIDEO}_database.db \
36 |      --input_path ${SPARSE_PATH}/${VIDEO}/sparse/0    \
37 |      --output_path ${DENSE_PATH}/${VIDEO} \
38 | 
39 | 
40 | colmap model_analyzer --path ${DENSE_PATH}/${VIDEO} > "${LOGS}/$VIDEO.out"
41 | 
42 | end_reg=`date +%s`
43 | 
44 | runtime=$(((end_reg-start)/60))
45 | echo "$runtime minutes (registration time)">> "${LOGS}/$VIDEO.out"
46 | 
47 | rm ${VIDEO}_database.db #remove the database since it's too large, you can keep it upon your usecase
48 | 


--------------------------------------------------------------------------------
/select_sparse_frames.py:
--------------------------------------------------------------------------------
 1 | import subprocess
 2 | import concurrent.futures
 3 | import glob
 4 | import os
 5 | import argparse
 6 | from utils.lib import *
 7 | # Function to parse command-line arguments
 8 | def parse_args():
 9 |     parser = argparse.ArgumentParser(description='COLMAP Reconstruction Script')
10 |     parser.add_argument('--input_videos', type=str, default='input_videos.txt',
11 |                         help='A file with list of vidoes to be processed in all stages')
12 |     parser.add_argument('--epic_kithens_root', type=str, default='.',
13 |                         help='Path to epic kitchens images.')
14 |     parser.add_argument('--sampled_images_path', type=str, default='sampled_frames',
15 |                         help='Path to the directory containing sampled image files.')
16 |     parser.add_argument('--homography_overlap', type=float, default=0.9,
17 |                         help='Threshold of the homography to sample new frames, higher value samples more images')
18 |     parser.add_argument('--max_concurrent', type=int, default=8,
19 |                         help='Max number of concurrent processes')
20 |     return parser.parse_args()
21 | 
22 | 
23 | 
24 | 
25 | def main():
26 |     args = parse_args()
27 | 
28 |     videos = read_lines_from_file(args.input_videos)
29 |     epic_root = args.epic_kithens_root
30 |     params_list = []
31 |     for video in videos:
32 |         video_pre = video.split('_')[0]
33 |         for folder in sorted(glob.glob(os.path.join(epic_root,video_pre+'/*'))):
34 |             video = folder.split('/')[-1] 
35 |             if video in videos:
36 |                 print(video)
37 |                 added_run = ['--src', folder, '--dst_file', '%s/%s_selected_frames.txt'%(args.sampled_images_path,video), '--overlap', str(args.homography_overlap)]
38 |                 if not added_run  in params_list:
39 |                                 params_list.append(added_run)
40 |                     
41 |     if params_list:
42 |         max_concurrent = args.max_concurrent
43 |         # Create a process pool executor with a maximum of K processes
44 |         executor = concurrent.futures.ProcessPoolExecutor(max_workers=max_concurrent)
45 | 
46 |         # Submit the tasks to the executor
47 |         results = []
48 |         for i in range(len(params_list)):
49 |             future = executor.submit(run_script, 'homography_filter/filter.py', params_list[i % len(params_list)])
50 |             results.append(future)
51 | 
52 |         # Wait for all tasks to complete
53 |         for r in concurrent.futures.as_completed(results):
54 |             try:
55 |                 r.result()
56 |             except Exception as e:
57 |                 print(f"Error occurred: {e}")
58 | 
59 |         # Shut down the executor
60 |         executor.shutdown()
61 | 
62 | 
63 | if __name__ == '__main__':
64 |     main()


--------------------------------------------------------------------------------
/tools/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/tools/__init__.py


--------------------------------------------------------------------------------
/tools/common_functions.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | """ Source: see COLMAP """
 5 | def qvec2rotmat(qvec):
 6 |     return np.array([
 7 |         [1 - 2 * qvec[2]**2 - 2 * qvec[3]**2,
 8 |          2 * qvec[1] * qvec[2] - 2 * qvec[0] * qvec[3],
 9 |          2 * qvec[3] * qvec[1] + 2 * qvec[0] * qvec[2]],
10 |         [2 * qvec[1] * qvec[2] + 2 * qvec[0] * qvec[3],
11 |          1 - 2 * qvec[1]**2 - 2 * qvec[3]**2,
12 |          2 * qvec[2] * qvec[3] - 2 * qvec[0] * qvec[1]],
13 |         [2 * qvec[3] * qvec[1] - 2 * qvec[0] * qvec[2],
14 |          2 * qvec[2] * qvec[3] + 2 * qvec[0] * qvec[1],
15 |          1 - 2 * qvec[1]**2 - 2 * qvec[2]**2]])
16 | 
17 | 
18 | 
19 | def get_c2w(img_data: list) -> np.ndarray:
20 |     """
21 |     Args:
22 |         img_data: list, [qvec, tvec] of w2c
23 |     
24 |     Returns:
25 |         c2w: np.ndarray, 4x4 camera-to-world matrix
26 |     """
27 |     w2c = np.eye(4)
28 |     w2c[:3, :3] = qvec2rotmat(img_data[:4])
29 |     w2c[:3, -1] = img_data[4:7]
30 |     c2w = np.linalg.inv(w2c)
31 |     return c2w
32 | 


--------------------------------------------------------------------------------
/tools/project_3d_line.py:
--------------------------------------------------------------------------------
  1 | from typing import List, Dict
  2 | import argparse
  3 | import json
  4 | import os
  5 | import re
  6 | import os.path as osp
  7 | import tqdm
  8 | import numpy as np
  9 | 
 10 | import cv2
 11 | from PIL import Image
 12 | 
 13 | from tools.common_functions import qvec2rotmat
 14 | 
 15 | 
 16 | class Line:
 17 |     """ An infinite 3D line to denote Annotated Line """
 18 | 
 19 |     def __init__(self, line_ends: np.ndarray):
 20 |         """
 21 |         Args:
 22 |             line_ends: (2, 3)
 23 |                 points annotated using some GUI, denoting points along the desired line
 24 |         """
 25 |         st, ed = line_ends
 26 |         self.vc = (st + ed) / 2
 27 |         self.dir = ed - st
 28 |         self.v0 = st
 29 |         self.v1 = ed
 30 | 
 31 |     def __repr__(self) -> str:
 32 |         return f'vc: {str(self.vc)} \ndir: {str(self.dir)}'
 33 | 
 34 |     def check_single_point(self,
 35 |                            point: np.ndarray,
 36 |                            radius: float) -> bool:
 37 |         """
 38 |         point-to-line = (|(p-v_0)x(p-v_1)|)/(|v_1 - v_0|)
 39 | 
 40 |         Args:
 41 |             point: (3,) array of point
 42 |             radius: threshold for checking inside
 43 |         """
 44 |         area2 = np.linalg.norm(np.cross(point - self.v0, point - self.v1))
 45 |         base_len = np.linalg.norm(self.v1 - self.v0)
 46 |         d = area2 / base_len
 47 |         return True if d < radius else False
 48 | 
 49 |     def check_points(self,
 50 |                      points: np.ndarray,
 51 |                      diameter: float) -> np.ndarray:
 52 |         """
 53 |         Args:
 54 |             points: (N, 3) array of points
 55 |             diameter: threshold for checking inside
 56 | 
 57 |         Returns:
 58 |             (N,) bool array
 59 |         """
 60 |         area2 = np.linalg.norm(np.cross(points - self.v0, points - self.v1), axis=1)
 61 |         base_len = np.linalg.norm(self.v1 - self.v0)
 62 |         d = area2 / base_len
 63 |         return d < diameter
 64 | 
 65 | 
 66 | def line_rectangle_check(cen, dir, rect,
 67 |                          eps=1e-6):
 68 |     """
 69 |     Args:
 70 |         cen, dir: (2,) float
 71 |         rect: Tuple (xmin, ymin, xmax, ymax)
 72 | 
 73 |     Returns:
 74 |         num_intersect: int
 75 |         inters: (num_intersect, 2) float
 76 |     """
 77 |     x1, y1 = cen
 78 |     u1, v1 = dir
 79 |     xmin, ymin, xmax, ymax = rect
 80 |     rect_loop = np.asarray([
 81 |         [xmin, ymin], [xmax, ymin], [xmax, ymax], [xmin, ymax],
 82 |         [xmin, ymin]
 83 |     ], dtype=np.float32)
 84 |     x2, y2 = rect_loop[:4, 0], rect_loop[:4, 1]
 85 |     u2 = rect_loop[1:, 0] - rect_loop[:-1, 0]
 86 |     v2 = rect_loop[1:, 1] - rect_loop[:-1, 1]
 87 | 
 88 |     t2 = (v1*x1 - u1*y1) - (v1*x2 - u1*y2)
 89 |     divisor = (v1*u2 - v2*u1)
 90 |     cond = np.abs(divisor) > eps
 91 | 
 92 |     t2[~cond] = -1
 93 |     t2[cond] = t2[cond] / divisor[cond]
 94 | 
 95 |     keep = (t2 >= 0) & (t2 <= 1)
 96 |     num_intersect = np.sum(keep)
 97 |     uv = np.stack([u2, v2], 1)
 98 |     inters = rect_loop[:4, :] + t2[:, None] * uv
 99 |     inters = inters[keep, :]
100 |     return num_intersect, inters
101 | 
102 | 
103 | def project_line_image(line: Line,
104 |                        pose_data: list,
105 |                        camera: dict):
106 |     """ Project a 3D line using camera pose and intrinsics
107 | 
108 |     This implementation ignores distortion.
109 | 
110 |     Args:
111 |         line:
112 |             -vc: (3,) float
113 |             -dir: (3,) float
114 |         pose_data: stores camera pose
115 |             [qw, qx, qy, qz, tx, ty, tz, frame_name]
116 |         camera: dict, stores intrinsics
117 |             -width,
118 |             -height
119 |             -params (8,) fx, fy, cx, cy, k1, k2, p1, p2
120 | 
121 |     Returns:
122 |         (st, ed): (2,) float
123 |     """
124 |     cen, dir = line.vc, line.dir
125 |     rot_w2c = qvec2rotmat(pose_data[:4])
126 |     tvec = np.asarray(pose_data[4:7])
127 |     # Represent as column vector
128 |     cen = rot_w2c @ cen + tvec
129 |     dir = rot_w2c @ dir
130 |     width, height = camera['width'], camera['height']
131 |     fx, fy, cx, cy, k1, k2, p1, p2 = camera['params']
132 | 
133 |     cen_uv = cen[:2] / cen[2]
134 |     cen_uv = cen_uv * np.array([fx, fy]) + np.array([cx, cy])
135 |     dir_uv = ((dir + cen)[:2] / (dir + cen)[2]) - (cen[:2] / cen[2])
136 |     dir_uv = dir_uv * np.array([fx, fy])
137 |     dir_uv = dir_uv / np.linalg.norm(dir_uv)
138 | 
139 |     line2d = None
140 |     num_inters, inters = line_rectangle_check(
141 |         cen_uv, dir_uv, (0, 0, width, height))
142 |     if num_inters == 2:
143 |         line2d = (inters[0], inters[1])
144 |     return line2d
145 | 
146 | 
147 | class LineProjector:
148 | 
149 |     COLORS = dict(yellow=(255, 255, 0),)
150 | 
151 |     def __init__(self,
152 |                  camera: Dict,
153 |                  images: Dict[str, List],
154 |                  line: Line):
155 |         """
156 |         Args:
157 |             camera: dict, camera info
158 |             images: dict of
159 |                 frame_name: [qw, qx, qy, qz, tx, ty, tz] in **w2c**
160 |         """
161 |         self.camera = camera
162 |         self.images = images
163 |         self.line = line
164 |         self.line_color = self.COLORS['yellow']
165 | 
166 |     def project_frame(self, frame_name: str, frames_root: str) -> np.ndarray:
167 |         """ Project a line onto a frame
168 | 
169 |         Args:
170 |             frame_idx: int. epic frame index
171 |             frames_root: str.
172 |                 f'{frame_root}/frame_{frame_idx:010d}.jpg' is the path to the epic-kitchens frame
173 | 
174 |         Returns:
175 |             img: (H, W, 3) np.uint8
176 |         """
177 |         pose_data = self.images[frame_name]
178 |         img_path = osp.join(frames_root, frame_name)
179 |         img = np.asarray(Image.open(img_path))
180 |         line_2d = project_line_image(self.line, pose_data, self.camera)
181 |         if line_2d is None:
182 |             return img
183 |         img = cv2.line(
184 |             img, np.int32(line_2d[0]), np.int32(line_2d[1]),
185 |             color=self.line_color, thickness=2, lineType=cv2.LINE_AA)
186 | 
187 |         return img
188 | 
189 |     def write_mp4(self,
190 |                   frames_root: str,
191 |                   fps=5,
192 |                   out_name='line_output'):
193 |         """ Write mp4 file that has line projected on the image frames
194 | 
195 |         Args:
196 |             frames_root: str.
197 |                 f'{frame_root}/frame_{frame_idx:010d}.jpg' is the path to the epic-kitchens frame
198 |         """
199 |         out_dir = os.path.join('./outputs/', out_name)
200 |         os.makedirs(out_dir, exist_ok=True)
201 |         fmt = os.path.join(out_dir, '{}')
202 | 
203 |         frames_on_disk = set(os.listdir(frames_root))
204 |         frame_names = set(self.images.keys())
205 |         if len(frames_on_disk) < len(frame_names):
206 |             print(f"Showing {len(frames_on_disk)} / {len(frame_names)} frames")
207 |             frame_names = frame_names.intersection(frames_on_disk)
208 |         frame_names = sorted(list(frame_names))
209 |         for frame_name in tqdm.tqdm(frame_names):
210 |             img = self.project_frame(frame_name, frames_root)
211 |             frame_number = re.search('\d{10,}', frame_name)[0]
212 |             cv2.putText(img, frame_number,
213 |                         (self.camera['width']//4, self.camera['height'] * 31 // 32),
214 |                         cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255), 2, cv2.LINE_AA)
215 |             Image.fromarray(img).save(fmt.format(frame_name))
216 | 
217 |         from moviepy import editor
218 |         clip = editor.ImageSequenceClip(sequence=out_dir, fps=fps)
219 |         clip.write_videofile(f'./outputs/{out_name}-fps{fps}.mp4')
220 | 
221 | 
222 | if __name__ == '__main__':
223 |     parser = argparse.ArgumentParser()
224 |     parser.add_argument('--json-data', type=str, required=True)
225 |     parser.add_argument('--line-data', type=str, required=True)
226 |     parser.add_argument('--frames-root', type=str, required=True)
227 |     parser.add_argument('--out-name', type=str, default="line_output")
228 |     parser.add_argument('--fps', type=int, default=5)
229 |     args = parser.parse_args()
230 | 
231 |     with open(args.json_data) as f:
232 |         model = json.load(f)
233 |         camera = model['camera']
234 |         images = model['images']
235 | 
236 |     with open(args.line_data) as f:
237 |         line = json.load(f)
238 |         line = np.asarray(line).reshape(2, 3)
239 |         line = Line(line)
240 | 
241 |     runner = LineProjector(camera, images, line)
242 |     runner.write_mp4(
243 |         frames_root=args.frames_root, fps=args.fps, out_name=args.out_name)
244 | 


--------------------------------------------------------------------------------
/tools/visualise_data_open3d.py:
--------------------------------------------------------------------------------
  1 | import open3d as o3d
  2 | import numpy as np
  3 | from argparse import ArgumentParser
  4 | import json
  5 | 
  6 | from tools.common_functions import get_c2w
  7 | 
  8 | """ Visualize poses and point-cloud stored in json file."""
  9 | 
 10 | def parse_args():
 11 |     parser = ArgumentParser()
 12 |     parser.add_argument('--json-data', help='path to json data', required=True)
 13 |     parser.add_argument('--line-data', help='path to line data', default=None)
 14 |     parser.add_argument(
 15 |         '--num-display-poses', type=int, default=500, 
 16 |         help='randomly display num-display-poses to avoid creating too many poses')
 17 |     parser.add_argument('--frustum-size', type=float, default=0.1)
 18 |     return parser.parse_args()
 19 | 
 20 | 
 21 | def get_frustum(c2w: np.ndarray,
 22 |                 sz=0.2, 
 23 |                 camera_height=None,
 24 |                 camera_width=None,
 25 |                 frustum_color=[1, 0, 0]) -> o3d.geometry.LineSet:
 26 |     """
 27 |     Args:
 28 |         c2w: np.ndarray, 4x4 camera-to-world matrix
 29 |         sz: float, size (width) of the frustum
 30 |     Returns:
 31 |         frustum: o3d.geometry.TriangleMesh
 32 |     """
 33 |     cen = [0, 0, 0]
 34 |     wid = sz
 35 |     if camera_height is not None and camera_width is not None:
 36 |         hei = wid * camera_height / camera_width
 37 |     else:
 38 |         hei = wid
 39 |     tl = [wid, hei, sz]
 40 |     tr = [-wid, hei, sz]
 41 |     br = [-wid, -hei, sz]
 42 |     bl = [wid, -hei, sz]
 43 |     points = np.float32([cen, tl, tr, br, bl])
 44 |     lines = [
 45 |         [0, 1], [0, 2], [0, 3], [0, 4],
 46 |         [1, 2], [2, 3], [3, 4], [4, 1],]
 47 |     frustum = o3d.geometry.LineSet()
 48 |     frustum.points = o3d.utility.Vector3dVector(points)
 49 |     frustum.lines = o3d.utility.Vector2iVector(lines)
 50 |     frustum.colors = o3d.utility.Vector3dVector([np.asarray([1, 0, 0])])
 51 |     frustum.paint_uniform_color(frustum_color)
 52 | 
 53 |     frustum = frustum.transform(c2w)
 54 |     return frustum
 55 | 
 56 | 
 57 | if __name__ == "__main__":
 58 |     args = parse_args()
 59 |     frustum_size = args.frustum_size
 60 | 
 61 |     vis = o3d.visualization.Visualizer()
 62 |     vis.create_window()
 63 |     with open(args.json_data, 'r') as f:
 64 |         model = json.load(f)
 65 | 
 66 |     """ Points """
 67 |     points = model['points']
 68 |     pcd_np = [v[:3] for v in points]
 69 |     pcd_rgb = [np.asarray(v[3:6]) / 255 for v in points]
 70 |     pcd = o3d.geometry.PointCloud()
 71 |     pcd.points = o3d.utility.Vector3dVector(pcd_np)
 72 |     pcd.colors = o3d.utility.Vector3dVector(pcd_rgb)
 73 |     vis.add_geometry(pcd, reset_bounding_box=True)
 74 | 
 75 |     """ Camear Poses """
 76 |     camera = model['camera']
 77 |     cam_h, cam_w = camera['height'], camera['width']
 78 |     c2w_list = [get_c2w(img) for img in model['images'].values()]
 79 |     c2w_sel_inds = np.linspace(0, len(c2w_list)-1, args.num_display_poses).astype(int)
 80 |     c2w_sel = [c2w_list[i] for i in c2w_sel_inds]
 81 |     frustums = [
 82 |         get_frustum(c2w, sz=frustum_size, camera_height=cam_h, camera_width=cam_w) 
 83 |         for c2w in c2w_sel
 84 |     ]
 85 |     for frustum in frustums:
 86 |         vis.add_geometry(frustum, reset_bounding_box=True)
 87 |     
 88 |     """ Optional: Line """
 89 |     if args.line_data is not None:
 90 |         line_set = o3d.geometry.LineSet()
 91 |         with open(args.line_data, 'r') as f:
 92 |             line_points = np.asarray(json.load(f)).reshape(2, 3)
 93 |         vc = line_points.mean(axis=0)
 94 |         dir = line_points[1] - line_points[0]
 95 |         lst = vc + 2 * dir
 96 |         led = vc - 2 * dir
 97 |         lines = [lst, led]
 98 |         line_set.points = o3d.utility.Vector3dVector(lines)
 99 |         line_set.lines = o3d.utility.Vector2iVector([[0, 1]])
100 |         vis.add_geometry(line_set, reset_bounding_box=True)
101 | 
102 |     control = vis.get_view_control()
103 |     control.set_front([1, 1, 1])
104 |     control.set_lookat([0, 0, 0])
105 |     control.set_up([0, 0, 1])
106 |     control.set_zoom(1)
107 | 
108 |     vis.run()
109 |     vis.destroy_window()
110 | 


--------------------------------------------------------------------------------
/tools/visualize_colmap_open3d.py:
--------------------------------------------------------------------------------
 1 | import open3d as o3d
 2 | import numpy as np
 3 | from argparse import ArgumentParser
 4 | from utils.base_type import ColmapModel
 5 | from tools.visualise_data_open3d import get_c2w, get_frustum
 6 | 
 7 | """TODO
 8 | 1. Frustum, on/off
 9 | 2. Line (saved in json)
10 | """
11 | 
12 | def parse_args():
13 |     parser = ArgumentParser()
14 |     parser.add_argument('--model', help="path to direcctory containing images.bin", required=True)
15 |     parser.add_argument('--pcd-path', help="path to fused.ply", default=None)
16 |     parser.add_argument('--show-mesh-frame', default=False)
17 |     parser.add_argument('--specify-frame-name', default=None)
18 |     parser.add_argument(
19 |         '--num-display-poses', type=int, default=500, 
20 |         help='randomly display num-display-poses to avoid creating too many poses')
21 |     return parser.parse_args()
22 | 
23 | if __name__ == "__main__":
24 |     args = parse_args()
25 | 
26 |     model_path = args.model
27 |     mod = ColmapModel(args.model)
28 |     if args.pcd_path is not None:
29 |         pcd = o3d.io.read_point_cloud(args.pcd_path)
30 |     else:
31 |         pcd_np = np.asarray([v.xyz for v in mod.points.values()])
32 |         pcd_rgb = np.asarray([v.rgb / 255 for v in mod.points.values()])
33 |         # Remove too far points from GUI -- usually noise
34 |         pcd_np_center = np.mean(pcd_np, axis=0)
35 |         pcd_ind = np.linalg.norm(pcd_np - pcd_np_center, axis=1) < 500
36 |         pcd_np, pcd_rgb = pcd_np[pcd_ind], pcd_rgb[pcd_ind]
37 | 
38 |         pcd = o3d.geometry.PointCloud()
39 |         pcd.points = o3d.utility.Vector3dVector(pcd_np)
40 |         pcd.colors = o3d.utility.Vector3dVector(pcd_rgb)
41 | 
42 |     mesh_frame = o3d.geometry.TriangleMesh.create_coordinate_frame(
43 |         size=1.0, origin=[0, 0, 0])
44 | 
45 |     vis = o3d.visualization.Visualizer()
46 |     vis.create_window()
47 |     vis.add_geometry(pcd, reset_bounding_box=True)
48 |     if args.show_mesh_frame:
49 |         vis.add_geometry(mesh_frame, reset_bounding_box=True)
50 | 
51 |     frustum_size = 0.1
52 |     camera = mod.camera
53 |     cam_h, cam_w = camera.height, camera.width
54 |     """ Camear Poses """
55 |     if args.specify_frame_name is not None:
56 |         qvec, tvec = [
57 |             (v.qvec, v.tvec) for k, v in mod.images.items() if v.name == args.specify_frame_name][0]
58 |         img_data = [qvec[0], qvec[1], qvec[2], qvec[3], tvec[0], tvec[1], tvec[2]]
59 |         c2w = get_c2w(img_data)
60 |         frustum = get_frustum(c2w, sz=frustum_size, camera_height=cam_h, camera_width=cam_w)
61 |         vis.add_geometry(frustum, reset_bounding_box=True)
62 |     else:
63 |         qtvecs = [list(v.qvec) + list(v.tvec) for v in mod.images.values()]
64 |         qtvecs = [qtvecs[i]
65 |             for i in np.linspace(0, len(qtvecs)-1, args.num_display_poses).astype(int)]
66 |         c2w_list = [get_c2w(img) for img in qtvecs]
67 |         for c2w in c2w_list:
68 |             frustum = get_frustum(c2w, sz=frustum_size, camera_height=cam_h, camera_width=cam_w)
69 |             vis.add_geometry(frustum, reset_bounding_box=True)
70 | 
71 |     control = vis.get_view_control()
72 |     control.set_front([1, 1, 1])
73 |     control.set_lookat([0, 0, 0])
74 |     control.set_up([0, 0, 1])
75 |     control.set_zoom(1.0)
76 | 
77 |     vis.run()
78 |     vis.destroy_window()
79 | 


--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/utils/__init__.py


--------------------------------------------------------------------------------
/utils/base_type.py:
--------------------------------------------------------------------------------
 1 | from typing import List
 2 | import json
 3 | from functools import cached_property
 4 | from utils.colmap_utils import (
 5 |     read_cameras_binary, read_points3d_binary,
 6 |     read_images_binary, BaseImage)
 7 | from utils.colmap_utils import Image as ColmapImage
 8 | 
 9 | 
10 | 
11 | class ColmapModel:
12 | 
13 |     """
14 |     NOTE: this class shares commons codes with line_check.LineChecker,
15 |         reuse these codes?
16 |     """
17 |     def __init__(self, model_dir: str):
18 | 
19 |         def _as_list(path, func):
20 |             return func(path)
21 | 
22 |         cameras = _as_list(
23 |             f'{model_dir}/cameras.bin', read_cameras_binary)
24 |         if len(cameras) != 1:
25 |             print("Found more than one camera!")
26 |         self.camera = cameras[1]
27 |         self.points = _as_list(
28 |             f'{model_dir}/points3D.bin', read_points3d_binary)
29 |         self.images = _as_list(
30 |             f'{model_dir}/images.bin', read_images_binary)
31 | 
32 |     def __repr__(self) -> str:
33 |         return f'{self.num_images} images - {self.num_points} points'
34 | 
35 |     @property
36 |     def example_data(self):
37 |         ki = list(self.images.keys())[0]
38 |         img = self.images[ki]
39 |         kp = list(self.points.keys())[0]
40 |         point = self.points[kp]
41 |         return img, point
42 | 
43 |     @cached_property
44 |     def ordered_image_ids(self):
45 |         return sorted(self.images.keys(), key=lambda x: self.images[x].name)
46 | 
47 |     @property
48 |     def num_points(self):
49 |         return len(self.points)
50 | 
51 |     @property
52 |     def num_images(self):
53 |         return len(self.images)
54 | 
55 |     @property
56 |     def ordered_images(self) -> List[BaseImage]:
57 |         return [self.images[i] for i in self.ordered_image_ids]
58 | 
59 |     def get_image_by_id(self, image_id: int):
60 |         return self.images[image_id]
61 | 
62 | 
63 | class JsonColmapModel:
64 |     def __init__(self, json_path_or_dict):
65 |         if isinstance(json_path_or_dict, str):
66 |             with open(json_path_or_dict) as f:
67 |                 model = json.load(f)
68 |         elif isinstance(json_path_or_dict, dict):
69 |             model = json_path_or_dict
70 |         self.camera = model['camera']
71 |         self.points = model['points']
72 |         self.images = [
73 |             model['images'][k] + [k] for k in sorted(model['images'].keys())
74 |             ] # qw, qx, qy, qz, tx, ty, tz, frame_name
75 |     
76 |     @property
77 |     def ordered_image_ids(self):
78 |         return list(range(len(self.images)))
79 |     
80 |     @property
81 |     def ordered_images(self) -> List[ColmapImage]:
82 |         return [self.get_image_by_id(i) for i in self.ordered_image_ids]
83 |     
84 |     def get_image_by_id(self, image_id: int) -> ColmapImage:
85 |         img_info = self.images[image_id]
86 |         cimg = ColmapImage(
87 |             id=image_id, qvec=img_info[:4], tvec=img_info[4:7], camera_id=0, 
88 |             name=img_info[7], xys=[], point3D_ids=[])
89 |         return cimg
90 | 


--------------------------------------------------------------------------------
/utils/colmap_utils.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) 2018, ETH Zurich and UNC Chapel Hill.
  2 | # All rights reserved.
  3 | #
  4 | # Redistribution and use in source and binary forms, with or without
  5 | # modification, are permitted provided that the following conditions are met:
  6 | #
  7 | #     * Redistributions of source code must retain the above copyright
  8 | #       notice, this list of conditions and the following disclaimer.
  9 | #
 10 | #     * Redistributions in binary form must reproduce the above copyright
 11 | #       notice, this list of conditions and the following disclaimer in the
 12 | #       documentation and/or other materials provided with the distribution.
 13 | #
 14 | #     * Neither the name of ETH Zurich and UNC Chapel Hill nor the names of
 15 | #       its contributors may be used to endorse or promote products derived
 16 | #       from this software without specific prior written permission.
 17 | #
 18 | # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 19 | # AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 20 | # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 21 | # ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE
 22 | # LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 23 | # CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
 24 | # SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
 25 | # INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 26 | # CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
 27 | # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
 28 | # POSSIBILITY OF SUCH DAMAGE.
 29 | #
 30 | # Author: Johannes L. Schoenberger (jsch at inf.ethz.ch)
 31 | 
 32 | import os
 33 | import sys
 34 | import collections
 35 | import numpy as np
 36 | import struct
 37 | 
 38 | 
 39 | CameraModel = collections.namedtuple(
 40 |     "CameraModel", ["model_id", "model_name", "num_params"])
 41 | Camera = collections.namedtuple(
 42 |     "Camera", ["id", "model", "width", "height", "params"])
 43 | BaseImage = collections.namedtuple(
 44 |     "Image", ["id", "qvec", "tvec", "camera_id", "name", "xys", "point3D_ids"])
 45 | Point3D = collections.namedtuple(
 46 |     "Point3D", ["id", "xyz", "rgb", "error", "image_ids", "point2D_idxs"])
 47 | 
 48 | class Image(BaseImage):
 49 |     def qvec2rotmat(self):
 50 |         return qvec2rotmat(self.qvec)
 51 | 
 52 | 
 53 | CAMERA_MODELS = {
 54 |     CameraModel(model_id=0, model_name="SIMPLE_PINHOLE", num_params=3),
 55 |     CameraModel(model_id=1, model_name="PINHOLE", num_params=4),
 56 |     CameraModel(model_id=2, model_name="SIMPLE_RADIAL", num_params=4),
 57 |     CameraModel(model_id=3, model_name="RADIAL", num_params=5),
 58 |     CameraModel(model_id=4, model_name="OPENCV", num_params=8),
 59 |     CameraModel(model_id=5, model_name="OPENCV_FISHEYE", num_params=8),
 60 |     CameraModel(model_id=6, model_name="FULL_OPENCV", num_params=12),
 61 |     CameraModel(model_id=7, model_name="FOV", num_params=5),
 62 |     CameraModel(model_id=8, model_name="SIMPLE_RADIAL_FISHEYE", num_params=4),
 63 |     CameraModel(model_id=9, model_name="RADIAL_FISHEYE", num_params=5),
 64 |     CameraModel(model_id=10, model_name="THIN_PRISM_FISHEYE", num_params=12)
 65 | }
 66 | CAMERA_MODEL_IDS = dict([(camera_model.model_id, camera_model) \
 67 |                          for camera_model in CAMERA_MODELS])
 68 | 
 69 | 
 70 | def read_next_bytes(fid, num_bytes, format_char_sequence, endian_character="<"):
 71 |     """Read and unpack the next bytes from a binary file.
 72 |     :param fid:
 73 |     :param num_bytes: Sum of combination of {2, 4, 8}, e.g. 2, 6, 16, 30, etc.
 74 |     :param format_char_sequence: List of {c, e, f, d, h, H, i, I, l, L, q, Q}.
 75 |     :param endian_character: Any of {@, =, <, >, !}
 76 |     :return: Tuple of read and unpacked values.
 77 |     """
 78 |     data = fid.read(num_bytes)
 79 |     return struct.unpack(endian_character + format_char_sequence, data)
 80 | 
 81 | 
 82 | def read_cameras_text(path):
 83 |     """
 84 |     see: src/base/reconstruction.cc
 85 |         void Reconstruction::WriteCamerasText(const std::string& path)
 86 |         void Reconstruction::ReadCamerasText(const std::string& path)
 87 |     """
 88 |     cameras = {}
 89 |     with open(path, "r") as fid:
 90 |         while True:
 91 |             line = fid.readline()
 92 |             if not line:
 93 |                 break
 94 |             line = line.strip()
 95 |             if len(line) > 0 and line[0] != "#":
 96 |                 elems = line.split()
 97 |                 camera_id = int(elems[0])
 98 |                 model = elems[1]
 99 |                 width = int(elems[2])
100 |                 height = int(elems[3])
101 |                 params = np.array(tuple(map(float, elems[4:])))
102 |                 cameras[camera_id] = Camera(id=camera_id, model=model,
103 |                                             width=width, height=height,
104 |                                             params=params)
105 |     return cameras
106 | 
107 | 
108 | def read_cameras_binary(path_to_model_file):
109 |     """
110 |     see: src/base/reconstruction.cc
111 |         void Reconstruction::WriteCamerasBinary(const std::string& path)
112 |         void Reconstruction::ReadCamerasBinary(const std::string& path)
113 |     """
114 |     cameras = {}
115 |     with open(path_to_model_file, "rb") as fid:
116 |         num_cameras = read_next_bytes(fid, 8, "Q")[0]
117 |         for camera_line_index in range(num_cameras):
118 |             camera_properties = read_next_bytes(
119 |                 fid, num_bytes=24, format_char_sequence="iiQQ")
120 |             camera_id = camera_properties[0]
121 |             model_id = camera_properties[1]
122 |             model_name = CAMERA_MODEL_IDS[camera_properties[1]].model_name
123 |             width = camera_properties[2]
124 |             height = camera_properties[3]
125 |             num_params = CAMERA_MODEL_IDS[model_id].num_params
126 |             params = read_next_bytes(fid, num_bytes=8*num_params,
127 |                                      format_char_sequence="d"*num_params)
128 |             cameras[camera_id] = Camera(id=camera_id,
129 |                                         model=model_name,
130 |                                         width=width,
131 |                                         height=height,
132 |                                         params=np.array(params))
133 |         assert len(cameras) == num_cameras
134 |     return cameras
135 | 
136 | 
137 | def read_images_text(path):
138 |     """
139 |     see: src/base/reconstruction.cc
140 |         void Reconstruction::ReadImagesText(const std::string& path)
141 |         void Reconstruction::WriteImagesText(const std::string& path)
142 |     """
143 |     images = {}
144 |     with open(path, "r") as fid:
145 |         while True:
146 |             line = fid.readline()
147 |             if not line:
148 |                 break
149 |             line = line.strip()
150 |             if len(line) > 0 and line[0] != "#":
151 |                 elems = line.split()
152 |                 image_id = int(elems[0])
153 |                 qvec = np.array(tuple(map(float, elems[1:5])))
154 |                 tvec = np.array(tuple(map(float, elems[5:8])))
155 |                 camera_id = int(elems[8])
156 |                 image_name = elems[9]
157 |                 elems = fid.readline().split()
158 |                 xys = np.column_stack([tuple(map(float, elems[0::3])),
159 |                                        tuple(map(float, elems[1::3]))])
160 |                 point3D_ids = np.array(tuple(map(int, elems[2::3])))
161 |                 images[image_id] = Image(
162 |                     id=image_id, qvec=qvec, tvec=tvec,
163 |                     camera_id=camera_id, name=image_name,
164 |                     xys=xys, point3D_ids=point3D_ids)
165 |     return images
166 | 
167 | 
168 | def read_images_binary(path_to_model_file):
169 |     """
170 |     see: src/base/reconstruction.cc
171 |         void Reconstruction::ReadImagesBinary(const std::string& path)
172 |         void Reconstruction::WriteImagesBinary(const std::string& path)
173 |     """
174 |     images = {}
175 |     with open(path_to_model_file, "rb") as fid:
176 |         num_reg_images = read_next_bytes(fid, 8, "Q")[0]
177 |         for image_index in range(num_reg_images):
178 |             binary_image_properties = read_next_bytes(
179 |                 fid, num_bytes=64, format_char_sequence="idddddddi")
180 |             image_id = binary_image_properties[0]
181 |             qvec = np.array(binary_image_properties[1:5])
182 |             tvec = np.array(binary_image_properties[5:8])
183 |             camera_id = binary_image_properties[8]
184 |             image_name = ""
185 |             current_char = read_next_bytes(fid, 1, "c")[0]
186 |             while current_char != b"\x00":   # look for the ASCII 0 entry
187 |                 image_name += current_char.decode("utf-8")
188 |                 current_char = read_next_bytes(fid, 1, "c")[0]
189 |             num_points2D = read_next_bytes(fid, num_bytes=8,
190 |                                            format_char_sequence="Q")[0]
191 |             x_y_id_s = read_next_bytes(fid, num_bytes=24*num_points2D,
192 |                                        format_char_sequence="ddq"*num_points2D)
193 |             xys = np.column_stack([tuple(map(float, x_y_id_s[0::3])),
194 |                                    tuple(map(float, x_y_id_s[1::3]))])
195 |             point3D_ids = np.array(tuple(map(int, x_y_id_s[2::3])))
196 |             images[image_id] = Image(
197 |                 id=image_id, qvec=qvec, tvec=tvec,
198 |                 camera_id=camera_id, name=image_name,
199 |                 xys=xys, point3D_ids=point3D_ids)
200 |     return images
201 | 
202 | 
203 | def read_points3D_text(path):
204 |     """
205 |     see: src/base/reconstruction.cc
206 |         void Reconstruction::ReadPoints3DText(const std::string& path)
207 |         void Reconstruction::WritePoints3DText(const std::string& path)
208 |     """
209 |     points3D = {}
210 |     with open(path, "r") as fid:
211 |         while True:
212 |             line = fid.readline()
213 |             if not line:
214 |                 break
215 |             line = line.strip()
216 |             if len(line) > 0 and line[0] != "#":
217 |                 elems = line.split()
218 |                 point3D_id = int(elems[0])
219 |                 xyz = np.array(tuple(map(float, elems[1:4])))
220 |                 rgb = np.array(tuple(map(int, elems[4:7])))
221 |                 error = float(elems[7])
222 |                 image_ids = np.array(tuple(map(int, elems[8::2])))
223 |                 point2D_idxs = np.array(tuple(map(int, elems[9::2])))
224 |                 points3D[point3D_id] = Point3D(id=point3D_id, xyz=xyz, rgb=rgb,
225 |                                                error=error, image_ids=image_ids,
226 |                                                point2D_idxs=point2D_idxs)
227 |     return points3D
228 | 
229 | 
230 | def read_points3d_binary(path_to_model_file):
231 |     """
232 |     see: src/base/reconstruction.cc
233 |         void Reconstruction::ReadPoints3DBinary(const std::string& path)
234 |         void Reconstruction::WritePoints3DBinary(const std::string& path)
235 |     """
236 |     points3D = {}
237 |     with open(path_to_model_file, "rb") as fid:
238 |         num_points = read_next_bytes(fid, 8, "Q")[0]
239 |         for point_line_index in range(num_points):
240 |             binary_point_line_properties = read_next_bytes(
241 |                 fid, num_bytes=43, format_char_sequence="QdddBBBd")
242 |             point3D_id = binary_point_line_properties[0]
243 |             xyz = np.array(binary_point_line_properties[1:4])
244 |             rgb = np.array(binary_point_line_properties[4:7])
245 |             error = np.array(binary_point_line_properties[7])
246 |             track_length = read_next_bytes(
247 |                 fid, num_bytes=8, format_char_sequence="Q")[0]
248 |             track_elems = read_next_bytes(
249 |                 fid, num_bytes=8*track_length,
250 |                 format_char_sequence="ii"*track_length)
251 |             image_ids = np.array(tuple(map(int, track_elems[0::2])))
252 |             point2D_idxs = np.array(tuple(map(int, track_elems[1::2])))
253 |             points3D[point3D_id] = Point3D(
254 |                 id=point3D_id, xyz=xyz, rgb=rgb,
255 |                 error=error, image_ids=image_ids,
256 |                 point2D_idxs=point2D_idxs)
257 |     return points3D
258 | 
259 | 
260 | def read_model(path, ext):
261 |     if ext == ".txt":
262 |         cameras = read_cameras_text(os.path.join(path, "cameras" + ext))
263 |         images = read_images_text(os.path.join(path, "images" + ext))
264 |         points3D = read_points3D_text(os.path.join(path, "points3D") + ext)
265 |     else:
266 |         cameras = read_cameras_binary(os.path.join(path, "cameras" + ext))
267 |         images = read_images_binary(os.path.join(path, "images" + ext))
268 |         points3D = read_points3d_binary(os.path.join(path, "points3D") + ext)
269 |     return cameras, images, points3D
270 | 
271 | 
272 | def qvec2rotmat(qvec):
273 |     return np.array([
274 |         [1 - 2 * qvec[2]**2 - 2 * qvec[3]**2,
275 |          2 * qvec[1] * qvec[2] - 2 * qvec[0] * qvec[3],
276 |          2 * qvec[3] * qvec[1] + 2 * qvec[0] * qvec[2]],
277 |         [2 * qvec[1] * qvec[2] + 2 * qvec[0] * qvec[3],
278 |          1 - 2 * qvec[1]**2 - 2 * qvec[3]**2,
279 |          2 * qvec[2] * qvec[3] - 2 * qvec[0] * qvec[1]],
280 |         [2 * qvec[3] * qvec[1] - 2 * qvec[0] * qvec[2],
281 |          2 * qvec[2] * qvec[3] + 2 * qvec[0] * qvec[1],
282 |          1 - 2 * qvec[1]**2 - 2 * qvec[2]**2]])
283 | 
284 | 
285 | def rotmat2qvec(R):
286 |     Rxx, Ryx, Rzx, Rxy, Ryy, Rzy, Rxz, Ryz, Rzz = R.flat
287 |     K = np.array([
288 |         [Rxx - Ryy - Rzz, 0, 0, 0],
289 |         [Ryx + Rxy, Ryy - Rxx - Rzz, 0, 0],
290 |         [Rzx + Rxz, Rzy + Ryz, Rzz - Rxx - Ryy, 0],
291 |         [Ryz - Rzy, Rzx - Rxz, Rxy - Ryx, Rxx + Ryy + Rzz]]) / 3.0
292 |     eigvals, eigvecs = np.linalg.eigh(K)
293 |     qvec = eigvecs[[3, 0, 1, 2], np.argmax(eigvals)]
294 |     if qvec[0] < 0:
295 |         qvec *= -1
296 |     return qvec


--------------------------------------------------------------------------------
/utils/hovering/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/epic-kitchens/epic-fields-code/e5cd4fdfbbf130a9f6266dcb536f17bb52136f64/utils/hovering/__init__.py


--------------------------------------------------------------------------------
/utils/hovering/helper.py:
--------------------------------------------------------------------------------
  1 | from typing import List
  2 | import os
  3 | import numpy as np
  4 | from PIL import Image
  5 | import open3d as o3d
  6 | import matplotlib.pyplot as plt
  7 | from open3d.visualization import rendering
  8 | 
  9 | 
 10 | from utils.hovering.o3d_line_mesh import LineMesh
 11 | 
 12 | 
 13 | class Helper:
 14 |     base_colors = {
 15 |         'white': [1, 1, 1, 0.8],
 16 |         'red': [1, 0, 0, 1],
 17 |         'blue': [0, 0, 1,1],
 18 |         'green': [0, 1, 0,1],
 19 |         'yellow': [1, 1, 0,1],
 20 |         'purple': [0.2, 0.2, 0.8, 1]
 21 |     }
 22 | 
 23 |     def __init__(self, point_size):
 24 |         self.point_size = point_size
 25 |     
 26 |     def material(self, color: str, shader="defaultUnlit") -> rendering.MaterialRecord:
 27 |         """
 28 |         Args:
 29 |             shader: e.g.'defaultUnlit', 'defaultLit', 'depth', 'normal'
 30 |                 see Open3D: cpp/open3d/visualization/rendering/filament/FilamentScene.cpp#L1109
 31 |         """
 32 |         material = rendering.MaterialRecord()
 33 |         material.shader = shader
 34 |         material.base_color = self.base_colors[color]
 35 |         material.point_size = self.point_size
 36 |         return material
 37 | 
 38 | def get_cam_pos(c2w: np.ndarray) -> np.ndarray:
 39 |      """ Get camera position in world coordinate system
 40 |      """
 41 |      cen = np.float32([0, 0, 0, 1])
 42 |      pos = c2w @ cen
 43 |      return pos[:3]
 44 | 
 45 | 
 46 | # def get_frustum(c2w: np.ndarray,
 47 | #                 sz=0.2,
 48 | #                 camera_height=None,
 49 | #                 camera_width=None,
 50 | #                 frustum_color=[1, 0, 0]) -> o3d.geometry.LineSet:
 51 | #     """
 52 | #     Args:
 53 | #         c2w: np.ndarray, 4x4 camera-to-world matrix
 54 | #         sz: float, size (width) of the frustum
 55 | #     Returns:
 56 | #         frustum: o3d.geometry.TriangleMesh
 57 | #     """
 58 | #     cen = [0, 0, 0]
 59 | #     wid = sz
 60 | #     if camera_height is not None and camera_width is not None:
 61 | #         hei = wid * camera_height / camera_width
 62 | #     else:
 63 | #         hei = wid
 64 | #     tl = [wid, hei, sz]
 65 | #     tr = [-wid, hei, sz]
 66 | #     br = [-wid, -hei, sz]
 67 | #     bl = [wid, -hei, sz]
 68 | #     points = np.float32([cen, tl, tr, br, bl])
 69 | #     lines = [
 70 | #         [0, 1], [0, 2], [0, 3], [0, 4],
 71 | #         [1, 2], [2, 3], [3, 4], [4, 1],]
 72 | #     frustum = o3d.geometry.LineSet()
 73 | #     frustum.points = o3d.utility.Vector3dVector(points)
 74 | #     frustum.lines = o3d.utility.Vector2iVector(lines)
 75 | #     frustum.colors = o3d.utility.Vector3dVector([np.asarray([1, 0, 0])])
 76 | #     frustum.paint_uniform_color(frustum_color)
 77 | 
 78 | #     frustum = frustum.transform(c2w)
 79 | #     return frustum
 80 | 
 81 | 
 82 | def get_trajectory(pos_history,
 83 |                    num_line=6,
 84 |                    line_radius=0.15
 85 |                    ) -> o3d.geometry.TriangleMesh:
 86 |     """ pos_history: absolute position history
 87 |     """
 88 |     pos_history = np.asarray(pos_history)[-num_line:]
 89 |     colors = [0, 0, 0.6]
 90 |     line_mesh = LineMesh(
 91 |         points=pos_history, 
 92 |         colors=colors, radius=line_radius)
 93 |     line_mesh.merge_cylinder_segments()
 94 |     path = line_mesh.cylinder_segments[0]
 95 |     return path
 96 | 
 97 | 
 98 | def get_pretty_trajectory(pos_history,
 99 |                           num_line=6,
100 |                           line_radius=0.15,
101 |                           darkness=1.0,
102 |                           ) -> List[o3d.geometry.TriangleMesh]:
103 |     """ pos_history: absolute position history
104 |     """
105 |     def generate_jet_colors(n, darkness=0.6):
106 |         cmap = plt.get_cmap('jet')
107 |         norm = plt.Normalize(vmin=0, vmax=n-1)
108 |         colors = cmap(norm(np.arange(n)))
109 |         # Convert RGBA to RGB
110 |         colors_rgb = []
111 |         for color in colors:
112 |             colors_rgb.append(color[:3] * darkness)
113 | 
114 |         return colors_rgb
115 | 
116 |     pos_history = np.asarray(pos_history)[-num_line:]
117 |     colors = generate_jet_colors(len(pos_history), darkness)
118 |     line_mesh = LineMesh(
119 |         points=pos_history, 
120 |         colors=colors, radius=line_radius)
121 |     return line_mesh.cylinder_segments
122 | 
123 | 
124 | """ Obtain Viewpoint from Open3D GUI """
125 | def parse_o3d_gui_view_status(status: dict, render: rendering.OffscreenRenderer):
126 |     """ Parse open3d GUI's view status and convert to OffscreenRenderer format.
127 |     This will do the normalisation of front and compute eye vector (updated version of front)
128 | 
129 |     
130 |     Args:
131 |         status: Ctrl-C output from Open3D GUI
132 |         render: OffscreenRenderer
133 |     Output:
134 |        params for render.setup_camera(fov, lookat, eye, up) 
135 |     """
136 |     cam_info = status['trajectory'][0]
137 |     fov = cam_info['field_of_view']
138 |     lookat = np.asarray(cam_info['lookat'])
139 |     front = np.asarray(cam_info['front'])
140 |     front = front / np.linalg.norm(front)
141 |     up = np.asarray(cam_info['up'])
142 |     zoom = cam_info['zoom']
143 |     """ 
144 |     See Open3D/cpp/open3d/visualization/visualizer/ViewControl.cpp#L243: 
145 |         void ViewControl::SetProjectionParameters()
146 |     """
147 |     right = np.cross(up, front) / np.linalg.norm(np.cross(up, front))
148 |     view_ratio = zoom * render.scene.bounding_box.get_max_extent()
149 |     distance = view_ratio / np.tan(fov * 0.5 / 180.0 * np.pi)
150 |     eye = lookat + front * distance
151 |     return fov, lookat, eye, up
152 | 
153 | 
154 | def set_offscreen_as_gui(render: rendering.OffscreenRenderer, status: dict):
155 |     """ Set offscreen renderer as GUI's view status
156 |     """
157 |     fov, lookat, eye, up = parse_o3d_gui_view_status(status, render)
158 |     render.setup_camera(fov, lookat, eye, up)


--------------------------------------------------------------------------------
/utils/hovering/hover_open3d.py:
--------------------------------------------------------------------------------
  1 | from argparse import ArgumentParser
  2 | import os
  3 | import glob
  4 | import numpy as np
  5 | from PIL import Image
  6 | from tqdm import tqdm
  7 | import json
  8 | import cv2
  9 | import open3d as o3d
 10 | from open3d.visualization import rendering
 11 | 
 12 | from utils.base_type import ColmapModel
 13 | from utils.hovering.helper import (
 14 |     Helper,
 15 |     get_cam_pos,
 16 |     get_trajectory, get_pretty_trajectory, set_offscreen_as_gui
 17 | )
 18 | from tools.visualise_data_open3d import get_c2w, get_frustum
 19 | 
 20 | from moviepy import editor
 21 | from PIL import ImageDraw, ImageFont
 22 | 
 23 | 
 24 | TRAJECTORY_LINE_RADIUS = 0.01
 25 | 
 26 | 
 27 | def parse_args():
 28 |     parser = ArgumentParser()
 29 |     parser.add_argument('--model', help="path to direcctory containing images.bin", required=True)
 30 |     parser.add_argument('--pcd-path', help="path to fused.ply", default=None)
 31 |     parser.add_argument('--view-path', type=str, required=True,
 32 |                         help='path to the view file, copy-paste from open3d gui.')
 33 |     parser.add_argument('--out_dir', type=str, default='outputs/hovering/')
 34 |     args = parser.parse_args()
 35 |     return args
 36 | 
 37 | 
 38 | class HoverRunner:
 39 | 
 40 |     fov = None
 41 |     lookat = None
 42 |     front = None
 43 |     up = None
 44 | 
 45 |     background_color = [1, 1, 1, 1.0]
 46 | 
 47 |     def __init__(self, out_size: str = 'big'):
 48 |         if out_size == 'big':
 49 |             out_size = (1920, 1080)
 50 |         else:
 51 |             out_size = (640, 480)
 52 |         self.render = rendering.OffscreenRenderer(*out_size)
 53 | 
 54 |     def setup(self,
 55 |               model: ColmapModel,
 56 |               pcd_path: str,
 57 |               viewstatus_path: str,
 58 |               out_dir: str,
 59 |               img_x0: int = 0,
 60 |               img_y0: int = 0,
 61 |               frustum_size: float = 0.2,
 62 |               frustum_line_width: float = 5):
 63 |         """
 64 |         Args:
 65 |             model:
 66 |             viewstatus_path:
 67 |                 path to viewstatus.json, CTRL-c output from Open3D gui
 68 |             out_dir:
 69 |                 e.g. 'P34_104_out'
 70 |         """
 71 |         self.model = model
 72 |         if pcd_path is not None:
 73 |             pcd = o3d.io.read_point_cloud(args.pcd_path)
 74 |         else:
 75 |             pcd_np = np.asarray([v.xyz for v in model.points.values()])
 76 |             pcd_rgb = np.asarray([v.rgb / 255 for v in model.points.values()])
 77 |             pcd = o3d.geometry.PointCloud()
 78 |             pcd.points = o3d.utility.Vector3dVector(pcd_np)
 79 |             pcd.colors = o3d.utility.Vector3dVector(pcd_rgb)
 80 |         self.transformed_pcd = pcd
 81 |         
 82 |         self.viewstatus_path = viewstatus_path
 83 |         self.out_dir = out_dir
 84 | 
 85 |         # Render Layout params
 86 |         # img_x0/img_y0: int. The top-left corner of the display image
 87 |         self.img_x0 = img_x0
 88 |         self.img_y0 = img_y0
 89 |         self.rgb_monitor_height = 456
 90 |         self.rgb_monitor_width = 456
 91 |         self.frustum_size = frustum_size
 92 |         self.frustum_line_width = frustum_line_width
 93 |         self.text_loc = (450, 1000)
 94 | 
 95 |     def test_single_frame(self,
 96 |                           psize,
 97 |                           img_index:int =None,
 98 |                           clear_geometry: bool =True,
 99 |                           lay_rgb_img: bool =True,
100 |                           sun_light: bool =False,
101 |                           show_first_frustum: bool =True,
102 |                           ):
103 |         """
104 |         Args:
105 |             psize: point size,
106 |                 probing a good point size is a bit tricky but very important!
107 |             img_index: int. I.e. Frame number
108 |         """
109 |         pcd = self.transformed_pcd
110 | 
111 |         if clear_geometry:
112 |             self.render.scene.clear_geometry()
113 | 
114 |         # Get materials
115 |         helper = Helper(point_size=psize)
116 |         white = helper.material('white')
117 |         red = helper.material('red', shader='unlitLine')
118 |         red.line_width = self.frustum_line_width
119 |         self.helper = helper
120 | 
121 |         # put on pcd
122 |         self.render.scene.add_geometry('pcd', pcd, white)
123 |         with open(self.viewstatus_path) as f:
124 |             viewstatus = json.load(f)
125 |         set_offscreen_as_gui(self.render, viewstatus)
126 | 
127 |         # now put frustum on canvas
128 |         if img_index is None:
129 |             img_index = 0 
130 |         c_image = self.model.ordered_images[img_index]
131 |         c2w = get_c2w(list(c_image.qvec) + list(c_image.tvec))
132 |         frustum = get_frustum(
133 |             c2w=c2w, sz=self.frustum_size,
134 |             camera_height=self.rgb_monitor_height,
135 |             camera_width=self.rgb_monitor_width)
136 |         if show_first_frustum:
137 |             self.render.scene.add_geometry('first_frustum', frustum, red)
138 |         self.render.scene.set_background(self.background_color)
139 | 
140 |         if sun_light:
141 |             self.render.scene.scene.set_sun_light(
142 |                 [0.707, 0.0, -.707], [1.0, 1.0, 1.0], 75000)
143 |             self.render.scene.scene.enable_sun_light(True)
144 |         else:
145 |             self.render.scene.set_lighting(
146 |                 rendering.Open3DScene.NO_SHADOWS, (0, 0, 0))
147 |         self.render.scene.show_axes(False)
148 | 
149 |         img_buf = self.render.render_to_image()
150 |         img = np.asarray(img_buf)
151 |         test_img = self.model.read_rgb_from_name(c_image.name)
152 |         test_img = cv2.resize(
153 |             test_img, (self.rgb_monitor_width, self.rgb_monitor_height))
154 |         if lay_rgb_img:
155 |             img[-self.rgb_monitor_height:,
156 |                 -self.rgb_monitor_width:] = test_img
157 | 
158 |             img_pil = Image.fromarray(img)
159 |             I1 = ImageDraw.Draw(img_pil)
160 |             myFont = ImageFont.truetype('FreeMono.ttf', 65)
161 |             bbox = (
162 |                 img.shape[1] - self.rgb_monitor_width,
163 |                 img.shape[0] - self.rgb_monitor_height,
164 |                 img.shape[1],
165 |                 img.shape[0])
166 |             # print(bbox)
167 |             text = "Frame %d" % img_index
168 |             I1.text(self.text_loc, text, font=myFont, fill =(0, 0, 0))
169 |             I1.rectangle(bbox, outline='red', width=5)
170 |             img = np.asarray(img_pil)
171 |         return img
172 | 
173 |     def run_all(self, step, traj_len=10):
174 |         """
175 |         Args:
176 |             step: int. Render every `step` frames
177 |             traj_len: int. Number of trajectory lines to show
178 |         """
179 |         render = self.render
180 |         os.makedirs(self.out_dir, exist_ok=True)
181 |         out_fmt = os.path.join(self.out_dir, '%010d.jpg')
182 |         red_m = self.helper.material('red', shader='unlitLine')
183 |         red_m.line_width = self.frustum_line_width
184 |         white_m = self.helper.material('white')
185 | 
186 |         render.scene.remove_geometry('first_frustum')
187 | 
188 |         myFont = ImageFont.truetype('FreeMono.ttf', 65)
189 |         bbox = (1464, 624, 1920, 1080)
190 | 
191 |         pos_history = []
192 |         num_images = self.model.num_images
193 |         for frame_idx in tqdm(range(0, num_images, step), total=num_images//step):
194 |             c_image = self.model.ordered_images[frame_idx]
195 |             frame_rgb = self.model.read_rgb_from_name(c_image.name)
196 |             frame_rgb = cv2.resize(
197 |                 frame_rgb, (self.rgb_monitor_width, self.rgb_monitor_height))
198 |             c2w = get_c2w(list(c_image.qvec) + list(c_image.tvec))
199 |             frustum = get_frustum(
200 |                 c2w=c2w, sz=self.frustum_size,
201 |                 camera_height=self.rgb_monitor_height,
202 |                 camera_width=self.rgb_monitor_width)
203 |             pos_history.append(get_cam_pos(c2w))
204 | 
205 |             if len(pos_history) > 2:
206 |                 # lines = get_pretty_trajectory(
207 |                 traj = get_trajectory(
208 |                     pos_history, num_line=traj_len,
209 |                     line_radius=TRAJECTORY_LINE_RADIUS)
210 |                 if render.scene.has_geometry('traj'):
211 |                     render.scene.remove_geometry('traj')
212 |                 render.scene.add_geometry('traj', traj, white_m)
213 |             render.scene.add_geometry('frustum', frustum, red_m)
214 | 
215 |             img = render.render_to_image()
216 |             img = np.asarray(img)
217 |             img[-self.rgb_monitor_height:,
218 |                 -self.rgb_monitor_width:] = frame_rgb
219 |             img_pil = Image.fromarray(img)
220 | 
221 |             I1 = ImageDraw.Draw(img_pil)
222 |             text = "Frame %d" % frame_idx
223 |             I1.text(self.text_loc, text, font=myFont, fill =(0, 0, 0))
224 |             I1.rectangle(bbox, outline='red', width=5)
225 |             img_pil.save(out_fmt % frame_idx)
226 | 
227 |             render.scene.remove_geometry('frustum')
228 | 
229 |         # Gen output
230 |         video_fps = 20
231 |         print("Generating video...")
232 |         seq = sorted(glob.glob(os.path.join(self.out_dir, '*.jpg')))
233 |         clip = editor.ImageSequenceClip(seq, fps=video_fps)
234 |         clip.write_videofile(os.path.join(self.out_dir, 'out.mp4'))
235 | 
236 | 
237 | if __name__ == '__main__':
238 |     args = parse_args()
239 |     model = ColmapModel(args.model)
240 |     model.read_rgb_from_name = \
241 |         lambda name: np.asarray(Image.open(f"outputs/demo/frames/{name}"))
242 |     runner = HoverRunner()
243 |     runner.setup(
244 |         model,
245 |         pcd_path=args.pcd_path,
246 |         viewstatus_path=args.view_path,
247 |         out_dir=args.out_dir,
248 |         frustum_size=1,
249 |         frustum_line_width=1)
250 |     runner.test_single_frame(0.1) 
251 |     runner.run_all(step=3, traj_len=10)
252 | 


--------------------------------------------------------------------------------
/utils/hovering/o3d_line_mesh.py:
--------------------------------------------------------------------------------
  1 | """Module which creates mesh lines from a line set
  2 | Open3D relies upon using glLineWidth to set line width on a LineSet
  3 | However, this method is now deprecated and not fully supporeted in newer OpenGL versions
  4 | See:
  5 |     Open3D Github Pull Request - https://github.com/intel-isl/Open3D/pull/738
  6 |     Other Framework Issues - https://github.com/openframeworks/openFrameworks/issues/3460
  7 | 
  8 | This module aims to solve this by converting a line into a triangular mesh (which has thickness)
  9 | The basic idea is to create a cylinder for each line segment, translate it, and then rotate it.
 10 | 
 11 | License: MIT
 12 | 
 13 | """
 14 | import numpy as np
 15 | import open3d as o3d
 16 | 
 17 | 
 18 | def align_vector_to_another(a=np.array([0, 0, 1]), b=np.array([1, 0, 0])):
 19 |     """
 20 |     Aligns vector a to vector b with axis angle rotation
 21 |     """
 22 |     if np.array_equal(a, b):
 23 |         return None, None
 24 |     axis_ = np.cross(a, b)
 25 |     axis_ = axis_ / np.linalg.norm(axis_)
 26 |     angle = np.arccos(np.dot(a, b))
 27 | 
 28 |     return axis_, angle
 29 | 
 30 | 
 31 | def normalized(a, axis=-1, order=2):
 32 |     """Normalizes a numpy array of points"""
 33 |     l2 = np.atleast_1d(np.linalg.norm(a, order, axis))
 34 |     l2[l2 == 0] = 1
 35 |     return a / np.expand_dims(l2, axis), l2
 36 | 
 37 | 
 38 | class LineMesh(object):
 39 |     def __init__(self, points, lines=None, colors=[0, 1, 0], radius=0.15):
 40 |         """Creates a line represented as sequence of cylinder triangular meshes
 41 | 
 42 |         Arguments:
 43 |             points {ndarray} -- Numpy array of ponts Nx3.
 44 | 
 45 |         Keyword Arguments:
 46 |             lines {list[list] or None} -- List of point index pairs denoting line segments. If None, implicit lines from ordered pairwise points. (default: {None})
 47 |             colors {list} -- list of colors, or single color of the line (default: {[0, 1, 0]})
 48 |             radius {float} -- radius of cylinder (default: {0.15})
 49 |         """
 50 |         self.points = np.array(points)
 51 |         self.lines = np.array(
 52 |             lines) if lines is not None else self.lines_from_ordered_points(self.points)
 53 |         self.colors = np.array(colors)
 54 |         self.radius = radius
 55 |         self.cylinder_segments = []
 56 | 
 57 |         self.create_line_mesh()
 58 | 
 59 |     @staticmethod
 60 |     def lines_from_ordered_points(points):
 61 |         lines = [[i, i + 1] for i in range(0, points.shape[0] - 1, 1)]
 62 |         return np.array(lines)
 63 | 
 64 |     def create_line_mesh(self):
 65 |         first_points = self.points[self.lines[:, 0], :]
 66 |         second_points = self.points[self.lines[:, 1], :]
 67 |         line_segments = second_points - first_points
 68 |         line_segments_unit, line_lengths = normalized(line_segments)
 69 | 
 70 |         z_axis = np.array([0, 0, 1])
 71 |         # Create triangular mesh cylinder segments of line
 72 |         for i in range(line_segments_unit.shape[0]):
 73 |             line_segment = line_segments_unit[i, :]
 74 |             line_length = line_lengths[i]
 75 |             # get axis angle rotation to allign cylinder with line segment
 76 |             axis, angle = align_vector_to_another(z_axis, line_segment)
 77 |             # Get translation vector
 78 |             translation = first_points[i, :] + line_segment * line_length * 0.5
 79 |             # create cylinder and apply transformations
 80 |             cylinder_segment = o3d.geometry.TriangleMesh.create_cylinder(
 81 |                 self.radius, line_length)
 82 |             cylinder_segment = cylinder_segment.translate(
 83 |                 translation, relative=False)
 84 |             if axis is not None:
 85 |                 axis_a = axis * angle
 86 |                 rot = o3d.geometry.get_rotation_matrix_from_axis_angle(axis_a)
 87 |                 cylinder_segment = cylinder_segment.rotate(
 88 |                     R=rot, center=cylinder_segment.get_center())
 89 |                 # cylinder_segment = cylinder_segment.rotate(
 90 |                 #   axis_a, center=True, type=o3d.geometry.RotationType.AxisAngle)
 91 |             # color cylinder
 92 |             color = self.colors if self.colors.ndim == 1 else self.colors[i, :]
 93 |             cylinder_segment.paint_uniform_color(color)
 94 | 
 95 |             self.cylinder_segments.append(cylinder_segment)
 96 | 
 97 |     def merge_cylinder_segments(self):
 98 | 
 99 |          vertices_list = [np.asarray(mesh.vertices) for mesh in self.cylinder_segments]
100 |          triangles_list = [np.asarray(mesh.triangles) for mesh in self.cylinder_segments]
101 |          triangles_offset = np.cumsum([v.shape[0] for v in vertices_list])
102 |          triangles_offset = np.insert(triangles_offset, 0, 0)[:-1]
103 |         
104 |          vertices = np.vstack(vertices_list)
105 |          triangles = np.vstack([triangle + offset for triangle, offset in zip(triangles_list, triangles_offset)])
106 |         
107 |          merged_mesh = o3d.geometry.TriangleMesh(o3d.open3d.utility.Vector3dVector(vertices), 
108 |                                                  o3d.open3d.utility.Vector3iVector(triangles))
109 |          color = self.colors if self.colors.ndim == 1 else self.colors[0]
110 |          merged_mesh.paint_uniform_color(color)
111 |          self.cylinder_segments = [merged_mesh]
112 | 
113 |     def add_line(self, vis):
114 |         """Adds this line to the visualizer"""
115 |         for cylinder in self.cylinder_segments:
116 |             vis.add_geometry(cylinder)
117 | 
118 |     def remove_line(self, vis):
119 |         """Removes this line from the visualizer"""
120 |         for cylinder in self.cylinder_segments:
121 |             vis.remove_geometry(cylinder)
122 | 
123 | 
124 | def main():
125 |     print("Demonstrating LineMesh vs LineSet")
126 |     # Create Line Set
127 |     points = [[0, 0, 0], [1, 0, 0], [0, 1, 0], [1, 1, 0], [0, 0, 1], [1, 0, 1],
128 |               [0, 1, 1], [1, 1, 1]]
129 |     lines = [[0, 1], [0, 2], [1, 3], [2, 3], [4, 5], [4, 6], [5, 7], [6, 7],
130 |              [0, 4], [1, 5], [2, 6], [3, 7]]
131 |     colors = [[1, 0, 0] for i in range(len(lines))]
132 | 
133 |     line_set = o3d.geometry.LineSet()
134 |     line_set.points = o3d.utility.Vector3dVector(points)
135 |     line_set.lines = o3d.utility.Vector2iVector(lines)
136 |     line_set.colors = o3d.utility.Vector3dVector(colors)
137 | 
138 |     # Create Line Mesh 1
139 |     points = np.array(points) + [0, 0, 2]
140 |     line_mesh1 = LineMesh(points, lines, colors, radius=0.02)
141 |     line_mesh1_geoms = line_mesh1.cylinder_segments
142 | 
143 |     # Create Line Mesh 1
144 |     points = np.array(points) + [0, 2, 0]
145 |     line_mesh2 = LineMesh(points, radius=0.03)
146 |     line_mesh2_geoms = line_mesh2.cylinder_segments
147 | 
148 |     o3d.visualization.draw_geometries(
149 |         [line_set, *line_mesh1_geoms, *line_mesh2_geoms])
150 | 
151 | 
152 | if __name__ == "__main__":
153 |     main()
154 | 
155 | 


--------------------------------------------------------------------------------
/utils/lib.py:
--------------------------------------------------------------------------------
 1 | import pycolmap
 2 | import shutil
 3 | import os
 4 | import glob
 5 | import subprocess
 6 | 
 7 | def get_num_images(model_path):
 8 |     reconstruction = pycolmap.Reconstruction(model_path)
 9 |     num_images = reconstruction.num_images()
10 |     return num_images
11 | 
12 | def read_lines_from_file(filename):
13 |     """
14 |     Read lines from a txt file and return them as a list.
15 |     
16 |     :param filename: Name of the file to read from.
17 |     :return: List of lines from the file.
18 |     """
19 |     with open(filename, 'r') as file:
20 |         lines = file.readlines()
21 |     
22 |     # Strip any trailing newline characters
23 |     return [line.strip() for line in lines]
24 | 
25 | def keep_model_with_largest_images(reconstuction_path):
26 |     all_models = sorted(glob.glob(os.path.join(reconstuction_path,'*')))
27 |     try:
28 |         max_images = get_num_images(all_models[0])
29 |     except:
30 |         return 0
31 |     selected_model = all_models[0]
32 |     if len(all_models) > 1:
33 |         for model in all_models:
34 |             num_images = get_num_images(model)
35 |             if num_images > max_images:
36 |                 max_images = num_images
37 |                 selected_model = model
38 | 
39 |         for model in all_models:
40 |             if model != selected_model:
41 |                 shutil.rmtree(model)   
42 |         os.rename(selected_model,os.path.join(reconstuction_path,'0'))
43 |     return max_images
44 | 
45 |     # Define the function to execute in each process
46 | def run_script(script_path, arg):
47 |     cmd = ['python3', script_path] + arg
48 |     print(cmd)
49 |     subprocess.call(cmd)


--------------------------------------------------------------------------------