├── README.md ├── __init__.py ├── configs ├── box.yaml ├── nerf_fern.yaml ├── nerf_lego.yaml ├── nerf_lego_highres.yaml ├── nerf_materials.yaml ├── nerf_materials_highres.yaml ├── points_surface.yaml ├── sphere.yaml ├── sphere_surface.yaml ├── torus_surface.yaml ├── train_box.yaml ├── train_sphere.yaml └── volsdf_surface.yaml ├── data ├── box_0.npy ├── box_0.png ├── box_1.npy ├── box_1.png ├── box_2.npy ├── box_2.png ├── box_3.npy ├── box_3.png ├── bridge_pointcloud.npz └── bunny_pointcloud.npz ├── data_utils.py ├── dataset.py ├── environment.yml ├── images └── .ignore ├── implicit.py ├── losses.py ├── ray_utils.py ├── render_functions.py ├── renderer.py ├── requirements.txt ├── sampler.py ├── surface_rendering_main.py ├── ta_images ├── color.png ├── depth.png ├── grid.png ├── part_1.gif ├── part_2.gif ├── part_2_after_training_0.png ├── part_2_after_training_1.png ├── part_2_after_training_2.png ├── part_2_after_training_3.png ├── part_2_before_training_0.png ├── part_2_before_training_1.png ├── part_2_before_training_2.png ├── part_2_before_training_3.png ├── part_3.gif ├── part_5.gif ├── part_6.gif ├── part_6_input.gif ├── part_7.gif ├── part_7_geometry.gif ├── rays.png ├── sample_points.png └── transmittance.png ├── transmittance_calculation ├── a3_transmittance.pdf ├── figure1.png └── main.tex └── volume_rendering_main.py /README.md: -------------------------------------------------------------------------------- 1 | Assignment 3 : Neural Volume Rendering and Surface Rendering 2 | =================================== 3 | Goals: In this assignment, you will setup a differentiable rendering pipeline and implement neural volume/surface rendering techniques like NeRF and VolSDF. 4 | 5 | ## Table of Contents 6 | - [Setup](#setup) 7 | - [A. Neural Volume Rendering (80 points)](#a-neural-volume-rendering-80-points) 8 | - [0. Transmittance Calculation (10)](#0-transmittance-calculation-10-points) 9 | - [1. Differentiable Volume Rendering (30)](#1-differentiable-volume-rendering) 10 | - [2. Optimizing a Basic Implicit Volume (10)](#2-optimizing-a-basic-implicit-volume) 11 | - [3. Optimizing a Neural Radiance Field (NeRF) (20)](#3-optimizing-a-neural-radiance-field-nerf-20-points) 12 | - [4. NeRF Extras (10 + 10 Extra)](#4-nerf-extras-choose-one-more-than-one-is-extra-credit) 13 | - [B. Neural Surface Rendering (50 points)](#b-neural-surface-rendering-50-points) 14 | - [5. Sphere Tracing (10)](#5-sphere-tracing-10-points) 15 | - [6. Optimizing a Neural SDF (15)](#6-optimizing-a-neural-sdf-15-points) 16 | - [7. VolSDF (15)](#7-volsdf-15-points) 17 | - [8. Neural Surface Extras (10 + 20 Extra)](#8-neural-surface-extras-choose-one-more-than-one-is-extra-credit) 18 | 19 | 20 | 21 | ## Setup 22 | 23 | ### Environment Setup 24 | You can use the python environment you've set up for past assignments, but if you're starting fresh, please follow the instructions from Assignment 1 to get an environment with `torch` and `pytorch3d` up and running. This assignment needs a few additional packages, that can be installed with - 25 | ```bash 26 | pip install -r requirements.txt 27 | ``` 28 | 29 | ### Data 30 | 31 | Most of the data for this assignment is provided in the github repo under `data/`. One of the assets (materials scene for Q4.1) is large, so you can download and unzip it as follows - 32 | ```bash 33 | sudo apt install git-lfs 34 | git lfs install 35 | 36 | git clone https://huggingface.co/datasets/learning3dvision/nerf_materials 37 | cd nerf_materials 38 | unzip materials.zip -d 39 | ``` 40 | # A. Neural Volume Rendering (80 points) 41 | 42 | ## 0. Transmittance Calculation (10 points) 43 | Transmittance calculation is a core part for the implementation of volume rendering. Your first task is to compute the transmittance of a ray going through a non-homogeneous medium (shown in the image below). 44 | Please compute the transmittance in `transmittance_calculation/a3_transmittance.pdf` and submit the result on your assignment website. You can either hand write the result or edit the tex file and show a screenshot on your webpage, as long as it is readable by the TAs. 45 | 46 | ![Transmittance computation](transmittance_calculation/figure1.png) 47 | 48 | ## 1. Differentiable Volume Rendering 49 | 50 | In the emission-absorption (EA) model described in class, volumes are typically described by their *appearance* (e.g. emission) and *geometry* (absorption) at *every point* in 3D space. For part 1 of the assignment, you will implement a ***Differentiable Renderer*** for EA volumes, which you will use in parts 2 and 3. Differentiable renderers are extremely useful for 3D learning problems --- one reason is because they allow you to optimize scene parameters (i.e. perform inverse rendering) from image supervision only! 51 | 52 | ## 1.1. Familiarize yourself with the code structure 53 | 54 | There are four major components of our differentiable volume rendering pipeline: 55 | 56 | * ***The camera***: `pytorch3d.CameraBase` 57 | * ***The scene***: `SDFVolume` in `implicit.py` 58 | * ***The sampling routine***: `StratifiedSampler` in `sampler.py` 59 | * ***The renderer***: `VolumeRenderer` in `renderer.py` 60 | 61 | `StratifiedSampler` provides a method for sampling multiple points along a ray traveling through the scene (also known as *raymarching*). Together, a sampler and a renderer describe a rendering pipeline. Like traditional graphics pipelines, this rendering procedure is independent of the scene and camera. 62 | 63 | The scene, sampler, and renderer are all packaged together under the `Model` class in `volume_rendering_main.py`. In particular the `Model`'s forward method invokes a `VolumeRenderer` instance with a sampling strategy and volume as input. 64 | 65 | Also, take a look at the `RayBundle` class in `ray_utils.py`, which provides a convenient wrapper around several inputs to the volume rendering procedure per ray. 66 | 67 | ## 1.2. Outline of tasks 68 | 69 | In order to perform rendering, you will implement the following routines: 70 | 71 | 1. **Ray sampling from cameras**: you will fill out methods in `ray_utils.py` to generate world space rays from a particular camera. 72 | 2. **Point sampling along rays**: you will fill out the `StratifiedSampler` class to generate sample points along each world space ray 73 | 3. **Rendering**: you will fill out the `VolumeRenderer` class to *evaluate* a volume function at each sample point along a ray, and aggregate these evaluations to perform rendering. 74 | 75 | ## 1.3. Ray sampling (5 points) 76 | 77 | Take a look at the `render_images` function in `volume_rendering_main.py`. It loops through a set of cameras, generates rays for each pixel on a camera, and renders these rays using a `Model` instance. 78 | 79 | ### Implementation 80 | 81 | Your first task is to implement: 82 | 83 | 1. `get_pixels_from_image` in `ray_utils.py` and 84 | 2. `get_rays_from_pixels` in `ray_utils.py` 85 | 86 | which are used in `render_images`: 87 | 88 | ```python 89 | xy_grid = get_pixels_from_image(image_size, camera) # TODO: implement in ray_utils.py 90 | ray_bundle = get_rays_from_pixels(xy_grid, camera) # TODO: implement in ray_utils.py 91 | ``` 92 | 93 | The `get_pixels_from_image` method generates pixel coordinates, ranging from `[-1, 1]` for each pixel in an image. The `get_rays_from_pixels` method generates rays for each pixel, by mapping from a camera's *Normalized Device Coordinate (NDC) Space* into world space. 94 | 95 | ### Visualization 96 | 97 | You can run the code for part 1 with: 98 | 99 | ```bash 100 | # mkdir images (uncomment when running for the first time) 101 | python volume_rendering_main.py --config-name=box 102 | ``` 103 | 104 | Once you have implemented these methods, verify that your output matches the TA output by visualizing both `xy_grid` and `rays` with the `vis_grid` and `vis_rays` functions in the `render_images` function in `main.py`. **By default, the above command will crash and return an error**. However, it should reach your visualization code before it does. The outputs of grid/ray visualization should look like this: 105 | 106 | ![Grid](ta_images/grid.png) ![Rays](ta_images/rays.png) 107 | 108 | ## 1.4. Point sampling (5 points) 109 | 110 | ### Implementation 111 | 112 | Your next task is to fill out `StratifiedSampler` in `sampler.py`. Implement the forward method, which: 113 | 114 | 1. Generates a set of distances between `near` and `far` and 115 | 2. Uses these distances to sample points offset from ray origins (`RayBundle.origins`) along ray directions (`RayBundle.directions`). 116 | 3. Stores the distances and sample points in `RayBundle.sample_points` and `RayBundle.sample_lengths` 117 | 118 | ### Visualization 119 | 120 | Once you have done this, use the `render_points` method in `render_functions.py` in order to visualize the point samples from the first camera. They should look like this: 121 | 122 | ![Sample points](ta_images/sample_points.png) 123 | 124 | ## 1.5. Volume rendering (20 points) 125 | 126 | Finally, we can implement volume rendering! With the `configs/box.yaml` configuration, we provide you with an `SDFVolume` instance describing a box. You can check out the code for this function in `implicit.py`, which converts a signed distance function into a volume. If you want, you can even implement your own `SDFVolume` classes by creating new signed distance function class, and adding it to `sdf_dict` in `implicit.py`. Take a look at [this great web page](https://iquilezles.org/articles/distfunctions/) for formulas for some simple/complex SDFs. 127 | 128 | 129 | ### Implementation 130 | 131 | You will implement 132 | 133 | 1. `VolumeRenderer._compute_weights` and 134 | 2. `VolumeRenderer._aggregate`. 135 | 3. You will also modify the `VolumeRenderer.forward` method to render a depth map in addition to color from a volume 136 | 137 | From each volume evaluation you will get both volume density, and a color: 138 | 139 | ```python 140 | # Call implicit function with sample points 141 | implicit_output = implicit_fn(cur_ray_bundle) 142 | density = implicit_output['density'] 143 | feature = implicit_output['feature'] 144 | ``` 145 | 146 | You'll then use the following equation to render color along a ray: 147 | 148 | ![Equation](ta_images/color.png) 149 | 150 | where `σ` is density, `Δt` is the length of current ray segment, and `L_e` is color: 151 | 152 | ![Transmittance](ta_images/transmittance.png) 153 | 154 | Compute the weights `T * (1 - exp(-σ * Δt))` in `VolumeRenderer._compute_weights`, and perform the summation in `VolumeRenderer._aggregate`. Note that for the first segment `T = 1`. 155 | 156 | Use weights, and aggregation function to render *color* and *depth* (stored in `RayBundle.sample_lengths`). 157 | 158 | ### Visualization 159 | 160 | By default, your results will be written out to `images/part_1.gif`. Provide a visualization of the depth in your write-up. Note that the depth should be normalized by its maximum value. 161 | 162 | ![Spiral Rendering of Part 1](ta_images/part_1.gif) ![Depth](ta_images/depth.png) 163 | 164 | 165 | ## 2. Optimizing a basic implicit volume 166 | 167 | ## 2.1. Random ray sampling (5 points) 168 | 169 | Since you have now implemented a differentiable volume renderer, we can use it to optimize the parameters of a volume! We have provided a basic training loop in the `train` method in `volume_rendering_main.py`. 170 | 171 | Depending on how many sample points we take for each ray, volume rendering can consume a lot of memory on the GPU (especially during the backward pass of gradient descent). Because of this, it usually makes sense to sample a subset of rays from a full image for each training iteration. In order to do this, implement the `get_random_pixels_from_image` method in `ray_utils.py`, invoked here: 172 | 173 | ```python 174 | xy_grid = get_random_pixels_from_image(cfg.training.batch_size, image_size, camera) # TODO: implement in ray_utils.py 175 | ``` 176 | 177 | ## 2.2. Loss and training (5 points) 178 | Replace the loss in `train` 179 | 180 | ```python 181 | loss = None 182 | ``` 183 | 184 | with mean squared error between the predicted colors and ground truth colors `rgb_gt`. 185 | 186 | Once you've done this, you can run train a model with 187 | 188 | ```bash 189 | python volume_rendering_main.py --config-name=train_box 190 | ``` 191 | 192 | This will optimize the position and side lengths of a box, given a few ground truth images with known camera poses (in the `data` folder). Report the center of the box, and the side lengths of the box after training, rounded to the nearest `1/100` decimal place. 193 | 194 | ## 2.3. Visualization 195 | 196 | The code renders a spiral sequence of the optimized volume in `images/part_2.gif`. Compare this gif to the one below, and attach it in your write-up: 197 | 198 | ![Spiral Rendering of Part 2](ta_images/part_2.gif) 199 | 200 | 201 | ## 3. Optimizing a Neural Radiance Field (NeRF) (20 points) 202 | In this part, you will implement an implicit volume as a Multi-Layer Perceptron (MLP) in the `NeuralRadianceField` class in `implicit.py`. This MLP should map 3D position to volume density and color. Specifically: 203 | 204 | 1. Your MLP should take in a `RayBundle` object in its forward method, and produce color and density for each sample point in the RayBundle. 205 | 2. You should also fill out the loss in `train_nerf` in the `volume_rendering_main.py` file. 206 | 207 | You will then use this implicit volume to optimize a scene from a set of RGB images. We have implemented data loading, training, checkpointing for you, but this part will still require you to do a bit more legwork than for Parts 1 and 2. You will have to write the code for the MLP yourself --- feel free to reference the NeRF paper, though you should not directly copy code from an external repository. 208 | 209 | ## Implementation 210 | 211 | Here are a few things to note: 212 | 213 | 1. For now, your NeRF MLP does not need to handle *view dependence*, and can solely depend on 3D position. 214 | 2. You should use the `ReLU` activation to map the first network output to density (to ensure that density is non-negative) 215 | 3. You should use the `Sigmoid` activation to map the remaining raw network outputs to color 216 | 4. You can use *Positional Encoding* of the input to the network to achieve higher quality. We provide an implementation of positional encoding in the `HarmonicEmbedding` class in `implicit.py`. 217 | 218 | ## Visualization 219 | You can train a NeRF on the lego bulldozer dataset with 220 | 221 | ```bash 222 | python volume_rendering_main.py --config-name=nerf_lego 223 | ``` 224 | 225 | This will create a NeRF with the `NeuralRadianceField` class in `implicit.py`, and use it as the `implicit_fn` in `VolumeRenderer`. It will also train a NeRF for 250 epochs on 128x128 images. 226 | 227 | Feel free to modify the experimental settings in `configs/nerf_lego.yaml` --- though the current settings should allow you to train a NeRF on low-resolution inputs in a reasonable amount of time. After training, a spiral rendering will be written to `images/part_3.gif`. Report your results. It should look something like this: 228 | 229 | ![Spiral Rendering of Part 3](ta_images/part_3.gif) 230 | 231 | ## 4. NeRF Extras (CHOOSE ONE! More than one is extra credit) 232 | 233 | ### 4.1 View Dependence (10 points) 234 | 235 | Add view dependence to your NeRF model! Specifically, make it so that emission can vary with viewing direction. You can read NeRF or other papers for how to do this effectively --- if you're not careful, your network may overfit to the training images. Discuss the trade-offs between increased view dependence and generalization quality. 236 | 237 | While you may use the lego scene to test your code, please employ the materials scene to show the results of your method on your webpage (experimental settings can be found in `nerf_materials.yaml` and `nerf_materials_highres.yaml`). 238 | 239 | If you haven't done so already, make sure to download and unzip the `nerf_materials` dataset as described in the setup section. 240 | 241 | ### 4.2 Coarse/Fine Sampling (10 points) 242 | 243 | NeRF employs two networks: a coarse network and a fine network. During the coarse pass, it uses the coarse network to get an estimate of geometry, and during fine pass uses these geometry estimates for better point sampling for the fine network. Implement this strategy and discuss trade-offs (speed / quality). 244 | 245 | # B. Neural Surface Rendering (50 points) 246 | 247 | ## 5. Sphere Tracing (10 points) 248 | 249 | In this part you will implement sphere tracing for rendering an SDF, and use this implementation to render a simple torus. You will need to implement the `sphere_tracing` function in `renderer.py`. This function should return two outputs: (`points, mask`), where the `points` Tensor indicates the intersection point for each ray with the surface, and `mask` is a boolean Tensor indicating which rays intersected the surface. 250 | 251 | You can run the code for part 5 with: 252 | ```bash 253 | # mkdir images (uncomment when running for the first time) 254 | python -m surface_rendering_main --config-name=torus_surface 255 | ``` 256 | 257 | This should save `part_5.gif` in the `images' folder. Please include this in your submission along with a short writeup describing your implementation. 258 | 259 | ![Torus](ta_images/part_5.gif) 260 | 261 | ## 6. Optimizing a Neural SDF (15 points) 262 | 263 | In this part, you will implement an MLP architecture for a neural SDF, and train this neural SDF on point cloud data. You will do this by training the network to output a zero value at the observed points. To encourage the network to learn an SDF instead of an arbitrary function, we will use an 'eikonal' regularization which enforces the gradients of the predictions to behave in a certain way (search lecture slides for hints). 264 | 265 | In this part you need to: 266 | 267 | * **Implement a MLP to predict distance**: You should populate the `NeuralSurface` class in `implicit.py`. For this part, you need to define a MLP that helps you predict a distance for any input point. More concretely, you would need to define some MLP(s) in `__init__` function, and use these to implement the `get_distance` function for this class. Hint: you can use a similar MLP to what you used to predict density in Part A, but remember that density and distance have different possible ranges! 268 | 269 | * **Implement Eikonal Constraint as a Loss**: Define the `eikonal_loss` in `losses.py`. 270 | 271 | After this, you should be able to train a NeuralSurface representation by: 272 | ```bash 273 | python -m surface_rendering_main --config-name=points_surface 274 | ``` 275 | 276 | This should save save `part_6_input.gif` and `part_6.gif` in the `images` folder. The former visualizes the input point cloud used for training, and the latter shows your prediction which you should include on the webpage alongwith brief descriptions of your MLP and eikonal loss. You might need to tune hyperparameters (e.g. number of layers, epochs, weight of regularization, etc.) for good results. 277 | 278 | ![Bunny geometry](ta_images/part_6.gif) 279 | 280 | ## 7. VolSDF (15 points) 281 | 282 | In this part, you will implement a function converting SDF -> volume density and extend the `NeuralSurface` class to predict color. 283 | 284 | * **Color Prediction**: Extend the the `NeuralSurface` class to predict per-point color. You may need to define a new MLP (a just a few new layers depending on how you implemented Q2). You should then implement the `get_color` and `get_distance_color` functions. 285 | 286 | * **SDF to Density**: Read section 3.1 of the [VolSDF Paper](https://arxiv.org/pdf/2106.12052.pdf) and implement their formula converting signed distance to density in the `sdf_to_density` function in `renderer.py`. In your write-up, give an intuitive explanation of what the parameters `alpha` and `beta` are doing here. Also, answer the following questions: 287 | 1. How does high `beta` bias your learned SDF? What about low `beta`? 288 | 2. Would an SDF be easier to train with volume rendering and low `beta` or high `beta`? Why? 289 | 3. Would you be more likely to learn an accurate surface with high `beta` or low `beta`? Why? 290 | 291 | After implementing these, train an SDF on the lego bulldozer model with 292 | 293 | ```bash 294 | python -m surface_rendering_main --config-name=volsdf_surface 295 | ``` 296 | 297 | This will save `part_7_geometry.gif` and `part_7.gif`. Experiment with hyper-parameters to and attach your best results on your webpage. Comment on the settings you chose, and why they seem to work well. 298 | 299 | ![Bulldozer geometry](ta_images/part_7_geometry.gif) ![Bulldozer color](ta_images/part_7.gif) 300 | 301 | 302 | ## 8. Neural Surface Extras (CHOOSE ONE! More than one is extra credit) 303 | 304 | ### 8.1. Render a Large Scene with Sphere Tracing (10 points) 305 | In Q5, you rendered a (lonely) Torus, but to the power of Sphere Tracing lies in the fact that it can render complex scenes efficiently. To observe this, try defining a ‘scene’ with many (> 20) primitives (e.g. Sphere, Torus, or another SDF from [this website](https://iquilezles.org/articles/distfunctions/) at different locations). See Lecture 2 for equations of what the ‘composed’ SDF of primitives is. You can then define a new class in `implicit.py` that instantiates a complex scene with many primitives, and modify the code for Q5 to render this scene instead of a simple torus. 306 | 307 | ### 8.2 Fewer Training Views (10 points) 308 | In Q7, we relied on 100 training views for a single scene. A benefit of using Surface representations, however, is that the geometry is better regularized and can in principle be inferred from fewer views. Experiment with using fewer training views (say 20) -- you can do this by changing [train_idx in data loader](https://github.com/learning3d/assignment3/blob/main/dataset.py#L123) to use a smaller random subset of indices. You should also compare the VolSDF solution to a NeRF solution learned using similar views. 309 | 310 | ### 8.3 Alternate SDF to Density Conversions (10 points) 311 | In Q7, we used the equations from [VolSDF Paper](https://arxiv.org/pdf/2106.12052.pdf) to convert SDF to density. You should try and compare alternate ways of doing this e.g. the ‘naive’ solution from the [NeuS paper](https://arxiv.org/pdf/2106.10689.pdf), or any other ways that you might want to propose! 312 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Facebook, Inc. and its affiliates. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the BSD-style license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | -------------------------------------------------------------------------------- /configs/box.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: render 4 | 5 | data: 6 | image_size: [256, 256] 7 | 8 | renderer: 9 | type: volume 10 | chunk_size: 32768 11 | 12 | sampler: 13 | type: stratified 14 | n_pts_per_ray: 64 15 | min_depth: 0.0 16 | max_depth: 5.0 17 | 18 | implicit_function: 19 | type: sdf_volume 20 | 21 | sdf: 22 | type: box 23 | 24 | side_lengths: 25 | val: [1.75, 1.75, 1.75] 26 | opt: False 27 | 28 | center: 29 | val: [0.0, 0.0, 0.0] 30 | opt: True 31 | 32 | feature: 33 | rainbow: True 34 | val: [1.0, 1.0, 1.0] 35 | opt: False 36 | 37 | alpha: 38 | val: 1.0 39 | opt: False 40 | 41 | beta: 42 | val: 0.05 43 | opt: False 44 | 45 | -------------------------------------------------------------------------------- /configs/nerf_fern.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train_nerf 4 | 5 | training: 6 | num_epochs: 10000 7 | batch_size: 4096 8 | lr: 0.0005 9 | 10 | lr_scheduler_step_size: 1000 11 | lr_scheduler_gamma: 0.9 12 | 13 | checkpoint_path: ./checkpoints 14 | checkpoint_interval: 1000 15 | resume: True 16 | 17 | render_interval: 10 18 | 19 | data: 20 | image_size: [252, 189] 21 | dataset_name: fern 22 | 23 | renderer: 24 | type: volume 25 | chunk_size: 32768 26 | 27 | sampler: 28 | type: stratified 29 | n_pts_per_ray: 64 30 | 31 | min_depth: 1.2 32 | max_depth: 6.28 33 | 34 | implicit_function: 35 | type: nerf 36 | 37 | n_harmonic_functions_xyz: 6 38 | n_harmonic_functions_dir: 2 39 | n_hidden_neurons_xyz: 128 40 | n_hidden_neurons_dir: 64 41 | density_noise_std: 0.0 42 | n_layers_xyz: 6 43 | append_xyz: [4] 44 | -------------------------------------------------------------------------------- /configs/nerf_lego.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train_nerf 4 | 5 | training: 6 | num_epochs: 250 7 | batch_size: 1024 8 | lr: 0.0005 9 | 10 | lr_scheduler_step_size: 50 11 | lr_scheduler_gamma: 0.8 12 | 13 | checkpoint_path: ./checkpoints 14 | checkpoint_interval: 50 15 | resume: True 16 | 17 | render_interval: 10 18 | 19 | data: 20 | image_size: [128, 128] 21 | dataset_name: lego 22 | 23 | renderer: 24 | type: volume 25 | chunk_size: 32768 26 | white_background: False 27 | 28 | sampler: 29 | type: stratified 30 | n_pts_per_ray: 128 31 | 32 | min_depth: 2.0 33 | max_depth: 6.0 34 | 35 | implicit_function: 36 | type: nerf 37 | 38 | n_harmonic_functions_xyz: 6 39 | n_harmonic_functions_dir: 2 40 | n_hidden_neurons_xyz: 128 41 | n_hidden_neurons_dir: 64 42 | density_noise_std: 0.0 43 | n_layers_xyz: 6 44 | append_xyz: [3] 45 | 46 | 47 | -------------------------------------------------------------------------------- /configs/nerf_lego_highres.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train_nerf 4 | 5 | training: 6 | num_epochs: 250 7 | batch_size: 1024 8 | lr: 0.0005 9 | 10 | lr_scheduler_step_size: 50 11 | lr_scheduler_gamma: 0.8 12 | 13 | checkpoint_path: ./checkpoints 14 | checkpoint_interval: 50 15 | resume: True 16 | 17 | render_interval: 10 18 | 19 | data: 20 | image_size: [400, 400] 21 | dataset_name: lego 22 | 23 | renderer: 24 | type: volume 25 | chunk_size: 32768 26 | white_background: False 27 | 28 | sampler: 29 | type: stratified 30 | n_pts_per_ray: 128 31 | 32 | min_depth: 2.0 33 | max_depth: 6.0 34 | 35 | implicit_function: 36 | type: nerf 37 | 38 | n_harmonic_functions_xyz: 6 39 | n_harmonic_functions_dir: 2 40 | n_hidden_neurons_xyz: 128 41 | n_hidden_neurons_dir: 64 42 | density_noise_std: 0.0 43 | n_layers_xyz: 6 44 | append_xyz: [3] 45 | 46 | 47 | 48 | -------------------------------------------------------------------------------- /configs/nerf_materials.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train_nerf 4 | 5 | training: 6 | num_epochs: 250 7 | batch_size: 1024 8 | lr: 0.0001 9 | 10 | lr_scheduler_step_size: 50 11 | lr_scheduler_gamma: 0.8 12 | 13 | checkpoint_path: ./checkpoints 14 | checkpoint_interval: 50 15 | resume: True 16 | 17 | render_interval: 10 18 | 19 | data: 20 | image_size: [128, 128] 21 | dataset_name: materials 22 | 23 | renderer: 24 | type: volume 25 | chunk_size: 32768 26 | white_background: False 27 | 28 | sampler: 29 | type: stratified 30 | n_pts_per_ray: 128 31 | 32 | min_depth: 1.0 33 | max_depth: 7.0 34 | 35 | implicit_function: 36 | type: nerf 37 | 38 | n_harmonic_functions_xyz: 6 39 | n_harmonic_functions_dir: 2 40 | n_hidden_neurons_xyz: 128 41 | n_hidden_neurons_dir: 64 42 | density_noise_std: 0.0 43 | n_layers_xyz: 6 44 | append_xyz: [3] 45 | 46 | -------------------------------------------------------------------------------- /configs/nerf_materials_highres.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train_nerf 4 | 5 | training: 6 | num_epochs: 250 7 | batch_size: 1024 8 | lr: 0.0001 9 | 10 | lr_scheduler_step_size: 50 11 | lr_scheduler_gamma: 0.8 12 | 13 | checkpoint_path: ./checkpoints 14 | checkpoint_interval: 50 15 | resume: True 16 | 17 | render_interval: 10 18 | 19 | data: 20 | image_size: [400, 400] 21 | dataset_name: materials 22 | 23 | renderer: 24 | type: volume 25 | chunk_size: 32768 26 | white_background: False 27 | 28 | sampler: 29 | type: stratified 30 | n_pts_per_ray: 128 31 | 32 | min_depth: 1.0 33 | max_depth: 7.0 34 | 35 | implicit_function: 36 | type: nerf 37 | 38 | n_harmonic_functions_xyz: 6 39 | n_harmonic_functions_dir: 2 40 | n_hidden_neurons_xyz: 128 41 | n_hidden_neurons_dir: 64 42 | density_noise_std: 0.0 43 | n_layers_xyz: 6 44 | append_xyz: [3] 45 | 46 | -------------------------------------------------------------------------------- /configs/points_surface.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train_points 4 | 5 | data: 6 | image_size: [256, 256] 7 | point_cloud_path: data/bunny_pointcloud.npz 8 | 9 | renderer: 10 | type: sphere_tracing 11 | chunk_size: 8192 12 | near: 0.0 13 | far: 5.0 14 | max_iters: 64 15 | 16 | sampler: 17 | type: stratified 18 | n_pts_per_ray: 19 | min_depth: 20 | max_depth: 21 | 22 | training: 23 | num_epochs: 5000 24 | pretrain_iters: 250 25 | batch_size: 4096 26 | lr: 0.0001 27 | 28 | lr_scheduler_step_size: 50 29 | lr_scheduler_gamma: 0.8 30 | 31 | checkpoint_path: ./points_checkpoint 32 | checkpoint_interval: 100 33 | resume: True 34 | 35 | render_interval: 500 36 | 37 | inter_weight: 0.1 38 | eikonal_weight: 0.02 39 | bounds: [[-4, -4, -4], [4, 4, 4]] 40 | 41 | implicit_function: 42 | type: neural_surface 43 | 44 | n_harmonic_functions_xyz: 4 45 | 46 | n_layers_distance: 6 47 | n_hidden_neurons_distance: 128 48 | append_distance: [] 49 | 50 | n_layers_color: 2 51 | n_hidden_neurons_color: 128 52 | append_color: [] 53 | -------------------------------------------------------------------------------- /configs/sphere.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: render 4 | 5 | data: 6 | image_size: [256, 256] 7 | 8 | renderer: 9 | type: volume 10 | chunk_size: 32768 11 | 12 | sampler: 13 | type: stratified 14 | n_pts_per_ray: 64 15 | min_depth: 0.0 16 | max_depth: 5.0 17 | 18 | implicit_function: 19 | type: sdf_volume 20 | 21 | sdf: 22 | type: sphere 23 | 24 | radius: 25 | val: 1.0 26 | opt: False 27 | 28 | center: 29 | val: [0.0, 0.0, 0.0] 30 | opt: False 31 | 32 | feature: 33 | rainbow: True 34 | val: [1.0, 1.0, 1.0] 35 | opt: False 36 | 37 | alpha: 38 | val: 1.0 39 | opt: False 40 | 41 | beta: 42 | val: 0.05 43 | opt: False -------------------------------------------------------------------------------- /configs/sphere_surface.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: render 4 | 5 | data: 6 | image_size: [256, 256] 7 | 8 | renderer: 9 | type: sphere_tracing 10 | chunk_size: 8192 11 | near: 0.0 12 | far: 5.0 13 | max_iters: 64 14 | 15 | sampler: 16 | type: stratified 17 | n_pts_per_ray: 18 | min_depth: 19 | max_depth: 20 | 21 | implicit_function: 22 | type: sdf_surface 23 | 24 | sdf: 25 | type: sphere 26 | 27 | center: 28 | val: [0.0, 0.0, 0.0] 29 | opt: True 30 | 31 | radius: 32 | val: 1.0 33 | opt: False 34 | 35 | feature: 36 | rainbow: True 37 | val: [1.0, 1.0, 1.0] 38 | opt: False -------------------------------------------------------------------------------- /configs/torus_surface.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: render 4 | 5 | data: 6 | image_size: [256, 256] 7 | 8 | renderer: 9 | type: sphere_tracing 10 | chunk_size: 8192 11 | near: 0.0 12 | far: 5.0 13 | max_iters: 64 14 | 15 | sampler: 16 | type: stratified 17 | n_pts_per_ray: 18 | min_depth: 19 | max_depth: 20 | 21 | implicit_function: 22 | type: sdf_surface 23 | 24 | sdf: 25 | type: torus 26 | 27 | center: 28 | val: [0.0, 0.0, 0.0] 29 | opt: True 30 | 31 | radii: 32 | val: [1.0, 0.25] 33 | opt: False 34 | 35 | feature: 36 | rainbow: True 37 | val: [1.0, 1.0, 1.0] 38 | opt: False 39 | -------------------------------------------------------------------------------- /configs/train_box.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train 4 | 5 | training: 6 | num_epochs: 1000 7 | batch_size: 4096 8 | lr: 0.0005 9 | 10 | data: 11 | image_size: [256, 256] 12 | 13 | cameras: 14 | cam0: 15 | focal: 1.0 16 | eye: [-2.5, 0.0, 0.0] 17 | principal_point: [0.0, 0.0] 18 | scene_center: [0.0, 0.0, 0.0] 19 | up: [0.0, 1.0, 0.0] 20 | 21 | image: "data/box_0.npy" 22 | 23 | cam1: 24 | focal: 1.0 25 | eye: [-1.0, 0.0, -2.2] 26 | principal_point: [0.0, 0.0] 27 | scene_center: [0.0, 0.0, 0.0] 28 | up: [0.0, 1.0, 0.0] 29 | 30 | image: "data/box_1.npy" 31 | 32 | cam2: 33 | focal: 1.0 34 | eye: [0.0, 0.0, -2.5] 35 | principal_point: [0.0, 0.0] 36 | scene_center: [0.0, 0.0, 0.0] 37 | up: [0.0, 1.0, 0.0] 38 | 39 | image: "data/box_2.npy" 40 | 41 | cam3: 42 | focal: 1.0 43 | eye: [1.0, 0.0, -2.2] 44 | principal_point: [0.0, 0.0] 45 | scene_center: [0.0, 0.0, 0.0] 46 | up: [0.0, 1.0, 0.0] 47 | 48 | image: "data/box_3.npy" 49 | 50 | renderer: 51 | type: volume 52 | chunk_size: 32768 53 | 54 | sampler: 55 | type: stratified 56 | n_pts_per_ray: 64 57 | min_depth: 0.0 58 | max_depth: 5.0 59 | 60 | implicit_function: 61 | type: sdf_volume 62 | 63 | sdf: 64 | type: box 65 | 66 | side_lengths: 67 | val: [1.5, 1.5, 1.5] 68 | opt: True 69 | 70 | center: 71 | val: [0.0, 0.0, 0.0] 72 | opt: True 73 | 74 | feature: 75 | rainbow: True 76 | val: [1.0, 1.0, 1.0] 77 | opt: False 78 | 79 | alpha: 80 | val: 1.0 81 | opt: False 82 | 83 | beta: 84 | val: 0.05 85 | opt: False -------------------------------------------------------------------------------- /configs/train_sphere.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train 4 | 5 | training: 6 | num_epochs: 250 7 | batch_size: 1024 8 | lr: 0.0005 9 | 10 | data: 11 | image_size: [256, 256] 12 | 13 | cameras: 14 | cam1: 15 | focal: 1.0 16 | eye: [0.0, 0.0, -3.0] 17 | principal_point: [0.0, 0.0] 18 | scene_center: [0.0, 0.0, 0.0] 19 | up: [0.0, 1.0, 0.0] 20 | 21 | cam2: 22 | focal: 1.0 23 | eye: [2.0, 0.0, -2.0] 24 | principal_point: [0.0, 0.0] 25 | scene_center: [0.0, 0.0, 0.0] 26 | up: [0.0, 1.0, 0.0] 27 | 28 | cam3: 29 | focal: 1.0 30 | eye: [3.0, 0.0, 0.0] 31 | principal_point: [0.0, 0.0] 32 | scene_center: [0.0, 0.0, 0.0] 33 | up: [0.0, 1.0, 0.0] 34 | 35 | renderer: 36 | type: volume 37 | chunk_size: 32768 38 | 39 | sampler: 40 | type: stratified 41 | n_pts_per_ray: 64 42 | min_depth: 0.0 43 | max_depth: 5.0 44 | 45 | implicit_function: 46 | type: sdf_volume 47 | 48 | sdf: 49 | type: sphere 50 | 51 | radius: 52 | val: 1.0 53 | opt: False 54 | 55 | center: 56 | val: [0.3, 0.2, 0.0] 57 | opt: True 58 | 59 | feature: 60 | val: [0.3, 0.1, 0.1] 61 | opt: True 62 | 63 | alpha: 64 | val: 1.0 65 | opt: False 66 | 67 | beta: 68 | val: 0.05 69 | opt: False -------------------------------------------------------------------------------- /configs/volsdf_surface.yaml: -------------------------------------------------------------------------------- 1 | seed: 1 2 | 3 | type: train_images 4 | 5 | data: 6 | image_size: [128, 128] 7 | dataset_name: lego 8 | 9 | renderer: 10 | type: volume_sdf 11 | chunk_size: 32768 12 | white_background: False 13 | 14 | alpha: 10.0 15 | beta: 0.05 16 | 17 | relighting_function: 18 | type: none 19 | 20 | sampler: 21 | type: stratified 22 | n_pts_per_ray: 128 23 | 24 | min_depth: 2.0 25 | max_depth: 6.0 26 | 27 | training: 28 | num_epochs: 250 29 | pretrain_iters: 1000 30 | batch_size: 1024 31 | lr: 0.0005 32 | 33 | lr_scheduler_step_size: 50 34 | lr_scheduler_gamma: 0.8 35 | 36 | checkpoint_path: ./volsdf_checkpoint 37 | checkpoint_interval: 50 38 | resume: True 39 | 40 | render_interval: 10 41 | 42 | inter_weight: 0.1 43 | eikonal_weight: 0.02 44 | bounds: [[-4, -4, -4], [4, 4, 4]] 45 | 46 | implicit_function: 47 | type: neural_surface 48 | 49 | n_harmonic_functions_xyz: 6 50 | 51 | n_layers_distance: 6 52 | n_hidden_neurons_distance: 128 53 | append_distance: [] 54 | 55 | n_layers_color: 2 56 | n_hidden_neurons_color: 128 57 | append_color: [] 58 | -------------------------------------------------------------------------------- /data/box_0.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_0.npy -------------------------------------------------------------------------------- /data/box_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_0.png -------------------------------------------------------------------------------- /data/box_1.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_1.npy -------------------------------------------------------------------------------- /data/box_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_1.png -------------------------------------------------------------------------------- /data/box_2.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_2.npy -------------------------------------------------------------------------------- /data/box_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_2.png -------------------------------------------------------------------------------- /data/box_3.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_3.npy -------------------------------------------------------------------------------- /data/box_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_3.png -------------------------------------------------------------------------------- /data/bridge_pointcloud.npz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/bridge_pointcloud.npz -------------------------------------------------------------------------------- /data/bunny_pointcloud.npz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/bunny_pointcloud.npz -------------------------------------------------------------------------------- /data_utils.py: -------------------------------------------------------------------------------- 1 | from pytorch3d.renderer import ( 2 | PerspectiveCameras, 3 | look_at_view_transform 4 | ) 5 | 6 | import numpy as np 7 | import torch 8 | 9 | 10 | # Basic data loading 11 | def dataset_from_config( 12 | cfg, 13 | ): 14 | dataset = [] 15 | 16 | for cam_idx, cam_key in enumerate(cfg.cameras.keys()): 17 | cam_cfg = cfg.cameras[cam_key] 18 | 19 | # Create camera parameters 20 | R, T = look_at_view_transform( 21 | eye=(cam_cfg.eye,), 22 | at=(cam_cfg.scene_center,), 23 | up=(cam_cfg.up,), 24 | ) 25 | focal = torch.tensor([cam_cfg.focal])[None] 26 | principal_point = torch.tensor(cam_cfg.principal_point)[None] 27 | 28 | # Assemble the dataset 29 | image = None 30 | if 'image' in cam_cfg and cam_cfg.image is not None: 31 | image = torch.tensor(np.load(cam_cfg.image))[None] 32 | 33 | dataset.append( 34 | { 35 | "image": image, 36 | "camera": PerspectiveCameras( 37 | focal_length=focal, 38 | principal_point=principal_point, 39 | R=R, 40 | T=T, 41 | ), 42 | "camera_idx": cam_idx, 43 | } 44 | ) 45 | 46 | return dataset 47 | 48 | 49 | # Spiral cameras looking at the origin 50 | def create_surround_cameras(radius, n_poses=20, up=(0.0, 1.0, 0.0), focal_length=1.0): 51 | cameras = [] 52 | 53 | for theta in np.linspace(0, 2 * np.pi, n_poses + 1)[:-1]: 54 | 55 | if np.abs(up[1]) > 0: 56 | eye = [np.cos(theta + np.pi / 2) * radius, 0, -np.sin(theta + np.pi / 2) * radius] 57 | else: 58 | eye = [np.cos(theta + np.pi / 2) * radius, np.sin(theta + np.pi / 2) * radius, 2.0] 59 | 60 | R, T = look_at_view_transform( 61 | eye=(eye,), 62 | at=([0.0, 0.0, 0.0],), 63 | up=(up,), 64 | ) 65 | 66 | cameras.append( 67 | PerspectiveCameras( 68 | focal_length=torch.tensor([focal_length])[None], 69 | principal_point=torch.tensor([0.0, 0.0])[None], 70 | R=R, 71 | T=T, 72 | ) 73 | ) 74 | 75 | return cameras 76 | 77 | 78 | def vis_grid(xy_grid, image_size): 79 | xy_vis = (xy_grid + 1) / 2.001 80 | xy_vis = torch.cat([xy_vis, torch.zeros_like(xy_vis[..., :1])], -1) 81 | xy_vis = xy_vis.view(image_size[1], image_size[0], 3) 82 | xy_vis = np.array(xy_vis.detach().cpu()) 83 | 84 | return xy_vis 85 | 86 | 87 | def vis_rays(ray_bundle, image_size): 88 | rays = torch.abs(ray_bundle.directions) 89 | rays = rays.view(image_size[1], image_size[0], 3) 90 | rays = np.array(rays.detach().cpu()) 91 | 92 | return rays -------------------------------------------------------------------------------- /dataset.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Facebook, Inc. and its affiliates. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the BSD-style license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | 7 | import os 8 | from typing import List, Optional, Tuple 9 | 10 | import numpy as np 11 | import requests 12 | import torch 13 | from PIL import Image 14 | from pytorch3d.renderer import PerspectiveCameras 15 | from torch.utils.data import Dataset 16 | 17 | import matplotlib.pyplot as plt 18 | 19 | 20 | DEFAULT_DATA_ROOT = os.path.join( 21 | os.path.dirname(os.path.realpath(__file__)), "data" 22 | ) 23 | 24 | DEFAULT_URL_ROOT = "https://dl.fbaipublicfiles.com/pytorch3d_nerf_data" 25 | 26 | ALL_DATASETS = ("lego", "fern", "pt3logo", "materials") 27 | 28 | 29 | def trivial_collate(batch): 30 | """ 31 | A trivial collate function that merely returns the uncollated batch. 32 | """ 33 | return batch 34 | 35 | 36 | class ListDataset(Dataset): 37 | """ 38 | A simple dataset made of a list of entries. 39 | """ 40 | 41 | def __init__(self, entries: List) -> None: 42 | """ 43 | Args: 44 | entries: The list of dataset entries. 45 | """ 46 | self._entries = entries 47 | 48 | def __len__( 49 | self, 50 | ) -> int: 51 | return len(self._entries) 52 | 53 | def __getitem__(self, index): 54 | return self._entries[index] 55 | 56 | 57 | def get_nerf_datasets( 58 | dataset_name: str, # 'lego | fern' 59 | image_size: Tuple[int, int], 60 | data_root: str = DEFAULT_DATA_ROOT, 61 | autodownload: bool = True, 62 | ) -> Tuple[Dataset, Dataset, Dataset]: 63 | """ 64 | Obtains the training and validation dataset object for a dataset specified 65 | with the `dataset_name` argument. 66 | 67 | Args: 68 | dataset_name: The name of the dataset to load. 69 | image_size: A tuple (height, width) denoting the sizes of the loaded dataset images. 70 | data_root: The root folder at which the data is stored. 71 | autodownload: Auto-download the dataset files in case they are missing. 72 | 73 | Returns: 74 | train_dataset: The training dataset object. 75 | val_dataset: The validation dataset object. 76 | test_dataset: The testing dataset object. 77 | """ 78 | 79 | if dataset_name not in ALL_DATASETS: 80 | raise ValueError(f"'{dataset_name}'' does not refer to a known dataset.") 81 | 82 | print(f"Loading dataset {dataset_name}, image size={str(image_size)} ...") 83 | 84 | cameras_path = os.path.join(data_root, dataset_name + ".pth") 85 | image_path = cameras_path.replace(".pth", ".png") 86 | 87 | if autodownload and any(not os.path.isfile(p) for p in (cameras_path, image_path)): 88 | # Automatically download the data files if missing. 89 | download_data((dataset_name,), data_root=data_root) 90 | 91 | train_data = torch.load(cameras_path) 92 | n_cameras = train_data["cameras"]["R"].shape[0] 93 | 94 | _image_max_image_pixels = Image.MAX_IMAGE_PIXELS 95 | Image.MAX_IMAGE_PIXELS = None # The dataset image is very large ... 96 | images = torch.FloatTensor(np.array(Image.open(image_path))) / 255.0 97 | images = torch.stack(torch.chunk(images, n_cameras, dim=0))[..., :3] 98 | Image.MAX_IMAGE_PIXELS = _image_max_image_pixels 99 | 100 | scale_factors = [s_new / s for s, s_new in zip(images.shape[1:3], image_size)] 101 | 102 | if abs(scale_factors[0] - scale_factors[1]) > 1e-3: 103 | raise ValueError( 104 | "Non-isotropic scaling is not allowed. Consider changing the 'image_size' argument." 105 | ) 106 | scale_factor = sum(scale_factors) * 0.5 107 | 108 | if scale_factor != 1.0: 109 | print(f"Rescaling dataset (factor={scale_factor})") 110 | images = torch.nn.functional.interpolate( 111 | images.permute(0, 3, 1, 2), 112 | size=tuple(image_size), 113 | mode="bilinear", 114 | ).permute(0, 2, 3, 1) 115 | 116 | cameras = [ 117 | PerspectiveCameras( 118 | **{k: v[cami][None] for k, v in train_data["cameras"].items()} 119 | ).to("cpu") 120 | for cami in range(n_cameras) 121 | ] 122 | 123 | train_idx, val_idx, test_idx = train_data["split"] 124 | 125 | train_dataset, val_dataset, test_dataset = [ 126 | ListDataset( 127 | [ 128 | {"image": images[i], "camera": cameras[i], "camera_idx": int(i)} 129 | for i in idx 130 | ] 131 | ) 132 | for idx in [train_idx, val_idx, test_idx] 133 | ] 134 | 135 | return train_dataset, val_dataset, test_dataset 136 | 137 | 138 | def download_data( 139 | dataset_names: Optional[List[str]] = None, 140 | data_root: str = DEFAULT_DATA_ROOT, 141 | url_root: str = DEFAULT_URL_ROOT, 142 | ) -> None: 143 | """ 144 | Downloads the relevant dataset files. 145 | 146 | Args: 147 | dataset_names: A list of the names of datasets to download. If `None`, 148 | downloads all available datasets. 149 | """ 150 | 151 | if dataset_names is None: 152 | dataset_names = ALL_DATASETS 153 | 154 | os.makedirs(data_root, exist_ok=True) 155 | 156 | for dataset_name in dataset_names: 157 | cameras_file = dataset_name + ".pth" 158 | images_file = cameras_file.replace(".pth", ".png") 159 | license_file = cameras_file.replace(".pth", "_license.txt") 160 | 161 | for fl in (cameras_file, images_file, license_file): 162 | local_fl = os.path.join(data_root, fl) 163 | remote_fl = os.path.join(url_root, fl) 164 | 165 | print(f"Downloading dataset {dataset_name} from {remote_fl} to {local_fl}.") 166 | 167 | r = requests.get(remote_fl) 168 | 169 | with open(local_fl, "wb") as f: 170 | f.write(r.content) 171 | 172 | 173 | -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: l3d 2 | channels: 3 | - pyg 4 | - pytorch 5 | - pytorch3d 6 | - conda-forge 7 | - fvcore 8 | - iopath 9 | - bottler 10 | - defaults 11 | dependencies: 12 | - cudatoolkit=11.0 13 | - python=3.9 14 | - pip 15 | - pytorch 16 | - pytorch3d=0.6.1 17 | - torchvision 18 | - fvcore 19 | - iopath 20 | - nvidiacub 21 | - pip: 22 | - hydra-core 23 | - Pillow 24 | - plotly 25 | - requests 26 | - imageio 27 | - matplotlib 28 | - numpy 29 | - PyMCubes 30 | - tqdm 31 | - visdom 32 | -------------------------------------------------------------------------------- /images/.ignore: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/images/.ignore -------------------------------------------------------------------------------- /implicit.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | from torch import autograd 4 | 5 | from ray_utils import RayBundle 6 | 7 | 8 | # Sphere SDF class 9 | class SphereSDF(torch.nn.Module): 10 | def __init__( 11 | self, 12 | cfg 13 | ): 14 | super().__init__() 15 | 16 | self.radius = torch.nn.Parameter( 17 | torch.tensor(cfg.radius.val).float(), requires_grad=cfg.radius.opt 18 | ) 19 | self.center = torch.nn.Parameter( 20 | torch.tensor(cfg.center.val).float().unsqueeze(0), requires_grad=cfg.center.opt 21 | ) 22 | 23 | def forward(self, points): 24 | points = points.view(-1, 3) 25 | 26 | return torch.linalg.norm( 27 | points - self.center, 28 | dim=-1, 29 | keepdim=True 30 | ) - self.radius 31 | 32 | 33 | # Box SDF class 34 | class BoxSDF(torch.nn.Module): 35 | def __init__( 36 | self, 37 | cfg 38 | ): 39 | super().__init__() 40 | 41 | self.center = torch.nn.Parameter( 42 | torch.tensor(cfg.center.val).float().unsqueeze(0), requires_grad=cfg.center.opt 43 | ) 44 | self.side_lengths = torch.nn.Parameter( 45 | torch.tensor(cfg.side_lengths.val).float().unsqueeze(0), requires_grad=cfg.side_lengths.opt 46 | ) 47 | 48 | def forward(self, points): 49 | points = points.view(-1, 3) 50 | diff = torch.abs(points - self.center) - self.side_lengths / 2.0 51 | 52 | signed_distance = torch.linalg.norm( 53 | torch.maximum(diff, torch.zeros_like(diff)), 54 | dim=-1 55 | ) + torch.minimum(torch.max(diff, dim=-1)[0], torch.zeros_like(diff[..., 0])) 56 | 57 | return signed_distance.unsqueeze(-1) 58 | 59 | # Torus SDF class 60 | class TorusSDF(torch.nn.Module): 61 | def __init__( 62 | self, 63 | cfg 64 | ): 65 | super().__init__() 66 | 67 | self.center = torch.nn.Parameter( 68 | torch.tensor(cfg.center.val).float().unsqueeze(0), requires_grad=cfg.center.opt 69 | ) 70 | self.radii = torch.nn.Parameter( 71 | torch.tensor(cfg.radii.val).float().unsqueeze(0), requires_grad=cfg.radii.opt 72 | ) 73 | 74 | def forward(self, points): 75 | points = points.view(-1, 3) 76 | diff = points - self.center 77 | q = torch.stack( 78 | [ 79 | torch.linalg.norm(diff[..., :2], dim=-1) - self.radii[..., 0], 80 | diff[..., -1], 81 | ], 82 | dim=-1 83 | ) 84 | return (torch.linalg.norm(q, dim=-1) - self.radii[..., 1]).unsqueeze(-1) 85 | 86 | sdf_dict = { 87 | 'sphere': SphereSDF, 88 | 'box': BoxSDF, 89 | 'torus': TorusSDF, 90 | } 91 | 92 | 93 | # Converts SDF into density/feature volume 94 | class SDFVolume(torch.nn.Module): 95 | def __init__( 96 | self, 97 | cfg 98 | ): 99 | super().__init__() 100 | 101 | self.sdf = sdf_dict[cfg.sdf.type]( 102 | cfg.sdf 103 | ) 104 | 105 | self.rainbow = cfg.feature.rainbow if 'rainbow' in cfg.feature else False 106 | self.feature = torch.nn.Parameter( 107 | torch.ones_like(torch.tensor(cfg.feature.val).float().unsqueeze(0)), requires_grad=cfg.feature.opt 108 | ) 109 | 110 | self.alpha = torch.nn.Parameter( 111 | torch.tensor(cfg.alpha.val).float(), requires_grad=cfg.alpha.opt 112 | ) 113 | self.beta = torch.nn.Parameter( 114 | torch.tensor(cfg.beta.val).float(), requires_grad=cfg.beta.opt 115 | ) 116 | 117 | def _sdf_to_density(self, signed_distance): 118 | # Convert signed distance to density with alpha, beta parameters 119 | return torch.where( 120 | signed_distance > 0, 121 | 0.5 * torch.exp(-signed_distance / self.beta), 122 | 1 - 0.5 * torch.exp(signed_distance / self.beta), 123 | ) * self.alpha 124 | 125 | def forward(self, ray_bundle): 126 | sample_points = ray_bundle.sample_points.view(-1, 3) 127 | depth_values = ray_bundle.sample_lengths[..., 0] 128 | deltas = torch.cat( 129 | ( 130 | depth_values[..., 1:] - depth_values[..., :-1], 131 | 1e10 * torch.ones_like(depth_values[..., :1]), 132 | ), 133 | dim=-1, 134 | ).view(-1, 1) 135 | 136 | # Transform SDF to density 137 | signed_distance = self.sdf(ray_bundle.sample_points) 138 | density = self._sdf_to_density(signed_distance) 139 | 140 | # Outputs 141 | if self.rainbow: 142 | base_color = torch.clamp( 143 | torch.abs(sample_points - self.sdf.center), 144 | 0.02, 145 | 0.98 146 | ) 147 | else: 148 | base_color = 1.0 149 | 150 | out = { 151 | 'density': -torch.log(1.0 - density) / deltas, 152 | 'feature': base_color * self.feature * density.new_ones(sample_points.shape[0], 1) 153 | } 154 | 155 | return out 156 | 157 | 158 | # Converts SDF into density/feature volume 159 | class SDFSurface(torch.nn.Module): 160 | def __init__( 161 | self, 162 | cfg 163 | ): 164 | super().__init__() 165 | 166 | self.sdf = sdf_dict[cfg.sdf.type]( 167 | cfg.sdf 168 | ) 169 | self.rainbow = cfg.feature.rainbow if 'rainbow' in cfg.feature else False 170 | self.feature = torch.nn.Parameter( 171 | torch.ones_like(torch.tensor(cfg.feature.val).float().unsqueeze(0)), requires_grad=cfg.feature.opt 172 | ) 173 | 174 | def get_distance(self, points): 175 | points = points.view(-1, 3) 176 | return self.sdf(points) 177 | 178 | def get_color(self, points): 179 | points = points.view(-1, 3) 180 | 181 | # Outputs 182 | if self.rainbow: 183 | base_color = torch.clamp( 184 | torch.abs(points - self.sdf.center), 185 | 0.02, 186 | 0.98 187 | ) 188 | else: 189 | base_color = 1.0 190 | 191 | return base_color * self.feature * points.new_ones(points.shape[0], 1) 192 | 193 | def forward(self, points): 194 | return self.get_distance(points) 195 | 196 | class HarmonicEmbedding(torch.nn.Module): 197 | def __init__( 198 | self, 199 | in_channels: int = 3, 200 | n_harmonic_functions: int = 6, 201 | omega0: float = 1.0, 202 | logspace: bool = True, 203 | include_input: bool = True, 204 | ) -> None: 205 | super().__init__() 206 | 207 | if logspace: 208 | frequencies = 2.0 ** torch.arange( 209 | n_harmonic_functions, 210 | dtype=torch.float32, 211 | ) 212 | else: 213 | frequencies = torch.linspace( 214 | 1.0, 215 | 2.0 ** (n_harmonic_functions - 1), 216 | n_harmonic_functions, 217 | dtype=torch.float32, 218 | ) 219 | 220 | self.register_buffer("_frequencies", omega0 * frequencies, persistent=False) 221 | self.include_input = include_input 222 | self.output_dim = n_harmonic_functions * 2 * in_channels 223 | 224 | if self.include_input: 225 | self.output_dim += in_channels 226 | 227 | def forward(self, x: torch.Tensor): 228 | embed = (x[..., None] * self._frequencies).view(*x.shape[:-1], -1) 229 | 230 | if self.include_input: 231 | return torch.cat((embed.sin(), embed.cos(), x), dim=-1) 232 | else: 233 | return torch.cat((embed.sin(), embed.cos()), dim=-1) 234 | 235 | 236 | class LinearWithRepeat(torch.nn.Linear): 237 | def forward(self, input): 238 | n1 = input[0].shape[-1] 239 | output1 = F.linear(input[0], self.weight[:, :n1], self.bias) 240 | output2 = F.linear(input[1], self.weight[:, n1:], None) 241 | return output1 + output2.unsqueeze(-2) 242 | 243 | 244 | class MLPWithInputSkips(torch.nn.Module): 245 | def __init__( 246 | self, 247 | n_layers: int, 248 | input_dim: int, 249 | output_dim: int, 250 | skip_dim: int, 251 | hidden_dim: int, 252 | input_skips, 253 | ): 254 | super().__init__() 255 | 256 | layers = [] 257 | 258 | for layeri in range(n_layers): 259 | if layeri == 0: 260 | dimin = input_dim 261 | dimout = hidden_dim 262 | elif layeri in input_skips: 263 | dimin = hidden_dim + skip_dim 264 | dimout = hidden_dim 265 | else: 266 | dimin = hidden_dim 267 | dimout = hidden_dim 268 | 269 | linear = torch.nn.Linear(dimin, dimout) 270 | layers.append(torch.nn.Sequential(linear, torch.nn.ReLU(True))) 271 | 272 | self.mlp = torch.nn.ModuleList(layers) 273 | self._input_skips = set(input_skips) 274 | 275 | def forward(self, x: torch.Tensor, z: torch.Tensor) -> torch.Tensor: 276 | y = x 277 | 278 | for li, layer in enumerate(self.mlp): 279 | if li in self._input_skips: 280 | y = torch.cat((y, z), dim=-1) 281 | 282 | y = layer(y) 283 | 284 | return y 285 | 286 | 287 | # TODO (Q3.1): Implement NeRF MLP 288 | class NeuralRadianceField(torch.nn.Module): 289 | def __init__( 290 | self, 291 | cfg, 292 | ): 293 | super().__init__() 294 | 295 | self.harmonic_embedding_xyz = HarmonicEmbedding(3, cfg.n_harmonic_functions_xyz) 296 | self.harmonic_embedding_dir = HarmonicEmbedding(3, cfg.n_harmonic_functions_dir) 297 | 298 | embedding_dim_xyz = self.harmonic_embedding_xyz.output_dim 299 | embedding_dim_dir = self.harmonic_embedding_dir.output_dim 300 | 301 | pass 302 | 303 | 304 | class NeuralSurface(torch.nn.Module): 305 | def __init__( 306 | self, 307 | cfg, 308 | ): 309 | super().__init__() 310 | # TODO (Q6): Implement Neural Surface MLP to output per-point SDF 311 | # TODO (Q7): Implement Neural Surface MLP to output per-point color 312 | 313 | def get_distance( 314 | self, 315 | points 316 | ): 317 | ''' 318 | TODO: Q6 319 | Output: 320 | distance: N X 1 Tensor, where N is number of input points 321 | ''' 322 | points = points.view(-1, 3) 323 | pass 324 | 325 | def get_color( 326 | self, 327 | points 328 | ): 329 | ''' 330 | TODO: Q7 331 | Output: 332 | distance: N X 3 Tensor, where N is number of input points 333 | ''' 334 | points = points.view(-1, 3) 335 | pass 336 | 337 | def get_distance_color( 338 | self, 339 | points 340 | ): 341 | ''' 342 | TODO: Q7 343 | Output: 344 | distance, points: N X 1, N X 3 Tensors, where N is number of input points 345 | You may just implement this by independent calls to get_distance, get_color 346 | but, depending on your MLP implementation, it maybe more efficient to share some computation 347 | ''' 348 | 349 | def forward(self, points): 350 | return self.get_distance(points) 351 | 352 | def get_distance_and_gradient( 353 | self, 354 | points 355 | ): 356 | has_grad = torch.is_grad_enabled() 357 | points = points.view(-1, 3) 358 | 359 | # Calculate gradient with respect to points 360 | with torch.enable_grad(): 361 | points = points.requires_grad_(True) 362 | distance = self.get_distance(points) 363 | gradient = autograd.grad( 364 | distance, 365 | points, 366 | torch.ones_like(distance, device=points.device), 367 | create_graph=has_grad, 368 | retain_graph=has_grad, 369 | only_inputs=True 370 | )[0] 371 | 372 | return distance, gradient 373 | 374 | 375 | implicit_dict = { 376 | 'sdf_volume': SDFVolume, 377 | 'nerf': NeuralRadianceField, 378 | 'sdf_surface': SDFSurface, 379 | 'neural_surface': NeuralSurface, 380 | } 381 | -------------------------------------------------------------------------------- /losses.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | 4 | def eikonal_loss(gradients): 5 | # TODO (Q6): Implement eikonal loss 6 | pass 7 | 8 | def sphere_loss(signed_distance, points, radius=1.0): 9 | return torch.square(signed_distance[..., 0] - (torch.norm(points, dim=-1) - radius)).mean() 10 | 11 | def get_random_points(num_points, bounds, device): 12 | min_bound = torch.tensor(bounds[0], device=device).unsqueeze(0) 13 | max_bound = torch.tensor(bounds[1], device=device).unsqueeze(0) 14 | 15 | return torch.rand((num_points, 3), device=device) * (max_bound - min_bound) + min_bound 16 | 17 | def select_random_points(points, n_points): 18 | points_sub = points[torch.randperm(points.shape[0])] 19 | return points_sub.reshape(-1, 3)[:n_points] 20 | -------------------------------------------------------------------------------- /ray_utils.py: -------------------------------------------------------------------------------- 1 | import math 2 | from typing import List, NamedTuple 3 | 4 | import torch 5 | import torch.nn.functional as F 6 | from pytorch3d.renderer.cameras import CamerasBase 7 | 8 | 9 | # Convenience class wrapping several ray inputs: 10 | # 1) Origins -- ray origins 11 | # 2) Directions -- ray directions 12 | # 3) Sample points -- sample points along ray direction from ray origin 13 | # 4) Sample lengths -- distance of sample points from ray origin 14 | 15 | class RayBundle(object): 16 | def __init__( 17 | self, 18 | origins, 19 | directions, 20 | sample_points, 21 | sample_lengths, 22 | ): 23 | self.origins = origins 24 | self.directions = directions 25 | self.sample_points = sample_points 26 | self.sample_lengths = sample_lengths 27 | 28 | def __getitem__(self, idx): 29 | return RayBundle( 30 | self.origins[idx], 31 | self.directions[idx], 32 | self.sample_points[idx], 33 | self.sample_lengths[idx], 34 | ) 35 | 36 | @property 37 | def shape(self): 38 | return self.origins.shape[:-1] 39 | 40 | @property 41 | def sample_shape(self): 42 | return self.sample_points.shape[:-1] 43 | 44 | def reshape(self, *args): 45 | return RayBundle( 46 | self.origins.reshape(*args, 3), 47 | self.directions.reshape(*args, 3), 48 | self.sample_points.reshape(*args, self.sample_points.shape[-2], 3), 49 | self.sample_lengths.reshape(*args, self.sample_lengths.shape[-2], 3), 50 | ) 51 | 52 | def view(self, *args): 53 | return RayBundle( 54 | self.origins.view(*args, 3), 55 | self.directions.view(*args, 3), 56 | self.sample_points.view(*args, self.sample_points.shape[-2], 3), 57 | self.sample_lengths.view(*args, self.sample_lengths.shape[-2], 3), 58 | ) 59 | 60 | def _replace(self, **kwargs): 61 | for key in kwargs.keys(): 62 | setattr(self, key, kwargs[key]) 63 | 64 | return self 65 | 66 | 67 | # Sample image colors from pixel values 68 | def sample_images_at_xy( 69 | images: torch.Tensor, 70 | xy_grid: torch.Tensor, 71 | ): 72 | batch_size = images.shape[0] 73 | spatial_size = images.shape[1:-1] 74 | 75 | xy_grid = -xy_grid.view(batch_size, -1, 1, 2) 76 | 77 | images_sampled = torch.nn.functional.grid_sample( 78 | images.permute(0, 3, 1, 2), 79 | xy_grid, 80 | align_corners=True, 81 | mode="bilinear", 82 | ) 83 | 84 | return images_sampled.permute(0, 2, 3, 1).view(-1, images.shape[-1]) 85 | 86 | 87 | # Generate pixel coordinates from in NDC space (from [-1, 1]) 88 | def get_pixels_from_image(image_size, camera): 89 | W, H = image_size[0], image_size[1] 90 | 91 | # TODO (Q1.3): Generate pixel coordinates from [0, W] in x and [0, H] in y 92 | pass 93 | 94 | # TODO (Q1.3): Convert to the range [-1, 1] in both x and y 95 | pass 96 | 97 | # Create grid of coordinates 98 | xy_grid = torch.stack( 99 | tuple( reversed( torch.meshgrid(y, x) ) ), 100 | dim=-1, 101 | ).view(W * H, 2) 102 | 103 | return -xy_grid 104 | 105 | 106 | # Random subsampling of pixels from an image 107 | def get_random_pixels_from_image(n_pixels, image_size, camera): 108 | xy_grid = get_pixels_from_image(image_size, camera) 109 | 110 | # TODO (Q2.1): Random subsampling of pixel coordinaters 111 | pass 112 | 113 | # Return 114 | return xy_grid_sub.reshape(-1, 2)[:n_pixels] 115 | 116 | 117 | # Get rays from pixel values 118 | def get_rays_from_pixels(xy_grid, image_size, camera): 119 | W, H = image_size[0], image_size[1] 120 | 121 | # TODO (Q1.3): Map pixels to points on the image plane at Z=1 122 | pass 123 | 124 | ndc_points = torch.cat( 125 | [ 126 | ndc_points, 127 | torch.ones_like(ndc_points[..., -1:]) 128 | ], 129 | dim=-1 130 | ) 131 | 132 | # TODO (Q1.3): Use camera.unproject to get world space points from NDC space points 133 | pass 134 | 135 | # TODO (Q1.3): Get ray origins from camera center 136 | pass 137 | 138 | # TODO (Q1.3): Get ray directions as image_plane_points - rays_o 139 | pass 140 | 141 | # Create and return RayBundle 142 | return RayBundle( 143 | rays_o, 144 | rays_d, 145 | torch.zeros_like(rays_o).unsqueeze(1), 146 | torch.zeros_like(rays_o).unsqueeze(1), 147 | ) -------------------------------------------------------------------------------- /render_functions.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import sys 4 | import datetime 5 | import time 6 | import math 7 | import json 8 | import torch 9 | 10 | import numpy as np 11 | from PIL import Image 12 | 13 | import matplotlib.pyplot as plt 14 | import pytorch3d 15 | import torch 16 | 17 | from pytorch3d.renderer import look_at_view_transform 18 | from pytorch3d.renderer import OpenGLPerspectiveCameras 19 | from pytorch3d.renderer import ( 20 | AlphaCompositor, 21 | RasterizationSettings, 22 | MeshRenderer, 23 | MeshRasterizer, 24 | PointsRasterizationSettings, 25 | PointsRenderer, 26 | PointsRasterizer, 27 | HardPhongShader, 28 | ) 29 | 30 | import mcubes 31 | 32 | def get_device(): 33 | """ 34 | Checks if GPU is available and returns device accordingly. 35 | """ 36 | if torch.cuda.is_available(): 37 | device = torch.device("cuda:0") 38 | else: 39 | device = torch.device("cpu") 40 | return device 41 | 42 | def get_points_renderer( 43 | image_size=512, device=None, radius=0.01, background_color=(1, 1, 1) 44 | ): 45 | """ 46 | Returns a Pytorch3D renderer for point clouds. 47 | 48 | Args: 49 | image_size (int): The rendered image size. 50 | device (torch.device): The torch device to use (CPU or GPU). If not specified, 51 | will automatically use GPU if available, otherwise CPU. 52 | radius (float): The radius of the rendered point in NDC. 53 | background_color (tuple): The background color of the rendered image. 54 | 55 | Returns: 56 | PointsRenderer. 57 | """ 58 | if device is None: 59 | if torch.cuda.is_available(): 60 | device = torch.device("cuda:0") 61 | else: 62 | device = torch.device("cpu") 63 | raster_settings = PointsRasterizationSettings(image_size=image_size, radius=radius,) 64 | renderer = PointsRenderer( 65 | rasterizer=PointsRasterizer(raster_settings=raster_settings), 66 | compositor=AlphaCompositor(background_color=background_color), 67 | ) 68 | return renderer 69 | 70 | 71 | def render_points(filename, points, image_size=256, color=[0.7, 0.7, 1], device=None): 72 | # The device tells us whether we are rendering with GPU or CPU. The rendering will 73 | # be *much* faster if you have a CUDA-enabled NVIDIA GPU. However, your code will 74 | # still run fine on a CPU. 75 | # The default is to run on CPU, so if you do not have a GPU, you do not need to 76 | # worry about specifying the device in all of these functions. 77 | if device is None: 78 | device = get_device() 79 | 80 | # Get the renderer. 81 | points_renderer = get_points_renderer(image_size=256,radius=0.01) 82 | 83 | # Get the vertices, faces, and textures. 84 | # vertices, faces = load_cow_mesh(cow_path) 85 | # vertices = vertices.unsqueeze(0) # (N_v, 3) -> (1, N_v, 3) 86 | # faces = faces.unsqueeze(0) # (N_f, 3) -> (1, N_f, 3) 87 | textures = torch.ones(points.size()).to(device)*0.5 # (1, N_v, 3) 88 | rgb = textures * torch.tensor(color).to(device) # (1, N_v, 3) 89 | 90 | point_cloud = pytorch3d.structures.pointclouds.Pointclouds( 91 | points=points, features=rgb 92 | ) 93 | 94 | R, T = look_at_view_transform(10.0, 10.0, 96) 95 | 96 | 97 | # Prepare the camera: 98 | cameras = OpenGLPerspectiveCameras( 99 | R=R,T=T, device=device 100 | ) 101 | 102 | rend = points_renderer(point_cloud.extend(2), cameras=cameras) 103 | 104 | 105 | # Place a point light in front of the cow. 106 | # lights = pytorch3d.renderer.PointLights(location=[[0.0, 1.0, -2.0]], device=device) 107 | 108 | # rend = renderer(mesh, cameras=cameras, lights=lights) 109 | rend = rend.detach().cpu().numpy()[0, ..., :3] # (B, H, W, 4) -> (H, W, 3) 110 | plt.imsave(filename, rend) 111 | 112 | # The .cpu moves the tensor to GPU (if needed). 113 | return rend 114 | 115 | def render_points_with_save( 116 | points, 117 | cameras, 118 | image_size, 119 | save=False, 120 | file_prefix='', 121 | color=[0.7, 0.7, 1] 122 | ): 123 | device = points.device 124 | if device is None: 125 | device = get_device() 126 | 127 | # Get the renderer. 128 | points_renderer = get_points_renderer(image_size=image_size[0], radius=0.01) 129 | 130 | textures = torch.ones(points.size()).to(device) # (1, N_v, 3) 131 | rgb = textures * torch.tensor(color).to(device) # (1, N_v, 3) 132 | 133 | point_cloud = pytorch3d.structures.pointclouds.Pointclouds( 134 | points=points, features=rgb 135 | ) 136 | 137 | all_images = [] 138 | with torch.no_grad(): 139 | torch.cuda.empty_cache() 140 | for cam_idx in range(len(cameras)): 141 | image = points_renderer(point_cloud, cameras=cameras[cam_idx].to(device)) 142 | image = image[0,:,:,:3].detach().cpu().numpy() 143 | all_images.append(image) 144 | 145 | # Save 146 | if save: 147 | plt.imsave( 148 | f'{file_prefix}_{cam_idx}.png', 149 | image 150 | ) 151 | 152 | return all_images 153 | 154 | def get_mesh_renderer(image_size=512, lights=None, device=None): 155 | """ 156 | Returns a Pytorch3D Mesh Renderer. 157 | Args: 158 | image_size (int): The rendered image size. 159 | lights: A default Pytorch3D lights object. 160 | device (torch.device): The torch device to use (CPU or GPU). If not specified, 161 | will automatically use GPU if available, otherwise CPU. 162 | """ 163 | if device is None: 164 | if torch.cuda.is_available(): 165 | device = torch.device("cuda:0") 166 | else: 167 | device = torch.device("cpu") 168 | raster_settings = RasterizationSettings( 169 | image_size=image_size, blur_radius=0.0, faces_per_pixel=1, 170 | ) 171 | renderer = MeshRenderer( 172 | rasterizer=MeshRasterizer(raster_settings=raster_settings), 173 | shader=HardPhongShader(device=device, lights=lights), 174 | ) 175 | return renderer 176 | 177 | 178 | def implicit_to_mesh(implicit_fn, scale=0.5, grid_size=128, device='cpu', color=[0.7, 0.7, 1], chunk_size=262144, thresh=0): 179 | Xs = torch.linspace(-1*scale, scale, grid_size+1).to(device) 180 | Ys = torch.linspace(-1*scale, scale, grid_size+1).to(device) 181 | Zs = torch.linspace(-1*scale, scale, grid_size+1).to(device) 182 | grid = torch.stack(torch.meshgrid(Xs, Ys, Zs), dim=-1) 183 | 184 | grid = grid.view(-1, 3) 185 | num_points = grid.shape[0] 186 | sdfs = torch.zeros(num_points) 187 | 188 | with torch.no_grad(): 189 | for chunk_start in range(0, num_points, chunk_size): 190 | torch.cuda.empty_cache() 191 | chunk_end = min(num_points, chunk_start+chunk_size) 192 | sdfs[chunk_start:chunk_end] = implicit_fn.get_distance(grid[chunk_start:chunk_end,:]).view(-1) 193 | 194 | sdfs = sdfs.view(grid_size+1, grid_size+1, grid_size+1) 195 | 196 | vertices, triangles = mcubes.marching_cubes(sdfs.cpu().numpy(), thresh) 197 | # normalize to [-scale, scale] 198 | vertices = (vertices/grid_size - 0.5)*2*scale 199 | 200 | vertices = torch.from_numpy(vertices).unsqueeze(0).float() 201 | faces = torch.from_numpy(triangles.astype(np.int64)).unsqueeze(0) 202 | 203 | textures = torch.ones_like(vertices) # (1, N_v, 3) 204 | textures = textures * torch.tensor(color) # (1, N_v, 3) 205 | mesh = pytorch3d.structures.Meshes( 206 | verts=vertices, 207 | faces=faces, 208 | textures=pytorch3d.renderer.TexturesVertex(textures), 209 | ) 210 | mesh = mesh.to(device) 211 | return mesh 212 | 213 | 214 | def render_geometry( 215 | model, 216 | cameras, 217 | image_size, 218 | save=False, 219 | thresh=0., 220 | file_prefix='' 221 | ): 222 | device = list(model.parameters())[0].device 223 | lights = pytorch3d.renderer.PointLights(location=[[0, 0, -3]], device=device) 224 | mesh_renderer = get_mesh_renderer(image_size=image_size[0], lights=lights, device=device) 225 | 226 | mesh = implicit_to_mesh(model.implicit_fn, scale=3, device=device, thresh=thresh) 227 | all_images = [] 228 | with torch.no_grad(): 229 | torch.cuda.empty_cache() 230 | for cam_idx in range(len(cameras)): 231 | image = mesh_renderer(mesh, cameras=cameras[cam_idx].to(device)) 232 | image = image[0,:,:,:3].detach().cpu().numpy() 233 | all_images.append(image) 234 | 235 | # Save 236 | if save: 237 | plt.imsave( 238 | f'{file_prefix}_{cam_idx}.png', 239 | image 240 | ) 241 | 242 | return all_images -------------------------------------------------------------------------------- /renderer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | from typing import List, Optional, Tuple 4 | from pytorch3d.renderer.cameras import CamerasBase 5 | 6 | 7 | # Volume renderer which integrates color and density along rays 8 | # according to the equations defined in [Mildenhall et al. 2020] 9 | class VolumeRenderer(torch.nn.Module): 10 | def __init__( 11 | self, 12 | cfg 13 | ): 14 | super().__init__() 15 | 16 | self._chunk_size = cfg.chunk_size 17 | self._white_background = cfg.white_background if 'white_background' in cfg else False 18 | 19 | def _compute_weights( 20 | self, 21 | deltas, 22 | rays_density: torch.Tensor, 23 | eps: float = 1e-10 24 | ): 25 | # TODO (1.5): Compute transmittance using the equation described in the README 26 | pass 27 | 28 | # TODO (1.5): Compute weight used for rendering from transmittance and alpha 29 | return weights 30 | 31 | def _aggregate( 32 | self, 33 | weights: torch.Tensor, 34 | rays_feature: torch.Tensor 35 | ): 36 | # TODO (1.5): Aggregate (weighted sum of) features using weights 37 | pass 38 | 39 | return feature 40 | 41 | def forward( 42 | self, 43 | sampler, 44 | implicit_fn, 45 | ray_bundle, 46 | ): 47 | B = ray_bundle.shape[0] 48 | 49 | # Process the chunks of rays. 50 | chunk_outputs = [] 51 | 52 | for chunk_start in range(0, B, self._chunk_size): 53 | cur_ray_bundle = ray_bundle[chunk_start:chunk_start+self._chunk_size] 54 | 55 | # Sample points along the ray 56 | cur_ray_bundle = sampler(cur_ray_bundle) 57 | n_pts = cur_ray_bundle.sample_shape[1] 58 | 59 | # Call implicit function with sample points 60 | implicit_output = implicit_fn(cur_ray_bundle) 61 | density = implicit_output['density'] 62 | feature = implicit_output['feature'] 63 | 64 | # Compute length of each ray segment 65 | depth_values = cur_ray_bundle.sample_lengths[..., 0] 66 | deltas = torch.cat( 67 | ( 68 | depth_values[..., 1:] - depth_values[..., :-1], 69 | 1e10 * torch.ones_like(depth_values[..., :1]), 70 | ), 71 | dim=-1, 72 | )[..., None] 73 | 74 | # Compute aggregation weights 75 | weights = self._compute_weights( 76 | deltas.view(-1, n_pts, 1), 77 | density.view(-1, n_pts, 1) 78 | ) 79 | 80 | # TODO (1.5): Render (color) features using weights 81 | pass 82 | 83 | # TODO (1.5): Render depth map 84 | pass 85 | 86 | # Return 87 | cur_out = { 88 | 'feature': feature, 89 | 'depth': depth, 90 | } 91 | 92 | chunk_outputs.append(cur_out) 93 | 94 | # Concatenate chunk outputs 95 | out = { 96 | k: torch.cat( 97 | [chunk_out[k] for chunk_out in chunk_outputs], 98 | dim=0 99 | ) for k in chunk_outputs[0].keys() 100 | } 101 | 102 | return out 103 | 104 | 105 | # Volume renderer which integrates color and density along rays 106 | # according to the equations defined in [Mildenhall et al. 2020] 107 | class SphereTracingRenderer(torch.nn.Module): 108 | def __init__( 109 | self, 110 | cfg 111 | ): 112 | super().__init__() 113 | 114 | self._chunk_size = cfg.chunk_size 115 | self.near = cfg.near 116 | self.far = cfg.far 117 | self.max_iters = cfg.max_iters 118 | 119 | def sphere_tracing( 120 | self, 121 | implicit_fn, 122 | origins, # Nx3 123 | directions, # Nx3 124 | ): 125 | ''' 126 | Input: 127 | implicit_fn: a module that computes a SDF at a query point 128 | origins: N_rays X 3 129 | directions: N_rays X 3 130 | Output: 131 | points: N_rays X 3 points indicating ray-surface intersections. For rays that do not intersect the surface, 132 | the point can be arbitrary. 133 | mask: N_rays X 1 (boolean tensor) denoting which of the input rays intersect the surface. 134 | ''' 135 | # TODO (Q5): Implement sphere tracing 136 | # 1) Iteratively update points and distance to the closest surface 137 | # in order to compute intersection points of rays with the implicit surface 138 | # 2) Maintain a mask with the same batch dimension as the ray origins, 139 | # indicating which points hit the surface, and which do not 140 | pass 141 | 142 | def forward( 143 | self, 144 | sampler, 145 | implicit_fn, 146 | ray_bundle, 147 | light_dir=None 148 | ): 149 | B = ray_bundle.shape[0] 150 | 151 | # Process the chunks of rays. 152 | chunk_outputs = [] 153 | 154 | for chunk_start in range(0, B, self._chunk_size): 155 | cur_ray_bundle = ray_bundle[chunk_start:chunk_start+self._chunk_size] 156 | points, mask = self.sphere_tracing( 157 | implicit_fn, 158 | cur_ray_bundle.origins, 159 | cur_ray_bundle.directions 160 | ) 161 | mask = mask.repeat(1,3) 162 | isect_points = points[mask].view(-1, 3) 163 | 164 | # Get color from implicit function with intersection points 165 | isect_color = implicit_fn.get_color(isect_points) 166 | 167 | # Return 168 | color = torch.zeros_like(cur_ray_bundle.origins) 169 | color[mask] = isect_color.view(-1) 170 | 171 | cur_out = { 172 | 'color': color.view(-1, 3), 173 | } 174 | 175 | chunk_outputs.append(cur_out) 176 | 177 | # Concatenate chunk outputs 178 | out = { 179 | k: torch.cat( 180 | [chunk_out[k] for chunk_out in chunk_outputs], 181 | dim=0 182 | ) for k in chunk_outputs[0].keys() 183 | } 184 | 185 | return out 186 | 187 | 188 | def sdf_to_density(signed_distance, alpha, beta): 189 | # TODO (Q7): Convert signed distance to density with alpha, beta parameters 190 | pass 191 | 192 | class VolumeSDFRenderer(VolumeRenderer): 193 | def __init__( 194 | self, 195 | cfg 196 | ): 197 | super().__init__(cfg) 198 | 199 | self._chunk_size = cfg.chunk_size 200 | self._white_background = cfg.white_background if 'white_background' in cfg else False 201 | self.alpha = cfg.alpha 202 | self.beta = cfg.beta 203 | 204 | self.cfg = cfg 205 | 206 | def forward( 207 | self, 208 | sampler, 209 | implicit_fn, 210 | ray_bundle, 211 | light_dir=None 212 | ): 213 | B = ray_bundle.shape[0] 214 | 215 | # Process the chunks of rays. 216 | chunk_outputs = [] 217 | 218 | for chunk_start in range(0, B, self._chunk_size): 219 | cur_ray_bundle = ray_bundle[chunk_start:chunk_start+self._chunk_size] 220 | 221 | # Sample points along the ray 222 | cur_ray_bundle = sampler(cur_ray_bundle) 223 | n_pts = cur_ray_bundle.sample_shape[1] 224 | 225 | # Call implicit function with sample points 226 | distance, color = implicit_fn.get_distance_color(cur_ray_bundle.sample_points) 227 | density = None # TODO (Q7): convert SDF to density 228 | 229 | # Compute length of each ray segment 230 | depth_values = cur_ray_bundle.sample_lengths[..., 0] 231 | deltas = torch.cat( 232 | ( 233 | depth_values[..., 1:] - depth_values[..., :-1], 234 | 1e10 * torch.ones_like(depth_values[..., :1]), 235 | ), 236 | dim=-1, 237 | )[..., None] 238 | 239 | # Compute aggregation weights 240 | weights = self._compute_weights( 241 | deltas.view(-1, n_pts, 1), 242 | density.view(-1, n_pts, 1) 243 | ) 244 | 245 | geometry_color = torch.zeros_like(color) 246 | 247 | # Compute color 248 | color = self._aggregate( 249 | weights, 250 | color.view(-1, n_pts, color.shape[-1]) 251 | ) 252 | 253 | # Return 254 | cur_out = { 255 | 'color': color, 256 | "geometry": geometry_color 257 | } 258 | 259 | chunk_outputs.append(cur_out) 260 | 261 | # Concatenate chunk outputs 262 | out = { 263 | k: torch.cat( 264 | [chunk_out[k] for chunk_out in chunk_outputs], 265 | dim=0 266 | ) for k in chunk_outputs[0].keys() 267 | } 268 | 269 | return out 270 | 271 | 272 | renderer_dict = { 273 | 'volume': VolumeRenderer, 274 | 'sphere_tracing': SphereTracingRenderer, 275 | 'volume_sdf': VolumeSDFRenderer 276 | } 277 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | torchvision 2 | hydra-core 3 | Pillow 4 | plotly 5 | requests 6 | imageio 7 | matplotlib 8 | numpy<2.0.0 9 | PyMCubes 10 | tqdm 11 | visdom -------------------------------------------------------------------------------- /sampler.py: -------------------------------------------------------------------------------- 1 | import math 2 | from typing import List 3 | 4 | import torch 5 | from ray_utils import RayBundle 6 | from pytorch3d.renderer.cameras import CamerasBase 7 | 8 | 9 | # Sampler which implements stratified (uniform) point sampling along rays 10 | class StratifiedRaysampler(torch.nn.Module): 11 | def __init__( 12 | self, 13 | cfg 14 | ): 15 | super().__init__() 16 | 17 | self.n_pts_per_ray = cfg.n_pts_per_ray 18 | self.min_depth = cfg.min_depth 19 | self.max_depth = cfg.max_depth 20 | 21 | def forward( 22 | self, 23 | ray_bundle, 24 | ): 25 | # TODO (Q1.4): Compute z values for self.n_pts_per_ray points uniformly sampled between [near, far] 26 | z_vals = None 27 | 28 | # TODO (Q1.4): Sample points from z values 29 | sample_points = None 30 | 31 | # Return 32 | return ray_bundle._replace( 33 | sample_points=sample_points, 34 | sample_lengths=z_vals * torch.ones_like(sample_points[..., :1]), 35 | ) 36 | 37 | 38 | sampler_dict = { 39 | 'stratified': StratifiedRaysampler 40 | } -------------------------------------------------------------------------------- /surface_rendering_main.py: -------------------------------------------------------------------------------- 1 | import os 2 | import warnings 3 | 4 | import hydra 5 | import numpy as np 6 | import torch 7 | import tqdm 8 | import imageio 9 | 10 | from omegaconf import DictConfig 11 | from PIL import Image 12 | from pytorch3d.renderer import ( 13 | PerspectiveCameras, 14 | look_at_view_transform 15 | ) 16 | import matplotlib.pyplot as plt 17 | 18 | from sampler import sampler_dict 19 | from implicit import implicit_dict 20 | from renderer import renderer_dict 21 | from losses import eikonal_loss, sphere_loss, get_random_points, select_random_points 22 | 23 | from ray_utils import ( 24 | sample_images_at_xy, 25 | get_pixels_from_image, 26 | get_random_pixels_from_image, 27 | get_rays_from_pixels 28 | ) 29 | from data_utils import ( 30 | dataset_from_config, 31 | create_surround_cameras, 32 | vis_grid, 33 | vis_rays, 34 | ) 35 | from dataset import ( 36 | get_nerf_datasets, 37 | trivial_collate, 38 | ) 39 | from render_functions import render_geometry 40 | from render_functions import render_points_with_save 41 | 42 | 43 | # Model class containing: 44 | # 1) Implicit function defining the scene 45 | # 2) Sampling scheme which generates sample points along rays 46 | # 3) Renderer which can render an implicit function given a sampling scheme 47 | 48 | class Model(torch.nn.Module): 49 | def __init__( 50 | self, 51 | cfg 52 | ): 53 | super().__init__() 54 | 55 | # Get implicit function from config 56 | self.implicit_fn = implicit_dict[cfg.implicit_function.type]( 57 | cfg.implicit_function 58 | ) 59 | 60 | # Point sampling (raymarching) scheme 61 | self.sampler = sampler_dict[cfg.sampler.type]( 62 | cfg.sampler 63 | ) 64 | 65 | # Initialize implicit renderer 66 | self.renderer = renderer_dict[cfg.renderer.type]( 67 | cfg.renderer 68 | ) 69 | 70 | def forward( 71 | self, 72 | ray_bundle, 73 | light_dir=None 74 | ): 75 | # Call renderer with 76 | # a) Implicit function 77 | # b) Sampling routine 78 | 79 | return self.renderer( 80 | self.sampler, 81 | self.implicit_fn, 82 | ray_bundle, 83 | light_dir 84 | ) 85 | 86 | 87 | def render_images( 88 | model, 89 | cameras, 90 | image_size, 91 | save=False, 92 | file_prefix='', 93 | lights=None, 94 | feat='color' 95 | ): 96 | all_images = [] 97 | device = list(model.parameters())[0].device 98 | 99 | for cam_idx, camera in enumerate(cameras): 100 | print(f'Rendering image {cam_idx}') 101 | 102 | with torch.no_grad(): 103 | torch.cuda.empty_cache() 104 | 105 | # Get rays 106 | camera = camera.to(device) 107 | light_dir = None 108 | # We assume the object is placed at the origin 109 | origin = torch.tensor([0.0, 0.0, 0.0], device=device) 110 | light_location = None if lights is None else lights[cam_idx].location.to(device) 111 | if lights is not None: 112 | light_dir = None #TODO: Use light location and origin to compute light direction 113 | light_dir = torch.nn.functional.normalize(light_dir, dim=-1).view(-1, 3) 114 | xy_grid = get_pixels_from_image(image_size, camera) 115 | ray_bundle = get_rays_from_pixels(xy_grid, image_size, camera) 116 | 117 | # Run model forward 118 | out = model(ray_bundle, light_dir) 119 | 120 | # Return rendered features (colors) 121 | image = np.array( 122 | out[feat].view( 123 | image_size[1], image_size[0], 3 124 | ).detach().cpu() 125 | ) 126 | all_images.append(image) 127 | 128 | # Save 129 | if save: 130 | plt.imsave( 131 | f'{file_prefix}_{cam_idx}.png', 132 | image 133 | ) 134 | 135 | return all_images 136 | 137 | 138 | def render( 139 | cfg, 140 | ): 141 | # Create model 142 | model = Model(cfg) 143 | model = model.cuda(); model.eval() 144 | 145 | # Render spiral 146 | cameras = create_surround_cameras(3.0, n_poses=20, up=(0.0, 0.0, 1.0)) 147 | all_images = render_images( 148 | model, cameras, cfg.data.image_size 149 | ) 150 | imageio.mimsave('images/part_5.gif', [np.uint8(im * 255) for im in all_images],loop=0) 151 | 152 | 153 | def create_model(cfg): 154 | # Create model 155 | model = Model(cfg) 156 | model.cuda(); model.train() 157 | 158 | # Load checkpoints 159 | optimizer_state_dict = None 160 | start_epoch = 0 161 | 162 | checkpoint_path = os.path.join( 163 | hydra.utils.get_original_cwd(), 164 | cfg.training.checkpoint_path 165 | ) 166 | 167 | if len(cfg.training.checkpoint_path) > 0: 168 | # Make the root of the experiment directory. 169 | checkpoint_dir = os.path.split(checkpoint_path)[0] 170 | os.makedirs(checkpoint_dir, exist_ok=True) 171 | 172 | # Resume training if requested. 173 | if cfg.training.resume and os.path.isfile(checkpoint_path): 174 | print(f"Resuming from checkpoint {checkpoint_path}.") 175 | loaded_data = torch.load(checkpoint_path) 176 | model.load_state_dict(loaded_data["model"]) 177 | start_epoch = loaded_data["epoch"] 178 | 179 | print(f" => resuming from epoch {start_epoch}.") 180 | optimizer_state_dict = loaded_data["optimizer"] 181 | 182 | # Initialize the optimizer. 183 | optimizer = torch.optim.Adam( 184 | model.parameters(), 185 | lr=cfg.training.lr, 186 | ) 187 | 188 | # Load the optimizer state dict in case we are resuming. 189 | if optimizer_state_dict is not None: 190 | optimizer.load_state_dict(optimizer_state_dict) 191 | optimizer.last_epoch = start_epoch 192 | 193 | # The learning rate scheduling is implemented with LambdaLR PyTorch scheduler. 194 | def lr_lambda(epoch): 195 | return cfg.training.lr_scheduler_gamma ** ( 196 | epoch / cfg.training.lr_scheduler_step_size 197 | ) 198 | 199 | lr_scheduler = torch.optim.lr_scheduler.LambdaLR( 200 | optimizer, lr_lambda, last_epoch=start_epoch - 1, verbose=False 201 | ) 202 | 203 | return model, optimizer, lr_scheduler, start_epoch, checkpoint_path 204 | 205 | 206 | def train_points( 207 | cfg 208 | ): 209 | # Create model 210 | model, optimizer, lr_scheduler, start_epoch, checkpoint_path = create_model(cfg) 211 | 212 | # Pretrain SDF 213 | pretrain_sdf(cfg, model) 214 | 215 | # Load pointcloud 216 | point_cloud = np.load(cfg.data.point_cloud_path) 217 | all_points = torch.Tensor(point_cloud["verts"][::2]).cuda().view(-1, 3) 218 | all_points = all_points - torch.mean(all_points, dim=0).unsqueeze(0) 219 | 220 | point_images = render_points_with_save( 221 | all_points.unsqueeze(0), create_surround_cameras(3.0, n_poses=20, up=(0.0, 1.0, 0.0), focal_length=2.0), 222 | cfg.data.image_size, file_prefix='points' 223 | ) 224 | imageio.mimsave('images/part_6_input.gif', [np.uint8(im * 255) for im in point_images], loop=0) 225 | 226 | # Run the main training loop. 227 | for epoch in range(0, cfg.training.num_epochs): 228 | t_range = tqdm.tqdm(range(0, all_points.shape[0], cfg.training.batch_size)) 229 | 230 | for idx in t_range: 231 | # Select random points from pointcloud 232 | points = select_random_points(all_points, cfg.training.batch_size) 233 | 234 | # Get distances and enforce point cloud loss 235 | distances, gradients = model.implicit_fn.get_distance_and_gradient(points) 236 | loss = None # TODO (Q6): Point cloud SDF loss on distances 237 | point_loss = loss 238 | 239 | # Sample random points in bounding box 240 | eikonal_points = get_random_points( 241 | cfg.training.batch_size, cfg.training.bounds, 'cuda' 242 | ) 243 | 244 | # Get sdf gradients and enforce eikonal loss 245 | eikonal_distances, eikonal_gradients = model.implicit_fn.get_distance_and_gradient(eikonal_points) 246 | loss += torch.exp(-1e2 * torch.abs(eikonal_distances)).mean() * cfg.training.inter_weight 247 | loss += eikonal_loss(eikonal_gradients) * cfg.training.eikonal_weight # TODO (Q6): Implement eikonal loss 248 | 249 | # Take the training step. 250 | optimizer.zero_grad() 251 | loss.backward() 252 | optimizer.step() 253 | 254 | t_range.set_description(f'Epoch: {epoch:04d}, Loss: {point_loss:.06f}') 255 | t_range.refresh() 256 | 257 | # Checkpoint. 258 | if ( 259 | epoch % cfg.training.checkpoint_interval == 0 260 | and len(cfg.training.checkpoint_path) > 0 261 | and epoch > 0 262 | ): 263 | print(f"Storing checkpoint {checkpoint_path}.") 264 | 265 | data_to_store = { 266 | "model": model.state_dict(), 267 | "optimizer": optimizer.state_dict(), 268 | "epoch": epoch, 269 | } 270 | 271 | torch.save(data_to_store, checkpoint_path) 272 | 273 | # Render 274 | if ( 275 | epoch % cfg.training.render_interval == 0 276 | and epoch > 0 277 | ): 278 | try: 279 | test_images = render_geometry( 280 | model, create_surround_cameras(3.0, n_poses=20, up=(0.0, 1.0, 0.0), focal_length=2.0), 281 | cfg.data.image_size, file_prefix='eikonal', thresh=0.002, 282 | ) 283 | imageio.mimsave('images/part_6.gif', [np.uint8(im * 255) for im in test_images], loop=0) 284 | except Exception as e: 285 | print("Empty mesh") 286 | pass 287 | 288 | 289 | def pretrain_sdf( 290 | cfg, 291 | model 292 | ): 293 | optimizer = torch.optim.Adam( 294 | model.parameters(), 295 | lr=cfg.training.lr, 296 | ) 297 | 298 | # Run the main training loop. 299 | for iter in range(0, cfg.training.pretrain_iters): 300 | points = get_random_points( 301 | cfg.training.batch_size, cfg.training.bounds, 'cuda' 302 | ) 303 | 304 | # Run model forward 305 | distances = model.implicit_fn.get_distance(points) 306 | loss = sphere_loss(distances, points, 1.0) 307 | 308 | # Take the training step 309 | optimizer.zero_grad() 310 | loss.backward() 311 | optimizer.step() 312 | 313 | 314 | def train_images( 315 | cfg 316 | ): 317 | # Create model 318 | model, optimizer, lr_scheduler, start_epoch, checkpoint_path = create_model(cfg) 319 | 320 | # Load the training/validation data. 321 | train_dataset, val_dataset, _ = get_nerf_datasets( 322 | dataset_name=cfg.data.dataset_name, 323 | image_size=[cfg.data.image_size[1], cfg.data.image_size[0]], 324 | ) 325 | 326 | train_dataloader = torch.utils.data.DataLoader( 327 | train_dataset, 328 | batch_size=1, 329 | shuffle=True, 330 | num_workers=0, 331 | collate_fn=lambda batch: batch, 332 | ) 333 | 334 | # Pretrain SDF 335 | pretrain_sdf(cfg, model) 336 | 337 | # Run the main training loop. 338 | for epoch in range(start_epoch, cfg.training.num_epochs): 339 | t_range = tqdm.tqdm(enumerate(train_dataloader)) 340 | 341 | for iteration, batch in t_range: 342 | image, camera, camera_idx = batch[0].values() 343 | image = image.cuda().unsqueeze(0) 344 | camera = camera.cuda() 345 | 346 | # Sample rays 347 | xy_grid = get_random_pixels_from_image( 348 | cfg.training.batch_size, cfg.data.image_size, camera 349 | ) 350 | ray_bundle = get_rays_from_pixels( 351 | xy_grid, cfg.data.image_size, camera 352 | ) 353 | rgb_gt = sample_images_at_xy(image, xy_grid) 354 | 355 | # Run model forward 356 | out = model(ray_bundle) 357 | 358 | # Color loss 359 | loss = torch.mean(torch.square(rgb_gt - out['color'])) 360 | image_loss = loss 361 | 362 | # Sample random points in bounding box 363 | eikonal_points = get_random_points( 364 | cfg.training.batch_size, cfg.training.bounds, 'cuda' 365 | ) 366 | 367 | # Get sdf gradients and enforce eikonal loss 368 | eikonal_distances, eikonal_gradients = model.implicit_fn.get_distance_and_gradient(eikonal_points) 369 | loss += torch.exp(-1e2 * torch.abs(eikonal_distances)).mean() * cfg.training.inter_weight 370 | loss += eikonal_loss(eikonal_gradients) * cfg.training.eikonal_weight # TODO (2): Implement eikonal loss 371 | 372 | # Take the training step. 373 | optimizer.zero_grad() 374 | loss.backward() 375 | optimizer.step() 376 | 377 | t_range.set_description(f'Epoch: {epoch:04d}, Loss: {image_loss:.06f}') 378 | t_range.refresh() 379 | 380 | # Adjust the learning rate. 381 | lr_scheduler.step() 382 | 383 | # Checkpoint. 384 | if ( 385 | epoch % cfg.training.checkpoint_interval == 0 386 | and len(cfg.training.checkpoint_path) > 0 387 | and epoch > 0 388 | ): 389 | print(f"Storing checkpoint {checkpoint_path}.") 390 | 391 | data_to_store = { 392 | "model": model.state_dict(), 393 | "optimizer": optimizer.state_dict(), 394 | "epoch": epoch, 395 | } 396 | 397 | torch.save(data_to_store, checkpoint_path) 398 | 399 | # Render 400 | if ( 401 | epoch % cfg.training.render_interval == 0 402 | and epoch > 0 403 | ): 404 | test_images = render_images( 405 | model, create_surround_cameras(4.0, n_poses=20, up=(0.0, 0.0, 1.0), focal_length=2.0), 406 | cfg.data.image_size, file_prefix='volsdf' 407 | ) 408 | imageio.mimsave('images/part_7.gif', [np.uint8(im * 255) for im in test_images], loop=0) 409 | 410 | try: 411 | test_images = render_geometry( 412 | model, create_surround_cameras(4.0, n_poses=20, up=(0.0, 0.0, 1.0), focal_length=2.0), 413 | cfg.data.image_size, file_prefix='volsdf_geometry' 414 | ) 415 | imageio.mimsave('images/part_7_geometry.gif', [np.uint8(im * 255) for im in test_images], loop=0) 416 | except Exception as e: 417 | print("Empty mesh") 418 | pass 419 | 420 | @hydra.main(config_path='configs', config_name='torus') 421 | def main(cfg: DictConfig): 422 | os.chdir(hydra.utils.get_original_cwd()) 423 | 424 | if cfg.type == 'render': 425 | render(cfg) 426 | elif cfg.type == 'train_points': 427 | train_points(cfg) 428 | elif cfg.type == 'train_images': 429 | train_images(cfg) 430 | 431 | 432 | if __name__ == "__main__": 433 | main() 434 | 435 | 436 | -------------------------------------------------------------------------------- /ta_images/color.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/color.png -------------------------------------------------------------------------------- /ta_images/depth.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/depth.png -------------------------------------------------------------------------------- /ta_images/grid.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/grid.png -------------------------------------------------------------------------------- /ta_images/part_1.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_1.gif -------------------------------------------------------------------------------- /ta_images/part_2.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2.gif -------------------------------------------------------------------------------- /ta_images/part_2_after_training_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_after_training_0.png -------------------------------------------------------------------------------- /ta_images/part_2_after_training_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_after_training_1.png -------------------------------------------------------------------------------- /ta_images/part_2_after_training_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_after_training_2.png -------------------------------------------------------------------------------- /ta_images/part_2_after_training_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_after_training_3.png -------------------------------------------------------------------------------- /ta_images/part_2_before_training_0.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_before_training_0.png -------------------------------------------------------------------------------- /ta_images/part_2_before_training_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_before_training_1.png -------------------------------------------------------------------------------- /ta_images/part_2_before_training_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_before_training_2.png -------------------------------------------------------------------------------- /ta_images/part_2_before_training_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_before_training_3.png -------------------------------------------------------------------------------- /ta_images/part_3.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_3.gif -------------------------------------------------------------------------------- /ta_images/part_5.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_5.gif -------------------------------------------------------------------------------- /ta_images/part_6.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_6.gif -------------------------------------------------------------------------------- /ta_images/part_6_input.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_6_input.gif -------------------------------------------------------------------------------- /ta_images/part_7.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_7.gif -------------------------------------------------------------------------------- /ta_images/part_7_geometry.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_7_geometry.gif -------------------------------------------------------------------------------- /ta_images/rays.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/rays.png -------------------------------------------------------------------------------- /ta_images/sample_points.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/sample_points.png -------------------------------------------------------------------------------- /ta_images/transmittance.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/transmittance.png -------------------------------------------------------------------------------- /transmittance_calculation/a3_transmittance.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/transmittance_calculation/a3_transmittance.pdf -------------------------------------------------------------------------------- /transmittance_calculation/figure1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/transmittance_calculation/figure1.png -------------------------------------------------------------------------------- /transmittance_calculation/main.tex: -------------------------------------------------------------------------------- 1 | \documentclass[11pt,addpoints,answers]{exam} 2 | \usepackage[margin=1in]{geometry} 3 | \usepackage{graphicx} 4 | \usepackage[svgname]{xcolor} 5 | \usepackage{url} 6 | \usepackage{datetime} 7 | \usepackage{color} 8 | \usepackage[many]{tcolorbox} 9 | \usepackage{hyperref} 10 | 11 | \newcommand{\courseNum}{\href{https://learning3d.github.io/index.html}{16825}} 12 | \newcommand{\courseName}{\href{https://learning3d.github.io/index.html}{Learning for 3D Vision}} 13 | \newcommand{\courseSem}{\href{https://learning3d.github.io/index.html}{Spring 2024}} 14 | \newcommand{\courseUrl}{\url{https://piazza.com/cmu/spring2024/16825}} 15 | \newcommand{\hwNum}{Problem Set 3} 16 | \newcommand{\hwTopic}{Volume Rendering} 17 | \newcommand{\hwName}{\hwNum: \hwTopic} 18 | \newcommand{\outDate}{Feb. 21, 2024} 19 | \newcommand{\dueDate}{Mar. 13, 2024 11:59 PM} 20 | \newcommand{\instructorName}{Shubham Tulsiani} 21 | \newcommand{\taNames}{Anurag Ghosh, Ayush Jain, Bharath Raj, Ruihan Gao, Shun Iwase} 22 | 23 | \lhead{\hwName} 24 | \rhead{\courseNum} 25 | \cfoot{\thepage{} of \numpages{}} 26 | 27 | \title{\textsc{\hwName}} % Title 28 | 29 | 30 | \author{} 31 | 32 | \date{} 33 | 34 | 35 | %%%%%%%%%%%%%%%%%%%%%%%%%% 36 | % Document configuration % 37 | %%%%%%%%%%%%%%%%%%%%%%%%%% 38 | 39 | % Don't display a date in the title and remove the white space 40 | \predate{} 41 | \postdate{} 42 | \date{} 43 | 44 | %%%%%%%%%%%%%%%%%% 45 | % Begin Document % 46 | %%%%%%%%%%%%%%%%%% 47 | 48 | 49 | \begin{document} 50 | 51 | \section*{} 52 | \begin{center} 53 | \textsc{\LARGE \hwNum} \\ 54 | \vspace{1em} 55 | \textsc{\large \courseNum{} \courseName{} (\courseSem)} \\ 56 | \courseUrl\\ 57 | \vspace{1em} 58 | OUT: \outDate \\ 59 | DUE: \dueDate \\ 60 | Instructor: \instructorName \\ 61 | TAs: \taNames 62 | \end{center} 63 | 64 | 65 | % Default to visible (but empty) solution box. 66 | \newtcolorbox[]{studentsolution}[1][]{% 67 | breakable, 68 | enhanced, 69 | colback=white, 70 | title=Solution, 71 | #1 72 | } 73 | 74 | \begin{questions} 75 | \question \textbf{[10 pts]} 76 | \begin{figure}[h] 77 | \centering 78 | \includegraphics[width=\textwidth]{figure1.png} 79 | \caption{A ray through a non-homogeneous medium. The medium is composed of 3 segments ($y1y2$, $y2y3$, $y3y4$). Each segment has a different absorption coefficient, shown as $\sigma_1, \sigma_2, \sigma_3$ in the figure. The length of each segment is also annotated in the figure (1m means 1 meter).} 80 | \label{fig:q1} 81 | \end{figure} 82 | 83 | As shown in Figure~\ref{fig:q1}, we observe a ray going through a non-homogeneous medium. 84 | Please compute the following transmittance: 85 | \begin{itemize} 86 | \item $T(y1, y2)$ 87 | \item $T(y2, y4)$ 88 | \item $T(x, y4)$ 89 | \item $T(x, y3)$ 90 | \end{itemize} 91 | 92 | 93 | 94 | \begin{tcolorbox}[fit,height=20cm, width=\textwidth, blank, borderline={0.5pt}{-2pt},halign=left, valign=center, nobeforeafter] 95 | 96 | 97 | % \begin{studentsolution} 98 | % Write your solution here. 99 | % \end{studentsolution} 100 | 101 | \end{tcolorbox} 102 | \end{questions} 103 | \end{document} -------------------------------------------------------------------------------- /volume_rendering_main.py: -------------------------------------------------------------------------------- 1 | import os 2 | import warnings 3 | 4 | import hydra 5 | import numpy as np 6 | import torch 7 | import tqdm 8 | import imageio 9 | 10 | from omegaconf import DictConfig 11 | from PIL import Image 12 | from pytorch3d.renderer import ( 13 | PerspectiveCameras, 14 | look_at_view_transform 15 | ) 16 | import matplotlib.pyplot as plt 17 | 18 | from implicit import implicit_dict 19 | from sampler import sampler_dict 20 | from renderer import renderer_dict 21 | from ray_utils import ( 22 | sample_images_at_xy, 23 | get_pixels_from_image, 24 | get_random_pixels_from_image, 25 | get_rays_from_pixels 26 | ) 27 | from data_utils import ( 28 | dataset_from_config, 29 | create_surround_cameras, 30 | vis_grid, 31 | vis_rays, 32 | ) 33 | from dataset import ( 34 | get_nerf_datasets, 35 | trivial_collate, 36 | ) 37 | 38 | 39 | # Model class containing: 40 | # 1) Implicit volume defining the scene 41 | # 2) Sampling scheme which generates sample points along rays 42 | # 3) Renderer which can render an implicit volume given a sampling scheme 43 | 44 | class Model(torch.nn.Module): 45 | def __init__( 46 | self, 47 | cfg 48 | ): 49 | super().__init__() 50 | 51 | # Get implicit function from config 52 | self.implicit_fn = implicit_dict[cfg.implicit_function.type]( 53 | cfg.implicit_function 54 | ) 55 | 56 | # Point sampling (raymarching) scheme 57 | self.sampler = sampler_dict[cfg.sampler.type]( 58 | cfg.sampler 59 | ) 60 | 61 | # Initialize volume renderer 62 | self.renderer = renderer_dict[cfg.renderer.type]( 63 | cfg.renderer 64 | ) 65 | 66 | def forward( 67 | self, 68 | ray_bundle 69 | ): 70 | # Call renderer with 71 | # a) Implicit volume 72 | # b) Sampling routine 73 | 74 | return self.renderer( 75 | self.sampler, 76 | self.implicit_fn, 77 | ray_bundle 78 | ) 79 | 80 | 81 | def render_images( 82 | model, 83 | cameras, 84 | image_size, 85 | save=False, 86 | file_prefix='' 87 | ): 88 | all_images = [] 89 | device = list(model.parameters())[0].device 90 | 91 | for cam_idx, camera in enumerate(cameras): 92 | print(f'Rendering image {cam_idx}') 93 | 94 | torch.cuda.empty_cache() 95 | camera = camera.to(device) 96 | xy_grid = get_pixels_from_image(image_size, camera) # TODO (Q1.3): implement in ray_utils.py 97 | ray_bundle = get_rays_from_pixels(xy_grid, image_size, camera) # TODO (Q1.3): implement in ray_utils.py 98 | 99 | # TODO (Q1.3): Visualize xy grid using vis_grid 100 | if cam_idx == 0 and file_prefix == '': 101 | pass 102 | 103 | # TODO (Q1.3): Visualize rays using vis_rays 104 | if cam_idx == 0 and file_prefix == '': 105 | pass 106 | 107 | # TODO (Q1.4): Implement point sampling along rays in sampler.py 108 | pass 109 | 110 | # TODO (Q1.4): Visualize sample points as point cloud 111 | if cam_idx == 0 and file_prefix == '': 112 | pass 113 | 114 | # TODO (Q1.5): Implement rendering in renderer.py 115 | out = model(ray_bundle) 116 | 117 | # Return rendered features (colors) 118 | image = np.array( 119 | out['feature'].view( 120 | image_size[1], image_size[0], 3 121 | ).detach().cpu() 122 | ) 123 | all_images.append(image) 124 | 125 | # TODO (Q1.5): Visualize depth 126 | if cam_idx == 2 and file_prefix == '': 127 | pass 128 | 129 | # Save 130 | if save: 131 | plt.imsave( 132 | f'{file_prefix}_{cam_idx}.png', 133 | image 134 | ) 135 | 136 | return all_images 137 | 138 | 139 | def render( 140 | cfg, 141 | ): 142 | # Create model 143 | model = Model(cfg) 144 | model = model.cuda(); model.eval() 145 | 146 | # Render spiral 147 | cameras = create_surround_cameras(3.0, n_poses=20) 148 | all_images = render_images( 149 | model, cameras, cfg.data.image_size 150 | ) 151 | imageio.mimsave('images/part_1.gif', [np.uint8(im * 255) for im in all_images], loop=0) 152 | 153 | 154 | def train( 155 | cfg 156 | ): 157 | # Create model 158 | model = Model(cfg) 159 | model = model.cuda(); model.train() 160 | 161 | # Create dataset 162 | train_dataset = dataset_from_config(cfg.data) 163 | train_dataloader = torch.utils.data.DataLoader( 164 | train_dataset, 165 | batch_size=1, 166 | shuffle=True, 167 | num_workers=0, 168 | collate_fn=lambda batch: batch, 169 | ) 170 | image_size = cfg.data.image_size 171 | 172 | # Create optimizer 173 | optimizer = torch.optim.Adam( 174 | model.parameters(), 175 | lr=cfg.training.lr 176 | ) 177 | 178 | # Render images before training 179 | cameras = [item['camera'] for item in train_dataset] 180 | render_images( 181 | model, cameras, image_size, 182 | save=True, file_prefix='images/part_2_before_training' 183 | ) 184 | 185 | # Train 186 | t_range = tqdm.tqdm(range(cfg.training.num_epochs)) 187 | 188 | for epoch in t_range: 189 | for iteration, batch in enumerate(train_dataloader): 190 | image, camera, camera_idx = batch[0].values() 191 | image = image.cuda() 192 | camera = camera.cuda() 193 | 194 | # Sample rays 195 | xy_grid = get_random_pixels_from_image(cfg.training.batch_size, image_size, camera) # TODO (Q2.1): implement in ray_utils.py 196 | ray_bundle = get_rays_from_pixels(xy_grid, image_size, camera) 197 | rgb_gt = sample_images_at_xy(image, xy_grid) 198 | 199 | # Run model forward 200 | out = model(ray_bundle) 201 | 202 | # TODO (Q2.2): Calculate loss 203 | loss = None 204 | 205 | # Backprop 206 | optimizer.zero_grad() 207 | loss.backward() 208 | optimizer.step() 209 | 210 | if (epoch % 10) == 0: 211 | t_range.set_description(f'Epoch: {epoch:04d}, Loss: {loss:.06f}') 212 | t_range.refresh() 213 | 214 | # Print center and side lengths 215 | print("Box center:", tuple(np.array(model.implicit_fn.sdf.center.data.detach().cpu()).tolist()[0])) 216 | print("Box side lengths:", tuple(np.array(model.implicit_fn.sdf.side_lengths.data.detach().cpu()).tolist()[0])) 217 | 218 | # Render images after training 219 | render_images( 220 | model, cameras, image_size, 221 | save=True, file_prefix='images/part_2_after_training' 222 | ) 223 | all_images = render_images( 224 | model, create_surround_cameras(3.0, n_poses=20), image_size, file_prefix='part_2' 225 | ) 226 | imageio.mimsave('images/part_2.gif', [np.uint8(im * 255) for im in all_images], loop=0) 227 | 228 | 229 | def create_model(cfg): 230 | # Create model 231 | model = Model(cfg) 232 | model.cuda(); model.train() 233 | 234 | # Load checkpoints 235 | optimizer_state_dict = None 236 | start_epoch = 0 237 | 238 | checkpoint_path = os.path.join( 239 | hydra.utils.get_original_cwd(), 240 | cfg.training.checkpoint_path 241 | ) 242 | 243 | if len(cfg.training.checkpoint_path) > 0: 244 | # Make the root of the experiment directory. 245 | checkpoint_dir = os.path.split(checkpoint_path)[0] 246 | os.makedirs(checkpoint_dir, exist_ok=True) 247 | 248 | # Resume training if requested. 249 | if cfg.training.resume and os.path.isfile(checkpoint_path): 250 | print(f"Resuming from checkpoint {checkpoint_path}.") 251 | loaded_data = torch.load(checkpoint_path) 252 | model.load_state_dict(loaded_data["model"]) 253 | start_epoch = loaded_data["epoch"] 254 | 255 | print(f" => resuming from epoch {start_epoch}.") 256 | optimizer_state_dict = loaded_data["optimizer"] 257 | 258 | # Initialize the optimizer. 259 | optimizer = torch.optim.Adam( 260 | model.parameters(), 261 | lr=cfg.training.lr, 262 | ) 263 | 264 | # Load the optimizer state dict in case we are resuming. 265 | if optimizer_state_dict is not None: 266 | optimizer.load_state_dict(optimizer_state_dict) 267 | optimizer.last_epoch = start_epoch 268 | 269 | # The learning rate scheduling is implemented with LambdaLR PyTorch scheduler. 270 | def lr_lambda(epoch): 271 | return cfg.training.lr_scheduler_gamma ** ( 272 | epoch / cfg.training.lr_scheduler_step_size 273 | ) 274 | 275 | lr_scheduler = torch.optim.lr_scheduler.LambdaLR( 276 | optimizer, lr_lambda, last_epoch=start_epoch - 1, verbose=False 277 | ) 278 | 279 | return model, optimizer, lr_scheduler, start_epoch, checkpoint_path 280 | 281 | def train_nerf( 282 | cfg 283 | ): 284 | # Create model 285 | model, optimizer, lr_scheduler, start_epoch, checkpoint_path = create_model(cfg) 286 | 287 | # Load the training/validation data. 288 | train_dataset, val_dataset, _ = get_nerf_datasets( 289 | dataset_name=cfg.data.dataset_name, 290 | image_size=[cfg.data.image_size[1], cfg.data.image_size[0]], 291 | ) 292 | 293 | train_dataloader = torch.utils.data.DataLoader( 294 | train_dataset, 295 | batch_size=1, 296 | shuffle=True, 297 | num_workers=0, 298 | collate_fn=trivial_collate, 299 | ) 300 | 301 | # Run the main training loop. 302 | for epoch in range(start_epoch, cfg.training.num_epochs): 303 | t_range = tqdm.tqdm(enumerate(train_dataloader)) 304 | 305 | for iteration, batch in t_range: 306 | image, camera, camera_idx = batch[0].values() 307 | image = image.cuda().unsqueeze(0) 308 | camera = camera.cuda() 309 | 310 | # Sample rays 311 | xy_grid = get_random_pixels_from_image( 312 | cfg.training.batch_size, cfg.data.image_size, camera 313 | ) 314 | ray_bundle = get_rays_from_pixels( 315 | xy_grid, cfg.data.image_size, camera 316 | ) 317 | rgb_gt = sample_images_at_xy(image, xy_grid) 318 | 319 | # Run model forward 320 | out = model(ray_bundle) 321 | 322 | # TODO (Q3.1): Calculate loss 323 | loss = None 324 | 325 | # Take the training step. 326 | optimizer.zero_grad() 327 | loss.backward() 328 | optimizer.step() 329 | 330 | t_range.set_description(f'Epoch: {epoch:04d}, Loss: {loss:.06f}') 331 | t_range.refresh() 332 | 333 | # Adjust the learning rate. 334 | lr_scheduler.step() 335 | 336 | # Checkpoint. 337 | if ( 338 | epoch % cfg.training.checkpoint_interval == 0 339 | and len(cfg.training.checkpoint_path) > 0 340 | and epoch > 0 341 | ): 342 | print(f"Storing checkpoint {checkpoint_path}.") 343 | 344 | data_to_store = { 345 | "model": model.state_dict(), 346 | "optimizer": optimizer.state_dict(), 347 | "epoch": epoch, 348 | } 349 | 350 | torch.save(data_to_store, checkpoint_path) 351 | 352 | # Render 353 | if ( 354 | epoch % cfg.training.render_interval == 0 355 | and epoch > 0 356 | ): 357 | with torch.no_grad(): 358 | test_images = render_images( 359 | model, create_surround_cameras(4.0, n_poses=20, up=(0.0, 0.0, 1.0), focal_length=2.0), 360 | cfg.data.image_size, file_prefix='nerf' 361 | ) 362 | imageio.mimsave('images/part_3.gif', [np.uint8(im * 255) for im in test_images], loop=0) 363 | 364 | 365 | @hydra.main(config_path='./configs', config_name='sphere') 366 | def main(cfg: DictConfig): 367 | os.chdir(hydra.utils.get_original_cwd()) 368 | 369 | if cfg.type == 'render': 370 | render(cfg) 371 | elif cfg.type == 'train': 372 | train(cfg) 373 | elif cfg.type == 'train_nerf': 374 | train_nerf(cfg) 375 | 376 | 377 | if __name__ == "__main__": 378 | main() 379 | 380 | --------------------------------------------------------------------------------