├── README.md
├── __init__.py
├── configs
    ├── box.yaml
    ├── nerf_fern.yaml
    ├── nerf_lego.yaml
    ├── nerf_lego_highres.yaml
    ├── nerf_materials.yaml
    ├── nerf_materials_highres.yaml
    ├── points_surface.yaml
    ├── sphere.yaml
    ├── sphere_surface.yaml
    ├── torus_surface.yaml
    ├── train_box.yaml
    ├── train_sphere.yaml
    └── volsdf_surface.yaml
├── data
    ├── box_0.npy
    ├── box_0.png
    ├── box_1.npy
    ├── box_1.png
    ├── box_2.npy
    ├── box_2.png
    ├── box_3.npy
    ├── box_3.png
    ├── bridge_pointcloud.npz
    └── bunny_pointcloud.npz
├── data_utils.py
├── dataset.py
├── environment.yml
├── images
    └── .ignore
├── implicit.py
├── losses.py
├── ray_utils.py
├── render_functions.py
├── renderer.py
├── requirements.txt
├── sampler.py
├── surface_rendering_main.py
├── ta_images
    ├── color.png
    ├── depth.png
    ├── grid.png
    ├── part_1.gif
    ├── part_2.gif
    ├── part_2_after_training_0.png
    ├── part_2_after_training_1.png
    ├── part_2_after_training_2.png
    ├── part_2_after_training_3.png
    ├── part_2_before_training_0.png
    ├── part_2_before_training_1.png
    ├── part_2_before_training_2.png
    ├── part_2_before_training_3.png
    ├── part_3.gif
    ├── part_5.gif
    ├── part_6.gif
    ├── part_6_input.gif
    ├── part_7.gif
    ├── part_7_geometry.gif
    ├── rays.png
    ├── sample_points.png
    └── transmittance.png
├── transmittance_calculation
    ├── a3_transmittance.pdf
    ├── figure1.png
    └── main.tex
└── volume_rendering_main.py


/README.md:
--------------------------------------------------------------------------------
  1 | Assignment 3 : Neural Volume Rendering and Surface Rendering
  2 | ===================================
  3 | Goals: In this assignment, you will setup a differentiable rendering pipeline and implement neural volume/surface rendering techniques like NeRF and VolSDF.
  4 | 
  5 | ## Table of Contents
  6 |  - [Setup](#setup)
  7 |  - [A. Neural Volume Rendering (80 points)](#a-neural-volume-rendering-80-points)
  8 |     - [0. Transmittance Calculation (10)](#0-transmittance-calculation-10-points)
  9 |     - [1. Differentiable Volume Rendering (30)](#1-differentiable-volume-rendering)
 10 |     - [2. Optimizing a Basic Implicit Volume (10)](#2-optimizing-a-basic-implicit-volume)
 11 |     - [3. Optimizing a Neural Radiance Field (NeRF) (20)](#3-optimizing-a-neural-radiance-field-nerf-20-points)
 12 |     - [4. NeRF Extras (10 + 10 Extra)](#4-nerf-extras-choose-one-more-than-one-is-extra-credit)
 13 |  - [B. Neural Surface Rendering (50 points)](#b-neural-surface-rendering-50-points)
 14 |     - [5. Sphere Tracing (10)](#5-sphere-tracing-10-points)
 15 |     - [6. Optimizing a Neural SDF (15)](#6-optimizing-a-neural-sdf-15-points)
 16 |     - [7. VolSDF (15)](#7-volsdf-15-points)
 17 |     - [8. Neural Surface Extras (10 + 20 Extra)](#8-neural-surface-extras-choose-one-more-than-one-is-extra-credit)
 18 | 
 19 | 
 20 | 
 21 | ##  Setup
 22 | 
 23 | ### Environment Setup
 24 | You can use the python environment you've set up for past assignments, but if you're starting fresh, please follow the instructions from Assignment 1 to get an environment with `torch` and `pytorch3d` up and running. This assignment needs a few additional packages, that can be installed with - 
 25 | ```bash
 26 | pip install -r requirements.txt
 27 | ```
 28 | 
 29 | ### Data
 30 | 
 31 | Most of the data for this assignment is provided in the github repo under `data/`. One of the assets (materials scene for Q4.1) is large, so you can download and unzip it as follows - 
 32 | ```bash
 33 | sudo apt install git-lfs
 34 | git lfs install
 35 | 
 36 | git clone https://huggingface.co/datasets/learning3dvision/nerf_materials
 37 | cd nerf_materials
 38 | unzip materials.zip -d <path_to_your_data_folder>
 39 | ```
 40 | # A. Neural Volume Rendering (80 points)
 41 | 
 42 | ## 0. Transmittance Calculation (10 points)
 43 | Transmittance calculation is a core part for the implementation of volume rendering. Your first task is to compute the transmittance of a ray going through a non-homogeneous medium (shown in the image below).   
 44 | Please compute the transmittance in `transmittance_calculation/a3_transmittance.pdf` and submit the result on your assignment website. You can either hand write the result or edit the tex file and show a screenshot on your webpage, as long as it is readable by the TAs.
 45 | 
 46 | ![Transmittance computation](transmittance_calculation/figure1.png)
 47 | 
 48 | ##  1. Differentiable Volume Rendering
 49 | 
 50 | In the emission-absorption (EA) model described in class, volumes are typically described by their *appearance* (e.g. emission) and *geometry* (absorption) at *every point* in 3D space. For part 1 of the assignment, you will implement a ***Differentiable Renderer*** for EA volumes, which you will use in parts 2 and 3. Differentiable renderers are extremely useful for 3D learning problems --- one reason is because they allow you to optimize scene parameters (i.e. perform inverse rendering) from image supervision only!
 51 | 
 52 | ##  1.1. Familiarize yourself with the code structure
 53 | 
 54 | There are four major components of our differentiable volume rendering pipeline:
 55 | 
 56 | * ***The camera***: `pytorch3d.CameraBase`
 57 | * ***The scene***: `SDFVolume` in `implicit.py`
 58 | * ***The sampling routine***: `StratifiedSampler` in `sampler.py`
 59 | * ***The renderer***: `VolumeRenderer` in `renderer.py`
 60 | 
 61 | `StratifiedSampler` provides a method for sampling multiple points along a ray traveling through the scene (also known as *raymarching*). Together, a sampler and a renderer describe a rendering pipeline. Like traditional graphics pipelines, this rendering procedure is independent of the scene and camera.
 62 | 
 63 | The scene, sampler, and renderer are all packaged together under the `Model` class in `volume_rendering_main.py`. In particular the `Model`'s forward method invokes a `VolumeRenderer` instance with a sampling strategy and volume as input.
 64 | 
 65 | Also, take a look at the `RayBundle` class in `ray_utils.py`, which provides a convenient wrapper around several inputs to the volume rendering procedure per ray.
 66 | 
 67 | ##  1.2. Outline of tasks
 68 | 
 69 | In order to perform rendering, you will implement the following routines:
 70 | 
 71 | 1. **Ray sampling from cameras**: you will fill out methods in `ray_utils.py` to generate world space rays from a particular camera.
 72 | 2. **Point sampling along rays**: you will fill out the `StratifiedSampler` class to generate sample points along each world space ray
 73 | 3. **Rendering**: you will fill out the `VolumeRenderer` class to *evaluate* a volume function at each sample point along a ray, and aggregate these evaluations to perform rendering.
 74 | 
 75 | ##  1.3. Ray sampling (5 points)
 76 | 
 77 | Take a look at the `render_images` function in `volume_rendering_main.py`. It loops through a set of cameras, generates rays for each pixel on a camera, and renders these rays using a `Model` instance.
 78 | 
 79 | ### Implementation
 80 | 
 81 | Your first task is to implement:
 82 | 
 83 | 1. `get_pixels_from_image` in `ray_utils.py` and
 84 | 2. `get_rays_from_pixels` in `ray_utils.py`
 85 | 
 86 | which are used in `render_images`:
 87 | 
 88 | ```python
 89 | xy_grid = get_pixels_from_image(image_size, camera) # TODO: implement in ray_utils.py
 90 | ray_bundle = get_rays_from_pixels(xy_grid, camera) # TODO: implement in ray_utils.py
 91 | ```
 92 | 
 93 | The `get_pixels_from_image` method generates pixel coordinates, ranging from `[-1, 1]` for each pixel in an image. The `get_rays_from_pixels` method generates rays for each pixel, by mapping from a camera's *Normalized Device Coordinate (NDC) Space* into world space.
 94 | 
 95 | ### Visualization
 96 | 
 97 | You can run the code for part 1 with:
 98 | 
 99 | ```bash
100 | # mkdir images (uncomment when running for the first time)
101 | python volume_rendering_main.py --config-name=box
102 | ```
103 | 
104 | Once you have implemented these methods, verify that your output matches the TA output by visualizing both `xy_grid` and `rays` with the `vis_grid` and `vis_rays` functions in the `render_images` function in `main.py`. **By default, the above command will crash and return an error**. However, it should reach your visualization code before it does. The outputs of grid/ray visualization should look like this:
105 | 
106 | ![Grid](ta_images/grid.png)    ![Rays](ta_images/rays.png)
107 | 
108 | ##  1.4. Point sampling (5 points)
109 | 
110 | ### Implementation
111 | 
112 | Your next task is to fill out `StratifiedSampler` in `sampler.py`. Implement the forward method, which:
113 | 
114 | 1. Generates a set of distances between `near` and `far` and
115 | 2. Uses these distances to sample points offset from ray origins (`RayBundle.origins`) along ray directions (`RayBundle.directions`).
116 | 3. Stores the distances and sample points in `RayBundle.sample_points` and `RayBundle.sample_lengths`
117 | 
118 | ### Visualization
119 | 
120 | Once you have done this, use the `render_points` method in `render_functions.py` in order to visualize the point samples from the first camera. They should look like this:
121 | 
122 | ![Sample points](ta_images/sample_points.png)
123 | 
124 | ##  1.5. Volume rendering (20 points)
125 | 
126 | Finally, we can implement volume rendering! With the `configs/box.yaml` configuration, we provide you with an `SDFVolume` instance describing a box. You can check out the code for this function in `implicit.py`, which converts a signed distance function into a volume. If you want, you can even implement your own `SDFVolume` classes by creating new signed distance function class, and adding it to `sdf_dict` in `implicit.py`. Take a look at [this great web page](https://iquilezles.org/articles/distfunctions/) for formulas for some simple/complex SDFs.
127 | 
128 | 
129 | ### Implementation
130 | 
131 | You will implement
132 | 
133 | 1. `VolumeRenderer._compute_weights` and
134 | 2. `VolumeRenderer._aggregate`.
135 | 3. You will also modify the `VolumeRenderer.forward` method to render a depth map in addition to color from a volume
136 | 
137 | From each volume evaluation you will get both volume density, and a color:
138 | 
139 | ```python
140 | # Call implicit function with sample points
141 | implicit_output = implicit_fn(cur_ray_bundle)
142 | density = implicit_output['density']
143 | feature = implicit_output['feature']
144 | ```
145 | 
146 | You'll then use the following equation to render color along a ray:
147 | 
148 | ![Equation](ta_images/color.png)
149 | 
150 | where `σ` is density, `Δt` is the length of current ray segment, and `L_e` is color:
151 | 
152 | ![Transmittance](ta_images/transmittance.png)
153 | 
154 | Compute the weights `T * (1 - exp(-σ * Δt))` in `VolumeRenderer._compute_weights`, and perform the summation in `VolumeRenderer._aggregate`. Note that for the first segment `T = 1`.
155 | 
156 | Use weights, and aggregation function to render *color* and *depth* (stored in `RayBundle.sample_lengths`). 
157 | 
158 | ### Visualization
159 | 
160 | By default, your results will be written out to `images/part_1.gif`. Provide a visualization of the depth in your write-up. Note that the depth should be normalized by its maximum value.
161 | 
162 | ![Spiral Rendering of Part 1](ta_images/part_1.gif) ![Depth](ta_images/depth.png)
163 | 
164 | 
165 | ##  2. Optimizing a basic implicit volume
166 | 
167 | ##  2.1. Random ray sampling (5 points)
168 | 
169 | Since you have now implemented a differentiable volume renderer, we can use it to optimize the parameters of a volume! We have provided a basic training loop in the `train` method in `volume_rendering_main.py`.
170 | 
171 | Depending on how many sample points we take for each ray, volume rendering can consume a lot of memory on the GPU (especially during the backward pass of gradient descent). Because of this, it usually makes sense to sample a subset of rays from a full image for each training iteration. In order to do this, implement the `get_random_pixels_from_image` method in `ray_utils.py`, invoked here:
172 | 
173 | ```python
174 | xy_grid = get_random_pixels_from_image(cfg.training.batch_size, image_size, camera) # TODO: implement in ray_utils.py
175 | ```
176 | 
177 | ##  2.2. Loss and training (5 points)
178 | Replace the loss in `train`
179 | 
180 | ```python
181 | loss = None
182 | ```
183 | 
184 | with mean squared error between the predicted colors and ground truth colors `rgb_gt`.
185 | 
186 | Once you've done this, you can run train a model with
187 | 
188 | ```bash
189 | python volume_rendering_main.py --config-name=train_box
190 | ```
191 | 
192 | This will optimize the position and side lengths of a box, given a few ground truth images with known camera poses (in the `data` folder). Report the center of the box, and the side lengths of the box after training, rounded to the nearest `1/100` decimal place.
193 | 
194 | ##  2.3. Visualization
195 | 
196 | The code renders a spiral sequence of the optimized volume in `images/part_2.gif`. Compare this gif to the one below, and attach it in your write-up:
197 | 
198 | ![Spiral Rendering of Part 2](ta_images/part_2.gif)
199 | 
200 | 
201 | ##  3. Optimizing a Neural Radiance Field (NeRF) (20 points)
202 | In this part, you will implement an implicit volume as a Multi-Layer Perceptron (MLP) in the `NeuralRadianceField` class in `implicit.py`. This MLP should map 3D position to volume density and color. Specifically:
203 | 
204 | 1. Your MLP should take in a `RayBundle` object in its forward method, and produce color and density for each sample point in the RayBundle.
205 | 2. You should also fill out the loss in `train_nerf` in the `volume_rendering_main.py` file.
206 | 
207 | You will then use this implicit volume to optimize a scene from a set of RGB images. We have implemented data loading, training, checkpointing for you, but this part will still require you to do a bit more legwork than for Parts 1 and 2. You will have to write the code for the MLP yourself --- feel free to reference the NeRF paper, though you should not directly copy code from an external repository.
208 | 
209 | ## Implementation
210 | 
211 | Here are a few things to note:
212 | 
213 | 1. For now, your NeRF MLP does not need to handle *view dependence*, and can solely depend on 3D position.
214 | 2. You should use the `ReLU` activation to map the first network output to density (to ensure that density is non-negative)
215 | 3. You should use the `Sigmoid` activation to map the remaining raw network outputs to color
216 | 4. You can use *Positional Encoding* of the input to the network to achieve higher quality. We provide an implementation of positional encoding in the `HarmonicEmbedding` class in `implicit.py`.
217 | 
218 | ## Visualization
219 | You can train a NeRF on the lego bulldozer dataset with
220 | 
221 | ```bash
222 | python volume_rendering_main.py --config-name=nerf_lego
223 | ```
224 | 
225 | This will create a NeRF with the `NeuralRadianceField` class in `implicit.py`, and use it as the `implicit_fn` in `VolumeRenderer`. It will also train a NeRF for 250 epochs on 128x128 images.
226 | 
227 | Feel free to modify the experimental settings in `configs/nerf_lego.yaml` --- though the current settings should allow you to train a NeRF on low-resolution inputs in a reasonable amount of time. After training, a spiral rendering will be written to `images/part_3.gif`. Report your results. It should look something like this:
228 | 
229 | ![Spiral Rendering of Part 3](ta_images/part_3.gif)
230 | 
231 | ##  4. NeRF Extras (CHOOSE ONE! More than one is extra credit)
232 | 
233 | ###  4.1 View Dependence (10 points)
234 | 
235 | Add view dependence to your NeRF model! Specifically, make it so that emission can vary with viewing direction. You can read NeRF or other papers for how to do this effectively --- if you're not careful, your network may overfit to the training images. Discuss the trade-offs between increased view dependence and generalization quality. 
236 | 
237 | While you may use the lego scene to test your code, please employ the materials scene to show the results of your method on your webpage (experimental settings can be found in `nerf_materials.yaml` and `nerf_materials_highres.yaml`).
238 | 
239 | If you haven't done so already, make sure to download and unzip the `nerf_materials` dataset as described in the setup section.
240 | 
241 | ###  4.2 Coarse/Fine Sampling (10 points)
242 | 
243 | NeRF employs two networks: a coarse network and a fine network. During the coarse pass, it uses the coarse network to get an estimate of geometry, and during fine pass uses these geometry estimates for better point sampling for the fine network. Implement this strategy and discuss trade-offs (speed / quality).
244 | 
245 | # B. Neural Surface Rendering (50 points)
246 | 
247 | ##  5. Sphere Tracing (10 points)
248 | 
249 | In this part you will implement sphere tracing for rendering an SDF, and use this implementation to render a simple torus. You will need to implement the `sphere_tracing` function in `renderer.py`. This function should return two outputs: (`points, mask`), where the `points` Tensor indicates the intersection point for each ray with the surface, and `mask` is a boolean Tensor indicating which rays intersected the surface.
250 | 
251 | You can run the code for part 5 with:
252 | ```bash
253 | # mkdir images (uncomment when running for the first time)
254 | python -m surface_rendering_main --config-name=torus_surface
255 | ```
256 | 
257 | This should save `part_5.gif` in the `images' folder. Please include this in your submission along with a short writeup describing your implementation.
258 | 
259 | ![Torus](ta_images/part_5.gif)
260 | 
261 | ##  6. Optimizing a Neural SDF (15 points)
262 | 
263 | In this part, you will implement an MLP architecture for a neural SDF, and train this neural SDF on point cloud data. You will do this by training the network to output a zero value at the observed points. To encourage the network to learn an SDF instead of an arbitrary function, we will use an 'eikonal' regularization which enforces the gradients of the predictions to behave in a certain way (search lecture slides for hints).
264 | 
265 | In this part you need to:
266 | 
267 | * **Implement a MLP to predict distance**: You should populate the `NeuralSurface` class in `implicit.py`. For this part, you need to define a MLP that helps you predict a distance for any input point. More concretely, you would need to define some MLP(s) in  `__init__` function, and use these to implement the `get_distance` function for this class. Hint: you can use a similar MLP to what you used to predict density in Part A, but remember that density and distance have different possible ranges!
268 | 
269 | * **Implement Eikonal Constraint as a Loss**: Define the `eikonal_loss` in `losses.py`.
270 | 
271 | After this, you should be able to train a NeuralSurface representation by:
272 | ```bash
273 | python -m surface_rendering_main --config-name=points_surface
274 | ```
275 | 
276 | This should save save `part_6_input.gif` and `part_6.gif` in the `images` folder. The former visualizes the input point cloud used for training, and the latter shows your prediction which you should include on the webpage alongwith brief descriptions of your MLP and eikonal loss. You might need to tune hyperparameters (e.g. number of layers, epochs, weight of regularization, etc.) for good results.
277 | 
278 | ![Bunny geometry](ta_images/part_6.gif)
279 | 
280 | ##  7. VolSDF (15 points)
281 | 
282 | In this part, you will implement a function converting SDF -> volume density and extend the `NeuralSurface` class to predict color. 
283 | 
284 | * **Color Prediction**: Extend the the `NeuralSurface` class to predict per-point color. You may need to define a new MLP (a just a few new layers depending on how you implemented Q2). You should then implement the `get_color` and `get_distance_color` functions.
285 | 
286 | * **SDF to Density**: Read section 3.1 of the [VolSDF Paper](https://arxiv.org/pdf/2106.12052.pdf) and implement their formula converting signed distance to density in the `sdf_to_density` function in `renderer.py`. In your write-up, give an intuitive explanation of what the parameters `alpha` and `beta` are doing here. Also, answer the following questions:
287 | 1. How does high `beta` bias your learned SDF? What about low `beta`?
288 | 2. Would an SDF be easier to train with volume rendering and low `beta` or high `beta`? Why?
289 | 3. Would you be more likely to learn an accurate surface with high `beta` or low `beta`? Why?
290 | 
291 | After implementing these, train an SDF on the lego bulldozer model with
292 | 
293 | ```bash
294 | python -m surface_rendering_main --config-name=volsdf_surface
295 | ```
296 | 
297 | This will save `part_7_geometry.gif` and `part_7.gif`. Experiment with hyper-parameters to and attach your best results on your webpage. Comment on the settings you chose, and why they seem to work well.
298 | 
299 | ![Bulldozer geometry](ta_images/part_7_geometry.gif) ![Bulldozer color](ta_images/part_7.gif)
300 | 
301 | 
302 | ## 8. Neural Surface Extras (CHOOSE ONE! More than one is extra credit)
303 | 
304 | ### 8.1. Render a Large Scene with Sphere Tracing (10 points)
305 | In Q5, you rendered a (lonely) Torus, but to the power of Sphere Tracing lies in the fact that it can render complex scenes efficiently. To observe this, try defining a ‘scene’ with many (> 20) primitives (e.g. Sphere, Torus, or another SDF from [this website](https://iquilezles.org/articles/distfunctions/) at different locations). See Lecture 2 for equations of what the ‘composed’ SDF of primitives is. You can then define a new class in `implicit.py` that instantiates a complex scene with many primitives, and modify the code for Q5 to render this scene instead of a simple torus.
306 | 
307 | ### 8.2 Fewer Training Views (10 points)
308 | In Q7, we relied on 100 training views for a single scene. A benefit of using Surface representations, however, is that the geometry is better regularized and can in principle be inferred from fewer views. Experiment with using fewer training views (say 20) -- you can do this by changing [train_idx in data loader](https://github.com/learning3d/assignment3/blob/main/dataset.py#L123) to use a smaller random subset of indices. You should also compare the VolSDF solution to a NeRF solution learned using similar views.
309 | 
310 | ### 8.3 Alternate SDF to Density Conversions (10 points)
311 | In Q7, we used the equations from [VolSDF Paper](https://arxiv.org/pdf/2106.12052.pdf) to convert SDF to density. You should try and compare alternate ways of doing this e.g. the ‘naive’ solution from the [NeuS paper](https://arxiv.org/pdf/2106.10689.pdf), or any other ways that you might want to propose!
312 | 


--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
1 | # Copyright (c) Facebook, Inc. and its affiliates.
2 | # All rights reserved.
3 | #
4 | # This source code is licensed under the BSD-style license found in the
5 | # LICENSE file in the root directory of this source tree.
6 | 


--------------------------------------------------------------------------------
/configs/box.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: render
 4 | 
 5 | data:
 6 |   image_size: [256, 256]
 7 | 
 8 | renderer:
 9 |   type: volume
10 |   chunk_size: 32768
11 | 
12 | sampler:
13 |   type: stratified
14 |   n_pts_per_ray: 64
15 |   min_depth: 0.0
16 |   max_depth: 5.0
17 | 
18 | implicit_function:
19 |   type: sdf_volume
20 | 
21 |   sdf:
22 |     type: box
23 | 
24 |     side_lengths:
25 |       val: [1.75, 1.75, 1.75]
26 |       opt: False
27 | 
28 |     center:
29 |       val: [0.0, 0.0, 0.0]
30 |       opt: True
31 | 
32 |   feature:
33 |     rainbow: True
34 |     val: [1.0, 1.0, 1.0]
35 |     opt: False
36 | 
37 |   alpha:
38 |     val: 1.0
39 |     opt: False
40 | 
41 |   beta:
42 |     val: 0.05
43 |     opt: False
44 | 
45 | 


--------------------------------------------------------------------------------
/configs/nerf_fern.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train_nerf
 4 | 
 5 | training:
 6 |   num_epochs: 10000
 7 |   batch_size: 4096
 8 |   lr: 0.0005
 9 | 
10 |   lr_scheduler_step_size: 1000
11 |   lr_scheduler_gamma: 0.9
12 | 
13 |   checkpoint_path: ./checkpoints
14 |   checkpoint_interval: 1000
15 |   resume: True
16 | 
17 |   render_interval: 10
18 | 
19 | data:
20 |   image_size: [252, 189]
21 |   dataset_name: fern
22 | 
23 | renderer:
24 |   type: volume
25 |   chunk_size: 32768
26 | 
27 | sampler:
28 |   type: stratified
29 |   n_pts_per_ray: 64
30 | 
31 |   min_depth: 1.2
32 |   max_depth: 6.28
33 | 
34 | implicit_function:
35 |   type: nerf
36 | 
37 |   n_harmonic_functions_xyz: 6
38 |   n_harmonic_functions_dir: 2
39 |   n_hidden_neurons_xyz: 128
40 |   n_hidden_neurons_dir: 64
41 |   density_noise_std: 0.0
42 |   n_layers_xyz: 6
43 |   append_xyz: [4]
44 | 


--------------------------------------------------------------------------------
/configs/nerf_lego.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train_nerf
 4 | 
 5 | training:
 6 |   num_epochs: 250
 7 |   batch_size: 1024
 8 |   lr: 0.0005
 9 | 
10 |   lr_scheduler_step_size: 50
11 |   lr_scheduler_gamma: 0.8
12 | 
13 |   checkpoint_path: ./checkpoints
14 |   checkpoint_interval: 50
15 |   resume: True
16 | 
17 |   render_interval: 10
18 | 
19 | data:
20 |   image_size: [128, 128]
21 |   dataset_name: lego
22 | 
23 | renderer:
24 |   type: volume
25 |   chunk_size: 32768
26 |   white_background: False
27 | 
28 | sampler:
29 |   type: stratified
30 |   n_pts_per_ray: 128
31 | 
32 |   min_depth: 2.0
33 |   max_depth: 6.0
34 | 
35 | implicit_function:
36 |   type: nerf
37 | 
38 |   n_harmonic_functions_xyz: 6
39 |   n_harmonic_functions_dir: 2
40 |   n_hidden_neurons_xyz: 128
41 |   n_hidden_neurons_dir: 64
42 |   density_noise_std: 0.0
43 |   n_layers_xyz: 6
44 |   append_xyz: [3]
45 | 
46 | 
47 | 


--------------------------------------------------------------------------------
/configs/nerf_lego_highres.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train_nerf
 4 | 
 5 | training:
 6 |   num_epochs: 250
 7 |   batch_size: 1024
 8 |   lr: 0.0005
 9 | 
10 |   lr_scheduler_step_size: 50
11 |   lr_scheduler_gamma: 0.8
12 | 
13 |   checkpoint_path: ./checkpoints
14 |   checkpoint_interval: 50
15 |   resume: True
16 | 
17 |   render_interval: 10
18 | 
19 | data:
20 |   image_size: [400, 400]
21 |   dataset_name: lego
22 | 
23 | renderer:
24 |   type: volume
25 |   chunk_size: 32768
26 |   white_background: False
27 | 
28 | sampler:
29 |   type: stratified
30 |   n_pts_per_ray: 128
31 | 
32 |   min_depth: 2.0
33 |   max_depth: 6.0
34 | 
35 | implicit_function:
36 |   type: nerf
37 | 
38 |   n_harmonic_functions_xyz: 6
39 |   n_harmonic_functions_dir: 2
40 |   n_hidden_neurons_xyz: 128
41 |   n_hidden_neurons_dir: 64
42 |   density_noise_std: 0.0
43 |   n_layers_xyz: 6
44 |   append_xyz: [3]
45 | 
46 | 
47 | 
48 | 


--------------------------------------------------------------------------------
/configs/nerf_materials.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train_nerf
 4 | 
 5 | training:
 6 |   num_epochs: 250
 7 |   batch_size: 1024
 8 |   lr: 0.0001
 9 | 
10 |   lr_scheduler_step_size: 50
11 |   lr_scheduler_gamma: 0.8
12 | 
13 |   checkpoint_path: ./checkpoints
14 |   checkpoint_interval: 50
15 |   resume: True
16 | 
17 |   render_interval: 10
18 | 
19 | data:
20 |   image_size: [128, 128]
21 |   dataset_name: materials
22 | 
23 | renderer:
24 |   type: volume
25 |   chunk_size: 32768
26 |   white_background: False
27 | 
28 | sampler:
29 |   type: stratified
30 |   n_pts_per_ray: 128
31 | 
32 |   min_depth: 1.0
33 |   max_depth: 7.0
34 | 
35 | implicit_function:
36 |   type: nerf
37 | 
38 |   n_harmonic_functions_xyz: 6
39 |   n_harmonic_functions_dir: 2
40 |   n_hidden_neurons_xyz: 128
41 |   n_hidden_neurons_dir: 64
42 |   density_noise_std: 0.0
43 |   n_layers_xyz: 6
44 |   append_xyz: [3]
45 | 
46 | 


--------------------------------------------------------------------------------
/configs/nerf_materials_highres.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train_nerf
 4 | 
 5 | training:
 6 |   num_epochs: 250
 7 |   batch_size: 1024
 8 |   lr: 0.0001
 9 | 
10 |   lr_scheduler_step_size: 50
11 |   lr_scheduler_gamma: 0.8
12 | 
13 |   checkpoint_path: ./checkpoints
14 |   checkpoint_interval: 50
15 |   resume: True
16 | 
17 |   render_interval: 10
18 | 
19 | data:
20 |   image_size: [400, 400]
21 |   dataset_name: materials
22 | 
23 | renderer:
24 |   type: volume
25 |   chunk_size: 32768
26 |   white_background: False
27 | 
28 | sampler:
29 |   type: stratified
30 |   n_pts_per_ray: 128
31 | 
32 |   min_depth: 1.0
33 |   max_depth: 7.0
34 | 
35 | implicit_function:
36 |   type: nerf
37 | 
38 |   n_harmonic_functions_xyz: 6
39 |   n_harmonic_functions_dir: 2
40 |   n_hidden_neurons_xyz: 128
41 |   n_hidden_neurons_dir: 64
42 |   density_noise_std: 0.0
43 |   n_layers_xyz: 6
44 |   append_xyz: [3]
45 | 
46 | 


--------------------------------------------------------------------------------
/configs/points_surface.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train_points
 4 | 
 5 | data:
 6 |   image_size: [256, 256]
 7 |   point_cloud_path: data/bunny_pointcloud.npz
 8 | 
 9 | renderer:
10 |   type: sphere_tracing
11 |   chunk_size: 8192
12 |   near: 0.0
13 |   far: 5.0
14 |   max_iters: 64
15 | 
16 | sampler:
17 |   type: stratified
18 |   n_pts_per_ray:
19 |   min_depth:
20 |   max_depth:
21 | 
22 | training:
23 |   num_epochs: 5000
24 |   pretrain_iters: 250
25 |   batch_size: 4096
26 |   lr: 0.0001
27 | 
28 |   lr_scheduler_step_size: 50
29 |   lr_scheduler_gamma: 0.8
30 | 
31 |   checkpoint_path: ./points_checkpoint
32 |   checkpoint_interval: 100
33 |   resume: True
34 | 
35 |   render_interval: 500
36 | 
37 |   inter_weight: 0.1
38 |   eikonal_weight: 0.02
39 |   bounds: [[-4, -4, -4], [4, 4, 4]]
40 | 
41 | implicit_function:
42 |   type: neural_surface
43 | 
44 |   n_harmonic_functions_xyz: 4
45 | 
46 |   n_layers_distance: 6
47 |   n_hidden_neurons_distance: 128
48 |   append_distance: []
49 | 
50 |   n_layers_color: 2
51 |   n_hidden_neurons_color: 128
52 |   append_color: []
53 | 


--------------------------------------------------------------------------------
/configs/sphere.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: render
 4 | 
 5 | data:
 6 |   image_size: [256, 256]
 7 | 
 8 | renderer:
 9 |   type: volume
10 |   chunk_size: 32768
11 | 
12 | sampler:
13 |   type: stratified
14 |   n_pts_per_ray: 64
15 |   min_depth: 0.0
16 |   max_depth: 5.0
17 | 
18 | implicit_function:
19 |   type: sdf_volume
20 | 
21 |   sdf:
22 |     type: sphere
23 | 
24 |     radius:
25 |       val: 1.0
26 |       opt: False
27 | 
28 |     center:
29 |       val: [0.0, 0.0, 0.0]
30 |       opt: False
31 | 
32 |   feature:
33 |     rainbow: True
34 |     val: [1.0, 1.0, 1.0]
35 |     opt: False
36 | 
37 |   alpha:
38 |     val: 1.0
39 |     opt: False
40 | 
41 |   beta:
42 |     val: 0.05
43 |     opt: False


--------------------------------------------------------------------------------
/configs/sphere_surface.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: render
 4 | 
 5 | data:
 6 |   image_size: [256, 256]
 7 | 
 8 | renderer:
 9 |   type: sphere_tracing
10 |   chunk_size: 8192
11 |   near: 0.0
12 |   far: 5.0
13 |   max_iters: 64
14 | 
15 | sampler:
16 |   type: stratified
17 |   n_pts_per_ray:
18 |   min_depth:
19 |   max_depth:
20 | 
21 | implicit_function:
22 |   type: sdf_surface
23 | 
24 |   sdf:
25 |     type: sphere
26 | 
27 |     center:
28 |       val: [0.0, 0.0, 0.0]
29 |       opt: True
30 | 
31 |     radius:
32 |       val: 1.0
33 |       opt: False
34 | 
35 |   feature:
36 |     rainbow: True
37 |     val: [1.0, 1.0, 1.0]
38 |     opt: False


--------------------------------------------------------------------------------
/configs/torus_surface.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: render
 4 | 
 5 | data:
 6 |   image_size: [256, 256]
 7 | 
 8 | renderer:
 9 |   type: sphere_tracing
10 |   chunk_size: 8192
11 |   near: 0.0
12 |   far: 5.0
13 |   max_iters: 64
14 | 
15 | sampler:
16 |   type: stratified
17 |   n_pts_per_ray:
18 |   min_depth:
19 |   max_depth:
20 | 
21 | implicit_function:
22 |   type: sdf_surface
23 | 
24 |   sdf:
25 |     type: torus
26 | 
27 |     center:
28 |       val: [0.0, 0.0, 0.0]
29 |       opt: True
30 | 
31 |     radii:
32 |       val: [1.0, 0.25]
33 |       opt: False
34 | 
35 |   feature:
36 |     rainbow: True
37 |     val: [1.0, 1.0, 1.0]
38 |     opt: False
39 | 


--------------------------------------------------------------------------------
/configs/train_box.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train
 4 | 
 5 | training:
 6 |   num_epochs: 1000
 7 |   batch_size: 4096
 8 |   lr: 0.0005
 9 | 
10 | data:
11 |   image_size: [256, 256]
12 | 
13 |   cameras:
14 |     cam0:
15 |       focal: 1.0
16 |       eye: [-2.5, 0.0, 0.0]
17 |       principal_point: [0.0, 0.0]
18 |       scene_center: [0.0, 0.0, 0.0]
19 |       up: [0.0, 1.0, 0.0]
20 | 
21 |       image: "data/box_0.npy"
22 | 
23 |     cam1:
24 |       focal: 1.0
25 |       eye: [-1.0, 0.0, -2.2]
26 |       principal_point: [0.0, 0.0]
27 |       scene_center: [0.0, 0.0, 0.0]
28 |       up: [0.0, 1.0, 0.0]
29 | 
30 |       image: "data/box_1.npy"
31 | 
32 |     cam2:
33 |       focal: 1.0
34 |       eye: [0.0, 0.0, -2.5]
35 |       principal_point: [0.0, 0.0]
36 |       scene_center: [0.0, 0.0, 0.0]
37 |       up: [0.0, 1.0, 0.0]
38 | 
39 |       image: "data/box_2.npy"
40 | 
41 |     cam3:
42 |       focal: 1.0
43 |       eye: [1.0, 0.0, -2.2]
44 |       principal_point: [0.0, 0.0]
45 |       scene_center: [0.0, 0.0, 0.0]
46 |       up: [0.0, 1.0, 0.0]
47 | 
48 |       image: "data/box_3.npy"
49 | 
50 | renderer:
51 |   type: volume
52 |   chunk_size: 32768
53 | 
54 | sampler:
55 |   type: stratified
56 |   n_pts_per_ray: 64
57 |   min_depth: 0.0
58 |   max_depth: 5.0
59 | 
60 | implicit_function:
61 |   type: sdf_volume
62 | 
63 |   sdf:
64 |     type: box
65 | 
66 |     side_lengths:
67 |       val: [1.5, 1.5, 1.5]
68 |       opt: True
69 | 
70 |     center:
71 |       val: [0.0, 0.0, 0.0]
72 |       opt: True
73 | 
74 |   feature:
75 |     rainbow: True
76 |     val: [1.0, 1.0, 1.0]
77 |     opt: False
78 | 
79 |   alpha:
80 |     val: 1.0
81 |     opt: False
82 | 
83 |   beta:
84 |     val: 0.05
85 |     opt: False


--------------------------------------------------------------------------------
/configs/train_sphere.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train
 4 | 
 5 | training:
 6 |   num_epochs: 250
 7 |   batch_size: 1024
 8 |   lr: 0.0005
 9 | 
10 | data:
11 |   image_size: [256, 256]
12 | 
13 |   cameras:
14 |     cam1:
15 |       focal: 1.0
16 |       eye: [0.0, 0.0, -3.0]
17 |       principal_point: [0.0, 0.0]
18 |       scene_center: [0.0, 0.0, 0.0]
19 |       up: [0.0, 1.0, 0.0]
20 | 
21 |     cam2:
22 |       focal: 1.0
23 |       eye: [2.0, 0.0, -2.0]
24 |       principal_point: [0.0, 0.0]
25 |       scene_center: [0.0, 0.0, 0.0]
26 |       up: [0.0, 1.0, 0.0]
27 | 
28 |     cam3:
29 |       focal: 1.0
30 |       eye: [3.0, 0.0, 0.0]
31 |       principal_point: [0.0, 0.0]
32 |       scene_center: [0.0, 0.0, 0.0]
33 |       up: [0.0, 1.0, 0.0]
34 | 
35 | renderer:
36 |   type: volume
37 |   chunk_size: 32768
38 | 
39 | sampler:
40 |   type: stratified
41 |   n_pts_per_ray: 64
42 |   min_depth: 0.0
43 |   max_depth: 5.0
44 | 
45 | implicit_function:
46 |   type: sdf_volume
47 | 
48 |   sdf:
49 |     type: sphere
50 | 
51 |     radius:
52 |       val: 1.0
53 |       opt: False
54 | 
55 |     center:
56 |       val: [0.3, 0.2, 0.0]
57 |       opt: True
58 | 
59 |   feature:
60 |     val: [0.3, 0.1, 0.1]
61 |     opt: True
62 | 
63 |   alpha:
64 |     val: 1.0
65 |     opt: False
66 | 
67 |   beta:
68 |     val: 0.05
69 |     opt: False


--------------------------------------------------------------------------------
/configs/volsdf_surface.yaml:
--------------------------------------------------------------------------------
 1 | seed: 1
 2 | 
 3 | type: train_images
 4 | 
 5 | data:
 6 |   image_size: [128, 128]
 7 |   dataset_name: lego
 8 | 
 9 | renderer:
10 |   type: volume_sdf
11 |   chunk_size: 32768
12 |   white_background: False
13 | 
14 |   alpha: 10.0
15 |   beta: 0.05
16 | 
17 |   relighting_function:
18 |     type: none
19 | 
20 | sampler:
21 |   type: stratified
22 |   n_pts_per_ray: 128
23 | 
24 |   min_depth: 2.0
25 |   max_depth: 6.0
26 | 
27 | training:
28 |   num_epochs: 250
29 |   pretrain_iters: 1000
30 |   batch_size: 1024
31 |   lr: 0.0005
32 | 
33 |   lr_scheduler_step_size: 50
34 |   lr_scheduler_gamma: 0.8
35 | 
36 |   checkpoint_path: ./volsdf_checkpoint
37 |   checkpoint_interval: 50
38 |   resume: True
39 | 
40 |   render_interval: 10
41 | 
42 |   inter_weight: 0.1
43 |   eikonal_weight: 0.02
44 |   bounds: [[-4, -4, -4], [4, 4, 4]]
45 | 
46 | implicit_function:
47 |   type: neural_surface
48 | 
49 |   n_harmonic_functions_xyz: 6
50 | 
51 |   n_layers_distance: 6
52 |   n_hidden_neurons_distance: 128
53 |   append_distance: []
54 | 
55 |   n_layers_color: 2
56 |   n_hidden_neurons_color: 128
57 |   append_color: []
58 | 


--------------------------------------------------------------------------------
/data/box_0.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_0.npy


--------------------------------------------------------------------------------
/data/box_0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_0.png


--------------------------------------------------------------------------------
/data/box_1.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_1.npy


--------------------------------------------------------------------------------
/data/box_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_1.png


--------------------------------------------------------------------------------
/data/box_2.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_2.npy


--------------------------------------------------------------------------------
/data/box_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_2.png


--------------------------------------------------------------------------------
/data/box_3.npy:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_3.npy


--------------------------------------------------------------------------------
/data/box_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/box_3.png


--------------------------------------------------------------------------------
/data/bridge_pointcloud.npz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/bridge_pointcloud.npz


--------------------------------------------------------------------------------
/data/bunny_pointcloud.npz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/data/bunny_pointcloud.npz


--------------------------------------------------------------------------------
/data_utils.py:
--------------------------------------------------------------------------------
 1 | from pytorch3d.renderer import (
 2 |     PerspectiveCameras,
 3 |     look_at_view_transform
 4 | )
 5 | 
 6 | import numpy as np
 7 | import torch
 8 | 
 9 | 
10 | # Basic data loading
11 | def dataset_from_config(
12 |     cfg,
13 |     ):
14 |     dataset = []
15 | 
16 |     for cam_idx, cam_key in enumerate(cfg.cameras.keys()):
17 |         cam_cfg = cfg.cameras[cam_key]
18 | 
19 |         # Create camera parameters
20 |         R, T = look_at_view_transform(
21 |             eye=(cam_cfg.eye,),
22 |             at=(cam_cfg.scene_center,),
23 |             up=(cam_cfg.up,),
24 |         )
25 |         focal = torch.tensor([cam_cfg.focal])[None]
26 |         principal_point = torch.tensor(cam_cfg.principal_point)[None]
27 | 
28 |         # Assemble the dataset
29 |         image = None
30 |         if 'image' in cam_cfg and cam_cfg.image is not None:
31 |             image = torch.tensor(np.load(cam_cfg.image))[None]
32 | 
33 |         dataset.append(
34 |             {
35 |                 "image": image,
36 |                 "camera": PerspectiveCameras(
37 |                     focal_length=focal,
38 |                     principal_point=principal_point,
39 |                     R=R,
40 |                     T=T,
41 |                 ),
42 |                 "camera_idx": cam_idx,
43 |             }
44 |         )
45 | 
46 |     return dataset
47 | 
48 | 
49 | # Spiral cameras looking at the origin
50 | def create_surround_cameras(radius, n_poses=20, up=(0.0, 1.0, 0.0), focal_length=1.0):
51 |     cameras = []
52 | 
53 |     for theta in np.linspace(0, 2 * np.pi, n_poses + 1)[:-1]:
54 | 
55 |         if np.abs(up[1]) > 0:
56 |             eye = [np.cos(theta + np.pi / 2) * radius, 0, -np.sin(theta + np.pi / 2) * radius]
57 |         else:
58 |             eye = [np.cos(theta + np.pi / 2) * radius, np.sin(theta + np.pi / 2) * radius, 2.0]
59 | 
60 |         R, T = look_at_view_transform(
61 |             eye=(eye,),
62 |             at=([0.0, 0.0, 0.0],),
63 |             up=(up,),
64 |         )
65 | 
66 |         cameras.append(
67 |             PerspectiveCameras(
68 |                 focal_length=torch.tensor([focal_length])[None],
69 |                 principal_point=torch.tensor([0.0, 0.0])[None],
70 |                 R=R,
71 |                 T=T,
72 |             )
73 |         )
74 |     
75 |     return cameras
76 | 
77 | 
78 | def vis_grid(xy_grid, image_size):
79 |     xy_vis = (xy_grid + 1) / 2.001
80 |     xy_vis = torch.cat([xy_vis, torch.zeros_like(xy_vis[..., :1])], -1)
81 |     xy_vis = xy_vis.view(image_size[1], image_size[0], 3)
82 |     xy_vis = np.array(xy_vis.detach().cpu())
83 | 
84 |     return xy_vis
85 | 
86 | 
87 | def vis_rays(ray_bundle, image_size):
88 |     rays = torch.abs(ray_bundle.directions)
89 |     rays = rays.view(image_size[1], image_size[0], 3)
90 |     rays = np.array(rays.detach().cpu())
91 | 
92 |     return rays


--------------------------------------------------------------------------------
/dataset.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # All rights reserved.
  3 | #
  4 | # This source code is licensed under the BSD-style license found in the
  5 | # LICENSE file in the root directory of this source tree.
  6 | 
  7 | import os
  8 | from typing import List, Optional, Tuple
  9 | 
 10 | import numpy as np
 11 | import requests
 12 | import torch
 13 | from PIL import Image
 14 | from pytorch3d.renderer import PerspectiveCameras
 15 | from torch.utils.data import Dataset
 16 | 
 17 | import matplotlib.pyplot as plt
 18 | 
 19 | 
 20 | DEFAULT_DATA_ROOT = os.path.join(
 21 |     os.path.dirname(os.path.realpath(__file__)), "data"
 22 | )
 23 | 
 24 | DEFAULT_URL_ROOT = "https://dl.fbaipublicfiles.com/pytorch3d_nerf_data"
 25 | 
 26 | ALL_DATASETS = ("lego", "fern", "pt3logo", "materials")
 27 | 
 28 | 
 29 | def trivial_collate(batch):
 30 |     """
 31 |     A trivial collate function that merely returns the uncollated batch.
 32 |     """
 33 |     return batch
 34 | 
 35 | 
 36 | class ListDataset(Dataset):
 37 |     """
 38 |     A simple dataset made of a list of entries.
 39 |     """
 40 | 
 41 |     def __init__(self, entries: List) -> None:
 42 |         """
 43 |         Args:
 44 |             entries: The list of dataset entries.
 45 |         """
 46 |         self._entries = entries
 47 | 
 48 |     def __len__(
 49 |         self,
 50 |     ) -> int:
 51 |         return len(self._entries)
 52 | 
 53 |     def __getitem__(self, index):
 54 |         return self._entries[index]
 55 | 
 56 | 
 57 | def get_nerf_datasets(
 58 |     dataset_name: str,  # 'lego | fern'
 59 |     image_size: Tuple[int, int],
 60 |     data_root: str = DEFAULT_DATA_ROOT,
 61 |     autodownload: bool = True,
 62 | ) -> Tuple[Dataset, Dataset, Dataset]:
 63 |     """
 64 |     Obtains the training and validation dataset object for a dataset specified
 65 |     with the `dataset_name` argument.
 66 | 
 67 |     Args:
 68 |         dataset_name: The name of the dataset to load.
 69 |         image_size: A tuple (height, width) denoting the sizes of the loaded dataset images.
 70 |         data_root: The root folder at which the data is stored.
 71 |         autodownload: Auto-download the dataset files in case they are missing.
 72 | 
 73 |     Returns:
 74 |         train_dataset: The training dataset object.
 75 |         val_dataset: The validation dataset object.
 76 |         test_dataset: The testing dataset object.
 77 |     """
 78 | 
 79 |     if dataset_name not in ALL_DATASETS:
 80 |         raise ValueError(f"'{dataset_name}'' does not refer to a known dataset.")
 81 | 
 82 |     print(f"Loading dataset {dataset_name}, image size={str(image_size)} ...")
 83 | 
 84 |     cameras_path = os.path.join(data_root, dataset_name + ".pth")
 85 |     image_path = cameras_path.replace(".pth", ".png")
 86 | 
 87 |     if autodownload and any(not os.path.isfile(p) for p in (cameras_path, image_path)):
 88 |         # Automatically download the data files if missing.
 89 |         download_data((dataset_name,), data_root=data_root)
 90 | 
 91 |     train_data = torch.load(cameras_path)
 92 |     n_cameras = train_data["cameras"]["R"].shape[0]
 93 | 
 94 |     _image_max_image_pixels = Image.MAX_IMAGE_PIXELS
 95 |     Image.MAX_IMAGE_PIXELS = None  # The dataset image is very large ...
 96 |     images = torch.FloatTensor(np.array(Image.open(image_path))) / 255.0
 97 |     images = torch.stack(torch.chunk(images, n_cameras, dim=0))[..., :3]
 98 |     Image.MAX_IMAGE_PIXELS = _image_max_image_pixels
 99 | 
100 |     scale_factors = [s_new / s for s, s_new in zip(images.shape[1:3], image_size)]
101 | 
102 |     if abs(scale_factors[0] - scale_factors[1]) > 1e-3:
103 |         raise ValueError(
104 |             "Non-isotropic scaling is not allowed. Consider changing the 'image_size' argument."
105 |         )
106 |     scale_factor = sum(scale_factors) * 0.5
107 | 
108 |     if scale_factor != 1.0:
109 |         print(f"Rescaling dataset (factor={scale_factor})")
110 |         images = torch.nn.functional.interpolate(
111 |             images.permute(0, 3, 1, 2),
112 |             size=tuple(image_size),
113 |             mode="bilinear",
114 |         ).permute(0, 2, 3, 1)
115 | 
116 |     cameras = [
117 |         PerspectiveCameras(
118 |             **{k: v[cami][None] for k, v in train_data["cameras"].items()}
119 |         ).to("cpu")
120 |         for cami in range(n_cameras)
121 |     ]
122 | 
123 |     train_idx, val_idx, test_idx = train_data["split"]
124 | 
125 |     train_dataset, val_dataset, test_dataset = [
126 |         ListDataset(
127 |             [
128 |                 {"image": images[i], "camera": cameras[i], "camera_idx": int(i)}
129 |                 for i in idx
130 |             ]
131 |         )
132 |         for idx in [train_idx, val_idx, test_idx]
133 |     ]
134 | 
135 |     return train_dataset, val_dataset, test_dataset
136 | 
137 | 
138 | def download_data(
139 |     dataset_names: Optional[List[str]] = None,
140 |     data_root: str = DEFAULT_DATA_ROOT,
141 |     url_root: str = DEFAULT_URL_ROOT,
142 | ) -> None:
143 |     """
144 |     Downloads the relevant dataset files.
145 | 
146 |     Args:
147 |         dataset_names: A list of the names of datasets to download. If `None`,
148 |             downloads all available datasets.
149 |     """
150 | 
151 |     if dataset_names is None:
152 |         dataset_names = ALL_DATASETS
153 | 
154 |     os.makedirs(data_root, exist_ok=True)
155 | 
156 |     for dataset_name in dataset_names:
157 |         cameras_file = dataset_name + ".pth"
158 |         images_file = cameras_file.replace(".pth", ".png")
159 |         license_file = cameras_file.replace(".pth", "_license.txt")
160 | 
161 |         for fl in (cameras_file, images_file, license_file):
162 |             local_fl = os.path.join(data_root, fl)
163 |             remote_fl = os.path.join(url_root, fl)
164 | 
165 |             print(f"Downloading dataset {dataset_name} from {remote_fl} to {local_fl}.")
166 | 
167 |             r = requests.get(remote_fl)
168 | 
169 |             with open(local_fl, "wb") as f:
170 |                 f.write(r.content)
171 | 
172 | 
173 | 


--------------------------------------------------------------------------------
/environment.yml:
--------------------------------------------------------------------------------
 1 | name: l3d
 2 | channels:
 3 |   - pyg
 4 |   - pytorch
 5 |   - pytorch3d
 6 |   - conda-forge
 7 |   - fvcore
 8 |   - iopath
 9 |   - bottler
10 |   - defaults
11 | dependencies:
12 |   - cudatoolkit=11.0
13 |   - python=3.9
14 |   - pip
15 |   - pytorch
16 |   - pytorch3d=0.6.1
17 |   - torchvision
18 |   - fvcore
19 |   - iopath
20 |   - nvidiacub
21 |   - pip:
22 |     - hydra-core
23 |     - Pillow
24 |     - plotly
25 |     - requests
26 |     - imageio
27 |     - matplotlib
28 |     - numpy
29 |     - PyMCubes
30 |     - tqdm
31 |     - visdom
32 | 


--------------------------------------------------------------------------------
/images/.ignore:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/images/.ignore


--------------------------------------------------------------------------------
/implicit.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn.functional as F
  3 | from torch import autograd
  4 | 
  5 | from ray_utils import RayBundle
  6 | 
  7 | 
  8 | # Sphere SDF class
  9 | class SphereSDF(torch.nn.Module):
 10 |     def __init__(
 11 |         self,
 12 |         cfg
 13 |     ):
 14 |         super().__init__()
 15 | 
 16 |         self.radius = torch.nn.Parameter(
 17 |             torch.tensor(cfg.radius.val).float(), requires_grad=cfg.radius.opt
 18 |         )
 19 |         self.center = torch.nn.Parameter(
 20 |             torch.tensor(cfg.center.val).float().unsqueeze(0), requires_grad=cfg.center.opt
 21 |         )
 22 | 
 23 |     def forward(self, points):
 24 |         points = points.view(-1, 3)
 25 | 
 26 |         return torch.linalg.norm(
 27 |             points - self.center,
 28 |             dim=-1,
 29 |             keepdim=True
 30 |         ) - self.radius
 31 | 
 32 | 
 33 | # Box SDF class
 34 | class BoxSDF(torch.nn.Module):
 35 |     def __init__(
 36 |         self,
 37 |         cfg
 38 |     ):
 39 |         super().__init__()
 40 | 
 41 |         self.center = torch.nn.Parameter(
 42 |             torch.tensor(cfg.center.val).float().unsqueeze(0), requires_grad=cfg.center.opt
 43 |         )
 44 |         self.side_lengths = torch.nn.Parameter(
 45 |             torch.tensor(cfg.side_lengths.val).float().unsqueeze(0), requires_grad=cfg.side_lengths.opt
 46 |         )
 47 | 
 48 |     def forward(self, points):
 49 |         points = points.view(-1, 3)
 50 |         diff = torch.abs(points - self.center) - self.side_lengths / 2.0
 51 | 
 52 |         signed_distance = torch.linalg.norm(
 53 |             torch.maximum(diff, torch.zeros_like(diff)),
 54 |             dim=-1
 55 |         ) + torch.minimum(torch.max(diff, dim=-1)[0], torch.zeros_like(diff[..., 0]))
 56 | 
 57 |         return signed_distance.unsqueeze(-1)
 58 | 
 59 | # Torus SDF class
 60 | class TorusSDF(torch.nn.Module):
 61 |     def __init__(
 62 |         self,
 63 |         cfg
 64 |     ):
 65 |         super().__init__()
 66 | 
 67 |         self.center = torch.nn.Parameter(
 68 |             torch.tensor(cfg.center.val).float().unsqueeze(0), requires_grad=cfg.center.opt
 69 |         )
 70 |         self.radii = torch.nn.Parameter(
 71 |             torch.tensor(cfg.radii.val).float().unsqueeze(0), requires_grad=cfg.radii.opt
 72 |         )
 73 | 
 74 |     def forward(self, points):
 75 |         points = points.view(-1, 3)
 76 |         diff = points - self.center
 77 |         q = torch.stack(
 78 |             [
 79 |                 torch.linalg.norm(diff[..., :2], dim=-1) - self.radii[..., 0],
 80 |                 diff[..., -1],
 81 |             ],
 82 |             dim=-1
 83 |         )
 84 |         return (torch.linalg.norm(q, dim=-1) - self.radii[..., 1]).unsqueeze(-1)
 85 | 
 86 | sdf_dict = {
 87 |     'sphere': SphereSDF,
 88 |     'box': BoxSDF,
 89 |     'torus': TorusSDF,
 90 | }
 91 | 
 92 | 
 93 | # Converts SDF into density/feature volume
 94 | class SDFVolume(torch.nn.Module):
 95 |     def __init__(
 96 |         self,
 97 |         cfg
 98 |     ):
 99 |         super().__init__()
100 | 
101 |         self.sdf = sdf_dict[cfg.sdf.type](
102 |             cfg.sdf
103 |         )
104 | 
105 |         self.rainbow = cfg.feature.rainbow if 'rainbow' in cfg.feature else False
106 |         self.feature = torch.nn.Parameter(
107 |             torch.ones_like(torch.tensor(cfg.feature.val).float().unsqueeze(0)), requires_grad=cfg.feature.opt
108 |         )
109 | 
110 |         self.alpha = torch.nn.Parameter(
111 |             torch.tensor(cfg.alpha.val).float(), requires_grad=cfg.alpha.opt
112 |         )
113 |         self.beta = torch.nn.Parameter(
114 |             torch.tensor(cfg.beta.val).float(), requires_grad=cfg.beta.opt
115 |         )
116 | 
117 |     def _sdf_to_density(self, signed_distance):
118 |         # Convert signed distance to density with alpha, beta parameters
119 |         return torch.where(
120 |             signed_distance > 0,
121 |             0.5 * torch.exp(-signed_distance / self.beta),
122 |             1 - 0.5 * torch.exp(signed_distance / self.beta),
123 |         ) * self.alpha
124 | 
125 |     def forward(self, ray_bundle):
126 |         sample_points = ray_bundle.sample_points.view(-1, 3)
127 |         depth_values = ray_bundle.sample_lengths[..., 0]
128 |         deltas = torch.cat(
129 |             (
130 |                 depth_values[..., 1:] - depth_values[..., :-1],
131 |                 1e10 * torch.ones_like(depth_values[..., :1]),
132 |             ),
133 |             dim=-1,
134 |         ).view(-1, 1)
135 | 
136 |         # Transform SDF to density
137 |         signed_distance = self.sdf(ray_bundle.sample_points)
138 |         density = self._sdf_to_density(signed_distance)
139 | 
140 |         # Outputs
141 |         if self.rainbow:
142 |             base_color = torch.clamp(
143 |                 torch.abs(sample_points - self.sdf.center),
144 |                 0.02,
145 |                 0.98
146 |             )
147 |         else:
148 |             base_color = 1.0
149 | 
150 |         out = {
151 |             'density': -torch.log(1.0 - density) / deltas,
152 |             'feature': base_color * self.feature * density.new_ones(sample_points.shape[0], 1)
153 |         }
154 | 
155 |         return out
156 | 
157 | 
158 | # Converts SDF into density/feature volume
159 | class SDFSurface(torch.nn.Module):
160 |     def __init__(
161 |         self,
162 |         cfg
163 |     ):
164 |         super().__init__()
165 | 
166 |         self.sdf = sdf_dict[cfg.sdf.type](
167 |             cfg.sdf
168 |         )
169 |         self.rainbow = cfg.feature.rainbow if 'rainbow' in cfg.feature else False
170 |         self.feature = torch.nn.Parameter(
171 |             torch.ones_like(torch.tensor(cfg.feature.val).float().unsqueeze(0)), requires_grad=cfg.feature.opt
172 |         )
173 |     
174 |     def get_distance(self, points):
175 |         points = points.view(-1, 3)
176 |         return self.sdf(points)
177 | 
178 |     def get_color(self, points):
179 |         points = points.view(-1, 3)
180 | 
181 |         # Outputs
182 |         if self.rainbow:
183 |             base_color = torch.clamp(
184 |                 torch.abs(points - self.sdf.center),
185 |                 0.02,
186 |                 0.98
187 |             )
188 |         else:
189 |             base_color = 1.0
190 | 
191 |         return base_color * self.feature * points.new_ones(points.shape[0], 1)
192 |     
193 |     def forward(self, points):
194 |         return self.get_distance(points)
195 | 
196 | class HarmonicEmbedding(torch.nn.Module):
197 |     def __init__(
198 |         self,
199 |         in_channels: int = 3,
200 |         n_harmonic_functions: int = 6,
201 |         omega0: float = 1.0,
202 |         logspace: bool = True,
203 |         include_input: bool = True,
204 |     ) -> None:
205 |         super().__init__()
206 | 
207 |         if logspace:
208 |             frequencies = 2.0 ** torch.arange(
209 |                 n_harmonic_functions,
210 |                 dtype=torch.float32,
211 |             )
212 |         else:
213 |             frequencies = torch.linspace(
214 |                 1.0,
215 |                 2.0 ** (n_harmonic_functions - 1),
216 |                 n_harmonic_functions,
217 |                 dtype=torch.float32,
218 |             )
219 | 
220 |         self.register_buffer("_frequencies", omega0 * frequencies, persistent=False)
221 |         self.include_input = include_input
222 |         self.output_dim = n_harmonic_functions * 2 * in_channels
223 | 
224 |         if self.include_input:
225 |             self.output_dim += in_channels
226 | 
227 |     def forward(self, x: torch.Tensor):
228 |         embed = (x[..., None] * self._frequencies).view(*x.shape[:-1], -1)
229 | 
230 |         if self.include_input:
231 |             return torch.cat((embed.sin(), embed.cos(), x), dim=-1)
232 |         else:
233 |             return torch.cat((embed.sin(), embed.cos()), dim=-1)
234 | 
235 | 
236 | class LinearWithRepeat(torch.nn.Linear):
237 |     def forward(self, input):
238 |         n1 = input[0].shape[-1]
239 |         output1 = F.linear(input[0], self.weight[:, :n1], self.bias)
240 |         output2 = F.linear(input[1], self.weight[:, n1:], None)
241 |         return output1 + output2.unsqueeze(-2)
242 | 
243 | 
244 | class MLPWithInputSkips(torch.nn.Module):
245 |     def __init__(
246 |         self,
247 |         n_layers: int,
248 |         input_dim: int,
249 |         output_dim: int,
250 |         skip_dim: int,
251 |         hidden_dim: int,
252 |         input_skips,
253 |     ):
254 |         super().__init__()
255 | 
256 |         layers = []
257 | 
258 |         for layeri in range(n_layers):
259 |             if layeri == 0:
260 |                 dimin = input_dim
261 |                 dimout = hidden_dim
262 |             elif layeri in input_skips:
263 |                 dimin = hidden_dim + skip_dim
264 |                 dimout = hidden_dim
265 |             else:
266 |                 dimin = hidden_dim
267 |                 dimout = hidden_dim
268 | 
269 |             linear = torch.nn.Linear(dimin, dimout)
270 |             layers.append(torch.nn.Sequential(linear, torch.nn.ReLU(True)))
271 | 
272 |         self.mlp = torch.nn.ModuleList(layers)
273 |         self._input_skips = set(input_skips)
274 | 
275 |     def forward(self, x: torch.Tensor, z: torch.Tensor) -> torch.Tensor:
276 |         y = x
277 | 
278 |         for li, layer in enumerate(self.mlp):
279 |             if li in self._input_skips:
280 |                 y = torch.cat((y, z), dim=-1)
281 | 
282 |             y = layer(y)
283 | 
284 |         return y
285 | 
286 | 
287 | # TODO (Q3.1): Implement NeRF MLP
288 | class NeuralRadianceField(torch.nn.Module):
289 |     def __init__(
290 |         self,
291 |         cfg,
292 |     ):
293 |         super().__init__()
294 | 
295 |         self.harmonic_embedding_xyz = HarmonicEmbedding(3, cfg.n_harmonic_functions_xyz)
296 |         self.harmonic_embedding_dir = HarmonicEmbedding(3, cfg.n_harmonic_functions_dir)
297 | 
298 |         embedding_dim_xyz = self.harmonic_embedding_xyz.output_dim
299 |         embedding_dim_dir = self.harmonic_embedding_dir.output_dim
300 | 
301 |         pass
302 | 
303 | 
304 | class NeuralSurface(torch.nn.Module):
305 |     def __init__(
306 |         self,
307 |         cfg,
308 |     ):
309 |         super().__init__()
310 |         # TODO (Q6): Implement Neural Surface MLP to output per-point SDF
311 |         # TODO (Q7): Implement Neural Surface MLP to output per-point color
312 | 
313 |     def get_distance(
314 |         self,
315 |         points
316 |     ):
317 |         '''
318 |         TODO: Q6
319 |         Output:
320 |             distance: N X 1 Tensor, where N is number of input points
321 |         '''
322 |         points = points.view(-1, 3)
323 |         pass
324 |     
325 |     def get_color(
326 |         self,
327 |         points
328 |     ):
329 |         '''
330 |         TODO: Q7
331 |         Output:
332 |             distance: N X 3 Tensor, where N is number of input points
333 |         '''
334 |         points = points.view(-1, 3)
335 |         pass
336 |     
337 |     def get_distance_color(
338 |         self,
339 |         points
340 |     ):
341 |         '''
342 |         TODO: Q7
343 |         Output:
344 |             distance, points: N X 1, N X 3 Tensors, where N is number of input points
345 |         You may just implement this by independent calls to get_distance, get_color
346 |             but, depending on your MLP implementation, it maybe more efficient to share some computation
347 |         '''
348 |         
349 |     def forward(self, points):
350 |         return self.get_distance(points)
351 | 
352 |     def get_distance_and_gradient(
353 |         self,
354 |         points
355 |     ):
356 |         has_grad = torch.is_grad_enabled()
357 |         points = points.view(-1, 3)
358 | 
359 |         # Calculate gradient with respect to points
360 |         with torch.enable_grad():
361 |             points = points.requires_grad_(True)
362 |             distance = self.get_distance(points)
363 |             gradient = autograd.grad(
364 |                 distance,
365 |                 points,
366 |                 torch.ones_like(distance, device=points.device),
367 |                 create_graph=has_grad,
368 |                 retain_graph=has_grad,
369 |                 only_inputs=True
370 |             )[0]
371 |         
372 |         return distance, gradient
373 | 
374 | 
375 | implicit_dict = {
376 |     'sdf_volume': SDFVolume,
377 |     'nerf': NeuralRadianceField,
378 |     'sdf_surface': SDFSurface,
379 |     'neural_surface': NeuralSurface,
380 | }
381 | 


--------------------------------------------------------------------------------
/losses.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn.functional as F
 3 | 
 4 | def eikonal_loss(gradients):
 5 |     # TODO (Q6): Implement eikonal loss
 6 |     pass
 7 | 
 8 | def sphere_loss(signed_distance, points, radius=1.0):
 9 |     return torch.square(signed_distance[..., 0] - (torch.norm(points, dim=-1) - radius)).mean()
10 | 
11 | def get_random_points(num_points, bounds, device):
12 |     min_bound = torch.tensor(bounds[0], device=device).unsqueeze(0)
13 |     max_bound = torch.tensor(bounds[1], device=device).unsqueeze(0)
14 | 
15 |     return torch.rand((num_points, 3), device=device) * (max_bound - min_bound) + min_bound
16 | 
17 | def select_random_points(points, n_points):
18 |     points_sub = points[torch.randperm(points.shape[0])]
19 |     return points_sub.reshape(-1, 3)[:n_points]
20 | 


--------------------------------------------------------------------------------
/ray_utils.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | from typing import List, NamedTuple
  3 | 
  4 | import torch
  5 | import torch.nn.functional as F
  6 | from pytorch3d.renderer.cameras import CamerasBase
  7 | 
  8 | 
  9 | # Convenience class wrapping several ray inputs:
 10 | #   1) Origins -- ray origins
 11 | #   2) Directions -- ray directions
 12 | #   3) Sample points -- sample points along ray direction from ray origin
 13 | #   4) Sample lengths -- distance of sample points from ray origin
 14 | 
 15 | class RayBundle(object):
 16 |     def __init__(
 17 |         self,
 18 |         origins,
 19 |         directions,
 20 |         sample_points,
 21 |         sample_lengths,
 22 |     ):
 23 |         self.origins = origins
 24 |         self.directions = directions
 25 |         self.sample_points = sample_points
 26 |         self.sample_lengths = sample_lengths
 27 | 
 28 |     def __getitem__(self, idx):
 29 |         return RayBundle(
 30 |             self.origins[idx],
 31 |             self.directions[idx],
 32 |             self.sample_points[idx],
 33 |             self.sample_lengths[idx],
 34 |         )
 35 | 
 36 |     @property
 37 |     def shape(self):
 38 |         return self.origins.shape[:-1]
 39 | 
 40 |     @property
 41 |     def sample_shape(self):
 42 |         return self.sample_points.shape[:-1]
 43 | 
 44 |     def reshape(self, *args):
 45 |         return RayBundle(
 46 |             self.origins.reshape(*args, 3),
 47 |             self.directions.reshape(*args, 3),
 48 |             self.sample_points.reshape(*args, self.sample_points.shape[-2], 3),
 49 |             self.sample_lengths.reshape(*args, self.sample_lengths.shape[-2], 3),
 50 |         )
 51 | 
 52 |     def view(self, *args):
 53 |         return RayBundle(
 54 |             self.origins.view(*args, 3),
 55 |             self.directions.view(*args, 3),
 56 |             self.sample_points.view(*args, self.sample_points.shape[-2], 3),
 57 |             self.sample_lengths.view(*args, self.sample_lengths.shape[-2], 3),
 58 |         )
 59 | 
 60 |     def _replace(self, **kwargs):
 61 |         for key in kwargs.keys():
 62 |             setattr(self, key, kwargs[key])
 63 |         
 64 |         return self
 65 | 
 66 | 
 67 | # Sample image colors from pixel values
 68 | def sample_images_at_xy(
 69 |     images: torch.Tensor,
 70 |     xy_grid: torch.Tensor,
 71 | ):
 72 |     batch_size = images.shape[0]
 73 |     spatial_size = images.shape[1:-1]
 74 | 
 75 |     xy_grid = -xy_grid.view(batch_size, -1, 1, 2)
 76 | 
 77 |     images_sampled = torch.nn.functional.grid_sample(
 78 |         images.permute(0, 3, 1, 2),
 79 |         xy_grid,
 80 |         align_corners=True,
 81 |         mode="bilinear",
 82 |     )
 83 | 
 84 |     return images_sampled.permute(0, 2, 3, 1).view(-1, images.shape[-1])
 85 | 
 86 | 
 87 | # Generate pixel coordinates from in NDC space (from [-1, 1])
 88 | def get_pixels_from_image(image_size, camera):
 89 |     W, H = image_size[0], image_size[1]
 90 | 
 91 |     # TODO (Q1.3): Generate pixel coordinates from [0, W] in x and [0, H] in y
 92 |     pass
 93 | 
 94 |     # TODO (Q1.3): Convert to the range [-1, 1] in both x and y
 95 |     pass
 96 | 
 97 |     # Create grid of coordinates
 98 |     xy_grid = torch.stack(
 99 |         tuple( reversed( torch.meshgrid(y, x) ) ),
100 |         dim=-1,
101 |     ).view(W * H, 2)
102 | 
103 |     return -xy_grid
104 | 
105 | 
106 | # Random subsampling of pixels from an image
107 | def get_random_pixels_from_image(n_pixels, image_size, camera):
108 |     xy_grid = get_pixels_from_image(image_size, camera)
109 |     
110 |     # TODO (Q2.1): Random subsampling of pixel coordinaters
111 |     pass
112 | 
113 |     # Return
114 |     return xy_grid_sub.reshape(-1, 2)[:n_pixels]
115 | 
116 | 
117 | # Get rays from pixel values
118 | def get_rays_from_pixels(xy_grid, image_size, camera):
119 |     W, H = image_size[0], image_size[1]
120 | 
121 |     # TODO (Q1.3): Map pixels to points on the image plane at Z=1
122 |     pass
123 | 
124 |     ndc_points = torch.cat(
125 |         [
126 |             ndc_points,
127 |             torch.ones_like(ndc_points[..., -1:])
128 |         ],
129 |         dim=-1
130 |     )
131 | 
132 |     # TODO (Q1.3): Use camera.unproject to get world space points from NDC space points
133 |     pass
134 | 
135 |     # TODO (Q1.3): Get ray origins from camera center
136 |     pass
137 | 
138 |     # TODO (Q1.3): Get ray directions as image_plane_points - rays_o
139 |     pass
140 | 
141 |     # Create and return RayBundle
142 |     return RayBundle(
143 |         rays_o,
144 |         rays_d,
145 |         torch.zeros_like(rays_o).unsqueeze(1),
146 |         torch.zeros_like(rays_o).unsqueeze(1),
147 |     )


--------------------------------------------------------------------------------
/render_functions.py:
--------------------------------------------------------------------------------
  1 | import argparse
  2 | import os
  3 | import sys
  4 | import datetime
  5 | import time
  6 | import math
  7 | import json
  8 | import torch
  9 | 
 10 | import numpy as np
 11 | from PIL import Image
 12 | 
 13 | import matplotlib.pyplot as plt
 14 | import pytorch3d
 15 | import torch
 16 | 
 17 | from pytorch3d.renderer import look_at_view_transform
 18 | from pytorch3d.renderer import OpenGLPerspectiveCameras
 19 | from pytorch3d.renderer import (
 20 |     AlphaCompositor,
 21 |     RasterizationSettings,
 22 |     MeshRenderer,
 23 |     MeshRasterizer,
 24 |     PointsRasterizationSettings,
 25 |     PointsRenderer,
 26 |     PointsRasterizer,
 27 |     HardPhongShader,
 28 | )
 29 | 
 30 | import mcubes
 31 | 
 32 | def get_device():
 33 |     """
 34 |     Checks if GPU is available and returns device accordingly.
 35 |     """
 36 |     if torch.cuda.is_available():
 37 |         device = torch.device("cuda:0")
 38 |     else:
 39 |         device = torch.device("cpu")
 40 |     return device
 41 | 
 42 | def get_points_renderer(
 43 |     image_size=512, device=None, radius=0.01, background_color=(1, 1, 1)
 44 | ):
 45 |     """
 46 |     Returns a Pytorch3D renderer for point clouds.
 47 | 
 48 |     Args:
 49 |         image_size (int): The rendered image size.
 50 |         device (torch.device): The torch device to use (CPU or GPU). If not specified,
 51 |             will automatically use GPU if available, otherwise CPU.
 52 |         radius (float): The radius of the rendered point in NDC.
 53 |         background_color (tuple): The background color of the rendered image.
 54 |     
 55 |     Returns:
 56 |         PointsRenderer.
 57 |     """
 58 |     if device is None:
 59 |         if torch.cuda.is_available():
 60 |             device = torch.device("cuda:0")
 61 |         else:
 62 |             device = torch.device("cpu")
 63 |     raster_settings = PointsRasterizationSettings(image_size=image_size, radius=radius,)
 64 |     renderer = PointsRenderer(
 65 |         rasterizer=PointsRasterizer(raster_settings=raster_settings),
 66 |         compositor=AlphaCompositor(background_color=background_color),
 67 |     )
 68 |     return renderer
 69 | 
 70 | 
 71 | def render_points(filename, points, image_size=256, color=[0.7, 0.7, 1], device=None):
 72 |     # The device tells us whether we are rendering with GPU or CPU. The rendering will
 73 |     # be *much* faster if you have a CUDA-enabled NVIDIA GPU. However, your code will
 74 |     # still run fine on a CPU.
 75 |     # The default is to run on CPU, so if you do not have a GPU, you do not need to
 76 |     # worry about specifying the device in all of these functions.
 77 |     if device is None:
 78 |         device = get_device()
 79 | 
 80 |     # Get the renderer.
 81 |     points_renderer = get_points_renderer(image_size=256,radius=0.01)
 82 | 
 83 |     # Get the vertices, faces, and textures.
 84 |     # vertices, faces = load_cow_mesh(cow_path)
 85 |     # vertices = vertices.unsqueeze(0)  # (N_v, 3) -> (1, N_v, 3)
 86 |     # faces = faces.unsqueeze(0)  # (N_f, 3) -> (1, N_f, 3)
 87 |     textures = torch.ones(points.size()).to(device)*0.5   # (1, N_v, 3)
 88 |     rgb = textures * torch.tensor(color).to(device)  # (1, N_v, 3)
 89 | 
 90 |     point_cloud = pytorch3d.structures.pointclouds.Pointclouds(
 91 |         points=points, features=rgb
 92 |     )
 93 | 
 94 |     R, T = look_at_view_transform(10.0, 10.0, 96)
 95 | 
 96 | 
 97 |     # Prepare the camera:
 98 |     cameras = OpenGLPerspectiveCameras(
 99 |         R=R,T=T, device=device
100 |     )
101 | 
102 |     rend = points_renderer(point_cloud.extend(2), cameras=cameras)
103 | 
104 | 
105 |     # Place a point light in front of the cow.
106 |     # lights = pytorch3d.renderer.PointLights(location=[[0.0, 1.0, -2.0]], device=device)
107 | 
108 |     # rend = renderer(mesh, cameras=cameras, lights=lights)
109 |     rend = rend.detach().cpu().numpy()[0, ..., :3]  # (B, H, W, 4) -> (H, W, 3)
110 |     plt.imsave(filename, rend)
111 | 
112 |     # The .cpu moves the tensor to GPU (if needed).
113 |     return rend
114 | 
115 | def render_points_with_save(
116 |     points,
117 |     cameras,
118 |     image_size,
119 |     save=False,
120 |     file_prefix='',
121 |     color=[0.7, 0.7, 1]
122 | ):
123 |     device = points.device
124 |     if device is None:
125 |         device = get_device()
126 | 
127 |     # Get the renderer.
128 |     points_renderer = get_points_renderer(image_size=image_size[0], radius=0.01)
129 | 
130 |     textures = torch.ones(points.size()).to(device)   # (1, N_v, 3)
131 |     rgb = textures * torch.tensor(color).to(device)  # (1, N_v, 3)
132 | 
133 |     point_cloud = pytorch3d.structures.pointclouds.Pointclouds(
134 |         points=points, features=rgb
135 |     )
136 |     
137 |     all_images = []
138 |     with torch.no_grad():
139 |         torch.cuda.empty_cache()
140 |         for cam_idx in range(len(cameras)):
141 |             image = points_renderer(point_cloud, cameras=cameras[cam_idx].to(device))
142 |             image = image[0,:,:,:3].detach().cpu().numpy()
143 |             all_images.append(image)
144 | 
145 |             # Save
146 |             if save:
147 |                 plt.imsave(
148 |                     f'{file_prefix}_{cam_idx}.png',
149 |                     image
150 |                 )
151 |     
152 |     return all_images
153 | 
154 | def get_mesh_renderer(image_size=512, lights=None, device=None):
155 |     """
156 |     Returns a Pytorch3D Mesh Renderer.
157 |     Args:
158 |         image_size (int): The rendered image size.
159 |         lights: A default Pytorch3D lights object.
160 |         device (torch.device): The torch device to use (CPU or GPU). If not specified,
161 |             will automatically use GPU if available, otherwise CPU.
162 |     """
163 |     if device is None:
164 |         if torch.cuda.is_available():
165 |             device = torch.device("cuda:0")
166 |         else:
167 |             device = torch.device("cpu")
168 |     raster_settings = RasterizationSettings(
169 |         image_size=image_size, blur_radius=0.0, faces_per_pixel=1,
170 |     )
171 |     renderer = MeshRenderer(
172 |         rasterizer=MeshRasterizer(raster_settings=raster_settings),
173 |         shader=HardPhongShader(device=device, lights=lights),
174 |     )
175 |     return renderer
176 | 
177 | 
178 | def implicit_to_mesh(implicit_fn, scale=0.5, grid_size=128, device='cpu', color=[0.7, 0.7, 1], chunk_size=262144, thresh=0):
179 |     Xs = torch.linspace(-1*scale, scale, grid_size+1).to(device)
180 |     Ys = torch.linspace(-1*scale, scale, grid_size+1).to(device)
181 |     Zs = torch.linspace(-1*scale, scale, grid_size+1).to(device)
182 |     grid = torch.stack(torch.meshgrid(Xs, Ys, Zs), dim=-1)
183 | 
184 |     grid = grid.view(-1, 3)
185 |     num_points = grid.shape[0]
186 |     sdfs = torch.zeros(num_points)
187 |     
188 |     with torch.no_grad():
189 |         for chunk_start in range(0, num_points, chunk_size):
190 |             torch.cuda.empty_cache()
191 |             chunk_end = min(num_points, chunk_start+chunk_size)
192 |             sdfs[chunk_start:chunk_end] = implicit_fn.get_distance(grid[chunk_start:chunk_end,:]).view(-1)
193 | 
194 |         sdfs = sdfs.view(grid_size+1, grid_size+1, grid_size+1)
195 | 
196 |     vertices, triangles = mcubes.marching_cubes(sdfs.cpu().numpy(), thresh)
197 |     # normalize to [-scale, scale]
198 |     vertices = (vertices/grid_size - 0.5)*2*scale
199 | 
200 |     vertices = torch.from_numpy(vertices).unsqueeze(0).float()
201 |     faces = torch.from_numpy(triangles.astype(np.int64)).unsqueeze(0)
202 |     
203 |     textures = torch.ones_like(vertices)  # (1, N_v, 3)
204 |     textures = textures * torch.tensor(color)  # (1, N_v, 3)
205 |     mesh = pytorch3d.structures.Meshes(
206 |         verts=vertices,
207 |         faces=faces,
208 |         textures=pytorch3d.renderer.TexturesVertex(textures),
209 |     )
210 |     mesh = mesh.to(device)
211 |     return mesh
212 |     
213 | 
214 | def render_geometry(
215 |     model,
216 |     cameras,
217 |     image_size,
218 |     save=False,
219 |     thresh=0.,
220 |     file_prefix=''    
221 | ):
222 |     device = list(model.parameters())[0].device
223 |     lights = pytorch3d.renderer.PointLights(location=[[0, 0, -3]], device=device)
224 |     mesh_renderer = get_mesh_renderer(image_size=image_size[0], lights=lights, device=device)
225 | 
226 |     mesh = implicit_to_mesh(model.implicit_fn, scale=3, device=device, thresh=thresh)
227 |     all_images = []
228 |     with torch.no_grad():
229 |         torch.cuda.empty_cache()
230 |         for cam_idx in range(len(cameras)):
231 |             image = mesh_renderer(mesh, cameras=cameras[cam_idx].to(device))
232 |             image = image[0,:,:,:3].detach().cpu().numpy()
233 |             all_images.append(image)
234 | 
235 |             # Save
236 |             if save:
237 |                 plt.imsave(
238 |                     f'{file_prefix}_{cam_idx}.png',
239 |                     image
240 |                 )
241 |     
242 |     return all_images


--------------------------------------------------------------------------------
/renderer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | 
  3 | from typing import List, Optional, Tuple
  4 | from pytorch3d.renderer.cameras import CamerasBase
  5 | 
  6 | 
  7 | # Volume renderer which integrates color and density along rays
  8 | # according to the equations defined in [Mildenhall et al. 2020]
  9 | class VolumeRenderer(torch.nn.Module):
 10 |     def __init__(
 11 |         self,
 12 |         cfg
 13 |     ):
 14 |         super().__init__()
 15 | 
 16 |         self._chunk_size = cfg.chunk_size
 17 |         self._white_background = cfg.white_background if 'white_background' in cfg else False
 18 | 
 19 |     def _compute_weights(
 20 |         self,
 21 |         deltas,
 22 |         rays_density: torch.Tensor,
 23 |         eps: float = 1e-10
 24 |     ):
 25 |         # TODO (1.5): Compute transmittance using the equation described in the README
 26 |         pass
 27 | 
 28 |         # TODO (1.5): Compute weight used for rendering from transmittance and alpha
 29 |         return weights
 30 |     
 31 |     def _aggregate(
 32 |         self,
 33 |         weights: torch.Tensor,
 34 |         rays_feature: torch.Tensor
 35 |     ):
 36 |         # TODO (1.5): Aggregate (weighted sum of) features using weights
 37 |         pass
 38 | 
 39 |         return feature
 40 | 
 41 |     def forward(
 42 |         self,
 43 |         sampler,
 44 |         implicit_fn,
 45 |         ray_bundle,
 46 |     ):
 47 |         B = ray_bundle.shape[0]
 48 | 
 49 |         # Process the chunks of rays.
 50 |         chunk_outputs = []
 51 | 
 52 |         for chunk_start in range(0, B, self._chunk_size):
 53 |             cur_ray_bundle = ray_bundle[chunk_start:chunk_start+self._chunk_size]
 54 | 
 55 |             # Sample points along the ray
 56 |             cur_ray_bundle = sampler(cur_ray_bundle)
 57 |             n_pts = cur_ray_bundle.sample_shape[1]
 58 | 
 59 |             # Call implicit function with sample points
 60 |             implicit_output = implicit_fn(cur_ray_bundle)
 61 |             density = implicit_output['density']
 62 |             feature = implicit_output['feature']
 63 | 
 64 |             # Compute length of each ray segment
 65 |             depth_values = cur_ray_bundle.sample_lengths[..., 0]
 66 |             deltas = torch.cat(
 67 |                 (
 68 |                     depth_values[..., 1:] - depth_values[..., :-1],
 69 |                     1e10 * torch.ones_like(depth_values[..., :1]),
 70 |                 ),
 71 |                 dim=-1,
 72 |             )[..., None]
 73 | 
 74 |             # Compute aggregation weights
 75 |             weights = self._compute_weights(
 76 |                 deltas.view(-1, n_pts, 1),
 77 |                 density.view(-1, n_pts, 1)
 78 |             ) 
 79 | 
 80 |             # TODO (1.5): Render (color) features using weights
 81 |             pass
 82 | 
 83 |             # TODO (1.5): Render depth map
 84 |             pass
 85 | 
 86 |             # Return
 87 |             cur_out = {
 88 |                 'feature': feature,
 89 |                 'depth': depth,
 90 |             }
 91 | 
 92 |             chunk_outputs.append(cur_out)
 93 | 
 94 |         # Concatenate chunk outputs
 95 |         out = {
 96 |             k: torch.cat(
 97 |               [chunk_out[k] for chunk_out in chunk_outputs],
 98 |               dim=0
 99 |             ) for k in chunk_outputs[0].keys()
100 |         }
101 | 
102 |         return out
103 | 
104 | 
105 | # Volume renderer which integrates color and density along rays
106 | # according to the equations defined in [Mildenhall et al. 2020]
107 | class SphereTracingRenderer(torch.nn.Module):
108 |     def __init__(
109 |         self,
110 |         cfg
111 |     ):
112 |         super().__init__()
113 | 
114 |         self._chunk_size = cfg.chunk_size
115 |         self.near = cfg.near
116 |         self.far = cfg.far
117 |         self.max_iters = cfg.max_iters
118 |     
119 |     def sphere_tracing(
120 |         self,
121 |         implicit_fn,
122 |         origins, # Nx3
123 |         directions, # Nx3
124 |     ):
125 |         '''
126 |         Input:
127 |             implicit_fn: a module that computes a SDF at a query point
128 |             origins: N_rays X 3
129 |             directions: N_rays X 3
130 |         Output:
131 |             points: N_rays X 3 points indicating ray-surface intersections. For rays that do not intersect the surface,
132 |                     the point can be arbitrary.
133 |             mask: N_rays X 1 (boolean tensor) denoting which of the input rays intersect the surface.
134 |         '''
135 |         # TODO (Q5): Implement sphere tracing
136 |         # 1) Iteratively update points and distance to the closest surface
137 |         #   in order to compute intersection points of rays with the implicit surface
138 |         # 2) Maintain a mask with the same batch dimension as the ray origins,
139 |         #   indicating which points hit the surface, and which do not
140 |         pass
141 | 
142 |     def forward(
143 |         self,
144 |         sampler,
145 |         implicit_fn,
146 |         ray_bundle,
147 |         light_dir=None
148 |     ):
149 |         B = ray_bundle.shape[0]
150 | 
151 |         # Process the chunks of rays.
152 |         chunk_outputs = []
153 | 
154 |         for chunk_start in range(0, B, self._chunk_size):
155 |             cur_ray_bundle = ray_bundle[chunk_start:chunk_start+self._chunk_size]
156 |             points, mask = self.sphere_tracing(
157 |                 implicit_fn,
158 |                 cur_ray_bundle.origins,
159 |                 cur_ray_bundle.directions
160 |             )
161 |             mask = mask.repeat(1,3)
162 |             isect_points = points[mask].view(-1, 3)
163 | 
164 |             # Get color from implicit function with intersection points
165 |             isect_color = implicit_fn.get_color(isect_points)
166 | 
167 |             # Return
168 |             color = torch.zeros_like(cur_ray_bundle.origins)
169 |             color[mask] = isect_color.view(-1)
170 | 
171 |             cur_out = {
172 |                 'color': color.view(-1, 3),
173 |             }
174 | 
175 |             chunk_outputs.append(cur_out)
176 | 
177 |         # Concatenate chunk outputs
178 |         out = {
179 |             k: torch.cat(
180 |               [chunk_out[k] for chunk_out in chunk_outputs],
181 |               dim=0
182 |             ) for k in chunk_outputs[0].keys()
183 |         }
184 | 
185 |         return out
186 | 
187 | 
188 | def sdf_to_density(signed_distance, alpha, beta):
189 |     # TODO (Q7): Convert signed distance to density with alpha, beta parameters
190 |     pass
191 | 
192 | class VolumeSDFRenderer(VolumeRenderer):
193 |     def __init__(
194 |         self,
195 |         cfg
196 |     ):
197 |         super().__init__(cfg)
198 | 
199 |         self._chunk_size = cfg.chunk_size
200 |         self._white_background = cfg.white_background if 'white_background' in cfg else False
201 |         self.alpha = cfg.alpha
202 |         self.beta = cfg.beta
203 | 
204 |         self.cfg = cfg
205 | 
206 |     def forward(
207 |         self,
208 |         sampler,
209 |         implicit_fn,
210 |         ray_bundle,
211 |         light_dir=None
212 |     ):
213 |         B = ray_bundle.shape[0]
214 | 
215 |         # Process the chunks of rays.
216 |         chunk_outputs = []
217 | 
218 |         for chunk_start in range(0, B, self._chunk_size):
219 |             cur_ray_bundle = ray_bundle[chunk_start:chunk_start+self._chunk_size]
220 | 
221 |             # Sample points along the ray
222 |             cur_ray_bundle = sampler(cur_ray_bundle)
223 |             n_pts = cur_ray_bundle.sample_shape[1]
224 | 
225 |             # Call implicit function with sample points
226 |             distance, color = implicit_fn.get_distance_color(cur_ray_bundle.sample_points)
227 |             density = None # TODO (Q7): convert SDF to density
228 | 
229 |             # Compute length of each ray segment
230 |             depth_values = cur_ray_bundle.sample_lengths[..., 0]
231 |             deltas = torch.cat(
232 |                 (
233 |                     depth_values[..., 1:] - depth_values[..., :-1],
234 |                     1e10 * torch.ones_like(depth_values[..., :1]),
235 |                 ),
236 |                 dim=-1,
237 |             )[..., None]
238 | 
239 |             # Compute aggregation weights
240 |             weights = self._compute_weights(
241 |                 deltas.view(-1, n_pts, 1),
242 |                 density.view(-1, n_pts, 1)
243 |             ) 
244 | 
245 |             geometry_color = torch.zeros_like(color)
246 | 
247 |             # Compute color
248 |             color = self._aggregate(
249 |                 weights,
250 |                 color.view(-1, n_pts, color.shape[-1])
251 |             )
252 | 
253 |             # Return
254 |             cur_out = {
255 |                 'color': color,
256 |                 "geometry": geometry_color
257 |             }
258 | 
259 |             chunk_outputs.append(cur_out)
260 | 
261 |         # Concatenate chunk outputs
262 |         out = {
263 |             k: torch.cat(
264 |               [chunk_out[k] for chunk_out in chunk_outputs],
265 |               dim=0
266 |             ) for k in chunk_outputs[0].keys()
267 |         }
268 | 
269 |         return out
270 | 
271 | 
272 | renderer_dict = {
273 |     'volume': VolumeRenderer,
274 |     'sphere_tracing': SphereTracingRenderer,
275 |     'volume_sdf': VolumeSDFRenderer
276 | }
277 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | torchvision
 2 | hydra-core
 3 | Pillow
 4 | plotly
 5 | requests
 6 | imageio
 7 | matplotlib
 8 | numpy<2.0.0
 9 | PyMCubes
10 | tqdm
11 | visdom


--------------------------------------------------------------------------------
/sampler.py:
--------------------------------------------------------------------------------
 1 | import math
 2 | from typing import List
 3 | 
 4 | import torch
 5 | from ray_utils import RayBundle
 6 | from pytorch3d.renderer.cameras import CamerasBase
 7 | 
 8 | 
 9 | # Sampler which implements stratified (uniform) point sampling along rays
10 | class StratifiedRaysampler(torch.nn.Module):
11 |     def __init__(
12 |         self,
13 |         cfg
14 |     ):
15 |         super().__init__()
16 | 
17 |         self.n_pts_per_ray = cfg.n_pts_per_ray
18 |         self.min_depth = cfg.min_depth
19 |         self.max_depth = cfg.max_depth
20 | 
21 |     def forward(
22 |         self,
23 |         ray_bundle,
24 |     ):
25 |         # TODO (Q1.4): Compute z values for self.n_pts_per_ray points uniformly sampled between [near, far]
26 |         z_vals = None
27 | 
28 |         # TODO (Q1.4): Sample points from z values
29 |         sample_points = None
30 | 
31 |         # Return
32 |         return ray_bundle._replace(
33 |             sample_points=sample_points,
34 |             sample_lengths=z_vals * torch.ones_like(sample_points[..., :1]),
35 |         )
36 | 
37 | 
38 | sampler_dict = {
39 |     'stratified': StratifiedRaysampler
40 | }


--------------------------------------------------------------------------------
/surface_rendering_main.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import warnings
  3 | 
  4 | import hydra
  5 | import numpy as np
  6 | import torch
  7 | import tqdm
  8 | import imageio
  9 | 
 10 | from omegaconf import DictConfig
 11 | from PIL import Image
 12 | from pytorch3d.renderer import (
 13 |     PerspectiveCameras,
 14 |     look_at_view_transform
 15 | )
 16 | import matplotlib.pyplot as plt
 17 | 
 18 | from sampler import sampler_dict
 19 | from implicit import implicit_dict
 20 | from renderer import renderer_dict
 21 | from losses import eikonal_loss, sphere_loss, get_random_points, select_random_points
 22 | 
 23 | from ray_utils import (
 24 |     sample_images_at_xy,
 25 |     get_pixels_from_image,
 26 |     get_random_pixels_from_image,
 27 |     get_rays_from_pixels
 28 | )
 29 | from data_utils import (
 30 |     dataset_from_config,
 31 |     create_surround_cameras,
 32 |     vis_grid,
 33 |     vis_rays,
 34 | )
 35 | from dataset import (
 36 |     get_nerf_datasets,
 37 |     trivial_collate,
 38 | )
 39 | from render_functions import render_geometry
 40 | from render_functions import render_points_with_save
 41 | 
 42 | 
 43 | # Model class containing:
 44 | #   1) Implicit function defining the scene
 45 | #   2) Sampling scheme which generates sample points along rays
 46 | #   3) Renderer which can render an implicit function given a sampling scheme
 47 | 
 48 | class Model(torch.nn.Module):
 49 |     def __init__(
 50 |         self,
 51 |         cfg
 52 |     ):
 53 |         super().__init__()
 54 | 
 55 |         # Get implicit function from config
 56 |         self.implicit_fn = implicit_dict[cfg.implicit_function.type](
 57 |             cfg.implicit_function
 58 |         )
 59 | 
 60 |         # Point sampling (raymarching) scheme
 61 |         self.sampler = sampler_dict[cfg.sampler.type](
 62 |             cfg.sampler
 63 |         )
 64 | 
 65 |         # Initialize implicit renderer
 66 |         self.renderer = renderer_dict[cfg.renderer.type](
 67 |             cfg.renderer
 68 |         )
 69 |     
 70 |     def forward(
 71 |         self,
 72 |         ray_bundle,
 73 |         light_dir=None
 74 |     ):
 75 |         # Call renderer with
 76 |         #  a) Implicit function
 77 |         #  b) Sampling routine
 78 | 
 79 |         return self.renderer(
 80 |             self.sampler,
 81 |             self.implicit_fn,
 82 |             ray_bundle,
 83 |             light_dir
 84 |         )
 85 | 
 86 | 
 87 | def render_images(
 88 |     model,
 89 |     cameras,
 90 |     image_size,
 91 |     save=False,
 92 |     file_prefix='',
 93 |     lights=None,
 94 |     feat='color'
 95 | ):
 96 |     all_images = []
 97 |     device = list(model.parameters())[0].device
 98 | 
 99 |     for cam_idx, camera in enumerate(cameras):
100 |         print(f'Rendering image {cam_idx}')
101 | 
102 |         with torch.no_grad():
103 |             torch.cuda.empty_cache()
104 | 
105 |             # Get rays
106 |             camera = camera.to(device)
107 |             light_dir = None
108 |             # We assume the object is placed at the origin
109 |             origin = torch.tensor([0.0, 0.0, 0.0], device=device) 
110 |             light_location = None if lights is None else lights[cam_idx].location.to(device)
111 |             if lights is not None:
112 |                 light_dir = None #TODO: Use light location and origin to compute light direction
113 |                 light_dir = torch.nn.functional.normalize(light_dir, dim=-1).view(-1, 3)
114 |             xy_grid = get_pixels_from_image(image_size, camera)
115 |             ray_bundle = get_rays_from_pixels(xy_grid, image_size, camera)
116 | 
117 |             # Run model forward
118 |             out = model(ray_bundle, light_dir)
119 | 
120 |         # Return rendered features (colors)
121 |         image = np.array(
122 |             out[feat].view(
123 |                 image_size[1], image_size[0], 3
124 |             ).detach().cpu()
125 |         )
126 |         all_images.append(image)
127 | 
128 |         # Save
129 |         if save:
130 |             plt.imsave(
131 |                 f'{file_prefix}_{cam_idx}.png',
132 |                 image
133 |             )
134 |     
135 |     return all_images
136 | 
137 | 
138 | def render(
139 |     cfg,
140 | ):
141 |     # Create model
142 |     model = Model(cfg)
143 |     model = model.cuda(); model.eval()
144 | 
145 |     # Render spiral
146 |     cameras = create_surround_cameras(3.0, n_poses=20, up=(0.0, 0.0, 1.0))
147 |     all_images = render_images(
148 |         model, cameras, cfg.data.image_size
149 |     )
150 |     imageio.mimsave('images/part_5.gif', [np.uint8(im * 255) for im in all_images],loop=0)
151 | 
152 | 
153 | def create_model(cfg):
154 |     # Create model
155 |     model = Model(cfg)
156 |     model.cuda(); model.train()
157 | 
158 |     # Load checkpoints
159 |     optimizer_state_dict = None
160 |     start_epoch = 0
161 | 
162 |     checkpoint_path = os.path.join(
163 |         hydra.utils.get_original_cwd(),
164 |         cfg.training.checkpoint_path
165 |     )
166 | 
167 |     if len(cfg.training.checkpoint_path) > 0:
168 |         # Make the root of the experiment directory.
169 |         checkpoint_dir = os.path.split(checkpoint_path)[0]
170 |         os.makedirs(checkpoint_dir, exist_ok=True)
171 | 
172 |         # Resume training if requested.
173 |         if cfg.training.resume and os.path.isfile(checkpoint_path):
174 |             print(f"Resuming from checkpoint {checkpoint_path}.")
175 |             loaded_data = torch.load(checkpoint_path)
176 |             model.load_state_dict(loaded_data["model"])
177 |             start_epoch = loaded_data["epoch"]
178 | 
179 |             print(f"   => resuming from epoch {start_epoch}.")
180 |             optimizer_state_dict = loaded_data["optimizer"]
181 | 
182 |     # Initialize the optimizer.
183 |     optimizer = torch.optim.Adam(
184 |         model.parameters(),
185 |         lr=cfg.training.lr,
186 |     )
187 | 
188 |     # Load the optimizer state dict in case we are resuming.
189 |     if optimizer_state_dict is not None:
190 |         optimizer.load_state_dict(optimizer_state_dict)
191 |         optimizer.last_epoch = start_epoch
192 | 
193 |     # The learning rate scheduling is implemented with LambdaLR PyTorch scheduler.
194 |     def lr_lambda(epoch):
195 |         return cfg.training.lr_scheduler_gamma ** (
196 |             epoch / cfg.training.lr_scheduler_step_size
197 |         )
198 | 
199 |     lr_scheduler = torch.optim.lr_scheduler.LambdaLR(
200 |         optimizer, lr_lambda, last_epoch=start_epoch - 1, verbose=False
201 |     )
202 | 
203 |     return model, optimizer, lr_scheduler, start_epoch, checkpoint_path
204 | 
205 | 
206 | def train_points(
207 |     cfg
208 | ):
209 |     # Create model
210 |     model, optimizer, lr_scheduler, start_epoch, checkpoint_path = create_model(cfg)
211 | 
212 |     # Pretrain SDF
213 |     pretrain_sdf(cfg, model)
214 | 
215 |     # Load pointcloud
216 |     point_cloud = np.load(cfg.data.point_cloud_path)
217 |     all_points = torch.Tensor(point_cloud["verts"][::2]).cuda().view(-1, 3)
218 |     all_points = all_points - torch.mean(all_points, dim=0).unsqueeze(0)
219 |     
220 |     point_images = render_points_with_save(
221 |         all_points.unsqueeze(0), create_surround_cameras(3.0, n_poses=20, up=(0.0, 1.0, 0.0), focal_length=2.0),
222 |         cfg.data.image_size, file_prefix='points'
223 |     )
224 |     imageio.mimsave('images/part_6_input.gif', [np.uint8(im * 255) for im in point_images], loop=0)
225 | 
226 |     # Run the main training loop.
227 |     for epoch in range(0, cfg.training.num_epochs):
228 |         t_range = tqdm.tqdm(range(0, all_points.shape[0], cfg.training.batch_size))
229 | 
230 |         for idx in t_range:
231 |             # Select random points from pointcloud
232 |             points = select_random_points(all_points, cfg.training.batch_size)
233 | 
234 |             # Get distances and enforce point cloud loss
235 |             distances, gradients = model.implicit_fn.get_distance_and_gradient(points)
236 |             loss = None # TODO (Q6): Point cloud SDF loss on distances
237 |             point_loss = loss
238 | 
239 |             # Sample random points in bounding box
240 |             eikonal_points = get_random_points(
241 |                 cfg.training.batch_size, cfg.training.bounds, 'cuda'
242 |             )
243 | 
244 |             # Get sdf gradients and enforce eikonal loss
245 |             eikonal_distances, eikonal_gradients = model.implicit_fn.get_distance_and_gradient(eikonal_points)
246 |             loss += torch.exp(-1e2 * torch.abs(eikonal_distances)).mean() * cfg.training.inter_weight
247 |             loss += eikonal_loss(eikonal_gradients) * cfg.training.eikonal_weight # TODO (Q6): Implement eikonal loss
248 | 
249 |             # Take the training step.
250 |             optimizer.zero_grad()
251 |             loss.backward()
252 |             optimizer.step()
253 | 
254 |             t_range.set_description(f'Epoch: {epoch:04d}, Loss: {point_loss:.06f}')
255 |             t_range.refresh()
256 | 
257 |         # Checkpoint.
258 |         if (
259 |             epoch % cfg.training.checkpoint_interval == 0
260 |             and len(cfg.training.checkpoint_path) > 0
261 |             and epoch > 0
262 |         ):
263 |             print(f"Storing checkpoint {checkpoint_path}.")
264 | 
265 |             data_to_store = {
266 |                 "model": model.state_dict(),
267 |                 "optimizer": optimizer.state_dict(),
268 |                 "epoch": epoch,
269 |             }
270 | 
271 |             torch.save(data_to_store, checkpoint_path)
272 | 
273 |         # Render
274 |         if (
275 |             epoch % cfg.training.render_interval == 0
276 |             and epoch > 0
277 |         ):
278 |             try:
279 |                 test_images = render_geometry(
280 |                     model, create_surround_cameras(3.0, n_poses=20, up=(0.0, 1.0, 0.0), focal_length=2.0),
281 |                     cfg.data.image_size, file_prefix='eikonal', thresh=0.002,
282 |                 )
283 |                 imageio.mimsave('images/part_6.gif', [np.uint8(im * 255) for im in test_images], loop=0)
284 |             except Exception as e:
285 |                 print("Empty mesh")
286 |                 pass
287 | 
288 | 
289 | def pretrain_sdf(
290 |     cfg,
291 |     model
292 | ):
293 |     optimizer = torch.optim.Adam(
294 |         model.parameters(),
295 |         lr=cfg.training.lr,
296 |     )
297 | 
298 |     # Run the main training loop.
299 |     for iter in range(0, cfg.training.pretrain_iters):
300 |         points = get_random_points(
301 |             cfg.training.batch_size, cfg.training.bounds, 'cuda'
302 |         )
303 | 
304 |         # Run model forward
305 |         distances = model.implicit_fn.get_distance(points)
306 |         loss = sphere_loss(distances, points, 1.0)
307 | 
308 |         # Take the training step
309 |         optimizer.zero_grad()
310 |         loss.backward()
311 |         optimizer.step()
312 | 
313 | 
314 | def train_images(
315 |     cfg
316 | ):
317 |     # Create model
318 |     model, optimizer, lr_scheduler, start_epoch, checkpoint_path = create_model(cfg)
319 | 
320 |     # Load the training/validation data.
321 |     train_dataset, val_dataset, _ = get_nerf_datasets(
322 |         dataset_name=cfg.data.dataset_name,
323 |         image_size=[cfg.data.image_size[1], cfg.data.image_size[0]],
324 |     )
325 | 
326 |     train_dataloader = torch.utils.data.DataLoader(
327 |         train_dataset,
328 |         batch_size=1,
329 |         shuffle=True,
330 |         num_workers=0,
331 |         collate_fn=lambda batch: batch,
332 |     )
333 | 
334 |     # Pretrain SDF
335 |     pretrain_sdf(cfg, model)
336 | 
337 |     # Run the main training loop.
338 |     for epoch in range(start_epoch, cfg.training.num_epochs):
339 |         t_range = tqdm.tqdm(enumerate(train_dataloader))
340 | 
341 |         for iteration, batch in t_range:
342 |             image, camera, camera_idx = batch[0].values()
343 |             image = image.cuda().unsqueeze(0)
344 |             camera = camera.cuda()
345 | 
346 |             # Sample rays
347 |             xy_grid = get_random_pixels_from_image(
348 |                 cfg.training.batch_size, cfg.data.image_size, camera
349 |             )
350 |             ray_bundle = get_rays_from_pixels(
351 |                 xy_grid, cfg.data.image_size, camera
352 |             )
353 |             rgb_gt = sample_images_at_xy(image, xy_grid)
354 | 
355 |             # Run model forward
356 |             out = model(ray_bundle)
357 | 
358 |             # Color loss
359 |             loss = torch.mean(torch.square(rgb_gt - out['color']))
360 |             image_loss = loss
361 | 
362 |             # Sample random points in bounding box
363 |             eikonal_points = get_random_points(
364 |                 cfg.training.batch_size, cfg.training.bounds, 'cuda'
365 |             )
366 | 
367 |             # Get sdf gradients and enforce eikonal loss
368 |             eikonal_distances, eikonal_gradients = model.implicit_fn.get_distance_and_gradient(eikonal_points)
369 |             loss += torch.exp(-1e2 * torch.abs(eikonal_distances)).mean() * cfg.training.inter_weight
370 |             loss += eikonal_loss(eikonal_gradients) * cfg.training.eikonal_weight # TODO (2): Implement eikonal loss
371 | 
372 |             # Take the training step.
373 |             optimizer.zero_grad()
374 |             loss.backward()
375 |             optimizer.step()
376 | 
377 |             t_range.set_description(f'Epoch: {epoch:04d}, Loss: {image_loss:.06f}')
378 |             t_range.refresh()
379 | 
380 |         # Adjust the learning rate.
381 |         lr_scheduler.step()
382 | 
383 |         # Checkpoint.
384 |         if (
385 |             epoch % cfg.training.checkpoint_interval == 0
386 |             and len(cfg.training.checkpoint_path) > 0
387 |             and epoch > 0
388 |         ):
389 |             print(f"Storing checkpoint {checkpoint_path}.")
390 | 
391 |             data_to_store = {
392 |                 "model": model.state_dict(),
393 |                 "optimizer": optimizer.state_dict(),
394 |                 "epoch": epoch,
395 |             }
396 | 
397 |             torch.save(data_to_store, checkpoint_path)
398 | 
399 |         # Render
400 |         if (
401 |             epoch % cfg.training.render_interval == 0
402 |             and epoch > 0
403 |         ):
404 |             test_images = render_images(
405 |                 model, create_surround_cameras(4.0, n_poses=20, up=(0.0, 0.0, 1.0), focal_length=2.0),
406 |                 cfg.data.image_size, file_prefix='volsdf'
407 |             )
408 |             imageio.mimsave('images/part_7.gif', [np.uint8(im * 255) for im in test_images], loop=0)
409 | 
410 |             try:
411 |                 test_images = render_geometry(
412 |                     model, create_surround_cameras(4.0, n_poses=20, up=(0.0, 0.0, 1.0), focal_length=2.0),
413 |                     cfg.data.image_size, file_prefix='volsdf_geometry'
414 |                 )
415 |                 imageio.mimsave('images/part_7_geometry.gif', [np.uint8(im * 255) for im in test_images], loop=0)
416 |             except Exception as e:
417 |                 print("Empty mesh")
418 |                 pass
419 |                 
420 | @hydra.main(config_path='configs', config_name='torus')
421 | def main(cfg: DictConfig):
422 |     os.chdir(hydra.utils.get_original_cwd())
423 | 
424 |     if cfg.type == 'render':
425 |         render(cfg)
426 |     elif cfg.type == 'train_points':
427 |         train_points(cfg)
428 |     elif cfg.type == 'train_images':
429 |         train_images(cfg)
430 | 
431 | 
432 | if __name__ == "__main__":
433 |     main()
434 | 
435 | 
436 | 


--------------------------------------------------------------------------------
/ta_images/color.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/color.png


--------------------------------------------------------------------------------
/ta_images/depth.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/depth.png


--------------------------------------------------------------------------------
/ta_images/grid.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/grid.png


--------------------------------------------------------------------------------
/ta_images/part_1.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_1.gif


--------------------------------------------------------------------------------
/ta_images/part_2.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2.gif


--------------------------------------------------------------------------------
/ta_images/part_2_after_training_0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_after_training_0.png


--------------------------------------------------------------------------------
/ta_images/part_2_after_training_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_after_training_1.png


--------------------------------------------------------------------------------
/ta_images/part_2_after_training_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_after_training_2.png


--------------------------------------------------------------------------------
/ta_images/part_2_after_training_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_after_training_3.png


--------------------------------------------------------------------------------
/ta_images/part_2_before_training_0.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_before_training_0.png


--------------------------------------------------------------------------------
/ta_images/part_2_before_training_1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_before_training_1.png


--------------------------------------------------------------------------------
/ta_images/part_2_before_training_2.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_before_training_2.png


--------------------------------------------------------------------------------
/ta_images/part_2_before_training_3.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_2_before_training_3.png


--------------------------------------------------------------------------------
/ta_images/part_3.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_3.gif


--------------------------------------------------------------------------------
/ta_images/part_5.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_5.gif


--------------------------------------------------------------------------------
/ta_images/part_6.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_6.gif


--------------------------------------------------------------------------------
/ta_images/part_6_input.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_6_input.gif


--------------------------------------------------------------------------------
/ta_images/part_7.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_7.gif


--------------------------------------------------------------------------------
/ta_images/part_7_geometry.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/part_7_geometry.gif


--------------------------------------------------------------------------------
/ta_images/rays.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/rays.png


--------------------------------------------------------------------------------
/ta_images/sample_points.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/sample_points.png


--------------------------------------------------------------------------------
/ta_images/transmittance.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/ta_images/transmittance.png


--------------------------------------------------------------------------------
/transmittance_calculation/a3_transmittance.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/transmittance_calculation/a3_transmittance.pdf


--------------------------------------------------------------------------------
/transmittance_calculation/figure1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/learning3d/assignment3/7f46cc446f966fb21710f75ea02c26e3bb581a2a/transmittance_calculation/figure1.png


--------------------------------------------------------------------------------
/transmittance_calculation/main.tex:
--------------------------------------------------------------------------------
  1 | \documentclass[11pt,addpoints,answers]{exam}
  2 | \usepackage[margin=1in]{geometry}
  3 | \usepackage{graphicx}
  4 | \usepackage[svgname]{xcolor}
  5 | \usepackage{url}
  6 | \usepackage{datetime}
  7 | \usepackage{color}
  8 | \usepackage[many]{tcolorbox}
  9 | \usepackage{hyperref}
 10 | 
 11 | \newcommand{\courseNum}{\href{https://learning3d.github.io/index.html}{16825}}
 12 | \newcommand{\courseName}{\href{https://learning3d.github.io/index.html}{Learning for 3D Vision}}
 13 | \newcommand{\courseSem}{\href{https://learning3d.github.io/index.html}{Spring 2024}}
 14 | \newcommand{\courseUrl}{\url{https://piazza.com/cmu/spring2024/16825}}
 15 | \newcommand{\hwNum}{Problem Set 3}
 16 | \newcommand{\hwTopic}{Volume Rendering}
 17 | \newcommand{\hwName}{\hwNum: \hwTopic}
 18 | \newcommand{\outDate}{Feb. 21, 2024}
 19 | \newcommand{\dueDate}{Mar. 13, 2024 11:59 PM}
 20 | \newcommand{\instructorName}{Shubham Tulsiani}
 21 | \newcommand{\taNames}{Anurag Ghosh, Ayush Jain, Bharath Raj, Ruihan Gao, Shun Iwase}
 22 | 
 23 | \lhead{\hwName}
 24 | \rhead{\courseNum}
 25 | \cfoot{\thepage{} of \numpages{}}
 26 | 
 27 | \title{\textsc{\hwName}} % Title
 28 | 
 29 | 
 30 | \author{}
 31 | 
 32 | \date{}
 33 | 
 34 | 
 35 | %%%%%%%%%%%%%%%%%%%%%%%%%%
 36 | % Document configuration %
 37 | %%%%%%%%%%%%%%%%%%%%%%%%%%
 38 | 
 39 | % Don't display a date in the title and remove the white space
 40 | \predate{}
 41 | \postdate{}
 42 | \date{}
 43 | 
 44 | %%%%%%%%%%%%%%%%%%
 45 | % Begin Document %
 46 | %%%%%%%%%%%%%%%%%%
 47 | 
 48 | 
 49 | \begin{document}
 50 | 
 51 | \section*{}
 52 | \begin{center}
 53 |   \textsc{\LARGE \hwNum} \\
 54 |   \vspace{1em}
 55 |   \textsc{\large \courseNum{} \courseName{} (\courseSem)} \\
 56 |   \courseUrl\\
 57 |   \vspace{1em}
 58 |   OUT: \outDate \\
 59 |   DUE: \dueDate \\
 60 |   Instructor: \instructorName \\
 61 |   TAs: \taNames
 62 | \end{center}
 63 | 
 64 | 
 65 | % Default to visible (but empty) solution box.
 66 | \newtcolorbox[]{studentsolution}[1][]{%
 67 |     breakable,
 68 |     enhanced,
 69 |     colback=white,
 70 |     title=Solution,
 71 |     #1
 72 | }
 73 | 
 74 | \begin{questions}
 75 | \question \textbf{[10 pts]}
 76 | \begin{figure}[h]
 77 |     \centering
 78 |     \includegraphics[width=\textwidth]{figure1.png}
 79 |     \caption{A ray through a non-homogeneous medium. The medium is composed of 3 segments ($y1y2$, $y2y3$, $y3y4$). Each segment has a different absorption coefficient, shown as $\sigma_1, \sigma_2, \sigma_3$ in the figure. The length of each segment is also annotated in the figure (1m means 1 meter).}
 80 |     \label{fig:q1}
 81 | \end{figure}
 82 | 
 83 | As shown in Figure~\ref{fig:q1}, we observe a ray going through a non-homogeneous medium. 
 84 | Please compute the following transmittance:
 85 | \begin{itemize}
 86 |     \item $T(y1, y2)$
 87 |     \item  $T(y2, y4)$
 88 |     \item $T(x, y4)$
 89 |     \item $T(x, y3)$
 90 | \end{itemize} 
 91 | 
 92 | 
 93 | 
 94 | \begin{tcolorbox}[fit,height=20cm, width=\textwidth, blank, borderline={0.5pt}{-2pt},halign=left, valign=center, nobeforeafter]
 95 | 
 96 | 
 97 | % \begin{studentsolution}
 98 | % Write your solution here.
 99 | % \end{studentsolution}
100 | 
101 | \end{tcolorbox}
102 | \end{questions}
103 | \end{document}


--------------------------------------------------------------------------------
/volume_rendering_main.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import warnings
  3 | 
  4 | import hydra
  5 | import numpy as np
  6 | import torch
  7 | import tqdm
  8 | import imageio
  9 | 
 10 | from omegaconf import DictConfig
 11 | from PIL import Image
 12 | from pytorch3d.renderer import (
 13 |     PerspectiveCameras,
 14 |     look_at_view_transform
 15 | )
 16 | import matplotlib.pyplot as plt
 17 | 
 18 | from implicit import implicit_dict
 19 | from sampler import sampler_dict
 20 | from renderer import renderer_dict
 21 | from ray_utils import (
 22 |     sample_images_at_xy,
 23 |     get_pixels_from_image,
 24 |     get_random_pixels_from_image,
 25 |     get_rays_from_pixels
 26 | )
 27 | from data_utils import (
 28 |     dataset_from_config,
 29 |     create_surround_cameras,
 30 |     vis_grid,
 31 |     vis_rays,
 32 | )
 33 | from dataset import (
 34 |     get_nerf_datasets,
 35 |     trivial_collate,
 36 | )
 37 | 
 38 | 
 39 | # Model class containing:
 40 | #   1) Implicit volume defining the scene
 41 | #   2) Sampling scheme which generates sample points along rays
 42 | #   3) Renderer which can render an implicit volume given a sampling scheme
 43 | 
 44 | class Model(torch.nn.Module):
 45 |     def __init__(
 46 |         self,
 47 |         cfg
 48 |     ):
 49 |         super().__init__()
 50 | 
 51 |         # Get implicit function from config
 52 |         self.implicit_fn = implicit_dict[cfg.implicit_function.type](
 53 |             cfg.implicit_function
 54 |         )
 55 | 
 56 |         # Point sampling (raymarching) scheme
 57 |         self.sampler = sampler_dict[cfg.sampler.type](
 58 |             cfg.sampler
 59 |         )
 60 | 
 61 |         # Initialize volume renderer
 62 |         self.renderer = renderer_dict[cfg.renderer.type](
 63 |             cfg.renderer
 64 |         )
 65 |     
 66 |     def forward(
 67 |         self,
 68 |         ray_bundle
 69 |     ):
 70 |         # Call renderer with
 71 |         #  a) Implicit volume
 72 |         #  b) Sampling routine
 73 | 
 74 |         return self.renderer(
 75 |             self.sampler,
 76 |             self.implicit_fn,
 77 |             ray_bundle
 78 |         )
 79 | 
 80 | 
 81 | def render_images(
 82 |     model,
 83 |     cameras,
 84 |     image_size,
 85 |     save=False,
 86 |     file_prefix=''
 87 | ):
 88 |     all_images = []
 89 |     device = list(model.parameters())[0].device
 90 | 
 91 |     for cam_idx, camera in enumerate(cameras):
 92 |         print(f'Rendering image {cam_idx}')
 93 | 
 94 |         torch.cuda.empty_cache()
 95 |         camera = camera.to(device)
 96 |         xy_grid = get_pixels_from_image(image_size, camera) # TODO (Q1.3): implement in ray_utils.py
 97 |         ray_bundle = get_rays_from_pixels(xy_grid, image_size, camera) # TODO (Q1.3): implement in ray_utils.py
 98 | 
 99 |         # TODO (Q1.3): Visualize xy grid using vis_grid
100 |         if cam_idx == 0 and file_prefix == '':
101 |             pass
102 | 
103 |         # TODO (Q1.3): Visualize rays using vis_rays
104 |         if cam_idx == 0 and file_prefix == '':
105 |             pass
106 |         
107 |         # TODO (Q1.4): Implement point sampling along rays in sampler.py
108 |         pass
109 | 
110 |         # TODO (Q1.4): Visualize sample points as point cloud
111 |         if cam_idx == 0 and file_prefix == '':
112 |             pass
113 | 
114 |         # TODO (Q1.5): Implement rendering in renderer.py
115 |         out = model(ray_bundle)
116 | 
117 |         # Return rendered features (colors)
118 |         image = np.array(
119 |             out['feature'].view(
120 |                 image_size[1], image_size[0], 3
121 |             ).detach().cpu()
122 |         )
123 |         all_images.append(image)
124 | 
125 |         # TODO (Q1.5): Visualize depth
126 |         if cam_idx == 2 and file_prefix == '':
127 |             pass
128 | 
129 |         # Save
130 |         if save:
131 |             plt.imsave(
132 |                 f'{file_prefix}_{cam_idx}.png',
133 |                 image
134 |             )
135 |     
136 |     return all_images
137 | 
138 | 
139 | def render(
140 |     cfg,
141 | ):
142 |     # Create model
143 |     model = Model(cfg)
144 |     model = model.cuda(); model.eval()
145 | 
146 |     # Render spiral
147 |     cameras = create_surround_cameras(3.0, n_poses=20)
148 |     all_images = render_images(
149 |         model, cameras, cfg.data.image_size
150 |     )
151 |     imageio.mimsave('images/part_1.gif', [np.uint8(im * 255) for im in all_images], loop=0)
152 | 
153 | 
154 | def train(
155 |     cfg
156 | ):
157 |     # Create model
158 |     model = Model(cfg)
159 |     model = model.cuda(); model.train()
160 | 
161 |     # Create dataset 
162 |     train_dataset = dataset_from_config(cfg.data)
163 |     train_dataloader = torch.utils.data.DataLoader(
164 |         train_dataset,
165 |         batch_size=1,
166 |         shuffle=True,
167 |         num_workers=0,
168 |         collate_fn=lambda batch: batch,
169 |     )
170 |     image_size = cfg.data.image_size
171 | 
172 |     # Create optimizer 
173 |     optimizer = torch.optim.Adam(
174 |         model.parameters(),
175 |         lr=cfg.training.lr
176 |     )
177 | 
178 |     # Render images before training
179 |     cameras = [item['camera'] for item in train_dataset]
180 |     render_images(
181 |         model, cameras, image_size,
182 |         save=True, file_prefix='images/part_2_before_training'
183 |     )
184 | 
185 |     # Train
186 |     t_range = tqdm.tqdm(range(cfg.training.num_epochs))
187 | 
188 |     for epoch in t_range:
189 |         for iteration, batch in enumerate(train_dataloader):
190 |             image, camera, camera_idx = batch[0].values()
191 |             image = image.cuda()
192 |             camera = camera.cuda()
193 | 
194 |             # Sample rays
195 |             xy_grid = get_random_pixels_from_image(cfg.training.batch_size, image_size, camera) # TODO (Q2.1): implement in ray_utils.py
196 |             ray_bundle = get_rays_from_pixels(xy_grid, image_size, camera)
197 |             rgb_gt = sample_images_at_xy(image, xy_grid)
198 | 
199 |             # Run model forward
200 |             out = model(ray_bundle)
201 | 
202 |             # TODO (Q2.2): Calculate loss
203 |             loss = None
204 | 
205 |             # Backprop
206 |             optimizer.zero_grad()
207 |             loss.backward()
208 |             optimizer.step()
209 | 
210 |         if (epoch % 10) == 0:
211 |             t_range.set_description(f'Epoch: {epoch:04d}, Loss: {loss:.06f}')
212 |             t_range.refresh()
213 | 
214 |     # Print center and side lengths
215 |     print("Box center:", tuple(np.array(model.implicit_fn.sdf.center.data.detach().cpu()).tolist()[0]))
216 |     print("Box side lengths:", tuple(np.array(model.implicit_fn.sdf.side_lengths.data.detach().cpu()).tolist()[0]))
217 | 
218 |     # Render images after training
219 |     render_images(
220 |         model, cameras, image_size,
221 |         save=True, file_prefix='images/part_2_after_training'
222 |     )
223 |     all_images = render_images(
224 |         model, create_surround_cameras(3.0, n_poses=20), image_size, file_prefix='part_2'
225 |     )
226 |     imageio.mimsave('images/part_2.gif', [np.uint8(im * 255) for im in all_images], loop=0)
227 | 
228 | 
229 | def create_model(cfg):
230 |     # Create model
231 |     model = Model(cfg)
232 |     model.cuda(); model.train()
233 | 
234 |     # Load checkpoints
235 |     optimizer_state_dict = None
236 |     start_epoch = 0
237 | 
238 |     checkpoint_path = os.path.join(
239 |         hydra.utils.get_original_cwd(),
240 |         cfg.training.checkpoint_path
241 |     )
242 | 
243 |     if len(cfg.training.checkpoint_path) > 0:
244 |         # Make the root of the experiment directory.
245 |         checkpoint_dir = os.path.split(checkpoint_path)[0]
246 |         os.makedirs(checkpoint_dir, exist_ok=True)
247 | 
248 |         # Resume training if requested.
249 |         if cfg.training.resume and os.path.isfile(checkpoint_path):
250 |             print(f"Resuming from checkpoint {checkpoint_path}.")
251 |             loaded_data = torch.load(checkpoint_path)
252 |             model.load_state_dict(loaded_data["model"])
253 |             start_epoch = loaded_data["epoch"]
254 | 
255 |             print(f"   => resuming from epoch {start_epoch}.")
256 |             optimizer_state_dict = loaded_data["optimizer"]
257 | 
258 |     # Initialize the optimizer.
259 |     optimizer = torch.optim.Adam(
260 |         model.parameters(),
261 |         lr=cfg.training.lr,
262 |     )
263 | 
264 |     # Load the optimizer state dict in case we are resuming.
265 |     if optimizer_state_dict is not None:
266 |         optimizer.load_state_dict(optimizer_state_dict)
267 |         optimizer.last_epoch = start_epoch
268 | 
269 |     # The learning rate scheduling is implemented with LambdaLR PyTorch scheduler.
270 |     def lr_lambda(epoch):
271 |         return cfg.training.lr_scheduler_gamma ** (
272 |             epoch / cfg.training.lr_scheduler_step_size
273 |         )
274 | 
275 |     lr_scheduler = torch.optim.lr_scheduler.LambdaLR(
276 |         optimizer, lr_lambda, last_epoch=start_epoch - 1, verbose=False
277 |     )
278 | 
279 |     return model, optimizer, lr_scheduler, start_epoch, checkpoint_path
280 | 
281 | def train_nerf(
282 |     cfg
283 | ):
284 |     # Create model
285 |     model, optimizer, lr_scheduler, start_epoch, checkpoint_path = create_model(cfg)
286 | 
287 |     # Load the training/validation data.
288 |     train_dataset, val_dataset, _ = get_nerf_datasets(
289 |         dataset_name=cfg.data.dataset_name,
290 |         image_size=[cfg.data.image_size[1], cfg.data.image_size[0]],
291 |     )
292 | 
293 |     train_dataloader = torch.utils.data.DataLoader(
294 |         train_dataset,
295 |         batch_size=1,
296 |         shuffle=True,
297 |         num_workers=0,
298 |         collate_fn=trivial_collate,
299 |     )
300 | 
301 |     # Run the main training loop.
302 |     for epoch in range(start_epoch, cfg.training.num_epochs):
303 |         t_range = tqdm.tqdm(enumerate(train_dataloader))
304 | 
305 |         for iteration, batch in t_range:
306 |             image, camera, camera_idx = batch[0].values()
307 |             image = image.cuda().unsqueeze(0)
308 |             camera = camera.cuda()
309 | 
310 |             # Sample rays
311 |             xy_grid = get_random_pixels_from_image(
312 |                 cfg.training.batch_size, cfg.data.image_size, camera
313 |             )
314 |             ray_bundle = get_rays_from_pixels(
315 |                 xy_grid, cfg.data.image_size, camera
316 |             )
317 |             rgb_gt = sample_images_at_xy(image, xy_grid)
318 | 
319 |             # Run model forward
320 |             out = model(ray_bundle)
321 | 
322 |             # TODO (Q3.1): Calculate loss
323 |             loss = None
324 | 
325 |             # Take the training step.
326 |             optimizer.zero_grad()
327 |             loss.backward()
328 |             optimizer.step()
329 | 
330 |             t_range.set_description(f'Epoch: {epoch:04d}, Loss: {loss:.06f}')
331 |             t_range.refresh()
332 | 
333 |         # Adjust the learning rate.
334 |         lr_scheduler.step()
335 | 
336 |         # Checkpoint.
337 |         if (
338 |             epoch % cfg.training.checkpoint_interval == 0
339 |             and len(cfg.training.checkpoint_path) > 0
340 |             and epoch > 0
341 |         ):
342 |             print(f"Storing checkpoint {checkpoint_path}.")
343 | 
344 |             data_to_store = {
345 |                 "model": model.state_dict(),
346 |                 "optimizer": optimizer.state_dict(),
347 |                 "epoch": epoch,
348 |             }
349 | 
350 |             torch.save(data_to_store, checkpoint_path)
351 | 
352 |         # Render
353 |         if (
354 |             epoch % cfg.training.render_interval == 0
355 |             and epoch > 0
356 |         ):
357 |             with torch.no_grad():
358 |                 test_images = render_images(
359 |                     model, create_surround_cameras(4.0, n_poses=20, up=(0.0, 0.0, 1.0), focal_length=2.0),
360 |                     cfg.data.image_size, file_prefix='nerf'
361 |                 )
362 |                 imageio.mimsave('images/part_3.gif', [np.uint8(im * 255) for im in test_images], loop=0)
363 | 
364 | 
365 | @hydra.main(config_path='./configs', config_name='sphere')
366 | def main(cfg: DictConfig):
367 |     os.chdir(hydra.utils.get_original_cwd())
368 | 
369 |     if cfg.type == 'render':
370 |         render(cfg)
371 |     elif cfg.type == 'train':
372 |         train(cfg)
373 |     elif cfg.type == 'train_nerf':
374 |         train_nerf(cfg)
375 | 
376 | 
377 | if __name__ == "__main__":
378 |     main()
379 | 
380 | 


--------------------------------------------------------------------------------