├── .gitignore ├── MVS.md ├── PointCloud.md ├── README.md ├── UpdateLog.md ├── avatar.md ├── colabs ├── 2DGS.ipynb ├── 4DGaussians.ipynb ├── HyperNerf.ipynb ├── gaussian_splatting_colab.ipynb └── gaussian_splatting_kaolin.ipynb ├── dynamic.md ├── generative.md ├── images └── cvpr2024.png ├── nerf.md ├── review.md └── vidgen.md /.gitignore: -------------------------------------------------------------------------------- 1 | scripts -------------------------------------------------------------------------------- /MVS.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## Reference 4 | - https://github.com/walsvid/Awesome-MVS -------------------------------------------------------------------------------- /PointCloud.md: -------------------------------------------------------------------------------- 1 | 2 | ### PlenopticPoints: Rasterizing Neural Feature Points for High-Quality Novel View Synthesis 3 | **Authors**: Florian Hahlbohm, Moritz Kappel, Jan-Philipp Tauscher, Martin Eisemann, Marcus Magnor 4 | **Published**: Vision, Modeling and Visualization (VMV) (The Eurographics Association) 5 | 6 | [Paper](https://graphics.tu-bs.de/publications/hahlbohm2023plenopticpoints) 7 | 8 | 9 | ### Neural Point Catacaustics for Novel-View Synthesis of Reflections 10 | **Authors**: Georgios Kopanas, Thomas Leimkühler, Gilles Rainer, Clément Jambon, George Drettakis 11 | **Published**: ACM Transactions on Graphics (TOG), 2022 12 | 13 | [Paper](https://dl.acm.org/doi/abs/10.1145/3550454.3555497) 14 | 15 | 16 | 17 | ### GaussiGAN: Controllable Image Synthesis with 3D Gaussians from Unposed Silhouettes 18 | **Authors**: Youssef A. Mejjati, Isa Milefchik, Aaron Gokaslan, Oliver Wang, Kwang In Kim, James Tompkin 19 | **Published**: BMVC 2021 + CVPRW AI for Content Creation 2021 20 | [Project Page](https://visual.cs.brown.edu/projects/gaussigan-webpage/) | [Paper](https://dl.acm.org/doi/abs/10.1145/3550454.3555497) -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Awesome 3D Gaussian Splatting Resources 2 | 3D Gaussian Splatting (3DGS) opens a new window for using neural rendering for real applications. 3 | This repo is intended to provide a collection of papers that are related to 3DGS, but not necessarily use 3DGS. 4 | 5 | In general, 3D Gaussian Splatting can be considered as a variant of NeRF. This repo will more focused on the practical side of NeRF, e.g. realtime, compatibility with Unity/Unreal, ease of editing, etc. 6 | 7 | Other resources: 8 | - [Dynamic NeRF](./dynamic.md) 9 | - [awesome-LLMs-finetuning](https://github.com/pdaicode/awesome-LLMs-finetuning) 10 | 11 | **Verified**: Papers listed with ```[+]``` have been verfied by myself or colleagues. The code is runnable. Please leave an issue if you need help on setting up. 12 | 13 | **If you have any additions or suggestions, feel free to contribute. Everything is welcome.** 14 | 15 | ## Most Recent Update & News: 16 | [Complete List](./UpdateLog.md) 17 | - Aug 2024: added papers for improving NeRF speed 18 | - May 2024: added **2024** section, added **LLM** subsection 19 | - Dec 2023: added **verified** section 20 | - 26 Nov 2023: added more details for cumstom data. 21 | - Nov 2023: Start a separate page for [**NeRF**](./nerf) 22 | - 29 Oct 2023: Start a separate page for [**Dynamic NeRF**](./dynamic) 23 | 24 | ### CVPR 2024 word cloud: 25 | 26 | cvpr2024 27 | 28 | ## 1. 3D Reconstruction 29 | 30 | - **3D Gaussian Splatting for Real-Time Radiance Field Rendering**, 31 | [Bernhard Kerbl](https://scholar.google.at/citations?user=jeasMB0AAAAJ&hl=en), [Georgios Kopanas](https://scholar.google.com/citations?user=QLWLLHMAAAAJ), [Thomas Leimkühler](https://www-sop.inria.fr/members/Thomas-Sebastian.Leimkuhler/), [George Drettakis](https://scholar.google.fr/citations?user=LGo5J4IAAAAJ&hl=en), SIGGRAPH 2023 (Best Paper). 32 | [[📄 Paper (Low Resolution)](https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/3d_gaussian_splatting_low.pdf) | [📄 Paper (High Resolution)](https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/3d_gaussian_splatting_high.pdf) | [🌐 Project Page](https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/) | [💻 Code](https://github.com/graphdeco-inria/gaussian-splatting) | [🎥 Short Presentation](https://youtu.be/T_kXY43VZnk?si=DrkbDFxQAv5scQNT) | [🎥 Explanation Video](https://www.youtube.com/live/xgwvU7S0K-k?si=edF8NkYtsRbgTbKi) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pdaicode/awesome-3dgs/blob/master/colabs/gaussian_splatting_colab.ipynb)] 33 | 34 | ### Other 3D Papers 35 | - N-BVH: Neural ray queries with bounding volume hierarchies, SIGGRAPH 2024. [[Paper](https://weiphil.s3.eu-central-1.amazonaws.com/neural_bvh.pdf) | [Project](https://weiphil.github.io/portfolio/neural_bvh) | [Code](https://github.com/WeiPhil/nbvh)] 36 | - High-quality Surface Reconstruction using Gaussian Surfels. [[Paper](https://arxiv.org/pdf/2404.17774) | [Code](https://github.com/turandai/gaussian_surfels)] 37 | - Toon3D: Seeing Cartoons from a New Perspective, 2024. [[Paper](https://arxiv.org/abs/2405.10320) | [Project](https://toon3d.studio/) | [Code](https://github.com/ethanweber/toon3d)] 38 | - Texture Generation on 3D Meshes with Point-UV Diffusion, 2024. [[Paper](https://arxiv.org/abs/2308.10490) | [Project](https://cvmi-lab.github.io/Point-UV-Diffusion/) | [Code](https://github.com/CVMI-Lab/Point-UV-Diffusion)] 39 | 40 | ### 2024 41 | **General** 42 | - ```[+]``` [2DGS](./colabs/2DGS.ipynb): 2D Gaussian Splatting for Geometrically Accurate Radiance Fields, SIGGRAPH 2024. [[Paper](https://arxiv.org/abs/2403.17888) | [Project](https://surfsplatting.github.io/) | [Code](https://github.com/hbb1/2d-gaussian-splatting)] 43 | - Deblur-GS: 3D Gaussian Splatting from Camera Motion Blurred Images, I3D 2024. [[Paper](https://chaphlagical.icu/Deblur-GS/static/paper/Deblur_GS_author_version.pdf) | [Code](https://github.com/Chaphlagical/Deblur-GS)] 44 | - GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting, [[Paper](https://arxiv.org/abs/2405.07472) | [Code](https://github.com/HaroldChen19/GaussianVTON)] 45 | - DarkGS: Learning Neural Illumination and 3D Gaussians Relighting for Robotic Exploration in the Dark, [[Paper](https://arxiv.org/abs/2403.10814) | [Project](https://github.com/tyz1030/darkgs)] 46 | - GaussianPro: 3D Gaussian Splatting with Progressive Propagation, ICML 2024. [[Paper](https://arxiv.org/abs/2402.14650) | [Project](https://github.com/kcheng1021/GaussianPro)] 47 | - VR-GS: A Physical Dynamics-Aware Interactive Gaussian Splatting System in Virtual Reality, [[Paper](https://arxiv.org/abs/2401.16663) | [Project](https://yingjiang96.github.io/VR-GS/)] 48 | - MVSplat: Efficient 3D Gaussian Splatting from Sparse Multi-View Images, ECCV2024. [[Paper](https://arxiv.org/pdf/2403.14627) | [Project](https://github.com/donydchen/mvsplat)] 49 | - DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting, [[Paper](https://arxiv.org/abs/2404.06903) | [Project](https://dreamscene360.github.io/)] 50 | 51 | - COLMAP-Free 3D Gaussian Splatting, CVPR2024. [[Paper](https://arxiv.org/abs/2312.07504) | [Project](https://oasisyang.github.io/colmap-free-3dgs/) | [Code](https://github.com/NVlabs/CF-3DGS)] 52 | - FreGS: 3D Gaussian Splatting with Progressive Frequency Regularization, CVPR 2024. [[Paper](https://arxiv.org/abs/2403.06908) | [Project](https://rogeraigc.github.io/FreGS-Page/)] 53 | - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting, CVPR 2024. [[Project](https://gs-slam.github.io/)] 54 | - LangSplat: 3D Language Gaussian Splatting, CVPR 2024 (Highlight). [[Project](https://langsplat.github.io/) | [Code](https://github.com/minghanqin/LangSplat)] 55 | - SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering, CVPR 2024. 56 | - GaussianShader: 3D Gaussian Splatting with Shading Functions for Reflective Surfaces, CVPR 2024. [[Project](https://asparagus15.github.io/GaussianShader.github.io/) | [Code](https://github.com/Asparagus15/GaussianShader)] 57 | - pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction, CVPR 2024. 58 | - Triplane Meets Gaussian Splatting: Fast and Generalizable Single-View 3D Reconstruction with Transformers, CVPR 2024. [[Project](https://zouzx.github.io/TriplaneGaussian/) | [Code](https://github.com/VAST-AI-Research/TriplaneGaussian)] 59 | - GS-IR: 3D Gaussian Splatting for Inverse Rendering, CVPR 2024. [[Project](https://lzhnb.github.io/project-pages/gs-ir.html) | [Code](https://github.com/lzhnb/GS-IR)] 60 | - GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis, CVPR 2024. [[Project](https://shunyuanzheng.github.io/GPS-Gaussian) | [Code](https://github.com/aipixel/GPS-Gaussian)] 61 | 62 | - Surfel-based Gaussian Inverse Rendering for Fast and Relightable Dynamic Human Reconstruction from Monocular Video, 2024. [[Paper](https://arxiv.org/pdf/2407.15212) | [Project](https://gs-ia.github.io/)] 63 | - GaussMR: Interactive Gaussian Splatting Sandbox with GPU Particles and Signed Distance Fields, SIGGRAPH 2024. [[Paper](https://dl.acm.org/doi/pdf/10.1145/3641521.3664405?casa_token=GXIJMXbeT1sAAAAA:Pqv_zjOe9uXiTSVUEj03Hz8lDRAynMJPDIAuLBI_unPN9gG06KI_Lks6SJJFgAG4CLKRY6wFpBR5cQ)] 64 | - MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo, ECCV 2024. [[Project](https://mvsgaussian.github.io/) | [Code](https://github.com/TQTQliu/MVSGaussian)] 65 | - Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Image, CVPRW 2024 (3DMV). [[Project](https://robot0321.github.io/DepthRegGS/index.html) | [Code](https://github.com/robot0321/DepthRegularizedGS)] 66 | - Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian Splatting, 2024. [[Project](https://ingra14m.github.io/Spec-Gaussian-website/) | [Code](https://github.com/ingra14m/Spec-Gaussian)] 67 | - Deblurring 3D Gaussian Splatting, ECCV 2024. [[Project](https://benhenryl.github.io/Deblurring-3D-Gaussian-Splatting/) | [Code](https://github.com/benhenryL/Deblurring-3D-Gaussian-Splatting)] 68 | - bsGS: Recovering Fine Details for 3D Gaussian Splatting, ACM MM 2024. [Code](https://github.com/Asparagus15/GaussianShader) 69 | - GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting, 2024. [[Project](https://sai-bi.github.io/project/gs-lrm/)] 70 | 71 | - WildGaussians: 3D Gaussian Splatting in the Wild, 2024. [[Project](https://wild-gaussians.github.io/)] 72 | - Gaussian in the Wild: 3D Gaussian Splatting for Unconstrained Image Collections, 2024. [[Project](https://eastbeanzhang.github.io/GS-W/) | [Code](https://github.com/EastbeanZhang/Gaussian-Wild)] 73 | - On Scaling Up 3D Gaussian Splatting Training, 2024. [[Project](https://daohanlu.github.io/scaling-up-3dgs/) | [Code](https://github.com/nyu-systems/Grendel-GS)] 74 | 75 | **Literature Review** 76 | - Recent advances in 3D Gaussian splatting, Computational Visual Media, 2024. [[Paper](https://link.springer.com/article/10.1007/s41095-024-0436-y)] 77 | - 3D Gaussian Splatting as New Era: A Survey, IEEE Transactions on Visualization and Computer Graphics, 2024. [[Paper](https://ieeexplore.ieee.org/abstract/document/10521791?casa_token=CmDrVUqmo1kAAAAA:3ekE_T2xp9gWMActz0wLQ3Z6m7cdmyomp0ubYIl-nVZyheke22vbIoCKjId1jouaI4m7rm-UFQ)] 78 | - Gaussian Splatting: 3D Reconstruction and Novel View Synthesis: A Review, IEEE Access, 2024. [[Paper](https://ieeexplore.ieee.org/abstract/document/10545567)] 79 | 80 | **[NeRF (Improving speed & efficiency)](./nerf.md)** 81 | - How Far Can We Compress Instant-NGP-Based NeRF? CVPR 2024. [[Project](https://yihangchen-ee.github.io/project_cnc/) | [code](https://github.com/YihangChen-ee/CNC)] 82 | - FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices With a Simple Super-Resolution Pipeline, WACV 2024. [[Paper](https://openaccess.thecvf.com/content/WACV2024/papers/Lin_FastSR-NeRF_Improving_NeRF_Efficiency_on_Consumer_Devices_With_a_Simple_WACV_2024_paper.pdf)] 83 | - HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces, CVPR 2024. [Project](https://haithemturki.com/hybrid-nerf/) 84 | 85 | **[Dynamic](https://github.com/pdaicode/awesome-dynamic-NeRF)** 86 | - Shape of Motion: 4D Reconstruction from a Single Video, 2024. [[Project](https://shape-of-motion.github.io/) | [Code](https://github.com/vye16/shape-of-motion/)] 87 | - MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds, 2024. [Project](https://www.cis.upenn.edu/~leijh/projects/mosca/) 88 | - Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos, 2024. 89 | - CoGS : Controllable Gaussian Splatting, CVPR 2024. [[Project](https://cogs2024.github.io/) | [Code](https://github.com/Heng14/CoGS/tree/main)] 90 | 91 | 92 | **LLM & 3D** 93 | - Comp4D: LLM-Guided Compositional 4D Scene Generation, [[Paper](https://arxiv.org/abs/2403.16993) | [Project](https://vita-group.github.io/Comp4D/)] 94 | - GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guidedGenerative Gaussian Splatting, [[Paper](https://arxiv.org/abs/2402.07207) | [Project](https://gala3d.github.io/)] 95 | 96 | **SLAM & Sensor Fusion** 97 | - Gaussian Splatting SLAM, CVPR 2024 [[Paper](https://arxiv.org/abs/2312.06741) | [Code](https://github.com/muskie82/MonoGS)] 98 | - SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM, CVPR 2024. [[Paper](https://arxiv.org/pdf/2312.02126.pdf) | [Code](https://github.com/spla-tam/SplaTAM)] 99 | - RGBD GS-ICP SLAM, [[Paper](https://arxiv.org/abs/2403.12550) | [Code](https://github.com/Lab-of-AI-and-Robotics/GS_ICP_SLAM)] 100 | - Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting, [[Paper](https://ivi.fnwi.uva.nl/cv/paper/GaussianSLAM.pdf) | [Code](https://github.com/VladimirYugay/Gaussian-SLAM)] 101 | - Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras, [[Paper](https://arxiv.org/pdf/2311.16728.pdf) | [Code](https://github.com/HuajianUP/Photo-SLAM)] 102 | 103 | **Compression & Efficiency** 104 | - GaussianPro: 3D Gaussian Splatting with Progressive Propagation, [[Paper](https://arxiv.org/abs/2402.14650) | [Code](https://github.com/kcheng1021/GaussianPro)] 105 | - InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 10 Seconds, [[Paper](https://arxiv.org/pdf/2403.20309.pdf) 106 | - HAC: Hash-grid Assisted Context for 3D Gaussian Splatting Compression, [[Paper](https://arxiv.org/abs/2403.14530) | [Code](https://github.com/YihangChen-ee/HAC)] 107 | - Reducing the Memory Footprint of 3D Gaussian Splatting, [[Paper](https://repo-sam.inria.fr/fungraph/reduced_3dgs/reduced_3DGS_i3d.pdf) | [Project](https://repo-sam.inria.fr/fungraph/reduced_3dgs/#:~:text=Our%20approach%20to%20reduce%20the,is%20applied%20as%20post%2Dprocessing.)] 108 | - SUNDAE: Spectrally Pruned Gaussian Fields with Neural Compensation, [[Paper](https://runyiyang.github.io/data/SUNDAE.pdf) | [Code](https://github.com/RunyiYang/SUNDAE)] 109 | - Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis, CVPR 2024. [[Project](https://keksboter.github.io/c3dgs/) | [Code](https://github.com/KeKsBoTer/c3dgs)] 110 | 111 | ### 2023 112 | Speed & Efficiency 113 | - ```[+]``` LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS, 2023. [[Paper](https://arxiv.org/abs/2311.17245) | [Code](https://github.com/VITA-Group/LightGaussian)] 114 | - ```[+]``` SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering, 2023. [[Paper](https://arxiv.org/abs/2311.12775) | [Code](https://github.com/Anttwo/SuGaR)] 115 | - Compact 3D Gaussian Representation for Radiance Field, [[Paper](https://github.com/maincold2/Compact-3DGS/blob/main) | [Code](https://github.com/maincold2/Compact-3DGS)] 116 | - Compact3D: Compressing Gaussian Splat Radiance Field Models with Vector Quantization, [[Paper](https://arxiv.org/abs/2311.18159) | [Code](https://github.com/UCDvision/compact3d)] 117 | 118 | Quality 119 | - Mip-Splatting: Alias-free 3D Gaussian Splatting, [[Paper](https://arxiv.org/abs/2311.16493) | [Code](https://github.com/autonomousvision/mip-splatting)] 120 | - Multi-Scale 3D Gaussian Splatting for Anti-Aliased Rendering, [[Paper](https://arxiv.org/abs/2311.17089) | [Code](https://github.com/JokerYan/MS-GS/tree/main)] 121 | - FisherRF: Active View Selection and Uncertainty Quantification for Radiance Fields using Fisher Information, [Paper](https://arxiv.org/abs/2311.17874) 122 | - COLMAP-Free 3D Gaussian Splatting, [[Paper](https://arxiv.org/pdf/2312.07504) | [Project](https://oasisyang.github.io/colmap-free-3dgs/)] 123 | - NeuSG: Neural Implicit Surface Reconstruction with 3D Gaussian Splatting Guidance, [Paper](https://arxiv.org/abs/2312.00846) 124 | - Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images, [Paper](https://arxiv.org/pdf/2311.13398) 125 | - GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting, [Paper](https://arxiv.org/abs/2311.11700) 126 | 127 | Reflection & Relighting 128 | - GaussianShader: 3D Gaussian Splatting with Shading Functions for Reflective Surfaces, [[Paper](https://arxiv.org/abs/2311.17977) | [Code](https://github.com/Asparagus15/GaussianShader)] 129 | - Relightable 3D Gaussian: Real-time Point Cloud Relighting with BRDF Decomposition and Ray Tracing, [Paper](https://arxiv.org/abs/2311.16043) 130 | 131 | Others 132 | - Splatter Image: Ultra-Fast Single-View 3D Reconstruction, [[Paper](https://arxiv.org/abs/2312.13150) | [Code](https://github.com/szymanowiczs/splatter-image)] 133 | - pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction, [[Paper](https://arxiv.org/abs/2312.12337) | [Project Page](https://davidcharatan.com/pixelsplat/)] 134 | - Volume Feature Rendering for Fast Neural Radiance Field Reconstruction, NeurIPS 2023. 135 | 136 | ## 2. Dynamic 3D Gaussian Splatting: 137 | - ```[+]``` 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering, [[Paper](https://arxiv.org/pdf/2310.08528.pdf) | [Project Page](https://guanjunwu.github.io/4dgs/) | [Code](https://github.com/hustvl/4DGaussians) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pdaicode/awesome-3dgs/blob/master/colabs/4DGaussians.ipynb)] 138 | - ```[+]``` 4K4D: Real-Time 4D View Synthesis at 4K Resolution. [[Paper](https://drive.google.com/file/d/1Y-C6ASIB8ofvcZkyZ_Vp-a2TtbiPw1Yx/view?usp=sharing) | [Project Page](https://zju3dv.github.io/4k4d/) | [Code (Inference)](https://github.com/zju3dv/4K4D)]] 139 | - Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis, [[Paper](https://dynamic3dgaussians.github.io/paper.pdf) | [Project Page](https://dynamic3dgaussians.github.io/) | [Code](https://github.com/JonathonLuiten/Dynamic3DGaussians) | [Explanation Video](https://www.youtube.com/live/hDuy1TgD8I4?si=6oGN0IYnPRxOibpg)] 140 | - Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction, [[Paper](https://arxiv.org/pdf/2309.13101.pdf) | [Project Page](https://ingra14m.github.io/Deformable-Gaussians/) | [Code](https://github.com/ingra14m/Deformable-3D-Gaussians)] 141 | - Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting, [Paper](https://arxiv.org/pdf/2310.10642.pdf) 142 | - GauFRe: Gaussian Deformation Fields for Real-time Dynamic Novel View Synthesis, [Project Page](https://lynl7130.github.io/gaufre/index.html) 143 | 144 | ## 3. Generative 3D Gaussian Splatting: 145 | Papers with shared code are ranked higher in this list 146 | - DreamGaussian4D: Generative 4D Gaussian Splatting, [[Paper](https://arxiv.org/abs/2312.17142) | [Code](https://github.com/jiawei-ren/dreamgaussian4d)] 147 | - Text-to-3D using Gaussian Splatting, [[📄 Paper](https://arxiv.org/pdf/2309.16585.pdf) | [Project Page](https://gsgen3d.github.io/) | [Code](https://github.com/gsgen3d/gsgen) | [Explanation Video](https://www.youtube.com/live/l956ye13F8M?si=ZkvFL_lsY5OQUB7e)] 148 | - DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation, [Paper](https://arxiv.org/pdf/2309.16653.pdf) | [Project Page](https://dreamgaussian.github.io/) | [Code](https://github.com/dreamgaussian/dreamgaussian) | [Explanation Video](https://www.youtube.com/live/l956ye13F8M?si=ZkvFL_lsY5OQUB7e)] 149 | - GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors, [[Paper](https://arxiv.org/pdf/2310.08529.pdf) | [Project Page](https://taoranyi.com/gaussiandreamer/) | [Code](https://github.com/hustvl/GaussianDreamer)] 150 | - Gsgen: Text-to-3D using Gaussian Splatting, [[Paper](https://arxiv.org/abs/2309.16585) | [Project Page](https://gsgen3d.github.io/) | [Code](https://github.com/gsgen3d/gsgen)] 151 | - LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes, [[Paper](https://arxiv.org/abs/2311.13384) | [Project Page](https://luciddreamer-cvlab.github.io/)] 152 | - PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics, [[Paper](https://arxiv.org/abs/2311.12198) | [Project Page](https://xpandora.github.io/PhysGaussian/)] 153 | - HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting, [[Paper](https://arxiv.org/abs/2311.17061)] 154 | - Learn to Optimize Denoising Scores for 3D Generation: A Unified and Improved Diffusion Prior on NeRF and 3D Gaussian Splatting, [[Paper](https://arxiv.org/abs/2312.04820) | [Code](https://github.com/yangxiaofeng/LODS)] 155 | 156 | ## 4. Digital Avatar 157 | - Gaussian Shell Maps for Efficient 3D Human Generation, [[Paper](https://arxiv.org/abs/2311.17857) | [Code](https://github.com/computational-imaging/GSM)] 158 | - GauHuman: Articulated Gaussian Splatting from Monocular Human Videos, [[Paper](https://arxiv.org/pdf/2312.02973.pdf) | [Project Page](https://skhu101.github.io/GauHuman/) | [Code](https://github.com/skhu101/GauHuman)] 159 | - HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting, [Paper](https://arxiv.org/abs/2312.02902) 160 | - HUGS: Human Gaussian Splats, [Paper](https://arxiv.org/abs/2311.17910) 161 | - SplatArmor: Articulated Gaussian splatting for animatable humans from monocular RGB videos, [Paper](https://arxiv.org/pdf/2311.10812) 162 | - Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling, [Paper](https://arxiv.org/pdf/2311.16096.pdf) 163 | - Human101: Training 100+FPS Human Gaussians in 100s from 1 View, [Paper](https://arxiv.org/abs/2312.15258) 164 | - Deformable 3D Gaussian Splatting for Animatable Human Avatars, [Paper](https://arxiv.org/abs/2312.15059) 165 | 166 | ## 5. LLM 3D Gaussian Splatting 167 | - LangSplat: 3D Language Gaussian Splatting, [[Paper](https://arxiv.org/pdf/2312.16084.pdf) | [Project Page](https://langsplat.github.io/) | [Code](https://github.com/minghanqin/LangSplat)] 168 | 169 | ## 6. 3D Gaussian Viewers 170 | 171 | ### Colab 172 | - [Camenduru](https://github.com/camenduru/gaussian-splatting-colab) 173 | - [NeRFStudio](https://github.com/nerfstudio-project/nerfstudio/blob/main/colab/demo.ipynb) 174 | 175 | ### Training 176 | - [fast: C++/CUDA](https://github.com/MrNeRF/gaussian-splatting-cuda) 177 | - [nerfstudio: python/CUDA](https://github.com/nerfstudio-project/gsplat) 178 | - [Taichi 3D Gaussian Splatting](https://github.com/wanmeihuali/taichi_3d_gaussian_splatting) 179 | 180 | ### Viewers 181 | - [Playcanvas](https://github.com/playcanvas/supersplat) 182 | - [Luma AI (WebGL)](https://lumalabs.ai/luma-web-library) 183 | - [WebGL Viewer 1](https://github.com/antimatter15/splat) 184 | - [WebGL Viewer 2](https://github.com/cvlab-epfl/gaussian-splatting-web) 185 | - [Three.js](https://github.com/mkkellogg/GaussianSplats3D) 186 | - [A-Frame](https://github.com/quadjr/aframe-gaussian-splatting) 187 | 188 | ### Game Engines 189 | - [Unity Implementation](https://github.com/aras-p/UnityGaussianSplatting) 190 | - [Blender](https://github.com/ReshotAI/gaussian-splatting-blender-addon) 191 | 192 | ## 6. Documents 193 | ### Product 194 | - [Luma AI](https://lumalabs.ai/interactive-scenes) 195 | - [Polycam](https://poly.cam/gaussian-splatting) 196 | 197 | ### Blog Posts 198 | 199 | 1. [Gaussian Splatting is pretty cool](https://aras-p.info/blog/2023/09/05/Gaussian-Splatting-is-pretty-cool/) 200 | 2. [Making Gaussian Splats smaller](https://aras-p.info/blog/2023/09/13/Making-Gaussian-Splats-smaller/) 201 | 3. [Making Gaussian Splats more smaller](https://aras-p.info/blog/2023/09/27/Making-Gaussian-Splats-more-smaller/) 202 | 203 | ### Tutorial Videos 204 | 205 | 1. [Getting Started with 3DGS](https://youtu.be/UXtuigy_wYc?si=j1vfORNspcocSH-b) 206 | 2. [How to view 3DGS Scenes in Unity](https://youtu.be/5_GaPYBHqOo?si=6u9j1HqXwF_5WSUL) 207 | 208 | 209 | ## Reference 210 | - [Gaussian Splatting](https://github.com/graphdeco-inria/gaussian-splatting) 211 | - [MrNeRF](https://github.com/MrNeRF/awesome-3D-gaussian-splatting/tree/main) 212 | - https://dellaert.github.io/NeRF22/ 213 | - https://github.com/yangjiheng/nerf_and_beyond_docs -------------------------------------------------------------------------------- /UpdateLog.md: -------------------------------------------------------------------------------- 1 | 2 | ## Update Log: 3 | **October 27, 2023**: 4 | - Change format to "not include abstract". 5 | **October 23, 2023**: 6 | - Added Related Papers section. 7 | -------------------------------------------------------------------------------- /avatar.md: -------------------------------------------------------------------------------- 1 | 2 | **Key Words**: 3D Morphable Models (3DMMs), NeRF 3 | 4 | ## 2024 5 | - Portrait4D: Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data, CVPR 2024. [[Project](https://yudeng.github.io/Portrait4D/) | [Code](https://github.com/YuDeng/Portrait-4D)] 6 | - Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer, ECCV 2024. [[Code](https://github.com/YuDeng/Portrait-4D) | [Huggingface](https://huggingface.co/posts/DmitryRyumin/891674447263162)] 7 | - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation, CVPR 2024. [Project](https://humanaigc.github.io/animate-anyone/) 8 | - Pose Adapted Shape Learning for Large-Pose Face Reenactment, CVPR 2024. 9 | - REFA: Real-time Egocentric Facial Animations for Virtual Reality, CVPR 2024. 10 | - Locally Adaptive Neural 3D Morphable Models, CVPR 2024. [[Code](https://github.com/michaeltrs/LAMM)] 11 | - Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation, CVPR 2024. [**NeRF**] [[Project](https://xiyichen.github.io/morphablediffusion/)] 12 | - HUGS: Human Gaussian Splats, CVPR 2024. [[Paper](https://arxiv.org/abs/2311.17910) | [Code](https://github.com/apple/ml-hugs)] 13 | - GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians, CVPR 2024. [[Project](https://shenhanqian.github.io/gaussian-avatars)] 14 | - VOODOO 3D: Volumetric Portrait Disentanglement for One-Shot 3D Head Reenactment, CVPR 2024. [[Project](https://p0lyfish.github.io/voodoo3d/) | [Code](https://github.com/MBZUAI-Metaverse/VOODOO3D-official)] 15 | - VRMM: A Volumetric Relightable Morphable Head Model, ACM SIGGRAPH 2024. 16 | - Bring Your Own Characters: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters, IEEE VR 2024. [[Paper](https://arxiv.org/abs/2402.13724) | [Code](https://github.com/showlab/byoc)] 17 | 18 | - TADA! Text to Animatable Digital Avatars, 3DV, 2024. [[Project](https://tada.is.tue.mpg.de/)] 19 | - AvatarOne: Monocular 3D Human Animation, WAVC, 2024. [[Project](https://aku02.github.io/projects/avatarone/)] 20 | - Towards Realistic Generative 3D Face Models (AlbedoGAN), WACV 2024. [[Code](https://github.com/aashishrai3799/Towards-Realistic-Generative-3D-Face-Models)] 21 | - PSAvatar: A Point-based Shape Model for Real-Time Head Avatar Animation with 3D Gaussian Splatting, 2024. 22 | 23 | - VOODOO XP: Expressive One-Shot Head Reenactment for VR Telepresence, 2024. 24 | - HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors, 2024. [[Project](https://headgap.github.io/)] 25 | - Learn2Talk: 3D Talking Face Learns from 2D Talking Face, 2024. 26 | 27 | ## 2023 28 | - Instant volumetric head avatars, CVPR, 2023. [[Project](https://zielon.github.io/insta/) | [Code](https://github.com/Zielon/INSTA)] 29 | - X-Avatar: Expressive Human Avatars, CVPR, 2023. [[Project](https://skype-line.github.io/projects/X-Avatar/)] 30 | - Generalizable One-shot 3D Neural Head Avatar, NueralIPS, 2023. [[Project](https://github.com/NVlabs/GOHA)] 31 | - HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting, [[Paper](https://arxiv.org/abs/2312.02902)] 32 | - SplatArmor: Articulated Gaussian splatting for animatable humans from monocular RGB videos, [[Paper](https://arxiv.org/pdf/2311.10812)] 33 | - Gaussian Shell Maps for Efficient 3D Human Generation, [[Paper](https://arxiv.org/abs/2311.17857)] 34 | - Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling, [[Paper](https://arxiv.org/pdf/2311.16096.pdf)] 35 | - Implicit Neural Head Synthesis via Controllable Local Deformation Fields, CVPR 2023. [**NeRF**] [[Project](https://imaging.cs.cmu.edu/local_deformation_fields/)] 36 | 37 | ## 2022 and before 38 | - Rodin: A generative model for sculpting 3d digital avatars using diffusion, 2022. [Project](https://3d-avatar-diffusion.microsoft.com/) 39 | - DeepFaceLab: Integrated, flexible and extensible face-swapping framework, 2020. [[Paper](https://arxiv.org/abs/2005.05535)] 40 | - First Order Motion Model for Image Animation, NeurIPS 2019. [[Project](https://aliaksandrsiarohin.github.io/first-order-model-website/) | [Code](https://github.com/AliaksandrSiarohin/first-order-model)] 41 | 42 | ## Demos 43 | - CatVTON: Concatenation Is All You Need for **Virtual Try-On** with Diffusion Models [Huggingface](https://huggingface.co/spaces/zhengchong/CatVTON) -------------------------------------------------------------------------------- /colabs/2DGS.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "provenance": [], 7 | "gpuType": "T4" 8 | }, 9 | "kernelspec": { 10 | "name": "python3", 11 | "display_name": "Python 3" 12 | }, 13 | "language_info": { 14 | "name": "python" 15 | }, 16 | "accelerator": "GPU" 17 | }, 18 | "cells": [ 19 | { 20 | "cell_type": "code", 21 | "execution_count": null, 22 | "metadata": { 23 | "id": "GsXqtQVJ-6wS" 24 | }, 25 | "outputs": [], 26 | "source": [ 27 | "import torch\n", 28 | "import torch.nn as nn\n", 29 | "import numpy as np\n", 30 | "import matplotlib.pyplot as plt\n", 31 | "import matplotlib" 32 | ] 33 | }, 34 | { 35 | "cell_type": "markdown", 36 | "source": [ 37 | "## Math utils" 38 | ], 39 | "metadata": { 40 | "id": "3tuwFOV__JfZ" 41 | } 42 | }, 43 | { 44 | "cell_type": "markdown", 45 | "source": [ 46 | "# 2D Gaussian Splatting implemented in python with a few lines of code\n", 47 | "\n", 48 | "\n", 49 | "\n", 50 | "This code implements the renderer of the paper \"2D Gaussian Splatting for Geometrically Accurate Radiance Fields\".\n", 51 | "\n", 52 | "paper: https://arxiv.org/abs/2403.17888\n", 53 | "\n", 54 | "homepage: https://surfsplatting.github.io/\n", 55 | "\n", 56 | "The cuda code is efficient and good, but it would be more readable with a pure pytorch/python code so readers can have better understanding without needing looking into the cuda implementation. Reader can also implement it with other preferred programming language.\n", 57 | "\n", 58 | "This code is built upon many great repos:\n", 59 | "\n", 60 | "torch-splatting: https://github.com/hbb1/torch-splatting\n", 61 | "\n", 62 | "3DGS: https://github.com/graphdeco-inria/gaussian-splatting\n", 63 | "\n", 64 | "gsplat: https://github.com/nerfstudio-project/gsplat" 65 | ], 66 | "metadata": { 67 | "id": "pcWh0PVWT7iC" 68 | } 69 | }, 70 | { 71 | "cell_type": "code", 72 | "source": [ 73 | "def build_rotation(r):\n", 74 | " norm = torch.sqrt(r[:,0]*r[:,0] + r[:,1]*r[:,1] + r[:,2]*r[:,2] + r[:,3]*r[:,3])\n", 75 | "\n", 76 | " q = r / norm[:, None]\n", 77 | "\n", 78 | " R = torch.zeros((q.size(0), 3, 3), device='cuda')\n", 79 | "\n", 80 | " r = q[:, 0]\n", 81 | " x = q[:, 1]\n", 82 | " y = q[:, 2]\n", 83 | " z = q[:, 3]\n", 84 | "\n", 85 | " R[:, 0, 0] = 1 - 2 * (y*y + z*z)\n", 86 | " R[:, 0, 1] = 2 * (x*y - r*z)\n", 87 | " R[:, 0, 2] = 2 * (x*z + r*y)\n", 88 | " R[:, 1, 0] = 2 * (x*y + r*z)\n", 89 | " R[:, 1, 1] = 1 - 2 * (x*x + z*z)\n", 90 | " R[:, 1, 2] = 2 * (y*z - r*x)\n", 91 | " R[:, 2, 0] = 2 * (x*z - r*y)\n", 92 | " R[:, 2, 1] = 2 * (y*z + r*x)\n", 93 | " R[:, 2, 2] = 1 - 2 * (x*x + y*y)\n", 94 | " return R\n", 95 | "\n", 96 | "def build_scaling_rotation(s, r):\n", 97 | " L = torch.zeros((s.shape[0], 3, 3), dtype=torch.float, device=\"cuda\")\n", 98 | " R = build_rotation(r)\n", 99 | "\n", 100 | " L[:,0,0] = s[:,0]\n", 101 | " L[:,1,1] = s[:,1]\n", 102 | " L[:,2,2] = s[:,2]\n", 103 | "\n", 104 | " L = R @ L\n", 105 | " return L\n", 106 | "\n", 107 | "def getProjectionMatrix(znear, zfar, fovX, fovY):\n", 108 | " import math\n", 109 | " tanHalfFovY = math.tan((fovY / 2))\n", 110 | " tanHalfFovX = math.tan((fovX / 2))\n", 111 | "\n", 112 | " top = tanHalfFovY * znear\n", 113 | " bottom = -top\n", 114 | " right = tanHalfFovX * znear\n", 115 | " left = -right\n", 116 | "\n", 117 | " P = torch.zeros(4, 4)\n", 118 | "\n", 119 | " z_sign = 1.0\n", 120 | "\n", 121 | " P[0, 0] = 2.0 * znear / (right - left)\n", 122 | " P[1, 1] = 2.0 * znear / (top - bottom)\n", 123 | " P[0, 2] = (right + left) / (right - left)\n", 124 | " P[1, 2] = (top + bottom) / (top - bottom)\n", 125 | " P[3, 2] = z_sign\n", 126 | " P[2, 2] = z_sign * zfar / (zfar - znear)\n", 127 | " P[2, 3] = -(zfar * znear) / (zfar - znear)\n", 128 | " return P\n", 129 | "\n", 130 | "def focal2fov(focal, pixels):\n", 131 | " import math\n", 132 | " return 2*math.atan(pixels/(2*focal))\n", 133 | "\n", 134 | "def homogeneous(points):\n", 135 | " \"\"\"\n", 136 | " homogeneous points\n", 137 | " :param points: [..., 3]\n", 138 | " \"\"\"\n", 139 | " return torch.cat([points, torch.ones_like(points[..., :1])], dim=-1)\n", 140 | "\n", 141 | "def homogeneous_vec(vec):\n", 142 | " \"\"\"\n", 143 | " homogeneous points\n", 144 | " :param points: [..., 3]\n", 145 | " \"\"\"\n", 146 | " return torch.cat([vec, torch.zeros_like(vec[..., :1])], dim=-1)" 147 | ], 148 | "metadata": { 149 | "id": "fhmSXACx_VL9" 150 | }, 151 | "execution_count": null, 152 | "outputs": [] 153 | }, 154 | { 155 | "cell_type": "markdown", 156 | "source": [ 157 | "## Surface splatting (2D Gaussian splatting)" 158 | ], 159 | "metadata": { 160 | "id": "yjPjjw2y_qL2" 161 | } 162 | }, 163 | { 164 | "cell_type": "code", 165 | "source": [ 166 | "# Surface splatting (2D Gaussian Splatting)\n", 167 | "def setup(means3D, scales, quats, opacities, colors, viewmat, projmat):\n", 168 | " rotations = build_scaling_rotation(scales, quats).permute(0,2,1)\n", 169 | "\n", 170 | " # 1. Viewing transform\n", 171 | " # Eq.4 and Eq.5\n", 172 | " p_view = (means3D @ viewmat[:3,:3]) + viewmat[-1:,:3]\n", 173 | " uv_view = (rotations @ viewmat[:3,:3])\n", 174 | " M = torch.cat([homogeneous_vec(uv_view[:,:2,:]), homogeneous(p_view.unsqueeze(1))], dim=1)\n", 175 | "\n", 176 | " T = M @ projmat # T stands for (WH)^T in Eq.9\n", 177 | " import pdb; pdb.set_trace()\n", 178 | " # 2. Compute AABB\n", 179 | " # Homogneous plane is very useful for both ray-splat intersection and bounding box computation\n", 180 | " # we know how to compute u,v given x,y homogeneous plane already; computing AABB is done by a reverse process.\n", 181 | " # i.e compute the x, y s.t. \\|hu^4\\| = 1 and \\|h_v^4\\|=1 (distance of gaussian center to plane in the uv space)\n", 182 | " temp_point = torch.tensor([[1.,1., -1.]]).cuda()\n", 183 | " distance = (temp_point * (T[..., 3] * T[..., 3])).sum(dim=-1, keepdims=True)\n", 184 | " f = (1 / distance) * temp_point\n", 185 | " point_image = torch.cat(\n", 186 | " [(f * T[..., 0] * T[...,3]).sum(dim=-1, keepdims=True),\n", 187 | " (f * T[..., 1] * T[...,3]).sum(dim=-1, keepdims=True),\n", 188 | " (f * T[..., 2] * T[...,3]).sum(dim=-1, keepdims=True)], dim=-1)\n", 189 | "\n", 190 | " half_extend = point_image * point_image - torch.cat(\n", 191 | " [(f * T[..., 0] * T[...,0]).sum(dim=-1, keepdims=True),\n", 192 | " (f * T[..., 1] * T[...,1]).sum(dim=-1, keepdims=True),\n", 193 | " (f * T[..., 2] * T[...,2]).sum(dim=-1, keepdims=True)], dim=-1)\n", 194 | "\n", 195 | " radii = half_extend.clamp(min=1e-4).sqrt() * 3 # three sigma\n", 196 | " center = point_image\n", 197 | "\n", 198 | " # 3. Perform Sorting\n", 199 | " depth = p_view[..., 2] # depth is used only for sorting\n", 200 | " index = depth.sort()[1]\n", 201 | " T = T[index]\n", 202 | " colors = colors[index]\n", 203 | " center = center[index]\n", 204 | " depth = depth[index]\n", 205 | " radii = radii[index]\n", 206 | " opacities = opacities[index]\n", 207 | " return T, colors, opacities, center, depth, radii\n", 208 | "\n", 209 | "def surface_splatting(means3D, scales, quats, colors, opacities, intrins, viewmat, projmat):\n", 210 | " # Rasterization setup\n", 211 | " projmat = torch.zeros(4,4).cuda()\n", 212 | " projmat[:3,:3] = intrins\n", 213 | " projmat[-1,-2] = 1.0\n", 214 | " projmat = projmat.T\n", 215 | " T, colors, opacities, center, depth, radii = setup(means3D, scales, quats, opacities, colors, viewmat, projmat)\n", 216 | "\n", 217 | " # Rasterization\n", 218 | " # 1. Generate pixels\n", 219 | " H, W = (intrins[0,-1] * 2).long(), (intrins[1,-1] * 2).long()\n", 220 | " H, W = H.item(), W.item()\n", 221 | " pix = torch.stack(torch.meshgrid(torch.arange(H),\n", 222 | " torch.arange(W), indexing='xy'), dim=-1).to('cuda')\n", 223 | "\n", 224 | " # 2. Compute ray splat intersection # Eq.9 and Eq.10\n", 225 | " x = pix.reshape(-1,1,2)[..., :1]\n", 226 | " y = pix.reshape(-1,1,2)[..., 1:]\n", 227 | " k = -T[None][..., 0] + x * T[None][..., 3]\n", 228 | " l = -T[None][..., 1] + y * T[None][..., 3]\n", 229 | " points = torch.cross(k, l, dim=-1)\n", 230 | " s = points[..., :2] / points[..., -1:]\n", 231 | "\n", 232 | " # 3. add low pass filter # Eq. 11\n", 233 | " # when a point (2D Gaussian) viewed from a far distance or from a slended angle\n", 234 | " # the 2D Gaussian will falls between pixels and no fragment is used to rasterize the Gaussian\n", 235 | " # so we should add a low pass filter to handle such aliasing.\n", 236 | " dist3d = (s * s).sum(dim=-1)\n", 237 | " filtersze = np.sqrt(2) / 2\n", 238 | " dist2d = (1/filtersze)**2 * (torch.cat([x,y], dim=-1) - center[None,:,:2]).norm(dim=-1)**2\n", 239 | " # min of dist2 is equal to max of Gaussian exp(-0.5 * dist2)\n", 240 | " dist2 = torch.min(dist3d, dist2d)\n", 241 | " # dist2 = dist3d\n", 242 | " depth_acc = (homogeneous(s) * T[None,..., -1]).sum(dim=-1)\n", 243 | "\n", 244 | " # 4. accumulate 2D gaussians through alpha blending # Eq.12\n", 245 | " image, depthmap = alpha_blending_with_gaussians(dist2, colors, opacities, depth_acc, H, W)\n", 246 | " return image, depthmap, center, radii, dist2" 247 | ], 248 | "metadata": { 249 | "id": "f4DBG-sz_tes" 250 | }, 251 | "execution_count": null, 252 | "outputs": [] 253 | }, 254 | { 255 | "cell_type": "markdown", 256 | "source": [ 257 | "## Volume splatting (3D Gaussian Splatting)" 258 | ], 259 | "metadata": { 260 | "id": "6PgnyA6JAE5e" 261 | } 262 | }, 263 | { 264 | "cell_type": "code", 265 | "source": [ 266 | "def build_covariance_2d(\n", 267 | " mean3d, cov3d, viewmatrix, tan_fovx, tan_fovy, focal_x, focal_y\n", 268 | "):\n", 269 | " import math\n", 270 | " t = (mean3d @ viewmatrix[:3,:3]) + viewmatrix[-1:,:3]\n", 271 | " tz = t[..., 2]\n", 272 | " tx = t[..., 0]\n", 273 | " ty = t[..., 1]\n", 274 | "\n", 275 | " # Eq.29 locally affine transform\n", 276 | " # perspective transform is not affine so we approximate with first-order taylor expansion\n", 277 | " # notice that we multiply by the intrinsic so that the variance is at the sceen space\n", 278 | " J = torch.zeros(mean3d.shape[0], 3, 3).to(mean3d)\n", 279 | " J[..., 0, 0] = 1 / tz * focal_x\n", 280 | " J[..., 0, 2] = -tx / (tz * tz) * focal_x\n", 281 | " J[..., 1, 1] = 1 / tz * focal_y\n", 282 | " J[..., 1, 2] = -ty / (tz * tz) * focal_y\n", 283 | " W = viewmatrix[:3,:3].T # transpose to correct viewmatrix\n", 284 | " cov2d = J @ W @ cov3d @ W.T @ J.permute(0,2,1)\n", 285 | "\n", 286 | " # add low pass filter here according to E.q. 32\n", 287 | " filter = torch.eye(2,2).to(cov2d) * 0.0\n", 288 | " return cov2d[:, :2, :2] + filter[None]\n", 289 | "\n", 290 | "def build_covariance_3d(s, r):\n", 291 | " L = build_scaling_rotation(s, r).permute(0,2,1)\n", 292 | " actual_covariance = L @ L.transpose(1, 2)\n", 293 | " return actual_covariance\n", 294 | "\n", 295 | "def projection_ndc(points, viewmatrix, projmatrix):\n", 296 | " points_o = homogeneous(points) # object space\n", 297 | " points_h = points_o @ viewmatrix @ projmatrix # screen space # RHS\n", 298 | " p_w = 1.0 / (points_h[..., -1:] + 0.000001)\n", 299 | " p_proj = points_h * p_w\n", 300 | " p_view = points_o @ viewmatrix\n", 301 | " in_mask = p_view[..., 2] >= 0.2\n", 302 | " return p_proj, p_view, in_mask\n", 303 | "\n", 304 | "def get_radius(cov2d):\n", 305 | " det = cov2d[:, 0, 0] * cov2d[:,1,1] - cov2d[:, 0, 1] * cov2d[:,1,0]\n", 306 | " mid = 0.5 * (cov2d[:, 0,0] + cov2d[:,1,1])\n", 307 | " lambda1 = mid + torch.sqrt((mid**2-det).clip(min=0.1))\n", 308 | " lambda2 = mid - torch.sqrt((mid**2-det).clip(min=0.1))\n", 309 | " return 3.0 * torch.sqrt(torch.max(lambda1, lambda2)).ceil()\n", 310 | "\n", 311 | "def volume_splatting(means3D, scales, quats, colors, opacities, intrins, viewmat, projmat):\n", 312 | " projmat = torch.zeros(4,4).cuda()\n", 313 | " projmat[:3,:3] = intrins\n", 314 | " projmat[-1,-2] = 1.0\n", 315 | " projmat = projmat.T\n", 316 | "\n", 317 | " mean_ndc, mean_view, in_mask = projection_ndc(means3D, viewmatrix=viewmat, projmatrix=projmat)\n", 318 | "\n", 319 | " depths = mean_view[:,2]\n", 320 | " mean_coord_x = mean_ndc[..., 0]\n", 321 | " mean_coord_y = mean_ndc[..., 1]\n", 322 | "\n", 323 | " means2D = torch.stack([mean_coord_x, mean_coord_y], dim=-1)\n", 324 | " # scales = torch.cat([scales[..., :2], scales[..., -1:]*1e-2], dim=-1)\n", 325 | " cov3d = build_covariance_3d(scales, quats)\n", 326 | "\n", 327 | " W, H = (intrins[0,-1] * 2).long().item(), (intrins[1,-1] * 2).long().item()\n", 328 | " fx, fy = intrins[0,0], intrins[1,1]\n", 329 | " tan_fovx = W / (2 * fx)\n", 330 | " tan_fovy = H / (2 * fy)\n", 331 | " cov2d = build_covariance_2d(means3D, cov3d, viewmat, tan_fovx, tan_fovy, fx, fy)\n", 332 | " radii = get_radius(cov2d)\n", 333 | "\n", 334 | " # Rasterization\n", 335 | " # generate pixels\n", 336 | " pix = torch.stack(torch.meshgrid(torch.arange(H), torch.arange(W), indexing='xy'), dim=-1).to('cuda').flatten(0,-2)\n", 337 | " sorted_conic = cov2d.inverse() # inverse of variance\n", 338 | " dx = (pix[:,None,:] - means2D[None,:]) # B P 2\n", 339 | " dist2 = dx[:, :, 0]**2 * sorted_conic[:, 0, 0] + dx[:, :, 1]**2 * sorted_conic[:, 1, 1]+ dx[:,:,0]*dx[:,:,1] * sorted_conic[:, 0, 1]+ dx[:,:,0]*dx[:,:,1] * sorted_conic[:, 1, 0]\n", 340 | " depth_acc = depths[None].expand_as(dist2)\n", 341 | "\n", 342 | " image, depthmap = alpha_blending_with_gaussians(dist2, colors, opacities, depth_acc, H, W)\n", 343 | " return image, depthmap, means2D, radii, dist2" 344 | ], 345 | "metadata": { 346 | "id": "THUjERtnANnJ" 347 | }, 348 | "execution_count": null, 349 | "outputs": [] 350 | }, 351 | { 352 | "cell_type": "markdown", 353 | "source": [ 354 | "## Rendering utils" 355 | ], 356 | "metadata": { 357 | "id": "6h6bNW-6_XZm" 358 | } 359 | }, 360 | { 361 | "cell_type": "code", 362 | "source": [ 363 | "def alpha_blending(alpha, colors):\n", 364 | " T = torch.cat([torch.ones_like(alpha[-1:]), (1-alpha).cumprod(dim=0)[:-1]], dim=0)\n", 365 | " image = (T * alpha * colors).sum(dim=0).reshape(-1, colors.shape[-1])\n", 366 | " alphamap = (T * alpha).sum(dim=0).reshape(-1, 1)\n", 367 | " return image, alphamap\n", 368 | "\n", 369 | "\n", 370 | "def alpha_blending_with_gaussians(dist2, colors, opacities, depth_acc, H, W):\n", 371 | " colors = colors.reshape(-1,1,colors.shape[-1])\n", 372 | " depth_acc = depth_acc.T[..., None]\n", 373 | " depth_acc = depth_acc.repeat(1,1,1)\n", 374 | "\n", 375 | " # evaluate gaussians\n", 376 | " # just for visualization, the actual cut off can be 3 sigma!\n", 377 | " cutoff = 1**2\n", 378 | " dist2 = dist2.T\n", 379 | " gaussians = torch.exp(-0.5*dist2) * (dist2 < cutoff)\n", 380 | " gaussians = gaussians[..., None]\n", 381 | " alpha = opacities.unsqueeze(1) * gaussians\n", 382 | "\n", 383 | " # accumulate gaussians\n", 384 | " image, _ = alpha_blending(alpha, colors)\n", 385 | " depthmap, alphamap = alpha_blending(alpha, depth_acc)\n", 386 | " depthmap = depthmap / alphamap\n", 387 | " depthmap = torch.nan_to_num(depthmap, 0, 0)\n", 388 | " return image.reshape(H,W,-1), depthmap.reshape(H,W,-1)" 389 | ], 390 | "metadata": { 391 | "id": "2ulOShwp-2xW" 392 | }, 393 | "execution_count": null, 394 | "outputs": [] 395 | }, 396 | { 397 | "cell_type": "markdown", 398 | "source": [ 399 | "## Utils for inputs and cameras" 400 | ], 401 | "metadata": { 402 | "id": "YoyRlnsYARRs" 403 | } 404 | }, 405 | { 406 | "cell_type": "code", 407 | "source": [ 408 | "def get_inputs(num_points=8):\n", 409 | " length = 0.5\n", 410 | " x = np.linspace(-1, 1, num_points) * length\n", 411 | " y = np.linspace(-1, 1, num_points) * length\n", 412 | " x, y = np.meshgrid(x, y)\n", 413 | " means3D = torch.from_numpy(np.stack([x,y, 0 * np.random.rand(*x.shape)], axis=-1).reshape(-1,3)).cuda().float()\n", 414 | " quats = torch.zeros(1,4).repeat(len(means3D), 1).cuda()\n", 415 | " quats[..., 0] = 1.\n", 416 | " scale = length /(num_points-1)\n", 417 | " scales = torch.zeros(1,3).repeat(len(means3D), 1).fill_(scale).cuda()\n", 418 | " return means3D, scales, quats\n", 419 | "\n", 420 | "def get_cameras():\n", 421 | " intrins = torch.tensor([[711.1111, 0.0000, 256.0000, 0.0000],\n", 422 | " [ 0.0000, 711.1111, 256.0000, 0.0000],\n", 423 | " [ 0.0000, 0.0000, 1.0000, 0.0000],\n", 424 | " [ 0.0000, 0.0000, 0.0000, 1.0000]]).cuda()\n", 425 | " c2w = torch.tensor([[-8.6086e-01, 3.7950e-01, -3.3896e-01, 6.7791e-01],\n", 426 | " [ 5.0884e-01, 6.4205e-01, -5.7346e-01, 1.1469e+00],\n", 427 | " [ 1.0934e-08, -6.6614e-01, -7.4583e-01, 1.4917e+00],\n", 428 | " [ 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00]]).cuda()\n", 429 | "\n", 430 | " width, height = 512, 512\n", 431 | " focal_x, focal_y = intrins[0, 0], intrins[1, 1]\n", 432 | " viewmat = torch.linalg.inv(c2w).permute(1,0)\n", 433 | " FoVx = focal2fov(focal_x, width)\n", 434 | " FoVy = focal2fov(focal_y, height)\n", 435 | " projmat = getProjectionMatrix(znear=0.2, zfar=1000, fovX=FoVx, fovY=FoVy).transpose(0,1).cuda()\n", 436 | " projmat = viewmat @ projmat\n", 437 | " return intrins, viewmat, projmat, height, width" 438 | ], 439 | "metadata": { 440 | "id": "lFT5k1dhAX0c" 441 | }, 442 | "execution_count": null, 443 | "outputs": [] 444 | }, 445 | { 446 | "cell_type": "markdown", 447 | "source": [ 448 | "## Visualization of the 2DGS v.s 3DGS\n", 449 | "\n", 450 | "Our 2DGS fits well to a surface." 451 | ], 452 | "metadata": { 453 | "id": "vFfgJKy7R46e" 454 | } 455 | }, 456 | { 457 | "cell_type": "code", 458 | "source": [ 459 | "# Make inputs\n", 460 | "num_points1=8\n", 461 | "means3D, scales, quats = get_inputs(num_points=num_points1)\n", 462 | "intrins, viewmat, projmat, height, width = get_cameras()\n", 463 | "intrins = intrins[:3,:3]\n", 464 | "colors = matplotlib.colormaps['Accent'](np.random.randint(1,64, 64)/64)[..., :3]\n", 465 | "colors = torch.from_numpy(colors).cuda()\n", 466 | "\n", 467 | "opacity = torch.ones_like(means3D[:,:1])\n", 468 | "image1, depthmap1, center1, radii1, dist1 = surface_splatting(means3D, scales, quats, colors, opacity, intrins, viewmat, projmat)\n", 469 | "image2, depthmap2, center2, radii2, dist2 = volume_splatting(means3D, scales, quats, colors, opacity, intrins, viewmat, projmat)\n", 470 | "\n", 471 | "# Visualize 3DGS and 2DGS\n", 472 | "fig1, (ax1,ax2) = plt.subplots(1,2)\n", 473 | "fig2, (ax3,ax4) = plt.subplots(1,2)\n", 474 | "\n", 475 | "from matplotlib.patches import Rectangle\n", 476 | "point_image = center1.cpu().detach().numpy()\n", 477 | "half_extend = radii1.cpu().numpy() * 1/3 # only show one sigma\n", 478 | "lb = np.floor(point_image - half_extend)[..., :2]\n", 479 | "hw = np.ceil(2*(half_extend)[..., :2])\n", 480 | "\n", 481 | "ax1.set_aspect('equal')\n", 482 | "ax1.set_axis_off()\n", 483 | "ax1.set_title('2D Gaussian splatting - color')\n", 484 | "ax2.set_aspect('equal')\n", 485 | "ax2.set_axis_off()\n", 486 | "ax2.set_title('3D Gaussian splatting - color')\n", 487 | "ax1.set_aspect('equal')\n", 488 | "ax1.set_axis_off()\n", 489 | "\n", 490 | "ax3.set_title('2D Gaussian splatting - depth')\n", 491 | "ax3.set_axis_off()\n", 492 | "ax3.set_aspect('equal')\n", 493 | "ax4.set_axis_off()\n", 494 | "ax4.set_title('3D Gaussian splatting - depth')\n", 495 | "fig1.tight_layout()\n", 496 | "fig2.tight_layout()\n", 497 | "# visualize AABB\n", 498 | "for k in range(len(half_extend)):\n", 499 | " ax1.add_patch(Rectangle(lb[k], hw[k, 0], hw[k, 1], facecolor='none', edgecolor='white'))\n", 500 | " # ax3.add_patch(Rectangle(lb[k], hw[k, 0], hw[k, 1], facecolor='none', edgecolor='white'))\n", 501 | "\n", 502 | "img1 = image1.cpu().numpy()\n", 503 | "img2 = image2.cpu().numpy()\n", 504 | "ax1.imshow(img1)\n", 505 | "ax2.imshow(img2)\n", 506 | "\n", 507 | "img1 = depthmap1.cpu().numpy()\n", 508 | "img2 = depthmap2.cpu().numpy()\n", 509 | "\n", 510 | "ax3.imshow(img1)\n", 511 | "ax4.imshow(img2)\n", 512 | "\n", 513 | "plt.savefig('test1.png', transparent=True, dpi=300)" 514 | ], 515 | "metadata": { 516 | "colab": { 517 | "base_uri": "https://localhost:8080/", 518 | "height": 356 519 | }, 520 | "id": "K20GoXFJAqzp", 521 | "outputId": "4fee37a1-023f-4069-9ada-bc8d4b1f1fae" 522 | }, 523 | "execution_count": null, 524 | "outputs": [ 525 | { 526 | "output_type": "error", 527 | "ename": "RuntimeError", 528 | "evalue": "Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx", 529 | "traceback": [ 530 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 531 | "\u001b[0;31mRuntimeError\u001b[0m Traceback (most recent call last)", 532 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# Make inputs\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mnum_points1\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;36m8\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mmeans3D\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mscales\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mquats\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mget_inputs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnum_points\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mnum_points1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 4\u001b[0m \u001b[0mintrins\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mviewmat\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mprojmat\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mheight\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mwidth\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mget_cameras\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mintrins\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mintrins\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 533 | "\u001b[0;32m\u001b[0m in \u001b[0;36mget_inputs\u001b[0;34m(num_points)\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mlinspace\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnum_points\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mlength\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mmeshgrid\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0my\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 6\u001b[0;31m \u001b[0mmeans3D\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfrom_numpy\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstack\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0my\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m \u001b[0;34m*\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrandom\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrand\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mx\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreshape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m-\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcuda\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfloat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 7\u001b[0m \u001b[0mquats\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mzeros\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mrepeat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlen\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmeans3D\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcuda\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0mquats\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m...\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m1.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 534 | "\u001b[0;32m/usr/local/lib/python3.10/dist-packages/torch/cuda/__init__.py\u001b[0m in \u001b[0;36m_lazy_init\u001b[0;34m()\u001b[0m\n\u001b[1;32m 300\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;34m\"CUDA_MODULE_LOADING\"\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0menviron\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 301\u001b[0m \u001b[0mos\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0menviron\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"CUDA_MODULE_LOADING\"\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m\"LAZY\"\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 302\u001b[0;31m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_C\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_cuda_init\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 303\u001b[0m \u001b[0;31m# Some of the queued calls may reentrantly call _lazy_init();\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 304\u001b[0m \u001b[0;31m# we need to just return without initializing in that case.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 535 | "\u001b[0;31mRuntimeError\u001b[0m: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx" 536 | ] 537 | } 538 | ] 539 | }, 540 | { 541 | "cell_type": "markdown", 542 | "source": [ 543 | "## 2DGS V.S flatten 3DGS (setting last scale to be very small)\n", 544 | "\n", 545 | "Our 2DGS rasterizer is **perspective correct** and **depth accurate**, with the depth gradient consistent to the the normal direction." 546 | ], 547 | "metadata": { 548 | "id": "0UHcGifCSJ3f" 549 | } 550 | }, 551 | { 552 | "cell_type": "code", 553 | "source": [ 554 | "# reduce num of points to give a close look\n", 555 | "num_points2=2\n", 556 | "means3D, scales, quats = get_inputs(num_points=num_points2)\n", 557 | "scales[:,-1] = 0e-6\n", 558 | "colors = torch.cat([colors[:num_points2, :], colors[num_points1:num_points1+num_points2, :]], dim=0)\n", 559 | "\n", 560 | "opacity = torch.ones_like(means3D[:,:1])\n", 561 | "\n", 562 | "image1, depthmap1, center1, radii1, dist1 = surface_splatting(means3D, scales, quats, colors, opacity, intrins, viewmat, projmat)\n", 563 | "image2, depthmap2, center2, radii2, dist2 = volume_splatting(means3D, scales, quats, colors, opacity, intrins, viewmat, projmat)\n", 564 | "\n", 565 | "# Visualize 3DGS and 2DGS\n", 566 | "fig1, (ax1,ax2) = plt.subplots(1,2)\n", 567 | "fig2, (ax3,ax4) = plt.subplots(1,2)\n", 568 | "\n", 569 | "from matplotlib.patches import Rectangle\n", 570 | "point_image = center1.cpu().detach().numpy()\n", 571 | "half_extend = radii1.cpu().numpy()\n", 572 | "lb = np.floor(point_image - half_extend)[..., :2]\n", 573 | "hw = np.ceil(2*(half_extend)[..., :2])\n", 574 | "\n", 575 | "ax1.set_aspect('equal')\n", 576 | "ax1.set_axis_off()\n", 577 | "# ax1.set_title('2D Gaussian splatting - color')\n", 578 | "ax2.set_aspect('equal')\n", 579 | "ax2.set_axis_off()\n", 580 | "# ax2.set_title('3D Gaussian splatting - color')\n", 581 | "ax1.set_aspect('equal')\n", 582 | "ax1.set_axis_off()\n", 583 | "\n", 584 | "# ax3.set_title('2D Gaussian splatting - depth')\n", 585 | "ax3.set_axis_off()\n", 586 | "ax3.set_aspect('equal')\n", 587 | "ax4.set_axis_off()\n", 588 | "# ax4.set_title('3D Gaussian splatting - depth')\n", 589 | "fig1.tight_layout()\n", 590 | "fig2.tight_layout()\n", 591 | "\n", 592 | "img1 = image1.cpu().numpy()\n", 593 | "img2 = image2.cpu().numpy()\n", 594 | "ax1.imshow(img1)\n", 595 | "ax2.imshow(img2)\n", 596 | "\n", 597 | "img1 = depthmap1.cpu().numpy()\n", 598 | "img2 = depthmap2.cpu().numpy()\n", 599 | "\n", 600 | "ax3.imshow(img1)\n", 601 | "ax4.imshow(img2)\n", 602 | "\n", 603 | "plt.savefig('test2.png', transparent=True, dpi=300)" 604 | ], 605 | "metadata": { 606 | "id": "LTw-URSvR_it" 607 | }, 608 | "execution_count": null, 609 | "outputs": [] 610 | }, 611 | { 612 | "cell_type": "markdown", 613 | "source": [ 614 | "you can see that the flatten 3d Gaussian has perspective distortion and the depth is constant within a splat." 615 | ], 616 | "metadata": { 617 | "id": "O_1mLwSaVk5U" 618 | } 619 | } 620 | ] 621 | } -------------------------------------------------------------------------------- /colabs/HyperNerf.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "nbformat": 4, 3 | "nbformat_minor": 0, 4 | "metadata": { 5 | "colab": { 6 | "name": "HyperNeRF Training.ipynb", 7 | "private_outputs": true, 8 | "provenance": [], 9 | "collapsed_sections": [] 10 | }, 11 | "kernelspec": { 12 | "name": "python3", 13 | "display_name": "Python 3" 14 | }, 15 | "accelerator": "TPU" 16 | }, 17 | "cells": [ 18 | { 19 | "cell_type": "markdown", 20 | "metadata": { 21 | "id": "EZ_wkNVdTz-C" 22 | }, 23 | "source": [ 24 | "# Let's train HyperNeRF!\n", 25 | "\n", 26 | "**Author**: [Keunhong Park](https://keunhong.com)\n", 27 | "\n", 28 | "[[Project Page](https://hypernerf.github.io)]\n", 29 | "[[Paper](https://arxiv.org/abs/2106.13228)]\n", 30 | "[[GitHub](https://github.com/google/hypernerf)]\n", 31 | "\n", 32 | "This notebook provides an demo for training HyperNeRF.\n", 33 | "\n", 34 | "### Instructions\n", 35 | "\n", 36 | "1. Convert a video into our dataset format using the Nerfies [dataset processing notebook](https://colab.sandbox.google.com/github/google/nerfies/blob/main/notebooks/Nerfies_Capture_Processing.ipynb).\n", 37 | "2. Set the `data_dir` below to where you saved the dataset.\n", 38 | "3. Come back to this notebook to train HyperNeRF.\n", 39 | "\n", 40 | "\n", 41 | "### Notes\n", 42 | " * To accomodate the limited compute power of Colab runtimes, this notebook defaults to a \"toy\" version of our method. The number of samples have been reduced and the elastic regularization turned off.\n", 43 | "\n", 44 | " * To train a high-quality model, please look at the CLI options we provide in the [Github repository](https://github.com/google/hypernerf).\n", 45 | "\n", 46 | "\n", 47 | "\n", 48 | " * Please report issues on the [GitHub issue tracker](https://github.com/google/hypernerf/issues).\n", 49 | "\n", 50 | "\n", 51 | "If you find this work useful, please consider citing:\n", 52 | "```bibtex\n", 53 | "@article{park2021hypernerf\n", 54 | " author = {Park, Keunhong and Sinha, Utkarsh and Hedman, Peter and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Martin-Brualla, Ricardo and Seitz, Steven M.},\n", 55 | " title = {HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields},\n", 56 | " journal = {arXiv preprint arXiv:2106.13228},\n", 57 | " year = {2021},\n", 58 | "}\n", 59 | "```\n" 60 | ] 61 | }, 62 | { 63 | "cell_type": "markdown", 64 | "metadata": { 65 | "id": "OlW1gF_djH6H" 66 | }, 67 | "source": [ 68 | "## Environment Setup" 69 | ] 70 | }, 71 | { 72 | "cell_type": "code", 73 | "metadata": { 74 | "id": "I6Jbspl7TnIX" 75 | }, 76 | "source": [ 77 | "!pip install flax immutabledict mediapy\n", 78 | "!pip install --upgrade git+https://github.com/google/hypernerf" 79 | ], 80 | "execution_count": null, 81 | "outputs": [] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "metadata": { 86 | "id": "zGJux-m5Xp3Z", 87 | "cellView": "form" 88 | }, 89 | "source": [ 90 | "# @title Configure notebook runtime\n", 91 | "# @markdown If you would like to use a GPU runtime instead, change the runtime type by going to `Runtime > Change runtime type`. \n", 92 | "# @markdown You will have to use a smaller batch size on GPU.\n", 93 | "\n", 94 | "runtime_type = 'tpu' # @param ['gpu', 'tpu']\n", 95 | "if runtime_type == 'tpu':\n", 96 | " import jax.tools.colab_tpu\n", 97 | " jax.tools.colab_tpu.setup_tpu()\n", 98 | "\n", 99 | "print('Detected Devices:', jax.devices())" 100 | ], 101 | "execution_count": null, 102 | "outputs": [] 103 | }, 104 | { 105 | "cell_type": "code", 106 | "metadata": { 107 | "id": "afUtLfRWULEi", 108 | "cellView": "form" 109 | }, 110 | "source": [ 111 | "# @title Mount Google Drive\n", 112 | "# @markdown Mount Google Drive onto `/content/gdrive`. You can skip this if running locally.\n", 113 | "\n", 114 | "from google.colab import drive\n", 115 | "drive.mount('/content/gdrive')" 116 | ], 117 | "execution_count": null, 118 | "outputs": [] 119 | }, 120 | { 121 | "cell_type": "code", 122 | "metadata": { 123 | "id": "ENOfbG3AkcVN", 124 | "cellView": "form" 125 | }, 126 | "source": [ 127 | "# @title Define imports and utility functions.\n", 128 | "\n", 129 | "import jax\n", 130 | "from jax.config import config as jax_config\n", 131 | "import jax.numpy as jnp\n", 132 | "from jax import grad, jit, vmap\n", 133 | "from jax import random\n", 134 | "\n", 135 | "import flax\n", 136 | "import flax.linen as nn\n", 137 | "from flax import jax_utils\n", 138 | "from flax import optim\n", 139 | "from flax.metrics import tensorboard\n", 140 | "from flax.training import checkpoints\n", 141 | "jax_config.enable_omnistaging() # Linen requires enabling omnistaging\n", 142 | "\n", 143 | "from absl import logging\n", 144 | "from io import BytesIO\n", 145 | "import random as pyrandom\n", 146 | "import numpy as np\n", 147 | "import PIL\n", 148 | "import IPython\n", 149 | "\n", 150 | "\n", 151 | "# Monkey patch logging.\n", 152 | "def myprint(msg, *args, **kwargs):\n", 153 | " print(msg % args)\n", 154 | "\n", 155 | "logging.info = myprint \n", 156 | "logging.warn = myprint\n", 157 | "logging.error = myprint\n", 158 | "\n", 159 | "\n", 160 | "def show_image(image, fmt='png'):\n", 161 | " image = image_utils.image_to_uint8(image)\n", 162 | " f = BytesIO()\n", 163 | " PIL.Image.fromarray(image).save(f, fmt)\n", 164 | " IPython.display.display(IPython.display.Image(data=f.getvalue()))\n", 165 | "\n" 166 | ], 167 | "execution_count": null, 168 | "outputs": [] 169 | }, 170 | { 171 | "cell_type": "markdown", 172 | "metadata": { 173 | "id": "wW7FsSB-jORB" 174 | }, 175 | "source": [ 176 | "## Configuration" 177 | ] 178 | }, 179 | { 180 | "cell_type": "code", 181 | "metadata": { 182 | "id": "rz7wRm7YT9Ka" 183 | }, 184 | "source": [ 185 | "# @title Model and dataset configuration\n", 186 | "\n", 187 | "from pathlib import Path\n", 188 | "from pprint import pprint\n", 189 | "import gin\n", 190 | "from IPython.display import display, Markdown\n", 191 | "\n", 192 | "from hypernerf import models\n", 193 | "from hypernerf import modules\n", 194 | "from hypernerf import warping\n", 195 | "from hypernerf import datasets\n", 196 | "from hypernerf import configs\n", 197 | "\n", 198 | "\n", 199 | "# @markdown The working directory.\n", 200 | "train_dir = '/content/gdrive/My Drive/nerfies/hypernerf_experiments/capture1/exp1' # @param {type: \"string\"}\n", 201 | "# @markdown The directory to the dataset capture.\n", 202 | "data_dir = '/content/gdrive/My Drive/nerfies/captures/capture1' # @param {type: \"string\"}\n", 203 | "\n", 204 | "# @markdown Training configuration.\n", 205 | "max_steps = 100000 # @param {type: 'number'}\n", 206 | "batch_size = 4096 # @param {type: 'number'}\n", 207 | "image_scale = 8 # @param {type: 'number'}\n", 208 | "\n", 209 | "# @markdown Model configuration.\n", 210 | "use_viewdirs = True #@param {type: 'boolean'}\n", 211 | "use_appearance_metadata = True #@param {type: 'boolean'}\n", 212 | "num_coarse_samples = 64 # @param {type: 'number'}\n", 213 | "num_fine_samples = 64 # @param {type: 'number'}\n", 214 | "\n", 215 | "# @markdown Deformation configuration.\n", 216 | "use_warp = True #@param {type: 'boolean'}\n", 217 | "warp_field_type = '@SE3Field' #@param['@SE3Field', '@TranslationField']\n", 218 | "warp_min_deg = 0 #@param{type:'number'}\n", 219 | "warp_max_deg = 6 #@param{type:'number'}\n", 220 | "\n", 221 | "# @markdown Hyper-space configuration.\n", 222 | "hyper_num_dims = 8 #@param{type:'number'}\n", 223 | "hyper_point_min_deg = 0 #@param{type:'number'}\n", 224 | "hyper_point_max_deg = 1 #@param{type:'number'}\n", 225 | "hyper_slice_method = 'bendy_sheet' #@param['none', 'axis_aligned_plane', 'bendy_sheet']\n", 226 | "\n", 227 | "\n", 228 | "checkpoint_dir = Path(train_dir, 'checkpoints')\n", 229 | "checkpoint_dir.mkdir(exist_ok=True, parents=True)\n", 230 | "\n", 231 | "config_str = f\"\"\"\n", 232 | "DELAYED_HYPER_ALPHA_SCHED = {{\n", 233 | " 'type': 'piecewise',\n", 234 | " 'schedules': [\n", 235 | " (1000, ('constant', 0.0)),\n", 236 | " (0, ('linear', 0.0, %hyper_point_max_deg, 10000))\n", 237 | " ],\n", 238 | "}}\n", 239 | "\n", 240 | "ExperimentConfig.image_scale = {image_scale}\n", 241 | "ExperimentConfig.datasource_cls = @NerfiesDataSource\n", 242 | "NerfiesDataSource.data_dir = '{data_dir}'\n", 243 | "NerfiesDataSource.image_scale = {image_scale}\n", 244 | "\n", 245 | "NerfModel.use_viewdirs = {int(use_viewdirs)}\n", 246 | "NerfModel.use_rgb_condition = {int(use_appearance_metadata)}\n", 247 | "NerfModel.num_coarse_samples = {num_coarse_samples}\n", 248 | "NerfModel.num_fine_samples = {num_fine_samples}\n", 249 | "\n", 250 | "NerfModel.use_viewdirs = True\n", 251 | "NerfModel.use_stratified_sampling = True\n", 252 | "NerfModel.use_posenc_identity = False\n", 253 | "NerfModel.nerf_trunk_width = 128\n", 254 | "NerfModel.nerf_trunk_depth = 8\n", 255 | "\n", 256 | "TrainConfig.max_steps = {max_steps}\n", 257 | "TrainConfig.batch_size = {batch_size}\n", 258 | "TrainConfig.print_every = 100\n", 259 | "TrainConfig.use_elastic_loss = False\n", 260 | "TrainConfig.use_background_loss = False\n", 261 | "\n", 262 | "# Warp configs.\n", 263 | "warp_min_deg = {warp_min_deg}\n", 264 | "warp_max_deg = {warp_max_deg}\n", 265 | "NerfModel.use_warp = {use_warp}\n", 266 | "SE3Field.min_deg = %warp_min_deg\n", 267 | "SE3Field.max_deg = %warp_max_deg\n", 268 | "SE3Field.use_posenc_identity = False\n", 269 | "NerfModel.warp_field_cls = @SE3Field\n", 270 | "\n", 271 | "TrainConfig.warp_alpha_schedule = {{\n", 272 | " 'type': 'linear',\n", 273 | " 'initial_value': {warp_min_deg},\n", 274 | " 'final_value': {warp_max_deg},\n", 275 | " 'num_steps': {int(max_steps*0.8)},\n", 276 | "}}\n", 277 | "\n", 278 | "# Hyper configs.\n", 279 | "hyper_num_dims = {hyper_num_dims}\n", 280 | "hyper_point_min_deg = {hyper_point_min_deg}\n", 281 | "hyper_point_max_deg = {hyper_point_max_deg}\n", 282 | "\n", 283 | "NerfModel.hyper_embed_cls = @hyper/GLOEmbed\n", 284 | "hyper/GLOEmbed.num_dims = %hyper_num_dims\n", 285 | "NerfModel.hyper_point_min_deg = %hyper_point_min_deg\n", 286 | "NerfModel.hyper_point_max_deg = %hyper_point_max_deg\n", 287 | "\n", 288 | "TrainConfig.hyper_alpha_schedule = %DELAYED_HYPER_ALPHA_SCHED\n", 289 | "\n", 290 | "hyper_sheet_min_deg = 0\n", 291 | "hyper_sheet_max_deg = 6\n", 292 | "HyperSheetMLP.min_deg = %hyper_sheet_min_deg\n", 293 | "HyperSheetMLP.max_deg = %hyper_sheet_max_deg\n", 294 | "HyperSheetMLP.output_channels = %hyper_num_dims\n", 295 | "\n", 296 | "NerfModel.hyper_slice_method = '{hyper_slice_method}'\n", 297 | "NerfModel.hyper_sheet_mlp_cls = @HyperSheetMLP\n", 298 | "NerfModel.hyper_use_warp_embed = True\n", 299 | "\n", 300 | "TrainConfig.hyper_sheet_alpha_schedule = ('constant', %hyper_sheet_max_deg)\n", 301 | "\"\"\"\n", 302 | "\n", 303 | "gin.parse_config(config_str)\n", 304 | "\n", 305 | "config_path = Path(train_dir, 'config.gin')\n", 306 | "with open(config_path, 'w') as f:\n", 307 | " logging.info('Saving config to %s', config_path)\n", 308 | " f.write(config_str)\n", 309 | "\n", 310 | "exp_config = configs.ExperimentConfig()\n", 311 | "train_config = configs.TrainConfig()\n", 312 | "eval_config = configs.EvalConfig()\n", 313 | "\n", 314 | "display(Markdown(\n", 315 | " gin.config.markdown(gin.config_str())))" 316 | ], 317 | "execution_count": null, 318 | "outputs": [] 319 | }, 320 | { 321 | "cell_type": "code", 322 | "metadata": { 323 | "id": "r872r6hiVUVS", 324 | "cellView": "form" 325 | }, 326 | "source": [ 327 | "# @title Create datasource and show an example.\n", 328 | "\n", 329 | "from hypernerf import datasets\n", 330 | "from hypernerf import image_utils\n", 331 | "\n", 332 | "dummy_model = models.NerfModel({}, 0, 0)\n", 333 | "datasource = exp_config.datasource_cls(\n", 334 | " image_scale=exp_config.image_scale,\n", 335 | " random_seed=exp_config.random_seed,\n", 336 | " # Enable metadata based on model needs.\n", 337 | " use_warp_id=dummy_model.use_warp,\n", 338 | " use_appearance_id=(\n", 339 | " dummy_model.nerf_embed_key == 'appearance'\n", 340 | " or dummy_model.hyper_embed_key == 'appearance'),\n", 341 | " use_camera_id=dummy_model.nerf_embed_key == 'camera',\n", 342 | " use_time=dummy_model.warp_embed_key == 'time')\n", 343 | "\n", 344 | "show_image(datasource.load_rgb(datasource.train_ids[0]))" 345 | ], 346 | "execution_count": null, 347 | "outputs": [] 348 | }, 349 | { 350 | "cell_type": "code", 351 | "metadata": { 352 | "id": "XC3PIY74XB05", 353 | "cellView": "form" 354 | }, 355 | "source": [ 356 | "# @title Create training iterators\n", 357 | "\n", 358 | "devices = jax.local_devices()\n", 359 | "\n", 360 | "train_iter = datasource.create_iterator(\n", 361 | " datasource.train_ids,\n", 362 | " flatten=True,\n", 363 | " shuffle=True,\n", 364 | " batch_size=train_config.batch_size,\n", 365 | " prefetch_size=3,\n", 366 | " shuffle_buffer_size=train_config.shuffle_buffer_size,\n", 367 | " devices=devices,\n", 368 | ")\n", 369 | "\n", 370 | "def shuffled(l):\n", 371 | " import random as r\n", 372 | " import copy\n", 373 | " l = copy.copy(l)\n", 374 | " r.shuffle(l)\n", 375 | " return l\n", 376 | "\n", 377 | "train_eval_iter = datasource.create_iterator(\n", 378 | " shuffled(datasource.train_ids), batch_size=0, devices=devices)\n", 379 | "val_eval_iter = datasource.create_iterator(\n", 380 | " shuffled(datasource.val_ids), batch_size=0, devices=devices)" 381 | ], 382 | "execution_count": null, 383 | "outputs": [] 384 | }, 385 | { 386 | "cell_type": "markdown", 387 | "metadata": { 388 | "id": "erY9l66KjYYW" 389 | }, 390 | "source": [ 391 | "## Training" 392 | ] 393 | }, 394 | { 395 | "cell_type": "code", 396 | "metadata": { 397 | "id": "nZnS8BhcXe5E", 398 | "cellView": "form" 399 | }, 400 | "source": [ 401 | "# @title Initialize model\n", 402 | "# @markdown Defines the model and initializes its parameters.\n", 403 | "\n", 404 | "from flax.training import checkpoints\n", 405 | "from hypernerf import models\n", 406 | "from hypernerf import model_utils\n", 407 | "from hypernerf import schedules\n", 408 | "from hypernerf import training\n", 409 | "\n", 410 | "# @markdown Restore a checkpoint if one exists.\n", 411 | "restore_checkpoint = False # @param{type:'boolean'}\n", 412 | "\n", 413 | "\n", 414 | "rng = random.PRNGKey(exp_config.random_seed)\n", 415 | "np.random.seed(exp_config.random_seed + jax.process_index())\n", 416 | "devices_to_use = jax.devices()\n", 417 | "\n", 418 | "learning_rate_sched = schedules.from_config(train_config.lr_schedule)\n", 419 | "nerf_alpha_sched = schedules.from_config(train_config.nerf_alpha_schedule)\n", 420 | "warp_alpha_sched = schedules.from_config(train_config.warp_alpha_schedule)\n", 421 | "elastic_loss_weight_sched = schedules.from_config(\n", 422 | "train_config.elastic_loss_weight_schedule)\n", 423 | "hyper_alpha_sched = schedules.from_config(train_config.hyper_alpha_schedule)\n", 424 | "hyper_sheet_alpha_sched = schedules.from_config(\n", 425 | " train_config.hyper_sheet_alpha_schedule)\n", 426 | "\n", 427 | "rng, key = random.split(rng)\n", 428 | "params = {}\n", 429 | "model, params['model'] = models.construct_nerf(\n", 430 | " key,\n", 431 | " batch_size=train_config.batch_size,\n", 432 | " embeddings_dict=datasource.embeddings_dict,\n", 433 | " near=datasource.near,\n", 434 | " far=datasource.far)\n", 435 | "\n", 436 | "optimizer_def = optim.Adam(learning_rate_sched(0))\n", 437 | "optimizer = optimizer_def.create(params)\n", 438 | "\n", 439 | "state = model_utils.TrainState(\n", 440 | " optimizer=optimizer,\n", 441 | " nerf_alpha=nerf_alpha_sched(0),\n", 442 | " warp_alpha=warp_alpha_sched(0),\n", 443 | " hyper_alpha=hyper_alpha_sched(0),\n", 444 | " hyper_sheet_alpha=hyper_sheet_alpha_sched(0))\n", 445 | "scalar_params = training.ScalarParams(\n", 446 | " learning_rate=learning_rate_sched(0),\n", 447 | " elastic_loss_weight=elastic_loss_weight_sched(0),\n", 448 | " warp_reg_loss_weight=train_config.warp_reg_loss_weight,\n", 449 | " warp_reg_loss_alpha=train_config.warp_reg_loss_alpha,\n", 450 | " warp_reg_loss_scale=train_config.warp_reg_loss_scale,\n", 451 | " background_loss_weight=train_config.background_loss_weight,\n", 452 | " hyper_reg_loss_weight=train_config.hyper_reg_loss_weight)\n", 453 | "\n", 454 | "if restore_checkpoint:\n", 455 | " logging.info('Restoring checkpoint from %s', checkpoint_dir)\n", 456 | " state = checkpoints.restore_checkpoint(checkpoint_dir, state)\n", 457 | "step = state.optimizer.state.step + 1\n", 458 | "state = jax_utils.replicate(state, devices=devices)\n", 459 | "del params" 460 | ], 461 | "execution_count": null, 462 | "outputs": [] 463 | }, 464 | { 465 | "cell_type": "code", 466 | "metadata": { 467 | "id": "at2CL5DRZ7By", 468 | "cellView": "form" 469 | }, 470 | "source": [ 471 | "# @title Define pmapped functions\n", 472 | "# @markdown This parallelizes the training and evaluation step functions using `jax.pmap`.\n", 473 | "\n", 474 | "import functools\n", 475 | "from hypernerf import evaluation\n", 476 | "\n", 477 | "\n", 478 | "def _model_fn(key_0, key_1, params, rays_dict, extra_params):\n", 479 | " out = model.apply({'params': params},\n", 480 | " rays_dict,\n", 481 | " extra_params=extra_params,\n", 482 | " rngs={\n", 483 | " 'coarse': key_0,\n", 484 | " 'fine': key_1\n", 485 | " },\n", 486 | " mutable=False)\n", 487 | " return jax.lax.all_gather(out, axis_name='batch')\n", 488 | "\n", 489 | "pmodel_fn = jax.pmap(\n", 490 | " # Note rng_keys are useless in eval mode since there's no randomness.\n", 491 | " _model_fn,\n", 492 | " in_axes=(0, 0, 0, 0, 0), # Only distribute the data input.\n", 493 | " devices=devices_to_use,\n", 494 | " axis_name='batch',\n", 495 | ")\n", 496 | "\n", 497 | "render_fn = functools.partial(evaluation.render_image,\n", 498 | " model_fn=pmodel_fn,\n", 499 | " device_count=len(devices),\n", 500 | " chunk=eval_config.chunk)\n", 501 | "train_step = functools.partial(\n", 502 | " training.train_step,\n", 503 | " model,\n", 504 | " elastic_reduce_method=train_config.elastic_reduce_method,\n", 505 | " elastic_loss_type=train_config.elastic_loss_type,\n", 506 | " use_elastic_loss=train_config.use_elastic_loss,\n", 507 | " use_background_loss=train_config.use_background_loss,\n", 508 | " use_warp_reg_loss=train_config.use_warp_reg_loss,\n", 509 | " use_hyper_reg_loss=train_config.use_hyper_reg_loss,\n", 510 | ")\n", 511 | "ptrain_step = jax.pmap(\n", 512 | " train_step,\n", 513 | " axis_name='batch',\n", 514 | " devices=devices,\n", 515 | " # rng_key, state, batch, scalar_params.\n", 516 | " in_axes=(0, 0, 0, None),\n", 517 | " # Treat use_elastic_loss as compile-time static.\n", 518 | " donate_argnums=(2,), # Donate the 'batch' argument.\n", 519 | ")" 520 | ], 521 | "execution_count": null, 522 | "outputs": [] 523 | }, 524 | { 525 | "cell_type": "code", 526 | "metadata": { 527 | "id": "vbc7cMr5aR_1", 528 | "cellView": "form" 529 | }, 530 | "source": [ 531 | "# @title Train!\n", 532 | "# @markdown This runs the training loop!\n", 533 | "\n", 534 | "import mediapy\n", 535 | "from hypernerf import utils\n", 536 | "from hypernerf import visualization as viz\n", 537 | "\n", 538 | "\n", 539 | "print_every_n_iterations = 100 # @param{type:'number'}\n", 540 | "visualize_results_every_n_iterations = 500 # @param{type:'number'}\n", 541 | "save_checkpoint_every_n_iterations = 1000 # @param{type:'number'}\n", 542 | "\n", 543 | "\n", 544 | "logging.info('Starting training')\n", 545 | "rng = rng + jax.process_index() # Make random seed separate across hosts.\n", 546 | "keys = random.split(rng, len(devices))\n", 547 | "time_tracker = utils.TimeTracker()\n", 548 | "time_tracker.tic('data', 'total')\n", 549 | "\n", 550 | "for step, batch in zip(range(step, train_config.max_steps + 1), train_iter):\n", 551 | " time_tracker.toc('data')\n", 552 | " scalar_params = scalar_params.replace(\n", 553 | " learning_rate=learning_rate_sched(step),\n", 554 | " elastic_loss_weight=elastic_loss_weight_sched(step))\n", 555 | " # pytype: enable=attribute-error\n", 556 | " nerf_alpha = jax_utils.replicate(nerf_alpha_sched(step), devices)\n", 557 | " warp_alpha = jax_utils.replicate(warp_alpha_sched(step), devices)\n", 558 | " hyper_alpha = jax_utils.replicate(hyper_alpha_sched(step), devices)\n", 559 | " hyper_sheet_alpha = jax_utils.replicate(\n", 560 | " hyper_sheet_alpha_sched(step), devices)\n", 561 | " state = state.replace(nerf_alpha=nerf_alpha,\n", 562 | " warp_alpha=warp_alpha,\n", 563 | " hyper_alpha=hyper_alpha,\n", 564 | " hyper_sheet_alpha=hyper_sheet_alpha)\n", 565 | "\n", 566 | " with time_tracker.record_time('train_step'):\n", 567 | " state, stats, keys, _ = ptrain_step(keys, state, batch, scalar_params)\n", 568 | " time_tracker.toc('total')\n", 569 | "\n", 570 | " if step % print_every_n_iterations == 0:\n", 571 | " logging.info(\n", 572 | " 'step=%d, warp_alpha=%.04f, hyper_alpha=%.04f, hyper_sheet_alpha=%.04f, %s',\n", 573 | " step, \n", 574 | " warp_alpha_sched(step), \n", 575 | " hyper_alpha_sched(step), \n", 576 | " hyper_sheet_alpha_sched(step), \n", 577 | " time_tracker.summary_str('last'))\n", 578 | " coarse_metrics_str = ', '.join(\n", 579 | " [f'{k}={v.mean():.04f}' for k, v in stats['coarse'].items()])\n", 580 | " fine_metrics_str = ', '.join(\n", 581 | " [f'{k}={v.mean():.04f}' for k, v in stats['fine'].items()])\n", 582 | " logging.info('\\tcoarse metrics: %s', coarse_metrics_str)\n", 583 | " if 'fine' in stats:\n", 584 | " logging.info('\\tfine metrics: %s', fine_metrics_str)\n", 585 | " \n", 586 | " if step % visualize_results_every_n_iterations == 0:\n", 587 | " print(f'[step={step}] Training set visualization')\n", 588 | " eval_batch = next(train_eval_iter)\n", 589 | " render = render_fn(state, eval_batch, rng=rng)\n", 590 | " rgb = render['rgb']\n", 591 | " acc = render['acc']\n", 592 | " depth_exp = render['depth']\n", 593 | " depth_med = render['med_depth']\n", 594 | " rgb_target = eval_batch['rgb']\n", 595 | " depth_med_viz = viz.colorize(depth_med, cmin=datasource.near, cmax=datasource.far)\n", 596 | " mediapy.show_images([rgb_target, rgb, depth_med_viz],\n", 597 | " titles=['GT RGB', 'Pred RGB', 'Pred Depth'])\n", 598 | "\n", 599 | " print(f'[step={step}] Validation set visualization')\n", 600 | " eval_batch = next(val_eval_iter)\n", 601 | " render = render_fn(state, eval_batch, rng=rng)\n", 602 | " rgb = render['rgb']\n", 603 | " acc = render['acc']\n", 604 | " depth_exp = render['depth']\n", 605 | " depth_med = render['med_depth']\n", 606 | " rgb_target = eval_batch['rgb']\n", 607 | " depth_med_viz = viz.colorize(depth_med, cmin=datasource.near, cmax=datasource.far)\n", 608 | " mediapy.show_images([rgb_target, rgb, depth_med_viz],\n", 609 | " titles=['GT RGB', 'Pred RGB', 'Pred Depth'])\n", 610 | "\n", 611 | " if step % save_checkpoint_every_n_iterations == 0:\n", 612 | " training.save_checkpoint(checkpoint_dir, state)\n", 613 | "\n", 614 | " time_tracker.tic('data', 'total')\n" 615 | ], 616 | "execution_count": null, 617 | "outputs": [] 618 | }, 619 | { 620 | "cell_type": "code", 621 | "metadata": { 622 | "id": "o69auGWvdyyd" 623 | }, 624 | "source": [ 625 | "" 626 | ], 627 | "execution_count": null, 628 | "outputs": [] 629 | } 630 | ] 631 | } -------------------------------------------------------------------------------- /colabs/gaussian_splatting_colab.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "metadata": { 7 | "id": "VjYy0F2gZIPR" 8 | }, 9 | "outputs": [], 10 | "source": [ 11 | "%cd /content\n", 12 | "!git clone --recursive https://github.com/camenduru/gaussian-splatting\n", 13 | "!pip install -q plyfile\n", 14 | "\n", 15 | "%cd /content/gaussian-splatting\n", 16 | "!pip install -q https://huggingface.co/camenduru/gaussian-splatting/resolve/main/diff_gaussian_rasterization-0.0.0-cp310-cp310-linux_x86_64.whl\n", 17 | "!pip install -q https://huggingface.co/camenduru/gaussian-splatting/resolve/main/simple_knn-0.0.0-cp310-cp310-linux_x86_64.whl\n", 18 | "\n", 19 | "!wget https://huggingface.co/camenduru/gaussian-splatting/resolve/main/tandt_db.zip\n", 20 | "!unzip tandt_db.zip\n", 21 | "\n", 22 | "!python train.py -s /content/gaussian-splatting/tandt/train\n", 23 | "\n", 24 | "# !wget https://huggingface.co/camenduru/gaussian-splatting/resolve/main/GaussianViewTest.zip\n", 25 | "# !unzip GaussianViewTest.zip\n", 26 | "# !python render.py -m /content/gaussian-splatting/GaussianViewTest/model\n", 27 | "# !ffmpeg -framerate 3 -i /content/gaussian-splatting/GaussianViewTest/model/train/ours_30000/renders/%05d.png -vf \"pad=ceil(iw/2)*2:ceil(ih/2)*2\" -c:v libx264 -r 3 -pix_fmt yuv420p /content/renders.mp4\n", 28 | "# !ffmpeg -framerate 3 -i /content/gaussian-splatting/GaussianViewTest/model/train/ours_30000/gt/%05d.png -vf \"pad=ceil(iw/2)*2:ceil(ih/2)*2\" -c:v libx264 -r 3 -pix_fmt yuv420p /content/gt.mp4 -y" 29 | ] 30 | } 31 | ], 32 | "metadata": { 33 | "accelerator": "GPU", 34 | "colab": { 35 | "gpuType": "T4", 36 | "provenance": [] 37 | }, 38 | "kernelspec": { 39 | "display_name": "Python 3", 40 | "name": "python3" 41 | }, 42 | "language_info": { 43 | "name": "python" 44 | } 45 | }, 46 | "nbformat": 4, 47 | "nbformat_minor": 0 48 | } -------------------------------------------------------------------------------- /dynamic.md: -------------------------------------------------------------------------------- 1 | # Dynamic NeRF 2 | 3 | **Verified**: Papers listed with ```[+]``` have been verfied by myself or colleagues. The code is runnable. Please leave an issue if you need help on setting up. 4 | 5 | # 1. Datasets 6 | ## Custom Data Preparation 7 | - [Monocular Dynamic View Synthesis: A Reality Check](https://github.com/KAIR-BAIR/dycheck/blob/main/docs/RECORD3D_CAPTURE.md) 8 | - [Process a video into a Nerfie dataset](https://colab.research.google.com/github/google/nerfies/blob/main/notebooks/Nerfies_Capture_Processing.ipynb) 9 | - [Robust Dynamic Radiance Fields](https://github.com/facebookresearch/robust-dynrf) 10 | Estimate monocular depth, Predict optical flows, Obtain motion mask. 11 | - [Neural Scene Flow Fields](https://github.com/zhengqili/Neural-Scene-Flow-Fields/tree/main) 12 | Instructions for custom data. 13 | 14 | ### Synthetic 15 | - [D-Nerf Dataset](https://www.albertpumarola.com/research/D-NeRF/index.html) 16 | 17 | 18 | ### Real 19 | - [Plenoptic Dataset](https://github.com/facebookresearch/Neural_3D_Video/releases/tag/v1.0) 20 | - [Hypernerf Dataset](https://github.com/google/hypernerf/releases/tag/v0.1) 21 | - [Nerfies Dataset](https://github.com/google/nerfies/releases/download/0.1/nerfies-vrig-dataset-v0.1.zip) 22 | - [Dynamic NeRF](https://github.com/gaochen315/DynamicNeRF) 23 | Balloon1, Balloon2, Jumping, Playground, Skating, Truck, Umbrella 24 | 25 | # 2. Papers 26 | ## 2024 27 | - Dynamic 3D Gaussians: Tracking by Persistent Dynamic View Synthesis, Luiten et. al., International Conference on 3D Vision (3DV), 2024. [[Paper](https://dynamic3dgaussians.github.io/paper.pdf) | [Project Page](https://dynamic3dgaussians.github.io/) | [Code](https://github.com/JonathonLuiten/Dynamic3DGaussians) | [Explanation Video](https://www.youtube.com/live/hDuy1TgD8I4?si=6oGN0IYnPRxOibpg)] 28 | - Sync-NeRF : Generalizing Dynamic NeRFs to Unsynchronized Videos, AAAI 2024. [[Paper](https://arxiv.org/abs/2310.13356), [Code](https://github.com/seoha-kim/Sync-NeRF)] 29 | - Endo-4DGS: Endoscopic Monocular Scene Reconstruction with 4D Gaussian Splatting, [[Paper](https://arxiv.org/abs/2401.16416) | [Code](https://github.com/lastbasket/Endo-4DGS)] 30 | - DaReNeRF: Direction-aware Representation for Dynamic Scenes, CVPR 2024 31 | - Sync-NeRF: Generalizing Dynamic NeRFs to Unsynchronized Videos, AAAI2024. [Code](https://github.com/seoha-kim/Sync-NeRF) 32 | - SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes. [Code](https://github.com/yihua7/SC-GS) 33 | - GaussianFlow: Splatting Gaussian Dynamics for 4D Content Creation 34 | - Entity-NeRF: Detecting and Removing Moving Entities in Urban Scenes, CVPR 2024. [Project](https://otonari726.github.io/entitynerf/) 35 | - Ced-NeRF: A Compact and Efficient Method for Dynamic Neural Radiance Fields, AAAI 2024. [Paper](https://ojs.aaai.org/index.php/AAAI/article/view/28138) 36 | - 3D Geometry-aware Deformable Gaussian Splatting for Dynamic View Synthesis, CVPR 2024. [Project](https://npucvr.github.io/GaGS/) 37 | 38 | - FPO++: efficient encoding and rendering of dynamic neural radiance fields by analyzing and enhancing Fourier PlenOctrees, The Visual Computer, 2024. 39 | - Evdnerf: Reconstructing event data with dynamic neural radiance fields, WACV 2024. [Code](https://github.com/anish-bhattacharya/EvDNeRF) 40 | - CTNeRF: Cross-time Transformer for dynamic neural radiance field from monocular video, Pattern Recognition, 2024. [Code](https://github.com/xingy038/ctnerf) 41 | - DynamicSurf: Dynamic Neural RGB-D Surface Reconstruction with an Optimizable Feature Grid, International Conference on 3D Vision (3DV) 2024. [Code](https://github.com/Mirgahney/dynsurf) 42 | - [+] Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis, CVPR 2024. [Code](https://github.com/oppo-us-research/SpacetimeGaussians) 43 | 44 | ## 2023 45 | - DynIBaR: Neural Dynamic Image-Based Rendering, CVPR, 2023 [[Project Page](https://dynibar.github.io/)] 46 | - Tensor4D : Efficient Neural 4D Decomposition for High-fidelity Dynamic Reconstruction and Rendering, Shao et. al., CVPR, 2023. [[Paper](https://arxiv.org/abs/2211.11610) | [Code](https://github.com/DSaurus/Tensor4D)] 47 | - HyperReel: High-Fidelity 6-DoF Video with Ray-Conditioned Sampling, CVPR 2023 (Highlight). [Code](https://github.com/facebookresearch/hyperreel) 48 | - HexPlane: A Fast Representation for Dynamic Scenes, Cao et. al., CVPR, 2023. [[Paper](https://caoang327.github.io/HexPlane/HexPlane.pdf) | [Project Page](https://caoang327.github.io/HexPlane/) | [Code](https://github.com/Caoang327/HexPlane)] 49 | - Robust Dynamic Radiance Fields, Liu et. al., CVPR, 2023. [[Code](https://github.com/facebookresearch/robust-dynrf)] 50 | - V4D: Voxel for 4D Novel View Synthesis, Gan et. al., IEEE Transactions on Visualization and Computer Graphics, 2023. [[Paper](https://arxiv.org/abs/2205.14332) | [Code](https://github.com/GANWANSHUI/V4D)] (instructions for custom data) 51 | - Dynamic Mesh-Aware Radiance Fields, ICCV, 2023. [[Project Page](https://mesh-aware-rf.github.io/) | [Code](https://github.com/YilingQiao/DMRF)] 52 | - NeRFPlayer: A Streamable Dynamic Scene Representation with Decomposed Neural Radiance Fields, IEEE Transactions on Visualization and Computer Graphics, vol 29(5), 2023. [[Code](https://github.com/lsongx/nerfplayer-nerfstudio)] 53 | - Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction, Yang et. al., ACM Transactions on Graphics, 2023. [[Paper](https://arxiv.org/pdf/2309.13101.pdf) | [Project Page](https://ingra14m.github.io/Deformable-Gaussians/) | [Code](https://github.com/ingra14m/Deformable-3D-Gaussians)] 54 | - V4d: Voxel for 4d novel view synthesis, Gan et. al., IEEE Transactions on Visualization and Computer Graphics, 2023. [[Code](https://github.com/GANWANSHUI/V4D)] 55 | - MixVoxels: Mixed Neural Voxels for Fast Multi-view Video Synthesis, ICCV2023 Oral. [Code](https://github.com/fengres/mixvoxels) 56 | - HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video, ICCV2023. [Code](https://github.com/TencentARC/HOSNeRF) 57 | - DynPoint: Dynamic Neural Point For View Synthesis, NeurIPS 2023. 58 | 59 | ## 2022 60 | - Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time, CVPR 2022 [[Project Page](https://aoliao12138.github.io/FPO/)] 61 | - D2NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from a Monocular Video, NeurIPS, 2022. [[Project Page](https://d2nerf.github.io/) | [Code](https://github.com/ChikaYan/d2nerf)] 62 | - Monocular Dynamic View Synthesis: A Reality Check, Gao et. al., Neurips 2022. [[Project Page](https://hangg7.com/dycheck/)] 63 | - TiNeuVox: Fast Dynamic Radiance Fields with Time-Aware Neural Voxels, Fang et. al., ACM SIGGRAPH Asia 2022. [[Project Page](https://jaminfong.cn/tineuvox/) | [Code](https://github.com/hustvl/TiNeuVox)] 64 | - Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time, CVPR 2022. [Project](https://aoliao12138.github.io/FPO/) 65 | 66 | ## 2021 67 | - Nerfies: Deformable Neural Radiance Fields, ICCV, 2021. [[Code](https://github.com/google/nerfies)] (instructions for **custom data**, this is the one everyone refering to) 68 | - Dynamic View Synthesis from Dynamic Monocular Video, ICCV, 2021. [[Code](https://github.com/gaochen315/DynamicNeRF)] 69 | - HyperNeRF: A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields, ACM Trans. Graph, 2021. [[Code](https://github.com/google/hyperNeRF) | [Project Page](https://hypernerf.github.io/) | [Colab](./colabs/HyperNerf.ipynb)] (instructions for custom data) 70 | - BARF: Bundle-Adjusting Neural Radiance Fields, Lin et. al., ICCV 2021 (Oral). [[Code](https://github.com/chenhsuanlin/bundle-adjusting-NeRF)] 71 | 72 | ## 2020 73 | - D-NeRF: Neural Radiance Fields for Dynamic Scenes, Pumarola et. al, CVPR 2020. [[Project Page](https://www.albertpumarola.com/research/D-NeRF/index.html) | [Code](https://github.com/albertpumarola/D-NeRF)] -------------------------------------------------------------------------------- /generative.md: -------------------------------------------------------------------------------- 1 | 2 | ## 2023 3 | - Wonder3D: Single Image to 3D using Cross-Domain Diffusion, [[Project Page](https://www.xxlong.site/Wonder3D/?ref=aiartweekly) | [Code](https://github.com/xxlong0/Wonder3D)] -------------------------------------------------------------------------------- /images/cvpr2024.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pdaicode/awesome-3dgs/da7cfa4e1f8edc84be629bb0f6de78b962269b7f/images/cvpr2024.png -------------------------------------------------------------------------------- /nerf.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## Projects 4 | - 5 | 6 | ## Original 7 | - NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis, Mildenhall et al., ECCV 2020. [[paper](https://www.matthewtancik.com/nerf) | [github](https://github.com/bmild/nerf)] 8 | 9 | ## Papers 10 | ### 2024 11 | - How Far Can We Compress Instant-NGP-Based NeRF? CVPR 2024. [[Project](https://yihangchen-ee.github.io/project_cnc/) | [code](https://github.com/YihangChen-ee/CNC)] 12 | - FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices With a Simple Super-Resolution Pipeline, WACV 2024. [[Paper](https://openaccess.thecvf.com/content/WACV2024/papers/Lin_FastSR-NeRF_Improving_NeRF_Efficiency_on_Consumer_Devices_With_a_Simple_WACV_2024_paper.pdf)] 13 | - HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces, CVPR 2024. [Project](https://haithemturki.com/hybrid-nerf/) 14 | 15 | ### 2023 16 | - Adaptive Shells for Efficient Neural Radiance Field Rendering, Zian Wang, et. al., SIGGRAPH Asia 2023. [[paper](https://nv-tlabs.github.io/adaptive-shells-website/assets/adaptiveShells_paper.pdf) | [Project](https://research.nvidia.com/labs/toronto-ai/adaptive-shells/)] 17 | - MERF - Memory-Efficient Radiance Fields for Real-time View Synthesis in Unbounded Scenes, SIGGRAPH 2023. [[Project](https://creiser.github.io/merf/) | [code](https://github.com/google-research/google-research/tree/master/merf)] -------------------------------------------------------------------------------- /review.md: -------------------------------------------------------------------------------- 1 | # A New Era of Neural Rendering: A Review of 3D Gaussian Splatting 2 | 3 | Co-authored by GPT-4o and Co-Pilot 4 | 5 | ## Abstract 6 | 7 | ## Introduction 8 | 3D Gaussian Splatting (3D-GS) has emerged as a leading technique in computer graphics, especially for 3D rendering. Recent research [Kerbl et al., 2023; Lu et al., 2023; Yu et al., 2023a] highlights its efficiency in rendering complex scenes with high detail. By representing objects and surfaces using collections of Gaussians, 3D-GS allows for accurate and efficient geometry and appearance representation [Guédon and Lepetit, 2023]. This method overcomes the limitations of traditional volume rendering, offering flexibility and adaptability [Kerbl et al., 2023]. Additionally, 3D-GS enables realistic visual effects like depth-of-field and soft shadows, making it a valuable tool in graphics research and applications [Chung et al., 2023a]. 9 | 10 | 3D Gaussian splatting is a sophisticated technique used in computer graphics to represent complex surfaces and volumes using a series of Gaussian functions. This method effectively approximates shapes and textures by distributing Gaussian "splats" over a 3D space, creating a smooth and continuous representation of objects. The versatility and precision of Gaussian splatting make it particularly valuable in mixed reality (MR) applications, where realistic and immersive visual experiences are crucial. 11 | 12 | ## Background 13 | 14 | ## Methodology 15 | 16 | ## Applications 17 | ### Implementations 18 | 19 | ### MR 20 | 21 | ## Discussion and Future Work 22 | 23 | ## Conclusion 24 | 25 | ## Reference 26 | - 3D GAUSSIAN AS A NEW VISION ERA: A SURVEY, 2024. [pdf](https://arxiv.org/pdf/2402.07181) -------------------------------------------------------------------------------- /vidgen.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | ## 2024 4 | - Portrait4D: Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data, CVPR 2024. [[Project](https://yudeng.github.io/Portrait4D/) | [Code](https://github.com/YuDeng/Portrait-4D)] 5 | - Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer, ECCV 2024. [[Code](https://github.com/YuDeng/Portrait-4D) | [Huggingface](https://huggingface.co/posts/DmitryRyumin/891674447263162)] 6 | - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation, CVPR 2024. [Project](https://humanaigc.github.io/animate-anyone/) 7 | - Pose Adapted Shape Learning for Large-Pose Face Reenactment, CVPR 2024. 8 | - REFA: Real-time Egocentric Facial Animations for Virtual Reality, CVPR 2024. 9 | - Locally Adaptive Neural 3D Morphable Models, CVPR 2024. [[Code](https://github.com/michaeltrs/LAMM)] 10 | - Morphable Diffusion: 3D-Consistent Diffusion for Single-image Avatar Creation, CVPR 2024. [**NeRF**] [[Project](https://xiyichen.github.io/morphablediffusion/)] 11 | 12 | ## 2023 and before 13 | - Implicit Neural Head Synthesis via Controllable Local Deformation Fields, CVPR 2023. [**NeRF**] [[Project](https://imaging.cs.cmu.edu/local_deformation_fields/)] 14 | - DeepFaceLab: Integrated, flexible and extensible face-swapping framework, 2020. [[Paper](https://arxiv.org/abs/2005.05535)] 15 | - First Order Motion Model for Image Animation, NeurIPS 2019. [[Project](https://aliaksandrsiarohin.github.io/first-order-model-website/) | [Code](https://github.com/AliaksandrSiarohin/first-order-model)] 16 | 17 | ## Demos 18 | - CatVTON: Concatenation Is All You Need for **Virtual Try-On** with Diffusion Models [Huggingface](https://huggingface.co/spaces/zhengchong/CatVTON) 19 | - [CogVideo](https://github.com/THUDM/CogVideo/tree/main) 20 | - [FLUX](https://github.com/black-forest-labs/flux?tab=readme-ov-file#usage) --------------------------------------------------------------------------------