├── .gitignore ├── LICENSE ├── README.md ├── parse.py └── topics ├── 1_fundamentals.md ├── 2_images.md └── 3_others.md /.gitignore: -------------------------------------------------------------------------------- 1 | *.pptx 2 | *.html 3 | .DS_Store -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2023 cvpr2023-tutorial-diffusion-models 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Papers in the CVPR 2023 Diffusion Model Tutorial 2 | 3 | This repository contains a list of papers to include in the CVPR 2023 tutorial ["Denoising Diffusion Models: 4 | A Generative Learning Big Bang"](https://cvpr2023-tutorial-diffusion-models.github.io/), by Jiaming Song, Chenlin Meng, and Arash Vahdat. 5 | 6 | As the number of papers are growing quite rapidly, it is impossible to list or even carefully read every paper in this field. 7 | Therefore, we take an "RLHF" approach to our tutorial and welcome any community feedback, to ensure we don't miss interesting / important works by accident (could be yours, could be others'). 8 | 9 | ## Disclaimers 10 | 11 | 1. The goal of the tutorial is to provide a **high-level introduction** to researchers not familiar with diffusion models, or wish to be more familiar with its more recent developments related to CVPR. 12 | 2. The field is growing at a near-exponential rate[^1]. Therefore, we have to consider **a curated list of selected papers**. We do not aim to make another paper tracker on diffusion models[^2]. 13 | 3. Even in the curated list, your paper might not get highlighted (i.e., taking at least 3 mins in the tutorial) in the 3-hour talk. 14 | 4. Given the time constraint (1 hour x 3 sections), we can only highlight no more than 20 papers in each hour of the tutorial. 15 | 5. **We will make the final decisions regarding which papers get highlighted**, and we try to lean on papers with novel yet highly practical ideas. 16 | 6. That being said, we will list all the papers contributed here in our slides, and strive to give a "shout out" to many of them (e.g., one or two sentences listing their connections to highlighted work), as long as they are related to the topics in the tutorial. 17 | 7. Although we use the term "paper", it does not have to be a paper at all. The community has lots of amazing ideas that are not always presnted as a paper, and these are worth highlighting as well (an example is negative prompting). 18 | 8. The deadline for contributions is **June 9th**, around ten days before the actual tutorial. 19 | 20 | [^1]: The awesome diffusion models repo (https://github.com/heejkoo/Awesome-Diffusion-Models) already lists more than 1300 papers on the topic, and that are even not all of them! 21 | [^2]: Several trackers are here: https://github.com/heejkoo/Awesome-Diffusion-Models (about 1300 papers as of writing), https://vsehwag.github.io/blog/2023/2/all_papers_on_diffusion.html (about 500 papers), https://scorebasedgenerativemodeling.github.io/ (about 800 papers). 22 | 23 | ## Contributions 24 | 25 | We welcome all types of contributions, mostly in the form of papers under certain topics relevant to CVPR. For now, these include: 26 | 27 | - [**Fundamentals [Arash]**](https://github.com/cvpr2023-tutorial-diffusion-models/papers/blob/main/topics/1_fundamentals.md): these include methods that can be applied to general domains, such as training, sampling, guidance. 28 | - [**Applications on natural iamges [Chenlin]**](https://github.com/cvpr2023-tutorial-diffusion-models/papers/blob/main/topics/2_images.md): these include applications that stem from natural images but can also be applied to other domains, such as architecture, editing, personalization, fine-tuning, "low-level" vision. 29 | - [**Applications on other domains [Jiaming]**](https://github.com/cvpr2023-tutorial-diffusion-models/papers/blob/main/topics/3_others.md): these include applications from other domains, such as video, 3d, motion, large content generation, inverse problems for medical domains, etc. 30 | 31 | ### New topics 32 | If you think a paper does not fall into any of the topics that we listed, please raise an issue. 33 | 34 | ### New papers 35 | If you want to contribute new "papers", please create a [pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork). 36 | 37 | See [this](https://github.com/cvpr2023-tutorial-diffusion-models/papers/pull/1) as an example. Generally, we will accept your pull request as long as it is a relevant paper -- this does not guarantee it being "highlighted", but we will try our best read the paper (hopefully we don't get 1k+ PRs). 38 | 39 | Generally, we recommend adding papers in "Awesome Diffusion Models" format. If you are the author and you have some slides for the work, you can create a link to the slides as well. We will compile them to [the website](https://cvpr2023-tutorial-diffusion-models.github.io/) after the tutorial. 40 | 41 | ### Questions or suggestions 42 | 43 | Please create an issue or send an email to jiaming [dot] tsong [at] "Google's personal email service in the US" [dot] com. 44 | -------------------------------------------------------------------------------- /parse.py: -------------------------------------------------------------------------------- 1 | import re 2 | import argparse 3 | 4 | 5 | def main(): 6 | """ 7 | Convert the awesome diffusion markdown format into HTML. 8 | """ 9 | parser = argparse.ArgumentParser() 10 | parser.add_argument('--fin', type=str, help="input markdown") 11 | parser.add_argument('--fout', type=str, help="output html") 12 | 13 | args = parser.parse_args() 14 | html = [] 15 | 16 | with open(f'{args.fin}', 'r') as f: 17 | lines = f.readlines() 18 | filtered_lines = [line for line in lines if not line.startswith("#")] 19 | lines = ''.join(filtered_lines) 20 | all_lines = lines.split('**') 21 | l = len(all_lines) // 2 22 | 23 | for i in range(0, l): 24 | title = all_lines[i*2+1] 25 | rest = all_lines[i*2+2] 26 | authors = rest.split('\n')[1] 27 | authors = re.sub(r'\*+', '', authors) 28 | authors = re.sub(r'\\+', '', authors) 29 | authors = re.sub(r'1', '', authors) 30 | authors = authors.strip().split(',') 31 | if len(authors) > 2: 32 | authors = authors[0].split()[-1] + ' et al.' 33 | else: 34 | authors = ' and '.join([a.split()[-1] for a in authors]) 35 | conference = rest.split('\n')[2].split('.')[0].split('(')[0].strip() 36 | 37 | paper_link= rest.split('\n')[2].split('[Paper]') 38 | if len(paper_link) < 2: 39 | paper_link = '' 40 | else: 41 | paper_link = paper_link[1].split(']')[0][1:-1] 42 | 43 | html.append(f'{authors}, "{title}", {conference}
') 44 | 45 | html = '\n'.join(html) 46 | 47 | html = f""" 48 | 49 | 50 | 51 | My HTML Page 52 | 53 | 54 | {html} 55 | 56 | """ 57 | 58 | print(args.fout) 59 | 60 | with open(f'{args.fout}', 'w') as f: 61 | f.write(html) 62 | 63 | 64 | if __name__ == '__main__': 65 | main() -------------------------------------------------------------------------------- /topics/1_fundamentals.md: -------------------------------------------------------------------------------- 1 | # Fundamentals of Diffusion Models 2 | 3 | ## Training 4 | 5 | **Denoising Diffusion Probabilistic Models** \ 6 | *Jonathan Ho, Ajay Jain, Pieter Abbeel* \ 7 | NeurIPS 2020. [[Paper](https://arxiv.org/abs/2006.11239)] [[Github](https://github.com/hojonathanho/diffusion)] [[Github2](https://github.com/pesser/pytorch_diffusion)] \ 8 | 19 Jun 2020 9 | 10 | **Score-Based Generative Modeling through Stochastic Differential Equations** \ 11 | *Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, Ben Poole* \ 12 | ICLR 2021 (Oral). [[Paper](https://arxiv.org/abs/2011.13456)] [[Github](https://github.com/yang-song/score_sde)] \ 13 | 26 Nov 2020 14 | 15 | **Variational Diffusion Models** \ 16 | *Diederik P. Kingma1, Tim Salimans1, Ben Poole, Jonathan Ho* \ 17 | arXiv 2021. [[Paper](https://arxiv.org/abs/2107.00630)] [[Github](https://github.com/google-research/vdm)] \ 18 | 1 Jul 2021 19 | 20 | **Elucidating the Design Space of Diffusion-Based Generative Models** \ 21 | *Tero Karras, Miika Aittala, Timo Aila, Samuli Laine* \ 22 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2206.00364)] \ 23 | 1 Jun 2022 24 | 25 | 26 | ## Fast sampling without additional training 27 | 28 | **Denoising Diffusion Implicit Models** \ 29 | *Jiaming Song, Chenlin Meng, Stefano Ermon* \ 30 | ICLR 2021. [[Paper](https://arxiv.org/abs/2010.02502)] [[Github](https://github.com/ermongroup/ddim)] \ 31 | 6 Oct 2020 32 | 33 | **Gotta Go Fast When Generating Data with Score-Based Models** \ 34 | Alexia Jolicoeur-Martineau, Ke Li, Rémi Piché-Taillefer, Tal Kachman, Ioannis Mitliagkas* \ 35 | arXiv 2021. [[Paper](https://arxiv.org/abs/2105.14080)] [[Github](https://github.com/AlexiaJM/score_sde_fast_sampling)] \ 36 | 28 May 2021 37 | 38 | **Pseudo Numerical Methods for Diffusion Models on Manifolds** \ 39 | *Luping Liu, Yi Ren, Zhijie Lin, Zhou Zhao* \ 40 | ICLR 2022. [[Paper](https://arxiv.org/abs/2202.09778)] [[Github](https://github.com/luping-liu/PNDM)] \ 41 | 20 Feb 2022 42 | 43 | **DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps** \ 44 | *Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu* \ 45 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2206.00927)] [[Github](https://github.com/LuChengTHU/dpm-solver)] \ 46 | 2 Jun 2022 47 | 48 | **DPM-Solver++: Fast Solver for Guided Sampling of Diffusion Probabilistic Models** \ 49 | *Cheng Lu, Yuhao Zhou, Fan Bao, Jianfei Chen, Chongxuan Li, Jun Zhu* \ 50 | NeurIPS 2022 (Oral). [[Paper](https://arxiv.org/abs/2211.01095)] [[Github](https://github.com/LuChengTHU/dpm-solver)] \ 51 | 2 Nov 2022 52 | 53 | **Fast Sampling of Diffusion Models with Exponential Integrator** \ 54 | *Qinsheng Zhang, Yongxin Chen* \ 55 | arXiv 2022. [[Paper](https://arxiv.org/abs/2204.13902)] \ 56 | 29 Apr 2022 57 | 58 | **gDDIM: Generalized denoising diffusion implicit models** \ 59 | *Qinsheng Zhang, Molei Tao, Yongxin Chen* \ 60 | arXiv 2022. [[Paper](https://arxiv.org/abs/2206.05564)] [[Github](https://github.com/qsh-zh/gDDIM)] \ 61 | 11 Jun 2022 62 | 63 | **UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models** \ 64 | *Wenliang Zhao, Lujia Bai, Yongming Rao, Jie Zhou, Jiwen Lu* \ 65 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.04867)] [[Project](https://unipc.ivg-research.xyz)] [[Github](https://github.com/wl-zhao/UniPC)] \ 66 | 9 Feb 2023 67 | 68 | **Parallel Sampling of Diffusion Models** \ 69 | *Andy Shih, Suneel Belkhale, Stefano Ermon, Dorsa Sadigh, Nima Anari* \ 70 | arxiv 2023. [[Paper](https://arxiv.org/abs/2305.16317)] [[Github](https://github.com/AndyShih12/paradigms)] \ 71 | 25 May 2023 72 | 73 | **A Geometric Perspective on Diffusion Models** \ 74 | *Defang Chen, Zhenyu Zhou, Jian-Ping Mei, Chunhua Shen, Chun Chen, Can Wang* \ 75 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.19947)] \ 76 | 31 May 2023 77 | 78 | ## Fast sampling with additional training 79 | 80 | **Tackling the Generative Learning Trilemma with Denoising Diffusion GANs** \ 81 | *Zhisheng Xiao, Karsten Kreis, Arash Vahdat* \ 82 | arXiv 2021. [[Paper](https://arxiv.org/abs/2112.07804)] [[Project](https://nvlabs.github.io/denoising-diffusion-gan)] \ 83 | 15 Dec 2021 84 | 85 | **Progressive Distillation for Fast Sampling of Diffusion Models** \ 86 | *Tim Salimans, Jonathan Ho* \ 87 | ICLR 2022. [[Paper](https://arxiv.org/abs/2202.00512)] \ 88 | 1 Feb 2022 89 | 90 | **On Distillation of Guided Diffusion Models** \ 91 | *Chenlin Meng, Ruiqi Gao, Diederik P. Kingma, Stefano Ermon, Jonathan Ho, Tim Salimans* \ 92 | arXiv 2022. [[Paper](https://arxiv.org/abs/2210.03142)] \ 93 | 6 Oct 2022 94 | 95 | **GENIE: Higher-Order Denoising Diffusion Solvers** \ 96 | *Tim Dockhorn, Arash Vahdat, Karsten Kreis* \ 97 | NeurIPS 2022 (Oral). [[Paper](https://arxiv.org/abs/2210.05475)] [[Github](https://github.com/nv-tlabs/GENIE)] \ 98 | 11 Oct 2022 99 | 100 | **Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality** \ 101 | *Daniel Watson, William Chan, Jonathan Ho, Mohammad Norouzi* \ 102 | ICLR 2022. [[Paper](https://arxiv.org/abs/2202.05830)] \ 103 | 11 Feb 2022 104 | 105 | **Wavelet Diffusion Models Are Fast and Scalable Image Generators** \ 106 | *Hao Phung, Quan Dao, Anh Tran* \ 107 | CVPR 2023. [[Paper](https://arxiv.org/abs/2211.16152)] [[Github](https://github.com/VinAIResearch/WaveDiff.git)] \ 108 | 29 Nov 2022 109 | 110 | ## Conditional Generation and Guidance 111 | 112 | **Diffusion Models Beat GANs on Image Synthesis** \ 113 | *Prafulla Dhariwal1, Alex Nichol1* \ 114 | arXiv 2021. [[Paper](https://arxiv.org/abs/2105.05233)] [[Github](https://github.com/openai/guided-diffusion)] \ 115 | 11 May 2021 116 | 117 | **Classifier-Free Diffusion Guidance** \ 118 | *Jonathan Ho, Tim Salimans* \ 119 | NeurIPS Workshop 2021. [[Paper](https://arxiv.org/abs/2207.12598)] \ 120 | 28 Sep 2021 121 | 122 | **Negative Prompt** \ 123 | *Automatic1111* \ 124 | GitHub. [[Paper](https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki/Negative-prompt)] 125 | 126 | **Improving Sample Quality of Diffusion Models Using Self-Attention Guidance** \ 127 | *Susung Hong, Gyuseong Lee, Wooseok Jang, Seungryong Kim* \ 128 | arXiv 2022. [[Paper](https://arxiv.org/abs/2210.00939)] [[Github](https://github.com/KU-CVLAB/Self-Attention-Guidance/)] \ 129 | 3 Oct 2022 130 | 131 | ## Diffusion Models on Low-dimensional Spaces 132 | 133 | **Image Super-Resolution via Iterative Refinement** \ 134 | *Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J. Fleet, Mohammad Norouzi* \ 135 | arXiv 2021. [[Paper](https://arxiv.org/abs/2104.07636)] [[Project](https://iterative-refinement.github.io/)] [[Github](https://github.com/Janspiry/Image-Super-Resolution-via-Iterative-Refinement)] \ 136 | 15 Apr 2021 137 | 138 | **Cascaded Diffusion Models for High Fidelity Image Generation** \ 139 | *Jonathan Ho1, Chitwan Saharia1, William Chan, David J. Fleet, Mohammad Norouzi, Tim Salimans* \ 140 | JMLR 2021. [[Paper](https://arxiv.org/abs/2106.15282)] [[Project](https://cascaded-diffusion.github.io/)] \ 141 | 30 May 2021 142 | 143 | **D2C: Diffusion-Denoising Models for Few-shot Conditional Generation** \ 144 | *Abhishek Sinha1, Jiaming Song1, Chenlin Meng, Stefano Ermon* \ 145 | NeurIPS 2021. [[Paper](https://arxiv.org/abs/2106.06819)] [[Project](https://d2c-model.github.io/)] [[Github](https://github.com/d2c-model/d2c-model.github.io)] \ 146 | 12 Jun 2021 147 | 148 | **Score-based Generative Modeling in Latent Space** \ 149 | *Arash Vahdat1, Karsten Kreis1, Jan Kautz* \ 150 | arXiv 2021. [[Paper](https://arxiv.org/abs/2106.05931)] \ 151 | 10 Jun 2021 152 | 153 | **High-Resolution Image Synthesis with Latent Diffusion Models** \ 154 | *Robin Rombach1, Andreas Blattmann1, Dominik Lorenz, Patrick Esser, Björn Ommer* \ 155 | CVPR 2022. [[Paper](https://arxiv.org/abs/2112.10752)] [[Github](https://github.com/CompVis/latent-diffusion)] \ 156 | 20 Dec 2021 157 | 158 | **Score-Guided Intermediate Layer Optimization: Fast Langevin Mixing for Inverse Problems** \ 159 | *Giannis Daras1, Yuval Dagan1, Alexandros G. Dimakis, Constantinos Daskalakis* \ 160 | ICML 2022. [[Paper](https://arxiv.org/abs/2206.09104)] [[Github](https://github.com/giannisdaras/sgilo)] \ 161 | 22 Jun 2022 162 | 163 | **Refusion: Enabling Large-Size Realistic Image Restoration with Latent-Space Diffusion Models** \ 164 | *Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön* \ 165 | CVPR Workshop 2023. [[Paper](https://arxiv.org/abs/2304.08291)] [[Github](https://github.com/Algolzw/image-restoration-sde)] \ 166 | 17 Apr 2023 167 | 168 | ## Generalized Diffusion Models 169 | 170 | 171 | **Diffusion Schrödinger Bridge** \ 172 | *Valentin De Bortoli, James Thornton, Jeremy Heng, Arnaud Doucet* \ 173 | NeurIPS 2021. [[Paper](https://arxiv.org/abs/2106.01357)] \ 174 | 01 Jun 2021 175 | 176 | **Riemannian Score-Based Generative Modelling** \ 177 | *Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, Arnaud Doucet* \ 178 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2202.02763)] \ 179 | 06 Feb 2022 180 | 181 | **Action Matching: Learning Stochastic Dynamics from Samples** \ 182 | *Kirill Neklyudov, Rob Brekelmans, Daniel Severo, Alireza Makhzani* \ 183 | ICML 2023. [[Paper](https://arxiv.org/abs/2210.06662)] [[Github](https://github.com/necludov/jam)] \ 184 | 13 Oct 2022 185 | 186 | 187 | **Cold Diffusion: Inverting Arbitrary Image Transforms Without Noise** \ 188 | *Arpit Bansal, Eitan Borgnia, Hong-Min Chu, Jie S. Li, Hamid Kazemi, Furong Huang, Micah Goldblum, Jonas Geiping, Tom Goldstein* \ 189 | arXiv 2022. [[Paper](https://arxiv.org/abs/2208.09392)] [[Github](https://github.com/arpitbansal297/Cold-Diffusion-Models)] \ 190 | 19 Aug 2022 191 | 192 | **Soft Diffusion: Score Matching for General Corruptions** \ 193 | *Giannis Daras, Mauricio Delbracio, Hossein Talebi, Alexandros G. Dimakis, Peyman Milanfar* \ 194 | TMLR 2023. [[Paper](https://openreview.net/forum?id=W98rebBxlQ)] \ 195 | 12 Sep 2022 196 | 197 | **Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration** \ 198 | *Mauricio Delbracio, Peyman Milanfar* \ 199 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.11435)] \ 200 | 20 March 2023 201 | 202 | **Image Restoration with Mean-Reverting Stochastic Differential Equations** \ 203 | *Ziwei Luo, Fredrik K. Gustafsson, Zheng Zhao, Jens Sjölund, Thomas B. Schön* \ 204 | ICML 2023. [[Paper](https://arxiv.org/abs/2301.11699)] [[Project](https://algolzw.github.io/ir-sde/index.html)] [[Github](https://github.com/Algolzw/image-restoration-sde)] \ 205 | 27 Jan 2023 206 | -------------------------------------------------------------------------------- /topics/2_images.md: -------------------------------------------------------------------------------- 1 | # Applications on Images 2 | 3 | Some of the applications are directly transferrable to other domains, and we try to not repeat them in the next section. 4 | 5 | ## Architecture 6 | 7 | **All are Worth Words: a ViT Backbone for Score-based Diffusion Models** \ 8 | *Fan Bao, Chongxuan Li, Yue Cao, Jun Zhu* \ 9 | arXiv 2022. [[Paper](https://arxiv.org/abs/2209.12152)] \ 10 | 25 Sep 2022 11 | 12 | **Scalable Diffusion Models with Transformers** \ 13 | *William Peebles, Saining Xie* \ 14 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.09748)] [[Project](https://www.wpeebles.com/DiT)] [[Github](https://github.com/facebookresearch/DiT)] \ 15 | 19 Dec 2022 16 | 17 | **One Transformer Fits All Distributions in Multi-Modal Diffusion at Scale** \ 18 | *Fan Bao, Shen Nie, Kaiwen Xue, Chongxuan Li, Shi Pu, Yaole Wang, Gang Yue, Yue Cao, Hang Su, Jun Zhu* \ 19 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.06555)] [[Github](https://github.com/thu-ml/unidiffuser)] \ 20 | 12 Mar 2023 21 | 22 | **Scalable Adaptive Computation for Iterative Generation** \ 23 | *Allan Jabri, David Fleet, Ting Chen* \ 24 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.11972)] \ 25 | 22 Dec 2022 26 | 27 | **simple diffusion: End-to-end diffusion for high resolution images** \ 28 | *Emiel Hoogeboom1, Jonathan Heek1, Tim Salimans* \ 29 | arXiv 2023. [[Paper](https://arxiv.org/abs/2301.11093)] \ 30 | 26 Jan 2023 31 | 32 | 33 | ## Editing 34 | 35 | **SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations** \ 36 | *Chenlin Meng, Yutong He, Yang Song, Jiaming Song, Jiajun Wu, Jun-Yan Zhu, Stefano Ermon* \ 37 | ICLR 2022. [[Paper](https://arxiv.org/abs/2108.01073)] [[Project](https://sde-image-editing.github.io/)] [[Github](https://github.com/ermongroup/SDEdit)] \ 38 | 2 Aug 2021 39 | 40 | **Efficient Spatially Sparse Inference for Conditional GANs and Diffusion Models** \ 41 | *Muyang Li, Ji Lin, Chenlin Meng, Stefano Ermon, Song Han, Jun-Yan Zhu* \ 42 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2211.02048)] [[Github](https://github.com/lmxyy/sige)] \ 43 | 4 Nov 2022 44 | 45 | **Blended Diffusion for Text-driven Editing of Natural Images** \ 46 | *Omri Avrahami, Dani Lischinski, Ohad Fried* \ 47 | CVPR 2022. [[Paper](https://arxiv.org/abs/2111.14818)] [[Project](https://omriavrahami.com/blended-diffusion-page/)] [[Github](https://github.com/omriav/blended-diffusion)] \ 48 | 29 Nov 2021 49 | 50 | **Prompt-to-Prompt Image Editing with Cross-Attention Control** \ 51 | *Amir Hertz, Ron Mokady, Jay Tenenbaum, Kfir Aberman, Yael Pritch, Daniel Cohen-Or* \ 52 | ICLR 2023. [[Paper](https://prompt-to-prompt.github.io/ptp_files/Prompt-to-Prompt_preprint.pdf)] | [[GitHub](https://github.com/google/prompt-to-prompt/)] 53 | 54 | **Imagic: Text-Based Real Image Editing with Diffusion Models** \ 55 | *Bahjat Kawar1, Shiran Zada1, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, Michal Irani* \ 56 | CVPR 2023. [[Paper](https://arxiv.org/abs/2210.09276)] [[Project](https://imagic-editing.github.io/)] \ 57 | 17 Oct 2022 58 | 59 | **DiffEdit: Diffusion-based semantic image editing with mask guidance** \ 60 | *Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord* \ 61 | ICLR 2023. [[Paper](https://arxiv.org/abs/2210.11427)] \ 62 | 20 Oct 2022 63 | 64 | **Collage Diffusion** \ 65 | *Vishnu Sarukkai, Linden Li, Arden Ma, Christopher Ré, Kayvon Fatahalian* \ 66 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.00262)] \ 67 | 1 Mar 2023 68 | 69 | **MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation** \ 70 | *Omer Bar-Tal, Lior Yariv, Yaron Lipman, Tali Dekel* \ 71 | ICML 2023. [[Paper](https://arxiv.org/abs/2302.08113)] [[Github](https://github.com/omerbt/MultiDiffusion)] \ 72 | 16 Feb 2023 73 | 74 | 75 | ## Personalization 76 | 77 | **An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion** \ 78 | *Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or* \ 79 | ICLR 2023. [[Paper](https://arxiv.org/abs/2208.01618)] | [[Github](https://github.com/rinongal/textual_inversion)] | [[Project](https://textual-inversion.github.io/)] \ 80 | 1 Mar 2023 81 | 82 | **DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation** \ 83 | *Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, Kfir Aberman* \ 84 | CVPR 2023. [[Paper](https://arxiv.org/abs/2208.12242)] [[Project](https://dreambooth.github.io/)] [[Github](https://github.com/Victarry/stable-dreambooth)] \ 85 | 25 Aug 2022 86 | 87 | **Multi-Concept Customization of Text-to-Image Diffusion** \ 88 | *Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan Zhu* \ 89 | CVPR 2023. [[Paper](https://arxiv.org/abs/2212.04488)] [[Project](https://www.cs.cmu.edu/~custom-diffusion/)] \ 90 | 8 Dec 2022 91 | 92 | **Key-Locked Rank One Editing for Text-to-Image Personalization** \ 93 | *Yoad Tewel, Rinon Gal, Gal Chechik, Yuval Atzmon* \ 94 | SIGGRAPH 2023. [[Paper](https://arxiv.org/abs/2305.01644)] [[Project](https://research.nvidia.com/labs/par/Perfusion/)] \ 95 | 8 Dec 2022 96 | 97 | **A Recipe for Watermarking Diffusion Models** \ 98 | *Yunqing Zhao, Tianyu Pang, Chao Du, Xiao Yang, Ngai-Man Cheung, Min Lin* \ 99 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.10137)] [[Project](https://github.com/yunqing-me/WatermarkDM)] \ 100 | 17 March 2023 101 | 102 | 103 | ## Fine-tuning 104 | 105 | **LoRA: Low-Rank Adaptation of Large Language Models** \ 106 | *Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen* \ 107 | ICLR 2022. [[Paper](https://arxiv.org/abs/2106.09685)] \ 108 | 109 | **GLIGEN: Open-Set Grounded Text-to-Image Generation** \ 110 | *Yuheng Li, Haotian Liu, Qingyang Wu, Fangzhou Mu, Jianwei Yang, Jianfeng Gao, Chunyuan Li, Yong Jae Lee* \ 111 | CVPR 2023. [[Paper](https://arxiv.org/abs/2301.07093)] [[Project](https://gligen.github.io/)] \ 112 | 113 | **SpaText: Spatio-Textual Representation for Controllable Image Generation** \ 114 | *Omri Avrahami, Thomas Hayes, Oran Gafni, Sonal Gupta, Yaniv Taigman, Devi Parikh, Dani Lischinski, Ohad Fried, Xi Yin* \ 115 | CVPR 2023. [[Paper](https://arxiv.org/abs/2211.14305)] [[Project](https://omriavrahami.com/spatext/)] \ 116 | 25 Nov 2022 117 | 118 | **Adding Conditional Control to Text-to-Image Diffusion Models** \ 119 | *Lvmin Zhang, Maneesh Agrawala* \ 120 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.05543)] [[Github](https://github.com/lllyasviel/ControlNet)] \ 121 | 10 Feb 2023 122 | 123 | **T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models** \ 124 | *Chong Mou, Xintao Wang, Liangbin Xie, Jian Zhang, Zhongang Qi, Ying Shan, Xiaohu Qie* \ 125 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.08453)] [[Github](https://github.com/TencentARC/T2I-Adapter)] \ 126 | 10 Feb 2023 127 | 128 | **Editing Implicit Assumptions in Text-to-Image Diffusion Models** \ 129 | *Hadas Orgad, Bahjat Kawar, Yonatan Belinkov* \ 130 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.08084)] [[Project](https://time-diffusion.github.io/)] [[Github](https://github.com/bahjat-kawar/time-diffusion)] \ 131 | 14 Mar 2023 132 | 133 | **SVDiff: Compact Parameter Space for Diffusion Fine-Tuning** \ 134 | *Ligong Han, Yinxiao Li, Han Zhang, Peyman Milanfar, Dimitris Metaxas, Feng Yang* \ 135 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.11305)] \ 136 | 20 Mar 2023 137 | 138 | **DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning** \ 139 | *Enze Xie, Lewei Yao, Han Shi, Zhili Liu, Daquan Zhou, Zhaoqiang Liu, Jiawei Li, Zhenguo Li* \ 140 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.06648)] \ 141 | 13 Apr 2023 142 | 143 | 144 | ## Low-level Vision 145 | 146 | **Palette: Image-to-Image Diffusion Models** \ 147 | *Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, Mohammad Norouzi*\ 148 | SIGGRAPH 2022. [[Paper](https://arxiv.org/abs/2111.05826)] [[Project](https://iterative-refinement.github.io/palette/)] \ 149 | 150 | **Deblurring via Stochastic Refinement** \ 151 | *Jay Whang, Mauricio Delbracio, Hossein Talebi, Chitwan Saharia, Alexandros G. Dimakis, Peyman Milanfar* \ 152 | CVPR 2022. [[Paper](https://arxiv.org/abs/2112.02475)] \ 153 | 5 Dec 2021 154 | 155 | **Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models** \ 156 | *Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, Shalini De Mello* \ 157 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.04803)] [[Project](https://jerryxu.net/ODISE/)] \ 158 | 8 Mar 2023 159 | 160 | **Monocular Depth Estimation using Diffusion Models** \ 161 | *Saurabh Saxena, Abhishek Kar, Mohammad Norouzi, David J. Fleet* \ 162 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.14816)] 163 | 28 April 2023 164 | 165 | ## Classification 166 | **Your Diffusion Model is Secretly a Zero-Shot Classifier** \ 167 | *Alexander C. Li, Mihir Prabhudesai, Shivam Duggal, Ellis Brown, Deepak Pathak*\ 168 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.16203)] [[Project](https://diffusion-classifier.github.io/)] [[Github](https://github.com/diffusion-classifier/diffusion-classifier)] \ 169 | 28 Mar 2023 170 | 171 | ## Adversarial Robustness 172 | 173 | **Improving Robustness using Generated Data** \ 174 | Sven Gowal, Sylvestre-Alvise Rebuffi, Olivia Wiles, Florian Stimberg, Dan Andrei Calian, Timothy Mann \ 175 | NeurIPS 2021. [[Paper](https://arxiv.org/abs/2110.09468)] [[Github](https://github.com/deepmind/deepmind-research/tree/master/adversarial_robustness)] \ 176 | 18 Oct 2021 177 | 178 | **Better Diffusion Models Further Improve Adversarial Training** \ 179 | Zekai Wang*, Tianyu Pang*, Chao Du, Min Lin, Weiwei Liu, Shuicheng Yan \ 180 | ICML 2023. [[Paper](https://arxiv.org/pdf/2302.04638.pdf)] [[Github](https://github.com/wzekai99/DM-Improves-AT)] \ 181 | 09 Feb 2023 182 | -------------------------------------------------------------------------------- /topics/3_others.md: -------------------------------------------------------------------------------- 1 | # Applications on Other Domains 2 | 3 | # Inverse Problems and Medical Imaging 4 | 5 | **Robust Compressed Sensing MRI with Deep Generative Priors** \ 6 | Ajil Jalal, Marius Arvinte, Giannis Daras, Eric Price, Alexandros G Dimakis, Jon Tamir \ 7 | NeurIPS 2021. [[Paper](https://proceedings.neurips.cc/paper/2021/hash/7d6044e95a16761171b130dcb476a43e-Abstract.html)] [[Github](https://github.com/utcsilab/csgm-mri-langevin)] \ 8 | 09 Nov 2021 9 | 10 | **Solving Inverse Problems in Medical Imaging with Score-Based Generative Models** \ 11 | *Yang Song1, Liyue Shen1, Lei Xing, Stefano Ermon* \ 12 | ICLR 2022. [[Paper](https://arxiv.org/abs/2111.08005)] [[Github](https://github.com/yang-song/score_inverse_problems)] \ 13 | 15 Nov 2021 14 | 15 | **Denoising Diffusion Restoration Models** \ 16 | *Bahjat Kawar, Michael Elad, Stefano Ermon, Jiaming Song* \ 17 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2201.11793)] \ 18 | 27 Jan 2022 19 | 20 | **Improving Diffusion Models for Inverse Problems using Manifold Constraints** \ 21 | *Hyungjin Chung1, Byeongsu Sim1, Dohoon Ryu, Jong Chul Ye* \ 22 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2206.00941)] \ 23 | 2 Jun 2022 24 | 25 | **Pyramidal Denoising Diffusion Probabilistic Models** \ 26 | *Dohoon Ryu, Jong Chul Ye* \ 27 | arXiv 2022. [[Paper](https://arxiv.org/abs/2208.01864)] \ 28 | 3 Aug 2022 29 | 30 | **Diffusion Posterior Sampling for General Noisy Inverse Problems** \ 31 | *Hyungjin Chung1, Jeongsol Kim1, Michael T. Mccann, Marc L. Klasky, Jong Chul Ye* \ 32 | arXiv 2022. [[Paper](https://arxiv.org/abs/2209.14687)] [[Github](https://github.com/DPS2022/diffusion-posterior-sampling)] \ 33 | 29 Sep 2022 34 | 35 | **Score-Based Diffusion Models as Principled Priors for Inverse Imaging** \ 36 | *Berthy T. Feng, Jamie Smith, Michael Rubinstein, Huiwen Chang, Katherine L. Bouman, William T. Freeman* \ 37 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.11751)] \ 38 | 23 Apr 2023 39 | 40 | **Pseudoinverse-Guided Diffusion Models for Inverse Problems** \ 41 | *Jiaming Song, Arash Vahdat, Morteza Mardani, Jan Kautz* \ 42 | ICLR 2023. [[Paper](https://openreview.net/forum?id=9_gsMA8MRKQ)] \ 43 | 7 May 2023 44 | 45 | **A Variational Perspective on Solving Inverse Problems with Diffusion Models** \ 46 | *Morteza Mardani, Jiaming Song, Jan Kautz, Arash Vahdat* \ 47 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.04391)] \ 48 | 7 May 2023 49 | 50 | **Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration** 51 | *Mauricio Delbracio, Peyman Milanfar* 52 | arxiv 2023. [[Paper](https://arxiv.org/abs/2303.11435)] \ 53 | 22 Mar 2023 54 | 55 | **Removing Structured Noise with Diffusion Models** 56 | *Tristan S.W. Stevens, Hans van Gorp, Faik C. Meral, Jun Seob Shin, Jason Yu, Jean-Luc Robert, Ruud J.G. van Sloun* 57 | arxiv 2023. [[Paper](https://arxiv.org/pdf/2302.05290.pdf)] [[Blog](https://tristan-deep.github.io/posts/2023/03/diffusion-models/)] \ 58 | 24 May 2023 59 | 60 | **Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model** 61 | *Yinhuai Wang, Jiwen Yu, Jian Zhang* 62 | ICLR 2023. [[Paper](https://arxiv.org/pdf/2302.05290.pdf)] \ 63 | 1 Feb 2023 64 | 65 | ## 3D 66 | 67 | ### Point clouds 68 | 69 | **3D Shape Generation and Completion through Point-Voxel Diffusion** \ 70 | *Linqi Zhou, Yilun Du, Jiajun Wu* \ 71 | ICCV 2021. [[Paper](https://arxiv.org/abs/2104.03670)] [[Project](https://alexzhou907.github.io/pvd)] 72 | 73 | **LION: Latent Point Diffusion Models for 3D Shape Generation** \ 74 | *Xiaohui Zeng, Arash Vahdat, Francis Williams, Zan Gojcic, Or Litany, Sanja Fidler, Karsten Kreis* \ 75 | NeurIPS 2022. [[Paper](https://arxiv.org/pdf/2210.06978.pdf)] [[Project](https://nv-tlabs.github.io/LION/)] 76 | 77 | **Point-E: A System for Generating 3D Point Clouds from Complex Prompts** \ 78 | *Alex Nichol1, Heewoo Jun1, Prafulla Dhariwal, Pamela Mishkin, Mark Chen* \ 79 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.08751)] [[Github](https://github.com/openai/point-e)] \ 80 | 16 Dec 2022 81 | 82 | 83 | ### SDFs 84 | 85 | **DiffusionSDF: Conditional Generative Modeling of Signed Distance Functions** \ 86 | *Gene Chou, Yuval Bahat, Felix Heide* \ 87 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.13757)] \ 88 | 24 Nov 2022 89 | 90 | **SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation** \ 91 | *Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing, Liangyan Gui* \ 92 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.04493)] [[Project](https://yccyenchicheng.github.io/SDFusion/)] \ 93 | 8 Dec 2022 94 | 95 | **Neural Wavelet-domain Diffusion for 3D Shape Generation** \ 96 | *Ka-Hei Hui, Ruihui Li, Jingyu Hu, Chi-Wing Fu* \ 97 | arXiv 2022. [[Paper](https://arxiv.org/abs/2209.08725)] \ 98 | 19 Sep 2022 99 | 100 | 101 | ### Triplanes 102 | 103 | **3D Neural Field Generation using Triplane Diffusion** \ 104 | *J. Ryan Shue, Eric Ryan Chan, Ryan Po, Zachary Ankner, Jiajun Wu, Gordon Wetzstein* \ 105 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.16677)] [[Project](https://jryanshue.com/nfd/)] \ 106 | 30 Nov 2022 107 | 108 | 109 | ### NeRF and other field representations 110 | 111 | **Learning a Diffusion Prior for NeRFs** \ 112 | *Guandao Yang, Abhijit Kundu, Leonidas J. Guibas, Jonathan T. Barron, Ben Poole* \ 113 | ICLR Workshop 2023. [[Paper](https://arxiv.org/abs/2304.14473)] \ 114 | 27 Apr 2023 115 | 116 | **Shap-E: Generating Conditional 3D Implicit Functions** \ 117 | *Heewoo Jun, Alex Nichol* \ 118 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.02463)] [[Github](https://github.com/openai/shap-e)] 119 | 3 May 2023 120 | 121 | ### 3d using 2d models, 2d data 122 | 123 | **DreamFusion: Text-to-3D using 2D Diffusion** \ 124 | *Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall* \ 125 | arXiv 2022. [[Paper](https://arxiv.org/abs/2209.14988)] [[Github](https://dreamfusion3d.github.io/)] \ 126 | 29 Sep 2022 127 | 128 | **Magic3D: High-Resolution Text-to-3D Content Creation** \ 129 | *Chen-Hsuan Lin1, Jun Gao1, Luming Tang1, Towaki Takikawa1, Xiaohui Zeng1, Xun Huang, Karsten Kreis, Sanja Fidler, Ming-Yu Liu, Tsung-Yi Lin* \ 130 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.10440)] [[Project](https://deepimagination.cc/Magic3D/)] \ 131 | 18 Nov 2022 132 | 133 | **Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation** \ 134 | *Haochen Wang1, Xiaodan Du1, Jiahao Li1, Raymond A. Yeh, Greg Shakhnarovich* \ 135 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.00774)] \ 136 | 1 Dec 2022 137 | 138 | **Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures** \ 139 | *Gal Metzer1, Elad Richardson1, Or Patashnik, Raja Giryes, Daniel Cohen-Or* \ 140 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.07600)] [[Github](https://github.com/eladrich/latent-nerf)] \ 141 | 14 Nov 2022 142 | 143 | **Debiasing Scores and Prompts of 2D Diffusion for Robust Text-to-3D Generation** \ 144 | *Susung Hong1, Donghoon Ahn1, Seungryong Kim* \ 145 | CVPR Workshop 2023. [[Paper](https://arxiv.org/abs/2303.15413)] \ 146 | 27 March 2023 147 | 148 | 149 | ### 3d using 2d models, 3d data 150 | 151 | **Novel View Synthesis with Diffusion Models** \ 152 | *Daniel Watson, William Chan, Ricardo Martin-Brualla, Jonathan Ho, Andrea Tagliasacchi, Mohammad Norouzi* \ 153 | arXiv 2022. [[Paper](https://arxiv.org/abs/2210.04628)] \ 154 | 155 | **Generative Novel View Synthesis with 3D-Aware Diffusion Models** \ 156 | *Eric R. Chan1, Koki Nagano1, Matthew A. Chan, Alexander W. Bergman, Jeong Joon Park, Axel Levy, Miika Aittala, Shalini De Mello, Tero Karras, Gordon Wetzstein* \ 157 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.02602)] [[Project](https://nvlabs.github.io/genvs/)] \ 158 | 5 Apr 2023 159 | 160 | 161 | ### 3d reconstruction 162 | 163 | **NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360° Views** \ 164 | *Dejia Xu, Yifan Jiang, Peihao Wang, Zhiwen Fan, Yi Wang, Zhangyang Wang* \ 165 | arXiv 2022. [[Paper](https://arxiv.org/abs/2211.16431)] [[Project](https://vita-group.github.io/NeuralLift-360/)] \ 166 | 29 Nov 2022 167 | 168 | **SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction** \ 169 | *Zhizhuo Zhou, Shubham Tulsiani* \ 170 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.00792)] [[Github](https://sparsefusion.github.io/)] \ 171 | 1 Dec 2022 172 | 173 | **DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model** \ 174 | *Hoigi Seo1, Hayeon Kim1, Gwanghyun Kim, Se Young Chun* \ 175 | arXiv 2023. [[Paper](https://arxiv.org/abs/2304.02827)] [[Project](https://janeyeon.github.io/ditto-nerf/)] \ 176 | 6 Apr 2023 177 | 178 | ### 3d editing 179 | 180 | **Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions** \ 181 | *Ayaan Haque, Matthew Tancik, Alexei A. Efros, Aleksander Holynski, Angjoo Kanazawa* \ 182 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.12789)] [[Project](https://instruct-nerf2nerf.github.io/)] \ 183 | 22 Mar 2023 184 | 185 | **Vox-E: Text-guided Voxel Editing of 3D Objects** \ 186 | *Etai Sella, Gal Fiebelman, Peter Hedman, Hadar Averbuch-Elor* \ 187 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.12048)] [[Project](https://tau-vailab.github.io/Vox-E/)] \ 188 | 21 Mar 2023 189 | 190 | 191 | # Video 192 | 193 | ## Video generation / prediction / interpolation 194 | 195 | **Video Diffusion Models** \ 196 | *Jonathan Ho1, Tim Salimans1, Alexey Gritsenko, William Chan, Mohammad Norouzi, David J. Fleet* \ 197 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2204.03458)] \ 198 | 7 Apr 2022 199 | 200 | **Flexible Diffusion Modeling of Long Videos** \ 201 | *William Harvey, Saeid Naderiparizi, Vaden Masrani, Christian Weilbach, Frank Wood* \ 202 | arXiv 2022. [[Paper](https://arxiv.org/abs/2205.11495)] [[Github](https://github.com/plai-group/flexible-video-diffusion-modeling)] \ 203 | 23 May 2022 204 | 205 | **MCVD: Masked Conditional Video Diffusion for Prediction, Generation, and Interpolation** \ 206 | *Vikram Voleti1, Alexia Jolicoeur-Martineau1, Christopher Pal* \ 207 | NeurIPS 2022. [[Paper](https://arxiv.org/abs/2205.09853)] [[Github](https://github.com/voletiv/mcvd-pytorch)] \ 208 | 19 May 2022 209 | 210 | **Imagen Video: High Definition Video Generation with Diffusion Models** \ 211 | *Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim Salimans* \ 212 | Oct 2022. [[Paper](https://imagen.research.google/video/paper.pdf)] 213 | 214 | **Make-A-Video: Text-to-Video Generation without Text-Video Data** \ 215 | *Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv Taigman* \ 216 | arXiv 2022. [[Paper](https://arxiv.org/abs/2209.14792)] \ 217 | 29 Sep 2022 218 | 219 | **VIDM: Video Implicit Diffusion Models** \ 220 | *Kangfu Mei, Vishal M. Patel* \ 221 | arXiv 2022. [[Paper](https://arxiv.org/abs/2212.00235)] [[Project](https://kfmei.page/vidm/)] \ 222 | 1 Dec 2022 223 | 224 | **Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models** \ 225 | *Andreas Blattmann1, Robin Rombach1, Huan Ling1, Tim Dockhorn1, Seung Wook Kim, Sanja Fidler, Karsten Kreis* \ 226 | CVPR 2023. [[Paper](https://arxiv.org/abs/2304.08818)] [[Project](https://research.nvidia.com/labs/toronto-ai/VideoLDM/)] \ 227 | 18 Apr 2023 228 | 229 | 230 | ## Video editing / style transfer 231 | 232 | **Zero-Shot Video Editing Using Off-The-Shelf Image Diffusion Models** \ 233 | *Wen Wang1, Kangyang Xie1, Zide Liu1, Hao Chen, Yue Cao, Xinlong Wang, Chunhua Shen* \ 234 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.17599)] \ 235 | 30 Mar 2023 236 | 237 | **Pix2Video: Video Editing using Image Diffusion** \ 238 | *Duygu Ceylan, Chun-Hao Paul Huang, Niloy J. Mitra* \ 239 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.12688)] [[Project](https://duyguceylan.github.io/pix2video.github.io/)] \ 240 | 22 Mar 2023 241 | 242 | **Structure and Content-Guided Video Synthesis with Diffusion Models** \ 243 | *Patrick Esser, Johnathan Chiu, Parmida Atighehchian, Jonathan Granskog, Anastasis Germanidis* \ 244 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.03011)] \ 245 | 6 Feb 2023 246 | 247 | # Flexible Large-Content Generation 248 | 249 | **Mixture of Diffusers for scene composition and high resolution image generation** \ 250 | *Álvaro Barbero Jiménez* \ 251 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.02412)] [[Github](https://github.com/albarji/mixture-of-diffusers)] \ 252 | 5 Feb 2023 253 | 254 | **MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation** \ 255 | *Omer Bar-Tal1, Lior Yariv1, Yaron Lipman, Tali Dekel* \ 256 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.08113)] [Project](https://multidiffusion.github.io/)] [[Github](https://github.com/omerbt/MultiDiffusion)] \ 257 | 16 Feb 2023 258 | 259 | **DiffCollage: Parallel Generation of Large Content with Diffusion Models** \ 260 | *Qinsheng Zhang, Jiaming Song, Xun Huang, Yongxin Chen, Ming-Yu Liu* \ 261 | CVPR 2023. [[Paper](https://arxiv.org/abs/2303.17076)] [[Project](https://research.nvidia.com/labs/dir/diffcollage/)] \ 262 | 30 Mar 2023 263 | 264 | 265 | 266 | # Motion 267 | 268 | **MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model** \ 269 | *Mingyuan Zhang, Zhongang Cai, Liang Pan, Fangzhou Hong, Xinying Guo, Lei Yang, Ziwei Liu* \ 270 | arXiv 2022. [[Paper](https://arxiv.org/abs/2208.15001)] [[Project](https://mingyuan-zhang.github.io/projects/MotionDiffuse.html)] \ 271 | 31 Aug 2022 272 | 273 | **Human Motion Diffusion Model** \ 274 | *Guy Tevet, Sigal Raab, Brian Gordon, Yonatan Shafir, Amit H. Bermano, Daniel Cohen-Or* \ 275 | arXiv 2022. [[Paper](https://arxiv.org/abs/2209.14916)] [[Project](https://guytevet.github.io/mdm-page/)] \ 276 | 29 Sep 2022 277 | 278 | **Executing your Commands via Motion Diffusion in Latent Space** \ 279 | *Xin Chen1, Biao Jiang1, Wen Liu, Zilong Huang, Bin Fu, Tao Chen, Jingyi Yu, Gang Yu* \ 280 | CVPR 2023. [[Paper](https://arxiv.org/abs/2212.04048)] [[Project](https://github.com/ChenFengYe/motion-latent-diffusion)] \ 281 | 8 Dec 2022 282 | 283 | **Avatars Grow Legs: Generating Smooth Human Motion from Sparse Tracking Inputs with Diffusion Model** \ 284 | *Yuming Du, Robin Kips, Albert Pumarola, Sebastian Starke, Ali Thabet, Artsiom Sanakoyeu* \ 285 | CVPR 2023. [[Paper](https://arxiv.org/abs/2304.08577)] [[Project](https://dulucas.github.io/agrol/)] \ 286 | 17 Apr 2023 287 | 288 | **Human Motion Diffusion as a Generative Prior** \ 289 | *Yonatan Shafir1, Guy Tevet1, Roy Kapon, Amit H. Bermano* \ 290 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.01418)] \ 291 | 2 Mar 2023 292 | 293 | 294 | ## Limitations & Mitigations 295 | 296 | **Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models** \ 297 | *Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein* \ 298 | CVPR 2023. [[Paper](https://openaccess.thecvf.com/content/CVPR2023/papers/Somepalli_Diffusion_Art_or_Digital_Forgery_Investigating_Data_Replication_in_Diffusion_CVPR_2023_paper.pdf)] [[Github](https://github.com/somepago/DCR)] \ 299 | 7 Dec 2022 300 | 301 | **Extracting Training Data from Diffusion Models** \ 302 | *Nicholas Carlini, Jamie Hayes, Milad Nasr, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace* \ 303 | arXiv 2023. [[Paper](https://arxiv.org/abs/2301.13188)] \ 304 | 30 Jan 2023 305 | 306 | **Erasing Concepts from Diffusion Models** \ 307 | *Rohit Gandikota, Joanna Materzynska, Jaden Fiotto-Kaufman, David Bau* \ 308 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.07345)] \ 309 | 13 March 2023 310 | 311 | **Ablating Concepts in Text-to-Image Diffusion Models** \ 312 | *Nupur Kumari, Bin Zhang, Sheng-Yu Wang, Eli Shechtman, Richard Zhang, Jun-Yan Zhu* \ 313 | arXiv 2023. [[Paper](https://arxiv.org/abs/2303.13516)] \ 314 | 23 March 2023 315 | 316 | **Understanding and Mitigating Copying in Diffusion Models** \ 317 | *Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein* \ 318 | arXiv 2023. [[Paper](https://arxiv.org/abs/2305.20086)] [[Github](https://github.com/somepago/DCR)] \ 319 | 31 May 2023 320 | 321 | **Extracting Training Data from Diffusion Models** \ 322 | *Nicholas Carlini1, Jamie Hayes1, Milad Nasr1, Matthew Jagielski, Vikash Sehwag, Florian Tramèr, Borja Balle, Daphne Ippolito, Eric Wallace* \ 323 | arXiv 2023. [[Paper](https://arxiv.org/abs/2302.00860)] \ 324 | 2 Feb 2023 --------------------------------------------------------------------------------