├── .gitignore
├── LICENSE
├── README.md
├── images
├── cat.jpg
└── vae_ddpm.jpg
├── notebooks
├── Random Variables.ipynb
├── all_you_need_to_know_about_gaussian.ipynb
└── deep_dive_into_ddpms.ipynb
└── plots
├── bivariate_negative_covariance_density.html
├── bivariate_negative_covariance_density.png
├── bivariate_normal_example.html
├── bivariate_normal_example.png
├── bivariate_positive_covariance_density.html
├── bivariate_positive_covariance_density.png
├── bivariate_zero_covariance_density.html
├── bivariate_zero_covariance_density.png
├── convolution_of_pdfs.png
├── covariance_pair_plot.png
├── efficient_forward_process.png
├── forward_process.png
├── isotropic_gaussian.html
├── isotropic_gaussian.png
├── univariate_normal_example.html
├── univariate_normal_example.png
└── univariate_with_dff_mu_sigma.png
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | *.DS_Store
3 | __pycache__/
4 | *.py[cod]
5 | *$py.class
6 |
7 | # C extensions
8 | *.so
9 |
10 | # Distribution / packaging
11 | .Python
12 | build/
13 | develop-eggs/
14 | dist/
15 | downloads/
16 | eggs/
17 | .eggs/
18 | lib/
19 | lib64/
20 | parts/
21 | sdist/
22 | var/
23 | wheels/
24 | pip-wheel-metadata/
25 | share/python-wheels/
26 | *.egg-info/
27 | .installed.cfg
28 | *.egg
29 | MANIFEST
30 |
31 | # PyInstaller
32 | # Usually these files are written by a python script from a template
33 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
34 | *.manifest
35 | *.spec
36 |
37 | # Installer logs
38 | pip-log.txt
39 | pip-delete-this-directory.txt
40 |
41 | # Unit test / coverage reports
42 | htmlcov/
43 | .tox/
44 | .nox/
45 | .coverage
46 | .coverage.*
47 | .cache
48 | nosetests.xml
49 | coverage.xml
50 | *.cover
51 | *.py,cover
52 | .hypothesis/
53 | .pytest_cache/
54 |
55 | # Translations
56 | *.mo
57 | *.pot
58 |
59 | # Django stuff:
60 | *.log
61 | local_settings.py
62 | db.sqlite3
63 | db.sqlite3-journal
64 |
65 | # Flask stuff:
66 | instance/
67 | .webassets-cache
68 |
69 | # Scrapy stuff:
70 | .scrapy
71 |
72 | # Sphinx documentation
73 | docs/_build/
74 |
75 | # PyBuilder
76 | target/
77 |
78 | # Jupyter Notebook
79 | .ipynb_checkpoints
80 |
81 | # IPython
82 | profile_default/
83 | ipython_config.py
84 |
85 | # pyenv
86 | .python-version
87 |
88 | # pipenv
89 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
90 | # However, in case of collaboration, if having platform-specific dependencies or dependencies
91 | # having no cross-platform support, pipenv may install dependencies that don't work, or not
92 | # install all needed dependencies.
93 | #Pipfile.lock
94 |
95 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow
96 | __pypackages__/
97 |
98 | # Celery stuff
99 | celerybeat-schedule
100 | celerybeat.pid
101 |
102 | # SageMath parsed files
103 | *.sage.py
104 |
105 | # Environments
106 | .env
107 | .venv
108 | env/
109 | venv/
110 | ENV/
111 | env.bak/
112 | venv.bak/
113 |
114 | # Spyder project settings
115 | .spyderproject
116 | .spyproject
117 |
118 | # Rope project settings
119 | .ropeproject
120 |
121 | # mkdocs documentation
122 | /site
123 |
124 | # mypy
125 | .mypy_cache/
126 | .dmypy.json
127 | dmypy.json
128 |
129 | # Pyre type checker
130 | .pyre/
131 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2022 Aakash Kumar Nain
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | **Update:** Blog posts: [DDPMs from scratch](https://magic-with-latents.github.io/latent/ddpms-series.html)
2 |
3 | # Diffusion Models
4 |
5 | Diffusion models are a class of likelihood-based generative models that recently have been used
6 | to produce very high quality images compared to other existing generative models like GANs.
7 | For example, take a look at the latest research [Imagen](https://imagen.research.google/) or
8 | [GLIDE](https://arxiv.org/abs/2112.10741)where the authors used diffusion models to generate
9 | very high quality images.
10 |
11 | Although you can find a lot of material online regarding other generative models like GANs to
12 | learn from, the list of resources for learning about diffusion models is still sparse. On top
13 | of it, the mathematics behind the diffusion models is a bit harder to understand. To address
14 | this, we are creating this repo to give you enough material to make you understand the
15 | working of diffusion models and the maths involved.
16 |
17 | We try to keep everything organized in notebooks which you can run on **Colab.**
18 | We are also organizing the content in a series of short blog-posts but that would take some time.
19 | Also, some of the notebooks presented here are marked as *optional*. These notebooks covers
20 | the theoretical parts that you should be aware of before reading about diffusion models.
21 |
22 | ---
23 |
24 | ## Table of Contents
25 |
26 | | Chapter No |
Topic
| Colab | GitHub |
27 | | ------------ | ----------------------------------- | ----- | ------ |
28 | | 1. |[**Random Variables (Optional)**](https://magic-with-latents.github.io/latent/posts/ddpms/part1/)| [](https://colab.research.google.com/github/AakashKumarNain/diffusion_models/blob/main/notebooks/Random%20Variables.ipynb) |[](https://github.com/AakashKumarNain/diffusion_models/blob/main/notebooks/Random%20Variables.ipynb) |
29 | | 2. | [**Gaussian Distribution and DDPMs**](https://magic-with-latents.github.io/latent/posts/ddpms/part2/)| [](https://colab.research.google.com/github/AakashKumarNain/diffusion_models/blob/main/notebooks/all_you_need_to_know_about_gaussian.ipynb) |[](https://github.com/AakashKumarNain/diffusion_models/blob/main/notebooks/all_you_need_to_know_about_gaussian.ipynb) |
30 | | 3. | [**A deep dive into DDPMs**](https://magic-with-latents.github.io/latent/posts/ddpms/part3/)| [](https://colab.research.google.com/github/AakashKumarNain/diffusion_models/blob/main/notebooks/deep_dive_into_ddpms.ipynb) |[](https://github.com/AakashKumarNain/diffusion_models/blob/main/notebooks/deep_dive_into_ddpms.ipynb) |
31 |
--------------------------------------------------------------------------------
/images/cat.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/images/cat.jpg
--------------------------------------------------------------------------------
/images/vae_ddpm.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/images/vae_ddpm.jpg
--------------------------------------------------------------------------------
/notebooks/Random Variables.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "id": "8470072a",
6 | "metadata": {},
7 | "source": [
8 | "**Author:** Rishabh Anand ([@rishabh16_](http://twitter.com/rishabh16_))
\n",
9 | "**Date created:** 2022/07/29
\n",
10 | "**Last modified:** 2022/08/01
\n",
11 | "**Description:** Fresher on Random Variables (optional read)
\n",
12 | "\n",
13 | "\n",
14 | "#### Foreword\n",
15 | "\n",
16 | "This is the first (optional) chapter in a 3-part series covering the basics of Random Variables. We do not cover a lot of material here as we believe there's enough online. Random Variables are a core component of statistics and probability. This reading places a heavier emphasis on understanding the theory rather than implementation specifics."
17 | ]
18 | },
19 | {
20 | "cell_type": "markdown",
21 | "id": "45f3840e",
22 | "metadata": {},
23 | "source": [
24 | "# 1. Fresher on Random Variables\n",
25 | "The study of probability (and most of statistics) revolves around “containers of information” that can hold different contents based on different associated events. These containers are called Random Variables (RV) and they can take up different values (any one at once, not combinations). Each event $E_i$ (and hence, value) also has an associated probability $p_i$ of happening. \n",
26 | "\n",
27 | "For instance, suppose we roll a fair die once. Let’s call the result of the roll, $X$. Here, $X$ is a RV and it has 6 associated events for each face it can land on, and as a result, 6 possible values, depending on which event takes place. Upon rolling, we have ourselves left with an upward-oriented face with a number on it. $X$ is now assigned this die value. The probably of each event is ⅙ (since we’re using a fair die) – there’s a 1 in 6 chance that any number will face upwards after the die has landed.\n",
28 | "\n",
29 | "\n",
30 | "### 1.1 Continuous Random Variables\n",
31 | "\n",
32 | "Continuous RVs can take in any float/real number ($\\mathbb{R}$) as a value based on the context. When filling up a bottle, your bottle can have 30.6ml or 56.89ml or 350.47ml of water in it, or the daily temperature can be 35ºC or 29.8ºC or 37.2ºC. It fluctuates and can take intermediate values. For some continuous RVs, there's no range while others can be bounded.\n",
33 | "\n",
34 | "### 1.2 Discrete Random Variables\n",
35 | "\n",
36 | "These usually fall into the category of integers ($\\mathbb{Z}$) that may or may not fall in a specific range in a specific context. Examples include the face value of a die when rolled, a card pulled from a deck, or letter grades on an exam like A, B, C, D, F. The similarity is that the collection of values from which they are sampled or pulled from are **limited** and **finite**. There is no \"in-between\" value (eg: you cannot get 5.5 on a die). "
37 | ]
38 | },
39 | {
40 | "cell_type": "markdown",
41 | "id": "e74f9180",
42 | "metadata": {},
43 | "source": [
44 | "# 2. Independence\n",
45 | "Suppose there are two events. If one does not affect the other, the two events are independent of each other. Though, what does “affect” mean in this context? It means the occurrence of one event does not impact the occurrence of the other. For instance, if I flip a fair coin 3 times, the second flip does not depend on the first flip, neither does the third flip on the second; in essence, the past outcome does not decide the future outcome. While there might be some causal link in the real world between events, in theory, we see the two events in isolation with each other, with no other factors involved. \n",
46 | "\n",
47 | "On the other hand, dependent events impact the occurrence of one another. For example, if you misuse your vehicle on the road, there’s a higher probability of getting caught by the authorities as compared to using your vehicle appropriately. The occurrence of one event changes the probability of the other event occurring.\n"
48 | ]
49 | },
50 | {
51 | "cell_type": "markdown",
52 | "id": "dd098b09",
53 | "metadata": {},
54 | "source": [
55 | "# 3. Expectation \n",
56 | "When we flip a die several times (say, a large number like 1000), we’d like to know what is the average score we get at the end. The Expectation of a RV is a weighted sum of all possible values, weighted by the respective probabilities of occurrence. Ideally, for a die, each face value, 1 to 6, has probability $p_i = \\frac{1}{6}$ of occuring. As such, the expectation or expected value of a RV is,\n",
57 | "\n",
58 | "$$\n",
59 | "\\begin{align}\n",
60 | " \\mathbb{E}(X) &= \\sum_{i} p_i \\cdot V_i\n",
61 | "\\end{align}\n",
62 | "$$\n",
63 | "\n",
64 | "where $p_i$ is the probability of event $E_i$ taking place, and $V_i$ is the value of said event. You may be wondering why this expectation is a float number that’s not a possible value. Since we’re looking at a simple average value and not the most occurring value, it makes sense to understand it as such. The expectation tells us what is the average face we get after flipping the die numerous times. "
65 | ]
66 | },
67 | {
68 | "cell_type": "markdown",
69 | "id": "1c152c71",
70 | "metadata": {},
71 | "source": [
72 | "# 4. Variance\n",
73 | "As you flip the fair die, you probably won’t get the same value again and again – you’ll notice some deviation. The expected value of a fair die is 3.5. The maximum deviation of the die value in the long run is given the Variance. It’s always a positive number which represents how much the value of a RV deviates on either side of the expectation. \n",
74 | "\n",
75 | "In fact, we’ve been using the term “deviation” a lot here. Surprisingly enough, variation is the square of the standard deviation $\\sigma$ of a RV, i.e., $\\text{Var}(X) = \\sigma^2$. The variance of a random variable $X$ can be computed as follows:\n",
76 | "\n",
77 | "$$\n",
78 | "\\begin{align}\n",
79 | " \\text{Var}(X) &= \\mathbb{E}(X^2) + (\\mathbb{E}(X))^2 \\\\\n",
80 | " &= \\mathbb{E}(X - \\mathbb{E}(X))^2\n",
81 | "\\end{align} \n",
82 | "$$"
83 | ]
84 | },
85 | {
86 | "cell_type": "markdown",
87 | "id": "a9c52c9c",
88 | "metadata": {},
89 | "source": [
90 | "# 5. Distributions\n",
91 | "Random Variables have associated probabilities that dictate the chances of an event occurring. For a fair die, the probability is it’s $\\frac{1}{6}$ and for a fair coin, it’s $\\frac{1}{2}$. However, what is the characteristic of the random variable that shapes these probabilities? \n",
92 | "\n",
93 | "This is given by the Distribution, a function that gives the probability of an event taking place. For some distributions, all events may have the same probability, while other distributions weight certains events more than others (causing some events to be more likely than others). There are a whole bunch of distributions that describe both synthetic and real world systems. Here are examples of some distributions:\n",
94 | "\n",
95 | "\n",
96 | "\n",
97 | "### 5.1 Uniform Distribution\n",
98 | "For starters, let’s look at the Uniform Distribution. To represent a RV from a Uniform Distribution, we denote it as $X \\sim U(a, b)$. This distribution has two parameters we need to supply – $a$ and $b$. They denote the bounds of the possible values the RV can assume. Here, the probability of any value occurring within this bound $[a, b]$ is equal. For example, if we have $X \\sim U(0, 1)$, all the values inside that range (like 0.1, 0.2, 0.03, 0.023452) have an equal probability of occurring. \n",
99 | "\n",
100 | "\n",
101 | "\n",
102 | "### 5.2 Binomial Distribution\n",
103 | "In many real-life applications, there’s this notion of failure or success associated with events. A random variable can hold the value of success or failure with a certain probability p of occurrence. The Binomial Distribution allows us to scale this single FAIL/PASS trial to many objects at once with replacement. To represent a RV from a Binomial Distribution, we denote it as $X \\sim \\text{Bin}(p, n)$. We supply two parameters again: $p$ is the probability of success and $n$ is number of samples we are considering, each which can either be a success or failure. \n",
104 | "\n",
105 | "For instance, at a factory, a certain machine part is manufactured without defects with probability 0.75. If the factory wants to test a bunch of samples for quality assurance, they can collect a sample of 100. Using this, we can answer questions like “What is the probability of 90 objects passing the defect test?” or “what is the probability of more than 10 objects failing the defect test?” and make changes to the process accordingly. Here, we’d say $X \\sim \\text{Bin}(0.75, 100)$.\n",
106 | "\n",
107 | "\n",
108 | "\n",
109 | "### 5.3 Normal/Gaussian Distribution\n",
110 | "\n",
111 | "This is important for the understanding of Diffusion Models. In the real world, everything isn’t as clearcut as FAIL/PASS. Neither do events all have the same probability of occurrence. There are some events that occur more often than others, making them statistically more probable than others. For example, in a sunny country like Singapore, the chances of a sunny day are much higher than the chances of a rainy day or cloudy day, _ceteris paribus_. The Normal Distribution helps us represent such events. To represent a RV from a Normal Distribution, we denote it as $X \\sim N(\\mu, \\sigma^2)$. There are two parameters: $\\mu$ is the average/mean/mode value the RV can take while $\\sigma$ is the variance of the event (i.e., how spread away is it from this mean?). \n",
112 | "\n",
113 | "In the next chapter, we cover the technical and implementation-specific details of the Normal Distribution and how it's used in Diffusion Models."
114 | ]
115 | },
116 | {
117 | "cell_type": "code",
118 | "execution_count": null,
119 | "id": "1b427e02",
120 | "metadata": {},
121 | "outputs": [],
122 | "source": []
123 | }
124 | ],
125 | "metadata": {
126 | "kernelspec": {
127 | "display_name": "Python 3 (ipykernel)",
128 | "language": "python",
129 | "name": "python3"
130 | },
131 | "language_info": {
132 | "codemirror_mode": {
133 | "name": "ipython",
134 | "version": 3
135 | },
136 | "file_extension": ".py",
137 | "mimetype": "text/x-python",
138 | "name": "python",
139 | "nbconvert_exporter": "python",
140 | "pygments_lexer": "ipython3",
141 | "version": "3.10.4"
142 | }
143 | },
144 | "nbformat": 4,
145 | "nbformat_minor": 5
146 | }
147 |
--------------------------------------------------------------------------------
/plots/bivariate_negative_covariance_density.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/bivariate_negative_covariance_density.png
--------------------------------------------------------------------------------
/plots/bivariate_normal_example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/bivariate_normal_example.png
--------------------------------------------------------------------------------
/plots/bivariate_positive_covariance_density.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/bivariate_positive_covariance_density.png
--------------------------------------------------------------------------------
/plots/bivariate_zero_covariance_density.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/bivariate_zero_covariance_density.png
--------------------------------------------------------------------------------
/plots/convolution_of_pdfs.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/convolution_of_pdfs.png
--------------------------------------------------------------------------------
/plots/covariance_pair_plot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/covariance_pair_plot.png
--------------------------------------------------------------------------------
/plots/efficient_forward_process.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/efficient_forward_process.png
--------------------------------------------------------------------------------
/plots/forward_process.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/forward_process.png
--------------------------------------------------------------------------------
/plots/isotropic_gaussian.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/isotropic_gaussian.png
--------------------------------------------------------------------------------
/plots/univariate_normal_example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/univariate_normal_example.png
--------------------------------------------------------------------------------
/plots/univariate_with_dff_mu_sigma.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AakashKumarNain/diffusion_models/798f5d728f9414a1e3bb469639535730696786f6/plots/univariate_with_dff_mu_sigma.png
--------------------------------------------------------------------------------