├── .gitignore
├── README.md
├── cover.png
├── notebooks
├── 01_normal.ipynb
├── 02_mle.ipynb
├── 03_multi_normal.ipynb
├── 04_gmm.ipynb
├── 05_em.ipynb
├── 06_pytorch.ipynb
├── 07_vae.ipynb
├── 08_hvae.ipynb
├── 09_diffusion.ipynb
├── 10_diffusion2.ipynb
├── flower.png
└── old_faithful.txt
├── posters
├── 1부. 시야를 찾아서.png
├── 2부. 상어공주.png
├── 3부. DeZero의 창조자.png
├── 4부. 제발, 가즈아!.png
├── 5부. 피쉬카소와 천재의 초상.png
├── README.md
└── 바닷속 딥러닝 어드벤처.png
├── requirements.txt
├── step01
├── norm_dist.py
├── norm_param.py
├── sample_avg.py
└── sample_sum.py
├── step02
├── fit.py
├── generate.py
├── height.txt
├── hist.py
└── prob.py
├── step03
├── height_weight.txt
├── mle.py
├── numpy_basis.py
├── numpy_matrix.py
├── plot_3d.py
├── plot_dataset.py
└── plot_norm.py
├── step04
├── gmm.py
├── gmm_sampling.py
├── old_faithful.py
└── old_faithful.txt
├── step05
├── em.py
├── generate.py
└── old_faithful.txt
├── step06
├── gradient.py
├── neuralnet.py
├── regression.py
├── tensor.py
└── vision.py
├── step07
└── vae.py
├── step08
└── hvae.py
├── step09
├── diffusion_model.py
├── flower.png
├── gaussian_noise.py
└── simple_unet.py
└── step10
├── classifier_free_guidance.py
└── conditional.py
/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | *~
3 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # 『밑바닥부터 시작하는 딥러닝 ❺』
: 정규 분포부터 생성 모델까지!
2 |
3 |
4 |
5 | **10단계로 알아보는 이미지 생성 모델의 원리!**
6 |
7 | 이 책은 정규 분포와 최대 가능도 추정과 같은 기본 개념에서 출발하여 가우스 혼합 모델, 변이형 오토인코더(VAE), 계층형 VAE 그리고 확산 모델에 이르기까지 다양한 생성 모델을 설명한다. 수식과 알고리즘을 꼼꼼하게 다루며 수학 이론과 파이썬 프로그래밍을 바탕으로 한 실제 구현 방법을 알려준다. 생성 모델을 이론뿐만 아니라 실습과 함께 명확하게 학습할 수 있다. 특히 확산 모델에 이르는 10단계의 과정을 하나의 스토리로 엮어 중요한 기술들을 서로 잇고 개선할 수 있도록 구성했다. 이 책과 함께 생성 모델을 밑바닥부터 시작해보자.
8 |
9 | [미리보기](https://www.yes24.com/Product/Viewer/Preview/134648807) | [알려진 오류(정오표)](https://docs.google.com/document/d/1SU7b_emm3Lqha4BfVLTr4Ae6eTg32BkKFWMEXl6N_vA) | [본문 그림과 수식 이미지 모음](https://drive.google.com/file/d/1bMxCjB_SJzc7oJ913QT6Yn9sn3fjsymn/view?usp=drive_link)
10 |
11 | ## 파일 구성
12 |
13 | |폴더명 |설명 |
14 | |:-- |:-- |
15 | |`step01` |1장에서 사용할 코드 |
16 | |`step02` |2장에서 사용할 코드 |
17 | |... |... |
18 | |`step10` |10장에서 사용할 코드 |
19 | |`notebooks` |1〜10장까지의 코드(주피터 노트북 형식)|
20 |
21 |
22 | ## 주피터 노트북
23 |
24 | 이 책의 코드는 주피터 노트북에서도 확인할 수 있습니다. 다음 표의 버튼을 클릭하면 각각의 클라우드 서비스에서 노트북을 실행할 수 있습니다.
25 |
26 | | 단계 | Colab | Kaggle | Studio Lab |
27 | | :--- | :--- | :--- | :--- |
28 | | 1. 정규 분포 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/01_normal.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/01_normal.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/01_normal.ipynb) |
29 | | 2. 최대 가능도 추정 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/02_mle.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/02_mle.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/02_mle.ipynb) |
30 | | 3. 다변량 정규 분포 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/03_multi_normal.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/03_multi_normal.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/03_multi_normal.ipynb) |
31 | | 4. 가우스 혼합 모델 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/04_gmm.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/04_gmm.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/04_gmm.ipynb) |
32 | | 5. EM 알고리즘 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/05_em.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/05_em.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/05_em.ipynb) |
33 | | 6. 신경망 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/06_pytorch.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/06_pytorch.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/06_pytorch.ipynb) |
34 | | 7. 변이형 오토인코더 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/07_vae.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/07_vae.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/07_vae.ipynb) |
35 | | 8. 확산 모델 이론 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/08_hvae.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/08_hvae.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/08_hvae.ipynb) |
36 | | 9. 확산 모델 구현 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/09_diffusion.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/09_diffusion.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/09_diffusion.ipynb) |
37 | | 10. 확산 모델 응용 | [](https://colab.research.google.com/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/10_diffusion2.ipynb) | [](https://kaggle.com/kernels/welcome?src=https://github.com/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/10_diffusion2.ipynb) | [](https://studiolab.sagemaker.aws/import/github/oreilly-japan/deep-learning-from-scratch-5/blob/master/notebooks/10_diffusion2.ipynb) |
38 |
39 |
40 | ## 파이썬과 외부 라이브러리
41 |
42 | 소스 코드를 실행하려면 다음과 같은 라이브러리가 필요합니다.
43 |
44 | * NumPy
45 | * Matplotlib
46 | * PyTorch(버전 2.x)
47 | * torchvision
48 | * tqdm
49 |
50 | ※ 파이썬 버전은 3.x를 사용합니다.
51 |
52 |
53 | ## 실행 방법
54 |
55 | 각 장의 폴더로 이동하여 파이썬 명령어를 실행하면 됩니다.
56 |
57 | ```
58 | $ cd step01
59 | $ python norm_dist.py
60 |
61 | $ cd ../step02
62 | $ python generate.py
63 | ```
64 |
65 | ---
66 |
67 | ## 팬픽 - 바닷속 딥러닝 어드벤처 (5부작)
68 |
69 |
70 |
71 | "<밑바닥부터 시작하는 딥러닝>의 주인공 생선들은 딥러닝 기술로 바닷속 생태계를 어떻게 혁신하고 있을까요? 어공지능의 첨단을 이끌어가는 밑시딥 생선들과 신나는 모험을 떠나보세요."
72 |
73 | 바닷속 세계를 배경으로, 해양 생물들이 자신의 특성과 필요에 맞는 딥러닝 기술을 개발하여 문제를 해결해 나가는 모험을 그린 연작 소설입니다. 시리즈를 읽으신 분은 더 많은 재미를 느끼실 수 있도록 딥러닝 요소들을 곳곳에 삽입하였습니다.
74 |
75 | 각 편의 주인공과 주제는 다음과 같습니다.
76 |
77 | 1. **시야를 찾아서**: 쏨뱅이(쏨)가 **이미지 처리 기술**을 개발하여 주변 환경을 선명하게 파악
78 | 1. **상어공주**: 괭이상어 공주(꽹)가 **자연어 처리** 기술로 돌고래 왕자와의 사랑을 쟁취
79 | 1. **DeZero의 창조자**: 나뭇잎해룡(잎룡)이 **딥러닝 프레임워크**를 만들어 기술 보급과 협업 촉진
80 | 1. **제발, 가즈아!**: 가자미(가즈아)가 **심층 강화 학습**으로 먹이가 풍부한 새로운 바다 개척
81 | 1. **피쉬카소와 천재의 초상**: 유령실고기(피쉬카소)가 **이미지 생성 모델**로 바닷속 예술계 혁신
82 |
83 | 소설 보러 가기
84 |
85 | ---
86 |
87 | ## 라이선스
88 |
89 | 이 저장소의 소스 코드는 [MIT 라이선스](http://www.opensource.org/licenses/MIT)를 따릅니다. 상업적/비상업적 용도로 자유롭게 사용하실 수 있습니다.
90 |
--------------------------------------------------------------------------------
/cover.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/cover.png
--------------------------------------------------------------------------------
/notebooks/06_pytorch.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Step 6: Neural Network & PyTorch"
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "## tensor.py"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": 1,
20 | "metadata": {},
21 | "outputs": [
22 | {
23 | "name": "stdout",
24 | "output_type": "stream",
25 | "text": [
26 | "tensor(75., grad_fn=)\n",
27 | "tensor(30.)\n"
28 | ]
29 | }
30 | ],
31 | "source": [
32 | "import torch\n",
33 | "\n",
34 | "x = torch.tensor(5.0, requires_grad=True)\n",
35 | "y = 3 * x ** 2\n",
36 | "print(y)\n",
37 | "\n",
38 | "y.backward()\n",
39 | "print(x.grad)"
40 | ]
41 | },
42 | {
43 | "cell_type": "markdown",
44 | "metadata": {},
45 | "source": [
46 | "## gradient.py"
47 | ]
48 | },
49 | {
50 | "cell_type": "code",
51 | "execution_count": 2,
52 | "metadata": {},
53 | "outputs": [
54 | {
55 | "name": "stdout",
56 | "output_type": "stream",
57 | "text": [
58 | "tensor(-2.) tensor(400.)\n",
59 | "0.0 2.0\n",
60 | "0.6815015077590942 0.46292299032211304\n",
61 | "0.8253857493400574 0.6804871559143066\n",
62 | "0.8942827582359314 0.7992911338806152\n",
63 | "0.9331904053688049 0.8705660700798035\n",
64 | "0.9568046927452087 0.9152978658676147\n",
65 | "0.9716982245445251 0.9440822601318359\n",
66 | "0.9813036918640137 0.9628812670707703\n",
67 | "0.98758465051651 0.9752733111381531\n",
68 | "0.9917276501655579 0.983490526676178\n",
69 | "0.9944759607315063 0.9889602065086365\n"
70 | ]
71 | }
72 | ],
73 | "source": [
74 | "import torch\n",
75 | "\n",
76 | "def rosenbrock(x0, x1):\n",
77 | " y = 100 * (x1 - x0 ** 2) ** 2 + (x0 - 1) ** 2\n",
78 | " return y\n",
79 | "\n",
80 | "x0 = torch.tensor(0.0, requires_grad=True)\n",
81 | "x1 = torch.tensor(2.0, requires_grad=True)\n",
82 | "\n",
83 | "y = rosenbrock(x0, x1)\n",
84 | "y.backward()\n",
85 | "print(x0.grad, x1.grad)\n",
86 | "\n",
87 | "lr = 0.001 # learning rate\n",
88 | "iters = 10000 # iteration count\n",
89 | "\n",
90 | "for i in range(iters):\n",
91 | " if i % 1000 == 0:\n",
92 | " print(x0.item(), x1.item())\n",
93 | "\n",
94 | " y = rosenbrock(x0, x1)\n",
95 | "\n",
96 | " y.backward()\n",
97 | "\n",
98 | " x0.data -= lr * x0.grad.data\n",
99 | " x1.data -= lr * x1.grad.data\n",
100 | "\n",
101 | " x0.grad.zero_()\n",
102 | " x1.grad.zero_()\n",
103 | "\n",
104 | "print(x0.item(), x1.item())"
105 | ]
106 | },
107 | {
108 | "cell_type": "markdown",
109 | "metadata": {},
110 | "source": [
111 | "## regression.py"
112 | ]
113 | },
114 | {
115 | "cell_type": "code",
116 | "execution_count": 3,
117 | "metadata": {},
118 | "outputs": [
119 | {
120 | "name": "stdout",
121 | "output_type": "stream",
122 | "text": [
123 | "41.89796447753906\n",
124 | "0.22483204305171967\n",
125 | "0.0925208106637001\n",
126 | "0.0888015553355217\n",
127 | "0.08627457916736603\n",
128 | "0.08435674756765366\n",
129 | "0.0829005315899849\n",
130 | "0.0817948430776596\n",
131 | "0.08095530420541763\n",
132 | "0.08031783998012543\n",
133 | "0.07987643033266068\n",
134 | "====\n",
135 | "W = 2.2863590717315674\n",
136 | "b = 5.3144850730896\n"
137 | ]
138 | },
139 | {
140 | "data": {
141 | "image/png": "",
142 | "text/plain": [
143 | ""
144 | ]
145 | },
146 | "metadata": {},
147 | "output_type": "display_data"
148 | }
149 | ],
150 | "source": [
151 | "import torch\n",
152 | "\n",
153 | "\n",
154 | "torch.manual_seed(0)\n",
155 | "x = torch.rand(100, 1)\n",
156 | "y = 5 + 2 * x + torch.rand(100, 1)\n",
157 | "\n",
158 | "W = torch.zeros((1, 1), requires_grad=True)\n",
159 | "b = torch.zeros(1, requires_grad=True)\n",
160 | "\n",
161 | "def predict(x):\n",
162 | " y = x @ W + b\n",
163 | " return y\n",
164 | "\n",
165 | "def mean_squared_error(x0, x1):\n",
166 | " diff = x0 - x1\n",
167 | " N = len(diff)\n",
168 | " return torch.sum(diff ** 2) / N\n",
169 | "\n",
170 | "lr = 0.1\n",
171 | "iters = 100\n",
172 | "\n",
173 | "for i in range(iters):\n",
174 | " y_hat = predict(x)\n",
175 | " loss = mean_squared_error(y, y_hat)\n",
176 | "\n",
177 | " loss.backward()\n",
178 | "\n",
179 | " W.data -= lr * W.grad.data\n",
180 | " b.data -= lr * b.grad.data\n",
181 | "\n",
182 | " W.grad.zero_()\n",
183 | " b.grad.zero_()\n",
184 | "\n",
185 | " if i % 10 == 0: # print every 10 iterations\n",
186 | " print(loss.item())\n",
187 | "\n",
188 | "print(loss.item())\n",
189 | "print('====')\n",
190 | "print('W =', W.item())\n",
191 | "print('b =', b.item())\n",
192 | "\n",
193 | "\n",
194 | "# plot\n",
195 | "import matplotlib.pyplot as plt\n",
196 | "plt.scatter(x.detach().numpy(), y.detach().numpy(), s=10)\n",
197 | "x = torch.tensor([[0.0], [1.0]])\n",
198 | "y = W.detach().numpy() * x.detach().numpy() + b.detach().numpy()\n",
199 | "plt.plot(x, y, color='red')\n",
200 | "plt.xlabel('x')\n",
201 | "plt.ylabel('y')\n",
202 | "plt.show()"
203 | ]
204 | },
205 | {
206 | "cell_type": "markdown",
207 | "metadata": {},
208 | "source": [
209 | "## neuralnet.py"
210 | ]
211 | },
212 | {
213 | "cell_type": "code",
214 | "execution_count": 4,
215 | "metadata": {},
216 | "outputs": [
217 | {
218 | "name": "stdout",
219 | "output_type": "stream",
220 | "text": [
221 | "0.7643452286720276\n",
222 | "0.23656320571899414\n",
223 | "0.23226076364517212\n",
224 | "0.22441408038139343\n",
225 | "0.21026146411895752\n",
226 | "0.17957879602909088\n",
227 | "0.11798454076051712\n",
228 | "0.08481380343437195\n",
229 | "0.08023109287023544\n",
230 | "0.0796855092048645\n",
231 | "0.07945814728736877\n"
232 | ]
233 | },
234 | {
235 | "data": {
236 | "image/png": "",
237 | "text/plain": [
238 | ""
239 | ]
240 | },
241 | "metadata": {},
242 | "output_type": "display_data"
243 | }
244 | ],
245 | "source": [
246 | "import torch\n",
247 | "import torch.nn as nn\n",
248 | "import torch.nn.functional as F\n",
249 | "\n",
250 | "\n",
251 | "torch.manual_seed(0)\n",
252 | "x = torch.rand(100, 1)\n",
253 | "y = torch.sin(2 * torch.pi * x) + torch.rand(100, 1)\n",
254 | "\n",
255 | "# model\n",
256 | "class Model(nn.Module):\n",
257 | " def __init__(self, input_size=1, hidden_size= 10, output_size=1):\n",
258 | " super().__init__()\n",
259 | " self.linear1 = nn.Linear(input_size, hidden_size)\n",
260 | " self.linear2 = nn.Linear(hidden_size, output_size)\n",
261 | "\n",
262 | " def forward(self, x):\n",
263 | " y = self.linear1(x)\n",
264 | " y = F.sigmoid(y)\n",
265 | " y = self.linear2(y)\n",
266 | " return y\n",
267 | "\n",
268 | "\n",
269 | "lr = 0.2\n",
270 | "iters = 10000\n",
271 | "\n",
272 | "model = Model()\n",
273 | "optimizer = torch.optim.SGD(model.parameters(), lr=lr)\n",
274 | "\n",
275 | "for i in range(iters):\n",
276 | " y_pred = model(x)\n",
277 | " loss = F.mse_loss(y, y_pred)\n",
278 | " optimizer.zero_grad()\n",
279 | " loss.backward()\n",
280 | " optimizer.step()\n",
281 | "\n",
282 | " if i % 1000 == 0:\n",
283 | " print(loss.item())\n",
284 | "\n",
285 | "print(loss.item())\n",
286 | "\n",
287 | "# plot\n",
288 | "import matplotlib.pyplot as plt\n",
289 | "plt.scatter(x.detach().numpy(), y.detach().numpy(), s=10)\n",
290 | "x = torch.linspace(0, 1, 100).reshape(-1, 1)\n",
291 | "y = model(x).detach().numpy()\n",
292 | "plt.plot(x, y, color='red')\n",
293 | "plt.show()"
294 | ]
295 | },
296 | {
297 | "cell_type": "markdown",
298 | "metadata": {},
299 | "source": [
300 | "## vision.py"
301 | ]
302 | },
303 | {
304 | "cell_type": "code",
305 | "execution_count": 5,
306 | "metadata": {},
307 | "outputs": [
308 | {
309 | "name": "stdout",
310 | "output_type": "stream",
311 | "text": [
312 | "Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\n",
313 | "Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz\n"
314 | ]
315 | },
316 | {
317 | "name": "stderr",
318 | "output_type": "stream",
319 | "text": [
320 | "100%|██████████| 9912422/9912422 [00:01<00:00, 6778804.78it/s]\n"
321 | ]
322 | },
323 | {
324 | "name": "stdout",
325 | "output_type": "stream",
326 | "text": [
327 | "Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw\n",
328 | "\n",
329 | "Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz\n",
330 | "Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz\n"
331 | ]
332 | },
333 | {
334 | "name": "stderr",
335 | "output_type": "stream",
336 | "text": [
337 | "100%|██████████| 28881/28881 [00:00<00:00, 3582730.29it/s]\n"
338 | ]
339 | },
340 | {
341 | "name": "stdout",
342 | "output_type": "stream",
343 | "text": [
344 | "Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw\n",
345 | "\n",
346 | "Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz\n",
347 | "Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz\n"
348 | ]
349 | },
350 | {
351 | "name": "stderr",
352 | "output_type": "stream",
353 | "text": [
354 | "100%|██████████| 1648877/1648877 [00:00<00:00, 5282915.73it/s]\n"
355 | ]
356 | },
357 | {
358 | "name": "stdout",
359 | "output_type": "stream",
360 | "text": [
361 | "Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw\n",
362 | "\n",
363 | "Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz\n",
364 | "Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz\n"
365 | ]
366 | },
367 | {
368 | "name": "stderr",
369 | "output_type": "stream",
370 | "text": [
371 | "100%|██████████| 4542/4542 [00:00<00:00, 8873092.11it/s]\n"
372 | ]
373 | },
374 | {
375 | "name": "stdout",
376 | "output_type": "stream",
377 | "text": [
378 | "Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw\n",
379 | "\n",
380 | "size: 60000\n",
381 | "type: \n",
382 | "label: 5\n"
383 | ]
384 | },
385 | {
386 | "data": {
387 | "image/png": "",
388 | "text/plain": [
389 | ""
390 | ]
391 | },
392 | "metadata": {},
393 | "output_type": "display_data"
394 | },
395 | {
396 | "name": "stdout",
397 | "output_type": "stream",
398 | "text": [
399 | "type: \n",
400 | "shape: torch.Size([1, 28, 28])\n",
401 | "x shape: torch.Size([32, 1, 28, 28])\n",
402 | "label shape: torch.Size([32])\n"
403 | ]
404 | }
405 | ],
406 | "source": [
407 | "import torch\n",
408 | "import torchvision\n",
409 | "import torchvision.transforms as transforms\n",
410 | "import matplotlib.pyplot as plt\n",
411 | "\n",
412 | "\n",
413 | "## ==== MNIST ====\n",
414 | "dataset = torchvision.datasets.MNIST(\n",
415 | " root='./data',\n",
416 | " train=True,\n",
417 | " transform=None,\n",
418 | " download=True\n",
419 | ")\n",
420 | "\n",
421 | "x, label = dataset[0]\n",
422 | "\n",
423 | "print('size:', len(dataset)) # size: 60000\n",
424 | "print('type:', type(x)) # type: \n",
425 | "print('label:', label) # label: 5\n",
426 | "\n",
427 | "plt.imshow(x, cmap='gray')\n",
428 | "plt.show()\n",
429 | "\n",
430 | "\n",
431 | "# ==== preprocess ====\n",
432 | "transform = transforms.ToTensor()\n",
433 | "\n",
434 | "dataset = torchvision.datasets.MNIST(\n",
435 | " root='./data',\n",
436 | " train=True,\n",
437 | " transform=transform,\n",
438 | " download=True\n",
439 | ")\n",
440 | "\n",
441 | "x, label = dataset[0]\n",
442 | "print('type:', type(x)) # type: \n",
443 | "print('shape:', x.shape) # shape: torch.Size([1, 28, 28])\n",
444 | "\n",
445 | "\n",
446 | "# ==== DataLoader ====\n",
447 | "dataloader = torch.utils.data.DataLoader(\n",
448 | " dataset,\n",
449 | " batch_size=32,\n",
450 | " shuffle=True)\n",
451 | "\n",
452 | "for x, label in dataloader:\n",
453 | " print('x shape:', x.shape) # shape: torch.Size([32, 1, 28, 28])\n",
454 | " print('label shape:', label.shape) # shape: torch.Size([32])\n",
455 | " break"
456 | ]
457 | }
458 | ],
459 | "metadata": {
460 | "kernelspec": {
461 | "display_name": "Python 3",
462 | "language": "python",
463 | "name": "python3"
464 | },
465 | "language_info": {
466 | "codemirror_mode": {
467 | "name": "ipython",
468 | "version": 3
469 | },
470 | "file_extension": ".py",
471 | "mimetype": "text/x-python",
472 | "name": "python",
473 | "nbconvert_exporter": "python",
474 | "pygments_lexer": "ipython3",
475 | "version": "3.11.0rc2"
476 | }
477 | },
478 | "nbformat": 4,
479 | "nbformat_minor": 2
480 | }
481 |
--------------------------------------------------------------------------------
/notebooks/flower.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/notebooks/flower.png
--------------------------------------------------------------------------------
/notebooks/old_faithful.txt:
--------------------------------------------------------------------------------
1 | 3.6 79
2 | 1.8 54
3 | 3.333 74
4 | 2.283 62
5 | 4.533 85
6 | 2.883 55
7 | 4.7 88
8 | 3.6 85
9 | 1.95 51
10 | 4.35 85
11 | 1.833 54
12 | 3.917 84
13 | 4.2 78
14 | 1.75 47
15 | 4.7 83
16 | 2.167 52
17 | 1.75 62
18 | 4.8 84
19 | 1.6 52
20 | 4.25 79
21 | 1.8 51
22 | 1.75 47
23 | 3.45 78
24 | 3.067 69
25 | 4.533 74
26 | 3.6 83
27 | 1.967 55
28 | 4.083 76
29 | 3.85 78
30 | 4.433 79
31 | 4.3 73
32 | 4.467 77
33 | 3.367 66
34 | 4.033 80
35 | 3.833 74
36 | 2.017 52
37 | 1.867 48
38 | 4.833 80
39 | 1.833 59
40 | 4.783 90
41 | 4.35 80
42 | 1.883 58
43 | 4.567 84
44 | 1.75 58
45 | 4.533 73
46 | 3.317 83
47 | 3.833 64
48 | 2.1 53
49 | 4.633 82
50 | 2 59
51 | 4.8 75
52 | 4.716 90
53 | 1.833 54
54 | 4.833 80
55 | 1.733 54
56 | 4.883 83
57 | 3.717 71
58 | 1.667 64
59 | 4.567 77
60 | 4.317 81
61 | 2.233 59
62 | 4.5 84
63 | 1.75 48
64 | 4.8 82
65 | 1.817 60
66 | 4.4 92
67 | 4.167 78
68 | 4.7 78
69 | 2.067 65
70 | 4.7 73
71 | 4.033 82
72 | 1.967 56
73 | 4.5 79
74 | 4 71
75 | 1.983 62
76 | 5.067 76
77 | 2.017 60
78 | 4.567 78
79 | 3.883 76
80 | 3.6 83
81 | 4.133 75
82 | 4.333 82
83 | 4.1 70
84 | 2.633 65
85 | 4.067 73
86 | 4.933 88
87 | 3.95 76
88 | 4.517 80
89 | 2.167 48
90 | 4 86
91 | 2.2 60
92 | 4.333 90
93 | 1.867 50
94 | 4.817 78
95 | 1.833 63
96 | 4.3 72
97 | 4.667 84
98 | 3.75 75
99 | 1.867 51
100 | 4.9 82
101 | 2.483 62
102 | 4.367 88
103 | 2.1 49
104 | 4.5 83
105 | 4.05 81
106 | 1.867 47
107 | 4.7 84
108 | 1.783 52
109 | 4.85 86
110 | 3.683 81
111 | 4.733 75
112 | 2.3 59
113 | 4.9 89
114 | 4.417 79
115 | 1.7 59
116 | 4.633 81
117 | 2.317 50
118 | 4.6 85
119 | 1.817 59
120 | 4.417 87
121 | 2.617 53
122 | 4.067 69
123 | 4.25 77
124 | 1.967 56
125 | 4.6 88
126 | 3.767 81
127 | 1.917 45
128 | 4.5 82
129 | 2.267 55
130 | 4.65 90
131 | 1.867 45
132 | 4.167 83
133 | 2.8 56
134 | 4.333 89
135 | 1.833 46
136 | 4.383 82
137 | 1.883 51
138 | 4.933 86
139 | 2.033 53
140 | 3.733 79
141 | 4.233 81
142 | 2.233 60
143 | 4.533 82
144 | 4.817 77
145 | 4.333 76
146 | 1.983 59
147 | 4.633 80
148 | 2.017 49
149 | 5.1 96
150 | 1.8 53
151 | 5.033 77
152 | 4 77
153 | 2.4 65
154 | 4.6 81
155 | 3.567 71
156 | 4 70
157 | 4.5 81
158 | 4.083 93
159 | 1.8 53
160 | 3.967 89
161 | 2.2 45
162 | 4.15 86
163 | 2 58
164 | 3.833 78
165 | 3.5 66
166 | 4.583 76
167 | 2.367 63
168 | 5 88
169 | 1.933 52
170 | 4.617 93
171 | 1.917 49
172 | 2.083 57
173 | 4.583 77
174 | 3.333 68
175 | 4.167 81
176 | 4.333 81
177 | 4.5 73
178 | 2.417 50
179 | 4 85
180 | 4.167 74
181 | 1.883 55
182 | 4.583 77
183 | 4.25 83
184 | 3.767 83
185 | 2.033 51
186 | 4.433 78
187 | 4.083 84
188 | 1.833 46
189 | 4.417 83
190 | 2.183 55
191 | 4.8 81
192 | 1.833 57
193 | 4.8 76
194 | 4.1 84
195 | 3.966 77
196 | 4.233 81
197 | 3.5 87
198 | 4.366 77
199 | 2.25 51
200 | 4.667 78
201 | 2.1 60
202 | 4.35 82
203 | 4.133 91
204 | 1.867 53
205 | 4.6 78
206 | 1.783 46
207 | 4.367 77
208 | 3.85 84
209 | 1.933 49
210 | 4.5 83
211 | 2.383 71
212 | 4.7 80
213 | 1.867 49
214 | 3.833 75
215 | 3.417 64
216 | 4.233 76
217 | 2.4 53
218 | 4.8 94
219 | 2 55
220 | 4.15 76
221 | 1.867 50
222 | 4.267 82
223 | 1.75 54
224 | 4.483 75
225 | 4 78
226 | 4.117 79
227 | 4.083 78
228 | 4.267 78
229 | 3.917 70
230 | 4.55 79
231 | 4.083 70
232 | 2.417 54
233 | 4.183 86
234 | 2.217 50
235 | 4.45 90
236 | 1.883 54
237 | 1.85 54
238 | 4.283 77
239 | 3.95 79
240 | 2.333 64
241 | 4.15 75
242 | 2.35 47
243 | 4.933 86
244 | 2.9 63
245 | 4.583 85
246 | 3.833 82
247 | 2.083 57
248 | 4.367 82
249 | 2.133 67
250 | 4.35 74
251 | 2.2 54
252 | 4.45 83
253 | 3.567 73
254 | 4.5 73
255 | 4.15 88
256 | 3.817 80
257 | 3.917 71
258 | 4.45 83
259 | 2 56
260 | 4.283 79
261 | 4.767 78
262 | 4.533 84
263 | 1.85 58
264 | 4.25 83
265 | 1.983 43
266 | 2.25 60
267 | 4.75 75
268 | 4.117 81
269 | 2.15 46
270 | 4.417 90
271 | 1.817 46
272 | 4.467 74
273 |
--------------------------------------------------------------------------------
/posters/1부. 시야를 찾아서.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/posters/1부. 시야를 찾아서.png
--------------------------------------------------------------------------------
/posters/2부. 상어공주.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/posters/2부. 상어공주.png
--------------------------------------------------------------------------------
/posters/3부. DeZero의 창조자.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/posters/3부. DeZero의 창조자.png
--------------------------------------------------------------------------------
/posters/4부. 제발, 가즈아!.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/posters/4부. 제발, 가즈아!.png
--------------------------------------------------------------------------------
/posters/5부. 피쉬카소와 천재의 초상.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/posters/5부. 피쉬카소와 천재의 초상.png
--------------------------------------------------------------------------------
/posters/README.md:
--------------------------------------------------------------------------------
1 | 포스터 보관용
2 |
--------------------------------------------------------------------------------
/posters/바닷속 딥러닝 어드벤처.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/posters/바닷속 딥러닝 어드벤처.png
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy==1.24.2
2 | scipy==1.10.1
3 | matplotlib==3.7.1
4 | torch==2.0.0
5 | torchvision==0.15.1
6 | tqdm==4.65.0
--------------------------------------------------------------------------------
/step01/norm_dist.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib.pyplot as plt
3 |
4 |
5 | def normal(x, mu=0, sigma=1):
6 | y = 1 / (np.sqrt(2 * np.pi) * sigma) * np.exp(-(x - mu)**2 / (2 * sigma**2))
7 | return y
8 |
9 | x = np.linspace(-5, 5, 100)
10 | y = normal(x)
11 |
12 | plt.plot(x, y)
13 | plt.xlabel('x')
14 | plt.ylabel('y')
15 | plt.show()
--------------------------------------------------------------------------------
/step01/norm_param.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib.pyplot as plt
3 |
4 | def normal(x, mu=0, sigma=1):
5 | y = 1 / (np.sqrt(2 * np.pi) * sigma) * np.exp(-(x - mu)**2 / (2 * sigma**2))
6 | return y
7 |
8 | x = np.linspace(-10, 10, 1000)
9 |
10 | # mu ====================
11 | y0 = normal(x, mu=-3)
12 | y1 = normal(x, mu=0)
13 | y2 = normal(x, mu=5)
14 |
15 | plt.plot(x, y0, label='$\mu$=-3')
16 | plt.plot(x, y1, label='$\mu$=0')
17 | plt.plot(x, y2, label='$\mu$=5')
18 | plt.legend()
19 | plt.xlabel('x')
20 | plt.ylabel('y')
21 | plt.show()
22 |
23 | # sigma ====================
24 | y0 = normal(x, mu=0, sigma=0.5)
25 | y1 = normal(x, mu=0, sigma=1)
26 | y2 = normal(x, mu=0, sigma=2)
27 |
28 | plt.plot(x, y0, label='$\sigma$=0.5')
29 | plt.plot(x, y1, label='$\sigma$=1')
30 | plt.plot(x, y2, label='$\sigma$=2')
31 | plt.legend()
32 | plt.xlabel('x')
33 | plt.ylabel('y')
34 | plt.show()
--------------------------------------------------------------------------------
/step01/sample_avg.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib.pyplot as plt
3 |
4 | x_means = []
5 | N = 1 # sample size
6 |
7 | for _ in range(10000):
8 | xs = []
9 | for i in range(N):
10 | x = np.random.rand()
11 | xs.append(x)
12 | mean = np.mean(xs)
13 | x_means.append(mean)
14 |
15 | # plot
16 | plt.hist(x_means, bins='auto', density=True)
17 | plt.title(f'N={N}')
18 | plt.xlabel('x')
19 | plt.ylabel('Probability Density')
20 | plt.xlim(-0.05, 1.05)
21 | plt.ylim(0, 5)
22 | plt.show()
--------------------------------------------------------------------------------
/step01/sample_sum.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib.pyplot as plt
3 |
4 | x_means = []
5 | N = 5
6 |
7 | for _ in range(10000):
8 | xs = []
9 | for i in range(N):
10 | x = np.random.rand()
11 | xs.append(x)
12 | mean = np.sum(xs)
13 | x_means.append(mean)
14 |
15 | # normal distribution
16 | def normal(x, mu=0, sigma=1):
17 | y = 1 / (np.sqrt(2 * np.pi) * sigma) * np.exp(-(x - mu)**2 / (2 * sigma**2))
18 | return y
19 | x_norm = np.linspace(-5, 5, 1000)
20 | mu = 0.5 * N
21 | sigma = np.sqrt(1 / 12 * N)
22 | y_norm = normal(x_norm, mu, sigma)
23 |
24 | # plot
25 | plt.hist(x_means, bins='auto', density=True)
26 | plt.plot(x_norm, y_norm)
27 | plt.title(f'N={N}')
28 | plt.xlim(-1, 6)
29 | plt.show()
--------------------------------------------------------------------------------
/step02/fit.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 | path = os.path.join(os.path.dirname(__file__), 'height.txt')
6 | xs = np.loadtxt(path)
7 |
8 | mu = np.mean(xs)
9 | sigma = np.std(xs)
10 |
11 | # normal distribution
12 | def normal(x, mu=0, sigma=1):
13 | y = 1 / (np.sqrt(2 * np.pi) * sigma) * np.exp(-(x - mu)**2 / (2 * sigma**2))
14 | return y
15 | x = np.linspace(150, 190, 1000)
16 | y = normal(x, mu, sigma)
17 |
18 | # plot
19 | plt.hist(xs, bins='auto', density=True)
20 | plt.plot(x, y)
21 | plt.xlabel('Height(cm)')
22 | plt.ylabel('Probability Density')
23 | plt.show()
--------------------------------------------------------------------------------
/step02/generate.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 |
6 | path = os.path.join(os.path.dirname(__file__), 'height.txt')
7 | xs = np.loadtxt(path)
8 | mu = np.mean(xs)
9 | sigma = np.std(xs)
10 |
11 | samples = np.random.normal(mu, sigma, 10000)
12 |
13 | plt.hist(xs, bins='auto', density=True, alpha=0.7, label='original')
14 | plt.hist(samples, bins='auto', density=True, alpha=0.7, label='generated')
15 | plt.xlabel('Height(cm)')
16 | plt.ylabel('Probability Density')
17 | plt.legend()
18 | plt.show()
--------------------------------------------------------------------------------
/step02/hist.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 | path = os.path.join(os.path.dirname(__file__), 'height.txt')
6 | xs = np.loadtxt(path)
7 | print(xs.shape)
8 |
9 | plt.hist(xs, bins='auto', density=True)
10 | plt.xlabel('Height(cm)')
11 | plt.ylabel('Probability Density')
12 | plt.show()
--------------------------------------------------------------------------------
/step02/prob.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | from scipy.stats import norm
4 |
5 |
6 | path = os.path.join(os.path.dirname(__file__), 'height.txt')
7 | xs = np.loadtxt(path)
8 | mu = np.mean(xs)
9 | sigma = np.std(xs)
10 |
11 | p1 = norm.cdf(160, mu, sigma)
12 | print('p(x <= 160):', p1)
13 |
14 | p2 = norm.cdf(180, mu, sigma)
15 | print('p(x > 180):', 1-p2)
--------------------------------------------------------------------------------
/step03/mle.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 | path = os.path.join(os.path.dirname(__file__), 'height_weight.txt')
6 | xs = np.loadtxt(path)
7 |
8 | # Maximum Likelihood Estimation(MLE)
9 | mu = np.mean(xs, axis=0)
10 | cov = np.cov(xs, rowvar=False)
11 |
12 | def multivariate_normal(x, mu, cov):
13 | det = np.linalg.det(cov)
14 | inv = np.linalg.inv(cov)
15 | d = len(x)
16 | z = 1 / np.sqrt((2 * np.pi) ** d * det)
17 | y = z * np.exp((x - mu).T @ inv @ (x - mu) / -2.0)
18 | return y
19 |
20 | small_xs = xs[:500]
21 | X, Y = np.meshgrid(np.arange(150, 195, 0.5),
22 | np.arange(45, 75, 0.5))
23 | Z = np.zeros_like(X)
24 |
25 | for i in range(X.shape[0]):
26 | for j in range(X.shape[1]):
27 | x = np.array([X[i, j], Y[i, j]])
28 | Z[i, j] = multivariate_normal(x, mu, cov)
29 |
30 | fig = plt.figure()
31 | ax1 = fig.add_subplot(1, 2, 1, projection='3d')
32 | ax1.set_xlabel('x')
33 | ax1.set_ylabel('y')
34 | ax1.set_zlabel('z')
35 | ax1.plot_surface(X, Y, Z, cmap='viridis')
36 |
37 | ax2 = fig.add_subplot(1, 2, 2)
38 | ax2.scatter(small_xs[:,0], small_xs[:,1])
39 | ax2.set_xlabel('x')
40 | ax2.set_ylabel('y')
41 | ax2.set_xlim(156, 189)
42 | ax2.set_ylim(36, 79)
43 | ax2.contour(X, Y, Z)
44 | plt.show()
--------------------------------------------------------------------------------
/step03/numpy_basis.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | # array
4 | x = np.array([1, 2, 3])
5 | print(x.__class__)
6 | print(x.shape)
7 | print(x.ndim)
8 | W = np.array([[1, 2, 3],
9 | [4, 5, 6]])
10 | print(W.ndim)
11 | print(W.shape)
12 |
13 | # element-wise operation
14 | W = np.array([[1, 2, 3], [4, 5, 6]])
15 | X = np.array([[0, 1, 2], [3, 4, 5]])
16 | print(W + X)
17 | print('---')
18 | print(W * X)
19 |
20 | # inner product
21 | a = np.array([1, 2, 3])
22 | b = np.array([4, 5, 6])
23 | y = np.dot(a, b) # a @ b
24 | print(y)
25 |
26 | # matrix multiplication
27 | A = np.array([[1, 2], [3, 4]])
28 | B = np.array([[5, 6], [7, 8]])
29 | Y = np.dot(A, B) # A @ B
30 | print(Y)
--------------------------------------------------------------------------------
/step03/numpy_matrix.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 |
3 | # transpose
4 | A = np.array([[1, 2, 3], [4, 5, 6]])
5 | print(A)
6 | print('---')
7 | print(A.T)
8 |
9 | # determinant
10 | A = np.array([[3, 4], [5, 6]])
11 | d = np.linalg.det(A)
12 | print(d)
13 |
14 | # inverse matrix
15 | A = np.array([[3, 4], [5, 6]])
16 | B = np.linalg.inv(A)
17 | print(B)
18 | print('---')
19 | print(A @ B)
20 |
21 | # multivariate normal distribution
22 | def multivariate_normal(x, mu, cov):
23 | det = np.linalg.det(cov)
24 | inv = np.linalg.inv(cov)
25 | D = len(x)
26 | z = 1 / np.sqrt((2 * np.pi) ** D * det)
27 | y = z * np.exp((x - mu).T @ inv @ (x - mu) / -2.0)
28 | return y
29 |
30 | x = np.array([0, 0])
31 | mu = np.array([1, 2])
32 | cov = np.array([[1, 0],
33 | [0, 1]])
34 |
35 | y = multivariate_normal(x, mu, cov)
36 | print(y)
37 |
--------------------------------------------------------------------------------
/step03/plot_3d.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib.pyplot as plt
3 |
4 | X = np.array([[-2, -1, 0, 1, 2],
5 | [-2, -1, 0, 1, 2],
6 | [-2, -1, 0, 1, 2],
7 | [-2, -1, 0, 1, 2],
8 | [-2, -1, 0, 1, 2]])
9 | Y = np.array([[-2, -2, -2, -2, -2],
10 | [-1, -1, -1, -1, -1],
11 | [0, 0, 0, 0, 0],
12 | [1, 1, 1, 1, 1],
13 | [2, 2, 2, 2, 2]])
14 | Z = X ** 2 + Y ** 2
15 |
16 | ax = plt.axes(projection='3d')
17 | ax.plot_surface(X, Y, Z, cmap='viridis')
18 | ax.set_xlabel('x')
19 | ax.set_ylabel('y')
20 | ax.set_zlabel('z')
21 | plt.show()
22 |
23 | # ===== better resolution =====
24 | xs = np.arange(-2, 2, 0.1)
25 | ys = np.arange(-2, 2, 0.1)
26 |
27 | X, Y = np.meshgrid(xs, ys)
28 | Z = X ** 2 + Y ** 2
29 |
30 | ax = plt.axes(projection='3d')
31 | ax.plot_surface(X, Y, Z, cmap='viridis')
32 | ax.set_xlabel('x')
33 | ax.set_ylabel('y')
34 | ax.set_zlabel('z')
35 | plt.show()
36 |
37 | # ===== plot contour =====
38 | x = np.arange(-2, 2, 0.1)
39 | y = np.arange(-2, 2, 0.1)
40 |
41 | X, Y = np.meshgrid(x, y)
42 | Z = X ** 2 + Y ** 2
43 |
44 | ax = plt.axes()
45 | ax.contour(X, Y, Z)
46 | ax.set_xlabel('x')
47 | ax.set_ylabel('y')
48 | plt.show()
49 |
--------------------------------------------------------------------------------
/step03/plot_dataset.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 | path = os.path.join(os.path.dirname(__file__), 'height_weight.txt')
6 | xs = np.loadtxt(path)
7 |
8 | print(xs.shape)
9 |
10 | small_xs = xs[:500]
11 | plt.scatter(small_xs[:, 0], small_xs[:, 1])
12 | plt.xlabel('Height(cm)')
13 | plt.ylabel('Weight(kg)')
14 | plt.show()
--------------------------------------------------------------------------------
/step03/plot_norm.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib.pyplot as plt
3 |
4 | def multivariate_normal(x, mu, cov):
5 | det = np.linalg.det(cov)
6 | inv = np.linalg.inv(cov)
7 | D = len(x)
8 | z = 1 / np.sqrt((2 * np.pi) ** D * det)
9 | y = z * np.exp((x - mu).T @ inv @ (x - mu) / -2.0)
10 | return y
11 |
12 | mu = np.array([0.5, -0.2])
13 | cov = np.array([[2.0, 0.3],
14 | [0.3, 0.5]])
15 |
16 | xs = ys = np.arange(-5, 5, 0.1)
17 | X, Y = np.meshgrid(xs, ys)
18 | Z = np.zeros_like(X)
19 |
20 | for i in range(X.shape[0]):
21 | for j in range(X.shape[1]):
22 | x = np.array([X[i, j], Y[i, j]])
23 | Z[i, j] = multivariate_normal(x, mu, cov)
24 |
25 | fig = plt.figure()
26 | ax1 = fig.add_subplot(1, 2, 1, projection='3d')
27 | ax1.set_xlabel('x')
28 | ax1.set_ylabel('y')
29 | ax1.set_zlabel('z')
30 | ax1.plot_surface(X, Y, Z, cmap='viridis')
31 |
32 | ax2 = fig.add_subplot(1, 2, 2)
33 | ax2.set_xlabel('x')
34 | ax2.set_ylabel('y')
35 | ax2.contour(X, Y, Z)
36 | plt.show()
37 |
--------------------------------------------------------------------------------
/step04/gmm.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib.pyplot as plt
3 |
4 |
5 | mus = np.array([[2.0, 54.50],
6 | [4.3, 80.0]])
7 | covs = np.array([[[0.07, 0.44],
8 | [0.44, 33.7]],
9 | [[0.17, 0.94],
10 | [0.94, 36.00 ]]])
11 | phis = np.array([0.35, 0.65])
12 |
13 |
14 | def multivariate_normal(x, mu, cov):
15 | det = np.linalg.det(cov)
16 | inv = np.linalg.inv(cov)
17 | d = len(x)
18 | z = 1 / np.sqrt((2 * np.pi) ** d * det)
19 | y = z * np.exp((x - mu).T @ inv @ (x - mu) / -2.0)
20 | return y
21 |
22 | def gmm(x, phis, mus, covs):
23 | K = len(phis)
24 | y = 0
25 | for k in range(K):
26 | phi, mu, cov = phis[k], mus[k], covs[k]
27 | y += phi * multivariate_normal(x, mu, cov)
28 | return y
29 |
30 |
31 | # plot
32 | xs = np.arange(1, 6, 0.1)
33 | ys = np.arange(40, 100, 0.1)
34 | X, Y = np.meshgrid(xs, ys)
35 | Z = np.zeros_like(X)
36 |
37 | for i in range(X.shape[0]):
38 | for j in range(X.shape[1]):
39 | x = np.array([X[i, j], Y[i, j]])
40 | Z[i, j] = gmm(x, phis, mus, covs)
41 |
42 | fig = plt.figure()
43 | ax1 = fig.add_subplot(1, 2, 1, projection='3d')
44 | ax1.set_xlabel('x')
45 | ax1.set_ylabel('y')
46 | ax1.set_zlabel('z')
47 | ax1.plot_surface(X, Y, Z, cmap='viridis')
48 |
49 | ax2 = fig.add_subplot(1, 2, 2)
50 | ax2.set_xlabel('x')
51 | ax2.set_ylabel('y')
52 | ax2.contour(X, Y, Z)
53 | plt.show()
--------------------------------------------------------------------------------
/step04/gmm_sampling.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import matplotlib.pyplot as plt
3 |
4 |
5 | mus = np.array([[2.0, 54.50],
6 | [4.3, 80.0]])
7 | covs = np.array([[[0.07, 0.44],
8 | [0.44, 33.7]],
9 | [[0.17, 0.94],
10 | [0.94, 36.00 ]]])
11 | phis = np.array([0.35, 0.65])
12 |
13 |
14 | def sample():
15 | k = np.random.choice(2, p=phis)
16 | mu, cov = mus[k], covs[k]
17 | x = np.random.multivariate_normal(mu, cov)
18 | return x
19 |
20 | N = 500
21 | xs = np.zeros((N, 2))
22 | for i in range(N):
23 | xs[i] = sample()
24 |
25 | plt.scatter(xs[:,0], xs[:,1], color='orange', alpha=0.7)
26 | plt.xlabel('x')
27 | plt.ylabel('y')
28 | plt.show()
--------------------------------------------------------------------------------
/step04/old_faithful.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 | path = os.path.join(os.path.dirname(__file__), 'old_faithful.txt')
6 | xs = np.loadtxt(path)
7 |
8 | print(xs.shape)
9 | print(xs[0])
10 |
11 | plt.scatter(xs[:,0], xs[:,1])
12 | plt.xlabel('Eruptions(Min)')
13 | plt.ylabel('Waiting(Min)')
14 | plt.show()
15 |
--------------------------------------------------------------------------------
/step04/old_faithful.txt:
--------------------------------------------------------------------------------
1 | 3.6 79
2 | 1.8 54
3 | 3.333 74
4 | 2.283 62
5 | 4.533 85
6 | 2.883 55
7 | 4.7 88
8 | 3.6 85
9 | 1.95 51
10 | 4.35 85
11 | 1.833 54
12 | 3.917 84
13 | 4.2 78
14 | 1.75 47
15 | 4.7 83
16 | 2.167 52
17 | 1.75 62
18 | 4.8 84
19 | 1.6 52
20 | 4.25 79
21 | 1.8 51
22 | 1.75 47
23 | 3.45 78
24 | 3.067 69
25 | 4.533 74
26 | 3.6 83
27 | 1.967 55
28 | 4.083 76
29 | 3.85 78
30 | 4.433 79
31 | 4.3 73
32 | 4.467 77
33 | 3.367 66
34 | 4.033 80
35 | 3.833 74
36 | 2.017 52
37 | 1.867 48
38 | 4.833 80
39 | 1.833 59
40 | 4.783 90
41 | 4.35 80
42 | 1.883 58
43 | 4.567 84
44 | 1.75 58
45 | 4.533 73
46 | 3.317 83
47 | 3.833 64
48 | 2.1 53
49 | 4.633 82
50 | 2 59
51 | 4.8 75
52 | 4.716 90
53 | 1.833 54
54 | 4.833 80
55 | 1.733 54
56 | 4.883 83
57 | 3.717 71
58 | 1.667 64
59 | 4.567 77
60 | 4.317 81
61 | 2.233 59
62 | 4.5 84
63 | 1.75 48
64 | 4.8 82
65 | 1.817 60
66 | 4.4 92
67 | 4.167 78
68 | 4.7 78
69 | 2.067 65
70 | 4.7 73
71 | 4.033 82
72 | 1.967 56
73 | 4.5 79
74 | 4 71
75 | 1.983 62
76 | 5.067 76
77 | 2.017 60
78 | 4.567 78
79 | 3.883 76
80 | 3.6 83
81 | 4.133 75
82 | 4.333 82
83 | 4.1 70
84 | 2.633 65
85 | 4.067 73
86 | 4.933 88
87 | 3.95 76
88 | 4.517 80
89 | 2.167 48
90 | 4 86
91 | 2.2 60
92 | 4.333 90
93 | 1.867 50
94 | 4.817 78
95 | 1.833 63
96 | 4.3 72
97 | 4.667 84
98 | 3.75 75
99 | 1.867 51
100 | 4.9 82
101 | 2.483 62
102 | 4.367 88
103 | 2.1 49
104 | 4.5 83
105 | 4.05 81
106 | 1.867 47
107 | 4.7 84
108 | 1.783 52
109 | 4.85 86
110 | 3.683 81
111 | 4.733 75
112 | 2.3 59
113 | 4.9 89
114 | 4.417 79
115 | 1.7 59
116 | 4.633 81
117 | 2.317 50
118 | 4.6 85
119 | 1.817 59
120 | 4.417 87
121 | 2.617 53
122 | 4.067 69
123 | 4.25 77
124 | 1.967 56
125 | 4.6 88
126 | 3.767 81
127 | 1.917 45
128 | 4.5 82
129 | 2.267 55
130 | 4.65 90
131 | 1.867 45
132 | 4.167 83
133 | 2.8 56
134 | 4.333 89
135 | 1.833 46
136 | 4.383 82
137 | 1.883 51
138 | 4.933 86
139 | 2.033 53
140 | 3.733 79
141 | 4.233 81
142 | 2.233 60
143 | 4.533 82
144 | 4.817 77
145 | 4.333 76
146 | 1.983 59
147 | 4.633 80
148 | 2.017 49
149 | 5.1 96
150 | 1.8 53
151 | 5.033 77
152 | 4 77
153 | 2.4 65
154 | 4.6 81
155 | 3.567 71
156 | 4 70
157 | 4.5 81
158 | 4.083 93
159 | 1.8 53
160 | 3.967 89
161 | 2.2 45
162 | 4.15 86
163 | 2 58
164 | 3.833 78
165 | 3.5 66
166 | 4.583 76
167 | 2.367 63
168 | 5 88
169 | 1.933 52
170 | 4.617 93
171 | 1.917 49
172 | 2.083 57
173 | 4.583 77
174 | 3.333 68
175 | 4.167 81
176 | 4.333 81
177 | 4.5 73
178 | 2.417 50
179 | 4 85
180 | 4.167 74
181 | 1.883 55
182 | 4.583 77
183 | 4.25 83
184 | 3.767 83
185 | 2.033 51
186 | 4.433 78
187 | 4.083 84
188 | 1.833 46
189 | 4.417 83
190 | 2.183 55
191 | 4.8 81
192 | 1.833 57
193 | 4.8 76
194 | 4.1 84
195 | 3.966 77
196 | 4.233 81
197 | 3.5 87
198 | 4.366 77
199 | 2.25 51
200 | 4.667 78
201 | 2.1 60
202 | 4.35 82
203 | 4.133 91
204 | 1.867 53
205 | 4.6 78
206 | 1.783 46
207 | 4.367 77
208 | 3.85 84
209 | 1.933 49
210 | 4.5 83
211 | 2.383 71
212 | 4.7 80
213 | 1.867 49
214 | 3.833 75
215 | 3.417 64
216 | 4.233 76
217 | 2.4 53
218 | 4.8 94
219 | 2 55
220 | 4.15 76
221 | 1.867 50
222 | 4.267 82
223 | 1.75 54
224 | 4.483 75
225 | 4 78
226 | 4.117 79
227 | 4.083 78
228 | 4.267 78
229 | 3.917 70
230 | 4.55 79
231 | 4.083 70
232 | 2.417 54
233 | 4.183 86
234 | 2.217 50
235 | 4.45 90
236 | 1.883 54
237 | 1.85 54
238 | 4.283 77
239 | 3.95 79
240 | 2.333 64
241 | 4.15 75
242 | 2.35 47
243 | 4.933 86
244 | 2.9 63
245 | 4.583 85
246 | 3.833 82
247 | 2.083 57
248 | 4.367 82
249 | 2.133 67
250 | 4.35 74
251 | 2.2 54
252 | 4.45 83
253 | 3.567 73
254 | 4.5 73
255 | 4.15 88
256 | 3.817 80
257 | 3.917 71
258 | 4.45 83
259 | 2 56
260 | 4.283 79
261 | 4.767 78
262 | 4.533 84
263 | 1.85 58
264 | 4.25 83
265 | 1.983 43
266 | 2.25 60
267 | 4.75 75
268 | 4.117 81
269 | 2.15 46
270 | 4.417 90
271 | 1.817 46
272 | 4.467 74
273 |
--------------------------------------------------------------------------------
/step05/em.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 | path = os.path.join(os.path.dirname(__file__), 'old_faithful.txt')
6 | xs = np.loadtxt(path)
7 | print(xs.shape) # (272, 2)
8 |
9 | # initialize parameters
10 | phis = np.array([0.5, 0.5])
11 | mus = np.array([[0.0, 50.0], [0.0, 100.0]])
12 | covs = np.array([np.eye(2), np.eye(2)])
13 |
14 | K = len(phis) # 2
15 | N = len(xs) # 272
16 | MAX_ITERS = 100
17 | THRESHOLD = 1e-4
18 |
19 | def multivariate_normal(x, mu, cov):
20 | det = np.linalg.det(cov)
21 | inv = np.linalg.inv(cov)
22 | d = len(x)
23 | z = 1 / np.sqrt((2 * np.pi) ** d * det)
24 | y = z * np.exp((x - mu).T @ inv @ (x - mu) / -2.0)
25 | return y
26 |
27 | def gmm(x, phis, mus, covs):
28 | K = len(phis)
29 | y = 0
30 | for k in range(K):
31 | phi, mu, cov = phis[k], mus[k], covs[k]
32 | y += phi * multivariate_normal(x, mu, cov)
33 | return y
34 |
35 | def likelihood(xs, phis, mus, covs):
36 | """ log likelihood """
37 | eps = 1e-8
38 | L = 0
39 | N = len(xs)
40 | for x in xs:
41 | y = gmm(x, phis, mus, covs)
42 | L += np.log(y + eps)
43 | return L / N
44 |
45 |
46 | current_likelihood = likelihood(xs, phis, mus, covs)
47 |
48 | for iter in range(MAX_ITERS):
49 | # E-step ====================
50 | qs = np.zeros((N, K))
51 | for n in range(N):
52 | x = xs[n]
53 | for k in range(K):
54 | phi, mu, cov = phis[k], mus[k], covs[k]
55 | qs[n, k] = phi * multivariate_normal(x, mu, cov)
56 | qs[n] /= gmm(x, phis, mus, covs)
57 |
58 | # M-step ====================
59 | qs_sum = qs.sum(axis=0)
60 | for k in range(K):
61 | # 1. phis
62 | phis[k] = qs_sum[k] / N
63 |
64 | # 2. mus
65 | c = 0
66 | for n in range(N):
67 | c += qs[n, k] * xs[n]
68 | mus[k] = c / qs_sum[k]
69 |
70 | # 3. covs
71 | c = 0
72 | for n in range(N):
73 | z = xs[n] - mus[k]
74 | z = z[:, np.newaxis] # column vector
75 | c += qs[n, k] * z @ z.T
76 | covs[k] = c / qs_sum[k]
77 |
78 | # thershold check ====================
79 | print(f'{current_likelihood:.3f}')
80 |
81 | next_likelihood = likelihood(xs, phis, mus, covs)
82 | diff = np.abs(next_likelihood - current_likelihood)
83 | if diff < THRESHOLD:
84 | break
85 | current_likelihood = next_likelihood
86 |
87 |
88 |
89 | # visualize
90 | def plot_contour(w, mus, covs):
91 | x = np.arange(1, 6, 0.1)
92 | y = np.arange(40, 100, 1)
93 | X, Y = np.meshgrid(x, y)
94 | Z = np.zeros_like(X)
95 |
96 | for i in range(X.shape[0]):
97 | for j in range(X.shape[1]):
98 | x = np.array([X[i, j], Y[i, j]])
99 |
100 | for k in range(len(mus)):
101 | mu, cov = mus[k], covs[k]
102 | Z[i, j] += w[k] * multivariate_normal(x, mu, cov)
103 | plt.contour(X, Y, Z)
104 |
105 | plt.scatter(xs[:,0], xs[:,1])
106 | plot_contour(phis, mus, covs)
107 | plt.xlabel('Eruptions(Min)')
108 | plt.ylabel('Waiting(Min)')
109 | plt.show()
--------------------------------------------------------------------------------
/step05/generate.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 |
5 | path = os.path.join(os.path.dirname(__file__), 'old_faithful.txt')
6 | original_xs = np.loadtxt(path)
7 |
8 | # learned parameters
9 | mus = np.array([[2.0, 54.50],
10 | [4.3, 80.0]])
11 | covs = np.array([[[0.07, 0.44],
12 | [0.44, 33.7]],
13 | [[0.17, 0.94],
14 | [0.94, 36.00 ]]])
15 | phis = np.array([0.35, 0.65])
16 |
17 | def multivariate_normal(x, mu, cov):
18 | det = np.linalg.det(cov)
19 | inv = np.linalg.inv(cov)
20 | d = len(x)
21 | z = 1 / np.sqrt((2 * np.pi) ** d * det)
22 | y = z * np.exp((x - mu).T @ inv @ (x - mu) / -2.0)
23 | return y
24 |
25 | def gmm(x, phis, mus, covs):
26 | K = len(phis)
27 | y = 0
28 | for k in range(K):
29 | phi, mu, cov = phis[k], mus[k], covs[k]
30 | y += phi * multivariate_normal(x, mu, cov)
31 | return y
32 |
33 | # genearte data
34 | N = 500
35 | new_xs = np.zeros((N, 2))
36 | for n in range(N):
37 | k = np.random.choice(2, p=phis)
38 | mu, cov = mus[k], covs[k]
39 | new_xs[n] = np.random.multivariate_normal(mu, cov)
40 |
41 | # visualize
42 | plt.scatter(original_xs[:,0], original_xs[:,1], alpha=0.7, label='original')
43 | plt.scatter(new_xs[:,0], new_xs[:,1], alpha=0.7, label='generated')
44 | plt.legend()
45 | plt.xlabel('Eruptions(Min)')
46 | plt.ylabel('Waiting(Min)')
47 | plt.show()
--------------------------------------------------------------------------------
/step05/old_faithful.txt:
--------------------------------------------------------------------------------
1 | 3.6 79
2 | 1.8 54
3 | 3.333 74
4 | 2.283 62
5 | 4.533 85
6 | 2.883 55
7 | 4.7 88
8 | 3.6 85
9 | 1.95 51
10 | 4.35 85
11 | 1.833 54
12 | 3.917 84
13 | 4.2 78
14 | 1.75 47
15 | 4.7 83
16 | 2.167 52
17 | 1.75 62
18 | 4.8 84
19 | 1.6 52
20 | 4.25 79
21 | 1.8 51
22 | 1.75 47
23 | 3.45 78
24 | 3.067 69
25 | 4.533 74
26 | 3.6 83
27 | 1.967 55
28 | 4.083 76
29 | 3.85 78
30 | 4.433 79
31 | 4.3 73
32 | 4.467 77
33 | 3.367 66
34 | 4.033 80
35 | 3.833 74
36 | 2.017 52
37 | 1.867 48
38 | 4.833 80
39 | 1.833 59
40 | 4.783 90
41 | 4.35 80
42 | 1.883 58
43 | 4.567 84
44 | 1.75 58
45 | 4.533 73
46 | 3.317 83
47 | 3.833 64
48 | 2.1 53
49 | 4.633 82
50 | 2 59
51 | 4.8 75
52 | 4.716 90
53 | 1.833 54
54 | 4.833 80
55 | 1.733 54
56 | 4.883 83
57 | 3.717 71
58 | 1.667 64
59 | 4.567 77
60 | 4.317 81
61 | 2.233 59
62 | 4.5 84
63 | 1.75 48
64 | 4.8 82
65 | 1.817 60
66 | 4.4 92
67 | 4.167 78
68 | 4.7 78
69 | 2.067 65
70 | 4.7 73
71 | 4.033 82
72 | 1.967 56
73 | 4.5 79
74 | 4 71
75 | 1.983 62
76 | 5.067 76
77 | 2.017 60
78 | 4.567 78
79 | 3.883 76
80 | 3.6 83
81 | 4.133 75
82 | 4.333 82
83 | 4.1 70
84 | 2.633 65
85 | 4.067 73
86 | 4.933 88
87 | 3.95 76
88 | 4.517 80
89 | 2.167 48
90 | 4 86
91 | 2.2 60
92 | 4.333 90
93 | 1.867 50
94 | 4.817 78
95 | 1.833 63
96 | 4.3 72
97 | 4.667 84
98 | 3.75 75
99 | 1.867 51
100 | 4.9 82
101 | 2.483 62
102 | 4.367 88
103 | 2.1 49
104 | 4.5 83
105 | 4.05 81
106 | 1.867 47
107 | 4.7 84
108 | 1.783 52
109 | 4.85 86
110 | 3.683 81
111 | 4.733 75
112 | 2.3 59
113 | 4.9 89
114 | 4.417 79
115 | 1.7 59
116 | 4.633 81
117 | 2.317 50
118 | 4.6 85
119 | 1.817 59
120 | 4.417 87
121 | 2.617 53
122 | 4.067 69
123 | 4.25 77
124 | 1.967 56
125 | 4.6 88
126 | 3.767 81
127 | 1.917 45
128 | 4.5 82
129 | 2.267 55
130 | 4.65 90
131 | 1.867 45
132 | 4.167 83
133 | 2.8 56
134 | 4.333 89
135 | 1.833 46
136 | 4.383 82
137 | 1.883 51
138 | 4.933 86
139 | 2.033 53
140 | 3.733 79
141 | 4.233 81
142 | 2.233 60
143 | 4.533 82
144 | 4.817 77
145 | 4.333 76
146 | 1.983 59
147 | 4.633 80
148 | 2.017 49
149 | 5.1 96
150 | 1.8 53
151 | 5.033 77
152 | 4 77
153 | 2.4 65
154 | 4.6 81
155 | 3.567 71
156 | 4 70
157 | 4.5 81
158 | 4.083 93
159 | 1.8 53
160 | 3.967 89
161 | 2.2 45
162 | 4.15 86
163 | 2 58
164 | 3.833 78
165 | 3.5 66
166 | 4.583 76
167 | 2.367 63
168 | 5 88
169 | 1.933 52
170 | 4.617 93
171 | 1.917 49
172 | 2.083 57
173 | 4.583 77
174 | 3.333 68
175 | 4.167 81
176 | 4.333 81
177 | 4.5 73
178 | 2.417 50
179 | 4 85
180 | 4.167 74
181 | 1.883 55
182 | 4.583 77
183 | 4.25 83
184 | 3.767 83
185 | 2.033 51
186 | 4.433 78
187 | 4.083 84
188 | 1.833 46
189 | 4.417 83
190 | 2.183 55
191 | 4.8 81
192 | 1.833 57
193 | 4.8 76
194 | 4.1 84
195 | 3.966 77
196 | 4.233 81
197 | 3.5 87
198 | 4.366 77
199 | 2.25 51
200 | 4.667 78
201 | 2.1 60
202 | 4.35 82
203 | 4.133 91
204 | 1.867 53
205 | 4.6 78
206 | 1.783 46
207 | 4.367 77
208 | 3.85 84
209 | 1.933 49
210 | 4.5 83
211 | 2.383 71
212 | 4.7 80
213 | 1.867 49
214 | 3.833 75
215 | 3.417 64
216 | 4.233 76
217 | 2.4 53
218 | 4.8 94
219 | 2 55
220 | 4.15 76
221 | 1.867 50
222 | 4.267 82
223 | 1.75 54
224 | 4.483 75
225 | 4 78
226 | 4.117 79
227 | 4.083 78
228 | 4.267 78
229 | 3.917 70
230 | 4.55 79
231 | 4.083 70
232 | 2.417 54
233 | 4.183 86
234 | 2.217 50
235 | 4.45 90
236 | 1.883 54
237 | 1.85 54
238 | 4.283 77
239 | 3.95 79
240 | 2.333 64
241 | 4.15 75
242 | 2.35 47
243 | 4.933 86
244 | 2.9 63
245 | 4.583 85
246 | 3.833 82
247 | 2.083 57
248 | 4.367 82
249 | 2.133 67
250 | 4.35 74
251 | 2.2 54
252 | 4.45 83
253 | 3.567 73
254 | 4.5 73
255 | 4.15 88
256 | 3.817 80
257 | 3.917 71
258 | 4.45 83
259 | 2 56
260 | 4.283 79
261 | 4.767 78
262 | 4.533 84
263 | 1.85 58
264 | 4.25 83
265 | 1.983 43
266 | 2.25 60
267 | 4.75 75
268 | 4.117 81
269 | 2.15 46
270 | 4.417 90
271 | 1.817 46
272 | 4.467 74
273 |
--------------------------------------------------------------------------------
/step06/gradient.py:
--------------------------------------------------------------------------------
1 | import torch
2 |
3 | def rosenbrock(x0, x1):
4 | y = 100 * (x1 - x0 ** 2) ** 2 + (x0 - 1) ** 2
5 | return y
6 |
7 | x0 = torch.tensor(0.0, requires_grad=True)
8 | x1 = torch.tensor(2.0, requires_grad=True)
9 |
10 | y = rosenbrock(x0, x1)
11 | y.backward()
12 | print(x0.grad, x1.grad)
13 |
14 | lr = 0.001 # learning rate
15 | iters = 10000 # iteration count
16 |
17 | for i in range(iters):
18 | if i % 1000 == 0:
19 | print(x0.item(), x1.item())
20 |
21 | y = rosenbrock(x0, x1)
22 |
23 | y.backward()
24 |
25 | x0.data -= lr * x0.grad.data
26 | x1.data -= lr * x1.grad.data
27 |
28 | x0.grad.zero_()
29 | x1.grad.zero_()
30 |
31 | print(x0.item(), x1.item())
--------------------------------------------------------------------------------
/step06/neuralnet.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import torch.nn.functional as F
4 |
5 |
6 | torch.manual_seed(0)
7 | x = torch.rand(100, 1)
8 | y = torch.sin(2 * torch.pi * x) + torch.rand(100, 1)
9 |
10 | # model
11 | class Model(nn.Module):
12 | def __init__(self, input_size=1, hidden_size= 10, output_size=1):
13 | super().__init__()
14 | self.linear1 = nn.Linear(input_size, hidden_size)
15 | self.linear2 = nn.Linear(hidden_size, output_size)
16 |
17 | def forward(self, x):
18 | y = self.linear1(x)
19 | y = F.sigmoid(y)
20 | y = self.linear2(y)
21 | return y
22 |
23 |
24 | lr = 0.2
25 | iters = 10000
26 |
27 | model = Model()
28 | optimizer = torch.optim.SGD(model.parameters(), lr=lr)
29 |
30 | for i in range(iters):
31 | y_pred = model(x)
32 | loss = F.mse_loss(y, y_pred)
33 | optimizer.zero_grad()
34 | loss.backward()
35 | optimizer.step()
36 |
37 | if i % 1000 == 0:
38 | print(loss.item())
39 |
40 | print(loss.item())
41 |
42 | # plot
43 | import matplotlib.pyplot as plt
44 | plt.scatter(x.detach().numpy(), y.detach().numpy(), s=10)
45 | x = torch.linspace(0, 1, 100).reshape(-1, 1)
46 | y = model(x).detach().numpy()
47 | plt.plot(x, y, color='red')
48 | plt.show()
--------------------------------------------------------------------------------
/step06/regression.py:
--------------------------------------------------------------------------------
1 | import torch
2 |
3 |
4 | torch.manual_seed(0)
5 | x = torch.rand(100, 1)
6 | y = 5 + 2 * x + torch.rand(100, 1)
7 |
8 | W = torch.zeros((1, 1), requires_grad=True)
9 | b = torch.zeros(1, requires_grad=True)
10 |
11 | def predict(x):
12 | y = x @ W + b
13 | return y
14 |
15 | def mean_squared_error(x0, x1):
16 | diff = x0 - x1
17 | N = len(diff)
18 | return torch.sum(diff ** 2) / N
19 |
20 | lr = 0.1
21 | iters = 100
22 |
23 | for i in range(iters):
24 | y_hat = predict(x)
25 | loss = mean_squared_error(y, y_hat)
26 |
27 | loss.backward()
28 |
29 | W.data -= lr * W.grad.data
30 | b.data -= lr * b.grad.data
31 |
32 | W.grad.zero_()
33 | b.grad.zero_()
34 |
35 | if i % 10 == 0: # print every 10 iterations
36 | print(loss.item())
37 |
38 | print(loss.item())
39 | print('====')
40 | print('W =', W.item())
41 | print('b =', b.item())
42 |
43 |
44 | # plot
45 | import matplotlib.pyplot as plt
46 | plt.scatter(x.detach().numpy(), y.detach().numpy(), s=10)
47 | x = torch.tensor([[0.0], [1.0]])
48 | y = W.detach().numpy() * x.detach().numpy() + b.detach().numpy()
49 | plt.plot(x, y, color='red')
50 | plt.xlabel('x')
51 | plt.ylabel('y')
52 | plt.show()
--------------------------------------------------------------------------------
/step06/tensor.py:
--------------------------------------------------------------------------------
1 | import torch
2 |
3 | x = torch.tensor(5.0, requires_grad=True)
4 | y = 3 * x ** 2
5 | print(y)
6 |
7 | y.backward()
8 | print(x.grad)
--------------------------------------------------------------------------------
/step06/vision.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torchvision
3 | import torchvision.transforms as transforms
4 | import matplotlib.pyplot as plt
5 |
6 |
7 | ## ==== MNIST ====
8 | dataset = torchvision.datasets.MNIST(
9 | root='./data',
10 | train=True,
11 | transform=None,
12 | download=True
13 | )
14 |
15 | x, label = dataset[0]
16 |
17 | print('size:', len(dataset)) # size: 60000
18 | print('type:', type(x)) # type:
19 | print('label:', label) # label: 5
20 |
21 | plt.imshow(x, cmap='gray')
22 | plt.show()
23 |
24 |
25 | # ==== preprocess ====
26 | transform = transforms.ToTensor()
27 |
28 | dataset = torchvision.datasets.MNIST(
29 | root='./data',
30 | train=True,
31 | transform=transform,
32 | download=True
33 | )
34 |
35 | x, label = dataset[0]
36 | print('type:', type(x)) # type:
37 | print('shape:', x.shape) # shape: torch.Size([1, 28, 28])
38 |
39 |
40 | # ==== DataLoader ====
41 | dataloader = torch.utils.data.DataLoader(
42 | dataset,
43 | batch_size=32,
44 | shuffle=True)
45 |
46 | for x, label in dataloader:
47 | print('x shape:', x.shape) # shape: torch.Size([32, 1, 28, 28])
48 | print('label shape:', label.shape) # shape: torch.Size([32])
49 | break
--------------------------------------------------------------------------------
/step07/vae.py:
--------------------------------------------------------------------------------
1 | import matplotlib.pyplot as plt
2 | import torch
3 | import torch.nn as nn
4 | import torch.nn.functional as F
5 | import torch.optim as optim
6 | import torchvision
7 | from torchvision import datasets, transforms
8 |
9 |
10 | # hyperparameters
11 | input_dim = 784 # x dimension
12 | hidden_dim = 200 # neurons in hidden layers
13 | latent_dim = 20 # z dimension
14 | epochs = 30
15 | learning_rate = 3e-4
16 | batch_size = 32
17 |
18 |
19 | class Encoder(nn.Module):
20 | def __init__(self, input_dim, hidden_dim, latent_dim):
21 | super().__init__()
22 | self.linear = nn.Linear(input_dim, hidden_dim)
23 | self.linear_mu = nn.Linear(hidden_dim, latent_dim)
24 | self.linear_logvar = nn.Linear(hidden_dim, latent_dim)
25 |
26 | def forward(self, x):
27 | h = self.linear(x)
28 | h = F.relu(h)
29 | mu = self.linear_mu(h)
30 | logvar = self.linear_logvar(h)
31 | sigma = torch.exp(0.5 * logvar)
32 | return mu, sigma
33 |
34 |
35 | class Decoder(nn.Module):
36 | def __init__(self, latent_dim, hidden_dim, output_dim):
37 | super().__init__()
38 | self.linear1 = nn.Linear(latent_dim, hidden_dim)
39 | self.linear2 = nn.Linear(hidden_dim, output_dim)
40 |
41 | def forward(self, z):
42 | h = self.linear1(z)
43 | h = F.relu(h)
44 | h = self.linear2(h)
45 | x_hat = F.sigmoid(h)
46 | return x_hat
47 |
48 |
49 | def reparameterize(mu, sigma):
50 | eps = torch.randn_like(sigma)
51 | z = mu + eps * sigma
52 | return z
53 |
54 |
55 | class VAE(nn.Module):
56 | def __init__(self, input_dim, hidden_dim, latent_dim):
57 | super().__init__()
58 | self.encoder = Encoder(input_dim, hidden_dim, latent_dim)
59 | self.decoder = Decoder(latent_dim, hidden_dim, input_dim)
60 |
61 | def get_loss(self, x):
62 | mu, sigma = self.encoder(x)
63 | z = reparameterize(mu, sigma)
64 | x_hat = self.decoder(z)
65 |
66 | batch_size = len(x)
67 | L1 = F.mse_loss(x_hat, x, reduction='sum')
68 | L2 = - torch.sum(1 + torch.log(sigma ** 2) - mu ** 2 - sigma ** 2)
69 | return (L1 + L2) / batch_size
70 |
71 |
72 | # datasets
73 | transform = transforms.Compose([
74 | transforms.ToTensor(),
75 | transforms.Lambda(torch.flatten) # falatten
76 | ])
77 | dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
78 | dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True)
79 |
80 | model = VAE(input_dim, hidden_dim, latent_dim)
81 | optimizer = optim.Adam(model.parameters(), lr=learning_rate)
82 | losses = []
83 |
84 | for epoch in range(epochs):
85 | loss_sum = 0.0
86 | cnt = 0
87 |
88 | for x, label in dataloader:
89 | optimizer.zero_grad()
90 | loss = model.get_loss(x)
91 | loss.backward()
92 | optimizer.step()
93 |
94 | loss_sum += loss.item()
95 | cnt += 1
96 |
97 | loss_avg = loss_sum / cnt
98 | print(loss_avg)
99 | losses.append(loss_avg)
100 |
101 | # plot losses
102 | epochs = list(range(1, epochs + 1))
103 | plt.plot(epochs, losses, marker='o', linestyle='-')
104 | plt.xlabel('Epoch')
105 | plt.ylabel('Loss')
106 | plt.show()
107 |
108 |
109 | # generate new images
110 | with torch.no_grad():
111 | sample_size = 64
112 | z = torch.randn(sample_size, latent_dim)
113 | x = model.decoder(z)
114 | generated_images = x.view(sample_size, 1, 28, 28)
115 |
116 | grid_img = torchvision.utils.make_grid(generated_images, nrow=8, padding=2, normalize=True)
117 | plt.imshow(grid_img.permute(1, 2, 0))
118 | plt.axis('off')
119 | plt.show()
--------------------------------------------------------------------------------
/step08/hvae.py:
--------------------------------------------------------------------------------
1 | import matplotlib.pyplot as plt
2 | import torch
3 | import torch.nn as nn
4 | import torch.nn.functional as F
5 | import torch.optim as optim
6 | import torchvision
7 | from torchvision import datasets, transforms
8 |
9 |
10 | # hyperparameters
11 | input_dim = 784 # mnist image 28x28
12 | hidden_dim = 100
13 | latent_dim = 20
14 | epochs = 30
15 | learning_rate = 1e-3
16 | batch_size = 32
17 |
18 |
19 | class Encoder(nn.Module):
20 | def __init__(self, input_dim, hidden_dim, latent_dim):
21 | super().__init__()
22 | self.linear = nn.Linear(input_dim, hidden_dim)
23 | self.linear_mu = nn.Linear(hidden_dim, latent_dim)
24 | self.linear_logvar = nn.Linear(hidden_dim, latent_dim)
25 |
26 | def forward(self, x):
27 | h = self.linear(x)
28 | h = F.relu(h)
29 | mu = self.linear_mu(h)
30 | logvar = self.linear_logvar(h)
31 | sigma = torch.exp(0.5 * logvar)
32 | return mu, sigma
33 |
34 |
35 | class Decoder(nn.Module):
36 | def __init__(self, latent_dim, hidden_dim, output_dim, use_sigmoid=False):
37 | super().__init__()
38 | self.linear1 = nn.Linear(latent_dim, hidden_dim)
39 | self.linear2 = nn.Linear(hidden_dim, output_dim)
40 | self.use_sigmoid = use_sigmoid
41 |
42 | def forward(self, z):
43 | h = self.linear1(z)
44 | h = F.relu(h)
45 | h = self.linear2(h)
46 | if self.use_sigmoid:
47 | h = F.sigmoid(h)
48 | return h
49 |
50 |
51 | def reparameterize(mu, sigma):
52 | eps = torch.randn_like(sigma)
53 | z = mu + eps * sigma
54 | return z
55 |
56 |
57 | class VAE(nn.Module):
58 | def __init__(self, input_dim, hidden_dim, latent_dim):
59 | super().__init__()
60 | self.encoder1 = Encoder(input_dim, hidden_dim, latent_dim)
61 | self.encoder2 = Encoder(latent_dim, hidden_dim, latent_dim)
62 | self.decoder1 = Decoder(latent_dim, hidden_dim, input_dim, use_sigmoid=True)
63 | self.decoder2 = Decoder(latent_dim, hidden_dim, latent_dim)
64 |
65 | def get_loss(self, x):
66 | mu1, sigma1 = self.encoder1(x)
67 | z1 = reparameterize(mu1, sigma1)
68 | mu2, sigma2 = self.encoder2(z1)
69 | z2 = reparameterize(mu2, sigma2)
70 |
71 | z_hat = self.decoder2(z2)
72 | x_hat = self.decoder1(z1)
73 |
74 | # loss
75 | batch_size = len(x)
76 | L1 = F.mse_loss(x_hat, x, reduction='sum')
77 | L2 = - torch.sum(1 + torch.log(sigma2 ** 2) - mu2 ** 2 - sigma2 ** 2)
78 | L3 = - torch.sum(1 + torch.log(sigma1 ** 2) - (mu1 - z_hat) ** 2 - sigma1 ** 2)
79 | return (L1 + L2 + L3) / batch_size
80 |
81 |
82 | # dataset
83 | transform = transforms.Compose([
84 | transforms.ToTensor(),
85 | transforms.Lambda(torch.flatten)
86 | ])
87 | dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
88 | dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True)
89 |
90 | model = VAE(input_dim, hidden_dim, latent_dim)
91 | optimizer = optim.Adam(model.parameters(), lr=learning_rate)
92 |
93 | for epoch in range(epochs):
94 | loss_sum = 0.0
95 | cnt = 0
96 |
97 | for x, label in dataloader:
98 | optimizer.zero_grad()
99 | loss = model.get_loss(x)
100 | loss.backward()
101 | optimizer.step()
102 |
103 | loss_sum += loss.item()
104 | cnt += 1
105 |
106 | loss_avg = loss_sum / cnt
107 | print(loss_avg)
108 |
109 |
110 | # visualize generated images
111 | with torch.no_grad():
112 | sample_size = 64
113 | z2 = torch.randn(sample_size, latent_dim)
114 | z1_hat = model.decoder2(z2)
115 | z1 = reparameterize(z1_hat, torch.ones_like(z1_hat))
116 | x = model.decoder1(z1)
117 | generated_images = x.view(sample_size, 1, 28, 28)
118 |
119 | grid_img = torchvision.utils.make_grid(generated_images, nrow=8, padding=2, normalize=True)
120 |
121 | plt.imshow(grid_img.permute(1, 2, 0))
122 | plt.axis('off')
123 | plt.show()
--------------------------------------------------------------------------------
/step09/diffusion_model.py:
--------------------------------------------------------------------------------
1 | import math
2 | import torch
3 | import torchvision
4 | import matplotlib.pyplot as plt
5 | from torchvision import transforms
6 | from torch.utils.data import DataLoader
7 | from torch.optim import Adam
8 | import torch.nn.functional as F
9 | from torch import nn
10 | from tqdm import tqdm
11 |
12 |
13 | img_size = 28
14 | batch_size = 128
15 | num_timesteps = 1000
16 | epochs = 10
17 | lr = 1e-3
18 | device = 'cuda' if torch.cuda.is_available() else 'cpu'
19 |
20 |
21 | def show_images(images, rows=2, cols=10):
22 | fig = plt.figure(figsize=(cols, rows))
23 | i = 0
24 | for r in range(rows):
25 | for c in range(cols):
26 | fig.add_subplot(rows, cols, i + 1)
27 | plt.imshow(images[i], cmap='gray')
28 | plt.axis('off')
29 | i += 1
30 | plt.show()
31 |
32 | def _pos_encoding(time_idx, output_dim, device='cpu'):
33 | t, D = time_idx, output_dim
34 | v = torch.zeros(D, device=device)
35 |
36 | i = torch.arange(0, D, device=device)
37 | div_term = torch.exp(i / D * math.log(10000))
38 |
39 | v[0::2] = torch.sin(t / div_term[0::2])
40 | v[1::2] = torch.cos(t / div_term[1::2])
41 | return v
42 |
43 | def pos_encoding(timesteps, output_dim, device='cpu'):
44 | batch_size = len(timesteps)
45 | device = timesteps.device
46 | v = torch.zeros(batch_size, output_dim, device=device)
47 | for i in range(batch_size):
48 | v[i] = _pos_encoding(timesteps[i], output_dim, device)
49 | return v
50 |
51 | class ConvBlock(nn.Module):
52 | def __init__(self, in_ch, out_ch, time_embed_dim):
53 | super().__init__()
54 | self.convs = nn.Sequential(
55 | nn.Conv2d(in_ch, out_ch, 3, padding=1),
56 | nn.BatchNorm2d(out_ch),
57 | nn.ReLU(),
58 | nn.Conv2d(out_ch, out_ch, 3, padding=1),
59 | nn.BatchNorm2d(out_ch),
60 | nn.ReLU()
61 | )
62 | self.mlp = nn.Sequential(
63 | nn.Linear(time_embed_dim, in_ch),
64 | nn.ReLU(),
65 | nn.Linear(in_ch, in_ch)
66 | )
67 |
68 | def forward(self, x, v):
69 | N, C, _, _ = x.shape
70 | v = self.mlp(v)
71 | v = v.view(N, C, 1, 1)
72 | y = self.convs(x + v)
73 | return y
74 |
75 | class UNet(nn.Module):
76 | def __init__(self, in_ch=1, time_embed_dim=100):
77 | super().__init__()
78 | self.time_embed_dim = time_embed_dim
79 |
80 | self.down1 = ConvBlock(in_ch, 64, time_embed_dim)
81 | self.down2 = ConvBlock(64, 128, time_embed_dim)
82 | self.bot1 = ConvBlock(128, 256, time_embed_dim)
83 | self.up2 = ConvBlock(128 + 256, 128, time_embed_dim)
84 | self.up1 = ConvBlock(128 + 64, 64, time_embed_dim)
85 | self.out = nn.Conv2d(64, in_ch, 1)
86 |
87 | self.maxpool = nn.MaxPool2d(2)
88 | self.upsample = nn.Upsample(scale_factor=2, mode='bilinear')
89 |
90 | def forward(self, x, timesteps):
91 | v = pos_encoding(timesteps, self.time_embed_dim, x.device)
92 |
93 | x1 = self.down1(x, v)
94 | x = self.maxpool(x1)
95 | x2 = self.down2(x, v)
96 | x = self.maxpool(x2)
97 |
98 | x = self.bot1(x, v)
99 |
100 | x = self.upsample(x)
101 | x = torch.cat([x, x2], dim=1)
102 | x = self.up2(x, v)
103 | x = self.upsample(x)
104 | x = torch.cat([x, x1], dim=1)
105 | x = self.up1(x, v)
106 | x = self.out(x)
107 | return x
108 |
109 |
110 | class Diffuser:
111 | def __init__(self, num_timesteps=1000, beta_start=0.0001, beta_end=0.02, device='cpu'):
112 | self.num_timesteps = num_timesteps
113 | self.device = device
114 | self.betas = torch.linspace(beta_start, beta_end, num_timesteps, device=device)
115 | self.alphas = 1 - self.betas
116 | self.alpha_bars = torch.cumprod(self.alphas, dim=0)
117 |
118 | def add_noise(self, x_0, t):
119 | T = self.num_timesteps
120 | assert (t >= 1).all() and (t <= T).all()
121 |
122 | t_idx = t - 1 # alpha_bars[0] is for t=1
123 | alpha_bar = self.alpha_bars[t_idx] # (N,)
124 | N = alpha_bar.size(0)
125 | alpha_bar = alpha_bar.view(N, 1, 1, 1) # (N, 1, 1, 1)
126 |
127 | noise = torch.randn_like(x_0, device=self.device)
128 | x_t = torch.sqrt(alpha_bar) * x_0 + torch.sqrt(1 - alpha_bar) * noise
129 | return x_t, noise
130 |
131 | def denoise(self, model, x, t):
132 | T = self.num_timesteps
133 | assert (t >= 1).all() and (t <= T).all()
134 |
135 | t_idx = t - 1 # alphas[0] is for t=1
136 | alpha = self.alphas[t_idx]
137 | alpha_bar = self.alpha_bars[t_idx]
138 | alpha_bar_prev = self.alpha_bars[t_idx-1]
139 |
140 | N = alpha.size(0)
141 | alpha = alpha.view(N, 1, 1, 1)
142 | alpha_bar = alpha_bar.view(N, 1, 1, 1)
143 | alpha_bar_prev = alpha_bar_prev.view(N, 1, 1, 1)
144 |
145 | model.eval()
146 | with torch.no_grad():
147 | eps = model(x, t)
148 | model.train()
149 |
150 | noise = torch.randn_like(x, device=self.device)
151 | noise[t == 1] = 0 # no noise at t=1
152 |
153 | mu = (x - ((1-alpha) / torch.sqrt(1-alpha_bar)) * eps) / torch.sqrt(alpha)
154 | std = torch.sqrt((1-alpha) * (1-alpha_bar_prev) / (1-alpha_bar))
155 | return mu + noise * std
156 |
157 | def reverse_to_img(self, x):
158 | x = x * 255
159 | x = x.clamp(0, 255)
160 | x = x.to(torch.uint8)
161 | x = x.cpu()
162 | to_pil = transforms.ToPILImage()
163 | return to_pil(x)
164 |
165 | def sample(self, model, x_shape=(20, 1, 28, 28)):
166 | batch_size = x_shape[0]
167 | x = torch.randn(x_shape, device=self.device)
168 |
169 | for i in tqdm(range(self.num_timesteps, 0, -1)):
170 | t = torch.tensor([i] * batch_size, device=self.device, dtype=torch.long)
171 | x = self.denoise(model, x, t)
172 |
173 | images = [self.reverse_to_img(x[i]) for i in range(batch_size)]
174 | return images
175 |
176 |
177 | preprocess = transforms.ToTensor()
178 | dataset = torchvision.datasets.MNIST(root='./data', download=True, transform=preprocess)
179 | dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
180 |
181 | diffuser = Diffuser(num_timesteps, device=device)
182 | model = UNet()
183 | model.to(device)
184 | optimizer = Adam(model.parameters(), lr=lr)
185 |
186 | losses = []
187 | for epoch in range(epochs):
188 | loss_sum = 0.0
189 | cnt = 0
190 |
191 | # generate samples every epoch ===================
192 | # images = diffuser.sample(model)
193 | # show_images(images)
194 | # ================================================
195 |
196 | for images, labels in tqdm(dataloader):
197 | optimizer.zero_grad()
198 | x = images.to(device)
199 | t = torch.randint(1, num_timesteps+1, (len(x),), device=device)
200 |
201 | x_noisy, noise = diffuser.add_noise(x, t)
202 | noise_pred = model(x_noisy, t)
203 | loss = F.mse_loss(noise, noise_pred)
204 |
205 | loss.backward()
206 | optimizer.step()
207 |
208 | loss_sum += loss.item()
209 | cnt += 1
210 |
211 | loss_avg = loss_sum / cnt
212 | losses.append(loss_avg)
213 | print(f'Epoch {epoch} | Loss: {loss_avg}')
214 |
215 | # plot losses
216 | plt.plot(losses)
217 | plt.xlabel('Epoch')
218 | plt.ylabel('Loss')
219 | plt.show()
220 |
221 | # generate samples
222 | images = diffuser.sample(model)
223 | show_images(images)
--------------------------------------------------------------------------------
/step09/flower.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WegraLee/deep-learning-from-scratch-5/86be5ee971cf3cfe3795262f1ee07ce2322e9c9a/step09/flower.png
--------------------------------------------------------------------------------
/step09/gaussian_noise.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import matplotlib.pyplot as plt
4 | import torch
5 | import torchvision.transforms as transforms
6 |
7 |
8 | x = torch.randn(3, 64, 64)
9 | T = 1000
10 | betas = torch.linspace(0.0001, 0.02, T)
11 |
12 | for t in range(T):
13 | beta = betas[t]
14 | eps = torch.randn_like(x)
15 | x = torch.sqrt(1 - beta) * x + torch.sqrt(beta) * eps
16 |
17 | # load image
18 | current_dir = os.path.dirname(os.path.abspath(__file__))
19 | file_path = os.path.join(current_dir, 'flower.png')
20 | image = plt.imread(file_path)
21 | print(image.shape) # (64, 64, 3)
22 |
23 | # preprocess
24 | preprocess = transforms.ToTensor()
25 | x = preprocess(image)
26 | print(x.shape) # (3, 64, 64)
27 |
28 | original_x = x.clone() # keep original image
29 |
30 | def reverse_to_img(x):
31 | x = x * 255
32 | x = x.clamp(0, 255)
33 | x = x.to(torch.uint8)
34 | to_pil = transforms.ToPILImage()
35 | return to_pil(x)
36 |
37 | T = 1000
38 | beta_start = 0.0001
39 | beta_end = 0.02
40 | betas = torch.linspace(beta_start, beta_end, T)
41 | imgs = []
42 |
43 | for t in range(T):
44 | if t % 100 == 0:
45 | img = reverse_to_img(x)
46 | imgs.append(img)
47 |
48 | beta = betas[t]
49 | eps = torch.randn_like(x)
50 | x = torch.sqrt(1 - beta) * x + torch.sqrt(beta) * eps
51 |
52 | # show imgs
53 | plt.figure(figsize=(15, 6))
54 | for i, img in enumerate(imgs[:10]):
55 | plt.subplot(2, 5, i + 1)
56 | plt.imshow(img)
57 | plt.title(f'Noise: {i * 100}')
58 | plt.axis('off')
59 | plt.show()
60 |
61 |
62 |
63 | # ============================================
64 | # q(x_t|x_0)
65 | # ============================================
66 | def add_noise(x_0, t, betas):
67 | T = len(betas)
68 | assert t >= 1 and t <= T
69 | t_idx = t - 1 # betas[0] is for t=1
70 |
71 | alphas = 1 - betas
72 | alpha_bars = torch.cumprod(alphas, dim=0)
73 | alpha_bar = alpha_bars[t_idx]
74 |
75 | eps = torch.randn_like(x_0)
76 | x_t = torch.sqrt(alpha_bar) * x_0 + torch.sqrt(1 - alpha_bar) * eps
77 | return x_t
78 |
79 | x = original_x
80 |
81 | t = 100
82 | x_t = add_noise(x, t, betas)
83 |
84 | img = reverse_to_img(x_t)
85 | plt.imshow(img)
86 | plt.title(f'Noise: {t}')
87 | plt.axis('off')
88 | plt.show()
--------------------------------------------------------------------------------
/step09/simple_unet.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torch import nn
3 |
4 |
5 | class ConvBlock(nn.Module):
6 | def __init__(self, in_ch, out_ch):
7 | super().__init__()
8 | self.convs = nn.Sequential(
9 | nn.Conv2d(in_ch, out_ch, 3, padding=1),
10 | nn.BatchNorm2d(out_ch),
11 | nn.ReLU(),
12 | nn.Conv2d(out_ch, out_ch, 3, padding=1),
13 | nn.BatchNorm2d(out_ch),
14 | nn.ReLU()
15 | )
16 |
17 | def forward(self, x):
18 | return self.convs(x)
19 |
20 |
21 | class UNet(nn.Module):
22 | def __init__(self, in_ch=1):
23 | super().__init__()
24 |
25 | self.down1 = ConvBlock(in_ch, 64)
26 | self.down2 = ConvBlock(64, 128)
27 | self.bot1 = ConvBlock(128, 256)
28 | self.up2 = ConvBlock(128 + 256, 128)
29 | self.up1 = ConvBlock(128 + 64, 64)
30 | self.out = nn.Conv2d(64, in_ch, 1)
31 |
32 | self.maxpool = nn.MaxPool2d(2)
33 | self.upsample = nn.Upsample(scale_factor=2, mode='bilinear')
34 |
35 | def forward(self, x):
36 | x1 = self.down1(x)
37 | x = self.maxpool(x1)
38 | x2 = self.down2(x)
39 | x = self.maxpool(x2)
40 |
41 | x = self.bot1(x)
42 |
43 | x = self.upsample(x)
44 | x = torch.cat([x, x2], dim=1)
45 | x = self.up2(x)
46 | x = self.upsample(x)
47 | x = torch.cat([x, x1], dim=1)
48 | x = self.up1(x)
49 | x = self.out(x)
50 | return x
51 |
52 |
53 | model = UNet()
54 | x = torch.randn(10, 1, 28, 28) # dummy input
55 | y = model(x)
56 | print(y.shape)
--------------------------------------------------------------------------------
/step10/classifier_free_guidance.py:
--------------------------------------------------------------------------------
1 | import math
2 | import numpy as np
3 | import torch
4 | import torchvision
5 | import matplotlib.pyplot as plt
6 | from torchvision import transforms
7 | from torch.utils.data import DataLoader
8 | from torch.optim import Adam
9 | import torch.nn.functional as F
10 | from torch import nn
11 | from tqdm import tqdm
12 |
13 |
14 | img_size = 28
15 | batch_size = 128
16 | num_timesteps = 1000
17 | epochs = 10
18 | lr = 1e-3
19 | device = 'cuda' if torch.cuda.is_available() else 'cpu'
20 |
21 |
22 | def show_images(images, labels=None, rows=2, cols=10):
23 | fig = plt.figure(figsize=(cols, rows))
24 | i = 0
25 | for r in range(rows):
26 | for c in range(cols):
27 | ax = fig.add_subplot(rows, cols, i + 1)
28 | plt.imshow(images[i], cmap='gray')
29 | if labels is not None:
30 | ax.set_xlabel(labels[i].item())
31 | ax.get_xaxis().set_ticks([])
32 | ax.get_yaxis().set_ticks([])
33 | i += 1
34 | plt.tight_layout()
35 | plt.show()
36 |
37 | def _pos_encoding(time_idx, output_dim, device='cpu'):
38 | t, D = time_idx, output_dim
39 | v = torch.zeros(D, device=device)
40 |
41 | i = torch.arange(0, D, device=device)
42 | div_term = torch.exp(i / D * math.log(10000))
43 |
44 | v[0::2] = torch.sin(t / div_term[0::2])
45 | v[1::2] = torch.cos(t / div_term[1::2])
46 | return v
47 |
48 | def pos_encoding(timesteps, output_dim, device='cpu'):
49 | batch_size = len(timesteps)
50 | device = timesteps.device
51 | v = torch.zeros(batch_size, output_dim, device=device)
52 | for i in range(batch_size):
53 | v[i] = _pos_encoding(timesteps[i], output_dim, device)
54 | return v
55 |
56 | class ConvBlock(nn.Module):
57 | def __init__(self, in_ch, out_ch, time_embed_dim):
58 | super().__init__()
59 | self.convs = nn.Sequential(
60 | nn.Conv2d(in_ch, out_ch, 3, padding=1),
61 | nn.BatchNorm2d(out_ch),
62 | nn.ReLU(),
63 | nn.Conv2d(out_ch, out_ch, 3, padding=1),
64 | nn.BatchNorm2d(out_ch),
65 | nn.ReLU()
66 | )
67 | self.mlp = nn.Sequential(
68 | nn.Linear(time_embed_dim, in_ch),
69 | nn.ReLU(),
70 | nn.Linear(in_ch, in_ch)
71 | )
72 |
73 | def forward(self, x, v):
74 | N, C, _, _ = x.shape
75 | v = self.mlp(v)
76 | v = v.view(N, C, 1, 1)
77 | y = self.convs(x + v)
78 | return y
79 |
80 | class UNetCond(nn.Module):
81 | def __init__(self, in_ch=1, time_embed_dim=100, num_labels=None):
82 | super().__init__()
83 | self.time_embed_dim = time_embed_dim
84 |
85 | self.down1 = ConvBlock(in_ch, 64, time_embed_dim)
86 | self.down2 = ConvBlock(64, 128, time_embed_dim)
87 | self.bot1 = ConvBlock(128, 256, time_embed_dim)
88 | self.up2 = ConvBlock(128 + 256, 128, time_embed_dim)
89 | self.up1 = ConvBlock(128 + 64, 64, time_embed_dim)
90 | self.out = nn.Conv2d(64, in_ch, 1)
91 |
92 | self.maxpool = nn.MaxPool2d(2)
93 | self.upsample = nn.Upsample(scale_factor=2, mode='bilinear')
94 |
95 | if num_labels is not None:
96 | self.label_emb = nn.Embedding(num_labels, time_embed_dim)
97 |
98 | def forward(self, x, timesteps, labels=None):
99 | t = pos_encoding(timesteps, self.time_embed_dim)
100 |
101 | if labels is not None:
102 | t += self.label_emb(labels)
103 |
104 | x1 = self.down1(x, t)
105 | x = self.maxpool(x1)
106 | x2 = self.down2(x, t)
107 | x = self.maxpool(x2)
108 |
109 | x = self.bot1(x, t)
110 |
111 | x = self.upsample(x)
112 | x = torch.cat([x, x2], dim=1)
113 | x = self.up2(x, t)
114 | x = self.upsample(x)
115 | x = torch.cat([x, x1], dim=1)
116 | x = self.up1(x, t)
117 | x = self.out(x)
118 | return x
119 |
120 |
121 | class Diffuser:
122 | def __init__(self, num_timesteps=1000, beta_start=0.0001, beta_end=0.02, device='cpu'):
123 | self.num_timesteps = num_timesteps
124 | self.device = device
125 | self.betas = torch.linspace(beta_start, beta_end, num_timesteps, device=device)
126 | self.alphas = 1 - self.betas
127 | self.alpha_bars = torch.cumprod(self.alphas, dim=0)
128 |
129 | def add_noise(self, x_0, t):
130 | T = self.num_timesteps
131 | assert (t >= 1).all() and (t <= T).all()
132 |
133 | t_idx = t - 1 # alpha_bars[0] is for t=1
134 | alpha_bar = self.alpha_bars[t_idx] # (N,)
135 | alpha_bar = alpha_bar.view(alpha_bar.size(0), 1, 1, 1) # (N, 1, 1, 1)
136 |
137 | noise = torch.randn_like(x_0, device=self.device)
138 | x_t = torch.sqrt(alpha_bar) * x_0 + torch.sqrt(1 - alpha_bar) * noise
139 | return x_t, noise
140 |
141 | def denoise(self, model, x, t, labels, gamma):
142 | T = self.num_timesteps
143 | assert (t >= 1).all() and (t <= T).all()
144 |
145 | t_idx = t - 1 # alphas[0] is for t=1
146 | alpha = self.alphas[t_idx]
147 | alpha_bar = self.alpha_bars[t_idx]
148 | alpha_bar_prev = self.alpha_bars[t_idx-1]
149 |
150 | N = alpha.size(0)
151 | alpha = alpha.view(N, 1, 1, 1)
152 | alpha_bar = alpha_bar.view(N, 1, 1, 1)
153 | alpha_bar_prev = alpha_bar_prev.view(N, 1, 1, 1)
154 |
155 | model.eval()
156 | with torch.no_grad():
157 | eps = model(x, t, labels)
158 | eps_uncond = model(x, t)
159 | eps = eps_uncond + gamma * (eps - eps_uncond)
160 | model.train()
161 |
162 | noise = torch.randn_like(x, device=self.device)
163 | noise[t == 1] = 0 # no noise at t=1
164 |
165 | mu = (x - ((1-alpha) / torch.sqrt(1-alpha_bar)) * eps) / torch.sqrt(alpha)
166 | std = torch.sqrt((1-alpha) * (1-alpha_bar_prev) / (1-alpha_bar))
167 | return mu + noise * std
168 |
169 | def reverse_to_img(self, x):
170 | x = x * 255
171 | x = x.clamp(0, 255)
172 | x = x.to(torch.uint8)
173 | x = x.cpu()
174 | to_pil = transforms.ToPILImage()
175 | return to_pil(x)
176 |
177 | def sample(self, model, x_shape=(20, 1, 28, 28), labels=None, gamma=3.0):
178 | batch_size = x_shape[0]
179 | x = torch.randn(x_shape, device=self.device)
180 | if labels is None:
181 | labels = torch.randint(0, 10, (len(x),), device=self.device)
182 |
183 | for i in tqdm(range(self.num_timesteps, 0, -1)):
184 | t = torch.tensor([i] * batch_size, device=self.device, dtype=torch.long)
185 | x = self.denoise(model, x, t, labels, gamma)
186 |
187 | images = [self.reverse_to_img(x[i]) for i in range(batch_size)]
188 | return images, labels
189 |
190 |
191 | preprocess = transforms.ToTensor()
192 | dataset = torchvision.datasets.MNIST(root='./data', download=True, transform=preprocess)
193 | dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
194 |
195 | diffuser = Diffuser(num_timesteps, device=device)
196 | model = UNetCond(num_labels=10)
197 | model.to(device)
198 | optimizer = Adam(model.parameters(), lr=lr)
199 |
200 | losses = []
201 | for epoch in range(epochs):
202 | loss_sum = 0.0
203 | cnt = 0
204 |
205 | # generate samples every epoch ===================
206 | #images, labels = diffuser.sample(model)
207 | #show_images(images, labels)
208 | # ================================================
209 |
210 | for images, labels in tqdm(dataloader):
211 | optimizer.zero_grad()
212 | x = images.to(device)
213 | labels = labels.to(device)
214 | t = torch.randint(1, num_timesteps+1, (len(x),), device=device)
215 |
216 | if np.random.random() < 0.1:
217 | labels = None
218 |
219 | x_noisy, noise = diffuser.add_noise(x, t)
220 | noise_pred = model(x_noisy, t, labels)
221 | loss = F.mse_loss(noise, noise_pred)
222 |
223 | loss.backward()
224 | optimizer.step()
225 |
226 | loss_sum += loss.item()
227 | cnt += 1
228 |
229 | loss_avg = loss_sum / cnt
230 | losses.append(loss_avg)
231 | print(f'Epoch {epoch} | Loss: {loss_avg}')
232 |
233 | # plot losses
234 | plt.plot(losses)
235 | plt.xlabel('Epoch')
236 | plt.ylabel('Loss')
237 | plt.show()
238 |
239 | # generate samples
240 | images, labels = diffuser.sample(model)
241 | show_images(images, labels)
--------------------------------------------------------------------------------
/step10/conditional.py:
--------------------------------------------------------------------------------
1 | import math
2 | import torch
3 | import torchvision
4 | import matplotlib.pyplot as plt
5 | from torchvision import transforms
6 | from torch.utils.data import DataLoader
7 | from torch.optim import Adam
8 | import torch.nn.functional as F
9 | from torch import nn
10 | from tqdm import tqdm
11 |
12 |
13 | img_size = 28
14 | batch_size = 128
15 | num_timesteps = 1000
16 | epochs = 10
17 | lr = 1e-3
18 | device = 'cuda' if torch.cuda.is_available() else 'cpu'
19 |
20 |
21 | def show_images(images, labels=None, rows=2, cols=10):
22 | fig = plt.figure(figsize=(cols, rows))
23 | i = 0
24 | for r in range(rows):
25 | for c in range(cols):
26 | ax = fig.add_subplot(rows, cols, i + 1)
27 | plt.imshow(images[i], cmap='gray')
28 | if labels is not None:
29 | ax.set_xlabel(labels[i].item())
30 | ax.get_xaxis().set_ticks([])
31 | ax.get_yaxis().set_ticks([])
32 | i += 1
33 | plt.tight_layout()
34 | plt.show()
35 |
36 | def _pos_encoding(time_idx, output_dim, device='cpu'):
37 | t, D = time_idx, output_dim
38 | v = torch.zeros(D, device=device)
39 |
40 | i = torch.arange(0, D, device=device)
41 | div_term = torch.exp(i / D * math.log(10000))
42 |
43 | v[0::2] = torch.sin(t / div_term[0::2])
44 | v[1::2] = torch.cos(t / div_term[1::2])
45 | return v
46 |
47 | def pos_encoding(timesteps, output_dim, device='cpu'):
48 | batch_size = len(timesteps)
49 | device = timesteps.device
50 | v = torch.zeros(batch_size, output_dim, device=device)
51 | for i in range(batch_size):
52 | v[i] = _pos_encoding(timesteps[i], output_dim, device)
53 | return v
54 |
55 | class ConvBlock(nn.Module):
56 | def __init__(self, in_ch, out_ch, time_embed_dim):
57 | super().__init__()
58 | self.convs = nn.Sequential(
59 | nn.Conv2d(in_ch, out_ch, 3, padding=1),
60 | nn.BatchNorm2d(out_ch),
61 | nn.ReLU(),
62 | nn.Conv2d(out_ch, out_ch, 3, padding=1),
63 | nn.BatchNorm2d(out_ch),
64 | nn.ReLU()
65 | )
66 | self.mlp = nn.Sequential(
67 | nn.Linear(time_embed_dim, in_ch),
68 | nn.ReLU(),
69 | nn.Linear(in_ch, in_ch)
70 | )
71 |
72 | def forward(self, x, v):
73 | N, C, _, _ = x.shape
74 | v = self.mlp(v)
75 | v = v.view(N, C, 1, 1)
76 | y = self.convs(x + v)
77 | return y
78 |
79 | class UNetCond(nn.Module):
80 | def __init__(self, in_ch=1, time_embed_dim=100, num_labels=None):
81 | super().__init__()
82 | self.time_embed_dim = time_embed_dim
83 |
84 | self.down1 = ConvBlock(in_ch, 64, time_embed_dim)
85 | self.down2 = ConvBlock(64, 128, time_embed_dim)
86 | self.bot1 = ConvBlock(128, 256, time_embed_dim)
87 | self.up2 = ConvBlock(128 + 256, 128, time_embed_dim)
88 | self.up1 = ConvBlock(128 + 64, 64, time_embed_dim)
89 | self.out = nn.Conv2d(64, in_ch, 1)
90 |
91 | self.maxpool = nn.MaxPool2d(2)
92 | self.upsample = nn.Upsample(scale_factor=2, mode='bilinear')
93 |
94 | if num_labels is not None:
95 | self.label_emb = nn.Embedding(num_labels, time_embed_dim)
96 |
97 | def forward(self, x, timesteps, labels=None):
98 | t = pos_encoding(timesteps, self.time_embed_dim)
99 |
100 | if labels is not None:
101 | t += self.label_emb(labels)
102 |
103 | x1 = self.down1(x, t)
104 | x = self.maxpool(x1)
105 | x2 = self.down2(x, t)
106 | x = self.maxpool(x2)
107 |
108 | x = self.bot1(x, t)
109 |
110 | x = self.upsample(x)
111 | x = torch.cat([x, x2], dim=1)
112 | x = self.up2(x, t)
113 | x = self.upsample(x)
114 | x = torch.cat([x, x1], dim=1)
115 | x = self.up1(x, t)
116 | x = self.out(x)
117 | return x
118 |
119 |
120 | class Diffuser:
121 | def __init__(self, num_timesteps=1000, beta_start=0.0001, beta_end=0.02, device='cpu'):
122 | self.num_timesteps = num_timesteps
123 | self.device = device
124 | self.betas = torch.linspace(beta_start, beta_end, num_timesteps, device=device)
125 | self.alphas = 1 - self.betas
126 | self.alpha_bars = torch.cumprod(self.alphas, dim=0)
127 |
128 | def add_noise(self, x_0, t):
129 | T = self.num_timesteps
130 | assert (t >= 1).all() and (t <= T).all()
131 |
132 | t_idx = t - 1 # alpha_bars[0] is for t=1
133 | alpha_bar = self.alpha_bars[t_idx] # (N,)
134 | alpha_bar = alpha_bar.view(alpha_bar.size(0), 1, 1, 1) # (N, 1, 1, 1)
135 |
136 | noise = torch.randn_like(x_0, device=self.device)
137 | x_t = torch.sqrt(alpha_bar) * x_0 + torch.sqrt(1 - alpha_bar) * noise
138 | return x_t, noise
139 |
140 | def denoise(self, model, x, t, labels):
141 | T = self.num_timesteps
142 | assert (t >= 1).all() and (t <= T).all()
143 |
144 | t_idx = t - 1 # alphas[0] is for t=1
145 | alpha = self.alphas[t_idx]
146 | alpha_bar = self.alpha_bars[t_idx]
147 | alpha_bar_prev = self.alpha_bars[t_idx-1]
148 |
149 | N = alpha.size(0)
150 | alpha = alpha.view(N, 1, 1, 1)
151 | alpha_bar = alpha_bar.view(N, 1, 1, 1)
152 | alpha_bar_prev = alpha_bar_prev.view(N, 1, 1, 1)
153 |
154 | model.eval()
155 | with torch.no_grad():
156 | eps = model(x, t, labels) # add lable embedding
157 | model.train()
158 |
159 | noise = torch.randn_like(x, device=self.device)
160 | noise[t == 1] = 0 # no noise at t=1
161 |
162 | mu = (x - ((1-alpha) / torch.sqrt(1-alpha_bar)) * eps) / torch.sqrt(alpha)
163 | std = torch.sqrt((1-alpha) * (1-alpha_bar_prev) / (1-alpha_bar))
164 | return mu + noise * std
165 |
166 | def reverse_to_img(self, x):
167 | x = x * 255
168 | x = x.clamp(0, 255)
169 | x = x.to(torch.uint8)
170 | x = x.cpu()
171 | to_pil = transforms.ToPILImage()
172 | return to_pil(x)
173 |
174 | def sample(self, model, x_shape=(20, 1, 28, 28), labels=None):
175 | batch_size = x_shape[0]
176 | x = torch.randn(x_shape, device=self.device)
177 | if labels is None:
178 | labels = torch.randint(0, 10, (len(x),), device=self.device)
179 |
180 | for i in tqdm(range(self.num_timesteps, 0, -1)):
181 | t = torch.tensor([i] * batch_size, device=self.device, dtype=torch.long)
182 | x = self.denoise(model, x, t, labels)
183 |
184 | images = [self.reverse_to_img(x[i]) for i in range(batch_size)]
185 | return images, labels
186 |
187 |
188 | preprocess = transforms.ToTensor()
189 | dataset = torchvision.datasets.MNIST(root='./data', download=True, transform=preprocess)
190 | dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)
191 |
192 | diffuser = Diffuser(num_timesteps, device=device)
193 | model = UNetCond(num_labels=10)
194 | model.to(device)
195 | optimizer = Adam(model.parameters(), lr=lr)
196 |
197 | losses = []
198 | for epoch in range(epochs):
199 | loss_sum = 0.0
200 | cnt = 0
201 |
202 | # generate samples every epoch ===================
203 | #images, labels = diffuser.sample(model)
204 | #show_images(images, labels)
205 | # ================================================
206 |
207 | for images, labels in tqdm(dataloader):
208 | optimizer.zero_grad()
209 | x = images.to(device)
210 | labels = labels.to(device)
211 | t = torch.randint(1, num_timesteps+1, (len(x),), device=device)
212 |
213 | x_noisy, noise = diffuser.add_noise(x, t)
214 | noise_pred = model(x_noisy, t, labels)
215 | loss = F.mse_loss(noise, noise_pred)
216 |
217 | loss.backward()
218 | optimizer.step()
219 |
220 | loss_sum += loss.item()
221 | cnt += 1
222 |
223 | loss_avg = loss_sum / cnt
224 | losses.append(loss_avg)
225 | print(f'Epoch {epoch} | Loss: {loss_avg}')
226 |
227 | # plot losses
228 | plt.plot(losses)
229 | plt.xlabel('Epoch')
230 | plt.ylabel('Loss')
231 | plt.show()
232 |
233 | # generate samples
234 | images, labels = diffuser.sample(model)
235 | show_images(images, labels)
--------------------------------------------------------------------------------