├── .gitignore
├── images
    ├── figures.png
    ├── cDCGAN_epoch_20.png
    ├── cDCGAN_ancient_egyptian.png
    ├── cDCGAN_losses_epoch_20.png
    └── cDCGAN_american_craftsman.png
├── README.md
└── generate.ipynb


/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | .ipynb_checkpoints*
3 | arcDataset*
4 | classify_models*
5 | generated_images*


--------------------------------------------------------------------------------
/images/figures.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carolineh101/deep-learning-architecture/HEAD/images/figures.png


--------------------------------------------------------------------------------
/images/cDCGAN_epoch_20.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carolineh101/deep-learning-architecture/HEAD/images/cDCGAN_epoch_20.png


--------------------------------------------------------------------------------
/images/cDCGAN_ancient_egyptian.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carolineh101/deep-learning-architecture/HEAD/images/cDCGAN_ancient_egyptian.png


--------------------------------------------------------------------------------
/images/cDCGAN_losses_epoch_20.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carolineh101/deep-learning-architecture/HEAD/images/cDCGAN_losses_epoch_20.png


--------------------------------------------------------------------------------
/images/cDCGAN_american_craftsman.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/carolineh101/deep-learning-architecture/HEAD/images/cDCGAN_american_craftsman.png


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Deep Learning Architectures for Building Architecture
 2 | 
 3 | **Title:** Building Deep Learning Architectures to Understand Building Architecture Styles
 4 | 
 5 | **Authors:** Caroline Ho & Cole Thomson `{cho19, colet}@stanford.edu`
 6 | 
 7 | **Course:** CS 230 – Deep Learning
 8 | 
 9 | ## Requirements
10 | 
11 | - PyTorch and libraries: [Anaconda Python distribution](https://www.anaconda.com)
12 | - Dataset: [Xu et al. 25-class architecture dataset](https://drive.google.com/file/d/0Bwo0SFiZwl3JVGRlWGZUaW5va00/edit)
13 | 
14 | ### classify.ipynb
15 | 
16 | - Install [tabulate](https://pypi.org/project/tabulate/): `pip install tabulate`
17 | - Install [TNT](https://github.com/pytorch/tnt): `pip install torchnet`
18 | 
19 | ## Description
20 | 
21 | ### classify.ipynb
22 | 
23 | This notebook uses transfer learning to classify images of buildings by architectural style.
24 | 
25 | **Best Results:** After pretraining a DenseNet on ImageNet, we achieve an accuracy of **0.795833** and a F1 score of **0.789431**. (Visualizations available in notebook.)
26 | 
27 | ### generate.ipynb
28 | 
29 | This notebook generates images of buildings conditioned on architecture styles using a conditional GAN.
30 | 
31 | **Results after 20 epochs:**
32 | 
33 | ![cDCGAN results](images/cDCGAN_epoch_20.png)
34 | 
35 | Our most successful generated image is this example of **Ancient Egyptian architecture**, which is visibly a pyramid:
36 | 
37 | ![Generated Egyptian Pyramid](images/cDCGAN_ancient_egyptian.png)
38 | 
39 | However, most of our images, including this example of **American Craftsman architecture**, are less clear. (If you look closely, you can see a blurry gabled brown roof and white walls.)
40 | 
41 | ![Generated American Craftsman](images/cDCGAN_american_craftsman.png)
42 | 
43 | 
44 | ## Acknowledgments
45 | 
46 | Much of our code has been adapted from the following sources.
47 | 
48 | - Data: [Architectural Style Classification using MLLR](https://sites.google.com/site/zhexuutssjtu/projects/arch)
49 | - Classification: [PyTorch Transfer Learning Tutorial](https://pytorch.org/tutorials/beginner/transfer_learning_tutorial.html) and [Finetuning Torchvision Models Tutorial](https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html)
50 | - Confusion Matrix: [scikit-learn](https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html)
51 | - Conditional GAN: [togheppi's cDCGAN](https://github.com/togheppi/cDCGAN)
52 | 


--------------------------------------------------------------------------------
/generate.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "markdown",
  5 |    "metadata": {},
  6 |    "source": [
  7 |     "# Architectural Style Generation"
  8 |    ]
  9 |   },
 10 |   {
 11 |    "cell_type": "code",
 12 |    "execution_count": null,
 13 |    "metadata": {},
 14 |    "outputs": [],
 15 |    "source": [
 16 |     "__author__ = \"Caroline Ho and Cole Thomson\"\n",
 17 |     "__version__ = \"CS230, Stanford, Autumn 2018 term\""
 18 |    ]
 19 |   },
 20 |   {
 21 |    "cell_type": "markdown",
 22 |    "metadata": {},
 23 |    "source": [
 24 |     "## Contents\n",
 25 |     "1. [Overview](#Overview)\n",
 26 |     "2. [Set-Up](#Set-Up)\n",
 27 |     "3. [Data](#Data)\n",
 28 |     "4. [Model](#Model)\n",
 29 |     "5. [Run Model](#Run-Model)\n",
 30 |     "6. [Resources](#Resources)"
 31 |    ]
 32 |   },
 33 |   {
 34 |    "cell_type": "markdown",
 35 |    "metadata": {},
 36 |    "source": [
 37 |     "## Overview\n",
 38 |     "\n",
 39 |     "In this notebook, we use conditional GANs to generate images of buildings with given architectural styles."
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "markdown",
 44 |    "metadata": {},
 45 |    "source": [
 46 |     "## Set-Up\n",
 47 |     "\n",
 48 |     "Run the following cells to import necessary libraries/functions and set global variables."
 49 |    ]
 50 |   },
 51 |   {
 52 |    "cell_type": "code",
 53 |    "execution_count": 1,
 54 |    "metadata": {},
 55 |    "outputs": [],
 56 |    "source": [
 57 |     "import torch\n",
 58 |     "from torch.autograd import Variable\n",
 59 |     "import torchvision.datasets as dsets\n",
 60 |     "import torchvision.transforms as transforms\n",
 61 |     "import numpy as np\n",
 62 |     "import matplotlib.pyplot as plt\n",
 63 |     "import os\n",
 64 |     "import imageio"
 65 |    ]
 66 |   },
 67 |   {
 68 |    "cell_type": "code",
 69 |    "execution_count": 2,
 70 |    "metadata": {},
 71 |    "outputs": [],
 72 |    "source": [
 73 |     "# Parameters\n",
 74 |     "image_size = 64\n",
 75 |     "label_dim = 25\n",
 76 |     "G_input_dim = 625\n",
 77 |     "G_output_dim = 3\n",
 78 |     "D_input_dim = 3\n",
 79 |     "D_output_dim = 1\n",
 80 |     "num_filters = [1024, 512, 256, 128]\n",
 81 |     "\n",
 82 |     "learning_rate = 0.0002\n",
 83 |     "betas = (0.5, 0.999)\n",
 84 |     "batch_size = 4\n",
 85 |     "num_epochs = 20\n",
 86 |     "save_dir = 'generated_images/'"
 87 |    ]
 88 |   },
 89 |   {
 90 |    "cell_type": "markdown",
 91 |    "metadata": {},
 92 |    "source": [
 93 |     "## Data\n",
 94 |     "\n",
 95 |     "Download [Xu et al.'s architecture dataset](https://drive.google.com/file/d/0Bwo0SFiZwl3JVGRlWGZUaW5va00/edit) and place it in the current directory.\n",
 96 |     "\n",
 97 |     "This dataset contains 4,843 images of buildings from 25 architecture style classes ranging from Achaemenid to Tudor Revival. Image dimensions/aspect ratios are not consistent."
 98 |    ]
 99 |   },
100 |   {
101 |    "cell_type": "code",
102 |    "execution_count": 3,
103 |    "metadata": {},
104 |    "outputs": [],
105 |    "source": [
106 |     "transform = transforms.Compose([\n",
107 |     "    transforms.Resize(image_size),\n",
108 |     "    transforms.CenterCrop(image_size),\n",
109 |     "    transforms.ToTensor(),\n",
110 |     "    transforms.Normalize(mean=(0.5, 0.5, 0.5),std=(0.5, 0.5, 0.5))\n",
111 |     "])\n",
112 |     "\n",
113 |     "data = dsets.ImageFolder('arcDataset', transform=transform)\n",
114 |     "\n",
115 |     "data_loader = torch.utils.data.DataLoader(dataset=data,\n",
116 |     "                                          batch_size=batch_size,\n",
117 |     "                                          shuffle=True)"
118 |    ]
119 |   },
120 |   {
121 |    "cell_type": "markdown",
122 |    "metadata": {},
123 |    "source": [
124 |     "## Helper Functions"
125 |    ]
126 |   },
127 |   {
128 |    "cell_type": "code",
129 |    "execution_count": 4,
130 |    "metadata": {},
131 |    "outputs": [],
132 |    "source": [
133 |     "def to_var(x):\n",
134 |     "    if torch.cuda.is_available():\n",
135 |     "        x = x.cuda()\n",
136 |     "    return Variable(x)"
137 |    ]
138 |   },
139 |   {
140 |    "cell_type": "code",
141 |    "execution_count": 5,
142 |    "metadata": {},
143 |    "outputs": [],
144 |    "source": [
145 |     "# De-normalization\n",
146 |     "def denorm(x):\n",
147 |     "    out = (x + 1) / 2\n",
148 |     "    return out.clamp(0, 1)"
149 |    ]
150 |   },
151 |   {
152 |    "cell_type": "markdown",
153 |    "metadata": {},
154 |    "source": [
155 |     "## Model"
156 |    ]
157 |   },
158 |   {
159 |    "cell_type": "code",
160 |    "execution_count": 6,
161 |    "metadata": {},
162 |    "outputs": [],
163 |    "source": [
164 |     "# Generator model\n",
165 |     "class Generator(torch.nn.Module):\n",
166 |     "    def __init__(self, input_dim, label_dim, num_filters, output_dim):\n",
167 |     "        super(Generator, self).__init__()\n",
168 |     "\n",
169 |     "        # Hidden layers\n",
170 |     "        self.hidden_layer1 = torch.nn.Sequential()\n",
171 |     "        self.hidden_layer2 = torch.nn.Sequential()\n",
172 |     "        self.hidden_layer = torch.nn.Sequential()\n",
173 |     "        for i in range(len(num_filters)):\n",
174 |     "            # Deconvolutional layer\n",
175 |     "            if i == 0:\n",
176 |     "                # For input\n",
177 |     "                input_deconv = torch.nn.ConvTranspose2d(input_dim, int(num_filters[i]/2), kernel_size=4, stride=1, padding=0)\n",
178 |     "                self.hidden_layer1.add_module('input_deconv', input_deconv)\n",
179 |     "\n",
180 |     "                # Initializer\n",
181 |     "                torch.nn.init.normal_(input_deconv.weight, mean=0.0, std=0.02)\n",
182 |     "                torch.nn.init.constant_(input_deconv.bias, 0.0)\n",
183 |     "\n",
184 |     "                # Batch normalization\n",
185 |     "                self.hidden_layer1.add_module('input_bn', torch.nn.BatchNorm2d(int(num_filters[i]/2)))\n",
186 |     "\n",
187 |     "                # Activation\n",
188 |     "                self.hidden_layer1.add_module('input_act', torch.nn.ReLU())\n",
189 |     "\n",
190 |     "                # For label\n",
191 |     "                label_deconv = torch.nn.ConvTranspose2d(label_dim, int(num_filters[i]/2), kernel_size=4, stride=1, padding=0)\n",
192 |     "                self.hidden_layer2.add_module('label_deconv', label_deconv)\n",
193 |     "\n",
194 |     "                # Initializer\n",
195 |     "                torch.nn.init.normal_(label_deconv.weight, mean=0.0, std=0.02)\n",
196 |     "                torch.nn.init.constant_(label_deconv.bias, 0.0)\n",
197 |     "\n",
198 |     "                # Batch normalization\n",
199 |     "                self.hidden_layer2.add_module('label_bn', torch.nn.BatchNorm2d(int(num_filters[i]/2)))\n",
200 |     "\n",
201 |     "                # Activation\n",
202 |     "                self.hidden_layer2.add_module('label_act', torch.nn.ReLU())\n",
203 |     "            else:\n",
204 |     "                deconv = torch.nn.ConvTranspose2d(num_filters[i-1], num_filters[i], kernel_size=4, stride=2, padding=1)\n",
205 |     "\n",
206 |     "                deconv_name = 'deconv' + str(i + 1)\n",
207 |     "                self.hidden_layer.add_module(deconv_name, deconv)\n",
208 |     "\n",
209 |     "                # Initializer\n",
210 |     "                torch.nn.init.normal_(deconv.weight, mean=0.0, std=0.02)\n",
211 |     "                torch.nn.init.constant_(deconv.bias, 0.0)\n",
212 |     "\n",
213 |     "                # Batch normalization\n",
214 |     "                bn_name = 'bn' + str(i + 1)\n",
215 |     "                self.hidden_layer.add_module(bn_name, torch.nn.BatchNorm2d(num_filters[i]))\n",
216 |     "\n",
217 |     "                # Activation\n",
218 |     "                act_name = 'act' + str(i + 1)\n",
219 |     "                self.hidden_layer.add_module(act_name, torch.nn.ReLU())\n",
220 |     "\n",
221 |     "        # Output layer\n",
222 |     "        self.output_layer = torch.nn.Sequential()\n",
223 |     "        # Deconvolutional layer\n",
224 |     "        out = torch.nn.ConvTranspose2d(num_filters[i], output_dim, kernel_size=4, stride=2, padding=1)\n",
225 |     "        self.output_layer.add_module('out', out)\n",
226 |     "        # Initializer\n",
227 |     "        torch.nn.init.normal_(out.weight, mean=0.0, std=0.02)\n",
228 |     "        torch.nn.init.constant_(out.bias, 0.0)\n",
229 |     "        # Activation\n",
230 |     "        self.output_layer.add_module('act', torch.nn.Tanh())\n",
231 |     "\n",
232 |     "    def forward(self, z, c):\n",
233 |     "        h1 = self.hidden_layer1(z)\n",
234 |     "        h2 = self.hidden_layer2(c)\n",
235 |     "        x = torch.cat([h1, h2], 1)\n",
236 |     "        h = self.hidden_layer(x)\n",
237 |     "        out = self.output_layer(h)\n",
238 |     "        return out"
239 |    ]
240 |   },
241 |   {
242 |    "cell_type": "code",
243 |    "execution_count": 7,
244 |    "metadata": {},
245 |    "outputs": [],
246 |    "source": [
247 |     "# Discriminator model\n",
248 |     "class Discriminator(torch.nn.Module):\n",
249 |     "    def __init__(self, input_dim, label_dim, num_filters, output_dim):\n",
250 |     "        super(Discriminator, self).__init__()\n",
251 |     "\n",
252 |     "        self.hidden_layer1 = torch.nn.Sequential()\n",
253 |     "        self.hidden_layer2 = torch.nn.Sequential()\n",
254 |     "        self.hidden_layer = torch.nn.Sequential()\n",
255 |     "        for i in range(len(num_filters)):\n",
256 |     "            # Convolutional layer\n",
257 |     "            if i == 0:\n",
258 |     "                # For input\n",
259 |     "                input_conv = torch.nn.Conv2d(input_dim, int(num_filters[i]/2), kernel_size=4, stride=2, padding=1)\n",
260 |     "                self.hidden_layer1.add_module('input_conv', input_conv)\n",
261 |     "\n",
262 |     "                # Initializer\n",
263 |     "                torch.nn.init.normal_(input_conv.weight, mean=0.0, std=0.02)\n",
264 |     "                torch.nn.init.constant_(input_conv.bias, 0.0)\n",
265 |     "\n",
266 |     "                # Activation\n",
267 |     "                self.hidden_layer1.add_module('input_act', torch.nn.LeakyReLU(0.2))\n",
268 |     "\n",
269 |     "                # For label\n",
270 |     "                label_conv = torch.nn.Conv2d(label_dim, int(num_filters[i]/2), kernel_size=4, stride=2, padding=1)\n",
271 |     "                self.hidden_layer2.add_module('label_conv', label_conv)\n",
272 |     "\n",
273 |     "                # Initializer\n",
274 |     "                torch.nn.init.normal_(label_conv.weight, mean=0.0, std=0.02)\n",
275 |     "                torch.nn.init.constant_(label_conv.bias, 0.0)\n",
276 |     "\n",
277 |     "                # Activation\n",
278 |     "                self.hidden_layer2.add_module('label_act', torch.nn.LeakyReLU(0.2))\n",
279 |     "            else:\n",
280 |     "                conv = torch.nn.Conv2d(num_filters[i-1], num_filters[i], kernel_size=4, stride=2, padding=1)\n",
281 |     "\n",
282 |     "                conv_name = 'conv' + str(i + 1)\n",
283 |     "                self.hidden_layer.add_module(conv_name, conv)\n",
284 |     "\n",
285 |     "                # Initializer\n",
286 |     "                torch.nn.init.normal_(conv.weight, mean=0.0, std=0.02)\n",
287 |     "                torch.nn.init.constant_(conv.bias, 0.0)\n",
288 |     "\n",
289 |     "                # Batch normalization\n",
290 |     "                bn_name = 'bn' + str(i + 1)\n",
291 |     "                self.hidden_layer.add_module(bn_name, torch.nn.BatchNorm2d(num_filters[i]))\n",
292 |     "\n",
293 |     "                # Activation\n",
294 |     "                act_name = 'act' + str(i + 1)\n",
295 |     "                self.hidden_layer.add_module(act_name, torch.nn.LeakyReLU(0.2))\n",
296 |     "\n",
297 |     "        # Output layer\n",
298 |     "        self.output_layer = torch.nn.Sequential()\n",
299 |     "        # Convolutional layer\n",
300 |     "        out = torch.nn.Conv2d(num_filters[i], output_dim, kernel_size=4, stride=1, padding=0)\n",
301 |     "        self.output_layer.add_module('out', out)\n",
302 |     "        # Initializer\n",
303 |     "        torch.nn.init.normal_(out.weight, mean=0.0, std=0.02)\n",
304 |     "        torch.nn.init.constant_(out.bias, 0.0)\n",
305 |     "        # Activation\n",
306 |     "        self.output_layer.add_module('act', torch.nn.Sigmoid())\n",
307 |     "\n",
308 |     "    def forward(self, z, c):\n",
309 |     "        h1 = self.hidden_layer1(z)\n",
310 |     "        h2 = self.hidden_layer2(c)\n",
311 |     "        x = torch.cat([h1, h2], 1)\n",
312 |     "        h = self.hidden_layer(x)\n",
313 |     "        out = self.output_layer(h)\n",
314 |     "        return out"
315 |    ]
316 |   },
317 |   {
318 |    "cell_type": "markdown",
319 |    "metadata": {},
320 |    "source": [
321 |     "## Plotting Functions"
322 |    ]
323 |   },
324 |   {
325 |    "cell_type": "code",
326 |    "execution_count": 8,
327 |    "metadata": {},
328 |    "outputs": [],
329 |    "source": [
330 |     "# Plot losses\n",
331 |     "def plot_loss(d_losses, g_losses, num_epoch, save=False, save_dir='generated_images/', show=False):\n",
332 |     "    fig, ax = plt.subplots()\n",
333 |     "    ax.set_xlim(0, num_epochs)\n",
334 |     "    ax.set_ylim(0, max(np.max(g_losses), np.max(d_losses))*1.1)\n",
335 |     "    plt.xlabel('Epoch {0}'.format(num_epoch + 1))\n",
336 |     "    plt.ylabel('Loss values')\n",
337 |     "    plt.plot(d_losses, label='Discriminator')\n",
338 |     "    plt.plot(g_losses, label='Generator')\n",
339 |     "    plt.legend()\n",
340 |     "\n",
341 |     "    # save figure\n",
342 |     "    if save:\n",
343 |     "        if not os.path.exists(save_dir):\n",
344 |     "            os.mkdir(save_dir)\n",
345 |     "        save_fn = save_dir + 'cDCGAN_losses_epoch_{:d}'.format(num_epoch + 1) + '.png'\n",
346 |     "        plt.savefig(save_fn)\n",
347 |     "\n",
348 |     "    if show:\n",
349 |     "        plt.show()\n",
350 |     "    else:\n",
351 |     "        plt.close()"
352 |    ]
353 |   },
354 |   {
355 |    "cell_type": "code",
356 |    "execution_count": 9,
357 |    "metadata": {},
358 |    "outputs": [],
359 |    "source": [
360 |     "def plot_result(generator, noise, label, num_epoch, save=False, save_dir='generated_images/', show=False, fig_size=(100, 100)):\n",
361 |     "    generator.eval()\n",
362 |     "\n",
363 |     "    noise = Variable(noise.cuda())\n",
364 |     "    label = Variable(label.cuda())\n",
365 |     "    gen_image = generator(noise, label)\n",
366 |     "    gen_image = denorm(gen_image)\n",
367 |     "\n",
368 |     "    generator.train()\n",
369 |     "\n",
370 |     "    n_rows = np.sqrt(noise.size()[0]).astype(np.int32)\n",
371 |     "    n_cols = np.sqrt(noise.size()[0]).astype(np.int32)\n",
372 |     "    fig, axes = plt.subplots(n_rows, n_cols, figsize=fig_size)\n",
373 |     "    for ax, img in zip(axes.flatten(), gen_image):\n",
374 |     "        ax.axis('off')\n",
375 |     "        ax.set_adjustable('box-forced')\n",
376 |     "        # Scale to 0-255\n",
377 |     "        img = (((img - img.min()) * 255) / (img.max() - img.min())).cpu().data.numpy().transpose(1, 2, 0).astype(\n",
378 |     "            np.uint8)\n",
379 |     "        ax.imshow(img, cmap=None, aspect='equal')\n",
380 |     "    plt.subplots_adjust(wspace=0, hspace=0)\n",
381 |     "    title = 'Epoch {0}'.format(num_epoch + 1)\n",
382 |     "    fig.text(0.5, 0.04, title, ha='center')\n",
383 |     "\n",
384 |     "    # save figure\n",
385 |     "    if save:\n",
386 |     "        if not os.path.exists(save_dir):\n",
387 |     "            os.mkdir(save_dir)\n",
388 |     "        save_fn = save_dir + 'cDCGAN_epoch_{:d}'.format(num_epoch+1) + '.png'\n",
389 |     "        plt.savefig(save_fn)\n",
390 |     "\n",
391 |     "    if show:\n",
392 |     "        plt.show()\n",
393 |     "    else:\n",
394 |     "        plt.close()"
395 |    ]
396 |   },
397 |   {
398 |    "cell_type": "markdown",
399 |    "metadata": {},
400 |    "source": [
401 |     "## Run Model\n",
402 |     "\n",
403 |     "You can view generated images and loss plots in the 'generated_images' folder."
404 |    ]
405 |   },
406 |   {
407 |    "cell_type": "code",
408 |    "execution_count": 10,
409 |    "metadata": {},
410 |    "outputs": [
411 |     {
412 |      "data": {
413 |       "text/plain": [
414 |        "Discriminator(\n",
415 |        "  (hidden_layer1): Sequential(\n",
416 |        "    (input_conv): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))\n",
417 |        "    (input_act): LeakyReLU(negative_slope=0.2)\n",
418 |        "  )\n",
419 |        "  (hidden_layer2): Sequential(\n",
420 |        "    (label_conv): Conv2d(25, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))\n",
421 |        "    (label_act): LeakyReLU(negative_slope=0.2)\n",
422 |        "  )\n",
423 |        "  (hidden_layer): Sequential(\n",
424 |        "    (conv2): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))\n",
425 |        "    (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
426 |        "    (act2): LeakyReLU(negative_slope=0.2)\n",
427 |        "    (conv3): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))\n",
428 |        "    (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
429 |        "    (act3): LeakyReLU(negative_slope=0.2)\n",
430 |        "    (conv4): Conv2d(512, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))\n",
431 |        "    (bn4): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)\n",
432 |        "    (act4): LeakyReLU(negative_slope=0.2)\n",
433 |        "  )\n",
434 |        "  (output_layer): Sequential(\n",
435 |        "    (out): Conv2d(1024, 1, kernel_size=(4, 4), stride=(1, 1))\n",
436 |        "    (act): Sigmoid()\n",
437 |        "  )\n",
438 |        ")"
439 |       ]
440 |      },
441 |      "execution_count": 10,
442 |      "metadata": {},
443 |      "output_type": "execute_result"
444 |     }
445 |    ],
446 |    "source": [
447 |     "G = Generator(G_input_dim, label_dim, num_filters, G_output_dim)\n",
448 |     "D = Discriminator(D_input_dim, label_dim, num_filters[::-1], D_output_dim)\n",
449 |     "G.cuda()\n",
450 |     "D.cuda()"
451 |    ]
452 |   },
453 |   {
454 |    "cell_type": "code",
455 |    "execution_count": 11,
456 |    "metadata": {},
457 |    "outputs": [],
458 |    "source": [
459 |     "# Loss function\n",
460 |     "criterion = torch.nn.BCELoss()\n",
461 |     "\n",
462 |     "# Optimizers\n",
463 |     "G_optimizer = torch.optim.Adam(G.parameters(), lr=learning_rate, betas=betas)\n",
464 |     "D_optimizer = torch.optim.Adam(D.parameters(), lr=learning_rate, betas=betas)"
465 |    ]
466 |   },
467 |   {
468 |    "cell_type": "code",
469 |    "execution_count": 12,
470 |    "metadata": {},
471 |    "outputs": [],
472 |    "source": [
473 |     "# Label preprocess\n",
474 |     "onehot = torch.zeros(label_dim, label_dim)\n",
475 |     "onehot = onehot.scatter_(1, torch.LongTensor(list(range(label_dim))).view(label_dim, 1), 1).view(label_dim, label_dim, 1, 1)\n",
476 |     "fill = torch.zeros([label_dim, label_dim, image_size, image_size])\n",
477 |     "for i in range(label_dim):\n",
478 |     "    fill[i, i, :, :] = 1"
479 |    ]
480 |   },
481 |   {
482 |    "cell_type": "code",
483 |    "execution_count": 13,
484 |    "metadata": {},
485 |    "outputs": [],
486 |    "source": [
487 |     "# Fixed noise & label for test\n",
488 |     "temp_noise = torch.randn(label_dim, G_input_dim)\n",
489 |     "fixed_noise = temp_noise\n",
490 |     "fixed_c = torch.zeros(label_dim, 1)\n",
491 |     "for i in range(label_dim - 1):\n",
492 |     "    fixed_noise = torch.cat([fixed_noise, temp_noise], 0)\n",
493 |     "    temp = torch.ones(label_dim, 1) + i\n",
494 |     "    fixed_c = torch.cat([fixed_c, temp], 0)\n",
495 |     "\n",
496 |     "fixed_noise = fixed_noise.view(-1, G_input_dim, 1, 1)\n",
497 |     "fixed_label = torch.zeros(G_input_dim, label_dim)\n",
498 |     "fixed_label.scatter_(1, fixed_c.type(torch.LongTensor), 1)\n",
499 |     "fixed_label = fixed_label.view(-1, label_dim, 1, 1)"
500 |    ]
501 |   },
502 |   {
503 |    "cell_type": "code",
504 |    "execution_count": null,
505 |    "metadata": {
506 |     "scrolled": true
507 |    },
508 |    "outputs": [],
509 |    "source": [
510 |     "# Training GAN\n",
511 |     "D_avg_losses = []\n",
512 |     "G_avg_losses = []\n",
513 |     "\n",
514 |     "step = 0\n",
515 |     "for epoch in range(num_epochs):\n",
516 |     "    D_losses = []\n",
517 |     "    G_losses = []\n",
518 |     "\n",
519 |     "    if epoch == 5 or epoch == 10:\n",
520 |     "        G_optimizer.param_groups[0]['lr'] /= label_dim\n",
521 |     "        D_optimizer.param_groups[0]['lr'] /= label_dim\n",
522 |     "\n",
523 |     "    # minibatch training\n",
524 |     "    for i, (images, labels) in enumerate(data_loader):\n",
525 |     "\n",
526 |     "        # image data\n",
527 |     "        mini_batch = images.size()[0]\n",
528 |     "        x_ = Variable(images.cuda())\n",
529 |     "\n",
530 |     "        # labels\n",
531 |     "        y_real_ = Variable(torch.ones(mini_batch).cuda())\n",
532 |     "        y_fake_ = Variable(torch.zeros(mini_batch).cuda())\n",
533 |     "        c_fill_ = Variable(fill[labels].cuda())\n",
534 |     "\n",
535 |     "        # Train discriminator with real data\n",
536 |     "        D_real_decision = D(x_, c_fill_).squeeze()\n",
537 |     "        D_real_loss = criterion(D_real_decision, y_real_)\n",
538 |     "\n",
539 |     "        # Train discriminator with fake data\n",
540 |     "        z_ = torch.randn(mini_batch, G_input_dim).view(-1, G_input_dim, 1, 1)\n",
541 |     "        z_ = Variable(z_.cuda())\n",
542 |     "\n",
543 |     "        c_ = (torch.rand(mini_batch, 1) * label_dim).type(torch.LongTensor).squeeze()\n",
544 |     "        c_onehot_ = Variable(onehot[c_].cuda())\n",
545 |     "        gen_image = G(z_, c_onehot_)\n",
546 |     "\n",
547 |     "        c_fill_ = Variable(fill[c_].cuda())\n",
548 |     "        D_fake_decision = D(gen_image, c_fill_).squeeze()\n",
549 |     "        D_fake_loss = criterion(D_fake_decision, y_fake_)\n",
550 |     "\n",
551 |     "        # Back propagation\n",
552 |     "        D_loss = D_real_loss + D_fake_loss\n",
553 |     "        D.zero_grad()\n",
554 |     "        D_loss.backward()\n",
555 |     "        D_optimizer.step()\n",
556 |     "\n",
557 |     "        # Train generator\n",
558 |     "        z_ = torch.randn(mini_batch, G_input_dim).view(-1, G_input_dim, 1, 1)\n",
559 |     "        z_ = Variable(z_.cuda())\n",
560 |     "\n",
561 |     "        c_ = (torch.rand(mini_batch, 1) * label_dim).type(torch.LongTensor).squeeze()\n",
562 |     "        c_onehot_ = Variable(onehot[c_].cuda())\n",
563 |     "        gen_image = G(z_, c_onehot_)\n",
564 |     "\n",
565 |     "        c_fill_ = Variable(fill[c_].cuda())\n",
566 |     "        D_fake_decision = D(gen_image, c_fill_).squeeze()\n",
567 |     "        G_loss = criterion(D_fake_decision, y_real_)\n",
568 |     "\n",
569 |     "        # Back propagation\n",
570 |     "        G.zero_grad()\n",
571 |     "        G_loss.backward()\n",
572 |     "        G_optimizer.step()\n",
573 |     "\n",
574 |     "        # Loss values\n",
575 |     "        D_losses.append(torch.Tensor.item(D_loss.data))\n",
576 |     "        G_losses.append(torch.Tensor.item(G_loss.data))\n",
577 |     "\n",
578 |     "        print('Epoch [%d/%d], Step [%d/%d], D_loss: %.4f, G_loss: %.4f'\n",
579 |     "              % (epoch+1, num_epochs, i+1, len(data_loader), torch.Tensor.item(D_loss.data), torch.Tensor.item(G_loss.data)))\n",
580 |     "        step += 1\n",
581 |     "\n",
582 |     "    D_avg_loss = torch.mean(torch.FloatTensor(D_losses))\n",
583 |     "    G_avg_loss = torch.mean(torch.FloatTensor(G_losses))\n",
584 |     "\n",
585 |     "    # Avg loss values for plot\n",
586 |     "    D_avg_losses.append(D_avg_loss)\n",
587 |     "    G_avg_losses.append(G_avg_loss)\n",
588 |     "\n",
589 |     "    plot_loss(D_avg_losses, G_avg_losses, epoch, save=True, save_dir=save_dir)\n",
590 |     "\n",
591 |     "    # Show result for fixed noise\n",
592 |     "    plot_result(G, fixed_noise, fixed_label, epoch, save=True, save_dir=save_dir)\n",
593 |     "\n",
594 |     "# Make gif\n",
595 |     "loss_plots = []\n",
596 |     "gen_image_plots = []\n",
597 |     "for epoch in range(num_epochs):\n",
598 |     "    # Plot for generating gif\n",
599 |     "    save_fn1 = save_dir + 'cDCGAN_losses_epoch_{:d}'.format(epoch + 1) + '.png'\n",
600 |     "    loss_plots.append(imageio.imread(save_fn1))\n",
601 |     "\n",
602 |     "    save_fn2 = save_dir + 'cDCGAN_epoch_{:d}'.format(epoch + 1) + '.png'\n",
603 |     "    gen_image_plots.append(imageio.imread(save_fn2))\n",
604 |     "\n",
605 |     "imageio.mimsave(save_dir + 'cDCGAN_losses_epochs_{:d}'.format(num_epochs) + '.gif', loss_plots, fps=5)\n",
606 |     "imageio.mimsave(save_dir + 'cDCGAN_epochs_{:d}'.format(num_epochs) + '.gif', gen_image_plots, fps=5)"
607 |    ]
608 |   },
609 |   {
610 |    "cell_type": "markdown",
611 |    "metadata": {},
612 |    "source": [
613 |     "## Resources\n",
614 |     "\n",
615 |     "- Much of the code in this notebook is modified from togheppi's implementation of [cDCGAN](https://github.com/togheppi/cDCGAN)."
616 |    ]
617 |   }
618 |  ],
619 |  "metadata": {
620 |   "kernelspec": {
621 |    "display_name": "Python 3",
622 |    "language": "python",
623 |    "name": "python3"
624 |   },
625 |   "language_info": {
626 |    "codemirror_mode": {
627 |     "name": "ipython",
628 |     "version": 3
629 |    },
630 |    "file_extension": ".py",
631 |    "mimetype": "text/x-python",
632 |    "name": "python",
633 |    "nbconvert_exporter": "python",
634 |    "pygments_lexer": "ipython3",
635 |    "version": "3.7.0"
636 |   }
637 |  },
638 |  "nbformat": 4,
639 |  "nbformat_minor": 2
640 | }
641 | 


--------------------------------------------------------------------------------