├── .gitignore ├── .gitmodules ├── LICENSE ├── Lab 0 └── warm_up.py ├── Lab 1 ├── DLP_LAB1_309552007_袁鈺勛.pdf ├── Lab1-Backpropagation.pdf ├── README.md └── backpropagation.py ├── Lab 2 ├── 2048_sample.cpp ├── DLP_LAB2_309552007_袁鈺勛.pdf ├── Lab2-TD.pdf ├── README.md └── makefile ├── Lab 3 ├── DLP_LAB3_309552007_袁鈺勛.pdf ├── Lab3.pdf ├── README.md ├── S4b_test.npz ├── S4b_train.npz ├── X11b_test.npz ├── X11b_train.npz ├── dataloader.py └── eeg_classification.py ├── Lab 4 ├── DLP_LAB4_309552007_袁鈺勛.pdf ├── Lab4-Diabetic retinopathy detection.pdf ├── README.md ├── dataloader.py ├── retinopathy_detection.py ├── test_img.csv ├── test_label.csv ├── train_img.csv └── train_label.csv ├── Lab 5 ├── DLP_LAB5_309552007_袁鈺勛.pdf ├── Lab5_Conditional Sequence-to-Sequence VAE.pdf ├── README.md ├── data │ ├── readme.txt │ ├── test.txt │ └── train.txt └── tense_conversion.py ├── Lab 6 ├── Lab6-DQN-DDPG.pdf ├── README.md ├── ddpg.py ├── dqn.py └── report.pdf ├── Lab 7 ├── Lab7-Lets play GANs with Flows and friends.pdf ├── README.md ├── argument_parser.py ├── dcgan.py ├── evaluator.py ├── glow.py ├── main.py ├── report.pdf ├── sagan.py ├── task_1_dataset.py ├── task_2_dataset.py ├── test.py ├── train.py ├── util.py └── visualizer.py └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | .idea 2 | *.exe 3 | *.bin 4 | __pycache__ 5 | Lab 4/data 6 | results 7 | model 8 | Lab 6/log 9 | Lab 6/*.pth 10 | Lab 7/data 11 | Lab 7/figure 12 | Lab 7/test_figure 13 | Lab 7/model -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- 1 | [submodule "Project"] 2 | path = Project 3 | url = git@github.com:steven112163/DLP-project.git 4 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Steven Yuan 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Lab 0/warm_up.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.utils.data import TensorDataset, DataLoader 3 | 4 | 5 | class TwoLayerNet(torch.nn.Module): 6 | def __init__(self, D_in, H, D_out): 7 | super(TwoLayerNet, self).__init__() 8 | self.linear_1 = torch.nn.Linear(D_in, H) 9 | self.linear_2 = torch.nn.Linear(H, D_out) 10 | 11 | def forward(self, data): 12 | h = self.linear_1(data) 13 | h_relu = torch.nn.functional.relu(h) 14 | y_pred = self.linear_2(h_relu) 15 | return y_pred 16 | 17 | 18 | if __name__ == '__main__': 19 | device = torch.device('cpu') 20 | learning_rate = 1e-2 21 | 22 | x = torch.randn(64, 1000, device=device) 23 | y = torch.randn(64, 10, device=device) 24 | loader = DataLoader(TensorDataset(x, y), batch_size=8) 25 | 26 | model = TwoLayerNet(D_in=1000, H=100, D_out=10) 27 | model = model.to(device) 28 | 29 | optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) 30 | 31 | for epochs in range(50): 32 | for x_batch, y_batch in loader: 33 | y_prediction = model(x_batch) 34 | loss = torch.nn.functional.mse_loss(y_prediction, y_batch) 35 | 36 | print(loss.item()) 37 | 38 | loss.backward() 39 | 40 | optimizer.step() 41 | optimizer.zero_grad() 42 | -------------------------------------------------------------------------------- /Lab 1/DLP_LAB1_309552007_袁鈺勛.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 1/DLP_LAB1_309552007_袁鈺勛.pdf -------------------------------------------------------------------------------- /Lab 1/Lab1-Backpropagation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 1/Lab1-Backpropagation.pdf -------------------------------------------------------------------------------- /Lab 1/README.md: -------------------------------------------------------------------------------- 1 | # Deep-Learning-and-Practice Lab 1 2 | 🚀 Back propagation 3 | 🏹 The goal of this lab is to implement neural network without framework library (e.g., Pytorch etc.). 4 | 5 | 6 | 7 | ## Arguments 8 | |Argument|Description|Default| 9 | |---|---|---| 10 | |`'-d', '--data_type'`|0: linear data points, 1: XOR data points|0| 11 | |`'-n', '--number_of_data'`|Number of data points|100| 12 | |`'-e', '--epoch'`|Number of epoch|1000000| 13 | |`'-l', '--learning-rate'`|Learning rate of the neural network|0.1| 14 | |`'-u', '--units'`|Number of units in each hidden layer|4| 15 | |`'-a', '--activation'`|Type of activation function|'sigmoid'| 16 | |`'-o', '--optimizer'`|Type of optimizer|'gd'| -------------------------------------------------------------------------------- /Lab 1/backpropagation.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | from typing import Tuple 4 | from argparse import ArgumentParser, Namespace, ArgumentTypeError 5 | 6 | 7 | def generate_linear(n: int = 100) -> Tuple[np.ndarray, np.ndarray]: 8 | """ 9 | Generate data points which are linearly separable 10 | :param n: number of points 11 | :return: inputs and labels 12 | """ 13 | pts = np.random.uniform(0, 1, (n, 2)) 14 | inputs, labels = [], [] 15 | 16 | for pt in pts: 17 | inputs.append([pt[0], pt[1]]) 18 | if pt[0] > pt[1]: 19 | labels.append(0) 20 | else: 21 | labels.append(1) 22 | 23 | return np.array(inputs), np.array(labels).reshape(n, 1) 24 | 25 | 26 | def generate_xor_easy(n: int = 11) -> Tuple[np.ndarray, np.ndarray]: 27 | """ 28 | Generate data points based on XOR situation 29 | :param n: number of points 30 | :return: inputs and labels 31 | """ 32 | inputs, labels = [], [] 33 | 34 | for i in range(n): 35 | inputs.append([0.1 * i, 0.1 * i]) 36 | labels.append(0) 37 | 38 | if 0.1 * i == 0.5: 39 | continue 40 | 41 | inputs.append([0.1 * i, 1 - 0.1 * i]) 42 | labels.append(1) 43 | 44 | return np.array(inputs), np.array(labels).reshape(21, 1) 45 | 46 | 47 | class Layer: 48 | def __init__(self, input_links: int, output_links: int, activation: str = 'sigmoid', optimizer: str = 'gd', 49 | learning_rate: float = 0.1): 50 | self.weight = np.random.normal(0, 1, (input_links + 1, output_links)) 51 | self.momentum = np.zeros((input_links + 1, output_links)) 52 | self.sum_of_squares_of_gradients = np.zeros((input_links + 1, output_links)) 53 | self.moving_average_m = np.zeros((input_links + 1, output_links)) 54 | self.moving_average_v = np.zeros((input_links + 1, output_links)) 55 | self.update_times = 1 56 | self.forward_gradient = None 57 | self.backward_gradient = None 58 | self.output = None 59 | self.activation = activation 60 | self.optimizer = optimizer 61 | self.learning_rate = learning_rate 62 | 63 | def forward(self, inputs: np.ndarray) -> np.ndarray: 64 | """ 65 | Forward feed 66 | :param inputs: input data for this layer 67 | :return: outputs computed by this layer 68 | """ 69 | self.forward_gradient = np.append(inputs, np.ones((inputs.shape[0], 1)), axis=1) 70 | if self.activation == 'sigmoid': 71 | self.output = self.sigmoid(np.matmul(self.forward_gradient, self.weight)) 72 | elif self.activation == 'tanh': 73 | self.output = self.tanh(np.matmul(self.forward_gradient, self.weight)) 74 | elif self.activation == 'relu': 75 | self.output = self.relu(np.matmul(self.forward_gradient, self.weight)) 76 | elif self.activation == 'leaky_relu': 77 | self.output = self.leaky_relu(np.matmul(self.forward_gradient, self.weight)) 78 | else: 79 | # Without activation function 80 | self.output = np.matmul(self.forward_gradient, self.weight) 81 | 82 | return self.output 83 | 84 | def backward(self, derivative_loss: np.ndarray) -> np.ndarray: 85 | """ 86 | Backward propagation 87 | :param derivative_loss: loss from next layer 88 | :return: loss of this layer 89 | """ 90 | if self.activation == 'sigmoid': 91 | self.backward_gradient = np.multiply(self.derivative_sigmoid(self.output), derivative_loss) 92 | elif self.activation == 'tanh': 93 | self.backward_gradient = np.multiply(self.derivative_tanh(self.output), derivative_loss) 94 | elif self.activation == 'relu': 95 | self.backward_gradient = np.multiply(self.derivative_relu(self.output), derivative_loss) 96 | elif self.activation == 'leaky_relu': 97 | self.backward_gradient = np.multiply(self.derivative_leaky_relu(self.output), derivative_loss) 98 | else: 99 | # Without activation function 100 | self.backward_gradient = derivative_loss 101 | 102 | return np.matmul(self.backward_gradient, self.weight[:-1].T) 103 | 104 | def update(self) -> None: 105 | """ 106 | Update weights 107 | :return: None 108 | """ 109 | gradient = np.matmul(self.forward_gradient.T, self.backward_gradient) 110 | if self.optimizer == 'gd': 111 | delta_weight = -self.learning_rate * gradient 112 | elif self.optimizer == 'momentum': 113 | self.momentum = 0.9 * self.momentum - self.learning_rate * gradient 114 | delta_weight = self.momentum 115 | elif self.optimizer == 'adagrad': 116 | self.sum_of_squares_of_gradients += np.square(gradient) 117 | delta_weight = -self.learning_rate * gradient / np.sqrt(self.sum_of_squares_of_gradients + 1e-8) 118 | else: 119 | # adam 120 | self.moving_average_m = 0.9 * self.moving_average_m + 0.1 * gradient 121 | self.moving_average_v = 0.999 * self.moving_average_v + 0.001 * np.square(gradient) 122 | bias_correction_m = self.moving_average_m / (1.0 - 0.9 ** self.update_times) 123 | bias_correction_v = self.moving_average_v / (1.0 - 0.999 ** self.update_times) 124 | self.update_times += 1 125 | delta_weight = -self.learning_rate * bias_correction_m / (np.sqrt(bias_correction_v) + 1e-8) 126 | 127 | self.weight += delta_weight 128 | 129 | @staticmethod 130 | def sigmoid(x: np.ndarray) -> np.ndarray: 131 | """ 132 | Calculate sigmoid function 133 | y = 1 / (1 + e^(-x)) 134 | :param x: input data 135 | :return: sigmoid results 136 | """ 137 | return 1.0 / (1.0 + np.exp(-x)) 138 | 139 | @staticmethod 140 | def derivative_sigmoid(y: np.ndarray) -> np.ndarray: 141 | """ 142 | Calculate the derivative of sigmoid function 143 | y' = y(1 - y) 144 | :param y: value of the sigmoid function 145 | :return: derivative sigmoid result 146 | """ 147 | return np.multiply(y, 1.0 - y) 148 | 149 | @staticmethod 150 | def tanh(x: np.ndarray) -> np.ndarray: 151 | """ 152 | Calculate tanh function 153 | y = tanh(x) 154 | :param x: input data 155 | :return: tanh results 156 | """ 157 | return np.tanh(x) 158 | 159 | @staticmethod 160 | def derivative_tanh(y: np.ndarray) -> np.ndarray: 161 | """ 162 | Calculate the derivative of tanh function 163 | y' = 1 - y^2 164 | :param y: value of the tanh function 165 | :return: derivative tanh result 166 | """ 167 | return 1.0 - y ** 2 168 | 169 | @staticmethod 170 | def relu(x: np.ndarray) -> np.ndarray: 171 | """ 172 | Calculate relu function 173 | y = max(0, x) 174 | :param x: input data 175 | :return: relu results 176 | """ 177 | return np.maximum(0.0, x) 178 | 179 | @staticmethod 180 | def derivative_relu(y: np.ndarray) -> np.ndarray: 181 | """ 182 | Calculate the derivative of relu function 183 | y' = 1 if y > 0 184 | y' = 0 if y <= 0 185 | :param y: value of the relu function 186 | :return: derivative relu result 187 | """ 188 | return np.heaviside(y, 0.0) 189 | 190 | @staticmethod 191 | def leaky_relu(x: np.ndarray) -> np.ndarray: 192 | """ 193 | Calculate leaky relu function 194 | y = max(0, x) + 0.01 * min(0, x) 195 | :param x: input data 196 | :return: relu results 197 | """ 198 | return np.maximum(0.0, x) + 0.01 * np.minimum(0.0, x) 199 | 200 | @staticmethod 201 | def derivative_leaky_relu(y: np.ndarray) -> np.ndarray: 202 | """ 203 | Calculate the derivative of leaky relu function 204 | y' = 1 if y > 0 205 | y' = 0.01 if y <= 0 206 | :param y: value of the relu function 207 | :return: derivative relu result 208 | """ 209 | y[y > 0.0] = 1.0 210 | y[y <= 0.0] = 0.01 211 | return y 212 | 213 | 214 | class NeuralNetwork: 215 | def __init__(self, epoch: int = 1000000, learning_rate: float = 0.1, num_of_layers: int = 2, input_units: int = 2, 216 | hidden_units: int = 4, activation: str = 'sigmoid', optimizer: str = 'gd'): 217 | self.num_of_epoch = epoch 218 | self.learning_rate = learning_rate 219 | self.hidden_units = hidden_units 220 | self.activation = activation 221 | self.optimizer = optimizer 222 | self.learning_epoch, self.learning_loss = list(), list() 223 | 224 | # Setup layers 225 | # Input layer 226 | self.layers = [Layer(input_units, hidden_units, activation, optimizer, learning_rate)] 227 | 228 | # Hidden layers 229 | for _ in range(num_of_layers - 1): 230 | self.layers.append(Layer(hidden_units, hidden_units, activation, optimizer, learning_rate)) 231 | 232 | # Output layer 233 | self.layers.append(Layer(hidden_units, 1, 'sigmoid', optimizer, learning_rate)) 234 | 235 | def forward(self, inputs: np.ndarray) -> np.ndarray: 236 | """ 237 | Forward feed 238 | :param inputs: input data 239 | :return: predict labels 240 | """ 241 | for layer in self.layers: 242 | inputs = layer.forward(inputs) 243 | return inputs 244 | 245 | def backward(self, derivative_loss) -> None: 246 | """ 247 | Backward propagation 248 | :param derivative_loss: loss form next layer 249 | :return: None 250 | """ 251 | for layer in self.layers[::-1]: 252 | derivative_loss = layer.backward(derivative_loss) 253 | 254 | def update(self) -> None: 255 | """ 256 | Update all weights in the neural network 257 | :return: None 258 | """ 259 | for layer in self.layers: 260 | layer.update() 261 | 262 | def train(self, inputs: np.ndarray, labels: np.ndarray) -> None: 263 | """ 264 | Train the neural network 265 | :param inputs: input data 266 | :param labels: input labels 267 | :return: None 268 | """ 269 | for epoch in range(self.num_of_epoch): 270 | prediction = self.forward(inputs) 271 | loss = self.mse_loss(prediction=prediction, ground_truth=labels) 272 | self.backward(self.mse_derivative_loss(prediction=prediction, ground_truth=labels)) 273 | self.update() 274 | 275 | if epoch % 100 == 0: 276 | print(f'Epoch {epoch} loss : {loss}') 277 | self.learning_epoch.append(epoch) 278 | self.learning_loss.append(loss) 279 | 280 | if loss < 0.001: 281 | break 282 | 283 | def predict(self, inputs: np.ndarray) -> np.ndarray: 284 | """ 285 | Predict the labels of inputs 286 | :param inputs: input data 287 | :return: predict labels 288 | """ 289 | prediction = self.forward(inputs=inputs) 290 | print(prediction) 291 | return np.round(prediction) 292 | 293 | def show_result(self, inputs: np.ndarray, labels: np.ndarray) -> None: 294 | """ 295 | Show the ground truth and predicted results 296 | :param inputs: input data points 297 | :param labels: ground truth labels 298 | :return: None 299 | """ 300 | # Plot ground truth and prediction 301 | plt.figure() 302 | plt.subplot(1, 2, 1) 303 | plt.title('Ground truth', fontsize=18) 304 | for idx, point in enumerate(inputs): 305 | plt.plot(point[0], point[1], 'ro' if labels[idx][0] == 0 else 'bo') 306 | 307 | pred_labels = self.predict(inputs) 308 | plt.subplot(1, 2, 2) 309 | plt.title('Predict result', fontsize=18) 310 | for idx, point in enumerate(inputs): 311 | plt.plot(point[0], point[1], 'ro' if pred_labels[idx][0] == 0 else 'bo') 312 | print(f'Activation : {self.activation}') 313 | print(f'Hidden units : {self.hidden_units}') 314 | print(f'Optimizer : {self.optimizer}') 315 | print(f'Accuracy : {float(np.sum(pred_labels == labels)) / len(labels)}') 316 | 317 | # Plot learning curve 318 | plt.figure() 319 | plt.title('Learning curve', fontsize=18) 320 | plt.plot(self.learning_epoch, self.learning_loss) 321 | 322 | plt.show() 323 | 324 | @staticmethod 325 | def mse_loss(prediction: np.ndarray, ground_truth: np.ndarray) -> np.ndarray: 326 | """ 327 | Mean squared error loss 328 | :param prediction: prediction from neural network 329 | :param ground_truth: ground truth 330 | :return: loss 331 | """ 332 | return np.mean((prediction - ground_truth) ** 2) 333 | 334 | @staticmethod 335 | def mse_derivative_loss(prediction: np.ndarray, ground_truth: np.ndarray) -> np.ndarray: 336 | """ 337 | Derivative of MSE loss 338 | :param prediction: prediction from neural network 339 | :param ground_truth: ground truth 340 | :return: derivative loss 341 | """ 342 | return 2 * (prediction - ground_truth) / len(ground_truth) 343 | 344 | 345 | def check_data_type(input_value: str) -> int: 346 | """ 347 | Check whether data type is 0 or 1 348 | :param input_value: input string value 349 | :return: integer value 350 | """ 351 | int_value = int(input_value) 352 | if int_value != 0 and int_value != 1: 353 | raise ArgumentTypeError(f'Data type({input_value}) should be 0 or 1.') 354 | return int_value 355 | 356 | 357 | def check_activation_type(input_value: str) -> str: 358 | """ 359 | Check activation function type 360 | :param input_value: input function type 361 | :return: original function type if the type is valid 362 | """ 363 | if input_value != 'none' and input_value != 'sigmoid' and input_value != 'tanh' and input_value != 'relu' and input_value != 'leaky_relu': 364 | raise ArgumentTypeError( 365 | f"Activation function type should be 'none' or 'sigmoid' or 'tanh' or 'relu' or 'leaky_relu'.") 366 | 367 | return input_value 368 | 369 | 370 | def check_optimizer_type(input_value: str) -> str: 371 | """ 372 | Check optimizer 373 | :param input_value: input optimizer 374 | :return: original optimizer if the it is valid 375 | """ 376 | if input_value != 'gd' and input_value != 'momentum' and input_value != 'adagrad' and input_value != 'adam': 377 | raise ArgumentTypeError(f"Optimizer should be 'gd', 'momentum', 'adagrad' or 'adam'.") 378 | 379 | return input_value 380 | 381 | 382 | def parse_arguments() -> Namespace: 383 | """ 384 | Parse arguments 385 | :return: all arguments 386 | """ 387 | parser = ArgumentParser(description='Neural Network') 388 | parser.add_argument('-d', '--data_type', default=0, type=check_data_type, 389 | help='0: linear data points, 1: XOR data points') 390 | parser.add_argument('-n', '--number_of_data', default=100, type=int, help='Number of data points') 391 | parser.add_argument('-e', '--epoch', default=1000000, type=int, help='Number of epoch') 392 | parser.add_argument('-l', '--learning-rate', default=0.1, type=float, help='Learning rate of the neural network') 393 | parser.add_argument('-u', '--units', default=4, type=int, help='Number of units in each hidden layer') 394 | parser.add_argument('-a', '--activation', default='sigmoid', type=check_activation_type, 395 | help='Type of activation function') 396 | parser.add_argument('-o', '--optimizer', default='gd', type=check_optimizer_type, help='Type of optimizer') 397 | 398 | return parser.parse_args() 399 | 400 | 401 | def main() -> None: 402 | """ 403 | Main function 404 | :return: None 405 | """ 406 | args = parse_arguments() 407 | data_type = args.data_type 408 | number_of_data = args.number_of_data 409 | epoch = args.epoch 410 | learning_rate = args.learning_rate 411 | hidden_units = args.units 412 | activation = args.activation 413 | optimizer = args.optimizer 414 | 415 | # Generate data points 416 | if not data_type: 417 | inputs, labels = generate_linear(number_of_data) 418 | else: 419 | inputs, labels = generate_xor_easy(number_of_data) 420 | 421 | neural_network = NeuralNetwork(epoch=epoch, 422 | learning_rate=learning_rate, 423 | hidden_units=hidden_units, 424 | activation=activation, 425 | optimizer=optimizer) 426 | neural_network.train(inputs=inputs, labels=labels) 427 | neural_network.show_result(inputs=inputs, labels=labels) 428 | 429 | 430 | if __name__ == '__main__': 431 | main() 432 | -------------------------------------------------------------------------------- /Lab 2/DLP_LAB2_309552007_袁鈺勛.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 2/DLP_LAB2_309552007_袁鈺勛.pdf -------------------------------------------------------------------------------- /Lab 2/Lab2-TD.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 2/Lab2-TD.pdf -------------------------------------------------------------------------------- /Lab 2/README.md: -------------------------------------------------------------------------------- 1 | # Deep-Learning-and-Practice Lab 2 2 | 🚀 Temporal Difference 3 | 🏹 The goal of this lab is to implement before-state version of temporal difference algorithm to solve 2048 using n-tuple network. 4 | 5 | 6 | 7 | ## Compile & Clean 8 | |Command|Description| 9 | |---|---| 10 | |`make`|Compile cpp file and produce 2048 execution file| 11 | |`make clean`|Delete 2048 execution file| -------------------------------------------------------------------------------- /Lab 2/makefile: -------------------------------------------------------------------------------- 1 | all: 2 | g++ -std=c++11 -O3 -o 2048 2048_sample.cpp 3 | 4 | clean: 5 | rm 2048 -------------------------------------------------------------------------------- /Lab 3/DLP_LAB3_309552007_袁鈺勛.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 3/DLP_LAB3_309552007_袁鈺勛.pdf -------------------------------------------------------------------------------- /Lab 3/Lab3.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 3/Lab3.pdf -------------------------------------------------------------------------------- /Lab 3/README.md: -------------------------------------------------------------------------------- 1 | # Deep-Learning-and-Practice Lab 3 2 | 🚀 EEGNet & DeepConvNet 3 | 🏹 The goal of this lab is to implement EEGNet and DeepConvNet with Pytorch to classify EEG. 4 | 5 | 6 | 7 | ## Arguments 8 | |Argument|Description|Default| 9 | |---|---|---| 10 | |`'-m', '--model'`|EEGNet or DeepConvNet|'EEG'| 11 | |`'-e', '--epochs'`|Number of epochs|150| 12 | |`'-lr', '--learning_rate'`|Learning rate|1e-2| 13 | |`'-b', '--batch_size'`|Batch size|64| 14 | |`'-o', '--optimizer'`|Optimizer|'adam'| 15 | |`'-lf', '--loss_function'`|Loss function|'cross_entropy'| 16 | |`'-d', '--dropout'`|Dropout probability|0.25| 17 | |`'-l', '--linear'`|Extra linear layers in DeepConvNet (default is 1)|1| 18 | |`'-v', '--verbosity'`|Whether to show info log|0| -------------------------------------------------------------------------------- /Lab 3/S4b_test.npz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 3/S4b_test.npz -------------------------------------------------------------------------------- /Lab 3/S4b_train.npz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 3/S4b_train.npz -------------------------------------------------------------------------------- /Lab 3/X11b_test.npz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 3/X11b_test.npz -------------------------------------------------------------------------------- /Lab 3/X11b_train.npz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 3/X11b_train.npz -------------------------------------------------------------------------------- /Lab 3/dataloader.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def read_bci_data(): 5 | S4b_train = np.load('S4b_train.npz') 6 | X11b_train = np.load('X11b_train.npz') 7 | S4b_test = np.load('S4b_test.npz') 8 | X11b_test = np.load('X11b_test.npz') 9 | 10 | train_data = np.concatenate((S4b_train['signal'], X11b_train['signal']), axis=0) 11 | train_label = np.concatenate((S4b_train['label'], X11b_train['label']), axis=0) 12 | test_data = np.concatenate((S4b_test['signal'], X11b_test['signal']), axis=0) 13 | test_label = np.concatenate((S4b_test['label'], X11b_test['label']), axis=0) 14 | 15 | train_label = train_label - 1 16 | test_label = test_label - 1 17 | train_data = np.transpose(np.expand_dims(train_data, axis=1), (0, 1, 3, 2)) 18 | test_data = np.transpose(np.expand_dims(test_data, axis=1), (0, 1, 3, 2)) 19 | 20 | mask = np.where(np.isnan(train_data)) 21 | train_data[mask] = np.nanmean(train_data) 22 | 23 | mask = np.where(np.isnan(test_data)) 24 | test_data[mask] = np.nanmean(test_data) 25 | 26 | print(train_data.shape, train_label.shape, test_data.shape, test_label.shape) 27 | 28 | return train_data, train_label, test_data, test_label 29 | -------------------------------------------------------------------------------- /Lab 3/eeg_classification.py: -------------------------------------------------------------------------------- 1 | from dataloader import read_bci_data 2 | from torch import Tensor, device, cuda, no_grad 3 | from torch import max as tensor_max 4 | from torch.utils.data import TensorDataset, DataLoader 5 | from argparse import ArgumentParser, ArgumentTypeError, Namespace 6 | from typing import Dict, List, Tuple 7 | from functools import reduce 8 | from collections import OrderedDict 9 | from tqdm import tqdm 10 | import sys 11 | import torch.nn as nn 12 | import torch.optim as op 13 | import matplotlib.pyplot as plt 14 | 15 | 16 | class EEGNet(nn.Module): 17 | def __init__(self, activation: nn.modules.activation, dropout: float): 18 | super().__init__() 19 | 20 | self.first_conv = nn.Sequential( 21 | nn.Conv2d( 22 | in_channels=1, 23 | out_channels=16, 24 | kernel_size=(1, 51), 25 | stride=(1, 1), 26 | padding=(0, 25), 27 | bias=False 28 | ), 29 | nn.BatchNorm2d(16) 30 | ) 31 | 32 | self.depth_wise_conv = nn.Sequential( 33 | nn.Conv2d( 34 | in_channels=16, 35 | out_channels=32, 36 | kernel_size=(2, 1), 37 | stride=(1, 1), 38 | groups=16, 39 | bias=False 40 | ), 41 | nn.BatchNorm2d(32), 42 | activation(), 43 | nn.AvgPool2d(kernel_size=(1, 4), stride=(1, 4), padding=0), 44 | nn.Dropout(p=dropout) 45 | ) 46 | 47 | self.separable_conv = nn.Sequential( 48 | nn.Conv2d( 49 | in_channels=32, 50 | out_channels=32, 51 | kernel_size=(1, 15), 52 | stride=(1, 1), 53 | padding=(0, 7), 54 | bias=False 55 | ), 56 | nn.BatchNorm2d(32), 57 | activation(), 58 | nn.AvgPool2d(kernel_size=(1, 8), stride=(1, 8), padding=0), 59 | nn.Dropout(p=dropout) 60 | ) 61 | 62 | self.classify = nn.Sequential( 63 | nn.Flatten(), 64 | nn.Linear(in_features=736, out_features=2, bias=True) 65 | ) 66 | 67 | def forward(self, inputs: TensorDataset) -> Tensor: 68 | """ 69 | Forward propagation 70 | :param inputs: input data 71 | :return: results 72 | """ 73 | first_conv_results = self.first_conv(inputs) 74 | depth_wise_conv_results = self.depth_wise_conv(first_conv_results) 75 | separable_conv_results = self.separable_conv(depth_wise_conv_results) 76 | return self.classify(separable_conv_results) 77 | 78 | 79 | class DeepConvNet(nn.Module): 80 | def __init__(self, activation: nn.modules.activation, dropout: float, num_of_linear: int, 81 | filters: Tuple[int] = (25, 50, 100, 200)): 82 | super().__init__() 83 | 84 | self.filters = filters 85 | self.conv_0 = nn.Sequential( 86 | # an input = [1, 1, 2, 750] 87 | nn.Conv2d( 88 | in_channels=1, 89 | out_channels=filters[0], 90 | kernel_size=(1, 5), 91 | bias=False 92 | ), 93 | # an input = [1, 25, 2, 746] 94 | nn.Conv2d( 95 | in_channels=filters[0], 96 | out_channels=filters[0], 97 | kernel_size=(2, 1), 98 | bias=False 99 | ), 100 | # an input = [1, 25, 1, 746] 101 | nn.BatchNorm2d(filters[0]), 102 | activation(), 103 | nn.MaxPool2d(kernel_size=(1, 2)), 104 | # an input = [1, 25, 1, 373] 105 | nn.Dropout(p=dropout) 106 | ) 107 | 108 | for idx, num_of_filters in enumerate(filters[:-1], start=1): 109 | setattr(self, f'conv_{idx}', nn.Sequential( 110 | nn.Conv2d( 111 | in_channels=num_of_filters, 112 | out_channels=filters[idx], 113 | kernel_size=(1, 5), 114 | bias=False 115 | ), 116 | nn.BatchNorm2d(filters[idx]), 117 | activation(), 118 | nn.MaxPool2d(kernel_size=(1, 2)), 119 | nn.Dropout(p=dropout) 120 | )) 121 | 122 | # If num_of_linear == 1, then there are 2 linear layers 123 | self.flatten_size = filters[-1] * reduce(lambda x, _: round((x - 4) / 2), filters[:-1], 373) 124 | interval = round((100.0 - 2.0) / num_of_linear) 125 | next_layer = 100 126 | features = [self.flatten_size] 127 | while next_layer > 2: 128 | features.append(next_layer) 129 | next_layer -= interval 130 | features.append(2) 131 | 132 | layers = [('flatten', nn.Flatten())] 133 | for idx, in_features in enumerate(features[:-1]): 134 | layers.append((f'linear_{idx}', nn.Linear(in_features=in_features, 135 | out_features=features[idx + 1], 136 | bias=True))) 137 | if idx != len(features) - 2: 138 | layers.append((f'activation_{idx}', activation())) 139 | layers.append((f'dropout_{idx}', nn.Dropout(p=dropout))) 140 | self.classify = nn.Sequential(OrderedDict(layers)) 141 | 142 | def forward(self, inputs: TensorDataset) -> Tensor: 143 | """ 144 | Forward propagation 145 | :param inputs: input data 146 | :return: results 147 | """ 148 | partial_results = inputs 149 | for idx in range(len(self.filters)): 150 | partial_results = getattr(self, f'conv_{idx}')(partial_results) 151 | return self.classify(partial_results) 152 | 153 | 154 | def show_results(target_model: str, epochs: int, accuracy: Dict[str, dict], keys: List[str]) -> None: 155 | """ 156 | Show accuracy results 157 | :param target_model: target training model 158 | :param epochs: number of epochs 159 | :param accuracy: training and testing accuracy of different activation functions 160 | :param keys: names of neural network with different activation functions 161 | :return: None 162 | """ 163 | # Get the number of characters of the longest neural network name 164 | longest = len(max(keys, key=len)) + 6 165 | 166 | # Plot 167 | plt.figure(0) 168 | if target_model == 'EEG': 169 | plt.title('Activation Function Comparison (EEGNet)') 170 | else: 171 | plt.title('Activation Function Comparison (DeepConvNet)') 172 | plt.xlabel('Epoch') 173 | plt.ylabel('Accuracy (%)') 174 | 175 | for train_or_test, acc in accuracy.items(): 176 | for model in keys: 177 | plt.plot(range(epochs), acc[model], label=f'{model}_{train_or_test}') 178 | spaces = ''.join([' ' for _ in range(longest - len(f'{model}_{train_or_test}'))]) 179 | print(f'{model}_{train_or_test}: {spaces}{max(acc[model]):.2f} %') 180 | 181 | plt.legend(loc='lower right') 182 | plt.show() 183 | 184 | 185 | def train(target_model: str, epochs: int, learning_rate: float, batch_size: int, optimizer: op, 186 | loss_function: nn.modules.loss, dropout: float, num_of_linear: int, train_device: device, 187 | train_dataset: TensorDataset, test_dataset: TensorDataset, verbosity: int) -> None: 188 | """ 189 | Train the models 190 | :param target_model: target training model 191 | :param epochs: number of epochs 192 | :param learning_rate: learning rate 193 | :param batch_size: batch size 194 | :param optimizer: optimizer 195 | :param loss_function: loss function 196 | :param dropout: dropout probability 197 | :param num_of_linear: number of extra linear layers in DeepConvNet 198 | :param train_device: training device 199 | :param train_dataset: training dataset 200 | :param test_dataset: testing dataset 201 | :param verbosity: whether to show info log 202 | :return: None 203 | """ 204 | # Setup models for different activation functions 205 | info_log('Setup models', verbosity=verbosity) 206 | if target_model == 'EEG': 207 | models = { 208 | 'EEG_ELU': EEGNet(nn.ELU, dropout=dropout).to(train_device), 209 | 'EEG_ReLU': EEGNet(nn.ReLU, dropout=dropout).to(train_device), 210 | 'EEG_LeakyReLU': EEGNet(nn.LeakyReLU, dropout=dropout).to(train_device) 211 | } 212 | else: 213 | models = { 214 | 'Deep_ELU': DeepConvNet(nn.ELU, dropout=dropout, num_of_linear=num_of_linear).to(train_device), 215 | 'Deep_ReLU': DeepConvNet(nn.ReLU, dropout=dropout, num_of_linear=num_of_linear).to(train_device), 216 | 'Deep_LeakyReLU': DeepConvNet(nn.LeakyReLU, dropout=dropout, num_of_linear=num_of_linear).to(train_device) 217 | } 218 | 219 | # Setup accuracy structure 220 | info_log('Setup accuracy structure', verbosity=verbosity) 221 | keys = [f'{target_model}_ELU', f'{target_model}_ReLU', f'{target_model}_LeakyReLU'] 222 | accuracy = { 223 | 'train': {key: [0 for _ in range(epochs)] for key in keys}, 224 | 'test': {key: [0 for _ in range(epochs)] for key in keys} 225 | } 226 | 227 | # Start training 228 | info_log('Start training', verbosity=verbosity) 229 | train_loader = DataLoader(train_dataset, batch_size=batch_size) 230 | test_loader = DataLoader(test_dataset, len(test_dataset)) 231 | for key, model in models.items(): 232 | info_log(f'Training {key} ...', verbosity=verbosity) 233 | model_optimizer = optimizer(model.parameters(), lr=learning_rate) 234 | 235 | for epoch in tqdm(range(epochs)): 236 | # Train model 237 | model.train() 238 | for data, label in train_loader: 239 | inputs = data.to(train_device) 240 | labels = label.to(train_device).long() 241 | 242 | pred_labels = model.forward(inputs=inputs) 243 | 244 | model_optimizer.zero_grad() 245 | loss = loss_function(pred_labels, labels) 246 | loss.backward() 247 | model_optimizer.step() 248 | 249 | accuracy['train'][key][epoch] += (tensor_max(pred_labels, 1)[1] == labels).sum().item() 250 | accuracy['train'][key][epoch] = 100.0 * accuracy['train'][key][epoch] / len(train_dataset) 251 | 252 | # Test model 253 | model.eval() 254 | with no_grad(): 255 | for data, label in test_loader: 256 | inputs = data.to(train_device) 257 | labels = label.to(train_device).long() 258 | 259 | pred_labels = model.forward(inputs=inputs) 260 | 261 | accuracy['test'][key][epoch] += (tensor_max(pred_labels, 1)[1] == labels).sum().item() 262 | accuracy['test'][key][epoch] = 100.0 * accuracy['test'][key][epoch] / len(test_dataset) 263 | print() 264 | cuda.empty_cache() 265 | 266 | show_results(target_model=target_model, epochs=epochs, accuracy=accuracy, keys=keys) 267 | 268 | 269 | def info_log(log: str, verbosity: int) -> None: 270 | """ 271 | Print information log 272 | :param log: log to be displayed 273 | :param verbosity: whether to show info log 274 | :return: None 275 | """ 276 | if verbosity: 277 | print(f'[\033[96mINFO\033[00m] {log}') 278 | sys.stdout.flush() 279 | 280 | 281 | def check_model_type(input_value: str) -> op: 282 | """ 283 | Check whether the model is eeg or deep 284 | :param input_value: input string value 285 | :return: model type 286 | """ 287 | if input_value != 'EEG' and input_value != 'Deep': 288 | raise ArgumentTypeError(f'Only "EEG" and "Deep" are supported.') 289 | 290 | return input_value 291 | 292 | 293 | def check_optimizer_type(input_value: str) -> op: 294 | """ 295 | Check whether the optimizer is supported 296 | :param input_value: input string value 297 | :return: optimizer 298 | """ 299 | if input_value == 'adam': 300 | return op.Adam 301 | elif input_value == 'adadelta': 302 | return op.Adadelta 303 | elif input_value == 'adagrad': 304 | return op.Adagrad 305 | elif input_value == 'adamw': 306 | return op.AdamW 307 | elif input_value == 'adamax': 308 | return op.Adamax 309 | 310 | raise ArgumentTypeError(f'Optimizer {input_value} is not supported.') 311 | 312 | 313 | def check_loss_type(input_value: str) -> nn.modules.loss: 314 | """ 315 | Check whether the loss function is supported 316 | :param input_value: input string value 317 | :return: loss function 318 | """ 319 | if input_value == 'cross_entropy': 320 | return nn.CrossEntropyLoss() 321 | 322 | raise ArgumentTypeError(f'Loss function {input_value} is not supported.') 323 | 324 | 325 | def check_linear_type(input_value: str) -> int: 326 | """ 327 | Check whether number of extra linear layers is greater than 0 328 | :param input_value: input string value 329 | :return: integer value 330 | """ 331 | int_value = int(input_value) 332 | if int_value < 1: 333 | raise ArgumentTypeError(f'Number of extra linear layers should be greater than 0.') 334 | return int_value 335 | 336 | 337 | def check_verbosity_type(input_value: str) -> int: 338 | """ 339 | Check whether verbosity is true or false 340 | :param input_value: input string value 341 | :return: integer value 342 | """ 343 | int_value = int(input_value) 344 | if int_value != 0 and int_value != 1: 345 | raise ArgumentTypeError(f'Verbosity should be 0 or 1.') 346 | return int_value 347 | 348 | 349 | def parse_arguments() -> Namespace: 350 | """ 351 | Parse arguments 352 | :return: arguments 353 | """ 354 | parser = ArgumentParser(description='EEGNet & DeepConvNet') 355 | parser.add_argument('-m', '--model', default='EEG', type=check_model_type, help='EEGNet or DeepConvNet') 356 | parser.add_argument('-e', '--epochs', default=150, type=int, help='Number of epochs') 357 | parser.add_argument('-lr', '--learning_rate', default=1e-2, type=float, help='Learning rate') 358 | parser.add_argument('-b', '--batch_size', default=64, type=int, help='Batch size') 359 | parser.add_argument('-o', '--optimizer', default='adam', type=check_optimizer_type, help='Optimizer') 360 | parser.add_argument('-lf', '--loss_function', default='cross_entropy', type=check_loss_type, help='Loss function') 361 | parser.add_argument('-d', '--dropout', default=0.25, type=float, help='Dropout probability') 362 | parser.add_argument('-l', '--linear', default=1, type=check_linear_type, 363 | help='Extra linear layers in DeepConvNet (default is 1)') 364 | parser.add_argument('-v', '--verbosity', default=0, type=check_verbosity_type, help='Whether to show info log') 365 | 366 | return parser.parse_args() 367 | 368 | 369 | def main() -> None: 370 | """ 371 | Main function 372 | :return: None 373 | """ 374 | # Parse arguments 375 | arguments = parse_arguments() 376 | model = arguments.model 377 | epochs = arguments.epochs 378 | learning_rate = arguments.learning_rate 379 | batch_size = arguments.batch_size 380 | optimizer = arguments.optimizer 381 | loss_function = arguments.loss_function 382 | dropout = arguments.dropout 383 | num_of_linear = arguments.linear 384 | verbosity = arguments.verbosity 385 | info_log(f'Model: {model}', verbosity=verbosity) 386 | info_log(f'Epochs: {epochs}', verbosity=verbosity) 387 | info_log(f'Learning rate: {learning_rate}', verbosity=verbosity) 388 | info_log(f'Batch size: {batch_size}', verbosity=verbosity) 389 | info_log(f'Optimizer: {optimizer}', verbosity=verbosity) 390 | info_log(f'Loss function: {loss_function}', verbosity=verbosity) 391 | info_log(f'Dropout: {dropout}', verbosity=verbosity) 392 | if model == 'Deep': 393 | info_log(f'Number of linear layers: {num_of_linear + 1}', verbosity=verbosity) 394 | 395 | # Read data 396 | info_log('Reading data ...', verbosity=verbosity) 397 | train_data, train_label, test_data, test_label = read_bci_data() 398 | train_dataset = TensorDataset(Tensor(train_data), Tensor(train_label)) 399 | test_dataset = TensorDataset(Tensor(test_data), Tensor(test_label)) 400 | 401 | # Get training device 402 | train_device = device("cuda" if cuda.is_available() else "cpu") 403 | info_log(f'Training device: {train_device}', verbosity=verbosity) 404 | 405 | # Train models 406 | train(target_model=model, 407 | epochs=epochs, 408 | learning_rate=learning_rate, 409 | batch_size=batch_size, 410 | optimizer=optimizer, 411 | loss_function=loss_function, 412 | dropout=dropout, 413 | num_of_linear=num_of_linear, 414 | train_device=train_device, 415 | train_dataset=train_dataset, 416 | test_dataset=test_dataset, 417 | verbosity=verbosity) 418 | 419 | 420 | if __name__ == '__main__': 421 | main() 422 | -------------------------------------------------------------------------------- /Lab 4/DLP_LAB4_309552007_袁鈺勛.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 4/DLP_LAB4_309552007_袁鈺勛.pdf -------------------------------------------------------------------------------- /Lab 4/Lab4-Diabetic retinopathy detection.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 4/Lab4-Diabetic retinopathy detection.pdf -------------------------------------------------------------------------------- /Lab 4/README.md: -------------------------------------------------------------------------------- 1 | # Deep-Learning-and-Practice Lab 4 2 | 🚀 ResNet18 & ResNet50 3 | 🏹 The goal of this lab is to implement ResNet18 and ResNet50 with Pytorch to classify diabetic retinopathy data. 4 | 5 | 6 | 7 | ## Arguments 8 | |Argument|Description|Default| 9 | |---|---|---| 10 | |`'-t', '--target_model'`|ResNet18 or ResNet50|'ResNet18'| 11 | |`'-c', '--comparison'`|Whether compare the accuracies of w/ pretraining and w/o pretraining models|1| 12 | |`'-p', '--pretrain'`|Train w/ pretraining model or w/o pretraining model when "comparison" is false|0| 13 | |`'-l', '--load'`|Whether load the stored model and accuracies|0| 14 | |`'-s', '--show_only'`|Whether only show the results|0| 15 | |`'-b', '--batch_size'`|Batch size|4| 16 | |`'-lr', '--learning_rate'`|Learning rate|1e-3| 17 | |`'-e', '--epochs'`|Number of epochs|10| 18 | |`'-o', '--optimizer'`|Optimizer|'sgd'| 19 | |`'-m', '--momentum'`|Momentum factor for SGD|0.9| 20 | |`'-w', '--weight_decay'`|Weight decay (L2 penalty)|5e-4| 21 | |`'-v', '--verbosity'`|Verbosity level|0| -------------------------------------------------------------------------------- /Lab 4/dataloader.py: -------------------------------------------------------------------------------- 1 | from torch.utils import data 2 | from torchvision import transforms 3 | import pandas as pd 4 | import numpy as np 5 | import os 6 | import PIL 7 | 8 | 9 | def get_data(mode): 10 | if mode == 'train': 11 | img = pd.read_csv('train_img.csv') 12 | label = pd.read_csv('train_label.csv') 13 | return np.squeeze(img.values), np.squeeze(label.values) 14 | else: 15 | img = pd.read_csv('test_img.csv') 16 | label = pd.read_csv('test_label.csv') 17 | return np.squeeze(img.values), np.squeeze(label.values) 18 | 19 | 20 | class RetinopathyLoader(data.Dataset): 21 | def __init__(self, root: str, mode: str, transformations=None): 22 | """ 23 | Args: 24 | root (string): Root path of the dataset. 25 | mode : Indicate procedure status(training or testing) 26 | 27 | self.img_name (string list): String list that store all image names. 28 | self.label (int or float list): Numerical list that store all ground truth label values. 29 | """ 30 | self.root = root 31 | self.img_name, self.label = get_data(mode) 32 | self.mode = mode 33 | trans = [] 34 | if transformations: 35 | trans += transformations 36 | trans.append(transforms.ToTensor()) 37 | self.transform = transforms.Compose(trans) 38 | print("> Found %d images ..." % (len(self.img_name))) 39 | 40 | def __len__(self): 41 | """'return the size of dataset""" 42 | return len(self.img_name) 43 | 44 | def __getitem__(self, index: int): 45 | """something you should implement here""" 46 | 47 | """ 48 | step1. Get the image path from 'self.img_name' and load it. 49 | hint : path = root + self.img_name[index] + '.jpeg' 50 | 51 | step2. Get the ground truth label from self.label 52 | 53 | step3. Transform the .jpeg rgb images during the training phase, such as resizing, random flipping, 54 | rotation, cropping, normalization etc. But at the beginning, I suggest you follow the hints. 55 | 56 | In the testing phase, if you have a normalization process during the training phase, you only need 57 | to normalize the data. 58 | 59 | hints : Convert the pixel value to [0, 1] 60 | Transpose the image shape from [H, W, C] to [C, H, W] 61 | 62 | step4. Return processed image and label 63 | """ 64 | path = os.path.join(self.root, f'{self.img_name[index]}.jpeg') 65 | img = self.transform(PIL.Image.open(path)) 66 | label = self.label[index] 67 | 68 | return img, label 69 | -------------------------------------------------------------------------------- /Lab 4/retinopathy_detection.py: -------------------------------------------------------------------------------- 1 | from dataloader import RetinopathyLoader 2 | from torch import Tensor, device, cuda, no_grad, load, save 3 | from torch import max as tensor_max 4 | from torch.utils.data import TensorDataset, DataLoader 5 | from torchvision import transforms 6 | from argparse import ArgumentParser, ArgumentTypeError, Namespace 7 | from typing import Optional, Type, Union, List, Dict 8 | from tqdm import tqdm 9 | from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay 10 | import sys 11 | import os 12 | import torch.nn as nn 13 | import torch.optim as op 14 | import torchvision.models as torch_models 15 | import matplotlib.pyplot as plt 16 | import numpy as np 17 | import pickle 18 | 19 | 20 | class BasicBlock(nn.Module): 21 | """ 22 | output = (channels, H, W) -> conv2d (3x3) -> (channels, H, W) -> conv2d (3x3) -> (channels, H, W) + (channels, H, W) 23 | """ 24 | expansion: int = 1 25 | 26 | def __init__(self, in_channels: int, out_channels: int, stride: int = 1, down_sample: Optional[nn.Module] = None): 27 | super(BasicBlock, self).__init__() 28 | 29 | self.activation = nn.ReLU(inplace=True) 30 | self.block = nn.Sequential( 31 | nn.Conv2d( 32 | in_channels=in_channels, 33 | out_channels=out_channels, 34 | kernel_size=3, 35 | stride=stride, 36 | padding=1, 37 | bias=False), 38 | nn.BatchNorm2d(out_channels), 39 | self.activation, 40 | nn.Conv2d( 41 | in_channels=out_channels, 42 | out_channels=out_channels, 43 | kernel_size=3, 44 | padding=1, 45 | bias=False), 46 | nn.BatchNorm2d(out_channels), 47 | ) 48 | self.down_sample = down_sample 49 | 50 | def forward(self, inputs: TensorDataset) -> Tensor: 51 | """ 52 | Forward propagation 53 | :param inputs: input data 54 | :return: results 55 | """ 56 | residual = inputs 57 | outputs = self.block(inputs) 58 | if self.down_sample is not None: 59 | residual = self.down_sample(inputs) 60 | 61 | outputs = self.activation(outputs + residual) 62 | 63 | return outputs 64 | 65 | 66 | class BottleneckBlock(nn.Module): 67 | """ 68 | output = (channels * 4, H, W) -> conv2d (1x1) -> (channels, H, W) -> conv2d (3x3) -> (channels, H, W) 69 | -> conv2d (1x1) -> (channels * 4, H, W) + (channels * 4, H, W) 70 | """ 71 | expansion: int = 4 72 | 73 | def __init__(self, in_channels: int, out_channels: int, stride: int = 1, down_sample: Optional[nn.Module] = None): 74 | super(BottleneckBlock, self).__init__() 75 | 76 | external_channels = out_channels * self.expansion 77 | self.activation = nn.ReLU(inplace=True) 78 | self.block = nn.Sequential( 79 | nn.Conv2d(in_channels=in_channels, 80 | out_channels=out_channels, 81 | kernel_size=1, 82 | bias=False), 83 | nn.BatchNorm2d(out_channels), 84 | self.activation, 85 | nn.Conv2d(in_channels=out_channels, 86 | out_channels=out_channels, 87 | kernel_size=3, 88 | stride=stride, 89 | padding=1, 90 | bias=False), 91 | nn.BatchNorm2d(out_channels), 92 | self.activation, 93 | nn.Conv2d(in_channels=out_channels, 94 | out_channels=external_channels, 95 | kernel_size=1, 96 | bias=False), 97 | nn.BatchNorm2d(external_channels), 98 | ) 99 | self.down_sample = down_sample 100 | 101 | def forward(self, inputs: TensorDataset) -> Tensor: 102 | """ 103 | Forward propagation 104 | :param inputs: input data 105 | :return: results 106 | """ 107 | residual = inputs 108 | outputs = self.block(inputs) 109 | if self.down_sample is not None: 110 | residual = self.down_sample(inputs) 111 | 112 | outputs = self.activation(outputs + residual) 113 | 114 | return outputs 115 | 116 | 117 | class ResNet(nn.Module): 118 | def __init__(self, architecture: str, block: Type[Union[BasicBlock, BottleneckBlock]], layers: List[int], 119 | pretrain: bool): 120 | super(ResNet, self).__init__() 121 | 122 | if pretrain: 123 | pretrained_resnet = getattr(torch_models, architecture)(pretrained=True) 124 | self.conv_1 = nn.Sequential( 125 | getattr(pretrained_resnet, 'conv1'), 126 | getattr(pretrained_resnet, 'bn1'), 127 | getattr(pretrained_resnet, 'relu'), 128 | getattr(pretrained_resnet, 'maxpool') 129 | ) 130 | 131 | # Layers 132 | self.conv_2 = getattr(pretrained_resnet, 'layer1') 133 | self.conv_3 = getattr(pretrained_resnet, 'layer2') 134 | self.conv_4 = getattr(pretrained_resnet, 'layer3') 135 | self.conv_5 = getattr(pretrained_resnet, 'layer4') 136 | 137 | self.classify = nn.Sequential( 138 | getattr(pretrained_resnet, 'avgpool'), 139 | nn.Flatten(), 140 | nn.Linear(getattr(pretrained_resnet, 'fc').in_features, out_features=50), 141 | nn.ReLU(inplace=True), 142 | nn.Dropout(p=0.25), 143 | nn.Linear(in_features=50, out_features=5) 144 | ) 145 | 146 | del pretrained_resnet 147 | else: 148 | self.current_channels = 64 149 | 150 | self.conv_1 = nn.Sequential( 151 | nn.Conv2d( 152 | in_channels=3, 153 | out_channels=64, 154 | kernel_size=7, 155 | stride=2, 156 | padding=3, 157 | bias=False), 158 | nn.BatchNorm2d(64), 159 | nn.ReLU(inplace=True), 160 | nn.MaxPool2d(kernel_size=3, 161 | stride=2, 162 | padding=1) 163 | ) 164 | 165 | # Layers 166 | self.conv_2 = self.make_layer(block=block, 167 | num_of_blocks=layers[0], 168 | in_channels=64) 169 | self.conv_3 = self.make_layer(block=block, 170 | num_of_blocks=layers[1], 171 | in_channels=128, 172 | stride=2) 173 | self.conv_4 = self.make_layer(block=block, 174 | num_of_blocks=layers[2], 175 | in_channels=256, 176 | stride=2) 177 | self.conv_5 = self.make_layer(block=block, 178 | num_of_blocks=layers[3], 179 | in_channels=512, 180 | stride=2) 181 | 182 | self.classify = nn.Sequential( 183 | nn.AdaptiveAvgPool2d((1, 1)), 184 | nn.Flatten(), 185 | nn.Linear(in_features=512 * block.expansion, out_features=50), 186 | nn.ReLU(inplace=True), 187 | nn.Dropout(p=0.25), 188 | nn.Linear(in_features=50, out_features=5) 189 | ) 190 | 191 | def make_layer(self, block: Type[Union[BasicBlock, BottleneckBlock]], num_of_blocks: int, in_channels: int, 192 | stride: int = 1) -> nn.Sequential: 193 | """ 194 | Make a layer with given block 195 | :param block: block to be used to compose the layer 196 | :param num_of_blocks: number of blocks in this layer 197 | :param in_channels: channels used in the blocks 198 | :param stride: stride 199 | :return: convolution layer composed with given block 200 | """ 201 | down_sample = None 202 | if stride != 1 or self.current_channels != in_channels * block.expansion: 203 | down_sample = nn.Sequential( 204 | nn.Conv2d(in_channels=self.current_channels, 205 | out_channels=in_channels * block.expansion, 206 | kernel_size=1, 207 | stride=stride, 208 | bias=False), 209 | nn.BatchNorm2d(in_channels * block.expansion), 210 | ) 211 | 212 | layers = [ 213 | block(in_channels=self.current_channels, 214 | out_channels=in_channels, 215 | stride=stride, 216 | down_sample=down_sample) 217 | ] 218 | self.current_channels = in_channels * block.expansion 219 | layers += [block(in_channels=self.current_channels, out_channels=in_channels) for _ in range(1, num_of_blocks)] 220 | 221 | return nn.Sequential(*layers) 222 | 223 | def forward(self, inputs: TensorDataset) -> Tensor: 224 | """ 225 | Forward propagation 226 | :param inputs: input data 227 | :return: results 228 | """ 229 | partial_results = inputs 230 | for idx in range(1, 6): 231 | partial_results = getattr(self, f'conv_{idx}')(partial_results) 232 | return self.classify(partial_results) 233 | 234 | 235 | def resnet_18(pretrain: bool = False) -> ResNet: 236 | """ 237 | Get ResNet18 238 | :param pretrain: whether use pretrained model 239 | :return: ResNet18 240 | """ 241 | return ResNet(architecture='resnet18', block=BasicBlock, layers=[2, 2, 2, 2], pretrain=pretrain) 242 | 243 | 244 | def resnet_50(pretrain: bool = False) -> ResNet: 245 | """ 246 | Get ResNet50 247 | :param pretrain: whether use pretrained model 248 | :return: ResNet50 249 | """ 250 | return ResNet(architecture='resnet50', block=BottleneckBlock, layers=[3, 4, 6, 3], pretrain=pretrain) 251 | 252 | 253 | def save_object(obj, name: str) -> None: 254 | """ 255 | Save object 256 | :param obj: object to be saved 257 | :param name: name of the file 258 | :return: None 259 | """ 260 | if not os.path.exists('./model'): 261 | os.mkdir('./model') 262 | with open(f'./model/{name}.pkl', 'wb') as f: 263 | pickle.dump(obj, f, pickle.HIGHEST_PROTOCOL) 264 | 265 | 266 | def load_object(name: str): 267 | """ 268 | Load object 269 | :param name: name of the file 270 | :return: the stored object 271 | """ 272 | with open(f'./model/{name}.pkl', 'rb') as f: 273 | return pickle.load(f) 274 | 275 | 276 | def show_results(target_model: str, 277 | epochs: int, 278 | accuracy: Dict[str, dict], 279 | prediction: Dict[str, np.ndarray], 280 | ground_truth: np.ndarray, 281 | keys: List[str], 282 | show_only: int) -> None: 283 | """ 284 | Show accuracy results 285 | :param target_model: ResNet18 or ResNet50 286 | :param epochs: number of epochs 287 | :param accuracy: training and testing accuracy of different ResNets 288 | :param prediction: predictions of different ResNets 289 | :param ground_truth: ground truth of testing data 290 | :param keys: names of ResNet w/ or w/o pretraining 291 | :param show_only: Whether only show the results 292 | :return: None 293 | """ 294 | # Get the number of characters of the longest ResNet name 295 | longest = len(max(keys, key=len)) + 6 296 | 297 | if not os.path.exists('./results'): 298 | os.mkdir('./results') 299 | 300 | # Plot 301 | plt.figure(0) 302 | plt.title(f'Result Comparison ({target_model})') 303 | plt.xlabel('Epoch') 304 | plt.ylabel('Accuracy (%)') 305 | 306 | for train_or_test, acc in accuracy.items(): 307 | for model in keys: 308 | plt.plot(range(epochs), acc[model], label=f'{model}_{train_or_test}') 309 | spaces = ''.join([' ' for _ in range(longest - len(f'{model}_{train_or_test}'))]) 310 | print(f'{model}_{train_or_test}: {spaces}{max(acc[model]):.2f} %') 311 | 312 | plt.legend(loc='lower right') 313 | plt.tight_layout() 314 | if not show_only: 315 | plt.savefig(f'./results/{target_model}_comparison.png') 316 | plt.close() 317 | 318 | for key, pred_labels in prediction.items(): 319 | cm = confusion_matrix(y_true=ground_truth, y_pred=pred_labels, normalize='true') 320 | ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=[0, 1, 2, 3, 4]).plot(cmap=plt.cm.Blues) 321 | plt.title(f'Normalized confusion matrix ({key})') 322 | plt.tight_layout() 323 | if not show_only: 324 | plt.savefig(f'./results/{key.replace(" ", "_").replace("/", "_")}_confusion.png') 325 | plt.close() 326 | 327 | if show_only: 328 | plt.show() 329 | 330 | 331 | def train(target_model: str, 332 | comparison: int, 333 | pretrain: int, 334 | load_or_not: int, 335 | show_only: int, 336 | batch_size: int, 337 | learning_rate: float, 338 | epochs: int, 339 | optimizer: op, 340 | momentum: float, 341 | weight_decay: float, 342 | train_device: device, 343 | train_dataset: RetinopathyLoader, 344 | test_dataset: RetinopathyLoader) -> None: 345 | """ 346 | Train the models 347 | :param target_model: ResNet18 or ResNet50 348 | :param comparison: Whether compare w/ pretraining and w/o pretraining models 349 | :param pretrain: Whether use pretrained model when comparison is false 350 | :param load_or_not: Whether load the stored model and accuracies 351 | :param show_only: Whether only show the results 352 | :param batch_size: batch size 353 | :param learning_rate: learning rate 354 | :param epochs: number of epochs 355 | :param optimizer: optimizer 356 | :param momentum: momentum for SGD 357 | :param weight_decay: weight decay factor 358 | :param train_device: training device (cpu or gpu) 359 | :param train_dataset: training dataset 360 | :param test_dataset: testing dataset 361 | :return: None 362 | """ 363 | # Setup models w/ or w/o pretraining 364 | info_log('Setup models ...') 365 | if target_model == 'ResNet18': 366 | if comparison: 367 | keys = [ 368 | 'ResNet18 (w/o pretraining)', 369 | 'ResNet18 (w/ pretraining)' 370 | ] 371 | models = { 372 | keys[0]: resnet_18().to(train_device), 373 | keys[1]: resnet_18(pretrain=True).to(train_device) 374 | } 375 | else: 376 | if pretrain: 377 | keys = ['ResNet18 (w/ pretraining)'] 378 | models = {keys[0]: resnet_18(pretrain=True).to(train_device)} 379 | else: 380 | keys = ['ResNet18 (w/o pretraining)'] 381 | models = {keys[0]: resnet_18().to(train_device)} 382 | if load_or_not: 383 | checkpoint = load(f'./model/{target_model}.pt') 384 | models[keys[0]].load_state_dict(checkpoint['model_state_dict']) 385 | else: 386 | if comparison: 387 | keys = [ 388 | 'ResNet50 (w/o pretraining)', 389 | 'ResNet50 (w/ pretraining)' 390 | ] 391 | models = { 392 | keys[0]: resnet_50().to(train_device), 393 | keys[1]: resnet_50(pretrain=True).to(train_device) 394 | } 395 | else: 396 | if pretrain: 397 | keys = ['ResNet50 (w/ pretraining)'] 398 | models = {keys[0]: resnet_50(pretrain=True).to(train_device)} 399 | else: 400 | keys = ['ResNet50 (w/o pretraining)'] 401 | models = {keys[0]: resnet_50().to(train_device)} 402 | if load_or_not: 403 | checkpoint = load(f'./model/{target_model}.pt') 404 | models[keys[0]].load_state_dict(checkpoint['model_state_dict']) 405 | 406 | # Setup accuracy structure 407 | info_log('Setup accuracy structure ...') 408 | if show_only: 409 | accuracy = load_object(name='accuracy') 410 | elif not comparison and load_or_not: 411 | last_accuracy = load_object(name='accuracy') 412 | accuracy = { 413 | 'train': {key: last_accuracy['train'][key] + [0 for _ in range(epochs)] for key in keys}, 414 | 'test': {key: last_accuracy['test'][key] + [0 for _ in range(epochs)] for key in keys} 415 | } 416 | else: 417 | accuracy = { 418 | 'train': {key: [0 for _ in range(epochs)] for key in keys}, 419 | 'test': {key: [0 for _ in range(epochs)] for key in keys} 420 | } 421 | 422 | # Setup prediction structure 423 | info_log('Setup prediction structure ...') 424 | prediction = load_object('prediction') if not comparison and load_or_not else {key: None for key in keys} 425 | 426 | # Load data 427 | info_log('Load data ...') 428 | train_loader = DataLoader(train_dataset, batch_size=batch_size) 429 | test_loader = DataLoader(test_dataset, batch_size=batch_size) 430 | ground_truth = np.array([], dtype=int) 431 | for _, label in test_loader: 432 | ground_truth = np.concatenate((ground_truth, label.long().view(-1).numpy())) 433 | 434 | # For storing model 435 | stored_check_point = { 436 | 'epoch': None, 437 | 'model_state_dict': None, 438 | 'optimizer_state_dict': None 439 | } 440 | 441 | # Start training 442 | last_epoch = checkpoint['epoch'] if not comparison and load_or_not else 0 443 | if not show_only: 444 | info_log('Start training') 445 | for key, model in models.items(): 446 | info_log(f'Training {key} ...') 447 | if optimizer is op.SGD: 448 | model_optimizer = optimizer(model.parameters(), lr=learning_rate, momentum=momentum, 449 | weight_decay=weight_decay) 450 | else: 451 | model_optimizer = optimizer(model.parameters(), lr=learning_rate, weight_decay=weight_decay) 452 | 453 | if not comparison and load_or_not: 454 | model_optimizer.load_state_dict(checkpoint['optimizer_state_dict']) 455 | 456 | max_test_acc = 0 457 | for epoch in tqdm(range(last_epoch, epochs + last_epoch)): 458 | # Train model 459 | model.train() 460 | for data, label in train_loader: 461 | inputs = data.to(train_device) 462 | labels = label.to(train_device).long().view(-1) 463 | 464 | pred_labels = model.forward(inputs=inputs) 465 | 466 | model_optimizer.zero_grad() 467 | loss = nn.CrossEntropyLoss()(pred_labels, labels) 468 | loss.backward() 469 | model_optimizer.step() 470 | 471 | accuracy['train'][key][epoch] += (tensor_max(pred_labels, 1)[1] == labels).sum().item() 472 | accuracy['train'][key][epoch] = 100.0 * accuracy['train'][key][epoch] / len(train_dataset) 473 | 474 | # Test model 475 | model.eval() 476 | with no_grad(): 477 | pred_labels = np.array([], dtype=int) 478 | for data, label in test_loader: 479 | inputs = data.to(train_device) 480 | labels = label.to(train_device).long().view(-1) 481 | 482 | outputs = model.forward(inputs=inputs) 483 | outputs = tensor_max(outputs, 1)[1] 484 | pred_labels = np.concatenate((pred_labels, outputs.cpu().numpy())) 485 | 486 | accuracy['test'][key][epoch] += (outputs == labels).sum().item() 487 | accuracy['test'][key][epoch] = 100.0 * accuracy['test'][key][epoch] / len(test_dataset) 488 | 489 | if accuracy['test'][key][epoch] > max_test_acc: 490 | max_test_acc = accuracy['test'][key][epoch] 491 | prediction[key] = pred_labels 492 | 493 | debug_log(f'Train accuracy: {accuracy["train"][key][epoch]:.2f}%') 494 | debug_log(f'Test accuracy: {accuracy["test"][key][epoch]:.2f}%') 495 | print() 496 | if not comparison: 497 | if not os.path.exists('./model'): 498 | os.mkdir('./model') 499 | stored_check_point['epoch'] = last_epoch + epochs 500 | stored_check_point['model_state_dict'] = model.state_dict() 501 | stored_check_point['optimizer_state_dict'] = model_optimizer.state_dict() 502 | save(stored_check_point, f'./model/{target_model}.pt') 503 | save_object(obj=accuracy, name='accuracy') 504 | save_object(obj=prediction, name='prediction') 505 | cuda.empty_cache() 506 | 507 | # Show results 508 | show_results(target_model=target_model, 509 | epochs=last_epoch + epochs if not show_only else last_epoch, 510 | accuracy=accuracy, 511 | prediction=prediction, 512 | ground_truth=ground_truth, 513 | keys=keys, show_only=show_only) 514 | 515 | 516 | def info_log(log: str) -> None: 517 | """ 518 | Print information log 519 | :param log: log to be displayed 520 | :return: None 521 | """ 522 | global verbosity 523 | if verbosity: 524 | print(f'[\033[96mINFO\033[00m] {log}') 525 | sys.stdout.flush() 526 | 527 | 528 | def debug_log(log: str) -> None: 529 | """ 530 | Print debug log 531 | :param log: log to be displayed 532 | :return: None 533 | """ 534 | global verbosity 535 | if verbosity > 1: 536 | print(f'[\033[93mDEBUG\033[00m] {log}') 537 | sys.stdout.flush() 538 | 539 | 540 | def check_model_type(input_value: str) -> str: 541 | """ 542 | Check whether the model is resnet18 or resnet50 543 | :param input_value: input string value 544 | :return: model name 545 | """ 546 | lowercase_input = input_value.lower() 547 | if lowercase_input != 'resnet18' and lowercase_input != 'resnet50': 548 | raise ArgumentTypeError('Only "ResNet18" and "ResNet50" are supported.') 549 | elif lowercase_input == 'resnet18': 550 | return 'ResNet18' 551 | else: 552 | return 'ResNet50' 553 | 554 | 555 | def check_comparison_type(input_value: str) -> int: 556 | """ 557 | Check whether the comparison is 0 or 1 558 | :param input_value: input string value 559 | :return: integer value 560 | """ 561 | int_value = int(input_value) 562 | if int_value != 0 and int_value != 1: 563 | raise ArgumentTypeError(f'Comparison should be 0 or 1.') 564 | return int_value 565 | 566 | 567 | def check_pretrain_type(input_value: str) -> int: 568 | """ 569 | Check whether the pretrain is 0 or 1 570 | :param input_value: input string value 571 | :return: integer value 572 | """ 573 | int_value = int(input_value) 574 | if int_value != 0 and int_value != 1: 575 | raise ArgumentTypeError(f'Pretrain should be 0 or 1.') 576 | return int_value 577 | 578 | 579 | def check_load_type(input_value: str) -> int: 580 | """ 581 | Check whether the load is 0 or 1 582 | :param input_value: input string value 583 | :return: integer value 584 | """ 585 | int_value = int(input_value) 586 | if int_value != 0 and int_value != 1: 587 | raise ArgumentTypeError(f'Load should be 0 or 1.') 588 | return int_value 589 | 590 | 591 | def check_show_type(input_value: str) -> int: 592 | """ 593 | Check whether the show_only is 0 or 1 594 | :param input_value: input string value 595 | :return: integer value 596 | """ 597 | int_value = int(input_value) 598 | if int_value != 0 and int_value != 1: 599 | raise ArgumentTypeError(f'Show_only should be 0 or 1.') 600 | return int_value 601 | 602 | 603 | def check_optimizer_type(input_value: str) -> op: 604 | """ 605 | Check whether the optimizer is supported 606 | :param input_value: input string value 607 | :return: optimizer 608 | """ 609 | if input_value == 'sgd': 610 | return op.SGD 611 | elif input_value == 'adam': 612 | return op.Adam 613 | elif input_value == 'adadelta': 614 | return op.Adadelta 615 | elif input_value == 'adagrad': 616 | return op.Adagrad 617 | elif input_value == 'adamw': 618 | return op.AdamW 619 | elif input_value == 'adamax': 620 | return op.Adamax 621 | 622 | raise ArgumentTypeError(f'Optimizer {input_value} is not supported.') 623 | 624 | 625 | def check_verbosity_type(input_value: str) -> int: 626 | """ 627 | Check whether verbosity is true or false 628 | :param input_value: input string value 629 | :return: integer value 630 | """ 631 | int_value = int(input_value) 632 | if int_value != 0 and int_value != 1 and int_value != 2: 633 | raise ArgumentTypeError(f'Verbosity should be 0, 1 or 2.') 634 | return int_value 635 | 636 | 637 | def parse_arguments() -> Namespace: 638 | """ 639 | Parse arguments 640 | :return: arguments 641 | """ 642 | parser = ArgumentParser(description='ResNet') 643 | parser.add_argument('-t', '--target_model', default='ResNet18', type=check_model_type, help='ResNet18 or ResNet50') 644 | parser.add_argument('-c', '--comparison', default=1, type=check_comparison_type, 645 | help='Whether compare the accuracies of w/ pretraining and w/o pretraining models') 646 | parser.add_argument('-p', '--pretrain', default=0, type=check_pretrain_type, 647 | help='Train w/ pretraining model or w/o pretraining model when "comparison" is false') 648 | parser.add_argument('-l', '--load', default=0, type=check_load_type, 649 | help='Whether load the stored model and accuracies') 650 | parser.add_argument('-s', '--show_only', default=0, type=check_show_type, help='Whether only show the results') 651 | parser.add_argument('-b', '--batch_size', default=4, type=int, help='Batch size') 652 | parser.add_argument('-lr', '--learning_rate', default=1e-3, type=float, help='Learning rate') 653 | parser.add_argument('-e', '--epochs', default=10, type=int, help='Number of epochs') 654 | parser.add_argument('-o', '--optimizer', default='sgd', type=check_optimizer_type, help='Optimizer') 655 | parser.add_argument('-m', '--momentum', default=0.9, type=float, help='Momentum factor for SGD') 656 | parser.add_argument('-w', '--weight_decay', default=5e-4, type=float, help='Weight decay (L2 penalty)') 657 | parser.add_argument('-v', '--verbosity', default=0, type=check_verbosity_type, help='Verbosity level') 658 | 659 | return parser.parse_args() 660 | 661 | 662 | def main() -> None: 663 | """ 664 | Main function 665 | :return: None 666 | """ 667 | # Parse arguments 668 | arguments = parse_arguments() 669 | target_model = arguments.target_model 670 | comparison = arguments.comparison 671 | pretrain = arguments.pretrain 672 | load_or_not = arguments.load 673 | show_only = arguments.show_only 674 | batch_size = arguments.batch_size 675 | learning_rate = arguments.learning_rate 676 | epochs = arguments.epochs 677 | optimizer = arguments.optimizer 678 | momentum = arguments.momentum 679 | weight_decay = arguments.weight_decay 680 | global verbosity 681 | verbosity = arguments.verbosity 682 | info_log(f'Target model: {target_model}') 683 | info_log(f'Compare w/ and w/o pretraining: {"True" if comparison else "False"}') 684 | info_log(f'Use pretrained model: {"True" if pretrain else "False"}') 685 | info_log(f'Use loaded model: {"True" if load_or_not else "False"}') 686 | info_log(f'Only show the results: {"True" if show_only else "False"}') 687 | info_log(f'Batch size: {batch_size}') 688 | info_log(f'Learning rate: {learning_rate}') 689 | info_log(f'Epochs: {epochs}') 690 | info_log(f'Optimizer: {optimizer}') 691 | info_log(f'Momentum: {momentum}') 692 | info_log(f'Weight decay: {weight_decay}') 693 | 694 | # Read data 695 | info_log('Reading data ...') 696 | train_dataset = RetinopathyLoader('./data', 'train', [ 697 | transforms.RandomHorizontalFlip(p=0.5), 698 | transforms.RandomVerticalFlip(p=0.5) 699 | ]) 700 | test_dataset = RetinopathyLoader('./data', 'test') 701 | 702 | # Get training device 703 | train_device = device("cuda" if cuda.is_available() else "cpu") 704 | info_log(f'Training device: {train_device}') 705 | 706 | # Train models 707 | train(target_model=target_model, 708 | comparison=comparison, 709 | pretrain=pretrain, 710 | load_or_not=load_or_not, 711 | show_only=show_only, 712 | batch_size=batch_size, 713 | learning_rate=learning_rate, 714 | epochs=epochs, 715 | optimizer=optimizer, 716 | momentum=momentum, 717 | weight_decay=weight_decay, 718 | train_device=train_device, 719 | train_dataset=train_dataset, 720 | test_dataset=test_dataset) 721 | 722 | 723 | if __name__ == '__main__': 724 | verbosity = None 725 | main() 726 | -------------------------------------------------------------------------------- /Lab 5/DLP_LAB5_309552007_袁鈺勛.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 5/DLP_LAB5_309552007_袁鈺勛.pdf -------------------------------------------------------------------------------- /Lab 5/Lab5_Conditional Sequence-to-Sequence VAE.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 5/Lab5_Conditional Sequence-to-Sequence VAE.pdf -------------------------------------------------------------------------------- /Lab 5/README.md: -------------------------------------------------------------------------------- 1 | # Deep-Learning-and-Practice Lab 5 2 | 🚀 CVAE 3 | 🏹 The goal of this lab is to implement CVAE using LSTM with Pytorch for English tense conversion. 4 | 5 | 6 | 7 | ## Arguments 8 | |Argument|Description|Default| 9 | |---|---|---| 10 | |`'-hs', '--hidden_size'`|RNN hidden size|256| 11 | |`'-ls', '--latent_size'`|Latent size|32| 12 | |`'-c', '--condition_embedding_size'`|Condition embedding size|8| 13 | |`'-k', '--kl_weight'`|KL weight|0.0| 14 | |`'-kt', '--kl_weight_type'`|Fixed, monotonic or cyclical KL weight|'monotonic'| 15 | |`'-t', '--teacher_forcing_ratio'`|Teacher forcing ratio|0.5| 16 | |`'-tt', '--teacher_forcing_type'`|Fixed or decreasing teacher forcing ratio|'decreasing'| 17 | |`'-lr', '--learning_rate'`|Learning rate|0.007| 18 | |`'-e', '--epochs'`|Number of epochs|100| 19 | |`'-l', '--load'`|Whether load the stored model and accuracies|0| 20 | |`'-s', '--show_only'`|Whether only show the results|0| 21 | |`'-v', '--verbosity'`|Verbosity level|0| -------------------------------------------------------------------------------- /Lab 5/data/readme.txt: -------------------------------------------------------------------------------- 1 | This is the specification file for conditional seq2seq VAE training and validation dataset - English tense 2 | 3 | 1. train.txt 4 | The file is for training. There are 1227 training pairs. 5 | Each training pair includes 4 words: simple present(sp), third person(tp), present progressive(pg), simple past(p). 6 | 7 | 8 | 2. test.txt 9 | The file is for validating. There are 10 validating pairs. 10 | Each training pair includes 2 words with different combination of tenses. 11 | You have to follow those tenses to test your model. 12 | 13 | Here are to details of the file: 14 | 15 | sp -> p 16 | sp -> pg 17 | sp -> tp 18 | sp -> tp 19 | p -> tp 20 | sp -> pg 21 | p -> sp 22 | pg -> sp 23 | pg -> p 24 | pg -> tp 25 | 26 | -------------------------------------------------------------------------------- /Lab 5/data/test.txt: -------------------------------------------------------------------------------- 1 | abandon abandoned 2 | abet abetting 3 | begin begins 4 | expend expends 5 | sent sends 6 | split splitting 7 | flared flare 8 | functioning function 9 | functioning functioned 10 | healing heals 11 | -------------------------------------------------------------------------------- /Lab 6/Lab6-DQN-DDPG.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 6/Lab6-DQN-DDPG.pdf -------------------------------------------------------------------------------- /Lab 6/README.md: -------------------------------------------------------------------------------- 1 | # Deep-Learning-and-Practice Lab 6 2 | 🚀 DQN & DDPG 3 | 🏹 The goal of this lab is to solve lunar landing using DQN, and solve continuous lunar landing using DDPG. 4 | 5 | 6 | 7 | ## Arguments 8 | ### DQN 9 | |Argument|Default| 10 | |---|---| 11 | |`'-d', '--device'`|'cuda'| 12 | |`'-m', '--model'`|'dqn.pth'| 13 | |`'--logdir'`|'log/dqn'| 14 | |`'--warmup'`|10000| 15 | |`'--episode'`|1200| 16 | |`'--capacity'`|10000| 17 | |`'--batch_size'`|128| 18 | |`'--lr'`|.0005| 19 | |`'--eps_decay'`|.995| 20 | |`'--eps_min'`|.01| 21 | |`'--gamma'`|.99| 22 | |`'--freq'`|4| 23 | |`'--target_freq'`|100| 24 | |`'--test_only'`|'store_true'| 25 | |`'--render'`|'store_true'| 26 | |`'--seed'`|20200519| 27 | |`'--test_epsilon'`|.001| 28 | 29 | ### DDPG 30 | |Argument|Default| 31 | |---|---| 32 | |'-d', '--device'`|'cuda'| 33 | |`'-m', '--model'`|'ddpg.pth'| 34 | |`'--logdir'`|'log/ddpg'| 35 | |`'--warmup'`|10000| 36 | |`'--episode'`|1200| 37 | |`'--batch_size'`|64| 38 | |`'--capacity'`|500000| 39 | |`'--lra'`|1e-3| 40 | |`'--lrc'`|1e-3| 41 | |`'--gamma'`|.99| 42 | |`'--tau'`|.005| 43 | |`'--test_only'`|'store_true'| 44 | |`'--render'`|'store_true'| 45 | |`'--seed'`|20200519| -------------------------------------------------------------------------------- /Lab 6/ddpg.py: -------------------------------------------------------------------------------- 1 | """DLP DDPG Lab""" 2 | __author__ = 'chengscott' 3 | __copyright__ = 'Copyright 2020, NCTU CGI Lab' 4 | 5 | import argparse 6 | from collections import deque 7 | import itertools 8 | import random 9 | import time 10 | 11 | import gym 12 | import numpy as np 13 | import torch 14 | import torch.nn as nn 15 | from torch.utils.tensorboard import SummaryWriter 16 | from torch.optim import Adam 17 | 18 | 19 | class GaussianNoise: 20 | def __init__(self, dim, mu=None, std=None): 21 | self.mu = mu if mu else np.zeros(dim) 22 | self.std = std if std else np.ones(dim) * .1 23 | 24 | def sample(self): 25 | """ 26 | Sample from the Gaussian noise 27 | :return: sampled noises 28 | """ 29 | return np.random.normal(self.mu, self.std) 30 | 31 | 32 | class ReplayMemory: 33 | __slots__ = ['buffer'] 34 | 35 | def __init__(self, capacity): 36 | self.buffer = deque(maxlen=capacity) 37 | 38 | def __len__(self): 39 | return len(self.buffer) 40 | 41 | def append(self, *transition): 42 | """ 43 | Append (state, action, reward, next_state, done) to the buffer 44 | :param transition: (state, action, reward, next_state, done) 45 | :return: None 46 | """ 47 | self.buffer.append(tuple(map(tuple, transition))) 48 | 49 | def sample(self, batch_size, device): 50 | """ 51 | Sample a batch of transition tensors 52 | :param batch_size: batch size 53 | :param device: training device 54 | :return: a batch of transition tensors 55 | """ 56 | # TODO 57 | transitions = random.sample(self.buffer, batch_size) 58 | return (torch.tensor(x, dtype=torch.float, device=device) 59 | for x in zip(*transitions)) 60 | 61 | 62 | class ActorNet(nn.Module): 63 | def __init__(self, state_dim=8, action_dim=2, hidden_dim=(400, 300)): 64 | super().__init__() 65 | # TODO 66 | h1, h2 = hidden_dim 67 | self.network = nn.Sequential( 68 | nn.Linear(state_dim, h1), 69 | nn.ReLU(inplace=True), 70 | nn.Linear(h1, h2), 71 | nn.ReLU(inplace=True), 72 | nn.Linear(h2, action_dim), 73 | nn.Tanh() 74 | ) 75 | 76 | def forward(self, x): 77 | # TODO 78 | return self.network(x) 79 | 80 | 81 | class CriticNet(nn.Module): 82 | def __init__(self, state_dim=8, action_dim=2, hidden_dim=(400, 300)): 83 | super().__init__() 84 | h1, h2 = hidden_dim 85 | self.critic_head = nn.Sequential( 86 | nn.Linear(state_dim + action_dim, h1), 87 | nn.ReLU(), 88 | ) 89 | self.critic = nn.Sequential( 90 | nn.Linear(h1, h2), 91 | nn.ReLU(), 92 | nn.Linear(h2, 1), 93 | ) 94 | 95 | def forward(self, x, action): 96 | x = self.critic_head(torch.cat([x, action], dim=1)) 97 | return self.critic(x) 98 | 99 | 100 | class DDPG: 101 | def __init__(self, args): 102 | # Behavior network 103 | self._actor_net = ActorNet().to(args.device) 104 | self._critic_net = CriticNet().to(args.device) 105 | 106 | # Target network 107 | self._target_actor_net = ActorNet().to(args.device) 108 | self._target_critic_net = CriticNet().to(args.device) 109 | 110 | # Initialize target network 111 | self._target_actor_net.load_state_dict(self._actor_net.state_dict()) 112 | self._target_critic_net.load_state_dict(self._critic_net.state_dict()) 113 | 114 | # TODO 115 | # self._actor_opt = ? 116 | # self._critic_opt = ? 117 | self._actor_opt = Adam(self._actor_net.parameters(), lr=args.lra) 118 | self._critic_opt = Adam(self._critic_net.parameters(), lr=args.lrc) 119 | 120 | # Action noise 121 | self._action_noise = GaussianNoise(dim=2) 122 | 123 | # Memory 124 | self._memory = ReplayMemory(capacity=args.capacity) 125 | 126 | # Config 127 | self.device = args.device 128 | self.batch_size = args.batch_size 129 | self.tau = args.tau 130 | self.gamma = args.gamma 131 | 132 | def select_action(self, state, noise=True): 133 | """ 134 | Select an action based on the behavior (actor) network and exploration noise 135 | :param state: current state 136 | :param noise: whether add Gaussian noise 137 | :return: action 138 | """ 139 | # TODO 140 | state = torch.from_numpy(state).float().to(self.device) 141 | 142 | self._actor_net.eval() 143 | with torch.no_grad(): 144 | action = self._actor_net(state).cpu().data.numpy() 145 | self._actor_net.train() 146 | 147 | if noise: 148 | action += self._action_noise.sample() 149 | 150 | return action 151 | 152 | def append(self, state, action, reward, next_state, done): 153 | """ 154 | Append a step to the memory 155 | :param state: current state 156 | :param action: best action 157 | :param reward: reward 158 | :param next_state: next state 159 | :param done: whether the game is finished 160 | :return: None 161 | """ 162 | self._memory.append(state, action, [reward / 100], next_state, 163 | [int(done)]) 164 | 165 | def update(self): 166 | """ 167 | Update behavior networks and target networks 168 | :return: None 169 | """ 170 | # Update the behavior networks 171 | self._update_behavior_network(self.gamma) 172 | 173 | # Update the target networks 174 | self._update_target_network(self._target_actor_net, self._actor_net, 175 | self.tau) 176 | self._update_target_network(self._target_critic_net, self._critic_net, 177 | self.tau) 178 | 179 | def _update_behavior_network(self, gamma): 180 | """ 181 | Update behavior network 182 | :param gamma: gamma 183 | :return: None 184 | """ 185 | actor_net, critic_net = self._actor_net, self._critic_net 186 | target_actor_net, target_critic_net = self._target_actor_net, self._target_critic_net 187 | actor_opt, critic_opt = self._actor_opt, self._critic_opt 188 | 189 | # Sample a mini-batch of transitions 190 | state, action, reward, next_state, done = self._memory.sample( 191 | self.batch_size, self.device) 192 | 193 | # Update critic 194 | # critic loss 195 | # TODO 196 | # q_value = ? 197 | # with torch.no_grad(): 198 | # a_next = ? 199 | # q_next = ? 200 | # q_target = ? 201 | # criterion = ? 202 | # critic_loss = criterion(q_value, q_target) 203 | q_value = critic_net(state, action) 204 | with torch.no_grad(): 205 | a_next = target_actor_net(next_state) 206 | q_next = target_critic_net(next_state, a_next) 207 | q_target = reward + (gamma * q_next * (1 - done)) 208 | critic_loss = nn.MSELoss()(q_value, q_target) 209 | # Optimize critic 210 | actor_net.zero_grad() 211 | critic_net.zero_grad() 212 | critic_loss.backward() 213 | critic_opt.step() 214 | 215 | # Update actor 216 | # actor loss 217 | # TODO 218 | # action = ? 219 | # actor_loss = ? 220 | action = actor_net(state) 221 | actor_loss = -critic_net(state, action).mean() 222 | # Optimize actor 223 | actor_net.zero_grad() 224 | critic_net.zero_grad() 225 | actor_loss.backward() 226 | actor_opt.step() 227 | 228 | @staticmethod 229 | def _update_target_network(target_net, net, tau): 230 | """ 231 | Update target network by _soft_ copying from behavior network 232 | :param target_net: target network 233 | :param net: behavior network 234 | :param tau: weight 235 | :return: None 236 | """ 237 | for target, behavior in zip(target_net.parameters(), net.parameters()): 238 | # TODO 239 | target.data.copy_(tau * behavior.data + (1.0 - tau) * target.data) 240 | 241 | def save(self, model_path, checkpoint=False): 242 | """ 243 | Save behavior networks (and target networks and optimizers) into model_path 244 | :param model_path: name of the stored model 245 | :param checkpoint: whether to store target networks and optimizers 246 | :return: None 247 | """ 248 | if checkpoint: 249 | torch.save( 250 | { 251 | 'actor': self._actor_net.state_dict(), 252 | 'critic': self._critic_net.state_dict(), 253 | 'target_actor': self._target_actor_net.state_dict(), 254 | 'target_critic': self._target_critic_net.state_dict(), 255 | 'actor_opt': self._actor_opt.state_dict(), 256 | 'critic_opt': self._critic_opt.state_dict(), 257 | }, model_path) 258 | else: 259 | torch.save( 260 | { 261 | 'actor': self._actor_net.state_dict(), 262 | 'critic': self._critic_net.state_dict(), 263 | }, model_path) 264 | 265 | def load(self, model_path, checkpoint=False): 266 | """ 267 | Load behavior networks (and target networks and optimizers) from model_path 268 | :param model_path: name of the stored model 269 | :param checkpoint: whether target networks and optimizers are stored in the model path 270 | :return: None 271 | """ 272 | model = torch.load(model_path) 273 | self._actor_net.load_state_dict(model['actor']) 274 | self._critic_net.load_state_dict(model['critic']) 275 | if checkpoint: 276 | self._target_actor_net.load_state_dict(model['target_actor']) 277 | self._target_critic_net.load_state_dict(model['target_critic']) 278 | self._actor_opt.load_state_dict(model['actor_opt']) 279 | self._critic_opt.load_state_dict(model['critic_opt']) 280 | 281 | 282 | def train(args, env, agent, writer): 283 | """ 284 | Training 285 | :param args: arguments 286 | :param env: environment 287 | :param agent: agent 288 | :param writer: Tensorboard writer 289 | :return: None 290 | """ 291 | print('Start Training') 292 | total_steps = 0 293 | ewma_reward = 0 294 | for episode in range(args.episode): 295 | total_reward = 0 296 | state = env.reset() 297 | for t in itertools.count(start=1): 298 | # select action 299 | if total_steps < args.warmup: 300 | action = env.action_space.sample() 301 | else: 302 | action = agent.select_action(state) 303 | # execute action 304 | next_state, reward, done, _ = env.step(action) 305 | # store transition 306 | agent.append(state, action, reward, next_state, done) 307 | if total_steps >= args.warmup: 308 | agent.update() 309 | 310 | state = next_state 311 | total_reward += reward 312 | total_steps += 1 313 | if done: 314 | ewma_reward = 0.05 * total_reward + (1 - 0.05) * ewma_reward 315 | writer.add_scalar('Train/Episode Reward', total_reward, 316 | total_steps) 317 | writer.add_scalar('Train/Ewma Reward', ewma_reward, 318 | total_steps) 319 | print( 320 | 'Step: {}\tEpisode: {}\tLength: {:3d}\tTotal reward: {:.2f}\tEwma reward: {:.2f}' 321 | .format(total_steps, episode, t, total_reward, 322 | ewma_reward)) 323 | break 324 | env.close() 325 | 326 | 327 | def test(args, env, agent, writer): 328 | """ 329 | Testing 330 | :param args: arguments 331 | :param env: environment 332 | :param agent: agent 333 | :param writer: Tensorboard writer 334 | :return: None 335 | """ 336 | print('Start Testing') 337 | seeds = (args.seed + i for i in range(10)) 338 | rewards = [] 339 | for n_episode, seed in enumerate(seeds): 340 | total_reward = 0 341 | env.seed(seed) 342 | state = env.reset() 343 | # TODO 344 | # ... 345 | # if done: 346 | # writer.add_scalar('Test/Episode Reward', total_reward, n_episode) 347 | # ... 348 | for _ in range(1000): 349 | action = agent.select_action(state, False) 350 | state, reward, done, _ = env.step(action) 351 | total_reward += reward 352 | if done: 353 | writer.add_scalar('Test/Episode Reward', total_reward, n_episode) 354 | break 355 | rewards.append(total_reward) 356 | print('Average Reward', np.mean(rewards)) 357 | env.close() 358 | 359 | 360 | def main(): 361 | # Arguments 362 | parser = argparse.ArgumentParser(description=__doc__) 363 | parser.add_argument('-d', '--device', default='cuda') 364 | parser.add_argument('-m', '--model', default='ddpg.pth') 365 | parser.add_argument('--logdir', default='log/ddpg') 366 | # Train arguments 367 | parser.add_argument('--warmup', default=10000, type=int) 368 | parser.add_argument('--episode', default=1200, type=int) 369 | parser.add_argument('--batch_size', default=64, type=int) 370 | parser.add_argument('--capacity', default=500000, type=int) 371 | parser.add_argument('--lra', default=1e-3, type=float) 372 | parser.add_argument('--lrc', default=1e-3, type=float) 373 | parser.add_argument('--gamma', default=.99, type=float) 374 | parser.add_argument('--tau', default=.005, type=float) 375 | # Testing arguments 376 | parser.add_argument('--test_only', action='store_true') 377 | parser.add_argument('--render', action='store_true') 378 | parser.add_argument('--seed', default=20200519, type=int) 379 | args = parser.parse_args() 380 | 381 | # Main 382 | env = gym.make('LunarLanderContinuous-v2') 383 | agent = DDPG(args) 384 | writer = SummaryWriter(args.logdir) 385 | if not args.test_only: 386 | train(args, env, agent, writer) 387 | agent.save(args.model) 388 | agent.load(args.model) 389 | test(args, env, agent, writer) 390 | 391 | 392 | if __name__ == '__main__': 393 | main() 394 | -------------------------------------------------------------------------------- /Lab 6/dqn.py: -------------------------------------------------------------------------------- 1 | """DLP DQN Lab""" 2 | __author__ = 'chengscott' 3 | __copyright__ = 'Copyright 2020, NCTU CGI Lab' 4 | 5 | import argparse 6 | from collections import deque 7 | import itertools 8 | import random 9 | import time 10 | 11 | import gym 12 | import numpy as np 13 | import torch 14 | import torch.nn as nn 15 | from torch.utils.tensorboard import SummaryWriter 16 | from torch.optim import Adam 17 | 18 | 19 | class ReplayMemory: 20 | __slots__ = ['buffer'] 21 | 22 | def __init__(self, capacity): 23 | self.buffer = deque(maxlen=capacity) 24 | 25 | def __len__(self): 26 | return len(self.buffer) 27 | 28 | def append(self, *transition): 29 | """ 30 | Append (state, action, reward, next_state, done) to the buffer 31 | :param transition: (state, action, reward, next_state, done) 32 | :return: None 33 | """ 34 | self.buffer.append(tuple(map(tuple, transition))) 35 | 36 | def sample(self, batch_size, device): 37 | """ 38 | Sample a batch of transition tensors 39 | :param batch_size: batch size 40 | :param device: training device 41 | :return: a batch of transition tensors 42 | """ 43 | transitions = random.sample(self.buffer, batch_size) 44 | return (torch.tensor(x, dtype=torch.float, device=device) 45 | for x in zip(*transitions)) 46 | 47 | 48 | class Net(nn.Module): 49 | def __init__(self, state_dim=8, action_dim=4, hidden_dim=32): 50 | super().__init__() 51 | self.network = nn.Sequential( 52 | nn.Linear(in_features=state_dim, 53 | out_features=hidden_dim), 54 | nn.ReLU(inplace=True), 55 | nn.Linear(in_features=hidden_dim, 56 | out_features=hidden_dim), 57 | nn.ReLU(inplace=True), 58 | nn.Linear(in_features=hidden_dim, 59 | out_features=action_dim) 60 | ) 61 | 62 | def forward(self, x): 63 | return self.network(x) 64 | 65 | 66 | class DQN: 67 | def __init__(self, args): 68 | self._behavior_net = Net().to(args.device) 69 | self._target_net = Net().to(args.device) 70 | 71 | # Initialize target network 72 | self._target_net.load_state_dict(self._behavior_net.state_dict()) 73 | 74 | # TODO 75 | # self._optimizer = ? 76 | self._optimizer = Adam(self._behavior_net.parameters(), lr=args.lr) 77 | 78 | # Memory 79 | self._memory = ReplayMemory(capacity=args.capacity) 80 | 81 | # Config 82 | self.device = args.device 83 | self.batch_size = args.batch_size 84 | self.gamma = args.gamma 85 | self.freq = args.freq 86 | self.target_freq = args.target_freq 87 | 88 | def select_action(self, state, epsilon, action_space): 89 | """ 90 | epsilon-greedy based on behavior network 91 | :param state: current state 92 | :param epsilon: probability 93 | :param action_space: action space of current game 94 | :return: an action 95 | """ 96 | # TODO 97 | if random.random() > epsilon: 98 | state = torch.from_numpy(state).float().unsqueeze(0).to(self.device) 99 | self._behavior_net.eval() 100 | with torch.no_grad(): 101 | action_values = self._behavior_net(state) 102 | self._behavior_net.train() 103 | return np.argmax(action_values.cpu().data.numpy()) 104 | else: 105 | return random.choice(np.arange(action_space.n)) 106 | 107 | def append(self, state, action, reward, next_state, done): 108 | """ 109 | Append a step to the memory 110 | :param state: current state 111 | :param action: best action 112 | :param reward: reward 113 | :param next_state: next state 114 | :param done: whether the game is finished 115 | :return: None 116 | """ 117 | self._memory.append(state, [action], [reward / 10], next_state, 118 | [int(done)]) 119 | 120 | def update(self, total_steps): 121 | """ 122 | Update behavior networks and target networks 123 | :return: None 124 | """ 125 | if total_steps % self.freq == 0: 126 | self._update_behavior_network(self.gamma) 127 | if total_steps % self.target_freq == 0: 128 | # TODO DQN 129 | # self._update_target_network() 130 | # TODO DDQN 131 | self._soft_update_target_network() 132 | 133 | def _update_behavior_network(self, gamma): 134 | """ 135 | Update behavior network 136 | :param gamma: gamma 137 | :return: None 138 | """ 139 | # Sample a mini-batch of transitions 140 | state, action, reward, next_state, done = self._memory.sample( 141 | self.batch_size, self.device) 142 | 143 | # TODO DQN 144 | # q_value = ? 145 | # with torch.no_grad(): 146 | # q_next = ? 147 | # q_target = ? 148 | # criterion = ? 149 | # loss = criterion(q_value, q_target) 150 | # q_value = self._behavior_net(state).gather(1, action.long()) 151 | # with torch.no_grad(): 152 | # q_next = self._target_net(next_state).detach().max(1)[0].unsqueeze(1) 153 | # q_target = reward + (gamma * q_next * (1 - done)) 154 | 155 | # TODO DDQN 156 | q_value = self._behavior_net(state).gather(1, action.long()) 157 | with torch.no_grad(): 158 | q_argmax = self._behavior_net(next_state).detach().max(1)[1].unsqueeze(1) 159 | q_next = self._target_net(next_state).detach().gather(1, q_argmax) 160 | q_target = reward + (gamma * q_next * (1 - done)) 161 | 162 | loss = nn.MSELoss()(q_value, q_target) 163 | 164 | # Optimize 165 | self._optimizer.zero_grad() 166 | loss.backward() 167 | nn.utils.clip_grad_norm_(self._behavior_net.parameters(), 5) 168 | self._optimizer.step() 169 | 170 | def _update_target_network(self): 171 | """ 172 | Update target network by copying from behavior network 173 | :return: None 174 | """ 175 | # TODO 176 | self._target_net.load_state_dict(self._behavior_net.state_dict()) 177 | 178 | def _soft_update_target_network(self, tau=.9): 179 | """ 180 | Update target network by _soft_ copying from behavior network 181 | :param tau: weight 182 | :return: None 183 | """ 184 | for target, behavior in zip(self._target_net.parameters(), self._behavior_net.parameters()): 185 | target.data.copy_(tau * behavior.data + (1.0 - tau) * target.data) 186 | 187 | def save(self, model_path, checkpoint=False): 188 | """ 189 | Save behavior networks (and target networks and optimizers) into model_path 190 | :param model_path: name of the stored model 191 | :param checkpoint: whether to store target networks and optimizers 192 | :return: None 193 | """ 194 | if checkpoint: 195 | torch.save( 196 | { 197 | 'behavior_net': self._behavior_net.state_dict(), 198 | 'target_net': self._target_net.state_dict(), 199 | 'optimizer': self._optimizer.state_dict(), 200 | }, model_path) 201 | else: 202 | torch.save({ 203 | 'behavior_net': self._behavior_net.state_dict(), 204 | }, model_path) 205 | 206 | def load(self, model_path, checkpoint=False): 207 | """ 208 | Load behavior networks (and target networks and optimizers) from model_path 209 | :param model_path: name of the stored model 210 | :param checkpoint: whether target networks and optimizers are stored in the model path 211 | :return: None 212 | """ 213 | model = torch.load(model_path) 214 | self._behavior_net.load_state_dict(model['behavior_net']) 215 | if checkpoint: 216 | self._target_net.load_state_dict(model['target_net']) 217 | self._optimizer.load_state_dict(model['optimizer']) 218 | 219 | 220 | def train(args, env, agent, writer): 221 | """ 222 | Training 223 | :param args: arguments 224 | :param env: environment 225 | :param agent: agent 226 | :param writer: Tensorboard writer 227 | :return: None 228 | """ 229 | print('Start Training') 230 | action_space = env.action_space 231 | total_steps, epsilon = 0, 1. 232 | ewma_reward = 0 233 | for episode in range(args.episode): 234 | total_reward = 0 235 | state = env.reset() 236 | for t in itertools.count(start=1): 237 | # Select action 238 | if total_steps < args.warmup: 239 | action = action_space.sample() 240 | else: 241 | action = agent.select_action(state, epsilon, action_space) 242 | epsilon = max(epsilon * args.eps_decay, args.eps_min) 243 | # Execute action 244 | next_state, reward, done, _ = env.step(action) 245 | # Store transition 246 | agent.append(state, action, reward, next_state, done) 247 | if total_steps >= args.warmup: 248 | agent.update(total_steps) 249 | 250 | state = next_state 251 | total_reward += reward 252 | total_steps += 1 253 | if done: 254 | ewma_reward = 0.05 * total_reward + (1 - 0.05) * ewma_reward 255 | writer.add_scalar('Train/Episode Reward', total_reward, 256 | total_steps) 257 | writer.add_scalar('Train/Ewma Reward', ewma_reward, 258 | total_steps) 259 | print( 260 | 'Step: {}\tEpisode: {}\tLength: {:3d}\tTotal reward: {:.2f}\tEwma reward: {:.2f}\tEpsilon: {:.3f}' 261 | .format(total_steps, episode, t, total_reward, ewma_reward, 262 | epsilon)) 263 | break 264 | env.close() 265 | 266 | 267 | def test(args, env, agent, writer): 268 | """ 269 | Testing 270 | :param args: arguments 271 | :param env: environment 272 | :param agent: agent 273 | :param writer: Tensorboard writer 274 | :return: None 275 | """ 276 | print('Start Testing') 277 | action_space = env.action_space 278 | epsilon = args.test_epsilon 279 | seeds = (args.seed + i for i in range(10)) 280 | rewards = [] 281 | for n_episode, seed in enumerate(seeds): 282 | total_reward = 0 283 | env.seed(seed) 284 | state = env.reset() 285 | # TODO 286 | # ... 287 | # if done: 288 | # writer.add_scalar('Test/Episode Reward', total_reward, n_episode) 289 | # ... 290 | for _ in range(1000): 291 | action = agent.select_action(state, 0, action_space) 292 | state, reward, done, _ = env.step(action) 293 | total_reward += reward 294 | if done: 295 | writer.add_scalar('Test/Episode Reward', total_reward, n_episode) 296 | break 297 | rewards.append(total_reward) 298 | print('Average Reward', np.mean(rewards)) 299 | env.close() 300 | 301 | 302 | def main(): 303 | # Arguments 304 | parser = argparse.ArgumentParser(description=__doc__) 305 | parser.add_argument('-d', '--device', default='cuda') 306 | parser.add_argument('-m', '--model', default='dqn.pth') 307 | parser.add_argument('--logdir', default='log/dqn') 308 | # Training arguments 309 | parser.add_argument('--warmup', default=10000, type=int) 310 | parser.add_argument('--episode', default=1200, type=int) 311 | parser.add_argument('--capacity', default=10000, type=int) 312 | parser.add_argument('--batch_size', default=128, type=int) 313 | parser.add_argument('--lr', default=.0005, type=float) 314 | parser.add_argument('--eps_decay', default=.995, type=float) 315 | parser.add_argument('--eps_min', default=.01, type=float) 316 | parser.add_argument('--gamma', default=.99, type=float) 317 | parser.add_argument('--freq', default=4, type=int) 318 | parser.add_argument('--target_freq', default=100, type=int) 319 | # Testing arguments 320 | parser.add_argument('--test_only', action='store_true') 321 | parser.add_argument('--render', action='store_true') 322 | parser.add_argument('--seed', default=20200519, type=int) 323 | parser.add_argument('--test_epsilon', default=.001, type=float) 324 | args = parser.parse_args() 325 | 326 | # Main 327 | env = gym.make('LunarLander-v2') 328 | agent = DQN(args) 329 | writer = SummaryWriter(args.logdir) 330 | if not args.test_only: 331 | train(args, env, agent, writer) 332 | agent.save(args.model) 333 | agent.load(args.model) 334 | test(args, env, agent, writer) 335 | 336 | 337 | if __name__ == '__main__': 338 | main() 339 | -------------------------------------------------------------------------------- /Lab 6/report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 6/report.pdf -------------------------------------------------------------------------------- /Lab 7/Lab7-Lets play GANs with Flows and friends.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 7/Lab7-Lets play GANs with Flows and friends.pdf -------------------------------------------------------------------------------- /Lab 7/README.md: -------------------------------------------------------------------------------- 1 | # Deep-Learning-and-Practice Lab 7 2 | 🚀 cGAN & cNF 3 | 🏹 The goal of this lab is to implement conditional GAN and conditional Normalizing Flow to generate object images and human faces. 4 | 5 | 6 | 7 | ## Reference 8 | 💡 cDCGAN is adapted from [pytorch tutorial](https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html) 9 | 💡 cSAGAN is adapted from [github](https://github.com/heykeetae/Self-Attention-GAN) 10 | 💡 cGlow is adapted from [github](https://github.com/5yearsKim/Conditional-Normalizing-Flow) 11 | 12 | 13 | 14 | ## Arguments 15 | |Argument|Description|Default| 16 | |---|---|---| 17 | |`'-b', '--batch_size'`|Batch size|64| 18 | |`'-i', '--image_size'`|Image size|64| 19 | |`'-w', '--width'`|Dimension of the hidden layers in normalizing flow|128| 20 | |`'-d', '--depth'`|Depth of the normalizing flow|8| 21 | |`'-n', '--num_levels'`|Number of levels in normalizing flow|3| 22 | |`'-gv', '--grad_value_clip'`|Clip gradients at specific value|0| 23 | |`'-gn', '--grad_norm_clip'`|Clip gradients' norm at specific value|0| 24 | |`'-lrd', '--learning_rate_discriminator'`|Learning rate of discriminator|0.0001| 25 | |`'-lrg', '--learning_rate_generator'`|Learning rate of generator|0.0004| 26 | |`'-lrnf', '--learning_rate_normalizing_flow'`|Learning rate of normalizing flow|0.0005| 27 | |`'-e', '--epochs'`|Number of epochs|300| 28 | |`'-wu', '--warmup'`|Number of warmup epochs|10| 29 | |`'-t', '--task'`|Task 1 or task 2|1 (1-2)| 30 | |`'-m', '--model'`|cGAN or cNF|'dcgan'| 31 | |`'-inf', '--inference'`|Only infer or not|False| 32 | |`'-v', '--verbosity'`|Verbosity level|0 (0-2)| -------------------------------------------------------------------------------- /Lab 7/argument_parser.py: -------------------------------------------------------------------------------- 1 | from argparse import ArgumentParser, ArgumentTypeError, Namespace 2 | 3 | 4 | def check_model_type(input_value: str) -> str: 5 | """ 6 | Check whether model is gan or nf 7 | :param input_value: input string value 8 | :return: string value 9 | """ 10 | if input_value != 'dcgan' and input_value != 'sagan' and input_value != 'glow': 11 | raise ArgumentTypeError(f'Model should be "dcgan", "sagan", or "glow"') 12 | return input_value.upper() 13 | 14 | 15 | def parse_arguments() -> Namespace: 16 | """ 17 | Parse arguments from command line 18 | :return: arguments 19 | """ 20 | parser = ArgumentParser(description='cGAN & cNF') 21 | parser.add_argument('-b', '--batch_size', default=64, type=int, help='Batch size') 22 | parser.add_argument('-i', '--image_size', default=64, type=int, help='Image size') 23 | parser.add_argument('-w', '--width', default=128, type=int, 24 | help='Dimension of the hidden layers in normalizing flow') 25 | parser.add_argument('-d', '--depth', default=8, type=int, help='Depth of the normalizing flow') 26 | parser.add_argument('-n', '--num_levels', default=3, type=int, help='Number of levels in normalizing flow') 27 | parser.add_argument('-gv', '--grad_value_clip', default=0, type=float, help='Clip gradients at specific value') 28 | parser.add_argument('-gn', '--grad_norm_clip', default=0, type=float, help="Clip gradients' norm at specific value") 29 | parser.add_argument('-lrd', '--learning_rate_discriminator', default=0.0001, type=float, 30 | help='Learning rate of discriminator') 31 | parser.add_argument('-lrg', '--learning_rate_generator', default=0.0004, type=float, 32 | help='Learning rate of generator') 33 | parser.add_argument('-lrnf', '--learning_rate_normalizing_flow', default=0.0005, type=float, 34 | help='Learning rate of normalizing flow') 35 | parser.add_argument('-e', '--epochs', default=300, type=int, help='Number of epochs') 36 | parser.add_argument('-wu', '--warmup', default=10, type=int, help='Number of warmup epochs') 37 | parser.add_argument('-t', '--task', default=1, type=int, choices=[1, 2], help='Task 1 or task 2') 38 | parser.add_argument('-m', '--model', default='dcgan', type=check_model_type, help='cGAN or cNF') 39 | parser.add_argument('-inf', '--inference', action='store_true', help='Only infer or not') 40 | parser.add_argument('-v', '--verbosity', default=0, type=int, choices=[0, 1, 2], help='Verbosity level') 41 | 42 | return parser.parse_args() 43 | -------------------------------------------------------------------------------- /Lab 7/dcgan.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch 3 | 4 | 5 | # DCGAN is designed for image size 64 6 | 7 | def weights_init(model: nn.Module) -> None: 8 | """ 9 | Initialize weights in convolution and batch norm layers 10 | :param model: Model 11 | :return: None 12 | """ 13 | classname = model.__class__.__name__ 14 | if classname.find('Conv') != -1: 15 | nn.init.normal_(model.weight.data, 0.0, 0.02) 16 | elif classname.find('BatchNorm') != -1: 17 | nn.init.normal_(model.weight.data, 1.0, 0.02) 18 | nn.init.constant_(model.bias.data, 0) 19 | 20 | 21 | class DCGenerator(nn.Module): 22 | def __init__(self, noise_size: int, label_size: int): 23 | super(DCGenerator, self).__init__() 24 | 25 | self.net = nn.Sequential( 26 | nn.ConvTranspose2d(in_channels=noise_size + label_size, 27 | out_channels=512, 28 | kernel_size=4, 29 | stride=1, 30 | padding=0, 31 | bias=False), 32 | nn.BatchNorm2d(num_features=512), 33 | nn.ReLU(True), 34 | 35 | nn.ConvTranspose2d(in_channels=512, 36 | out_channels=256, 37 | kernel_size=4, 38 | stride=2, 39 | padding=1, 40 | bias=False), 41 | nn.BatchNorm2d(num_features=256), 42 | nn.ReLU(True), 43 | 44 | nn.ConvTranspose2d(in_channels=256, 45 | out_channels=128, 46 | kernel_size=4, 47 | stride=2, 48 | padding=1, 49 | bias=False), 50 | nn.BatchNorm2d(num_features=128), 51 | nn.ReLU(True), 52 | 53 | nn.ConvTranspose2d(in_channels=128, 54 | out_channels=64, 55 | kernel_size=4, 56 | stride=2, 57 | padding=1, 58 | bias=False), 59 | nn.BatchNorm2d(num_features=64), 60 | nn.ReLU(True), 61 | 62 | nn.ConvTranspose2d(in_channels=64, 63 | out_channels=3, 64 | kernel_size=4, 65 | stride=2, 66 | padding=1, 67 | bias=False), 68 | nn.Tanh() 69 | ) 70 | 71 | def forward(self, x: torch.Tensor) -> torch.Tensor: 72 | """ 73 | Generator forwarding 74 | :param x: Batched data 75 | :return: Batched image 76 | """ 77 | return self.net(x) 78 | 79 | 80 | class DCDiscriminator(nn.Module): 81 | def __init__(self, num_classes: int, image_size: int): 82 | super(DCDiscriminator, self).__init__() 83 | 84 | self.net = nn.Sequential( 85 | nn.Conv2d(in_channels=4, 86 | out_channels=64, 87 | kernel_size=4, 88 | stride=2, 89 | padding=1, 90 | bias=False), 91 | nn.LeakyReLU(negative_slope=0.2, inplace=True), 92 | 93 | nn.Conv2d(in_channels=64, 94 | out_channels=128, 95 | kernel_size=4, 96 | stride=2, 97 | padding=1, 98 | bias=False), 99 | nn.BatchNorm2d(num_features=128), 100 | nn.LeakyReLU(negative_slope=0.2, inplace=True), 101 | 102 | nn.Conv2d(in_channels=128, 103 | out_channels=256, 104 | kernel_size=4, 105 | stride=2, 106 | padding=1, 107 | bias=False), 108 | nn.BatchNorm2d(num_features=256), 109 | nn.LeakyReLU(negative_slope=0.2, inplace=True), 110 | 111 | nn.Conv2d(in_channels=256, 112 | out_channels=512, 113 | kernel_size=4, 114 | stride=2, 115 | padding=1, 116 | bias=False), 117 | nn.BatchNorm2d(num_features=512), 118 | nn.LeakyReLU(negative_slope=0.2, inplace=True), 119 | 120 | nn.Conv2d(in_channels=512, 121 | out_channels=1, 122 | kernel_size=4, 123 | stride=1, 124 | padding=0, 125 | bias=False) 126 | ) 127 | 128 | self.label_to_condition = nn.Sequential( 129 | nn.ConvTranspose2d(in_channels=num_classes, 130 | out_channels=16, 131 | kernel_size=4, 132 | stride=1, 133 | padding=0, 134 | bias=False), 135 | nn.BatchNorm2d(num_features=16), 136 | nn.ReLU(True), 137 | 138 | nn.ConvTranspose2d(in_channels=16, 139 | out_channels=4, 140 | kernel_size=4, 141 | stride=2, 142 | padding=1, 143 | bias=False), 144 | nn.BatchNorm2d(num_features=4), 145 | nn.ReLU(True), 146 | 147 | nn.ConvTranspose2d(in_channels=4, 148 | out_channels=1, 149 | kernel_size=4, 150 | stride=2, 151 | padding=1, 152 | bias=False), 153 | nn.BatchNorm2d(num_features=1), 154 | nn.ReLU(True), 155 | ) 156 | self.linear = nn.Sequential( 157 | nn.Linear(in_features=16 * 16, 158 | out_features=32 * 32, 159 | bias=False), 160 | nn.ReLU(True), 161 | 162 | nn.Linear(in_features=32 * 32, 163 | out_features=image_size * image_size, 164 | bias=False), 165 | nn.Tanh() 166 | ) 167 | self.image_size = image_size 168 | 169 | def forward(self, x: torch.Tensor, label: torch.Tensor) -> torch.Tensor: 170 | """ 171 | Discriminator forwarding 172 | :param x: Batched data 173 | :param label: Batched labels 174 | :return: Discrimination results 175 | """ 176 | batch_size, num_classes = label.size() 177 | label = label.view(batch_size, num_classes, 1, 1) 178 | condition = self.label_to_condition(label).view(batch_size, 1, -1) 179 | condition = self.linear(condition).view(-1, 1, self.image_size, self.image_size) 180 | inputs = torch.cat([x, condition], 1) 181 | return self.net(inputs).view(-1, 1) 182 | -------------------------------------------------------------------------------- /Lab 7/evaluator.py: -------------------------------------------------------------------------------- 1 | from torch import device 2 | import torch 3 | import torch.nn as nn 4 | import torchvision.models as models 5 | 6 | '''=============================================================== 7 | 1. Title: 8 | 9 | DLP spring 2021 Lab7 classifier 10 | 11 | 2. Purpose: 12 | 13 | For computing the classification accuracy. 14 | 15 | 3. Details: 16 | 17 | The model is based on ResNet18 with only changing the 18 | last linear layer. The model is trained on iclevr dataset 19 | with 1 to 5 objects and the resolution is the up-sampled 20 | 64x64 images from 32x32 images. 21 | 22 | It will capture the top k highest accuracy indexes on generated 23 | images and compare them with ground truth labels. 24 | 25 | 4. How to use 26 | 27 | You should call eval(images, labels) and to get total accuracy. 28 | images shape: (batch_size, 3, 64, 64) 29 | labels shape: (batch_size, 24) where labels are one-hot vectors 30 | e.g. [[1,1,0,...,0],[0,1,1,0,...],...] 31 | 32 | ===============================================================''' 33 | 34 | 35 | class EvaluationModel: 36 | def __init__(self, training_device: device): 37 | checkpoint = torch.load('data/task_1/classifier_weight.pth', map_location=training_device) 38 | self.resnet18 = models.resnet18(pretrained=False) 39 | self.resnet18.fc = nn.Sequential( 40 | nn.Linear(512, 24), 41 | nn.Sigmoid() 42 | ) 43 | self.resnet18.load_state_dict(checkpoint['model']) 44 | self.resnet18 = self.resnet18.to(training_device) 45 | self.resnet18.eval() 46 | self.class_num = 24 47 | 48 | @staticmethod 49 | def compute_acc(out: torch.Tensor, one_hot_labels: torch.Tensor): 50 | """ 51 | Compute accuracy for one_hot_labels based on out 52 | :param out: output from ResNet18 53 | :param one_hot_labels: one_hot_labels from generator 54 | :return: accuracy 55 | """ 56 | batch_size = out.size(0) 57 | acc = 0 58 | total = 0 59 | for i in range(batch_size): 60 | k = int(one_hot_labels[i].sum().item()) 61 | total += k 62 | out_v, out_i = out[i].topk(k) 63 | lv, li = one_hot_labels[i].topk(k) 64 | for j in out_i: 65 | if j in li: 66 | acc += 1 67 | return acc / total 68 | 69 | def eval(self, images: torch.Tensor, labels: torch.Tensor): 70 | """ 71 | Evaluate labels generated from the generator 72 | :param images: images from generator 73 | :param labels: labels from generator 74 | :return: accuracy 75 | """ 76 | with torch.no_grad(): 77 | # Your image shape should be (batch, 3, 64, 64) 78 | out = self.resnet18(images) 79 | acc = self.compute_acc(out.cpu(), labels.cpu()) 80 | return acc 81 | -------------------------------------------------------------------------------- /Lab 7/glow.py: -------------------------------------------------------------------------------- 1 | from typing import Tuple, Optional, List 2 | import torch.nn as nn 3 | import torch.nn.functional as func 4 | import numpy as np 5 | import torch 6 | 7 | 8 | class ActNorm(nn.Module): 9 | """ 10 | Activation normalization for 2D inputs. 11 | The bias and scale get initialized using the mean and variance of the first mini-batch. 12 | After the init, bias and scale are trainable parameters. 13 | Adapted from: https://github.com/openai/glow 14 | :arg in_channels: Number of channels in the input 15 | :arg scale: Scale factor for initial logs 16 | """ 17 | 18 | def __init__(self, in_channels: int, scale: float = 1.): 19 | super(ActNorm, self).__init__() 20 | 21 | self.bias = nn.Parameter(torch.zeros(1, in_channels, 1, 1)) 22 | self.logs = nn.Parameter(torch.zeros(1, in_channels, 1, 1)) 23 | 24 | self.num_features = in_channels 25 | self.scale = float(scale) 26 | self.eps = 1e-6 27 | self.is_initialized = False 28 | 29 | def initialize_parameters(self, x: torch.Tensor) -> None: 30 | """ 31 | Initialize bias and logs 32 | :param x: First mini-batch 33 | :return: None 34 | """ 35 | if not self.training: 36 | return 37 | 38 | with torch.no_grad(): 39 | bias = -torch.mean(x.clone(), dim=[0, 2, 3], keepdim=True) 40 | v = torch.mean((x.clone() + bias) ** 2, dim=[0, 2, 3], keepdim=True) 41 | logs = (self.scale / (v.sqrt() + self.eps)).log() 42 | self.bias.data.copy_(bias.data) 43 | self.logs.data.copy_(logs.data) 44 | self.is_initialized = True 45 | 46 | def _center(self, x: torch.Tensor, reverse: bool = False) -> torch.Tensor: 47 | """ 48 | Translate the data 49 | :param x: Batched data 50 | :param reverse: Reverse or not 51 | :return: Translated data 52 | """ 53 | if not reverse: 54 | return x + self.bias 55 | else: 56 | return x - self.bias 57 | 58 | def _scale(self, x: torch.Tensor, sld: torch.Tensor, reverse: bool = False) -> Tuple[torch.Tensor, ...]: 59 | """ 60 | Scale the data 61 | :param x: Batched data 62 | :param sld: Sum of log-determinant 63 | :param reverse: Reverse or not 64 | :return: Scaled data and sum of log-determinant 65 | """ 66 | if not reverse: 67 | x *= self.logs.exp() 68 | else: 69 | x *= (-self.logs).exp() 70 | 71 | if sld is not None: 72 | ld = self.logs.sum() * x.size(2) * x.size(3) 73 | if not reverse: 74 | sld += ld 75 | else: 76 | sld -= ld 77 | 78 | return x, sld 79 | 80 | def forward(self, x: torch.Tensor, sld: torch.Tensor = None, reverse: bool = False) -> Tuple[torch.Tensor, ...]: 81 | """ 82 | Actnorm forwarding 83 | :param x: Batched data 84 | :param sld: Sum of log-determinant 85 | :param reverse: Reverse or not 86 | :return: Scaled and translated data & sum of log-determinant 87 | """ 88 | if not self.is_initialized: 89 | self.initialize_parameters(x=x) 90 | 91 | if not reverse: 92 | x = self._center(x=x, reverse=False) 93 | x, sld = self._scale(x=x, sld=sld, reverse=False) 94 | else: 95 | x, sld = self._scale(x=x, sld=sld, reverse=True) 96 | x = self._center(x=x, reverse=True) 97 | 98 | return x, sld 99 | 100 | 101 | class InvConv(nn.Module): 102 | """ 103 | Invertible 1x1 Convolution for 2D inputs. 104 | Originally described in Glow (https://arxiv.org/abs/1807.03039). 105 | :arg num_channels: Number of channels in the input and output 106 | """ 107 | 108 | def __init__(self, num_channels: int): 109 | super(InvConv, self).__init__() 110 | 111 | self.num_channels = num_channels 112 | 113 | # Initialize with a random orthogonal matrix 114 | w_init = torch.qr(torch.randn(num_channels, num_channels))[0] 115 | p, lower, upper = torch.lu_unpack(*torch.lu(w_init)) 116 | s = torch.diag(upper) 117 | sign_s = torch.sign(s) 118 | log_s = torch.log(torch.abs(s)) 119 | upper = torch.triu(upper, 1) 120 | l_mask = torch.tril(torch.ones(num_channels, num_channels), -1) 121 | eye = torch.eye(num_channels, num_channels) 122 | 123 | self.register_buffer("p", p) 124 | self.register_buffer("sign_s", sign_s) 125 | self.lower = nn.Parameter(lower) 126 | self.log_s = nn.Parameter(log_s) 127 | self.upper = nn.Parameter(upper) 128 | self.l_mask = l_mask 129 | self.eye = eye 130 | 131 | def forward(self, x: torch.Tensor, sld: torch.tensor, reverse: bool = False) -> Tuple[torch.Tensor, ...]: 132 | """ 133 | Invertible 1x1 convolution forwarding 134 | :param x: Batched data 135 | :param sld: Sum of log-determinant 136 | :param reverse: Reverse or not 137 | :return: Transformed data and sum of log-determinant 138 | """ 139 | self.l_mask = self.l_mask.to(x.device) 140 | self.eye = self.eye.to(x.device) 141 | 142 | lower = self.lower * self.l_mask + self.eye 143 | 144 | u = self.upper * self.l_mask.transpose(0, 1).contiguous() 145 | u += torch.diag(self.sign_s * torch.exp(self.log_s)) 146 | 147 | ld = self.log_s.sum() * x.size(2) * x.size(3) 148 | 149 | if not reverse: 150 | weight = torch.matmul(self.p, torch.matmul(lower, u)).view(self.num_channels, self.num_channels, 1, 1) 151 | sld += ld 152 | else: 153 | u_inv = torch.inverse(u) 154 | l_inv = torch.inverse(lower) 155 | p_inv = torch.inverse(self.p) 156 | 157 | weight = torch.matmul(u_inv, torch.matmul(l_inv, p_inv)).view(self.num_channels, self.num_channels, 1, 1) 158 | sld -= ld 159 | 160 | z = func.conv2d(input=x, weight=weight) 161 | 162 | return z, sld 163 | 164 | 165 | def compute_same_pad(kernel_size: Optional[int or List[int]], stride: Optional[int or List[int]]) -> List[int]: 166 | """ 167 | Compute paddings 168 | :param kernel_size: Kernel size 169 | :param stride: Stride 170 | :return: Paddings 171 | """ 172 | if isinstance(kernel_size, int): 173 | kernel_size = [kernel_size] 174 | 175 | if isinstance(stride, int): 176 | stride = [stride] 177 | 178 | assert len(stride) == len( 179 | kernel_size 180 | ), "Pass kernel size and stride both as int, or both as equal length iterable" 181 | 182 | return [((k - 1) * s + 1) // 2 for k, s in zip(kernel_size, stride)] 183 | 184 | 185 | class Conv2d(nn.Module): 186 | """ 187 | Conv2d with actnorm 188 | :arg in_channels: Input channels 189 | :arg out_channels: Output channels 190 | :arg kernel_size: Kernel size 191 | :arg stride: Stride 192 | :arg padding: Padding 193 | :arg do_actnorm: Whether use actnorm 194 | :arg weight_std: Weight standard deviation 195 | """ 196 | 197 | def __init__( 198 | self, 199 | in_channels: int, 200 | out_channels: int, 201 | kernel_size: Optional[int or Tuple[int, int]] = (3, 3), 202 | stride: Optional[int or Tuple[int, int]] = (1, 1), 203 | padding: str = "same", 204 | do_actnorm: bool = True, 205 | weight_std: float = 0.05, 206 | ): 207 | super().__init__() 208 | 209 | if padding == "same": 210 | padding = compute_same_pad(kernel_size=kernel_size, stride=stride) 211 | elif padding == "valid": 212 | padding = 0 213 | 214 | self.conv = nn.Conv2d( 215 | in_channels=in_channels, 216 | out_channels=out_channels, 217 | kernel_size=kernel_size, 218 | stride=stride, 219 | padding=padding, 220 | bias=(not do_actnorm), 221 | ) 222 | 223 | # init weight with std 224 | self.conv.weight.data.normal_(mean=0.0, std=weight_std) 225 | 226 | if not do_actnorm: 227 | self.conv.bias.data.zero_() 228 | else: 229 | self.actnorm = ActNorm(in_channels=out_channels) 230 | 231 | self.do_actnorm = do_actnorm 232 | 233 | def forward(self, x: torch.Tensor) -> torch.Tensor: 234 | """ 235 | Forwarding 236 | :param x: Batched data 237 | :return: Batched data 238 | """ 239 | output = self.conv(x) 240 | if self.do_actnorm: 241 | output, _ = self.actnorm.forward(x=output) 242 | return output 243 | 244 | 245 | class Conv2dZeros(nn.Module): 246 | """ 247 | Conv2d with zero initial weight and bias 248 | :arg in_channels: Input channels 249 | :arg out_channels: Output channels 250 | :arg kernel_size: Kernel size 251 | :arg stride: Stride 252 | :arg padding: Padding 253 | :arg logscale_factor: Log scale factor 254 | """ 255 | 256 | def __init__( 257 | self, 258 | in_channels: int, 259 | out_channels: int, 260 | kernel_size: Optional[int or Tuple[int, int]] = (3, 3), 261 | stride: Optional[int or Tuple[int, int]] = (1, 1), 262 | padding: str = "same", 263 | logscale_factor: int = 3, 264 | ): 265 | super().__init__() 266 | 267 | if padding == "same": 268 | padding = compute_same_pad(kernel_size=kernel_size, stride=stride) 269 | elif padding == "valid": 270 | padding = 0 271 | 272 | self.conv = nn.Conv2d( 273 | in_channels=in_channels, 274 | out_channels=out_channels, 275 | kernel_size=kernel_size, 276 | stride=stride, 277 | padding=padding) 278 | 279 | self.conv.weight.data.zero_() 280 | self.conv.bias.data.zero_() 281 | 282 | self.logscale_factor = logscale_factor 283 | self.logs = nn.Parameter(torch.zeros(out_channels, 1, 1)) 284 | 285 | def forward(self, x: torch.Tensor) -> torch.Tensor: 286 | """ 287 | Forwarding 288 | :param x: Batched data 289 | :return: Batched data 290 | """ 291 | output = self.conv(x) 292 | return output * torch.exp(self.logs * self.logscale_factor) 293 | 294 | 295 | class LinearZeros(nn.Module): 296 | """ 297 | Linear with zero initial weight and bias 298 | :arg in_channels: Input features 299 | :arg out_channels: Output features 300 | :arg logscale_factor: Log scale factor 301 | """ 302 | 303 | def __init__(self, in_channels: int, out_channels: int, logscale_factor: int = 3): 304 | super().__init__() 305 | 306 | self.linear = nn.Linear(in_features=in_channels, out_features=out_channels) 307 | self.linear.weight.data.zero_() 308 | self.linear.bias.data.zero_() 309 | 310 | self.logscale_factor = logscale_factor 311 | 312 | self.logs = nn.Parameter(torch.zeros(out_channels)) 313 | 314 | def forward(self, x: torch.Tensor) -> torch.Tensor: 315 | """ 316 | Forwarding 317 | :param x: Batched data 318 | :return: Batched data 319 | """ 320 | output = self.linear(x) 321 | return output * torch.exp(self.logs * self.logscale_factor) 322 | 323 | 324 | class Coupling(nn.Module): 325 | """ 326 | Affine coupling layer 327 | :arg in_channels: Number of channels in the input 328 | :arg mid_channels: Number of channels in the intermediate activation in NN 329 | """ 330 | 331 | def __init__(self, in_channels: int, cond_channels: int, mid_channels: int): 332 | super(Coupling, self).__init__() 333 | self.nn = NN(in_channels=in_channels, 334 | cond_channels=cond_channels, 335 | mid_channels=mid_channels, 336 | out_channels=2 * in_channels) 337 | self.scale = nn.Parameter(torch.ones(in_channels, 1, 1)) 338 | 339 | def forward(self, 340 | x: torch.Tensor, 341 | x_cond: torch.Tensor, 342 | sld: torch.Tensor, 343 | reverse: bool = False) -> Tuple[torch.Tensor, ...]: 344 | """ 345 | Affine coupling forwarding 346 | :param x: Batched data 347 | :param x_cond: Batched conditions 348 | :param sld: Sum of log-determinant 349 | :param reverse: Reverse or not 350 | :return: Affine coupled data and sum of log-determinant 351 | """ 352 | x_id, x_change = x.chunk(2, dim=1) 353 | 354 | scale_and_translate = self.nn.forward(x=x_id, x_cond=x_cond) 355 | scale, translate = scale_and_translate[:, 0::2, ...], scale_and_translate[:, 1::2, ...] 356 | scale = self.scale * torch.tanh(scale) 357 | 358 | # Scale and translate 359 | ld = scale.flatten(1).sum(-1) 360 | if not reverse: 361 | x_change = scale.exp() * x_change + translate 362 | sld += ld 363 | else: 364 | x_change = (x_change - translate) * scale.mul(-1).exp() 365 | sld -= ld 366 | 367 | x = torch.cat((x_id, x_change), dim=1) 368 | 369 | return x, sld 370 | 371 | 372 | class NN(nn.Module): 373 | """ 374 | Small convolutional network used to compute scale and translate factors. 375 | :arg in_channels: Number of channels in the input 376 | :arg cond_channels: Number of channels in the condition 377 | :arg mid_channels: Number of channels in the hidden activations 378 | :arg out_channels: Number of channels in the output 379 | """ 380 | 381 | def __init__(self, in_channels: int, cond_channels: int, mid_channels: int, out_channels: int): 382 | super(NN, self).__init__() 383 | 384 | self.in_conv = Conv2d(in_channels=in_channels, 385 | out_channels=mid_channels) 386 | self.in_cond_conv = Conv2d(in_channels=cond_channels, 387 | out_channels=in_channels) 388 | 389 | self.mid_conv = Conv2d(in_channels=mid_channels, 390 | out_channels=mid_channels, 391 | kernel_size=(1, 1)) 392 | self.mid_cond_conv = Conv2d(in_channels=cond_channels, 393 | out_channels=mid_channels, 394 | kernel_size=(1, 1)) 395 | 396 | self.out_conv = Conv2dZeros(in_channels=mid_channels, 397 | out_channels=out_channels) 398 | 399 | def forward(self, x: torch.Tensor, x_cond: torch.Tensor) -> torch.Tensor: 400 | """ 401 | Compute scale and translate from batched data 402 | :param x: Batched data 403 | :param x_cond: Batched conditions 404 | :return: Scale and translate as one tensor 405 | """ 406 | x = self.in_conv(x + self.in_cond_conv(x_cond)) 407 | x = func.relu(x) 408 | 409 | x = self.mid_conv(x + self.mid_cond_conv(x_cond)) 410 | x = func.relu(x) 411 | 412 | return self.out_conv(x) 413 | 414 | 415 | class FlowStep(nn.Module): 416 | """ 417 | Single flow step 418 | Forward: ActNorm -> InvConv -> Coupling 419 | Reverse: Coupling -> InvConv -> ActNorm 420 | :arg in_channels: Number of channels in the input 421 | :arg cond_channels: Number of channels in the condition 422 | :arg mid_channels: Number of hidden channels in the coupling layer 423 | """ 424 | 425 | def __init__(self, in_channels: int, cond_channels: int, mid_channels: int): 426 | super(FlowStep, self).__init__() 427 | 428 | # Activation normalization, invertible 1x1 convolution, affine coupling 429 | self.norm = ActNorm(in_channels=in_channels) 430 | self.conv = InvConv(num_channels=in_channels) 431 | self.coup = Coupling(in_channels=in_channels // 2, 432 | cond_channels=cond_channels, 433 | mid_channels=mid_channels) 434 | 435 | def forward(self, 436 | x: torch.Tensor, 437 | x_cond: torch.Tensor, 438 | sld: torch.Tensor = None, 439 | reverse: bool = False) -> Tuple[torch.Tensor, ...]: 440 | """ 441 | Single flow step 442 | Forward: ActNorm -> InvConv -> Coupling 443 | Reverse: Coupling -> InvConv -> ActNorm 444 | :param x: Batched data 445 | :param x_cond: Batched conditions 446 | :param sld: Sum of log-determinant 447 | :param reverse: Reverse or not 448 | :return: Batched data and sum of log-determinant 449 | """ 450 | if not reverse: 451 | # Normal flow 452 | x, sld = self.norm.forward(x=x, sld=sld, reverse=False) 453 | x, sld = self.conv.forward(x=x, sld=sld, reverse=False) 454 | x, sld = self.coup.forward(x=x, x_cond=x_cond, sld=sld, reverse=False) 455 | else: 456 | # Reverse flow 457 | x, sld = self.coup.forward(x=x, x_cond=x_cond, sld=sld, reverse=True) 458 | x, sld = self.conv.forward(x=x, sld=sld, reverse=True) 459 | x, sld = self.norm.forward(x=x, sld=sld, reverse=True) 460 | 461 | return x, sld 462 | 463 | 464 | class CGlow(nn.Module): 465 | """ 466 | Conditional Glow model 467 | :arg num_channels: Number of channels in the hidden layers 468 | :arg num_levels: Number of levels in the model (number of _CGlow classes) 469 | :arg num_steps: Number of flow steps in each level 470 | :arg num_classes: Number of classes in the condition 471 | :arg image_size: Image size 472 | """ 473 | 474 | def __init__(self, num_channels: int, num_levels: int, num_steps: int, num_classes: int, image_size: int): 475 | super(CGlow, self).__init__() 476 | 477 | # Use bounds to rescale images before converting to logits, not learned 478 | self.register_buffer('bounds', torch.tensor([0.95], dtype=torch.float32)) 479 | 480 | self.squeeze = Squeeze2d() 481 | self.flows = _CGlow(in_channels=3 * 4, 482 | cond_channels=1 * 4, 483 | mid_channels=num_channels, 484 | num_levels=num_levels, 485 | num_steps=num_steps) 486 | 487 | self.learn_top_fn = Conv2dZeros(in_channels=3 * 2, 488 | out_channels=3 * 2) 489 | self.register_buffer( 490 | 'prior_h', 491 | torch.zeros(1, 3 * 2, image_size, image_size), 492 | ) 493 | 494 | # Project label to condition 495 | self.project_label = LinearZeros(in_channels=num_classes, 496 | out_channels=3 * 2) 497 | 498 | # Project latent code to label 499 | self.project_latent = LinearZeros(in_channels=3, 500 | out_channels=num_classes) 501 | 502 | self.num_classes = num_classes 503 | self.image_size = image_size 504 | 505 | self.label_to_condition = nn.Sequential( 506 | nn.ConvTranspose2d(in_channels=num_classes, 507 | out_channels=16, 508 | kernel_size=4, 509 | stride=1, 510 | padding=0, 511 | bias=False), 512 | nn.BatchNorm2d(num_features=16), 513 | nn.ReLU(inplace=True), 514 | 515 | nn.ConvTranspose2d(in_channels=16, 516 | out_channels=4, 517 | kernel_size=4, 518 | stride=2, 519 | padding=1, 520 | bias=False), 521 | nn.BatchNorm2d(num_features=4), 522 | nn.ReLU(inplace=True), 523 | 524 | nn.ConvTranspose2d(in_channels=4, 525 | out_channels=1, 526 | kernel_size=4, 527 | stride=2, 528 | padding=1, 529 | bias=False), 530 | nn.BatchNorm2d(num_features=1), 531 | nn.ReLU(inplace=True), 532 | ) 533 | self.linear = nn.Sequential( 534 | nn.Linear(in_features=16 * 16, 535 | out_features=32 * 32, 536 | bias=False), 537 | nn.ReLU(inplace=True), 538 | 539 | nn.Linear(in_features=32 * 32, 540 | out_features=image_size * image_size, 541 | bias=False), 542 | nn.Sigmoid() 543 | ) 544 | 545 | def forward(self, 546 | x_label: torch.Tensor, 547 | x: torch.Tensor = None, 548 | reverse: bool = False) -> Optional[Tuple[torch.Tensor, ...] or torch.Tensor]: 549 | """ 550 | CGlow forwarding 551 | :param x: Batched data 552 | :param x_label: Batched label 553 | :param reverse: Reverse or not 554 | :return: Batched data and sum of log-determinant 555 | """ 556 | if not reverse: 557 | x, sld = self._pre_process(x) 558 | else: 559 | if x is None: 560 | x = torch.zeros(x_label.size(0), device=x_label.device) 561 | mean, logs = self.get_mean_and_logs(data=x, label=x_label) 562 | x = GaussianDiag.sample(mean, logs) 563 | sld = torch.zeros(x.size(0), device=x.device) 564 | 565 | x_cond = self.label_to_condition(x_label.view(-1, self.num_classes, 1, 1)) 566 | x_cond = self.linear(x_cond.view(-1, 1, x_cond.size(2) * x_cond.size(3))) 567 | x_cond = x_cond.view(-1, 1, self.image_size, self.image_size) 568 | 569 | x = self.squeeze.forward(x=x, reverse=False) 570 | x_cond = self.squeeze.forward(x=x_cond, reverse=False) 571 | if not reverse: 572 | x, sld = self.flows.forward(x=x, x_cond=x_cond, sld=sld, reverse=False) 573 | else: 574 | with torch.no_grad(): 575 | x, sld = self.flows.forward(x=x, x_cond=x_cond, sld=sld, reverse=True) 576 | x = self.squeeze(x, reverse=True) 577 | 578 | mean, logs = self.get_mean_and_logs(data=x, label=x_label) 579 | sld += GaussianDiag.log_prob(mean=mean, logs=logs, x=x) 580 | nll = (-sld) / float(np.log(2.0) * x.size(1) * x.size(2) * x.size(3)) 581 | 582 | label_logits = self.project_latent(x.mean(2).mean(2)).view(-1, self.num_classes) 583 | label_logits = torch.sigmoid(label_logits) 584 | 585 | if reverse: 586 | x = torch.sigmoid(x) 587 | 588 | return x, nll, label_logits 589 | 590 | def get_mean_and_logs(self, data: torch.Tensor, label: torch.Tensor) -> Tuple[torch.Tensor, ...]: 591 | """ 592 | Get mean and logs from label 593 | :param data: Batched data 594 | :param label: Batched labels 595 | :return: Mean and logs 596 | """ 597 | h = self.prior_h.repeat(data.shape[0], 1, 1, 1) 598 | channels = h.size(1) 599 | h = self.learn_top_fn(h) 600 | h += self.project_label(label).view(h.shape[0], channels, 1, 1) 601 | return h[:, : channels // 2, ...], h[:, channels // 2:, ...] 602 | 603 | def _pre_process(self, x: torch.Tensor) -> Tuple[torch.Tensor, ...]: 604 | """ 605 | Preprocess x to x + U(0, 1/256) 606 | :param x: Batched data 607 | :return: Preprocessed data and sum of log-determinant 608 | """ 609 | x = (x * 255. + torch.rand_like(x)) / 256. 610 | x = (2 * x - 1) * self.bounds 611 | x = (x + 1) / 2 612 | x = x.log() - (1. - x).log() 613 | 614 | # Save log-determinant of Jacobian of initial transform 615 | ld = func.softplus(x) + func.softplus(-x) - func.softplus((1. - self.bounds).log() - self.bounds.log()) 616 | sld = ld.flatten(1).sum(-1) 617 | 618 | return x, sld 619 | 620 | 621 | class _CGlow(nn.Module): 622 | """ 623 | Recursive constructor for a cGlow model. 624 | Each call creates a single level. 625 | :arg in_channels: Number of channels in the input 626 | :arg mid_channels: Number of channels in hidden layers of each step 627 | :arg num_levels: Number of levels to construct. Counter for recursion 628 | :arg num_steps: Number of steps of flow for each level 629 | """ 630 | 631 | def __init__(self, in_channels: int, cond_channels: int, mid_channels: int, num_levels: int, num_steps: int): 632 | super(_CGlow, self).__init__() 633 | 634 | self.squeeze = Squeeze2d() 635 | self.steps = nn.ModuleList([FlowStep(in_channels=in_channels, 636 | cond_channels=cond_channels, 637 | mid_channels=mid_channels) 638 | for _ in range(num_steps)]) 639 | 640 | self.level = num_levels 641 | 642 | if num_levels > 1: 643 | self.next = _CGlow(in_channels=in_channels * 2, 644 | cond_channels=cond_channels * 4, 645 | mid_channels=mid_channels, 646 | num_levels=num_levels - 1, 647 | num_steps=num_steps) 648 | else: 649 | self.next = None 650 | 651 | def forward(self, 652 | x: torch.Tensor, 653 | x_cond: torch.Tensor, 654 | sld: torch.Tensor, 655 | reverse: bool = False) -> Tuple[torch.Tensor, ...]: 656 | """ 657 | Forwarding of each level 658 | :param x: Batched data 659 | :param x_cond: Batched conditions 660 | :param sld: Sum of log-determinant 661 | :param reverse: Reverse or not 662 | :return: Batched data and sum of log-determinant 663 | """ 664 | if not reverse: 665 | for step in self.steps: 666 | x, sld = step.forward(x=x, x_cond=x_cond, sld=sld, reverse=False) 667 | 668 | if self.next is not None: 669 | x = self.squeeze.forward(x=x, reverse=False) 670 | x_cond = self.squeeze.forward(x=x_cond, reverse=False) 671 | x, x_split = x.chunk(2, dim=1) 672 | x, sld = self.next.forward(x=x, x_cond=x_cond, sld=sld, reverse=reverse) 673 | x = torch.cat((x, x_split), dim=1) 674 | x = self.squeeze.forward(x=x, reverse=True) 675 | x_cond = self.squeeze.forward(x=x_cond, reverse=True) 676 | 677 | if reverse: 678 | for step in reversed(self.steps): 679 | x, sld = step.forward(x=x, x_cond=x_cond, sld=sld, reverse=True) 680 | 681 | return x, sld 682 | 683 | 684 | class Squeeze2d(nn.Module): 685 | """ 686 | Trade spatial extent for channels. 687 | In forward direction, convert each 1x4x4 volume of input into a 4x1x1 volume of output. 688 | """ 689 | 690 | def __init__(self): 691 | super(Squeeze2d, self).__init__() 692 | 693 | def forward(self, x: torch.Tensor, reverse: bool = False) -> torch.Tensor: 694 | """ 695 | Squeeze forwarding 696 | :param x: Batched data 697 | :param reverse: Reverse or not 698 | :return: Squeezed/Un-squeezed data 699 | """ 700 | batch_size, channel_size, height, width = x.size() 701 | if not reverse: 702 | # Squeeze 703 | x = x.view(batch_size, channel_size, height // 2, 2, width // 2, 2) 704 | x = x.permute(0, 1, 3, 5, 2, 4).contiguous() 705 | x = x.view(batch_size, channel_size * 2 * 2, height // 2, width // 2) 706 | else: 707 | # Un-squeeze 708 | x = x.view(batch_size, channel_size // 4, 2, 2, height, width) 709 | x = x.permute(0, 1, 4, 2, 5, 3).contiguous() 710 | x = x.view(batch_size, channel_size // 4, height * 2, width * 2) 711 | 712 | return x 713 | 714 | 715 | class GaussianDiag: 716 | Log2PI = float(np.log(2 * np.pi)) 717 | 718 | @staticmethod 719 | def likelihood(mean: torch.Tensor, logs: torch.Tensor, x: torch.Tensor) -> torch.Tensor: 720 | """ 721 | lnL = -1/2 * { ln|Var| + ((X - Mu)^T)(Var^-1)(X - Mu) + kln(2*PI) } 722 | :param mean: Mean 723 | :param logs: Log std 724 | :param x: Batched data 725 | :return Log-likelihood 726 | """ 727 | return -0.5 * (logs * 2. + ((x - mean) ** 2) / torch.exp(logs * 2.) + GaussianDiag.Log2PI) 728 | 729 | @staticmethod 730 | def log_prob(mean: torch.Tensor, logs: torch.Tensor, x: torch.Tensor) -> torch.Tensor: 731 | """ 732 | Get log-likelihood for each batch 733 | :param mean: Mean 734 | :param logs: Log std 735 | :param x: Batched data 736 | :return: Batched log-likelihood 737 | """ 738 | likelihood = GaussianDiag.likelihood(mean=mean, logs=logs, x=x) 739 | return torch.sum(likelihood, dim=[1, 2, 3]) - np.log(256.0) * np.prod(x.size()[1:]) 740 | 741 | @staticmethod 742 | def sample(mean: torch.Tensor, logs: torch.Tensor) -> torch.Tensor: 743 | """ 744 | Sample data from Gaussian with mean and logs 745 | :param mean: Mean 746 | :param logs: Log std 747 | :return: Sampled data 748 | """ 749 | return torch.normal(mean=mean, std=torch.exp(logs)) 750 | 751 | 752 | class NLLLoss(nn.Module): 753 | """ 754 | Negative log-likelihood loss 755 | """ 756 | 757 | def __init__(self): 758 | super(NLLLoss, self).__init__() 759 | 760 | def forward(self, nll: torch.Tensor, label_logits: torch.Tensor, labels: torch.Tensor) -> torch.Tensor: 761 | """ 762 | Compute loss 763 | :param nll: Negative log-likelihood 764 | :param label_logits: Label logits 765 | :param labels: Labels 766 | :return: Loss 767 | """ 768 | return func.binary_cross_entropy_with_logits(input=label_logits, target=labels.float()) * 0.01 + torch.mean(nll) 769 | -------------------------------------------------------------------------------- /Lab 7/main.py: -------------------------------------------------------------------------------- 1 | from dcgan import DCGenerator, DCDiscriminator, weights_init 2 | from sagan import SAGenerator, SADiscriminator 3 | from glow import CGlow, NLLLoss 4 | from task_1_dataset import ICLEVRLoader 5 | from task_2_dataset import CelebALoader 6 | from train import train_cgan, train_cnf 7 | from test import test_cgan, test_cnf, inference_celeb 8 | from evaluator import EvaluationModel 9 | from argument_parser import parse_arguments 10 | from visualizer import plot_losses, plot_accuracies 11 | from util import info_log, create_directories, get_score 12 | from torch.utils.data import DataLoader 13 | from torchvision.transforms import transforms 14 | from torchvision.utils import save_image, make_grid 15 | from torch import device, cuda 16 | from argparse import Namespace 17 | from math import inf 18 | import torch.optim as optim 19 | import os 20 | import torch 21 | 22 | 23 | def train_and_evaluate_cgan(train_loader: DataLoader, 24 | test_loader: DataLoader, 25 | new_test_loader: DataLoader, 26 | evaluator: EvaluationModel, 27 | num_classes: int, 28 | args: Namespace, 29 | training_device: device) -> None: 30 | """ 31 | Train and test cGAN 32 | :param train_loader: Training data loader 33 | :param test_loader: Testing data loader 34 | :param new_test_loader: Net Testing data loader 35 | :param evaluator: Evaluator 36 | :param num_classes: Number of classes (object IDs) 37 | :param args: All arguments 38 | :param training_device: Training device 39 | :return: None 40 | """ 41 | # Setup models 42 | info_log('Setup models ...', args.verbosity) 43 | 44 | if args.model == 'DCGAN': 45 | # DCGAN 46 | generator = DCGenerator(noise_size=args.image_size, 47 | label_size=num_classes).to(training_device) 48 | discriminator = DCDiscriminator(num_classes=num_classes, 49 | image_size=args.image_size).to(training_device) 50 | generator.apply(weights_init) 51 | discriminator.apply(weights_init) 52 | else: 53 | # Self Attention GAN 54 | generator = SAGenerator(noise_size=args.image_size, 55 | label_size=num_classes, 56 | conv_dim=args.image_size).to(training_device) 57 | discriminator = SADiscriminator(num_classes=num_classes, 58 | image_size=args.image_size, 59 | conv_dim=args.image_size).to(training_device) 60 | 61 | if os.path.exists(f'model/task_1/{args.model}.pt'): 62 | checkpoint = torch.load(f'model/task_1/{args.model}.pt') 63 | generator.load_state_dict(checkpoint['generator']) 64 | discriminator.load_state_dict(checkpoint['discriminator']) 65 | 66 | optimizer_g = optim.Adam(generator.parameters(), lr=args.learning_rate_generator, betas=(0.5, 0.999)) 67 | optimizer_d = optim.Adam(discriminator.parameters(), lr=args.learning_rate_discriminator, betas=(0.5, 0.999)) 68 | scheduler_g = torch.optim.lr_scheduler.LambdaLR(optimizer_g, lr_lambda=lambda e: min(1.0, (e + 1) / args.warmup)) 69 | scheduler_d = torch.optim.lr_scheduler.LambdaLR(optimizer_d, lr_lambda=lambda e: min(1.0, (e + 1) / args.warmup)) 70 | 71 | # Setup average losses/accuracies container 72 | generator_losses = [0.0 for _ in range(args.epochs)] 73 | discriminator_losses = [0.0 for _ in range(args.epochs)] 74 | accuracies = [0.0 for _ in range(args.epochs)] 75 | new_accuracies = [0.0 for _ in range(args.epochs)] 76 | 77 | if not args.inference: 78 | # Start training 79 | info_log('Start training', args.verbosity) 80 | max_score = 0.0 81 | for epoch in range(args.epochs): 82 | # Train 83 | total_g_loss, total_d_loss = train_cgan(data_loader=train_loader, 84 | generator=generator, 85 | discriminator=discriminator, 86 | optimizer_g=optimizer_g, 87 | optimizer_d=optimizer_d, 88 | scheduler_g=scheduler_g, 89 | scheduler_d=scheduler_d, 90 | num_classes=num_classes, 91 | epoch=epoch, 92 | args=args, 93 | training_device=training_device) 94 | generator_losses[epoch] = total_g_loss / len(train_loader) 95 | discriminator_losses[epoch] = total_d_loss / len(train_loader) 96 | print(f'[{epoch + 1}/{args.epochs}] Average generator loss: {generator_losses[epoch]}') 97 | print(f'[{epoch + 1}/{args.epochs}] Average discriminator loss: {discriminator_losses[epoch]}') 98 | 99 | # Test 100 | generated_image, total_accuracy = test_cgan(data_loader=test_loader, 101 | generator=generator, 102 | num_classes=num_classes, 103 | epoch=epoch, 104 | evaluator=evaluator, 105 | args=args, 106 | training_device=training_device) 107 | accuracies[epoch] = total_accuracy / len(test_loader) 108 | 109 | # New Test 110 | new_generated_image, total_accuracy = test_cgan(data_loader=new_test_loader, 111 | generator=generator, 112 | num_classes=num_classes, 113 | epoch=epoch, 114 | evaluator=evaluator, 115 | args=args, 116 | training_device=training_device) 117 | new_accuracies[epoch] = total_accuracy / len(new_test_loader) 118 | 119 | print(f'[{epoch + 1}/{args.epochs}] Average accuracy: {accuracies[epoch]:.2f}') 120 | print(f'[{epoch + 1}/{args.epochs}] New Average accuracy: {new_accuracies[epoch]:.2f}') 121 | 122 | # Save generator and discriminator, and plot test image 123 | score = get_score(accuracies[epoch], new_accuracies[epoch]) 124 | if score >= max_score: 125 | # Update 126 | max_score = score 127 | 128 | # Save images 129 | save_image(make_grid(generated_image, nrow=8), 130 | f'test_figure/{args.model}_{epoch}_{accuracies[epoch]:.2f}.jpg') 131 | save_image(make_grid(generated_image, nrow=8), 132 | f'test_figure/{args.model}_{epoch}_new_{new_accuracies[epoch]:.2f}.jpg') 133 | 134 | # Save model 135 | checkpoint = { 136 | 'generator': generator.state_dict(), 137 | 'discriminator': discriminator.state_dict() 138 | } 139 | torch.save(checkpoint, 140 | f'model/task_1/{args.model}_{epoch}_{accuracies[epoch]:.4f}_new_{new_accuracies[epoch]:.4f}.pt') 141 | 142 | # Plot losses and accuracies 143 | info_log('Plot losses and accuracies ...', args.verbosity) 144 | plot_losses(losses=(generator_losses, discriminator_losses), labels=['Generator', 'Discriminator'], 145 | epoch=epoch, task='task_1', model=args.model) 146 | plot_accuracies(accuracies=(accuracies, new_accuracies), labels=['Test', 'New Test'], epoch=epoch, 147 | model=args.model) 148 | else: 149 | # Start inferring 150 | info_log('Start inferring', args.verbosity) 151 | generated_image, test_accuracy = test_cgan(data_loader=test_loader, 152 | generator=generator, 153 | num_classes=num_classes, 154 | epoch=0, 155 | evaluator=evaluator, 156 | args=args, 157 | training_device=training_device) 158 | test_accuracy /= len(test_loader) 159 | save_image(make_grid(generated_image, nrow=8), f'figure/task_1/{args.model}_{test_accuracy:.2f}.png') 160 | 161 | # New Test 162 | generated_image, new_test_accuracy = test_cgan(data_loader=new_test_loader, 163 | generator=generator, 164 | num_classes=num_classes, 165 | epoch=0, 166 | evaluator=evaluator, 167 | args=args, 168 | training_device=training_device) 169 | new_test_accuracy /= len(new_test_loader) 170 | save_image(make_grid(generated_image, nrow=8), f'figure/task_1/{args.model}_new_{new_test_accuracy:.2f}.png') 171 | 172 | print(f'Average accuracy: {test_accuracy:.2f}') 173 | print(f'New Average accuracy: {new_test_accuracy:.2f}') 174 | 175 | 176 | def train_and_evaluate_cnf(train_loader: DataLoader, 177 | test_loader: DataLoader, 178 | new_test_loader: DataLoader, 179 | evaluator: EvaluationModel, 180 | num_classes: int, 181 | args: Namespace, 182 | training_device: device) -> None: 183 | """ 184 | Train and test cNF 185 | :param train_loader: Training data loader 186 | :param test_loader: Testing data loader 187 | :param new_test_loader: New testing data loader 188 | :param evaluator: Evaluator 189 | :param num_classes: Number of different conditions 190 | :param args: All arguments 191 | :param training_device: Training device 192 | :return: None 193 | """ 194 | # Setup models 195 | info_log('Setup models ...', args.verbosity) 196 | 197 | normalizing_flow = CGlow(num_channels=args.width, 198 | num_levels=args.num_levels, 199 | num_steps=args.depth, 200 | num_classes=num_classes, 201 | image_size=args.image_size).to(training_device) 202 | if os.path.exists(f'model/task_1/{args.model}.pt'): 203 | checkpoint = torch.load(f'model/task_1/{args.model}.pt') 204 | normalizing_flow.load_state_dict(checkpoint['normalizing_flow']) 205 | 206 | optimizer = optim.Adamax(normalizing_flow.parameters(), lr=args.learning_rate_normalizing_flow, weight_decay=5e-5) 207 | scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda e: min(1.0, (e + 1) / args.warmup)) 208 | loss_fn = NLLLoss().to(training_device) 209 | 210 | # Setup average losses/accuracies container 211 | losses = [0.0 for _ in range(args.epochs)] 212 | accuracies = [0.0 for _ in range(args.epochs)] 213 | new_accuracies = [0.0 for _ in range(args.epochs)] 214 | 215 | if not args.inference: 216 | # Start training 217 | info_log('Start training', args.verbosity) 218 | max_score = 0.0 219 | for epoch in range(args.epochs): 220 | # Train 221 | total_loss = train_cnf(data_loader=train_loader, 222 | normalizing_flow=normalizing_flow, 223 | optimizer=optimizer, 224 | scheduler=scheduler, 225 | loss_fn=loss_fn, 226 | epoch=epoch, 227 | args=args, 228 | training_device=training_device) 229 | losses[epoch] = total_loss / len(train_loader) 230 | print(f'[{epoch + 1}/{args.epochs}] Average loss: {losses[epoch]}') 231 | 232 | # Test 233 | generated_image, total_accuracy = test_cnf(data_loader=test_loader, 234 | normalizing_flow=normalizing_flow, 235 | epoch=epoch, 236 | evaluator=evaluator, 237 | args=args, 238 | training_device=training_device) 239 | accuracies[epoch] = total_accuracy / len(test_loader) 240 | 241 | # New Test 242 | new_generated_image, total_accuracy = test_cnf(data_loader=new_test_loader, 243 | normalizing_flow=normalizing_flow, 244 | epoch=epoch, 245 | evaluator=evaluator, 246 | args=args, 247 | training_device=training_device) 248 | new_accuracies[epoch] = total_accuracy / len(new_test_loader) 249 | 250 | print(f'[{epoch + 1}/{args.epochs}] Average accuracy: {accuracies[epoch]:.2f}') 251 | print(f'[{epoch + 1}/{args.epochs}] New Average accuracy: {new_accuracies[epoch]:.2f}') 252 | 253 | # Save normalizing flow, and plot test image 254 | score = get_score(accuracies[epoch], new_accuracies[epoch]) 255 | if score >= max_score: 256 | # Update 257 | max_score = score 258 | 259 | # Save images 260 | save_image(make_grid(generated_image, nrow=8), 261 | f'test_figure/{args.model}_{epoch}_{accuracies[epoch]:.2f}.jpg') 262 | save_image(make_grid(new_generated_image, nrow=8), 263 | f'test_figure/{args.model}_{epoch}_new_{new_accuracies[epoch]:.2f}.jpg') 264 | 265 | # Save model 266 | checkpoint = {'normalizing_flow': normalizing_flow.state_dict()} 267 | torch.save(checkpoint, 268 | f'model/task_1/{args.model}_{epoch}_{accuracies[epoch]:.4f}_new_{new_accuracies[epoch]:.4f}.pt') 269 | 270 | # Plot losses and accuracies 271 | info_log('Plot losses and accuracies ...', args.verbosity) 272 | plot_losses(losses=(losses,), labels=['loss'], epoch=epoch, task='task_1', model=args.model) 273 | plot_accuracies(accuracies=(accuracies, new_accuracies), labels=['Test', 'New Test'], epoch=epoch, 274 | model=args.model) 275 | else: 276 | # Start inferring 277 | info_log('Start inferring', args.verbosity) 278 | generated_image, test_accuracy = test_cnf(data_loader=test_loader, 279 | normalizing_flow=normalizing_flow, 280 | epoch=0, 281 | evaluator=evaluator, 282 | args=args, 283 | training_device=training_device) 284 | test_accuracy /= len(test_loader) 285 | save_image(make_grid(generated_image, nrow=8), f'figure/task_1/{args.model}_{test_accuracy:.2f}.png') 286 | 287 | # New test 288 | generated_image, new_test_accuracy = test_cnf(data_loader=new_test_loader, 289 | normalizing_flow=normalizing_flow, 290 | epoch=0, 291 | evaluator=evaluator, 292 | args=args, 293 | training_device=training_device) 294 | new_test_accuracy /= len(new_test_loader) 295 | save_image(make_grid(generated_image, nrow=8), f'figure/task_1/{args.model}_new_{new_test_accuracy:.2f}.png') 296 | 297 | print(f'Average accuracy: {test_accuracy:.2f}') 298 | print(f'New Average accuracy: {new_test_accuracy:.2f}') 299 | 300 | 301 | def train_and_inference_celeb(train_dataset: CelebALoader, 302 | train_loader: DataLoader, 303 | num_classes: int, 304 | args: Namespace, 305 | training_device: device) -> None: 306 | """ 307 | Train and inference cGlow 308 | :param train_dataset: Training dataset 309 | :param train_loader: Training data loader 310 | :param num_classes: Number of different conditions 311 | :param args: All arguments 312 | :param training_device: Training device 313 | :return: None 314 | """ 315 | # Setup models 316 | info_log('Setup models ...', args.verbosity) 317 | 318 | normalizing_flow = CGlow(num_channels=args.width, 319 | num_levels=args.num_levels, 320 | num_steps=args.depth, 321 | num_classes=num_classes, 322 | image_size=args.image_size).to(training_device) 323 | if os.path.exists(f'model/task_2/{args.model}.pt'): 324 | checkpoint = torch.load(f'model/task_2/{args.model}.pt') 325 | normalizing_flow.load_state_dict(checkpoint['normalizing_flow']) 326 | 327 | optimizer = optim.Adamax(normalizing_flow.parameters(), lr=args.learning_rate_normalizing_flow, weight_decay=5e-5) 328 | scheduler = torch.optim.lr_scheduler.LambdaLR(optimizer, lr_lambda=lambda e: min(1.0, (e + 1) / args.warmup)) 329 | loss_fn = NLLLoss().to(training_device) 330 | 331 | # Setup average losses container 332 | losses = [0.0 for _ in range(args.epochs)] 333 | 334 | if not args.inference: 335 | # Start training 336 | info_log('Start training', args.verbosity) 337 | min_loss = inf 338 | for epoch in range(args.epochs): 339 | # Train 340 | total_loss = train_cnf(data_loader=train_loader, 341 | normalizing_flow=normalizing_flow, 342 | optimizer=optimizer, 343 | scheduler=scheduler, 344 | loss_fn=loss_fn, 345 | epoch=epoch, 346 | args=args, 347 | training_device=training_device) 348 | losses[epoch] = total_loss / len(train_loader) 349 | print(f'[{epoch + 1}/{args.epochs}] Average loss: {losses[epoch]}') 350 | 351 | # 3 applications 352 | inference_celeb(data_loader=train_loader, 353 | train_dataset=train_dataset, 354 | normalizing_flow=normalizing_flow, 355 | num_classes=num_classes, 356 | args=args, 357 | training_device=training_device) 358 | 359 | # Save the model 360 | if losses[epoch] < min_loss: 361 | min_loss = losses[epoch] 362 | checkpoint = {'normalizing_flow': normalizing_flow.state_dict()} 363 | torch.save(checkpoint, f'model/task_2/{args.model}_{epoch}_{losses[epoch]:.4f}.pt') 364 | 365 | # Plot losses 366 | info_log('Plot losses ...', args.verbosity) 367 | plot_losses(losses=(losses,), labels=['loss'], epoch=epoch, task='task_2', model=args.model) 368 | else: 369 | # Start inferring 370 | info_log('Start inferring', args.verbosity) 371 | inference_celeb(data_loader=train_loader, 372 | train_dataset=train_dataset, 373 | normalizing_flow=normalizing_flow, 374 | num_classes=num_classes, 375 | args=args, 376 | training_device=training_device) 377 | 378 | 379 | def main() -> None: 380 | """ 381 | Main function 382 | :return: None 383 | """ 384 | # Get training device 385 | training_device = device('cuda' if cuda.is_available() else 'cpu') 386 | 387 | # Parse arguments 388 | args = parse_arguments() 389 | info_log(f'Batch size: {args.batch_size}', args.verbosity) 390 | info_log(f'Image size: {args.image_size}', args.verbosity) 391 | info_log(f'Dimension of the hidden layers in normalizing flow: {args.width}', args.verbosity) 392 | info_log(f'Depth of the normalizing flow: {args.depth}', args.verbosity) 393 | info_log(f'Number of levels in normalizing flow: {args.num_levels}', args.verbosity) 394 | info_log(f'Clip gradients at specific value: {args.grad_value_clip}', args.verbosity) 395 | info_log(f"Clip gradients' norm at specific value: {args.grad_norm_clip}", args.verbosity) 396 | info_log(f'Learning rate of discriminator: {args.learning_rate_discriminator}', args.verbosity) 397 | info_log(f'Learning rate of generator: {args.learning_rate_generator}', args.verbosity) 398 | info_log(f'Learning rate of normalizing flow: {args.learning_rate_normalizing_flow}', args.verbosity) 399 | info_log(f'Number of epochs: {args.epochs}', args.verbosity) 400 | info_log(f'Number of warmup epochs: {args.warmup}', args.verbosity) 401 | info_log(f'Perform task: {args.task}', args.verbosity) 402 | info_log(f'Which model will be used: {args.model}', args.verbosity) 403 | info_log(f'Only inference or not: {True if args.inference else False}', args.verbosity) 404 | info_log(f'Training device: {training_device}', args.verbosity) 405 | 406 | # Read data 407 | info_log('Read data ...', args.verbosity) 408 | 409 | if args.model == 'GLOW': 410 | if args.task == 1: 411 | transformation = transforms.Compose([transforms.RandomCrop(240), 412 | transforms.RandomHorizontalFlip(), 413 | transforms.Resize(args.image_size), 414 | transforms.ToTensor()]) 415 | else: 416 | transformation = transforms.Compose([transforms.RandomHorizontalFlip(), 417 | transforms.Resize(args.image_size), 418 | transforms.ToTensor()]) 419 | else: 420 | transformation = transforms.Compose([transforms.RandomCrop(240), 421 | transforms.RandomHorizontalFlip(), 422 | transforms.Resize(args.image_size), 423 | transforms.ToTensor(), 424 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) 425 | 426 | if args.task == 1: 427 | train_dataset = ICLEVRLoader(root_folder='data/task_1/', trans=transformation, mode='train') 428 | test_dataset = ICLEVRLoader(root_folder='data/task_1/', mode='test') 429 | new_test_dataset = ICLEVRLoader(root_folder='data/task_1/', mode='new_test') 430 | 431 | train_loader = DataLoader(train_dataset, batch_size=args.batch_size, shuffle=True) 432 | test_loader = DataLoader(test_dataset, batch_size=args.batch_size, shuffle=False) 433 | new_test_loader = DataLoader(new_test_dataset, batch_size=args.batch_size, shuffle=False) 434 | 435 | num_classes = train_dataset.num_classes 436 | else: 437 | train_dataset = CelebALoader(root_folder='data/task_2/', trans=transformation) 438 | train_loader = DataLoader(train_dataset, batch_size=args.batch_size, shuffle=True) 439 | num_classes = train_dataset.num_classes 440 | 441 | # Setup evaluator 442 | evaluator = EvaluationModel(training_device=training_device) 443 | 444 | # Create directories 445 | create_directories() 446 | 447 | if args.task == 1: 448 | if args.model == 'GLOW': 449 | train_and_evaluate_cnf(train_loader=train_loader, 450 | test_loader=test_loader, 451 | new_test_loader=new_test_loader, 452 | evaluator=evaluator, 453 | num_classes=num_classes, 454 | args=args, 455 | training_device=training_device) 456 | else: 457 | train_and_evaluate_cgan(train_loader=train_loader, 458 | test_loader=test_loader, 459 | new_test_loader=new_test_loader, 460 | evaluator=evaluator, 461 | num_classes=num_classes, 462 | args=args, 463 | training_device=training_device) 464 | else: 465 | train_and_inference_celeb(train_dataset=train_dataset, 466 | train_loader=train_loader, 467 | num_classes=num_classes, 468 | args=args, 469 | training_device=training_device) 470 | 471 | 472 | if __name__ == '__main__': 473 | main() 474 | -------------------------------------------------------------------------------- /Lab 7/report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/steven112163/Deep-Learning-and-Practice/159dcdf323e7bb034b15ab221541a128ebc967b1/Lab 7/report.pdf -------------------------------------------------------------------------------- /Lab 7/sagan.py: -------------------------------------------------------------------------------- 1 | from torch.nn import Parameter 2 | import torch.nn as nn 3 | import numpy as np 4 | import torch 5 | 6 | 7 | class SelfAttn(nn.Module): 8 | def __init__(self, in_dim: int): 9 | super(SelfAttn, self).__init__() 10 | self.channel_in = in_dim 11 | 12 | self.query_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim // 8, kernel_size=1) 13 | self.key_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim // 8, kernel_size=1) 14 | self.value_conv = nn.Conv2d(in_channels=in_dim, out_channels=in_dim, kernel_size=1) 15 | self.gamma = nn.Parameter(torch.zeros(1)) 16 | 17 | self.softmax = nn.Softmax(dim=-1) 18 | 19 | def forward(self, x: torch.Tensor): 20 | """ 21 | Forwarding 22 | :param x: Batched data 23 | :return: Self attention value + input feature 24 | """ 25 | batch_size, channel_size, width, height = x.size() 26 | proj_query = self.query_conv(x).view(batch_size, -1, width * height).permute(0, 2, 1) # B X CX(N) 27 | proj_key = self.key_conv(x).view(batch_size, -1, width * height) # B X C x (*W*H) 28 | energy = torch.bmm(proj_query, proj_key) # transpose check 29 | attention = self.softmax(energy) # BX (N) X (N) 30 | proj_value = self.value_conv(x).view(batch_size, -1, width * height) # B X C X N 31 | 32 | out = torch.bmm(proj_value, attention.permute(0, 2, 1)) 33 | out = out.view(batch_size, channel_size, width, height) 34 | return self.gamma * out + x 35 | 36 | 37 | def l2normalize(v: torch.Tensor, eps: float = 1e-12) -> torch.Tensor: 38 | """ 39 | L2 normalize 40 | :param v: Batched data 41 | :param eps: Value to avoid divided by 0 42 | :return: L2 normalized data 43 | """ 44 | return v / (v.norm() + eps) 45 | 46 | 47 | class SpectralNorm(nn.Module): 48 | def __init__(self, module, name='weight', power_iterations=1): 49 | super(SpectralNorm, self).__init__() 50 | self.module = module 51 | self.name = name 52 | self.power_iterations = power_iterations 53 | if not self._made_params(): 54 | self._make_params() 55 | 56 | def _update_u_v(self): 57 | u = getattr(self.module, self.name + "_u") 58 | v = getattr(self.module, self.name + "_v") 59 | w = getattr(self.module, self.name + "_bar") 60 | 61 | height = w.data.shape[0] 62 | for _ in range(self.power_iterations): 63 | v.data = l2normalize(torch.mv(torch.t(w.view(height, -1).data), u.data)) 64 | u.data = l2normalize(torch.mv(w.view(height, -1).data, v.data)) 65 | 66 | # sigma = torch.dot(u.data, torch.mv(w.view(height,-1).data, v.data)) 67 | sigma = u.dot(w.view(height, -1).mv(v)) 68 | setattr(self.module, self.name, w / sigma.expand_as(w)) 69 | 70 | def _made_params(self): 71 | try: 72 | u = getattr(self.module, self.name + "_u") 73 | v = getattr(self.module, self.name + "_v") 74 | w = getattr(self.module, self.name + "_bar") 75 | return True 76 | except AttributeError: 77 | return False 78 | 79 | def _make_params(self): 80 | w = getattr(self.module, self.name) 81 | 82 | height = w.data.shape[0] 83 | width = w.view(height, -1).data.shape[1] 84 | 85 | u = Parameter(w.data.new(height).normal_(0, 1), requires_grad=False) 86 | v = Parameter(w.data.new(width).normal_(0, 1), requires_grad=False) 87 | u.data = l2normalize(u.data) 88 | v.data = l2normalize(v.data) 89 | w_bar = Parameter(w.data) 90 | 91 | del self.module._parameters[self.name] 92 | 93 | self.module.register_parameter(self.name + "_u", u) 94 | self.module.register_parameter(self.name + "_v", v) 95 | self.module.register_parameter(self.name + "_bar", w_bar) 96 | 97 | def forward(self, *args): 98 | self._update_u_v() 99 | return self.module.forward(*args) 100 | 101 | 102 | class SAGenerator(nn.Module): 103 | def __init__(self, noise_size: int, label_size: int, conv_dim: int): 104 | super(SAGenerator, self).__init__() 105 | 106 | layer1 = [] 107 | layer2 = [] 108 | layer3 = [] 109 | last = [] 110 | 111 | repeat_num = int(np.log2(noise_size)) - 3 112 | multi = 2 ** repeat_num # 8 113 | layer1.append(SpectralNorm(nn.ConvTranspose2d(noise_size + label_size, conv_dim * multi, 4))) 114 | layer1.append(nn.BatchNorm2d(conv_dim * multi)) 115 | layer1.append(nn.ReLU()) 116 | 117 | curr_dim = conv_dim * multi 118 | 119 | layer2.append(SpectralNorm(nn.ConvTranspose2d(curr_dim, int(curr_dim / 2), 4, 2, 1))) 120 | layer2.append(nn.BatchNorm2d(int(curr_dim / 2))) 121 | layer2.append(nn.ReLU()) 122 | 123 | curr_dim = int(curr_dim / 2) 124 | 125 | layer3.append(SpectralNorm(nn.ConvTranspose2d(curr_dim, int(curr_dim / 2), 4, 2, 1))) 126 | layer3.append(nn.BatchNorm2d(int(curr_dim / 2))) 127 | layer3.append(nn.ReLU()) 128 | 129 | if noise_size == 64: 130 | layer4 = [] 131 | curr_dim = int(curr_dim / 2) 132 | layer4.append(SpectralNorm(nn.ConvTranspose2d(curr_dim, int(curr_dim / 2), 4, 2, 1))) 133 | layer4.append(nn.BatchNorm2d(int(curr_dim / 2))) 134 | layer4.append(nn.ReLU()) 135 | self.l4 = nn.Sequential(*layer4) 136 | curr_dim = int(curr_dim / 2) 137 | 138 | self.l1 = nn.Sequential(*layer1) 139 | self.l2 = nn.Sequential(*layer2) 140 | self.l3 = nn.Sequential(*layer3) 141 | 142 | last.append(nn.ConvTranspose2d(curr_dim, 3, 4, 2, 1)) 143 | last.append(nn.Tanh()) 144 | self.last = nn.Sequential(*last) 145 | 146 | self.attn1 = SelfAttn(128) 147 | self.attn2 = SelfAttn(64) 148 | 149 | def forward(self, x: torch.Tensor) -> torch.Tensor: 150 | """ 151 | Forwarding 152 | :param x: Batched noises and labels 153 | :return: Fake images 154 | """ 155 | x = x.view(x.size(0), x.size(1), 1, 1) 156 | out = self.l1(x) 157 | out = self.l2(out) 158 | out = self.l3(out) 159 | out = self.attn1(out) 160 | out = self.l4(out) 161 | out = self.attn2(out) 162 | out = self.last(out) 163 | 164 | return out 165 | 166 | 167 | class SADiscriminator(nn.Module): 168 | def __init__(self, num_classes: int, image_size: int, conv_dim: int): 169 | super(SADiscriminator, self).__init__() 170 | 171 | self.image_size = image_size 172 | layer1 = [] 173 | layer2 = [] 174 | layer3 = [] 175 | last = [] 176 | 177 | layer1.append(SpectralNorm(nn.Conv2d(4, conv_dim, 4, 2, 1))) 178 | layer1.append(nn.LeakyReLU(0.1)) 179 | 180 | curr_dim = conv_dim 181 | 182 | layer2.append(SpectralNorm(nn.Conv2d(curr_dim, curr_dim * 2, 4, 2, 1))) 183 | layer2.append(nn.LeakyReLU(0.1)) 184 | curr_dim = curr_dim * 2 185 | 186 | layer3.append(SpectralNorm(nn.Conv2d(curr_dim, curr_dim * 2, 4, 2, 1))) 187 | layer3.append(nn.LeakyReLU(0.1)) 188 | curr_dim = curr_dim * 2 189 | 190 | if image_size == 64: 191 | layer4 = [] 192 | layer4.append(SpectralNorm(nn.Conv2d(curr_dim, curr_dim * 2, 4, 2, 1))) 193 | layer4.append(nn.LeakyReLU(0.1)) 194 | self.l4 = nn.Sequential(*layer4) 195 | curr_dim = curr_dim * 2 196 | self.l1 = nn.Sequential(*layer1) 197 | self.l2 = nn.Sequential(*layer2) 198 | self.l3 = nn.Sequential(*layer3) 199 | 200 | last.append(nn.Conv2d(curr_dim, 1, 4)) 201 | self.last = nn.Sequential(*last) 202 | 203 | self.attn1 = SelfAttn(256) 204 | self.attn2 = SelfAttn(512) 205 | 206 | self.label_to_condition = nn.Sequential( 207 | nn.ConvTranspose2d(in_channels=num_classes, 208 | out_channels=16, 209 | kernel_size=4, 210 | stride=1, 211 | padding=0, 212 | bias=False), 213 | nn.BatchNorm2d(num_features=16), 214 | nn.ReLU(True), 215 | 216 | nn.ConvTranspose2d(in_channels=16, 217 | out_channels=4, 218 | kernel_size=4, 219 | stride=2, 220 | padding=1, 221 | bias=False), 222 | nn.BatchNorm2d(num_features=4), 223 | nn.ReLU(True), 224 | 225 | nn.ConvTranspose2d(in_channels=4, 226 | out_channels=1, 227 | kernel_size=4, 228 | stride=2, 229 | padding=1, 230 | bias=False), 231 | nn.BatchNorm2d(num_features=1), 232 | nn.ReLU(True), 233 | ) 234 | self.linear = nn.Sequential( 235 | nn.Linear(in_features=16 * 16, 236 | out_features=32 * 32, 237 | bias=False), 238 | nn.ReLU(True), 239 | 240 | nn.Linear(in_features=32 * 32, 241 | out_features=image_size * image_size, 242 | bias=False), 243 | nn.Tanh() 244 | ) 245 | 246 | def forward(self, x: torch.Tensor, label: torch.Tensor) -> torch.Tensor: 247 | """ 248 | Forwarding 249 | :param x: Batched data 250 | :param label: Batched labels 251 | :return: Discrimination results 252 | """ 253 | batch_size, num_classes = label.size() 254 | label = label.view(batch_size, num_classes, 1, 1) 255 | condition = self.label_to_condition(label).view(batch_size, 1, -1) 256 | condition = self.linear(condition).view(-1, 1, self.image_size, self.image_size) 257 | inputs = torch.cat([x, condition], 1) 258 | 259 | out = self.l1(inputs) 260 | out = self.l2(out) 261 | out = self.l3(out) 262 | out = self.attn1(out) 263 | out = self.l4(out) 264 | out = self.attn2(out) 265 | return self.last(out).view(-1, 1) 266 | -------------------------------------------------------------------------------- /Lab 7/task_1_dataset.py: -------------------------------------------------------------------------------- 1 | from torch.utils import data 2 | from typing import Dict, List 3 | from PIL import Image 4 | import torchvision.transforms as transforms 5 | import os 6 | import json 7 | import numpy as np 8 | 9 | 10 | def change_labels_to_one_hot(obj: Dict[str, int], ori_label: List[List[str]]): 11 | """ 12 | Change labels in each image to one hot vectors 13 | :param obj: ID for each label 14 | :param ori_label: original labels 15 | :return: converted labels 16 | """ 17 | converted_labels = np.zeros((len(ori_label), len(obj))) 18 | for img_idx, labels in enumerate(ori_label): 19 | for label_idx, label in enumerate(labels): 20 | ori_label[img_idx][label_idx] = obj[label] 21 | tmp = np.zeros(len(obj)) 22 | tmp[ori_label[img_idx]] = 1 23 | converted_labels[img_idx] = tmp 24 | 25 | return converted_labels 26 | 27 | 28 | def get_iclevr_data(root_folder: str, mode: str): 29 | """ 30 | Read training/testing/new_testing data from the file in root_folder 31 | :param root_folder: root folder containing training/testing data 32 | :param mode: train or test 33 | :return: image & labels for train, otherwise none & labels 34 | """ 35 | if mode == 'train': 36 | training_data = json.load(open(os.path.join(root_folder, 'train.json'))) 37 | obj = json.load(open(os.path.join(root_folder, 'objects.json'))) 38 | img = list(training_data.keys()) 39 | label = list(training_data.values()) 40 | label = change_labels_to_one_hot(obj=obj, ori_label=label) 41 | return np.squeeze(img), np.squeeze(label) 42 | else: 43 | testing_data = json.load(open(os.path.join(root_folder, f'{mode}.json'))) 44 | obj = json.load(open(os.path.join(root_folder, 'objects.json'))) 45 | label = testing_data 46 | label = change_labels_to_one_hot(obj=obj, ori_label=label) 47 | return None, label 48 | 49 | 50 | class ICLEVRLoader(data.Dataset): 51 | def __init__(self, root_folder: str, trans: transforms.transforms = None, cond: bool = False, mode: str = 'train'): 52 | self.root_folder = root_folder 53 | self.mode = mode 54 | self.img_list, self.label_list = get_iclevr_data(root_folder, mode) 55 | if self.mode == 'train': 56 | print(f'> Found {len(self.img_list)} images...') 57 | 58 | self.transform = trans 59 | self.cond = cond 60 | self.num_classes = 24 61 | 62 | def __len__(self): 63 | """ 64 | Return the size of dataset 65 | :return: size of dataset 66 | """ 67 | return len(self.label_list) 68 | 69 | def __getitem__(self, index: int): 70 | """ 71 | Get current data 72 | :param index: index of training/testing data 73 | :return: data 74 | """ 75 | if self.mode == 'train': 76 | img_path = self.root_folder + 'images/' + self.img_list[index] 77 | label = self.label_list[index] 78 | image = Image.open(img_path).convert('RGB') 79 | image = self.transform(image) 80 | return image, label 81 | else: 82 | return self.label_list[index] 83 | -------------------------------------------------------------------------------- /Lab 7/task_2_dataset.py: -------------------------------------------------------------------------------- 1 | from torch.utils import data 2 | from PIL import Image 3 | import numpy as np 4 | import os 5 | 6 | 7 | def get_celebrity_data(root_folder): 8 | img_list = os.listdir(os.path.join(root_folder, 'CelebA-HQ-img')) 9 | label_list = [] 10 | f = open(os.path.join(root_folder, 'CelebA-HQ-attribute-anno.txt'), 'r') 11 | num_images = int(f.readline()[:-1]) 12 | attrs = f.readline()[:-1].split(' ') 13 | for idx in range(num_images): 14 | line = f.readline()[:-1].split(' ') 15 | label = line[2:] 16 | label = list(map(int, label)) 17 | label_list.append(label) 18 | f.close() 19 | return img_list, np.array(label_list) 20 | 21 | 22 | class CelebALoader(data.Dataset): 23 | def __init__(self, root_folder, trans=None, cond=False): 24 | self.root_folder = root_folder 25 | assert os.path.isdir(self.root_folder), '{} is not a valid directory'.format(self.root_folder) 26 | 27 | self.img_list, self.label_list = get_celebrity_data(self.root_folder) 28 | 29 | print("> Found %d images..." % (len(self.img_list))) 30 | 31 | self.cond = cond 32 | self.transform = trans 33 | self.num_classes = 40 34 | 35 | def __len__(self): 36 | """ 37 | Return the size of dataset 38 | :return: size of dataset 39 | """ 40 | return len(self.label_list) 41 | 42 | def __getitem__(self, index): 43 | """ 44 | Get current data 45 | :param index: index of training data 46 | :return: data 47 | """ 48 | img_path = self.root_folder + 'CelebA-HQ-img/' + self.img_list[index] 49 | label = self.label_list[index] 50 | image = Image.open(img_path).convert('RGB') 51 | image = self.transform(image) 52 | return image, label 53 | -------------------------------------------------------------------------------- /Lab 7/test.py: -------------------------------------------------------------------------------- 1 | from dcgan import DCGenerator 2 | from sagan import SAGenerator 3 | from glow import CGlow 4 | from task_2_dataset import CelebALoader 5 | from evaluator import EvaluationModel 6 | from util import debug_log 7 | from torch.utils.data import DataLoader 8 | from torchvision.transforms import transforms 9 | from torchvision.utils import save_image, make_grid 10 | from torch import device 11 | from typing import Tuple, Optional 12 | from argparse import Namespace 13 | import torch 14 | 15 | 16 | def test_cgan(data_loader: DataLoader, 17 | generator: Optional[DCGenerator or SAGenerator], 18 | num_classes: int, 19 | epoch: int, 20 | evaluator: EvaluationModel, 21 | args: Namespace, 22 | training_device: device) -> Tuple[torch.Tensor, float]: 23 | """ 24 | Test cGAN 25 | :param data_loader: Testing data loader 26 | :param generator: Generator 27 | :param num_classes: Number of classes (object IDs) 28 | :param epoch: Current epoch 29 | :param args: All arguments 30 | :param training_device: Training device 31 | :return: Generated images and total accuracy of all batches 32 | """ 33 | generator.eval() 34 | total_accuracy = 0.0 35 | norm_image = torch.randn(0, 3, args.image_size, args.image_size) 36 | for batch_idx, batch_data in enumerate(data_loader): 37 | labels = batch_data 38 | batch_size = len(labels) 39 | labels = labels.to(training_device).type(torch.float) 40 | 41 | # Generate batch of latent vectors 42 | noise = torch.cat([ 43 | torch.randn((batch_size, args.image_size)), 44 | torch.clone(labels).cpu().detach() 45 | ], 1).view(-1, args.image_size + num_classes, 1, 1).to(training_device) 46 | 47 | # Generate fake image batch with generator 48 | with torch.no_grad(): 49 | fake_outputs = generator.forward(noise) 50 | 51 | # Compute accuracy 52 | acc = evaluator.eval(fake_outputs, labels) 53 | total_accuracy += acc 54 | for fake_image in fake_outputs: 55 | n_image = fake_image.cpu().detach() 56 | n_image = ((n_image + 1) / 2.0).clamp_(0, 1) 57 | norm_image = torch.cat([norm_image, n_image.view(1, 3, args.image_size, args.image_size)], 0) 58 | 59 | debug_log(f'[{epoch + 1}/{args.epochs}][{batch_idx + 1}/{len(data_loader)}] Accuracy: {acc}', args.verbosity) 60 | 61 | return norm_image, total_accuracy 62 | 63 | 64 | def test_cnf(data_loader: DataLoader, 65 | normalizing_flow: CGlow, 66 | epoch: int, 67 | evaluator: EvaluationModel, 68 | args: Namespace, 69 | training_device: device) -> Tuple[torch.Tensor, float]: 70 | """ 71 | Test cNF 72 | :param data_loader: Testing data loader 73 | :param normalizing_flow: Conditional normalizing flow model 74 | :param epoch: Current epoch 75 | :param evaluator: Evaluator 76 | :param args: All arguments 77 | :param training_device: Training device 78 | :return: Generated images and total accuracy 79 | """ 80 | normalizing_flow.eval() 81 | total_accuracy = 0.0 82 | generated_image = torch.randn(0, 3, args.image_size, args.image_size) 83 | for batch_idx, batch_data in enumerate(data_loader): 84 | labels = batch_data 85 | labels = labels.to(training_device).type(torch.float) 86 | 87 | with torch.no_grad(): 88 | fake_images, _, _ = normalizing_flow.forward(x=None, x_label=labels, reverse=True) 89 | 90 | transformed_images = torch.randn(0, 3, args.image_size, args.image_size) 91 | transformation = transforms.Compose([transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) 92 | for fake_image in fake_images: 93 | n_image = fake_image.cpu().detach() 94 | transformed_images = torch.cat([transformed_images, 95 | transformation(n_image).view(1, 3, args.image_size, args.image_size)], 0) 96 | transformed_images = transformed_images.to(training_device) 97 | 98 | acc = evaluator.eval(transformed_images, labels) 99 | total_accuracy += acc 100 | 101 | for fake_image in fake_images: 102 | n_image = fake_image.cpu().detach() 103 | generated_image = torch.cat([generated_image, n_image.view(1, 3, args.image_size, args.image_size)], 0) 104 | 105 | debug_log(f'[{epoch + 1}/{args.epochs}][{batch_idx + 1}/{len(data_loader)}] Accuracy: {acc}', args.verbosity) 106 | 107 | return generated_image, total_accuracy 108 | 109 | 110 | def inference_celeb(data_loader: DataLoader, 111 | train_dataset: CelebALoader, 112 | normalizing_flow: CGlow, 113 | num_classes: int, 114 | args: Namespace, 115 | training_device: device) -> None: 116 | """ 117 | Use cNF to inference celebrity data with 3 applications 118 | :param data_loader: Training data loader 119 | :param train_dataset: Training dataset 120 | :param normalizing_flow: Conditional normalizing flow model 121 | :param num_classes: Number of classes (attributes) 122 | :param args: All arguments 123 | :param training_device: Training device 124 | :return: None 125 | """ 126 | normalizing_flow.eval() 127 | 128 | # Application 1 129 | debug_log(f'Perform app 1', args.verbosity) 130 | application_one(train_dataset=train_dataset, 131 | normalizing_flow=normalizing_flow, 132 | num_classes=num_classes, 133 | args=args, 134 | training_device=training_device) 135 | 136 | # Application 2 137 | debug_log(f'Perform app 2', args.verbosity) 138 | application_two(train_dataset=train_dataset, 139 | normalizing_flow=normalizing_flow, 140 | args=args, 141 | training_device=training_device) 142 | 143 | # Application 3 144 | debug_log(f'Perform app 3', args.verbosity) 145 | application_three(data_loader=data_loader, 146 | train_dataset=train_dataset, 147 | normalizing_flow=normalizing_flow, 148 | args=args, 149 | training_device=training_device) 150 | 151 | 152 | def application_one(train_dataset: CelebALoader, 153 | normalizing_flow: CGlow, 154 | num_classes: int, 155 | args: Namespace, 156 | training_device: device) -> None: 157 | """ 158 | Application 1 159 | :param train_dataset: Training dataset 160 | :param normalizing_flow: Conditional normalizing flow model 161 | :param num_classes: Number of classes (attributes) 162 | :param args: All arguments 163 | :param training_device: Training device 164 | :return: None 165 | """ 166 | # Get labels for inference 167 | labels = torch.rand(0, num_classes) 168 | for idx in range(32): 169 | _, label = train_dataset[idx] 170 | labels = torch.cat([labels, torch.from_numpy(label).view(1, 40)], 0) 171 | labels = labels.to(training_device).type(torch.float) 172 | 173 | # Produce fake images 174 | with torch.no_grad(): 175 | fake_images, _, _ = normalizing_flow.forward(x=None, x_label=labels, reverse=True) 176 | 177 | # Save fake images for application 1 178 | generated_images = torch.randn(0, 3, args.image_size, args.image_size) 179 | for fake_image in fake_images: 180 | n_image = fake_image.cpu().detach() 181 | generated_images = torch.cat([generated_images, n_image.view(1, 3, args.image_size, args.image_size)], 0) 182 | save_image(make_grid(generated_images, nrow=8), f'figure/task_2/{args.model}_app_1.jpg') 183 | 184 | 185 | def application_two(train_dataset: CelebALoader, 186 | normalizing_flow: CGlow, 187 | args: Namespace, 188 | training_device: device) -> None: 189 | """ 190 | Application 2 191 | :param train_dataset: Training dataset 192 | :param normalizing_flow: Conditional normalizing flow model 193 | :param args: All arguments 194 | :param training_device: Training device 195 | :return: None 196 | """ 197 | # Get 2 images to perform linear interpolation 198 | linear_images = torch.randn(0, 3, args.image_size, args.image_size) 199 | for idx in range(5): 200 | # Get first image and label 201 | first_image, first_label = train_dataset[idx] 202 | first_image = first_image.to(training_device).type(torch.float).view(1, 3, args.image_size, args.image_size) 203 | first_label = torch.from_numpy(first_label).to(training_device).type(torch.float).view(1, 40) 204 | 205 | # Get second image and label 206 | second_image, second_label = train_dataset[idx + 5] 207 | second_image = second_image.to(training_device).type(torch.float).view(1, 3, args.image_size, args.image_size) 208 | second_label = torch.from_numpy(second_label).to(training_device).type(torch.float).view(1, 40) 209 | 210 | # Generate latent code 211 | with torch.no_grad(): 212 | first_z, _, _ = normalizing_flow.forward(x=first_image, x_label=first_label) 213 | second_z, _, _ = normalizing_flow.forward(x=second_image, x_label=second_label) 214 | 215 | # Compute interval 216 | interval_z = (second_z - first_z) / 8.0 217 | interval_label = (second_label - first_label) / 8.0 218 | 219 | # Generate linear images 220 | for num_of_intervals in range(9): 221 | with torch.no_grad(): 222 | image, _, _ = normalizing_flow.forward(x=first_z + num_of_intervals * interval_z, 223 | x_label=first_label + num_of_intervals * interval_label, 224 | reverse=True) 225 | linear_images = torch.cat([linear_images, 226 | image.cpu().detach().view(1, 3, args.image_size, args.image_size)], 0) 227 | save_image(make_grid(linear_images, nrow=9), f'figure/task_2/{args.model}_app_2.jpg') 228 | 229 | 230 | def application_three(data_loader: DataLoader, 231 | train_dataset: CelebALoader, 232 | normalizing_flow: CGlow, 233 | args: Namespace, 234 | training_device: device) -> None: 235 | """ 236 | Application three 237 | :param data_loader: Training data loader 238 | :param train_dataset: Training dataset 239 | :param normalizing_flow: Conditional normalizing flow model 240 | :param args: All arguments 241 | :param training_device: Training device 242 | :return: None 243 | """ 244 | # Get a image and labels with negative/positive smiling/bald 245 | image, label = train_dataset[1] 246 | image = image.to(training_device).type(torch.float).view(1, 3, args.image_size, args.image_size) 247 | label = torch.from_numpy(label).to(training_device).type(torch.float).view(1, 40) 248 | with torch.no_grad(): 249 | latent, _, _ = normalizing_flow.forward(x=image, x_label=label) 250 | 251 | # Get negative labels 252 | neg_smiling_label = torch.clone(label) 253 | neg_bald_label = torch.clone(label) 254 | neg_smiling_label[0, 31] = -1. 255 | neg_bald_label[0, 4] = -1. 256 | 257 | # Get positive labels 258 | pos_smiling_label = torch.clone(label) 259 | pos_bald_label = torch.clone(label) 260 | pos_smiling_label[0, 31] = 1. 261 | pos_bald_label[0, 4] = 1. 262 | 263 | # Compute conditional interval 264 | interval_smiling_label = (pos_smiling_label - neg_smiling_label) / 4.0 265 | interval_bald_label = (pos_bald_label - neg_bald_label) / 4.0 266 | 267 | # Generate manipulated images 268 | manipulated_images = torch.randn(0, 3, args.image_size, args.image_size) 269 | 270 | # Generate smiling 271 | manipulated_images = generate_manipulated_images(data_loader=data_loader, 272 | normalizing_flow=normalizing_flow, 273 | latent=latent, 274 | neg_label=neg_smiling_label, 275 | interval_label=interval_smiling_label, 276 | manipulated_images=manipulated_images, 277 | idx=31, 278 | args=args, 279 | training_device=training_device) 280 | 281 | # Generate bald 282 | manipulated_images = generate_manipulated_images(data_loader=data_loader, 283 | normalizing_flow=normalizing_flow, 284 | latent=latent, 285 | neg_label=neg_bald_label, 286 | interval_label=interval_bald_label, 287 | manipulated_images=manipulated_images, 288 | idx=4, 289 | args=args, 290 | training_device=training_device) 291 | 292 | save_image(make_grid(manipulated_images, nrow=5), f'figure/task_2/{args.model}_app_3.jpg') 293 | 294 | 295 | def generate_manipulated_images(data_loader: DataLoader, 296 | normalizing_flow: CGlow, 297 | latent: torch.Tensor, 298 | neg_label: torch.Tensor, 299 | interval_label: torch.Tensor, 300 | manipulated_images: torch.Tensor, 301 | idx: int, 302 | args: Namespace, 303 | training_device: device) -> torch.Tensor: 304 | """ 305 | Generate images with manipulated attribute 306 | :param data_loader: Training data loader 307 | :param normalizing_flow: Conditional normalizing flow model 308 | :param latent: Latent code of the target image 309 | :param neg_label: Label with negative target attribute 310 | :param interval_label: Interval from negative label to positive label 311 | :param manipulated_images: Tensor of manipulated images 312 | :param idx: Index of the target attribute 313 | :param args: All arguments 314 | :param training_device: Training device 315 | :return: Manipulated images 316 | """ 317 | pos_z_mean = torch.zeros(*(latent.size()), dtype=torch.float) 318 | neg_z_mean = torch.zeros(*(latent.size()), dtype=torch.float) 319 | num_pos, num_neg = 0, 0 320 | for images, labels in data_loader: 321 | images = images.to(training_device) 322 | labels = labels.to(training_device).type(torch.float) 323 | pos_indices = (labels[:, idx] == 1).nonzero(as_tuple=True)[0] 324 | neg_indices = (labels[:, idx] == -1).nonzero(as_tuple=True)[0] 325 | 326 | with torch.no_grad(): 327 | z, _, _ = normalizing_flow.forward(x=images, x_label=labels) 328 | z = z.cpu().detach() 329 | 330 | if len(pos_indices) > 0: 331 | num_pos += len(pos_indices) 332 | pos_z_mean = (num_pos - len(pos_indices)) / num_pos * pos_z_mean + z[pos_indices].sum(dim=0) / num_pos 333 | if len(neg_indices) > 0: 334 | num_neg += len(neg_indices) 335 | neg_z_mean = (num_neg - len(neg_indices)) / num_neg * neg_z_mean + z[neg_indices].sum(dim=0) / num_neg 336 | interval_z = 1.6 * (pos_z_mean - neg_z_mean) 337 | interval_z = interval_z.to(training_device) 338 | 339 | alphas = [-1.0, -0.5, 0.0, 0.5, 1.0] 340 | for num_of_intervals, alpha in enumerate(alphas): 341 | with torch.no_grad(): 342 | image, _, _ = normalizing_flow.forward(x=latent + alpha * interval_z, 343 | x_label=neg_label + num_of_intervals * interval_label, 344 | reverse=True) 345 | manipulated_images = torch.cat([manipulated_images, 346 | image.cpu().detach().view(1, 3, args.image_size, args.image_size)], 0) 347 | 348 | return manipulated_images 349 | -------------------------------------------------------------------------------- /Lab 7/train.py: -------------------------------------------------------------------------------- 1 | from dcgan import DCGenerator, DCDiscriminator 2 | from sagan import SAGenerator, SADiscriminator 3 | from glow import CGlow, NLLLoss 4 | from util import debug_log 5 | from torch.utils.data import DataLoader 6 | from torch import device 7 | from typing import Tuple, Optional 8 | from argparse import Namespace 9 | import torch.optim as optim 10 | import torch.nn as nn 11 | import torch 12 | 13 | 14 | def train_cgan(data_loader: DataLoader, 15 | generator: Optional[DCGenerator or SAGenerator], 16 | discriminator: Optional[DCDiscriminator or SADiscriminator], 17 | optimizer_g: optim, 18 | optimizer_d: optim, 19 | scheduler_g: optim.lr_scheduler, 20 | scheduler_d: optim.lr_scheduler, 21 | num_classes: int, 22 | epoch: int, 23 | args: Namespace, 24 | training_device: device) -> Tuple[float, float]: 25 | """ 26 | Train cGAN 27 | :param data_loader: Training data loader 28 | :param generator: Generator 29 | :param discriminator: Discriminator 30 | :param optimizer_g: Optimizer for generator 31 | :param optimizer_d: Optimizer for discriminator 32 | :param scheduler_g: Learning rate scheduler for generator 33 | :param scheduler_d: Learning rate scheduler for discriminator 34 | :param num_classes: Number of classes (object IDs) 35 | :param epoch: Current epoch 36 | :param args: All arguments 37 | :param training_device: Training device 38 | :return: Total generator loss and total discriminator loss 39 | """ 40 | generator.train() 41 | discriminator.train() 42 | total_g_loss = 0.0 43 | total_d_loss = 0.0 44 | 45 | criterion = nn.BCEWithLogitsLoss().to(training_device) 46 | 47 | for batch_idx, batch_data in enumerate(data_loader): 48 | images, real_labels = batch_data 49 | real_labels = real_labels.to(training_device).type(torch.float) 50 | 51 | ############################ 52 | # (1) Update D network 53 | ########################### 54 | # Train with all-real batch 55 | # Format batch 56 | images = images.to(training_device) 57 | batch_size = images.size(0) 58 | 59 | # Forward pass real batch through discriminator 60 | outputs = discriminator.forward(images, real_labels) 61 | 62 | # Calculate loss on all-real batch 63 | loss_d_real = criterion(outputs, torch.ones_like(outputs, dtype=torch.float)) 64 | 65 | # Train with all-fake batch 66 | # Generate batch of latent vectors 67 | noise = torch.cat([ 68 | torch.randn((batch_size, args.image_size)), 69 | torch.clone(real_labels).cpu().detach() 70 | ], 1).view(-1, args.image_size + num_classes, 1, 1).to(training_device) 71 | 72 | # Generate fake image batch with generator 73 | fake_outputs = generator.forward(noise) 74 | 75 | # Forward pass fake batch through discriminator 76 | outputs = discriminator.forward(fake_outputs.detach(), real_labels) 77 | 78 | # Calculate loss on fake batch 79 | loss_d_fake = criterion(outputs, torch.zeros_like(outputs, dtype=torch.float)) 80 | 81 | # Compute loss of discriminator as sum over the fake and the real batches 82 | loss_d = loss_d_real + loss_d_fake 83 | 84 | total_d_loss += loss_d.item() 85 | optimizer_d.zero_grad() 86 | loss_d.backward() 87 | optimizer_d.step() 88 | scheduler_d.step() 89 | 90 | ############################ 91 | # (2) Update G network: maximize log(D(G(z))) 92 | ########################### 93 | # Since we just updated discriminator, perform another forward pass of all-fake batch through discriminator 94 | outputs = discriminator.forward(fake_outputs, real_labels) 95 | 96 | # Calculate generator's loss based on this output 97 | loss_g = criterion(outputs, torch.ones_like(outputs, dtype=torch.float)) 98 | total_g_loss += loss_g.item() 99 | 100 | # Calculate gradients for generator and update 101 | optimizer_g.zero_grad() 102 | loss_g.backward() 103 | optimizer_g.step() 104 | scheduler_g.step() 105 | 106 | if batch_idx % 50 == 0: 107 | output_string = f'[{epoch + 1}/{args.epochs}][{batch_idx + 1}/{len(data_loader)}] ' + \ 108 | f'Loss_D: {loss_d.item():.4f} Loss_G: {loss_g.item():.4f}' 109 | debug_log(output_string, args.verbosity) 110 | 111 | return total_g_loss, total_d_loss 112 | 113 | 114 | def train_cnf(data_loader: DataLoader, 115 | normalizing_flow: CGlow, 116 | optimizer: optim, 117 | scheduler: optim.lr_scheduler, 118 | loss_fn: NLLLoss, 119 | epoch: int, 120 | args: Namespace, 121 | training_device: device) -> float: 122 | """ 123 | Train cNF 124 | :param data_loader: Training data loader 125 | :param normalizing_flow: Conditional normalizing flow model 126 | :param optimizer: Glow optimizer 127 | :param scheduler: Learning rate scheduler 128 | :param loss_fn: Loss function 129 | :param epoch: Current epoch 130 | :param args: All arguments 131 | :param training_device: Training device 132 | :return: Total loss 133 | """ 134 | normalizing_flow.train() 135 | total_loss = 0.0 136 | for batch_idx, batch_data in enumerate(data_loader): 137 | images, labels = batch_data 138 | images = images.to(training_device) 139 | labels = labels.to(training_device).type(torch.float) 140 | 141 | if epoch == 0 and batch_idx == 0: 142 | # Initialize the network 143 | normalizing_flow.forward(x=images, x_label=labels) 144 | 145 | z, nll, label_logits = normalizing_flow.forward(x=images, x_label=labels) 146 | loss = loss_fn.forward(nll=nll, label_logits=label_logits, labels=labels) 147 | total_loss += loss.data.cpu().item() 148 | 149 | optimizer.zero_grad() 150 | loss.backward() 151 | 152 | if args.grad_value_clip > 0: 153 | nn.utils.clip_grad_value_(normalizing_flow.parameters(), args.grad_value_clip) 154 | if args.grad_norm_clip > 0: 155 | nn.utils.clip_grad_norm_(normalizing_flow.parameters(), args.grad_norm_clip) 156 | 157 | optimizer.step() 158 | scheduler.step() 159 | 160 | if batch_idx % 50 == 0: 161 | debug_log(f'[{epoch + 1}/{args.epochs}][{batch_idx + 1}/{len(data_loader)}] Loss: {loss}', 162 | args.verbosity) 163 | 164 | return total_loss 165 | -------------------------------------------------------------------------------- /Lab 7/util.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import os 3 | 4 | 5 | def info_log(log: str, verbosity: int) -> None: 6 | """ 7 | Print information log 8 | :param log: log to be displayed 9 | :param verbosity: verbosity level 10 | :return: None 11 | """ 12 | if verbosity: 13 | print(f'[\033[96mINFO\033[00m] {log}') 14 | sys.stdout.flush() 15 | 16 | 17 | def debug_log(log: str, verbosity: int) -> None: 18 | """ 19 | Print debug log 20 | :param log: log to be displayed 21 | :param verbosity: verbosity level 22 | :return: None 23 | """ 24 | if verbosity > 1: 25 | print(f'[\033[93mDEBUG\033[00m] {log}') 26 | sys.stdout.flush() 27 | 28 | 29 | def create_directories() -> None: 30 | """ 31 | Create all directories needed in this lab 32 | :return: None 33 | """ 34 | if not os.path.exists('./model/task_1'): 35 | os.makedirs('./model/task_1') 36 | if not os.path.exists('./model/task_2'): 37 | os.makedirs('./model/task_2') 38 | if not os.path.exists('./test_figure'): 39 | os.mkdir('./test_figure') 40 | if not os.path.exists('./figure/task_1'): 41 | os.makedirs('./figure/task_1') 42 | if not os.path.exists('./figure/task_2'): 43 | os.makedirs('./figure/task_2') 44 | 45 | 46 | def get_score(test: float, new_test: float) -> float: 47 | """ 48 | Get score according to the test and new_test accuracy 49 | :param test: Test accuracy 50 | :param new_test: New test accuracy 51 | :return: Score 52 | """ 53 | score = 0.0 54 | if test >= 0.8: 55 | score += 0.05 * 100 56 | elif 0.8 > test >= 0.7: 57 | score += 0.05 * 90 58 | elif 0.7 > test >= 0.6: 59 | score += 0.05 * 80 60 | elif 0.6 > test >= 0.5: 61 | score += 0.05 * 70 62 | elif 0.5 > test >= 0.4: 63 | score += 0.05 * 60 64 | 65 | if new_test >= 0.8: 66 | score += 0.1 * 100 67 | elif 0.8 > new_test >= 0.7: 68 | score += 0.1 * 90 69 | elif 0.7 > new_test >= 0.6: 70 | score += 0.1 * 80 71 | elif 0.6 > new_test >= 0.5: 72 | score += 0.1 * 70 73 | elif 0.5 > new_test >= 0.4: 74 | score += 0.1 * 60 75 | 76 | return score 77 | -------------------------------------------------------------------------------- /Lab 7/visualizer.py: -------------------------------------------------------------------------------- 1 | from typing import List, Tuple 2 | import matplotlib.pyplot as plt 3 | 4 | 5 | def plot_losses(losses: Tuple[List[float], ...], labels: List[str], epoch:int, task: str, model: str) -> None: 6 | """ 7 | Plot losses 8 | :param losses: Losses 9 | :param labels: Label of each loss list 10 | :param epoch: Current epoch 11 | :param task: Task_1 or task_2 12 | :param model: Which model is used 13 | :return: None 14 | """ 15 | plt.clf() 16 | plt.title('Loss') 17 | plt.xlabel('Epoch') 18 | plt.ylabel('Loss') 19 | for idx, loss in enumerate(losses): 20 | plt.plot(range(epoch+1), loss[:epoch+1], label=f'{labels[idx]}') 21 | plt.legend() 22 | plt.tight_layout() 23 | plt.savefig(f'./figure/{task}/{model}_loss.png') 24 | 25 | 26 | def plot_accuracies(accuracies: Tuple[List[float], ...], labels: List[str], epoch: int, model: str) -> None: 27 | """ 28 | Plot accuracies 29 | :param accuracies: Accuracies 30 | :param labels: Label of each accuracy list 31 | :param epoch: Current epoch 32 | :param model: Which model is used 33 | :return: None 34 | """ 35 | plt.clf() 36 | plt.title('Accuracy') 37 | plt.xlabel('Epoch') 38 | plt.ylabel('Accuracy') 39 | for idx, accuracy in enumerate(accuracies): 40 | plt.plot(range(epoch + 1), accuracy[:epoch + 1], label=labels[idx]) 41 | plt.legend() 42 | plt.tight_layout() 43 | plt.savefig(f'./figure/task_1/{model}_accuracy.png') 44 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Deep-Learning-and-Practice 2 | 💻 HWs of Deep Learning and Practice Spring 2021 NCTU(NYCU) 深度學習與實務 5261 3 | 4 | 5 | 6 | ## Prerequisites 7 | * python >= 3.8 8 | * gym >= 0.18.0 9 | * matplotlib >= 3.3.4 10 | * numpy >= 1.19.2 11 | * pandas >= 1.2.4 12 | * scikit-learn >= 0.24.1 13 | * tensorboard >= 2.5.0 14 | * torch == 1.7.1 15 | * tqdm >= 4.60.0 16 | * CUDA == 10.2 17 | 18 | 19 | 20 | ## Labs 21 | |Lab|Description| 22 | |---|---| 23 | |Lab 0|Warm up| 24 | |Lab 1|Back propagation| 25 | |Lab 2|Temporal Difference| 26 | |Lab 3|EEGNet & DeepConvNet| 27 | |Lab 4|ResNet| 28 | |Lab 5|CVAE with LSTM| 29 | |Lab 6|DQN & DDPG| 30 | |Lab 7|cGAN & cNF| 31 | |Project|Stock Prediction using Transformer| 32 | 33 | 34 | 35 | ## Usage 36 | ```shell 37 | $ git clone git@github.com:steven112163/Deep-Learning-and-Practice.git 38 | $ git submodule init 39 | $ git submodule update 40 | ``` --------------------------------------------------------------------------------