├── README.md ├── data ├── combine_A_and_B.py ├── diabetes.csv.gz ├── downloadCelebA.sh ├── download_cyclegan_dataset.sh ├── download_cyclegan_datasetTrain.sh ├── download_pix2pix_dataset.sh ├── lsun │ ├── README.md │ ├── category_indices.txt │ ├── data.py │ └── download.py ├── make_dataset_aligned.py ├── names_test.csv.gz ├── names_train.csv.gz ├── setting_up_script.sh └── shakespeare.txt.gz ├── logo └── pytorch_logo.png ├── slides ├── Lecture 02_ Linear Model.pptx ├── Lecture 03_ Gradient Descent.pptx ├── Lecture 04_ Back-propagation and PyTorch autograd.pptx ├── Lecture 05_ Linear regression in PyTorch way.pptx ├── Lecture 06_ Logistic Regression.pptx ├── Lecture 07_ Wide _ Deep.pptx ├── Lecture 08_ DataLoader.pptx ├── Lecture 09_ Softmax Classifier.pptx ├── Lecture 10_ Basic CNN.pptx ├── Lecture 11_ Advanced CNN.pptx ├── Lecture 12_ RNN.pdf ├── Lecture 12_ RNN.pptx ├── Lecture 13_ RNN II.pdf ├── Lecture 13_ RNN II.pptx ├── Lecture 14_ Seq2Seq.pptx ├── Lecture 15_ NSML_ Smartest ML Platform.pptx ├── P-Epilogue_ What_s the next_.pptx └── template.pptx └── tutorials ├── 01-basics ├── Back_propagration │ └── main.py ├── DataLoader │ ├── main.py │ └── main_logistic.py ├── Feedforward_neural_network │ ├── main-gpu.py │ └── main.py ├── Gradient_Descent │ └── main.py ├── Linear_Model │ └── main.py ├── Linear_regression │ ├── lr.py │ └── main.py ├── Logistic_regression │ └── main.py ├── Softmax_Classifier │ ├── main.py │ └── main_mnist.py ├── Wide_Deep │ └── main.py └── pytorch_basics │ └── main.py ├── 02-intermediate ├── bidirectional_recurrent_neural_network │ ├── main-gpu.py │ └── main.py ├── convolutional_neural_network │ ├── main-gpu.py │ └── main.py ├── deep_residual_network │ ├── main-gpu.py │ └── main.py ├── generative_adversarial_network │ └── main.py ├── language_model │ ├── data │ │ └── train.txt │ ├── data_utils.py │ ├── main-gpu.py │ └── main.py └── recurrent_neural_network │ ├── main-gpu.py │ └── main.py ├── 03-advanced ├── deep_convolutional_gan │ ├── README.md │ ├── data_loader.py │ ├── download.sh │ ├── main.py │ ├── model.py │ ├── png │ │ ├── dcgan.png │ │ ├── sample1.png │ │ └── sample2.png │ ├── requirements.txt │ └── solver.py ├── image_captioning │ ├── README.md │ ├── build_vocab.py │ ├── data_loader.py │ ├── download.sh │ ├── model.py │ ├── png │ │ ├── example.png │ │ ├── image_captioning.png │ │ └── model.png │ ├── requirements.txt │ ├── resize.py │ ├── sample.py │ └── train.py ├── neural_style_transfer │ ├── README.md │ ├── main.py │ ├── png │ │ ├── content.png │ │ ├── neural_style.png │ │ ├── neural_style2.png │ │ ├── style.png │ │ ├── style2.png │ │ ├── style3.png │ │ └── style4.png │ └── requirements.txt └── variational_auto_encoder │ ├── README.md │ ├── main.py │ ├── png │ ├── real.png │ ├── reconst.png │ └── vae.png │ └── requirements.txt └── 04-utils └── tensorboard ├── README.md ├── gif ├── g └── tensorboard.gif ├── logger.py ├── main.py └── requirements.txt /README.md: -------------------------------------------------------------------------------- 1 |

2 | 3 | -------------------------------------------------------------------------------- 4 | 5 | This repository provides tutorial code for deep learning researchers to learn [PyTorch](https://github.com/pytorch/pytorch). In the tutorial, most of the models were implemented with less than 30 lines of code. Before starting this tutorial, it is recommended to finish [Official Pytorch Tutorial](http://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html). 6 | 7 | 8 |
9 | 10 | ## Table of Contents 11 | #### PPTs 12 | * [PyTorchZeroToAll](https://drive.google.com/drive/folders/0B41Zbb4c8HVyUndGdGdJSXd5d3M) 13 | #### Videos 14 | * [PyTorchZeroToAll-lecture1~lecture4-Chinese version](https://youtu.be/MuyUFqJf_Ug) 15 | #### Install 16 | * [PyTorch install] 17 | ## Getting Installed for OSX pip 3.6 without coda 18 | ```bash 19 | $ pip3 install http://download.pytorch.org/whl/torch-0.3.1-cp36-cp36m-macosx_10_7_x86_64.whl 20 | $ pip3 install torchvision 21 | ``` 22 | ## Getting Installed for Linux pip 3.5 with coda9 23 | ```bash 24 | $ pip3 install http://download.pytorch.org/whl/cu90/torch-0.3.1-cp35-cp35m-linux_x86_64.whl 25 | $ pip3 install torchvision 26 | ``` 27 | #### 1. Basics 28 | * [Lecture2:Linear Model](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/Linear_Model/main.py) 29 | * [Lecture3:Gradient_Descent](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/Gradient_Descent/main.py) 30 | * [Lecture4:Back_propagration](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/Back_propagration/main.py) 31 | * [Lecture5:Linear_regression](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/Linear_regression/main.py) 32 | * [Lecture6:Logistic_regression](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/Logistic_regression/main.py) 33 | * [Lecture7:Wide_Deep](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/Wide_Deep/main.py) 34 | * [Lecture8:DataLoader](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/DataLoader/main.py) 35 | * [Lecture8:DataLoader_logistic](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/DataLoader/main_logistic.py) 36 | * [Lecture9:Softmax_Classifier](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/Softmax_Classifier/main.py) 37 | * [Lecture9:Softmax_Classifier_mnist](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/01-basics/Softmax_Classifier/main_mnist.py) 38 | 39 | #### 2. Intermediate 40 | * [Convolutional Neural Network](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/02-intermediate/convolutional_neural_network/main.py#L33-L53) 41 | * [Deep Residual Network](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/02-intermediate/deep_residual_network/main.py#L67-L103) 42 | * [Recurrent Neural Network](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/02-intermediate/recurrent_neural_network/main.py#L38-L56) 43 | * [Bidirectional Recurrent Neural Network](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/02-intermediate/bidirectional_recurrent_neural_network/main.py#L38-L57) 44 | * [Language Model (RNN-LM)](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/02-intermediate/language_model/main.py#L28-L53) 45 | * [Generative Adversarial Network](https://github.com/Tim810306/PytorchTutorial/blob/master/tutorials/02-intermediate/generative_adversarial_network/main.py#L34-L50) 46 | 47 | #### 3. Advanced 48 | * [Image Captioning (CNN-RNN)](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/03-advanced/image_captioning) 49 | * [Deep Convolutional GAN (DCGAN)](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/03-advanced/deep_convolutional_gan) 50 | * [Variational Auto-Encoder](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/03-advanced/variational_auto_encoder) 51 | * [Neural Style Transfer](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/03-advanced/neural_style_transfer) 52 | 53 | #### 4. Utilities 54 | * [TensorBoard in PyTorch](https://github.com/Tim810306/PytorchTutorial/tree/master/tutorials/04-utils/tensorboard) 55 | 56 | 57 |
58 | 59 | ## Getting Started 60 | ```bash 61 | $ git clone https://github.com/Tim810306/PytorchTutorial.git 62 | $ cd PytorchTutorial/tutorials/project_path 63 | $ python main.py # cpu version 64 | $ python main-gpu.py # gpu version 65 | $ python main_XXX.py # execute XXX for cpu version 66 | ``` 67 | 68 |
69 | 70 | ## Dependencies 71 | * [Python 2.7 or 3.5](https://www.continuum.io/downloads) 72 | * [PyTorch 0.1.12](http://pytorch.org/) 73 | 74 | 75 | 76 |
77 | 78 | 79 | ## Author 80 | Cheng Yu Ting/ [@Tim810306](https://github.com/Tim810306) -------------------------------------------------------------------------------- /data/combine_A_and_B.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import cv2 4 | import argparse 5 | 6 | parser = argparse.ArgumentParser('create image pairs') 7 | parser.add_argument('--fold_A', dest='fold_A', help='input directory for image A', type=str, default='../dataset/50kshoes_edges') 8 | parser.add_argument('--fold_B', dest='fold_B', help='input directory for image B', type=str, default='../dataset/50kshoes_jpg') 9 | parser.add_argument('--fold_AB', dest='fold_AB', help='output directory', type=str, default='../dataset/test_AB') 10 | parser.add_argument('--num_imgs', dest='num_imgs', help='number of images',type=int, default=1000000) 11 | parser.add_argument('--use_AB', dest='use_AB', help='if true: (0001_A, 0001_B) to (0001_AB)',action='store_true') 12 | args = parser.parse_args() 13 | 14 | for arg in vars(args): 15 | print('[%s] = ' % arg, getattr(args, arg)) 16 | 17 | splits = os.listdir(args.fold_A) 18 | 19 | for sp in splits: 20 | img_fold_A = os.path.join(args.fold_A, sp) 21 | img_fold_B = os.path.join(args.fold_B, sp) 22 | img_list = os.listdir(img_fold_A) 23 | if args.use_AB: 24 | img_list = [img_path for img_path in img_list if '_A.' in img_path] 25 | 26 | num_imgs = min(args.num_imgs, len(img_list)) 27 | print('split = %s, use %d/%d images' % (sp, num_imgs, len(img_list))) 28 | img_fold_AB = os.path.join(args.fold_AB, sp) 29 | if not os.path.isdir(img_fold_AB): 30 | os.makedirs(img_fold_AB) 31 | print('split = %s, number of images = %d' % (sp, num_imgs)) 32 | for n in range(num_imgs): 33 | name_A = img_list[n] 34 | path_A = os.path.join(img_fold_A, name_A) 35 | if args.use_AB: 36 | name_B = name_A.replace('_A.', '_B.') 37 | else: 38 | name_B = name_A 39 | path_B = os.path.join(img_fold_B, name_B) 40 | if os.path.isfile(path_A) and os.path.isfile(path_B): 41 | name_AB = name_A 42 | if args.use_AB: 43 | name_AB = name_AB.replace('_A.', '.') # remove _A 44 | path_AB = os.path.join(img_fold_AB, name_AB) 45 | im_A = cv2.imread(path_A, cv2.CV_LOAD_IMAGE_COLOR) 46 | im_B = cv2.imread(path_B, cv2.CV_LOAD_IMAGE_COLOR) 47 | im_AB = np.concatenate([im_A, im_B], 1) 48 | cv2.imwrite(path_AB, im_AB) 49 | -------------------------------------------------------------------------------- /data/diabetes.csv.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/data/diabetes.csv.gz -------------------------------------------------------------------------------- /data/downloadCelebA.sh: -------------------------------------------------------------------------------- 1 | # CelebA images 2 | URL=https://www.dropbox.com/s/3e5cmqgplchz85o/CelebA_nocrop.zip?dl=0 3 | ZIP_FILE=./data/CelebA_nocrop.zip 4 | mkdir -p ./data/ 5 | wget -N $URL -O $ZIP_FILE 6 | unzip $ZIP_FILE -d ./data/ 7 | rm $ZIP_FILE 8 | 9 | # CelebA attribute labels 10 | URL=https://www.dropbox.com/s/auexdy98c6g7y25/list_attr_celeba.zip?dl=0 11 | ZIP_FILE=./data/list_attr_celeba.zip 12 | wget -N $URL -O $ZIP_FILE 13 | unzip $ZIP_FILE -d ./data/ 14 | rm $ZIP_FILE 15 | -------------------------------------------------------------------------------- /data/download_cyclegan_dataset.sh: -------------------------------------------------------------------------------- 1 | FILE=$1 2 | 3 | if [[ $FILE != "ae_photos" && $FILE != "apple2orange" && $FILE != "summer2winter_yosemite" && $FILE != "horse2zebra" && $FILE != "monet2photo" && $FILE != "cezanne2photo" && $FILE != "ukiyoe2photo" && $FILE != "vangogh2photo" && $FILE != "maps" && $FILE != "cityscapes" && $FILE != "facades" && $FILE != "iphone2dslr_flower" && $FILE != "ae_photos" ]]; then 4 | echo "Available datasets are: apple2orange, summer2winter_yosemite, horse2zebra, monet2photo, cezanne2photo, ukiyoe2photo, vangogh2photo, maps, cityscapes, facades, iphone2dslr_flower, ae_photos" 5 | exit 1 6 | fi 7 | 8 | URL=https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/$FILE.zip 9 | ZIP_FILE=./datasets/$FILE.zip 10 | TARGET_DIR=./datasets/$FILE/ 11 | wget -N $URL -O $ZIP_FILE 12 | mkdir $TARGET_DIR 13 | unzip $ZIP_FILE -d ./datasets/ 14 | rm $ZIP_FILE 15 | -------------------------------------------------------------------------------- /data/download_cyclegan_datasetTrain.sh: -------------------------------------------------------------------------------- 1 | FILE=$1 2 | 3 | if [[ $FILE != "ae_photos" && $FILE != "apple2orange" && $FILE != "summer2winter_yosemite" && $FILE != "horse2zebra" && $FILE != "monet2photo" && $FILE != "cezanne2photo" && $FILE != "ukiyoe2photo" && $FILE != "vangogh2photo" && $FILE != "maps" && $FILE != "cityscapes" && $FILE != "facades" && $FILE != "iphone2dslr_flower" && $FILE != "ae_photos" ]]; then 4 | echo "Available datasets are: apple2orange, summer2winter_yosemite, horse2zebra, monet2photo, cezanne2photo, ukiyoe2photo, vangogh2photo, maps, cityscapes, facades, iphone2dslr_flower, ae_photos" 5 | exit 1 6 | fi 7 | 8 | URL=https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/$FILE.zip 9 | ZIP_FILE=./datasets/$FILE.zip 10 | TARGET_DIR=./datasets/$FILE/ 11 | wget -N $URL -O $ZIP_FILE 12 | mkdir $TARGET_DIR 13 | unzip $ZIP_FILE -d ./datasets/ 14 | rm $ZIP_FILE 15 | -------------------------------------------------------------------------------- /data/download_pix2pix_dataset.sh: -------------------------------------------------------------------------------- 1 | FILE=$1 2 | URL=https://people.eecs.berkeley.edu/~tinghuiz/projects/pix2pix/datasets/$FILE.tar.gz 3 | TAR_FILE=./datasets/$FILE.tar.gz 4 | TARGET_DIR=./datasets/$FILE/ 5 | wget -N $URL -O $TAR_FILE 6 | mkdir $TARGET_DIR 7 | tar -zxvf $TAR_FILE -C ./datasets/ 8 | rm $TAR_FILE -------------------------------------------------------------------------------- /data/lsun/README.md: -------------------------------------------------------------------------------- 1 | # LSUN 2 | 3 | Please check [LSUN webpage](http://www.yf.io/p/lsun) for more information about the dataset. 4 | 5 | ## Data Release 6 | 7 | All the images in one category are stored in one lmdb database 8 | file. The value 9 | of each entry is the jpg binary data. We resize all the images so 10 | that the 11 | smaller dimension is 256 and compress the images in jpeg with 12 | quality 75. 13 | 14 | ### Citing LSUN 15 | 16 | If you find LSUN dataset useful in your research, please consider citing: 17 | 18 | @article{yu15lsun, 19 | Author = {Yu, Fisher and Zhang, Yinda and Song, Shuran and Seff, Ari and Xiao, Jianxiong}, 20 | Title = {LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop}, 21 | Journal = {arXiv preprint arXiv:1506.03365}, 22 | Year = {2015} 23 | } 24 | 25 | ### Download data 26 | Please make sure you have cURL installed 27 | ```bash 28 | # Download the whole latest data set 29 | python2.7 download.py 30 | # Download the whole latest data set to 31 | python2.7 download.py -o 32 | # Download data for bedroom 33 | python2.7 download.py -c bedroom 34 | # Download testing set 35 | python2.7 download.py -c test 36 | ``` 37 | 38 | ## Demo code 39 | 40 | ### Dependency 41 | 42 | Install Python 43 | 44 | Install Python dependency: numpy, lmdb, opencv 45 | 46 | ### Usage: 47 | 48 | View the lmdb content 49 | 50 | ```bash 51 | python2.7 data.py view 52 | ``` 53 | 54 | Export the images to a folder 55 | 56 | ```bash 57 | python2.7 data.py export --out_dir 58 | ``` 59 | 60 | ### Example: 61 | 62 | Export all the images in valuation sets in the current folder to a 63 | "data" 64 | subfolder. 65 | 66 | ```bash 67 | python2.7 data.py export *_val_lmdb --out_dir data 68 | ``` 69 | 70 | ## Submission 71 | 72 | We expect one category prediction for each image in the testing 73 | set. The name of each image is the key value in the LMDB 74 | database. Each category has an index as listed in 75 | [index list](https://github.com/fyu/lsun_toolkit/blob/master/category_indices.txt). The 76 | submitted results on the testing set will be stored in a text file 77 | with one line per image. In each line, there are two fields separated 78 | by a whitespace. The first is the image key and the second is the 79 | predicted category index. For example: 80 | 81 | ``` 82 | 0001c44e5f5175a7e6358d207660f971d90abaf4 0 83 | 000319b73404935eec40ac49d1865ce197b3a553 1 84 | 00038e8b13a97577ada8a884702d607220ce6d15 2 85 | 00039ba1bf659c30e50b757280efd5eba6fc2fe1 3 86 | ... 87 | ``` 88 | 89 | The score for the submission is the percentage of correctly predicted 90 | labels. In our evaluation, we will double check our ground truth 91 | labels for the testing images and we may remove some images with 92 | controversial labels in the final evaluation. 93 | -------------------------------------------------------------------------------- /data/lsun/category_indices.txt: -------------------------------------------------------------------------------- 1 | bedroom 0 2 | bridge 1 3 | church_outdoor 2 4 | classroom 3 5 | conference_room 4 6 | dining_room 5 7 | kitchen 6 8 | living_room 7 9 | restaurant 8 10 | tower 9 11 | -------------------------------------------------------------------------------- /data/lsun/data.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python2.7 2 | 3 | from __future__ import print_function 4 | import argparse 5 | import cv2 6 | import lmdb 7 | import numpy 8 | import os 9 | from os.path import exists, join 10 | 11 | __author__ = 'Fisher Yu' 12 | __email__ = 'fy@cs.princeton.edu' 13 | __license__ = 'MIT' 14 | 15 | 16 | def view(db_path): 17 | print('Viewing', db_path) 18 | print('Press ESC to exist or SPACE to advance.') 19 | window_name = 'LSUN' 20 | cv2.namedWindow(window_name) 21 | env = lmdb.open(db_path, map_size=1099511627776, 22 | max_readers=100, readonly=True) 23 | with env.begin(write=False) as txn: 24 | cursor = txn.cursor() 25 | for key, val in cursor: 26 | print('Current key:', key) 27 | img = cv2.imdecode( 28 | numpy.fromstring(val, dtype=numpy.uint8), 1) 29 | cv2.imshow(window_name, img) 30 | c = cv2.waitKey() 31 | if c == 27: 32 | break 33 | 34 | 35 | def export_images(db_path, out_dir, flat=False, limit=-1): 36 | print('Exporting', db_path, 'to', out_dir) 37 | env = lmdb.open(db_path, map_size=1099511627776, 38 | max_readers=100, readonly=True) 39 | count = 0 40 | with env.begin(write=False) as txn: 41 | cursor = txn.cursor() 42 | for key, val in cursor: 43 | if not flat: 44 | image_out_dir = join(out_dir, '/'.join(key[:6])) 45 | else: 46 | image_out_dir = out_dir 47 | if not exists(image_out_dir): 48 | os.makedirs(image_out_dir) 49 | image_out_path = join(image_out_dir, key + '.webp') 50 | with open(image_out_path, 'w') as fp: 51 | fp.write(val) 52 | count += 1 53 | if count == limit: 54 | break 55 | if count % 1000 == 0: 56 | print('Finished', count, 'images') 57 | 58 | 59 | def main(): 60 | parser = argparse.ArgumentParser() 61 | parser.add_argument('command', nargs='?', type=str, 62 | choices=['view', 'export'], 63 | help='view: view the images in the lmdb database ' 64 | 'interactively.\n' 65 | 'export: Export the images in the lmdb databases ' 66 | 'to a folder. The images are grouped in subfolders' 67 | ' determinted by the prefiex of image key.') 68 | parser.add_argument('lmdb_path', nargs='+', type=str, 69 | help='The path to the lmdb database folder. ' 70 | 'Support multiple database paths.') 71 | parser.add_argument('--out_dir', type=str, default='') 72 | parser.add_argument('--flat', action='store_true', 73 | help='If enabled, the images are imported into output ' 74 | 'directory directly instead of hierarchical ' 75 | 'directories.') 76 | args = parser.parse_args() 77 | 78 | command = args.command 79 | lmdb_paths = args.lmdb_path 80 | 81 | for lmdb_path in lmdb_paths: 82 | if command == 'view': 83 | view(lmdb_path) 84 | elif command == 'export': 85 | export_images(lmdb_path, args.out_dir, args.flat) 86 | 87 | 88 | if __name__ == '__main__': 89 | main() 90 | -------------------------------------------------------------------------------- /data/lsun/download.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | 4 | from __future__ import print_function, division 5 | import argparse 6 | import json 7 | from os.path import join 8 | 9 | import subprocess 10 | import urllib2 11 | 12 | __author__ = 'Fisher Yu' 13 | __email__ = 'fy@cs.princeton.edu' 14 | __license__ = 'MIT' 15 | 16 | 17 | def list_categories(tag): 18 | url = 'http://lsun.cs.princeton.edu/htbin/list.cgi?tag=' + tag 19 | f = urllib2.urlopen(url) 20 | return json.loads(f.read()) 21 | 22 | 23 | def download(out_dir, category, set_name, tag): 24 | url = 'http://lsun.cs.princeton.edu/htbin/download.cgi?tag={tag}' \ 25 | '&category={category}&set={set_name}'.format(**locals()) 26 | if set_name == 'test': 27 | out_name = 'test_lmdb.zip' 28 | else: 29 | out_name = '{category}_{set_name}_lmdb.zip'.format(**locals()) 30 | out_path = join(out_dir, out_name) 31 | cmd = ['curl', url, '-o', out_path] 32 | print('Downloading', category, set_name, 'set') 33 | subprocess.call(cmd) 34 | 35 | 36 | def main(): 37 | parser = argparse.ArgumentParser() 38 | parser.add_argument('--tag', type=str, default='latest') 39 | parser.add_argument('-o', '--out_dir', default='') 40 | parser.add_argument('-c', '--category', default=None) 41 | args = parser.parse_args() 42 | 43 | categories = list_categories(args.tag) 44 | if args.category is None: 45 | print('Downloading', len(categories), 'categories') 46 | for category in categories: 47 | download(args.out_dir, category, 'train', args.tag) 48 | download(args.out_dir, category, 'val', args.tag) 49 | download(args.out_dir, '', 'test', args.tag) 50 | else: 51 | if args.category == 'test': 52 | download(args.out_dir, '', 'test', args.tag) 53 | elif args.category not in categories: 54 | print('Error:', args.category, "doesn't exist in", 55 | args.tag, 'LSUN release') 56 | else: 57 | download(args.out_dir, args.category, 'train', args.tag) 58 | download(args.out_dir, args.category, 'val', args.tag) 59 | 60 | 61 | if __name__ == '__main__': 62 | main() 63 | -------------------------------------------------------------------------------- /data/make_dataset_aligned.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | from PIL import Image 4 | 5 | 6 | def get_file_paths(folder): 7 | image_file_paths = [] 8 | for root, dirs, filenames in os.walk(folder): 9 | filenames = sorted(filenames) 10 | for filename in filenames: 11 | input_path = os.path.abspath(root) 12 | file_path = os.path.join(input_path, filename) 13 | if filename.endswith('.png') or filename.endswith('.jpg'): 14 | image_file_paths.append(file_path) 15 | 16 | break # prevent descending into subfolders 17 | return image_file_paths 18 | 19 | 20 | def align_images(a_file_paths, b_file_paths, target_path): 21 | if not os.path.exists(target_path): 22 | os.makedirs(target_path) 23 | 24 | for i in range(len(a_file_paths)): 25 | img_a = Image.open(a_file_paths[i]) 26 | img_b = Image.open(b_file_paths[i]) 27 | assert(img_a.size == img_b.size) 28 | 29 | aligned_image = Image.new("RGB", (img_a.size[0] * 2, img_a.size[1])) 30 | aligned_image.paste(img_a, (0, 0)) 31 | aligned_image.paste(img_b, (img_a.size[0], 0)) 32 | aligned_image.save(os.path.join(target_path, '{:04d}.jpg'.format(i))) 33 | 34 | 35 | if __name__ == '__main__': 36 | import argparse 37 | parser = argparse.ArgumentParser() 38 | parser.add_argument( 39 | '--dataset-path', 40 | dest='dataset_path', 41 | help='Which folder to process (it should have subfolders testA, testB, trainA and trainB' 42 | ) 43 | args = parser.parse_args() 44 | 45 | dataset_folder = args.dataset_path 46 | print(dataset_folder) 47 | 48 | test_a_path = os.path.join(dataset_folder, 'testA') 49 | test_b_path = os.path.join(dataset_folder, 'testB') 50 | test_a_file_paths = get_file_paths(test_a_path) 51 | test_b_file_paths = get_file_paths(test_b_path) 52 | assert(len(test_a_file_paths) == len(test_b_file_paths)) 53 | test_path = os.path.join(dataset_folder, 'test') 54 | 55 | train_a_path = os.path.join(dataset_folder, 'trainA') 56 | train_b_path = os.path.join(dataset_folder, 'trainB') 57 | train_a_file_paths = get_file_paths(train_a_path) 58 | train_b_file_paths = get_file_paths(train_b_path) 59 | assert(len(train_a_file_paths) == len(train_b_file_paths)) 60 | train_path = os.path.join(dataset_folder, 'train') 61 | 62 | align_images(test_a_file_paths, test_b_file_paths, test_path) 63 | align_images(train_a_file_paths, train_b_file_paths, train_path) 64 | -------------------------------------------------------------------------------- /data/names_test.csv.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/data/names_test.csv.gz -------------------------------------------------------------------------------- /data/names_train.csv.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/data/names_train.csv.gz -------------------------------------------------------------------------------- /data/setting_up_script.sh: -------------------------------------------------------------------------------- 1 | ## First time installs 2 | #pip install tensorboard_logger 3 | #pip install opencv-python 4 | 5 | ## Download CAT dataset 6 | wget -nc https://archive.org/download/CAT_DATASET/CAT_DATASET_01.zip 7 | wget -nc https://archive.org/download/CAT_DATASET/CAT_DATASET_02.zip 8 | wget -nc https://archive.org/download/CAT_DATASET/00000003_015.jpg.cat 9 | 10 | ## Setting up folder 11 | unzip CAT_DATASET_01.zip -d cat_dataset 12 | unzip CAT_DATASET_02.zip -d cat_dataset 13 | mv cat_dataset/CAT_00/* cat_dataset 14 | rmdir cat_dataset/CAT_00 15 | mv cat_dataset/CAT_01/* cat_dataset 16 | rmdir cat_dataset/CAT_01 17 | mv cat_dataset/CAT_02/* cat_dataset 18 | rmdir cat_dataset/CAT_02 19 | mv cat_dataset/CAT_03/* cat_dataset 20 | rmdir cat_dataset/CAT_03 21 | mv cat_dataset/CAT_04/* cat_dataset 22 | rmdir cat_dataset/CAT_04 23 | mv cat_dataset/CAT_05/* cat_dataset 24 | rmdir cat_dataset/CAT_05 25 | mv cat_dataset/CAT_06/* cat_dataset 26 | rmdir cat_dataset/CAT_06 27 | 28 | ## Error correction 29 | rm cat_dataset/00000003_019.jpg.cat 30 | mv 00000003_015.jpg.cat cat_dataset/00000003_015.jpg.cat 31 | 32 | ## Removing outliers 33 | # Corrupted, drawings, badly cropped, inverted, impossible to tell it's a cat, blocked face 34 | cd cat_dataset 35 | rm 00000004_007.jpg 00000007_002.jpg 00000045_028.jpg 00000050_014.jpg 00000056_013.jpg 00000059_002.jpg 00000108_005.jpg 00000122_023.jpg 00000126_005.jpg 00000132_018.jpg 00000142_024.jpg 00000142_029.jpg 00000143_003.jpg 00000145_021.jpg 00000166_021.jpg 00000169_021.jpg 00000186_002.jpg 00000202_022.jpg 00000208_023.jpg 00000210_003.jpg 00000229_005.jpg 00000236_025.jpg 00000249_016.jpg 00000254_013.jpg 00000260_019.jpg 00000261_029.jpg 00000265_029.jpg 00000271_020.jpg 00000282_026.jpg 00000316_004.jpg 00000352_014.jpg 00000400_026.jpg 00000406_006.jpg 00000431_024.jpg 00000443_027.jpg 00000502_015.jpg 00000504_012.jpg 00000510_019.jpg 00000514_016.jpg 00000514_008.jpg 00000515_021.jpg 00000519_015.jpg 00000522_016.jpg 00000523_021.jpg 00000529_005.jpg 00000556_022.jpg 00000574_011.jpg 00000581_018.jpg 00000582_011.jpg 00000588_016.jpg 00000588_019.jpg 00000590_006.jpg 00000592_018.jpg 00000593_027.jpg 00000617_013.jpg 00000618_016.jpg 00000619_025.jpg 00000622_019.jpg 00000622_021.jpg 00000630_007.jpg 00000645_016.jpg 00000656_017.jpg 00000659_000.jpg 00000660_022.jpg 00000660_029.jpg 00000661_016.jpg 00000663_005.jpg 00000672_027.jpg 00000673_027.jpg 00000675_023.jpg 00000692_006.jpg 00000800_017.jpg 00000805_004.jpg 00000807_020.jpg 00000823_010.jpg 00000824_010.jpg 00000836_008.jpg 00000843_021.jpg 00000850_025.jpg 00000862_017.jpg 00000864_007.jpg 00000865_015.jpg 00000870_007.jpg 00000877_014.jpg 00000882_013.jpg 00000887_028.jpg 00000893_022.jpg 00000907_013.jpg 00000921_029.jpg 00000929_022.jpg 00000934_006.jpg 00000960_021.jpg 00000976_004.jpg 00000987_000.jpg 00000993_009.jpg 00001006_014.jpg 00001008_013.jpg 00001012_019.jpg 00001014_005.jpg 00001020_017.jpg 00001039_008.jpg 00001039_023.jpg 00001048_029.jpg 00001057_003.jpg 00001068_005.jpg 00001113_015.jpg 00001140_007.jpg 00001157_029.jpg 00001158_000.jpg 00001167_007.jpg 00001184_007.jpg 00001188_019.jpg 00001204_027.jpg 00001205_022.jpg 00001219_005.jpg 00001243_010.jpg 00001261_005.jpg 00001270_028.jpg 00001274_006.jpg 00001293_015.jpg 00001312_021.jpg 00001365_026.jpg 00001372_006.jpg 00001379_018.jpg 00001388_024.jpg 00001389_026.jpg 00001418_028.jpg 00001425_012.jpg 00001431_001.jpg 00001456_018.jpg 00001458_003.jpg 00001468_019.jpg 00001475_009.jpg 00001487_020.jpg 36 | rm 00000004_007.jpg.cat 00000007_002.jpg.cat 00000045_028.jpg.cat 00000050_014.jpg.cat 00000056_013.jpg.cat 00000059_002.jpg.cat 00000108_005.jpg.cat 00000122_023.jpg.cat 00000126_005.jpg.cat 00000132_018.jpg.cat 00000142_024.jpg.cat 00000142_029.jpg.cat 00000143_003.jpg.cat 00000145_021.jpg.cat 00000166_021.jpg.cat 00000169_021.jpg.cat 00000186_002.jpg.cat 00000202_022.jpg.cat 00000208_023.jpg.cat 00000210_003.jpg.cat 00000229_005.jpg.cat 00000236_025.jpg.cat 00000249_016.jpg.cat 00000254_013.jpg.cat 00000260_019.jpg.cat 00000261_029.jpg.cat 00000265_029.jpg.cat 00000271_020.jpg.cat 00000282_026.jpg.cat 00000316_004.jpg.cat 00000352_014.jpg.cat 00000400_026.jpg.cat 00000406_006.jpg.cat 00000431_024.jpg.cat 00000443_027.jpg.cat 00000502_015.jpg.cat 00000504_012.jpg.cat 00000510_019.jpg.cat 00000514_016.jpg.cat 00000514_008.jpg.cat 00000515_021.jpg.cat 00000519_015.jpg.cat 00000522_016.jpg.cat 00000523_021.jpg.cat 00000529_005.jpg.cat 00000556_022.jpg.cat 00000574_011.jpg.cat 00000581_018.jpg.cat 00000582_011.jpg.cat 00000588_016.jpg.cat 00000588_019.jpg.cat 00000590_006.jpg.cat 00000592_018.jpg.cat 00000593_027.jpg.cat 00000617_013.jpg.cat 00000618_016.jpg.cat 00000619_025.jpg.cat 00000622_019.jpg.cat 00000622_021.jpg.cat 00000630_007.jpg.cat 00000645_016.jpg.cat 00000656_017.jpg.cat 00000659_000.jpg.cat 00000660_022.jpg.cat 00000660_029.jpg.cat 00000661_016.jpg.cat 00000663_005.jpg.cat 00000672_027.jpg.cat 00000673_027.jpg.cat 00000675_023.jpg.cat 00000692_006.jpg.cat 00000800_017.jpg.cat 00000805_004.jpg.cat 00000807_020.jpg.cat 00000823_010.jpg.cat 00000824_010.jpg.cat 00000836_008.jpg.cat 00000843_021.jpg.cat 00000850_025.jpg.cat 00000862_017.jpg.cat 00000864_007.jpg.cat 00000865_015.jpg.cat 00000870_007.jpg.cat 00000877_014.jpg.cat 00000882_013.jpg.cat 00000887_028.jpg.cat 00000893_022.jpg.cat 00000907_013.jpg.cat 00000921_029.jpg.cat 00000929_022.jpg.cat 00000934_006.jpg.cat 00000960_021.jpg.cat 00000976_004.jpg.cat 00000987_000.jpg.cat 00000993_009.jpg.cat 00001006_014.jpg.cat 00001008_013.jpg.cat 00001012_019.jpg.cat 00001014_005.jpg.cat 00001020_017.jpg.cat 00001039_008.jpg.cat 00001039_023.jpg.cat 00001048_029.jpg.cat 00001057_003.jpg.cat 00001068_005.jpg.cat 00001113_015.jpg.cat 00001140_007.jpg.cat 00001157_029.jpg.cat 00001158_000.jpg.cat 00001167_007.jpg.cat 00001184_007.jpg.cat 00001188_019.jpg.cat 00001204_027.jpg.cat 00001205_022.jpg.cat 00001219_005.jpg.cat 00001243_010.jpg.cat 00001261_005.jpg.cat 00001270_028.jpg.cat 00001274_006.jpg.cat 00001293_015.jpg.cat 00001312_021.jpg.cat 00001365_026.jpg.cat 00001372_006.jpg.cat 00001379_018.jpg.cat 00001388_024.jpg.cat 00001389_026.jpg.cat 00001418_028.jpg.cat 00001425_012.jpg.cat 00001431_001.jpg.cat 00001456_018.jpg.cat 00001458_003.jpg.cat 00001468_019.jpg.cat 00001475_009.jpg.cat 00001487_020.jpg.cat 37 | cd .. 38 | 39 | ## Preprocessing and putting in folders for different image sizes 40 | mkdir cats_bigger_than_64x64 41 | mkdir cats_bigger_than_128x128 42 | wget -nc https://raw.githubusercontent.com/AlexiaJM/Deep-learning-with-cats/master/preprocess_cat_dataset.py 43 | python preprocess_cat_dataset.py 44 | 45 | ## Removing cat_dataset 46 | rm -r cat_dataset 47 | 48 | ## Move to your favorite place 49 | #mv cats_bigger_than_64x64 /home/alexia/Datasets/Meow_64x64 50 | #mv cats_bigger_than_128x128 /home/alexia/Datasets/Meow_128x128 51 | -------------------------------------------------------------------------------- /data/shakespeare.txt.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/data/shakespeare.txt.gz -------------------------------------------------------------------------------- /logo/pytorch_logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/logo/pytorch_logo.png -------------------------------------------------------------------------------- /slides/Lecture 02_ Linear Model.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 02_ Linear Model.pptx -------------------------------------------------------------------------------- /slides/Lecture 03_ Gradient Descent.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 03_ Gradient Descent.pptx -------------------------------------------------------------------------------- /slides/Lecture 04_ Back-propagation and PyTorch autograd.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 04_ Back-propagation and PyTorch autograd.pptx -------------------------------------------------------------------------------- /slides/Lecture 05_ Linear regression in PyTorch way.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 05_ Linear regression in PyTorch way.pptx -------------------------------------------------------------------------------- /slides/Lecture 06_ Logistic Regression.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 06_ Logistic Regression.pptx -------------------------------------------------------------------------------- /slides/Lecture 07_ Wide _ Deep.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 07_ Wide _ Deep.pptx -------------------------------------------------------------------------------- /slides/Lecture 08_ DataLoader.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 08_ DataLoader.pptx -------------------------------------------------------------------------------- /slides/Lecture 09_ Softmax Classifier.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 09_ Softmax Classifier.pptx -------------------------------------------------------------------------------- /slides/Lecture 10_ Basic CNN.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 10_ Basic CNN.pptx -------------------------------------------------------------------------------- /slides/Lecture 11_ Advanced CNN.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 11_ Advanced CNN.pptx -------------------------------------------------------------------------------- /slides/Lecture 12_ RNN.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 12_ RNN.pdf -------------------------------------------------------------------------------- /slides/Lecture 12_ RNN.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 12_ RNN.pptx -------------------------------------------------------------------------------- /slides/Lecture 13_ RNN II.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 13_ RNN II.pdf -------------------------------------------------------------------------------- /slides/Lecture 13_ RNN II.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 13_ RNN II.pptx -------------------------------------------------------------------------------- /slides/Lecture 14_ Seq2Seq.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 14_ Seq2Seq.pptx -------------------------------------------------------------------------------- /slides/Lecture 15_ NSML_ Smartest ML Platform.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/Lecture 15_ NSML_ Smartest ML Platform.pptx -------------------------------------------------------------------------------- /slides/P-Epilogue_ What_s the next_.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/P-Epilogue_ What_s the next_.pptx -------------------------------------------------------------------------------- /slides/template.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/slides/template.pptx -------------------------------------------------------------------------------- /tutorials/01-basics/Back_propagration/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.autograd import Variable 3 | 4 | x_data = [1.0, 2.0, 3.0] 5 | y_data = [2.0, 4.0, 6.0] 6 | 7 | w = Variable(torch.Tensor([1.0]), requires_grad=True) # Any random value 8 | 9 | # our model forward pass 10 | 11 | 12 | def forward(x): 13 | return x * w 14 | 15 | # Loss function 16 | 17 | 18 | def loss(x, y): 19 | y_pred = forward(x) 20 | return (y_pred - y) * (y_pred - y) 21 | 22 | # Before training 23 | print("predict (before training)", 4, forward(4).data[0]) 24 | 25 | # Training loop 26 | for epoch in range(10): 27 | for x_val, y_val in zip(x_data, y_data): 28 | l = loss(x_val, y_val) 29 | l.backward() 30 | print("\tgrad: ", x_val, y_val, w.grad.data[0]) 31 | w.data = w.data - 0.01 * w.grad.data 32 | 33 | # Manually zero the gradients after updating weights 34 | w.grad.data.zero_() 35 | 36 | print("progress:", epoch, l.data[0]) 37 | 38 | # After training 39 | print("predict (after training)", 4, forward(4).data[0]) 40 | -------------------------------------------------------------------------------- /tutorials/01-basics/DataLoader/main.py: -------------------------------------------------------------------------------- 1 | # References 2 | # https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/01-basics/pytorch_basics/main.py 3 | # http://pytorch.org/tutorials/beginner/data_loading_tutorial.html#dataset-class 4 | import torch 5 | import numpy as np 6 | from torch.autograd import Variable 7 | from torch.utils.data import Dataset, DataLoader 8 | 9 | 10 | class DiabetesDataset(Dataset): 11 | """ Diabetes dataset.""" 12 | 13 | # Initialize your data, download, etc. 14 | def __init__(self): 15 | xy = np.loadtxt('./data/diabetes.csv.gz', 16 | delimiter=',', dtype=np.float32) 17 | self.len = xy.shape[0] 18 | self.x_data = torch.from_numpy(xy[:, 0:-1]) 19 | self.y_data = torch.from_numpy(xy[:, [-1]]) 20 | 21 | def __getitem__(self, index): 22 | return self.x_data[index], self.y_data[index] 23 | 24 | def __len__(self): 25 | return self.len 26 | 27 | 28 | dataset = DiabetesDataset() 29 | train_loader = DataLoader(dataset=dataset, 30 | batch_size=32, 31 | shuffle=True, 32 | num_workers=2) 33 | 34 | for epoch in range(2): 35 | for i, data in enumerate(train_loader, 0): 36 | # get the inputs 37 | inputs, labels = data 38 | 39 | # wrap them in Variable 40 | inputs, labels = Variable(inputs), Variable(labels) 41 | 42 | # Run your training process 43 | print(epoch, i, "inputs", inputs.data, "labels", labels.data) 44 | -------------------------------------------------------------------------------- /tutorials/01-basics/DataLoader/main_logistic.py: -------------------------------------------------------------------------------- 1 | # References 2 | # https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/01-basics/pytorch_basics/main.py 3 | # http://pytorch.org/tutorials/beginner/data_loading_tutorial.html#dataset-class 4 | import torch 5 | import numpy as np 6 | from torch.autograd import Variable 7 | from torch.utils.data import Dataset, DataLoader 8 | 9 | 10 | class DiabetesDataset(Dataset): 11 | """ Diabetes dataset.""" 12 | 13 | # Initialize your data, download, etc. 14 | def __init__(self): 15 | xy = np.loadtxt('./data/diabetes.csv.gz', 16 | delimiter=',', dtype=np.float32) 17 | self.len = xy.shape[0] 18 | self.x_data = torch.from_numpy(xy[:, 0:-1]) 19 | self.y_data = torch.from_numpy(xy[:, [-1]]) 20 | 21 | def __getitem__(self, index): 22 | return self.x_data[index], self.y_data[index] 23 | 24 | def __len__(self): 25 | return self.len 26 | 27 | 28 | dataset = DiabetesDataset() 29 | train_loader = DataLoader(dataset=dataset, 30 | batch_size=32, 31 | shuffle=True, 32 | num_workers=2) 33 | 34 | 35 | class Model(torch.nn.Module): 36 | 37 | def __init__(self): 38 | """ 39 | In the constructor we instantiate two nn.Linear module 40 | """ 41 | super(Model, self).__init__() 42 | self.l1 = torch.nn.Linear(8, 6) 43 | self.l2 = torch.nn.Linear(6, 4) 44 | self.l3 = torch.nn.Linear(4, 1) 45 | 46 | self.sigmoid = torch.nn.Sigmoid() 47 | 48 | def forward(self, x): 49 | """ 50 | In the forward function we accept a Variable of input data and we must return 51 | a Variable of output data. We can use Modules defined in the constructor as 52 | well as arbitrary operators on Variables. 53 | """ 54 | out1 = self.sigmoid(self.l1(x)) 55 | out2 = self.sigmoid(self.l2(out1)) 56 | y_pred = self.sigmoid(self.l3(out2)) 57 | return y_pred 58 | 59 | # our model 60 | model = Model() 61 | 62 | 63 | # Construct our loss function and an Optimizer. The call to model.parameters() 64 | # in the SGD constructor will contain the learnable parameters of the two 65 | # nn.Linear modules which are members of the model. 66 | criterion = torch.nn.BCELoss(size_average=True) 67 | optimizer = torch.optim.SGD(model.parameters(), lr=0.1) 68 | 69 | # Training loop 70 | for epoch in range(2): 71 | for i, data in enumerate(train_loader, 0): 72 | # get the inputs 73 | inputs, labels = data 74 | 75 | # wrap them in Variable 76 | inputs, labels = Variable(inputs), Variable(labels) 77 | 78 | # Forward pass: Compute predicted y by passing x to the model 79 | y_pred = model(inputs) 80 | 81 | # Compute and print loss 82 | loss = criterion(y_pred, labels) 83 | print(epoch, i, loss.data[0]) 84 | 85 | # Zero gradients, perform a backward pass, and update the weights. 86 | optimizer.zero_grad() 87 | loss.backward() 88 | optimizer.step() 89 | -------------------------------------------------------------------------------- /tutorials/01-basics/Feedforward_neural_network/main-gpu.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | input_size = 784 10 | hidden_size = 500 11 | num_classes = 10 12 | num_epochs = 5 13 | batch_size = 100 14 | learning_rate = 0.001 15 | 16 | # MNIST Dataset 17 | train_dataset = dsets.MNIST(root='./data', 18 | train=True, 19 | transform=transforms.ToTensor(), 20 | download=True) 21 | 22 | test_dataset = dsets.MNIST(root='./data', 23 | train=False, 24 | transform=transforms.ToTensor()) 25 | 26 | # Data Loader (Input Pipeline) 27 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 28 | batch_size=batch_size, 29 | shuffle=True) 30 | 31 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 32 | batch_size=batch_size, 33 | shuffle=False) 34 | 35 | # Neural Network Model (1 hidden layer) 36 | class Net(nn.Module): 37 | def __init__(self, input_size, hidden_size, num_classes): 38 | super(Net, self).__init__() 39 | self.fc1 = nn.Linear(input_size, hidden_size) 40 | self.relu = nn.ReLU() 41 | self.fc2 = nn.Linear(hidden_size, num_classes) 42 | 43 | def forward(self, x): 44 | out = self.fc1(x) 45 | out = self.relu(out) 46 | out = self.fc2(out) 47 | return out 48 | 49 | net = Net(input_size, hidden_size, num_classes) 50 | net.cuda() 51 | 52 | # Loss and Optimizer 53 | criterion = nn.CrossEntropyLoss() 54 | optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate) 55 | 56 | # Train the Model 57 | for epoch in range(num_epochs): 58 | for i, (images, labels) in enumerate(train_loader): 59 | # Convert torch tensor to Variable 60 | images = Variable(images.view(-1, 28*28).cuda()) 61 | labels = Variable(labels.cuda()) 62 | 63 | # Forward + Backward + Optimize 64 | optimizer.zero_grad() # zero the gradient buffer 65 | outputs = net(images) 66 | loss = criterion(outputs, labels) 67 | loss.backward() 68 | optimizer.step() 69 | 70 | if (i+1) % 100 == 0: 71 | print ('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' 72 | %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 73 | 74 | # Test the Model 75 | correct = 0 76 | total = 0 77 | for images, labels in test_loader: 78 | images = Variable(images.view(-1, 28*28)).cuda() 79 | outputs = net(images) 80 | _, predicted = torch.max(outputs.data, 1) 81 | total += labels.size(0) 82 | correct += (predicted.cpu() == labels).sum() 83 | 84 | print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / total)) 85 | 86 | # Save the Model 87 | torch.save(net.state_dict(), 'model.pkl') 88 | -------------------------------------------------------------------------------- /tutorials/01-basics/Feedforward_neural_network/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | input_size = 784 10 | hidden_size = 500 11 | num_classes = 10 12 | num_epochs = 5 13 | batch_size = 100 14 | learning_rate = 0.001 15 | 16 | # MNIST Dataset 17 | train_dataset = dsets.MNIST(root='./data', 18 | train=True, 19 | transform=transforms.ToTensor(), 20 | download=True) 21 | 22 | test_dataset = dsets.MNIST(root='./data', 23 | train=False, 24 | transform=transforms.ToTensor()) 25 | 26 | # Data Loader (Input Pipeline) 27 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 28 | batch_size=batch_size, 29 | shuffle=True) 30 | 31 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 32 | batch_size=batch_size, 33 | shuffle=False) 34 | 35 | # Neural Network Model (1 hidden layer) 36 | class Net(nn.Module): 37 | def __init__(self, input_size, hidden_size, num_classes): 38 | super(Net, self).__init__() 39 | self.fc1 = nn.Linear(input_size, hidden_size) 40 | self.relu = nn.ReLU() 41 | self.fc2 = nn.Linear(hidden_size, num_classes) 42 | 43 | def forward(self, x): 44 | out = self.fc1(x) 45 | out = self.relu(out) 46 | out = self.fc2(out) 47 | return out 48 | 49 | net = Net(input_size, hidden_size, num_classes) 50 | 51 | 52 | # Loss and Optimizer 53 | criterion = nn.CrossEntropyLoss() 54 | optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate) 55 | 56 | # Train the Model 57 | for epoch in range(num_epochs): 58 | for i, (images, labels) in enumerate(train_loader): 59 | # Convert torch tensor to Variable 60 | images = Variable(images.view(-1, 28*28)) 61 | labels = Variable(labels) 62 | 63 | # Forward + Backward + Optimize 64 | optimizer.zero_grad() # zero the gradient buffer 65 | outputs = net(images) 66 | loss = criterion(outputs, labels) 67 | loss.backward() 68 | optimizer.step() 69 | 70 | if (i+1) % 100 == 0: 71 | print ('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' 72 | %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 73 | 74 | # Test the Model 75 | correct = 0 76 | total = 0 77 | for images, labels in test_loader: 78 | images = Variable(images.view(-1, 28*28)) 79 | outputs = net(images) 80 | _, predicted = torch.max(outputs.data, 1) 81 | total += labels.size(0) 82 | correct += (predicted == labels).sum() 83 | 84 | print('Accuracy of the network on the 10000 test images: %d %%' % (100 * correct / total)) 85 | 86 | # Save the Model 87 | torch.save(net.state_dict(), 'model.pkl') -------------------------------------------------------------------------------- /tutorials/01-basics/Gradient_Descent/main.py: -------------------------------------------------------------------------------- 1 | x_data = [1.0, 2.0, 3.0] 2 | y_data = [2.0, 4.0, 6.0] 3 | 4 | w = 1.0 # a random guess: random value 5 | 6 | # our model forward pass 7 | 8 | 9 | def forward(x): 10 | return x * w 11 | 12 | 13 | # Loss function 14 | def loss(x, y): 15 | y_pred = forward(x) 16 | return (y_pred - y) * (y_pred - y) 17 | 18 | 19 | # compute gradient 20 | def gradient(x, y): # d_loss/d_w 21 | return 2 * x * (x * w - y) 22 | 23 | # Before training 24 | print("predict (before training)", 4, forward(4)) 25 | 26 | # Training loop 27 | for epoch in range(10): 28 | for x_val, y_val in zip(x_data, y_data): 29 | grad = gradient(x_val, y_val) 30 | w = w - 0.01 * grad 31 | print("\tgrad: ", x_val, y_val, round(grad, 2)) 32 | l = loss(x_val, y_val) 33 | 34 | print("progress:", epoch, "w=", round(w, 2), "loss=", round(l, 2)) 35 | 36 | # After training 37 | print("predict (after training)", "4 hours", forward(4)) 38 | -------------------------------------------------------------------------------- /tutorials/01-basics/Linear_Model/main.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | 4 | x_data = [1.0, 2.0, 3.0] 5 | y_data = [2.0, 4.0, 6.0] 6 | 7 | 8 | # our model for the forward pass 9 | def forward(x): 10 | return x * w 11 | 12 | 13 | # Loss function 14 | def loss(x, y): 15 | y_pred = forward(x) 16 | return (y_pred - y) * (y_pred - y) 17 | 18 | 19 | w_list = [] 20 | mse_list = [] 21 | 22 | for w in np.arange(0.0, 4.1, 0.1): 23 | print("w=", w) 24 | l_sum = 0 25 | for x_val, y_val in zip(x_data, y_data): 26 | y_pred_val = forward(x_val) 27 | l = loss(x_val, y_val) 28 | l_sum += l 29 | print("\t", x_val, y_val, y_pred_val, l) 30 | print("MSE=", l_sum / 3) 31 | w_list.append(w) 32 | mse_list.append(l_sum / 3) 33 | 34 | plt.plot(w_list, mse_list) 35 | plt.ylabel('Loss') 36 | plt.xlabel('w') 37 | plt.show() 38 | -------------------------------------------------------------------------------- /tutorials/01-basics/Linear_regression/lr.py: -------------------------------------------------------------------------------- 1 | 2 | import torch 3 | from torch.autograd import Variable 4 | 5 | x_data = Variable(torch.Tensor([[1.0], [2.0], [3.0]])) 6 | y_data = Variable(torch.Tensor([[2.0], [4.0], [6.0]])) 7 | 8 | 9 | class Model(torch.nn.Module): 10 | 11 | def __init__(self): 12 | """ 13 | In the constructor we instantiate two nn.Linear module 14 | """ 15 | super(Model, self).__init__() 16 | self.linear = torch.nn.Linear(1, 1) # One in and one out 17 | 18 | def forward(self, x): 19 | """ 20 | In the forward function we accept a Variable of input data and we must return 21 | a Variable of output data. We can use Modules defined in the constructor as 22 | well as arbitrary operators on Variables. 23 | """ 24 | y_pred = self.linear(x) 25 | return y_pred 26 | 27 | # our model 28 | model = Model() 29 | 30 | 31 | # Construct our loss function and an Optimizer. The call to model.parameters() 32 | # in the SGD constructor will contain the learnable parameters of the two 33 | # nn.Linear modules which are members of the model. 34 | criterion = torch.nn.MSELoss(size_average=False) 35 | optimizer = torch.optim.SGD(model.parameters(), lr=0.01) 36 | 37 | # Training loop 38 | for epoch in range(500): 39 | # Forward pass: Compute predicted y by passing x to the model 40 | y_pred = model(x_data) 41 | 42 | # Compute and print loss 43 | loss = criterion(y_pred, y_data) 44 | print(epoch, loss.data[0]) 45 | 46 | # Zero gradients, perform a backward pass, and update the weights. 47 | optimizer.zero_grad() 48 | loss.backward() 49 | optimizer.step() 50 | 51 | 52 | # After training 53 | hour_var = Variable(torch.Tensor([[4.0]])) 54 | y_pred = model(hour_var) 55 | print("predict (after training)", 4, model(hour_var).data[0][0]) 56 | -------------------------------------------------------------------------------- /tutorials/01-basics/Linear_regression/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import numpy as np 4 | import matplotlib.pyplot as plt 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | input_size = 1 10 | output_size = 1 11 | num_epochs = 60 12 | learning_rate = 0.001 13 | 14 | # Toy Dataset 15 | x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168], 16 | [9.779], [6.182], [7.59], [2.167], [7.042], 17 | [10.791], [5.313], [7.997], [3.1]], dtype=np.float32) 18 | 19 | y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573], 20 | [3.366], [2.596], [2.53], [1.221], [2.827], 21 | [3.465], [1.65], [2.904], [1.3]], dtype=np.float32) 22 | 23 | # Linear Regression Model 24 | class LinearRegression(nn.Module): 25 | def __init__(self, input_size, output_size): 26 | super(LinearRegression, self).__init__() 27 | self.linear = nn.Linear(input_size, output_size) 28 | 29 | def forward(self, x): 30 | out = self.linear(x) 31 | return out 32 | 33 | model = LinearRegression(input_size, output_size) 34 | 35 | # Loss and Optimizer 36 | criterion = nn.MSELoss() 37 | optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate) 38 | 39 | # Train the Model 40 | for epoch in range(num_epochs): 41 | # Convert numpy array to torch Variable 42 | inputs = Variable(torch.from_numpy(x_train)) 43 | targets = Variable(torch.from_numpy(y_train)) 44 | 45 | # Forward + Backward + Optimize 46 | optimizer.zero_grad() 47 | outputs = model(inputs) 48 | loss = criterion(outputs, targets) 49 | loss.backward() 50 | optimizer.step() 51 | 52 | if (epoch+1) % 5 == 0: 53 | print ('Epoch [%d/%d], Loss: %.4f' 54 | %(epoch+1, num_epochs, loss.data[0])) 55 | 56 | # Plot the graph 57 | predicted = model(Variable(torch.from_numpy(x_train))).data.numpy() 58 | plt.plot(x_train, y_train, 'ro', label='Original data') 59 | plt.plot(x_train, predicted, label='Fitted line') 60 | plt.legend() 61 | plt.show() 62 | 63 | # Save the Model 64 | torch.save(model.state_dict(), 'model.pkl') -------------------------------------------------------------------------------- /tutorials/01-basics/Logistic_regression/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | input_size = 784 10 | num_classes = 10 11 | num_epochs = 5 12 | batch_size = 100 13 | learning_rate = 0.001 14 | 15 | # MNIST Dataset (Images and Labels) 16 | train_dataset = dsets.MNIST(root='./data', 17 | train=True, 18 | transform=transforms.ToTensor(), 19 | download=True) 20 | 21 | test_dataset = dsets.MNIST(root='./data', 22 | train=False, 23 | transform=transforms.ToTensor()) 24 | 25 | # Dataset Loader (Input Pipline) 26 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 27 | batch_size=batch_size, 28 | shuffle=True) 29 | 30 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 31 | batch_size=batch_size, 32 | shuffle=False) 33 | 34 | # Model 35 | class LogisticRegression(nn.Module): 36 | def __init__(self, input_size, num_classes): 37 | super(LogisticRegression, self).__init__() 38 | self.linear = nn.Linear(input_size, num_classes) 39 | 40 | def forward(self, x): 41 | out = self.linear(x) 42 | return out 43 | 44 | model = LogisticRegression(input_size, num_classes) 45 | 46 | # Loss and Optimizer 47 | # Softmax is internally computed. 48 | # Set parameters to be updated. 49 | criterion = nn.CrossEntropyLoss() 50 | optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate) 51 | 52 | # Training the Model 53 | for epoch in range(num_epochs): 54 | for i, (images, labels) in enumerate(train_loader): 55 | images = Variable(images.view(-1, 28*28)) 56 | labels = Variable(labels) 57 | 58 | # Forward + Backward + Optimize 59 | optimizer.zero_grad() 60 | outputs = model(images) 61 | loss = criterion(outputs, labels) 62 | loss.backward() 63 | optimizer.step() 64 | 65 | if (i+1) % 100 == 0: 66 | print ('Epoch: [%d/%d], Step: [%d/%d], Loss: %.4f' 67 | % (epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 68 | 69 | # Test the Model 70 | correct = 0 71 | total = 0 72 | for images, labels in test_loader: 73 | images = Variable(images.view(-1, 28*28)) 74 | outputs = model(images) 75 | _, predicted = torch.max(outputs.data, 1) 76 | total += labels.size(0) 77 | correct += (predicted == labels).sum() 78 | 79 | print('Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total)) 80 | 81 | # Save the Model 82 | torch.save(model.state_dict(), 'model.pkl') -------------------------------------------------------------------------------- /tutorials/01-basics/Softmax_Classifier/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import torch.optim as optim 5 | from torchvision import datasets, transforms 6 | from torch.autograd import Variable 7 | 8 | 9 | # Cross entropy example 10 | import numpy as np 11 | # One hot 12 | # 0: 1 0 0 13 | # 1: 0 1 0 14 | # 2: 0 0 1 15 | Y = np.array([1, 0, 0]) 16 | 17 | Y_pred1 = np.array([0.7, 0.2, 0.1]) 18 | Y_pred2 = np.array([0.1, 0.3, 0.6]) 19 | print("loss1 = ", np.sum(-Y * np.log(Y_pred1))) 20 | print("loss2 = ", np.sum(-Y * np.log(Y_pred2))) 21 | 22 | # Softmax + CrossEntropy (logSoftmax + NLLLoss) 23 | loss = nn.CrossEntropyLoss() 24 | 25 | # target is of size nBatch 26 | # each element in target has to have 0 <= value < nClasses (0-2) 27 | # Input is class, not one-hot 28 | Y = Variable(torch.LongTensor([0]), requires_grad=False) 29 | 30 | # input is of size nBatch x nClasses = 1 x 4 31 | # Y_pred are logits (not softmax) 32 | Y_pred1 = Variable(torch.Tensor([[2.0, 1.0, 0.1]])) 33 | Y_pred2 = Variable(torch.Tensor([[0.5, 2.0, 0.3]])) 34 | 35 | l1 = loss(Y_pred1, Y) 36 | l2 = loss(Y_pred2, Y) 37 | 38 | print("PyTorch Loss1 = ", l1.data, "\nPyTorch Loss2=", l2.data) 39 | 40 | print("Y_pred1=", torch.max(Y_pred1.data, 1)[1]) 41 | print("Y_pred2=", torch.max(Y_pred2.data, 1)[1]) 42 | 43 | # target is of size nBatch 44 | # each element in target has to have 0 <= value < nClasses (0-2) 45 | # Input is class, not one-hot 46 | Y = Variable(torch.LongTensor([2, 0, 1]), requires_grad=False) 47 | 48 | # input is of size nBatch x nClasses = 2 x 4 49 | # Y_pred are logits (not softmax) 50 | Y_pred1 = Variable(torch.Tensor([[0.1, 0.2, 0.9], 51 | [1.1, 0.1, 0.2], 52 | [0.2, 2.1, 0.1]])) 53 | 54 | 55 | Y_pred2 = Variable(torch.Tensor([[0.8, 0.2, 0.3], 56 | [0.2, 0.3, 0.5], 57 | [0.2, 0.2, 0.5]])) 58 | 59 | l1 = loss(Y_pred1, Y) 60 | l2 = loss(Y_pred2, Y) 61 | 62 | print("Batch Loss1 = ", l1.data, "\nBatch Loss2=", l2.data) 63 | -------------------------------------------------------------------------------- /tutorials/01-basics/Softmax_Classifier/main_mnist.py: -------------------------------------------------------------------------------- 1 | # https://github.com/pytorch/examples/blob/master/mnist/main.py 2 | from __future__ import print_function 3 | import torch 4 | import torch.nn as nn 5 | import torch.nn.functional as F 6 | import torch.optim as optim 7 | from torchvision import datasets, transforms 8 | from torch.autograd import Variable 9 | 10 | # Training settings 11 | batch_size = 64 12 | 13 | # MNIST Dataset 14 | train_dataset = datasets.MNIST(root='./mnist_data/', 15 | train=True, 16 | transform=transforms.ToTensor(), 17 | download=True) 18 | 19 | test_dataset = datasets.MNIST(root='./mnist_data/', 20 | train=False, 21 | transform=transforms.ToTensor()) 22 | 23 | # Data Loader (Input Pipeline) 24 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 25 | batch_size=batch_size, 26 | shuffle=True) 27 | 28 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 29 | batch_size=batch_size, 30 | shuffle=False) 31 | 32 | 33 | class Net(nn.Module): 34 | 35 | def __init__(self): 36 | super(Net, self).__init__() 37 | self.l1 = nn.Linear(784, 520) 38 | self.l2 = nn.Linear(520, 320) 39 | self.l3 = nn.Linear(320, 240) 40 | self.l4 = nn.Linear(240, 120) 41 | self.l5 = nn.Linear(120, 10) 42 | 43 | def forward(self, x): 44 | x = x.view(-1, 784) # Flatten the data (n, 1, 28, 28)-> (n, 784) 45 | x = F.relu(self.l1(x)) 46 | x = F.relu(self.l2(x)) 47 | x = F.relu(self.l3(x)) 48 | x = F.relu(self.l4(x)) 49 | return self.l5(x) 50 | 51 | 52 | model = Net() 53 | 54 | criterion = nn.CrossEntropyLoss() 55 | optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.5) 56 | 57 | 58 | def train(epoch): 59 | model.train() 60 | for batch_idx, (data, target) in enumerate(train_loader): 61 | data, target = Variable(data), Variable(target) 62 | optimizer.zero_grad() 63 | output = model(data) 64 | loss = criterion(output, target) 65 | loss.backward() 66 | optimizer.step() 67 | if batch_idx % 10 == 0: 68 | print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format( 69 | epoch, batch_idx * len(data), len(train_loader.dataset), 70 | 100. * batch_idx / len(train_loader), loss.data[0])) 71 | 72 | 73 | def test(): 74 | model.eval() 75 | test_loss = 0 76 | correct = 0 77 | for data, target in test_loader: 78 | data, target = Variable(data, volatile=True), Variable(target) 79 | output = model(data) 80 | # sum up batch loss 81 | test_loss += criterion(output, target).data[0] 82 | # get the index of the max 83 | pred = output.data.max(1, keepdim=True)[1] 84 | correct += pred.eq(target.data.view_as(pred)).cpu().sum() 85 | 86 | test_loss /= len(test_loader.dataset) 87 | print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format( 88 | test_loss, correct, len(test_loader.dataset), 89 | 100. * correct / len(test_loader.dataset))) 90 | 91 | 92 | for epoch in range(1, 10): 93 | train(epoch) 94 | test() 95 | -------------------------------------------------------------------------------- /tutorials/01-basics/Wide_Deep/main.py: -------------------------------------------------------------------------------- 1 | 2 | import torch 3 | from torch.autograd import Variable 4 | import numpy as np 5 | 6 | xy = np.loadtxt('./data/diabetes.csv.gz', delimiter=',', dtype=np.float32) 7 | x_data = Variable(torch.from_numpy(xy[:, 0:-1])) 8 | y_data = Variable(torch.from_numpy(xy[:, [-1]])) 9 | 10 | print(x_data.data.shape) 11 | print(y_data.data.shape) 12 | 13 | 14 | class Model(torch.nn.Module): 15 | 16 | def __init__(self): 17 | """ 18 | In the constructor we instantiate two nn.Linear module 19 | """ 20 | super(Model, self).__init__() 21 | self.l1 = torch.nn.Linear(8, 6) 22 | self.l2 = torch.nn.Linear(6, 4) 23 | self.l3 = torch.nn.Linear(4, 1) 24 | 25 | self.sigmoid = torch.nn.Sigmoid() 26 | 27 | def forward(self, x): 28 | """ 29 | In the forward function we accept a Variable of input data and we must return 30 | a Variable of output data. We can use Modules defined in the constructor as 31 | well as arbitrary operators on Variables. 32 | """ 33 | out1 = self.sigmoid(self.l1(x)) 34 | out2 = self.sigmoid(self.l2(out1)) 35 | y_pred = self.sigmoid(self.l3(out2)) 36 | return y_pred 37 | 38 | # our model 39 | model = Model() 40 | 41 | 42 | # Construct our loss function and an Optimizer. The call to model.parameters() 43 | # in the SGD constructor will contain the learnable parameters of the two 44 | # nn.Linear modules which are members of the model. 45 | criterion = torch.nn.BCELoss(size_average=True) 46 | optimizer = torch.optim.SGD(model.parameters(), lr=0.1) 47 | 48 | # Training loop 49 | for epoch in range(100): 50 | # Forward pass: Compute predicted y by passing x to the model 51 | y_pred = model(x_data) 52 | 53 | # Compute and print loss 54 | loss = criterion(y_pred, y_data) 55 | print(epoch, loss.data[0]) 56 | 57 | # Zero gradients, perform a backward pass, and update the weights. 58 | optimizer.zero_grad() 59 | loss.backward() 60 | optimizer.step() 61 | -------------------------------------------------------------------------------- /tutorials/01-basics/pytorch_basics/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torchvision 3 | import torch.nn as nn 4 | import numpy as np 5 | import torch.utils.data as data 6 | import torchvision.transforms as transforms 7 | import torchvision.datasets as dsets 8 | from torch.autograd import Variable 9 | 10 | 11 | #========================== Table of Contents ==========================# 12 | # 1. Basic autograd example 1 (Line 21 to 36) 13 | # 2. Basic autograd example 2 (Line 39 to 77) 14 | # 3. Loading data from numpy (Line 80 to 83) 15 | # 4. Implementing the input pipline (Line 86 to 113) 16 | # 5. Input pipline for custom dataset (Line 116 to 138) 17 | # 6. Using pretrained model (Line 141 to 155) 18 | # 7. Save and load model (Line 158 to 165) 19 | 20 | 21 | #======================= Basic autograd example 1 =======================# 22 | # Create tensors. 23 | x = Variable(torch.Tensor([1]), requires_grad=True) 24 | w = Variable(torch.Tensor([2]), requires_grad=True) 25 | b = Variable(torch.Tensor([3]), requires_grad=True) 26 | 27 | # Build a computational graph. 28 | y = w * x + b # y = 2 * x + 3 29 | 30 | # Compute gradients. 31 | y.backward() 32 | 33 | # Print out the gradients. 34 | print(x.grad) # x.grad = 2 35 | print(w.grad) # w.grad = 1 36 | print(b.grad) # b.grad = 1 37 | 38 | 39 | #======================== Basic autograd example 2 =======================# 40 | # Create tensors. 41 | x = Variable(torch.randn(5, 3)) 42 | y = Variable(torch.randn(5, 2)) 43 | 44 | # Build a linear layer. 45 | linear = nn.Linear(3, 2) 46 | print ('w: ', linear.weight) 47 | print ('b: ', linear.bias) 48 | 49 | # Build Loss and Optimizer. 50 | criterion = nn.MSELoss() 51 | optimizer = torch.optim.SGD(linear.parameters(), lr=0.01) 52 | 53 | # Forward propagation. 54 | pred = linear(x) 55 | 56 | # Compute loss. 57 | loss = criterion(pred, y) 58 | print('loss: ', loss.data[0]) 59 | 60 | # Backpropagation. 61 | loss.backward() 62 | 63 | # Print out the gradients. 64 | print ('dL/dw: ', linear.weight.grad) 65 | print ('dL/db: ', linear.bias.grad) 66 | 67 | # 1-step Optimization (gradient descent). 68 | optimizer.step() 69 | 70 | # You can also do optimization at the low level as shown below. 71 | # linear.weight.data.sub_(0.01 * linear.weight.grad.data) 72 | # linear.bias.data.sub_(0.01 * linear.bias.grad.data) 73 | 74 | # Print out the loss after optimization. 75 | pred = linear(x) 76 | loss = criterion(pred, y) 77 | print('loss after 1 step optimization: ', loss.data[0]) 78 | 79 | 80 | #======================== Loading data from numpy ========================# 81 | a = np.array([[1,2], [3,4]]) 82 | b = torch.from_numpy(a) # convert numpy array to torch tensor 83 | c = b.numpy() # convert torch tensor to numpy array 84 | 85 | 86 | #===================== Implementing the input pipline =====================# 87 | # Download and construct dataset. 88 | train_dataset = dsets.CIFAR10(root='../data/', 89 | train=True, 90 | transform=transforms.ToTensor(), 91 | download=True) 92 | 93 | # Select one data pair (read data from disk). 94 | image, label = train_dataset[0] 95 | print (image.size()) 96 | print (label) 97 | 98 | # Data Loader (this provides queue and thread in a very simple way). 99 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 100 | batch_size=100, 101 | shuffle=True, 102 | num_workers=2) 103 | 104 | # When iteration starts, queue and thread start to load dataset from files. 105 | data_iter = iter(train_loader) 106 | 107 | # Mini-batch images and labels. 108 | images, labels = data_iter.next() 109 | 110 | # Actual usage of data loader is as below. 111 | for images, labels in train_loader: 112 | # Your training code will be written here 113 | pass 114 | 115 | 116 | #===================== Input pipline for custom dataset =====================# 117 | # You should build custom dataset as below. 118 | class CustomDataset(data.Dataset): 119 | def __init__(self): 120 | # TODO 121 | # 1. Initialize file path or list of file names. 122 | pass 123 | def __getitem__(self, index): 124 | # TODO 125 | # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open). 126 | # 2. Preprocess the data (e.g. torchvision.Transform). 127 | # 3. Return a data pair (e.g. image and label). 128 | pass 129 | def __len__(self): 130 | # You should change 0 to the total size of your dataset. 131 | return 0 132 | 133 | # Then, you can just use prebuilt torch's data loader. 134 | custom_dataset = CustomDataset() 135 | train_loader = torch.utils.data.DataLoader(dataset=custom_dataset, 136 | batch_size=100, 137 | shuffle=True, 138 | num_workers=2) 139 | 140 | 141 | #========================== Using pretrained model ==========================# 142 | # Download and load pretrained resnet. 143 | resnet = torchvision.models.resnet18(pretrained=True) 144 | 145 | # If you want to finetune only top layer of the model. 146 | for param in resnet.parameters(): 147 | param.requires_grad = False 148 | 149 | # Replace top layer for finetuning. 150 | resnet.fc = nn.Linear(resnet.fc.in_features, 100) # 100 is for example. 151 | 152 | # For test. 153 | images = Variable(torch.randn(10, 3, 256, 256)) 154 | outputs = resnet(images) 155 | print (outputs.size()) # (10, 100) 156 | 157 | 158 | #============================ Save and load the model ============================# 159 | # Save and load the entire model. 160 | torch.save(resnet, 'model.pkl') 161 | model = torch.load('model.pkl') 162 | 163 | # Save and load only the model parameters(recommended). 164 | torch.save(resnet.state_dict(), 'params.pkl') 165 | resnet.load_state_dict(torch.load('params.pkl')) -------------------------------------------------------------------------------- /tutorials/02-intermediate/bidirectional_recurrent_neural_network/main-gpu.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | sequence_length = 28 10 | input_size = 28 11 | hidden_size = 128 12 | num_layers = 2 13 | num_classes = 10 14 | batch_size = 100 15 | num_epochs = 2 16 | learning_rate = 0.003 17 | 18 | # MNIST Dataset 19 | train_dataset = dsets.MNIST(root='./data/', 20 | train=True, 21 | transform=transforms.ToTensor(), 22 | download=True) 23 | 24 | test_dataset = dsets.MNIST(root='./data/', 25 | train=False, 26 | transform=transforms.ToTensor()) 27 | 28 | # Data Loader (Input Pipeline) 29 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 30 | batch_size=batch_size, 31 | shuffle=True) 32 | 33 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 34 | batch_size=batch_size, 35 | shuffle=False) 36 | 37 | # BiRNN Model (Many-to-One) 38 | class BiRNN(nn.Module): 39 | def __init__(self, input_size, hidden_size, num_layers, num_classes): 40 | super(BiRNN, self).__init__() 41 | self.hidden_size = hidden_size 42 | self.num_layers = num_layers 43 | self.lstm = nn.LSTM(input_size, hidden_size, num_layers, 44 | batch_first=True, bidirectional=True) 45 | self.fc = nn.Linear(hidden_size*2, num_classes) # 2 for bidirection 46 | 47 | def forward(self, x): 48 | # Set initial states 49 | h0 = Variable(torch.zeros(self.num_layers*2, x.size(0), self.hidden_size)).cuda() # 2 for bidirection 50 | c0 = Variable(torch.zeros(self.num_layers*2, x.size(0), self.hidden_size)).cuda() 51 | 52 | # Forward propagate RNN 53 | out, _ = self.lstm(x, (h0, c0)) 54 | 55 | # Decode hidden state of last time step 56 | out = self.fc(out[:, -1, :]) 57 | return out 58 | 59 | rnn = BiRNN(input_size, hidden_size, num_layers, num_classes) 60 | rnn.cuda() 61 | 62 | # Loss and Optimizer 63 | criterion = nn.CrossEntropyLoss() 64 | optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate) 65 | 66 | # Train the Model 67 | for epoch in range(num_epochs): 68 | for i, (images, labels) in enumerate(train_loader): 69 | images = Variable(images.view(-1, sequence_length, input_size)).cuda() 70 | labels = Variable(labels).cuda() 71 | 72 | # Forward + Backward + Optimize 73 | optimizer.zero_grad() 74 | outputs = rnn(images) 75 | loss = criterion(outputs, labels) 76 | loss.backward() 77 | optimizer.step() 78 | 79 | if (i+1) % 100 == 0: 80 | print ('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' 81 | %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 82 | 83 | # Test the Model 84 | correct = 0 85 | total = 0 86 | for images, labels in test_loader: 87 | images = Variable(images.view(-1, sequence_length, input_size)).cuda() 88 | outputs = rnn(images) 89 | _, predicted = torch.max(outputs.data, 1) 90 | total += labels.size(0) 91 | correct += (predicted.cpu() == labels).sum() 92 | 93 | print('Test Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total)) 94 | 95 | # Save the Model 96 | torch.save(rnn.state_dict(), 'rnn.pkl') -------------------------------------------------------------------------------- /tutorials/02-intermediate/bidirectional_recurrent_neural_network/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | sequence_length = 28 10 | input_size = 28 11 | hidden_size = 128 12 | num_layers = 2 13 | num_classes = 10 14 | batch_size = 100 15 | num_epochs = 2 16 | learning_rate = 0.003 17 | 18 | # MNIST Dataset 19 | train_dataset = dsets.MNIST(root='./data/', 20 | train=True, 21 | transform=transforms.ToTensor(), 22 | download=True) 23 | 24 | test_dataset = dsets.MNIST(root='./data/', 25 | train=False, 26 | transform=transforms.ToTensor()) 27 | 28 | # Data Loader (Input Pipeline) 29 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 30 | batch_size=batch_size, 31 | shuffle=True) 32 | 33 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 34 | batch_size=batch_size, 35 | shuffle=False) 36 | 37 | # BiRNN Model (Many-to-One) 38 | class BiRNN(nn.Module): 39 | def __init__(self, input_size, hidden_size, num_layers, num_classes): 40 | super(BiRNN, self).__init__() 41 | self.hidden_size = hidden_size 42 | self.num_layers = num_layers 43 | self.lstm = nn.LSTM(input_size, hidden_size, num_layers, 44 | batch_first=True, bidirectional=True) 45 | self.fc = nn.Linear(hidden_size*2, num_classes) # 2 for bidirection 46 | 47 | def forward(self, x): 48 | # Set initial states 49 | h0 = Variable(torch.zeros(self.num_layers*2, x.size(0), self.hidden_size)) # 2 for bidirection 50 | c0 = Variable(torch.zeros(self.num_layers*2, x.size(0), self.hidden_size)) 51 | 52 | # Forward propagate RNN 53 | out, _ = self.lstm(x, (h0, c0)) 54 | 55 | # Decode hidden state of last time step 56 | out = self.fc(out[:, -1, :]) 57 | return out 58 | 59 | rnn = BiRNN(input_size, hidden_size, num_layers, num_classes) 60 | 61 | 62 | # Loss and Optimizer 63 | criterion = nn.CrossEntropyLoss() 64 | optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate) 65 | 66 | # Train the Model 67 | for epoch in range(num_epochs): 68 | for i, (images, labels) in enumerate(train_loader): 69 | images = Variable(images.view(-1, sequence_length, input_size)) 70 | labels = Variable(labels) 71 | 72 | # Forward + Backward + Optimize 73 | optimizer.zero_grad() 74 | outputs = rnn(images) 75 | loss = criterion(outputs, labels) 76 | loss.backward() 77 | optimizer.step() 78 | 79 | if (i+1) % 100 == 0: 80 | print ('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' 81 | %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 82 | 83 | # Test the Model 84 | correct = 0 85 | total = 0 86 | for images, labels in test_loader: 87 | images = Variable(images.view(-1, sequence_length, input_size)) 88 | outputs = rnn(images) 89 | _, predicted = torch.max(outputs.data, 1) 90 | total += labels.size(0) 91 | correct += (predicted == labels).sum() 92 | 93 | print('Test Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total)) 94 | 95 | # Save the Model 96 | torch.save(rnn.state_dict(), 'rnn.pkl') -------------------------------------------------------------------------------- /tutorials/02-intermediate/convolutional_neural_network/main-gpu.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | num_epochs = 5 10 | batch_size = 100 11 | learning_rate = 0.001 12 | 13 | # MNIST Dataset 14 | train_dataset = dsets.MNIST(root='./data/', 15 | train=True, 16 | transform=transforms.ToTensor(), 17 | download=True) 18 | 19 | test_dataset = dsets.MNIST(root='./data/', 20 | train=False, 21 | transform=transforms.ToTensor()) 22 | 23 | # Data Loader (Input Pipeline) 24 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 25 | batch_size=batch_size, 26 | shuffle=True) 27 | 28 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 29 | batch_size=batch_size, 30 | shuffle=False) 31 | 32 | # CNN Model (2 conv layer) 33 | class CNN(nn.Module): 34 | def __init__(self): 35 | super(CNN, self).__init__() 36 | self.layer1 = nn.Sequential( 37 | nn.Conv2d(1, 16, kernel_size=5, padding=2), 38 | nn.BatchNorm2d(16), 39 | nn.ReLU(), 40 | nn.MaxPool2d(2)) 41 | self.layer2 = nn.Sequential( 42 | nn.Conv2d(16, 32, kernel_size=5, padding=2), 43 | nn.BatchNorm2d(32), 44 | nn.ReLU(), 45 | nn.MaxPool2d(2)) 46 | self.fc = nn.Linear(7*7*32, 10) 47 | 48 | def forward(self, x): 49 | out = self.layer1(x) 50 | out = self.layer2(out) 51 | out = out.view(out.size(0), -1) 52 | out = self.fc(out) 53 | return out 54 | 55 | cnn = CNN() 56 | cnn.cuda() 57 | 58 | # Loss and Optimizer 59 | criterion = nn.CrossEntropyLoss() 60 | optimizer = torch.optim.Adam(cnn.parameters(), lr=learning_rate) 61 | 62 | # Train the Model 63 | for epoch in range(num_epochs): 64 | for i, (images, labels) in enumerate(train_loader): 65 | images = Variable(images).cuda() 66 | labels = Variable(labels).cuda() 67 | 68 | # Forward + Backward + Optimize 69 | optimizer.zero_grad() 70 | outputs = cnn(images) 71 | loss = criterion(outputs, labels) 72 | loss.backward() 73 | optimizer.step() 74 | 75 | if (i+1) % 100 == 0: 76 | print ('Epoch [%d/%d], Iter [%d/%d] Loss: %.4f' 77 | %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 78 | 79 | # Test the Model 80 | cnn.eval() # Change model to 'eval' mode (BN uses moving mean/var). 81 | correct = 0 82 | total = 0 83 | for images, labels in test_loader: 84 | images = Variable(images).cuda() 85 | outputs = cnn(images) 86 | _, predicted = torch.max(outputs.data, 1) 87 | total += labels.size(0) 88 | correct += (predicted.cpu() == labels).sum() 89 | 90 | print('Test Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total)) 91 | 92 | # Save the Trained Model 93 | torch.save(cnn.state_dict(), 'cnn.pkl') -------------------------------------------------------------------------------- /tutorials/02-intermediate/convolutional_neural_network/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | num_epochs = 5 10 | batch_size = 100 11 | learning_rate = 0.001 12 | 13 | # MNIST Dataset 14 | train_dataset = dsets.MNIST(root='./data/', 15 | train=True, 16 | transform=transforms.ToTensor(), 17 | download=True) 18 | 19 | test_dataset = dsets.MNIST(root='./data/', 20 | train=False, 21 | transform=transforms.ToTensor()) 22 | 23 | # Data Loader (Input Pipeline) 24 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 25 | batch_size=batch_size, 26 | shuffle=True) 27 | 28 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 29 | batch_size=batch_size, 30 | shuffle=False) 31 | 32 | # CNN Model (2 conv layer) 33 | class CNN(nn.Module): 34 | def __init__(self): 35 | super(CNN, self).__init__() 36 | self.layer1 = nn.Sequential( 37 | nn.Conv2d(1, 16, kernel_size=5, padding=2), 38 | nn.BatchNorm2d(16), 39 | nn.ReLU(), 40 | nn.MaxPool2d(2)) 41 | self.layer2 = nn.Sequential( 42 | nn.Conv2d(16, 32, kernel_size=5, padding=2), 43 | nn.BatchNorm2d(32), 44 | nn.ReLU(), 45 | nn.MaxPool2d(2)) 46 | self.fc = nn.Linear(7*7*32, 10) 47 | 48 | def forward(self, x): 49 | out = self.layer1(x) 50 | out = self.layer2(out) 51 | out = out.view(out.size(0), -1) 52 | out = self.fc(out) 53 | return out 54 | 55 | cnn = CNN() 56 | 57 | 58 | # Loss and Optimizer 59 | criterion = nn.CrossEntropyLoss() 60 | optimizer = torch.optim.Adam(cnn.parameters(), lr=learning_rate) 61 | 62 | # Train the Model 63 | for epoch in range(num_epochs): 64 | for i, (images, labels) in enumerate(train_loader): 65 | images = Variable(images) 66 | labels = Variable(labels) 67 | 68 | # Forward + Backward + Optimize 69 | optimizer.zero_grad() 70 | outputs = cnn(images) 71 | loss = criterion(outputs, labels) 72 | loss.backward() 73 | optimizer.step() 74 | 75 | if (i+1) % 100 == 0: 76 | print ('Epoch [%d/%d], Iter [%d/%d] Loss: %.4f' 77 | %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 78 | 79 | # Test the Model 80 | cnn.eval() # Change model to 'eval' mode (BN uses moving mean/var). 81 | correct = 0 82 | total = 0 83 | for images, labels in test_loader: 84 | images = Variable(images) 85 | outputs = cnn(images) 86 | _, predicted = torch.max(outputs.data, 1) 87 | total += labels.size(0) 88 | correct += (predicted == labels).sum() 89 | 90 | print('Test Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total)) 91 | 92 | # Save the Trained Model 93 | torch.save(cnn.state_dict(), 'cnn.pkl') -------------------------------------------------------------------------------- /tutorials/02-intermediate/deep_residual_network/main-gpu.py: -------------------------------------------------------------------------------- 1 | # Implementation of https://arxiv.org/pdf/1512.03385.pdf/ 2 | # See section 4.2 for model architecture on CIFAR-10. 3 | # Some part of the code was referenced below. 4 | # https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py 5 | import torch 6 | import torch.nn as nn 7 | import torchvision.datasets as dsets 8 | import torchvision.transforms as transforms 9 | from torch.autograd import Variable 10 | 11 | # Image Preprocessing 12 | transform = transforms.Compose([ 13 | transforms.Scale(40), 14 | transforms.RandomHorizontalFlip(), 15 | transforms.RandomCrop(32), 16 | transforms.ToTensor()]) 17 | 18 | # CIFAR-10 Dataset 19 | train_dataset = dsets.CIFAR10(root='./data/', 20 | train=True, 21 | transform=transform, 22 | download=True) 23 | 24 | test_dataset = dsets.CIFAR10(root='./data/', 25 | train=False, 26 | transform=transforms.ToTensor()) 27 | 28 | # Data Loader (Input Pipeline) 29 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 30 | batch_size=100, 31 | shuffle=True) 32 | 33 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 34 | batch_size=100, 35 | shuffle=False) 36 | 37 | # 3x3 Convolution 38 | def conv3x3(in_channels, out_channels, stride=1): 39 | return nn.Conv2d(in_channels, out_channels, kernel_size=3, 40 | stride=stride, padding=1, bias=False) 41 | 42 | # Residual Block 43 | class ResidualBlock(nn.Module): 44 | def __init__(self, in_channels, out_channels, stride=1, downsample=None): 45 | super(ResidualBlock, self).__init__() 46 | self.conv1 = conv3x3(in_channels, out_channels, stride) 47 | self.bn1 = nn.BatchNorm2d(out_channels) 48 | self.relu = nn.ReLU(inplace=True) 49 | self.conv2 = conv3x3(out_channels, out_channels) 50 | self.bn2 = nn.BatchNorm2d(out_channels) 51 | self.downsample = downsample 52 | 53 | def forward(self, x): 54 | residual = x 55 | out = self.conv1(x) 56 | out = self.bn1(out) 57 | out = self.relu(out) 58 | out = self.conv2(out) 59 | out = self.bn2(out) 60 | if self.downsample: 61 | residual = self.downsample(x) 62 | out += residual 63 | out = self.relu(out) 64 | return out 65 | 66 | # ResNet Module 67 | class ResNet(nn.Module): 68 | def __init__(self, block, layers, num_classes=10): 69 | super(ResNet, self).__init__() 70 | self.in_channels = 16 71 | self.conv = conv3x3(3, 16) 72 | self.bn = nn.BatchNorm2d(16) 73 | self.relu = nn.ReLU(inplace=True) 74 | self.layer1 = self.make_layer(block, 16, layers[0]) 75 | self.layer2 = self.make_layer(block, 32, layers[0], 2) 76 | self.layer3 = self.make_layer(block, 64, layers[1], 2) 77 | self.avg_pool = nn.AvgPool2d(8) 78 | self.fc = nn.Linear(64, num_classes) 79 | 80 | def make_layer(self, block, out_channels, blocks, stride=1): 81 | downsample = None 82 | if (stride != 1) or (self.in_channels != out_channels): 83 | downsample = nn.Sequential( 84 | conv3x3(self.in_channels, out_channels, stride=stride), 85 | nn.BatchNorm2d(out_channels)) 86 | layers = [] 87 | layers.append(block(self.in_channels, out_channels, stride, downsample)) 88 | self.in_channels = out_channels 89 | for i in range(1, blocks): 90 | layers.append(block(out_channels, out_channels)) 91 | return nn.Sequential(*layers) 92 | 93 | def forward(self, x): 94 | out = self.conv(x) 95 | out = self.bn(out) 96 | out = self.relu(out) 97 | out = self.layer1(out) 98 | out = self.layer2(out) 99 | out = self.layer3(out) 100 | out = self.avg_pool(out) 101 | out = out.view(out.size(0), -1) 102 | out = self.fc(out) 103 | return out 104 | 105 | resnet = ResNet(ResidualBlock, [3, 3, 3]) 106 | resnet.cuda() 107 | 108 | # Loss and Optimizer 109 | criterion = nn.CrossEntropyLoss() 110 | lr = 0.001 111 | optimizer = torch.optim.Adam(resnet.parameters(), lr=lr) 112 | 113 | # Training 114 | for epoch in range(80): 115 | for i, (images, labels) in enumerate(train_loader): 116 | images = Variable(images.cuda()) 117 | labels = Variable(labels.cuda()) 118 | 119 | # Forward + Backward + Optimize 120 | optimizer.zero_grad() 121 | outputs = resnet(images) 122 | loss = criterion(outputs, labels) 123 | loss.backward() 124 | optimizer.step() 125 | 126 | if (i+1) % 100 == 0: 127 | print ("Epoch [%d/%d], Iter [%d/%d] Loss: %.4f" %(epoch+1, 80, i+1, 500, loss.data[0])) 128 | 129 | # Decaying Learning Rate 130 | if (epoch+1) % 20 == 0: 131 | lr /= 3 132 | optimizer = torch.optim.Adam(resnet.parameters(), lr=lr) 133 | 134 | # Test 135 | correct = 0 136 | total = 0 137 | for images, labels in test_loader: 138 | images = Variable(images.cuda()) 139 | outputs = resnet(images) 140 | _, predicted = torch.max(outputs.data, 1) 141 | total += labels.size(0) 142 | correct += (predicted.cpu() == labels).sum() 143 | 144 | print('Accuracy of the model on the test images: %d %%' % (100 * correct / total)) 145 | 146 | # Save the Model 147 | torch.save(resnet.state_dict(), 'resnet.pkl') -------------------------------------------------------------------------------- /tutorials/02-intermediate/deep_residual_network/main.py: -------------------------------------------------------------------------------- 1 | # Implementation of https://arxiv.org/pdf/1512.03385.pdf. 2 | # See section 4.2 for model architecture on CIFAR-10. 3 | # Some part of the code was referenced below. 4 | # https://github.com/pytorch/vision/blob/master/torchvision/models/resnet.py 5 | import torch 6 | import torch.nn as nn 7 | import torchvision.datasets as dsets 8 | import torchvision.transforms as transforms 9 | from torch.autograd import Variable 10 | 11 | # Image Preprocessing 12 | transform = transforms.Compose([ 13 | transforms.Scale(40), 14 | transforms.RandomHorizontalFlip(), 15 | transforms.RandomCrop(32), 16 | transforms.ToTensor()]) 17 | 18 | # CIFAR-10 Dataset 19 | train_dataset = dsets.CIFAR10(root='./data/', 20 | train=True, 21 | transform=transform, 22 | download=True) 23 | 24 | test_dataset = dsets.CIFAR10(root='./data/', 25 | train=False, 26 | transform=transforms.ToTensor()) 27 | 28 | # Data Loader (Input Pipeline) 29 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 30 | batch_size=100, 31 | shuffle=True) 32 | 33 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 34 | batch_size=100, 35 | shuffle=False) 36 | 37 | # 3x3 Convolution 38 | def conv3x3(in_channels, out_channels, stride=1): 39 | return nn.Conv2d(in_channels, out_channels, kernel_size=3, 40 | stride=stride, padding=1, bias=False) 41 | 42 | # Residual Block 43 | class ResidualBlock(nn.Module): 44 | def __init__(self, in_channels, out_channels, stride=1, downsample=None): 45 | super(ResidualBlock, self).__init__() 46 | self.conv1 = conv3x3(in_channels, out_channels, stride) 47 | self.bn1 = nn.BatchNorm2d(out_channels) 48 | self.relu = nn.ReLU(inplace=True) 49 | self.conv2 = conv3x3(out_channels, out_channels) 50 | self.bn2 = nn.BatchNorm2d(out_channels) 51 | self.downsample = downsample 52 | 53 | def forward(self, x): 54 | residual = x 55 | out = self.conv1(x) 56 | out = self.bn1(out) 57 | out = self.relu(out) 58 | out = self.conv2(out) 59 | out = self.bn2(out) 60 | if self.downsample: 61 | residual = self.downsample(x) 62 | out += residual 63 | out = self.relu(out) 64 | return out 65 | 66 | # ResNet Module 67 | class ResNet(nn.Module): 68 | def __init__(self, block, layers, num_classes=10): 69 | super(ResNet, self).__init__() 70 | self.in_channels = 16 71 | self.conv = conv3x3(3, 16) 72 | self.bn = nn.BatchNorm2d(16) 73 | self.relu = nn.ReLU(inplace=True) 74 | self.layer1 = self.make_layer(block, 16, layers[0]) 75 | self.layer2 = self.make_layer(block, 32, layers[0], 2) 76 | self.layer3 = self.make_layer(block, 64, layers[1], 2) 77 | self.avg_pool = nn.AvgPool2d(8) 78 | self.fc = nn.Linear(64, num_classes) 79 | 80 | def make_layer(self, block, out_channels, blocks, stride=1): 81 | downsample = None 82 | if (stride != 1) or (self.in_channels != out_channels): 83 | downsample = nn.Sequential( 84 | conv3x3(self.in_channels, out_channels, stride=stride), 85 | nn.BatchNorm2d(out_channels)) 86 | layers = [] 87 | layers.append(block(self.in_channels, out_channels, stride, downsample)) 88 | self.in_channels = out_channels 89 | for i in range(1, blocks): 90 | layers.append(block(out_channels, out_channels)) 91 | return nn.Sequential(*layers) 92 | 93 | def forward(self, x): 94 | out = self.conv(x) 95 | out = self.bn(out) 96 | out = self.relu(out) 97 | out = self.layer1(out) 98 | out = self.layer2(out) 99 | out = self.layer3(out) 100 | out = self.avg_pool(out) 101 | out = out.view(out.size(0), -1) 102 | out = self.fc(out) 103 | return out 104 | 105 | resnet = ResNet(ResidualBlock, [2, 2, 2, 2]) 106 | 107 | 108 | # Loss and Optimizer 109 | criterion = nn.CrossEntropyLoss() 110 | lr = 0.001 111 | optimizer = torch.optim.Adam(resnet.parameters(), lr=lr) 112 | 113 | # Training 114 | for epoch in range(80): 115 | for i, (images, labels) in enumerate(train_loader): 116 | images = Variable(images) 117 | labels = Variable(labels) 118 | 119 | # Forward + Backward + Optimize 120 | optimizer.zero_grad() 121 | outputs = resnet(images) 122 | loss = criterion(outputs, labels) 123 | loss.backward() 124 | optimizer.step() 125 | 126 | if (i+1) % 100 == 0: 127 | print ("Epoch [%d/%d], Iter [%d/%d] Loss: %.4f" %(epoch+1, 80, i+1, 500, loss.data[0])) 128 | 129 | # Decaying Learning Rate 130 | if (epoch+1) % 20 == 0: 131 | lr /= 3 132 | optimizer = torch.optim.Adam(resnet.parameters(), lr=lr) 133 | 134 | # Test 135 | correct = 0 136 | total = 0 137 | for images, labels in test_loader: 138 | images = Variable(images) 139 | outputs = resnet(images) 140 | _, predicted = torch.max(outputs.data, 1) 141 | total += labels.size(0) 142 | correct += (predicted == labels).sum() 143 | 144 | print('Accuracy of the model on the test images: %d %%' % (100 * correct / total)) 145 | 146 | # Save the Model 147 | torch.save(resnet.state_dict(), 'resnet.pkl') -------------------------------------------------------------------------------- /tutorials/02-intermediate/generative_adversarial_network/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torchvision 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | from torchvision import datasets 6 | from torchvision import transforms 7 | from torchvision.utils import save_image 8 | from torch.autograd import Variable 9 | 10 | 11 | def to_var(x): 12 | if torch.cuda.is_available(): 13 | x = x.cuda() 14 | return Variable(x) 15 | 16 | def denorm(x): 17 | out = (x + 1) / 2 18 | return out.clamp(0, 1) 19 | 20 | # Image processing 21 | transform = transforms.Compose([ 22 | transforms.ToTensor(), 23 | transforms.Normalize(mean=(0.5, 0.5, 0.5), 24 | std=(0.5, 0.5, 0.5))]) 25 | # MNIST dataset 26 | mnist = datasets.MNIST(root='./data/', 27 | train=True, 28 | transform=transform, 29 | download=True) 30 | # Data loader 31 | data_loader = torch.utils.data.DataLoader(dataset=mnist, 32 | batch_size=100, 33 | shuffle=True) 34 | # Discriminator 35 | D = nn.Sequential( 36 | nn.Linear(784, 256), 37 | nn.LeakyReLU(0.2), 38 | nn.Linear(256, 256), 39 | nn.LeakyReLU(0.2), 40 | nn.Linear(256, 1), 41 | nn.Sigmoid()) 42 | 43 | # Generator 44 | G = nn.Sequential( 45 | nn.Linear(64, 256), 46 | nn.LeakyReLU(0.2), 47 | nn.Linear(256, 256), 48 | nn.LeakyReLU(0.2), 49 | nn.Linear(256, 784), 50 | nn.Tanh()) 51 | 52 | if torch.cuda.is_available(): 53 | D.cuda() 54 | G.cuda() 55 | 56 | # Binary cross entropy loss and optimizer 57 | criterion = nn.BCELoss() 58 | d_optimizer = torch.optim.Adam(D.parameters(), lr=0.0003) 59 | g_optimizer = torch.optim.Adam(G.parameters(), lr=0.0003) 60 | 61 | # Start training 62 | for epoch in range(200): 63 | for i, (images, _) in enumerate(data_loader): 64 | # Build mini-batch dataset 65 | batch_size = images.size(0) 66 | images = to_var(images.view(batch_size, -1)) 67 | 68 | # Create the labels which are later used as input for the BCE loss 69 | real_labels = to_var(torch.ones(batch_size)) 70 | fake_labels = to_var(torch.zeros(batch_size)) 71 | 72 | #============= Train the discriminator =============# 73 | # Compute BCE_Loss using real images where BCE_Loss(x, y): - y * log(D(x)) - (1-y) * log(1 - D(x)) 74 | # Second term of the loss is always zero since real_labels == 1 75 | outputs = D(images) 76 | d_loss_real = criterion(outputs, real_labels) 77 | real_score = outputs 78 | 79 | # Compute BCELoss using fake images 80 | # First term of the loss is always zero since fake_labels == 0 81 | z = to_var(torch.randn(batch_size, 64)) 82 | fake_images = G(z) 83 | outputs = D(fake_images) 84 | d_loss_fake = criterion(outputs, fake_labels) 85 | fake_score = outputs 86 | 87 | # Backprop + Optimize 88 | d_loss = d_loss_real + d_loss_fake 89 | D.zero_grad() 90 | d_loss.backward() 91 | d_optimizer.step() 92 | 93 | #=============== Train the generator ===============# 94 | # Compute loss with fake images 95 | z = to_var(torch.randn(batch_size, 64)) 96 | fake_images = G(z) 97 | outputs = D(fake_images) 98 | 99 | # We train G to maximize log(D(G(z)) instead of minimizing log(1-D(G(z))) 100 | # For the reason, see the last paragraph of section 3. https://arxiv.org/pdf/1406.2661.pdf 101 | g_loss = criterion(outputs, real_labels) 102 | 103 | # Backprop + Optimize 104 | D.zero_grad() 105 | G.zero_grad() 106 | g_loss.backward() 107 | g_optimizer.step() 108 | 109 | if (i+1) % 300 == 0: 110 | print('Epoch [%d/%d], Step[%d/%d], d_loss: %.4f, ' 111 | 'g_loss: %.4f, D(x): %.2f, D(G(z)): %.2f' 112 | %(epoch, 200, i+1, 600, d_loss.data[0], g_loss.data[0], 113 | real_score.data.mean(), fake_score.data.mean())) 114 | 115 | # Save real images 116 | if (epoch+1) == 1: 117 | images = images.view(images.size(0), 1, 28, 28) 118 | save_image(denorm(images.data), './data/real_images.png') 119 | 120 | # Save sampled images 121 | fake_images = fake_images.view(fake_images.size(0), 1, 28, 28) 122 | save_image(denorm(fake_images.data), './data/fake_images-%d.png' %(epoch+1)) 123 | 124 | # Save the trained parameters 125 | torch.save(G.state_dict(), './generator.pkl') 126 | torch.save(D.state_dict(), './discriminator.pkl') 127 | -------------------------------------------------------------------------------- /tutorials/02-intermediate/language_model/data_utils.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import os 3 | 4 | class Dictionary(object): 5 | def __init__(self): 6 | self.word2idx = {} 7 | self.idx2word = {} 8 | self.idx = 0 9 | 10 | def add_word(self, word): 11 | if not word in self.word2idx: 12 | self.word2idx[word] = self.idx 13 | self.idx2word[self.idx] = word 14 | self.idx += 1 15 | 16 | def __len__(self): 17 | return len(self.word2idx) 18 | 19 | class Corpus(object): 20 | def __init__(self, path='./data'): 21 | self.dictionary = Dictionary() 22 | self.train = os.path.join(path, 'train.txt') 23 | self.test = os.path.join(path, 'test.txt') 24 | 25 | def get_data(self, path, batch_size=20): 26 | # Add words to the dictionary 27 | with open(path, 'r') as f: 28 | tokens = 0 29 | for line in f: 30 | words = line.split() + [''] 31 | tokens += len(words) 32 | for word in words: 33 | self.dictionary.add_word(word) 34 | 35 | # Tokenize the file content 36 | ids = torch.LongTensor(tokens) 37 | token = 0 38 | with open(path, 'r') as f: 39 | for line in f: 40 | words = line.split() + [''] 41 | for word in words: 42 | ids[token] = self.dictionary.word2idx[word] 43 | token += 1 44 | num_batches = ids.size(0) // batch_size 45 | ids = ids[:num_batches*batch_size] 46 | return ids.view(batch_size, -1) -------------------------------------------------------------------------------- /tutorials/02-intermediate/language_model/main-gpu.py: -------------------------------------------------------------------------------- 1 | # Some part of the code was referenced from below. 2 | # https://github.com/pytorch/examples/tree/master/word_language_model 3 | import torch 4 | import torch.nn as nn 5 | import numpy as np 6 | from torch.autograd import Variable 7 | from data_utils import Dictionary, Corpus 8 | 9 | # Hyper Parameters 10 | embed_size = 128 11 | hidden_size = 1024 12 | num_layers = 1 13 | num_epochs = 5 14 | num_samples = 1000 # number of words to be sampled 15 | batch_size = 20 16 | seq_length = 30 17 | learning_rate = 0.002 18 | 19 | # Load Penn Treebank Dataset 20 | train_path = './data/train.txt' 21 | sample_path = './sample.txt' 22 | corpus = Corpus() 23 | ids = corpus.get_data(train_path, batch_size) 24 | vocab_size = len(corpus.dictionary) 25 | num_batches = ids.size(1) // seq_length 26 | 27 | # RNN Based Language Model 28 | class RNNLM(nn.Module): 29 | def __init__(self, vocab_size, embed_size, hidden_size, num_layers): 30 | super(RNNLM, self).__init__() 31 | self.embed = nn.Embedding(vocab_size, embed_size) 32 | self.lstm = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True) 33 | self.linear = nn.Linear(hidden_size, vocab_size) 34 | self.init_weights() 35 | 36 | def init_weights(self): 37 | self.embed.weight.data.uniform_(-0.1, 0.1) 38 | self.linear.bias.data.fill_(0) 39 | self.linear.weight.data.uniform_(-0.1, 0.1) 40 | 41 | def forward(self, x, h): 42 | # Embed word ids to vectors 43 | x = self.embed(x) 44 | 45 | # Forward propagate RNN 46 | out, h = self.lstm(x, h) 47 | 48 | # Reshape output to (batch_size*sequence_length, hidden_size) 49 | out = out.contiguous().view(out.size(0)*out.size(1), out.size(2)) 50 | 51 | # Decode hidden states of all time step 52 | out = self.linear(out) 53 | return out, h 54 | 55 | model = RNNLM(vocab_size, embed_size, hidden_size, num_layers) 56 | model.cuda() 57 | 58 | # Loss and Optimizer 59 | criterion = nn.CrossEntropyLoss() 60 | optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) 61 | 62 | # Truncated Backpropagation 63 | def detach(states): 64 | return [state.detach() for state in states] 65 | 66 | # Training 67 | for epoch in range(num_epochs): 68 | # Initial hidden and memory states 69 | states = (Variable(torch.zeros(num_layers, batch_size, hidden_size)).cuda(), 70 | Variable(torch.zeros(num_layers, batch_size, hidden_size)).cuda()) 71 | 72 | for i in range(0, ids.size(1) - seq_length, seq_length): 73 | # Get batch inputs and targets 74 | inputs = Variable(ids[:, i:i+seq_length]).cuda() 75 | targets = Variable(ids[:, (i+1):(i+1)+seq_length].contiguous()).cuda() 76 | 77 | # Forward + Backward + Optimize 78 | model.zero_grad() 79 | states = detach(states) 80 | outputs, states = model(inputs, states) 81 | loss = criterion(outputs, targets.view(-1)) 82 | loss.backward() 83 | torch.nn.utils.clip_grad_norm(model.parameters(), 0.5) 84 | optimizer.step() 85 | 86 | step = (i+1) // seq_length 87 | if step % 100 == 0: 88 | print ('Epoch [%d/%d], Step[%d/%d], Loss: %.3f, Perplexity: %5.2f' % 89 | (epoch+1, num_epochs, step, num_batches, loss.data[0], np.exp(loss.data[0]))) 90 | 91 | # Sampling 92 | with open(sample_path, 'w') as f: 93 | # Set intial hidden ane memory states 94 | state = (Variable(torch.zeros(num_layers, 1, hidden_size)).cuda(), 95 | Variable(torch.zeros(num_layers, 1, hidden_size)).cuda()) 96 | 97 | # Select one word id randomly 98 | prob = torch.ones(vocab_size) 99 | input = Variable(torch.multinomial(prob, num_samples=1).unsqueeze(1), 100 | volatile=True).cuda() 101 | 102 | for i in range(num_samples): 103 | # Forward propagate rnn 104 | output, state = model(input, state) 105 | 106 | # Sample a word id 107 | prob = output.squeeze().data.exp().cpu() 108 | word_id = torch.multinomial(prob, 1)[0] 109 | 110 | # Feed sampled word id to next time step 111 | input.data.fill_(word_id) 112 | 113 | # File write 114 | word = corpus.dictionary.idx2word[word_id] 115 | word = '\n' if word == '' else word + ' ' 116 | f.write(word) 117 | 118 | if (i+1) % 100 == 0: 119 | print('Sampled [%d/%d] words and save to %s'%(i+1, num_samples, sample_path)) 120 | 121 | # Save the Trained Model 122 | torch.save(model.state_dict(), 'model.pkl') 123 | -------------------------------------------------------------------------------- /tutorials/02-intermediate/language_model/main.py: -------------------------------------------------------------------------------- 1 | # Some part of the code was referenced from below. 2 | # https://github.com/pytorch/examples/tree/master/word_language_model 3 | import torch 4 | import torch.nn as nn 5 | import numpy as np 6 | from torch.autograd import Variable 7 | from data_utils import Dictionary, Corpus 8 | 9 | # Hyper Parameters 10 | embed_size = 128 11 | hidden_size = 1024 12 | num_layers = 1 13 | num_epochs = 5 14 | num_samples = 1000 # number of words to be sampled 15 | batch_size = 20 16 | seq_length = 30 17 | learning_rate = 0.002 18 | 19 | # Load Penn Treebank Dataset 20 | train_path = './data/train.txt' 21 | sample_path = './sample.txt' 22 | corpus = Corpus() 23 | ids = corpus.get_data(train_path, batch_size) 24 | vocab_size = len(corpus.dictionary) 25 | num_batches = ids.size(1) // seq_length 26 | 27 | # RNN Based Language Model 28 | class RNNLM(nn.Module): 29 | def __init__(self, vocab_size, embed_size, hidden_size, num_layers): 30 | super(RNNLM, self).__init__() 31 | self.embed = nn.Embedding(vocab_size, embed_size) 32 | self.lstm = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True) 33 | self.linear = nn.Linear(hidden_size, vocab_size) 34 | self.init_weights() 35 | 36 | def init_weights(self): 37 | self.embed.weight.data.uniform_(-0.1, 0.1) 38 | self.linear.bias.data.fill_(0) 39 | self.linear.weight.data.uniform_(-0.1, 0.1) 40 | 41 | def forward(self, x, h): 42 | # Embed word ids to vectors 43 | x = self.embed(x) 44 | 45 | # Forward propagate RNN 46 | out, h = self.lstm(x, h) 47 | 48 | # Reshape output to (batch_size*sequence_length, hidden_size) 49 | out = out.contiguous().view(out.size(0)*out.size(1), out.size(2)) 50 | 51 | # Decode hidden states of all time step 52 | out = self.linear(out) 53 | return out, h 54 | 55 | model = RNNLM(vocab_size, embed_size, hidden_size, num_layers) 56 | 57 | 58 | # Loss and Optimizer 59 | criterion = nn.CrossEntropyLoss() 60 | optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) 61 | 62 | # Truncated Backpropagation 63 | def detach(states): 64 | return [state.detach() for state in states] 65 | 66 | # Training 67 | for epoch in range(num_epochs): 68 | # Initial hidden and memory states 69 | states = (Variable(torch.zeros(num_layers, batch_size, hidden_size)), 70 | Variable(torch.zeros(num_layers, batch_size, hidden_size))) 71 | 72 | for i in range(0, ids.size(1) - seq_length, seq_length): 73 | # Get batch inputs and targets 74 | inputs = Variable(ids[:, i:i+seq_length]) 75 | targets = Variable(ids[:, (i+1):(i+1)+seq_length].contiguous()) 76 | 77 | # Forward + Backward + Optimize 78 | model.zero_grad() 79 | states = detach(states) 80 | outputs, states = model(inputs, states) 81 | loss = criterion(outputs, targets.view(-1)) 82 | loss.backward() 83 | torch.nn.utils.clip_grad_norm(model.parameters(), 0.5) 84 | optimizer.step() 85 | 86 | step = (i+1) // seq_length 87 | if step % 100 == 0: 88 | print ('Epoch [%d/%d], Step[%d/%d], Loss: %.3f, Perplexity: %5.2f' % 89 | (epoch+1, num_epochs, step, num_batches, loss.data[0], np.exp(loss.data[0]))) 90 | 91 | # Sampling 92 | with open(sample_path, 'w') as f: 93 | # Set intial hidden ane memory states 94 | state = (Variable(torch.zeros(num_layers, 1, hidden_size)), 95 | Variable(torch.zeros(num_layers, 1, hidden_size))) 96 | 97 | # Select one word id randomly 98 | prob = torch.ones(vocab_size) 99 | input = Variable(torch.multinomial(prob, num_samples=1).unsqueeze(1), 100 | volatile=True) 101 | 102 | for i in range(num_samples): 103 | # Forward propagate rnn 104 | output, state = model(input, state) 105 | 106 | # Sample a word id 107 | prob = output.squeeze().data.exp() 108 | word_id = torch.multinomial(prob, 1)[0] 109 | 110 | # Feed sampled word id to next time step 111 | input.data.fill_(word_id) 112 | 113 | # File write 114 | word = corpus.dictionary.idx2word[word_id] 115 | word = '\n' if word == '' else word + ' ' 116 | f.write(word) 117 | 118 | if (i+1) % 100 == 0: 119 | print('Sampled [%d/%d] words and save to %s'%(i+1, num_samples, sample_path)) 120 | 121 | # Save the Trained Model 122 | torch.save(model.state_dict(), 'model.pkl') 123 | -------------------------------------------------------------------------------- /tutorials/02-intermediate/recurrent_neural_network/main-gpu.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | sequence_length = 28 10 | input_size = 28 11 | hidden_size = 128 12 | num_layers = 2 13 | num_classes = 10 14 | batch_size = 100 15 | num_epochs = 2 16 | learning_rate = 0.01 17 | 18 | # MNIST Dataset 19 | train_dataset = dsets.MNIST(root='./data/', 20 | train=True, 21 | transform=transforms.ToTensor(), 22 | download=True) 23 | 24 | test_dataset = dsets.MNIST(root='./data/', 25 | train=False, 26 | transform=transforms.ToTensor()) 27 | 28 | # Data Loader (Input Pipeline) 29 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 30 | batch_size=batch_size, 31 | shuffle=True) 32 | 33 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 34 | batch_size=batch_size, 35 | shuffle=False) 36 | 37 | # RNN Model (Many-to-One) 38 | class RNN(nn.Module): 39 | def __init__(self, input_size, hidden_size, num_layers, num_classes): 40 | super(RNN, self).__init__() 41 | self.hidden_size = hidden_size 42 | self.num_layers = num_layers 43 | self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True) 44 | self.fc = nn.Linear(hidden_size, num_classes) 45 | 46 | def forward(self, x): 47 | # Set initial states 48 | h0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size).cuda()) 49 | c0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size).cuda()) 50 | 51 | # Forward propagate RNN 52 | out, _ = self.lstm(x, (h0, c0)) 53 | 54 | # Decode hidden state of last time step 55 | out = self.fc(out[:, -1, :]) 56 | return out 57 | 58 | rnn = RNN(input_size, hidden_size, num_layers, num_classes) 59 | rnn.cuda() 60 | 61 | # Loss and Optimizer 62 | criterion = nn.CrossEntropyLoss() 63 | optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate) 64 | 65 | # Train the Model 66 | for epoch in range(num_epochs): 67 | for i, (images, labels) in enumerate(train_loader): 68 | images = Variable(images.view(-1, sequence_length, input_size)).cuda() 69 | labels = Variable(labels).cuda() 70 | 71 | # Forward + Backward + Optimize 72 | optimizer.zero_grad() 73 | outputs = rnn(images) 74 | loss = criterion(outputs, labels) 75 | loss.backward() 76 | optimizer.step() 77 | 78 | if (i+1) % 100 == 0: 79 | print ('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' 80 | %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 81 | 82 | # Test the Model 83 | correct = 0 84 | total = 0 85 | for images, labels in test_loader: 86 | images = Variable(images.view(-1, sequence_length, input_size)).cuda() 87 | outputs = rnn(images) 88 | _, predicted = torch.max(outputs.data, 1) 89 | total += labels.size(0) 90 | correct += (predicted.cpu() == labels).sum() 91 | 92 | print('Test Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total)) 93 | 94 | # Save the Model 95 | torch.save(rnn.state_dict(), 'rnn.pkl') -------------------------------------------------------------------------------- /tutorials/02-intermediate/recurrent_neural_network/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | 7 | 8 | # Hyper Parameters 9 | sequence_length = 28 10 | input_size = 28 11 | hidden_size = 128 12 | num_layers = 2 13 | num_classes = 10 14 | batch_size = 100 15 | num_epochs = 2 16 | learning_rate = 0.01 17 | 18 | # MNIST Dataset 19 | train_dataset = dsets.MNIST(root='./data/', 20 | train=True, 21 | transform=transforms.ToTensor(), 22 | download=True) 23 | 24 | test_dataset = dsets.MNIST(root='./data/', 25 | train=False, 26 | transform=transforms.ToTensor()) 27 | 28 | # Data Loader (Input Pipeline) 29 | train_loader = torch.utils.data.DataLoader(dataset=train_dataset, 30 | batch_size=batch_size, 31 | shuffle=True) 32 | 33 | test_loader = torch.utils.data.DataLoader(dataset=test_dataset, 34 | batch_size=batch_size, 35 | shuffle=False) 36 | 37 | # RNN Model (Many-to-One) 38 | class RNN(nn.Module): 39 | def __init__(self, input_size, hidden_size, num_layers, num_classes): 40 | super(RNN, self).__init__() 41 | self.hidden_size = hidden_size 42 | self.num_layers = num_layers 43 | self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True) 44 | self.fc = nn.Linear(hidden_size, num_classes) 45 | 46 | def forward(self, x): 47 | # Set initial states 48 | h0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)) 49 | c0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)) 50 | 51 | # Forward propagate RNN 52 | out, _ = self.lstm(x, (h0, c0)) 53 | 54 | # Decode hidden state of last time step 55 | out = self.fc(out[:, -1, :]) 56 | return out 57 | 58 | rnn = RNN(input_size, hidden_size, num_layers, num_classes) 59 | 60 | 61 | # Loss and Optimizer 62 | criterion = nn.CrossEntropyLoss() 63 | optimizer = torch.optim.Adam(rnn.parameters(), lr=learning_rate) 64 | 65 | # Train the Model 66 | for epoch in range(num_epochs): 67 | for i, (images, labels) in enumerate(train_loader): 68 | images = Variable(images.view(-1, sequence_length, input_size)) 69 | labels = Variable(labels) 70 | 71 | # Forward + Backward + Optimize 72 | optimizer.zero_grad() 73 | outputs = rnn(images) 74 | loss = criterion(outputs, labels) 75 | loss.backward() 76 | optimizer.step() 77 | 78 | if (i+1) % 100 == 0: 79 | print ('Epoch [%d/%d], Step [%d/%d], Loss: %.4f' 80 | %(epoch+1, num_epochs, i+1, len(train_dataset)//batch_size, loss.data[0])) 81 | 82 | # Test the Model 83 | correct = 0 84 | total = 0 85 | for images, labels in test_loader: 86 | images = Variable(images.view(-1, sequence_length, input_size)) 87 | outputs = rnn(images) 88 | _, predicted = torch.max(outputs.data, 1) 89 | total += labels.size(0) 90 | correct += (predicted == labels).sum() 91 | 92 | print('Test Accuracy of the model on the 10000 test images: %d %%' % (100 * correct / total)) 93 | 94 | # Save the Model 95 | torch.save(rnn.state_dict(), 'rnn.pkl') -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/README.md: -------------------------------------------------------------------------------- 1 | ## Deep Convolutional GAN 2 | [Generative Adversarial Network](https://arxiv.org/abs/1406.2661) is a generative model that contains a discriminator and a generator. The discriminator is a binary classifier that is trained to classify the real image as real and the fake image as fake. The discriminator is trained to assign 1 to the real image and 0 to the fake image.The generator is a generative model that creates an image from the latent code. The generator is trained to generate an image that can not be distinguishable from the real image in order to deceive the discriminator. 3 | 4 | In the [Deep Convolutional GAN(DCGAN)](https://arxiv.org/abs/1511.06434), the authors introduce architecture guidlines for stable GAN training. They replace any pooling layers with strided convolutions (for the discriminator) and fractional-strided convolutions (for the generator) and use batchnorm in both the discriminator and the generator. In addition, they use ReLU activation in the generator and LeakyReLU activation in the discriminator. However, in our case, we use LeakyReLU activation in both models to avoid sparse gradients. 5 | 6 | ![alt text](png/dcgan.png) 7 | 8 | 9 | ## Usage 10 | 11 | #### 1. Install the dependencies 12 | ```bash 13 | $ pip install -r requirements.txt 14 | ``` 15 | 16 | #### 2. Download the dataset 17 | ```bash 18 | $ chmod +x download.sh 19 | $ ./download.sh 20 | ``` 21 | 22 | #### 3. Train the model 23 | ```bash 24 | $ python main.py --mode='train' 25 | ``` 26 | 27 | #### 3. Sample the images 28 | ```bash 29 | $ python main.py --mode='sample' 30 | ``` 31 | 32 | 33 | 34 |
35 | 36 | ## Results 37 | 38 | The following is the result on the CelebA dataset. 39 | 40 | ![alt text](png/sample1.png) 41 | ![alt text](png/sample2.png) 42 | -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/data_loader.py: -------------------------------------------------------------------------------- 1 | import os 2 | from torch.utils import data 3 | from torchvision import transforms 4 | from PIL import Image 5 | 6 | 7 | class ImageFolder(data.Dataset): 8 | """Custom Dataset compatible with prebuilt DataLoader. 9 | 10 | This is just for tutorial. You can use the prebuilt torchvision.datasets.ImageFolder. 11 | """ 12 | def __init__(self, root, transform=None): 13 | """Initializes image paths and preprocessing module.""" 14 | self.image_paths = list(map(lambda x: os.path.join(root, x), os.listdir(root))) 15 | self.transform = transform 16 | 17 | def __getitem__(self, index): 18 | """Reads an image from a file and preprocesses it and returns.""" 19 | image_path = self.image_paths[index] 20 | image = Image.open(image_path).convert('RGB') 21 | if self.transform is not None: 22 | image = self.transform(image) 23 | return image 24 | 25 | def __len__(self): 26 | """Returns the total number of image files.""" 27 | return len(self.image_paths) 28 | 29 | 30 | def get_loader(image_path, image_size, batch_size, num_workers=2): 31 | """Builds and returns Dataloader.""" 32 | 33 | transform = transforms.Compose([ 34 | transforms.Scale(image_size), 35 | transforms.ToTensor(), 36 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) 37 | 38 | dataset = ImageFolder(image_path, transform) 39 | data_loader = data.DataLoader(dataset=dataset, 40 | batch_size=batch_size, 41 | shuffle=True, 42 | num_workers=num_workers) 43 | return data_loader -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/download.sh: -------------------------------------------------------------------------------- 1 | wget https://www.dropbox.com/s/e0ig4nf1v94hyj8/CelebA_128crop_FD.zip?dl=0 -P ./ 2 | unzip CelebA_128crop_FD.zip -d ./ 3 | -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/main.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | from solver import Solver 4 | from data_loader import get_loader 5 | from torch.backends import cudnn 6 | 7 | 8 | def main(config): 9 | cudnn.benchmark = True 10 | 11 | data_loader = get_loader(image_path=config.image_path, 12 | image_size=config.image_size, 13 | batch_size=config.batch_size, 14 | num_workers=config.num_workers) 15 | 16 | solver = Solver(config, data_loader) 17 | 18 | # Create directories if not exist 19 | if not os.path.exists(config.model_path): 20 | os.makedirs(config.model_path) 21 | if not os.path.exists(config.sample_path): 22 | os.makedirs(config.sample_path) 23 | 24 | # Train and sample the images 25 | if config.mode == 'train': 26 | solver.train() 27 | elif config.mode == 'sample': 28 | solver.sample() 29 | 30 | if __name__ == '__main__': 31 | parser = argparse.ArgumentParser() 32 | 33 | # model hyper-parameters 34 | parser.add_argument('--image_size', type=int, default=64) 35 | parser.add_argument('--z_dim', type=int, default=100) 36 | parser.add_argument('--g_conv_dim', type=int, default=64) 37 | parser.add_argument('--d_conv_dim', type=int, default=64) 38 | 39 | # training hyper-parameters 40 | parser.add_argument('--num_epochs', type=int, default=20) 41 | parser.add_argument('--batch_size', type=int, default=32) 42 | parser.add_argument('--sample_size', type=int, default=100) 43 | parser.add_argument('--num_workers', type=int, default=2) 44 | parser.add_argument('--lr', type=float, default=0.0002) 45 | parser.add_argument('--beta1', type=float, default=0.5) # momentum1 in Adam 46 | parser.add_argument('--beta2', type=float, default=0.999) # momentum2 in Adam 47 | 48 | # misc 49 | parser.add_argument('--mode', type=str, default='train') 50 | parser.add_argument('--model_path', type=str, default='./models') 51 | parser.add_argument('--sample_path', type=str, default='./samples') 52 | parser.add_argument('--image_path', type=str, default='./CelebA/128_crop') 53 | parser.add_argument('--log_step', type=int , default=10) 54 | parser.add_argument('--sample_step', type=int , default=500) 55 | 56 | config = parser.parse_args() 57 | print(config) 58 | main(config) -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/model.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch.nn.functional as F 3 | 4 | 5 | def deconv(c_in, c_out, k_size, stride=2, pad=1, bn=True): 6 | """Custom deconvolutional layer for simplicity.""" 7 | layers = [] 8 | layers.append(nn.ConvTranspose2d(c_in, c_out, k_size, stride, pad)) 9 | if bn: 10 | layers.append(nn.BatchNorm2d(c_out)) 11 | return nn.Sequential(*layers) 12 | 13 | 14 | class Generator(nn.Module): 15 | """Generator containing 7 deconvolutional layers.""" 16 | def __init__(self, z_dim=256, image_size=128, conv_dim=64): 17 | super(Generator, self).__init__() 18 | self.fc = deconv(z_dim, conv_dim*8, int(image_size/16), 1, 0, bn=False) 19 | self.deconv1 = deconv(conv_dim*8, conv_dim*4, 4) 20 | self.deconv2 = deconv(conv_dim*4, conv_dim*2, 4) 21 | self.deconv3 = deconv(conv_dim*2, conv_dim, 4) 22 | self.deconv4 = deconv(conv_dim, 3, 4, bn=False) 23 | 24 | def forward(self, z): 25 | z = z.view(z.size(0), z.size(1), 1, 1) # If image_size is 64, output shape is as below. 26 | out = self.fc(z) # (?, 512, 4, 4) 27 | out = F.leaky_relu(self.deconv1(out), 0.05) # (?, 256, 8, 8) 28 | out = F.leaky_relu(self.deconv2(out), 0.05) # (?, 128, 16, 16) 29 | out = F.leaky_relu(self.deconv3(out), 0.05) # (?, 64, 32, 32) 30 | out = F.tanh(self.deconv4(out)) # (?, 3, 64, 64) 31 | return out 32 | 33 | 34 | def conv(c_in, c_out, k_size, stride=2, pad=1, bn=True): 35 | """Custom convolutional layer for simplicity.""" 36 | layers = [] 37 | layers.append(nn.Conv2d(c_in, c_out, k_size, stride, pad)) 38 | if bn: 39 | layers.append(nn.BatchNorm2d(c_out)) 40 | return nn.Sequential(*layers) 41 | 42 | 43 | class Discriminator(nn.Module): 44 | """Discriminator containing 4 convolutional layers.""" 45 | def __init__(self, image_size=128, conv_dim=64): 46 | super(Discriminator, self).__init__() 47 | self.conv1 = conv(3, conv_dim, 4, bn=False) 48 | self.conv2 = conv(conv_dim, conv_dim*2, 4) 49 | self.conv3 = conv(conv_dim*2, conv_dim*4, 4) 50 | self.conv4 = conv(conv_dim*4, conv_dim*8, 4) 51 | self.fc = conv(conv_dim*8, 1, int(image_size/16), 1, 0, False) 52 | 53 | def forward(self, x): # If image_size is 64, output shape is as below. 54 | out = F.leaky_relu(self.conv1(x), 0.05) # (?, 64, 32, 32) 55 | out = F.leaky_relu(self.conv2(out), 0.05) # (?, 128, 16, 16) 56 | out = F.leaky_relu(self.conv3(out), 0.05) # (?, 256, 8, 8) 57 | out = F.leaky_relu(self.conv4(out), 0.05) # (?, 512, 4, 4) 58 | out = self.fc(out).squeeze() 59 | return out -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/png/dcgan.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/deep_convolutional_gan/png/dcgan.png -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/png/sample1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/deep_convolutional_gan/png/sample1.png -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/png/sample2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/deep_convolutional_gan/png/sample2.png -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/requirements.txt: -------------------------------------------------------------------------------- 1 | torch 2 | torchvision 3 | Pillow 4 | argparse 5 | -------------------------------------------------------------------------------- /tutorials/03-advanced/deep_convolutional_gan/solver.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torchvision 3 | import os 4 | from torch import optim 5 | from torch.autograd import Variable 6 | from model import Discriminator 7 | from model import Generator 8 | 9 | 10 | class Solver(object): 11 | def __init__(self, config, data_loader): 12 | self.generator = None 13 | self.discriminator = None 14 | self.g_optimizer = None 15 | self.d_optimizer = None 16 | self.g_conv_dim = config.g_conv_dim 17 | self.d_conv_dim = config.d_conv_dim 18 | self.z_dim = config.z_dim 19 | self.beta1 = config.beta1 20 | self.beta2 = config.beta2 21 | self.image_size = config.image_size 22 | self.data_loader = data_loader 23 | self.num_epochs = config.num_epochs 24 | self.batch_size = config.batch_size 25 | self.sample_size = config.sample_size 26 | self.lr = config.lr 27 | self.log_step = config.log_step 28 | self.sample_step = config.sample_step 29 | self.sample_path = config.sample_path 30 | self.model_path = config.model_path 31 | self.build_model() 32 | 33 | def build_model(self): 34 | """Build generator and discriminator.""" 35 | self.generator = Generator(z_dim=self.z_dim, 36 | image_size=self.image_size, 37 | conv_dim=self.g_conv_dim) 38 | self.discriminator = Discriminator(image_size=self.image_size, 39 | conv_dim=self.d_conv_dim) 40 | self.g_optimizer = optim.Adam(self.generator.parameters(), 41 | self.lr, [self.beta1, self.beta2]) 42 | self.d_optimizer = optim.Adam(self.discriminator.parameters(), 43 | self.lr, [self.beta1, self.beta2]) 44 | 45 | if torch.cuda.is_available(): 46 | self.generator.cuda() 47 | self.discriminator.cuda() 48 | 49 | def to_variable(self, x): 50 | """Convert tensor to variable.""" 51 | if torch.cuda.is_available(): 52 | x = x.cuda() 53 | return Variable(x) 54 | 55 | def to_data(self, x): 56 | """Convert variable to tensor.""" 57 | if torch.cuda.is_available(): 58 | x = x.cpu() 59 | return x.data 60 | 61 | def reset_grad(self): 62 | """Zero the gradient buffers.""" 63 | self.discriminator.zero_grad() 64 | self.generator.zero_grad() 65 | 66 | def denorm(self, x): 67 | """Convert range (-1, 1) to (0, 1)""" 68 | out = (x + 1) / 2 69 | return out.clamp(0, 1) 70 | 71 | def train(self): 72 | """Train generator and discriminator.""" 73 | fixed_noise = self.to_variable(torch.randn(self.batch_size, self.z_dim)) 74 | total_step = len(self.data_loader) 75 | for epoch in range(self.num_epochs): 76 | for i, images in enumerate(self.data_loader): 77 | 78 | #===================== Train D =====================# 79 | images = self.to_variable(images) 80 | batch_size = images.size(0) 81 | noise = self.to_variable(torch.randn(batch_size, self.z_dim)) 82 | 83 | # Train D to recognize real images as real. 84 | outputs = self.discriminator(images) 85 | real_loss = torch.mean((outputs - 1) ** 2) # L2 loss instead of Binary cross entropy loss (this is optional for stable training) 86 | 87 | # Train D to recognize fake images as fake. 88 | fake_images = self.generator(noise) 89 | outputs = self.discriminator(fake_images) 90 | fake_loss = torch.mean(outputs ** 2) 91 | 92 | # Backprop + optimize 93 | d_loss = real_loss + fake_loss 94 | self.reset_grad() 95 | d_loss.backward() 96 | self.d_optimizer.step() 97 | 98 | #===================== Train G =====================# 99 | noise = self.to_variable(torch.randn(batch_size, self.z_dim)) 100 | 101 | # Train G so that D recognizes G(z) as real. 102 | fake_images = self.generator(noise) 103 | outputs = self.discriminator(fake_images) 104 | g_loss = torch.mean((outputs - 1) ** 2) 105 | 106 | # Backprop + optimize 107 | self.reset_grad() 108 | g_loss.backward() 109 | self.g_optimizer.step() 110 | 111 | # print the log info 112 | if (i+1) % self.log_step == 0: 113 | print('Epoch [%d/%d], Step[%d/%d], d_real_loss: %.4f, ' 114 | 'd_fake_loss: %.4f, g_loss: %.4f' 115 | %(epoch+1, self.num_epochs, i+1, total_step, 116 | real_loss.data[0], fake_loss.data[0], g_loss.data[0])) 117 | 118 | # save the sampled images 119 | if (i+1) % self.sample_step == 0: 120 | fake_images = self.generator(fixed_noise) 121 | torchvision.utils.save_image(self.denorm(fake_images.data), 122 | os.path.join(self.sample_path, 123 | 'fake_samples-%d-%d.png' %(epoch+1, i+1))) 124 | 125 | # save the model parameters for each epoch 126 | g_path = os.path.join(self.model_path, 'generator-%d.pkl' %(epoch+1)) 127 | d_path = os.path.join(self.model_path, 'discriminator-%d.pkl' %(epoch+1)) 128 | torch.save(self.generator.state_dict(), g_path) 129 | torch.save(self.discriminator.state_dict(), d_path) 130 | 131 | def sample(self): 132 | 133 | # Load trained parameters 134 | g_path = os.path.join(self.model_path, 'generator-%d.pkl' %(self.num_epochs)) 135 | d_path = os.path.join(self.model_path, 'discriminator-%d.pkl' %(self.num_epochs)) 136 | self.generator.load_state_dict(torch.load(g_path)) 137 | self.discriminator.load_state_dict(torch.load(d_path)) 138 | self.generator.eval() 139 | self.discriminator.eval() 140 | 141 | # Sample the images 142 | noise = self.to_variable(torch.randn(self.sample_size, self.z_dim)) 143 | fake_images = self.generator(noise) 144 | sample_path = os.path.join(self.sample_path, 'fake_samples-final.png') 145 | torchvision.utils.save_image(self.denorm(fake_images.data), sample_path, nrow=12) 146 | 147 | print("Saved sampled images to '%s'" %sample_path) 148 | -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/README.md: -------------------------------------------------------------------------------- 1 | # Image Captioning 2 | The goal of image captioning is to convert a given input image into a natural language description. The encoder-decoder framework is widely used for this task. The image encoder is a convolutional neural network (CNN). In this tutorial, we used [resnet-152](https://arxiv.org/abs/1512.03385) model pretrained on the [ILSVRC-2012-CLS](http://www.image-net.org/challenges/LSVRC/2012/) image classification dataset. The decoder is a long short-term memory (LSTM) network. 3 | 4 | ![alt text](png/model.png) 5 | 6 | #### Training phase 7 | For the encoder part, the pretrained CNN extracts the feature vector from a given input image. The feature vector is linearly transformed to have the same dimension as the input dimension of the LSTM network. For the decoder part, source and target texts are predefined. For example, if the image description is **"Giraffes standing next to each other"**, the source sequence is a list containing **['\', 'Giraffes', 'standing', 'next', 'to', 'each', 'other']** and the target sequence is a list containing **['Giraffes', 'standing', 'next', 'to', 'each', 'other', '\']**. Using these source and target sequences and the feature vector, the LSTM decoder is trained as a language model conditioned on the feature vector. 8 | 9 | #### Test phase 10 | In the test phase, the encoder part is almost same as the training phase. The only difference is that batchnorm layer uses moving average and variance instead of mini-batch statistics. This can be easily implemented using [encoder.eval()](https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/03-advanced/image_captioning/sample.py#L41). For the decoder part, there is a significant difference between the training phase and the test phase. In the test phase, the LSTM decoder can't see the image description. To deal with this problem, the LSTM decoder feeds back the previosly generated word to the next input. This can be implemented using a [for-loop](https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/03-advanced/image_captioning/model.py#L57-L68). 11 | 12 | 13 | 14 | ## Usage 15 | 16 | 17 | #### 1. Clone the repositories 18 | ```bash 19 | $ git clone https://github.com/pdollar/coco.git 20 | $ cd coco/PythonAPI/ 21 | $ make 22 | $ python setup.py build 23 | $ python setup.py install 24 | $ cd ../../ 25 | $ git clone https://github.com/yunjey/pytorch-tutorial.git 26 | $ cd pytorch-tutorial/tutorials/03-advanced/image_captioning/ 27 | ``` 28 | 29 | #### 2. Download the dataset 30 | 31 | ```bash 32 | $ pip install -r requirements.txt 33 | $ chmod +x download.sh 34 | $ ./download.sh 35 | ``` 36 | 37 | #### 3. Preprocessing 38 | 39 | ```bash 40 | $ python build_vocab.py 41 | $ python resize.py 42 | ``` 43 | 44 | #### 4. Train the model 45 | 46 | ```bash 47 | $ python train.py 48 | ``` 49 | 50 | #### 5. Test the model 51 | 52 | ```bash 53 | $ python sample.py --image='png/example.png' 54 | ``` 55 | 56 |
57 | 58 | ## Pretrained model 59 | If you do not want to train the model from scratch, you can use a pretrained model. You can download the pretrained model [here](https://www.dropbox.com/s/ne0ixz5d58ccbbz/pretrained_model.zip?dl=0) and the vocabulary file [here](https://www.dropbox.com/s/26adb7y9m98uisa/vocap.zip?dl=0). You should extract pretrained_model.zip to `./models/` and vocab.pkl to `./data/` using `unzip` command. 60 | -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/build_vocab.py: -------------------------------------------------------------------------------- 1 | import nltk 2 | import pickle 3 | import argparse 4 | from collections import Counter 5 | from pycocotools.coco import COCO 6 | 7 | 8 | class Vocabulary(object): 9 | """Simple vocabulary wrapper.""" 10 | def __init__(self): 11 | self.word2idx = {} 12 | self.idx2word = {} 13 | self.idx = 0 14 | 15 | def add_word(self, word): 16 | if not word in self.word2idx: 17 | self.word2idx[word] = self.idx 18 | self.idx2word[self.idx] = word 19 | self.idx += 1 20 | 21 | def __call__(self, word): 22 | if not word in self.word2idx: 23 | return self.word2idx[''] 24 | return self.word2idx[word] 25 | 26 | def __len__(self): 27 | return len(self.word2idx) 28 | 29 | def build_vocab(json, threshold): 30 | """Build a simple vocabulary wrapper.""" 31 | coco = COCO(json) 32 | counter = Counter() 33 | ids = coco.anns.keys() 34 | for i, id in enumerate(ids): 35 | caption = str(coco.anns[id]['caption']) 36 | tokens = nltk.tokenize.word_tokenize(caption.lower()) 37 | counter.update(tokens) 38 | 39 | if i % 1000 == 0: 40 | print("[%d/%d] Tokenized the captions." %(i, len(ids))) 41 | 42 | # If the word frequency is less than 'threshold', then the word is discarded. 43 | words = [word for word, cnt in counter.items() if cnt >= threshold] 44 | 45 | # Creates a vocab wrapper and add some special tokens. 46 | vocab = Vocabulary() 47 | vocab.add_word('') 48 | vocab.add_word('') 49 | vocab.add_word('') 50 | vocab.add_word('') 51 | 52 | # Adds the words to the vocabulary. 53 | for i, word in enumerate(words): 54 | vocab.add_word(word) 55 | return vocab 56 | 57 | def main(args): 58 | vocab = build_vocab(json=args.caption_path, 59 | threshold=args.threshold) 60 | vocab_path = args.vocab_path 61 | with open(vocab_path, 'wb') as f: 62 | pickle.dump(vocab, f) 63 | print("Total vocabulary size: %d" %len(vocab)) 64 | print("Saved the vocabulary wrapper to '%s'" %vocab_path) 65 | 66 | 67 | if __name__ == '__main__': 68 | parser = argparse.ArgumentParser() 69 | parser.add_argument('--caption_path', type=str, 70 | default='/usr/share/mscoco/annotations/captions_train2014.json', 71 | help='path for train annotation file') 72 | parser.add_argument('--vocab_path', type=str, default='./data/vocab.pkl', 73 | help='path for saving vocabulary wrapper') 74 | parser.add_argument('--threshold', type=int, default=4, 75 | help='minimum word count threshold') 76 | args = parser.parse_args() 77 | main(args) -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/data_loader.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torchvision.transforms as transforms 3 | import torch.utils.data as data 4 | import os 5 | import pickle 6 | import numpy as np 7 | import nltk 8 | from PIL import Image 9 | from build_vocab import Vocabulary 10 | from pycocotools.coco import COCO 11 | 12 | 13 | class CocoDataset(data.Dataset): 14 | """COCO Custom Dataset compatible with torch.utils.data.DataLoader.""" 15 | def __init__(self, root, json, vocab, transform=None): 16 | """Set the path for images, captions and vocabulary wrapper. 17 | 18 | Args: 19 | root: image directory. 20 | json: coco annotation file path. 21 | vocab: vocabulary wrapper. 22 | transform: image transformer. 23 | """ 24 | self.root = root 25 | self.coco = COCO(json) 26 | self.ids = list(self.coco.anns.keys()) 27 | self.vocab = vocab 28 | self.transform = transform 29 | 30 | def __getitem__(self, index): 31 | """Returns one data pair (image and caption).""" 32 | coco = self.coco 33 | vocab = self.vocab 34 | ann_id = self.ids[index] 35 | caption = coco.anns[ann_id]['caption'] 36 | img_id = coco.anns[ann_id]['image_id'] 37 | path = coco.loadImgs(img_id)[0]['file_name'] 38 | 39 | image = Image.open(os.path.join(self.root, path)).convert('RGB') 40 | if self.transform is not None: 41 | image = self.transform(image) 42 | 43 | # Convert caption (string) to word ids. 44 | tokens = nltk.tokenize.word_tokenize(str(caption).lower()) 45 | caption = [] 46 | caption.append(vocab('')) 47 | caption.extend([vocab(token) for token in tokens]) 48 | caption.append(vocab('')) 49 | target = torch.Tensor(caption) 50 | return image, target 51 | 52 | def __len__(self): 53 | return len(self.ids) 54 | 55 | 56 | def collate_fn(data): 57 | """Creates mini-batch tensors from the list of tuples (image, caption). 58 | 59 | We should build custom collate_fn rather than using default collate_fn, 60 | because merging caption (including padding) is not supported in default. 61 | 62 | Args: 63 | data: list of tuple (image, caption). 64 | - image: torch tensor of shape (3, 256, 256). 65 | - caption: torch tensor of shape (?); variable length. 66 | 67 | Returns: 68 | images: torch tensor of shape (batch_size, 3, 256, 256). 69 | targets: torch tensor of shape (batch_size, padded_length). 70 | lengths: list; valid length for each padded caption. 71 | """ 72 | # Sort a data list by caption length (descending order). 73 | data.sort(key=lambda x: len(x[1]), reverse=True) 74 | images, captions = zip(*data) 75 | 76 | # Merge images (from tuple of 3D tensor to 4D tensor). 77 | images = torch.stack(images, 0) 78 | 79 | # Merge captions (from tuple of 1D tensor to 2D tensor). 80 | lengths = [len(cap) for cap in captions] 81 | targets = torch.zeros(len(captions), max(lengths)).long() 82 | for i, cap in enumerate(captions): 83 | end = lengths[i] 84 | targets[i, :end] = cap[:end] 85 | return images, targets, lengths 86 | 87 | 88 | def get_loader(root, json, vocab, transform, batch_size, shuffle, num_workers): 89 | """Returns torch.utils.data.DataLoader for custom coco dataset.""" 90 | # COCO caption dataset 91 | coco = CocoDataset(root=root, 92 | json=json, 93 | vocab=vocab, 94 | transform=transform) 95 | 96 | # Data loader for COCO dataset 97 | # This will return (images, captions, lengths) for every iteration. 98 | # images: tensor of shape (batch_size, 3, 224, 224). 99 | # captions: tensor of shape (batch_size, padded_length). 100 | # lengths: list indicating valid length for each caption. length is (batch_size). 101 | data_loader = torch.utils.data.DataLoader(dataset=coco, 102 | batch_size=batch_size, 103 | shuffle=shuffle, 104 | num_workers=num_workers, 105 | collate_fn=collate_fn) 106 | return data_loader -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/download.sh: -------------------------------------------------------------------------------- 1 | mkdir data 2 | wget http://msvocds.blob.core.windows.net/annotations-1-0-3/captions_train-val2014.zip -P ./data/ 3 | wget http://msvocds.blob.core.windows.net/coco2014/train2014.zip -P ./data/ 4 | wget http://msvocds.blob.core.windows.net/coco2014/val2014.zip -P ./data/ 5 | 6 | unzip ./data/captions_train-val2014.zip -d ./data/ 7 | rm ./data/captions_train-val2014.zip 8 | unzip ./data/train2014.zip -d ./data/ 9 | rm ./data/train2014.zip 10 | unzip ./data/val2014.zip -d ./data/ 11 | rm ./data/val2014.zip 12 | -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.models as models 4 | from torch.nn.utils.rnn import pack_padded_sequence 5 | from torch.autograd import Variable 6 | 7 | 8 | class EncoderCNN(nn.Module): 9 | def __init__(self, embed_size): 10 | """Load the pretrained ResNet-152 and replace top fc layer.""" 11 | super(EncoderCNN, self).__init__() 12 | resnet = models.resnet152(pretrained=True) 13 | modules = list(resnet.children())[:-1] # delete the last fc layer. 14 | self.resnet = nn.Sequential(*modules) 15 | self.linear = nn.Linear(resnet.fc.in_features, embed_size) 16 | self.bn = nn.BatchNorm1d(embed_size, momentum=0.01) 17 | self.init_weights() 18 | 19 | def init_weights(self): 20 | """Initialize the weights.""" 21 | self.linear.weight.data.normal_(0.0, 0.02) 22 | self.linear.bias.data.fill_(0) 23 | 24 | def forward(self, images): 25 | """Extract the image feature vectors.""" 26 | features = self.resnet(images) 27 | features = Variable(features.data) 28 | features = features.view(features.size(0), -1) 29 | features = self.bn(self.linear(features)) 30 | return features 31 | 32 | 33 | class DecoderRNN(nn.Module): 34 | def __init__(self, embed_size, hidden_size, vocab_size, num_layers): 35 | """Set the hyper-parameters and build the layers.""" 36 | super(DecoderRNN, self).__init__() 37 | self.embed = nn.Embedding(vocab_size, embed_size) 38 | self.lstm = nn.LSTM(embed_size, hidden_size, num_layers, batch_first=True) 39 | self.linear = nn.Linear(hidden_size, vocab_size) 40 | self.init_weights() 41 | 42 | def init_weights(self): 43 | """Initialize weights.""" 44 | self.embed.weight.data.uniform_(-0.1, 0.1) 45 | self.linear.weight.data.uniform_(-0.1, 0.1) 46 | self.linear.bias.data.fill_(0) 47 | 48 | def forward(self, features, captions, lengths): 49 | """Decode image feature vectors and generates captions.""" 50 | embeddings = self.embed(captions) 51 | embeddings = torch.cat((features.unsqueeze(1), embeddings), 1) 52 | packed = pack_padded_sequence(embeddings, lengths, batch_first=True) 53 | hiddens, _ = self.lstm(packed) 54 | outputs = self.linear(hiddens[0]) 55 | return outputs 56 | 57 | def sample(self, features, states=None): 58 | """Samples captions for given image features (Greedy search).""" 59 | sampled_ids = [] 60 | inputs = features.unsqueeze(1) 61 | for i in range(20): # maximum sampling length 62 | hiddens, states = self.lstm(inputs, states) # (batch_size, 1, hidden_size), 63 | outputs = self.linear(hiddens.squeeze(1)) # (batch_size, vocab_size) 64 | predicted = outputs.max(1)[1] 65 | sampled_ids.append(predicted) 66 | inputs = self.embed(predicted) 67 | inputs = inputs.unsqueeze(1) # (batch_size, 1, embed_size) 68 | sampled_ids = torch.cat(sampled_ids, 1) # (batch_size, 20) 69 | return sampled_ids.squeeze() 70 | -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/png/example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/image_captioning/png/example.png -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/png/image_captioning.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/image_captioning/png/image_captioning.png -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/png/model.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/image_captioning/png/model.png -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/requirements.txt: -------------------------------------------------------------------------------- 1 | matplotlib 2 | nltk 3 | numpy 4 | Pillow 5 | argparse -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/resize.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | from PIL import Image 4 | 5 | 6 | def resize_image(image, size): 7 | """Resize an image to the given size.""" 8 | return image.resize(size, Image.ANTIALIAS) 9 | 10 | def resize_images(image_dir, output_dir, size): 11 | """Resize the images in 'image_dir' and save into 'output_dir'.""" 12 | if not os.path.exists(output_dir): 13 | os.makedirs(output_dir) 14 | 15 | images = os.listdir(image_dir) 16 | num_images = len(images) 17 | for i, image in enumerate(images): 18 | with open(os.path.join(image_dir, image), 'r+b') as f: 19 | with Image.open(f) as img: 20 | img = resize_image(img, size) 21 | img.save(os.path.join(output_dir, image), img.format) 22 | if i % 100 == 0: 23 | print ("[%d/%d] Resized the images and saved into '%s'." 24 | %(i, num_images, output_dir)) 25 | 26 | def main(args): 27 | splits = ['train', 'val'] 28 | for split in splits: 29 | image_dir = args.image_dir 30 | output_dir = args.output_dir 31 | image_size = [args.image_size, args.image_size] 32 | resize_images(image_dir, output_dir, image_size) 33 | 34 | 35 | if __name__ == '__main__': 36 | parser = argparse.ArgumentParser() 37 | parser.add_argument('--image_dir', type=str, default='./data/train2014/', 38 | help='directory for train images') 39 | parser.add_argument('--output_dir', type=str, default='./data/resized2014/', 40 | help='directory for saving resized images') 41 | parser.add_argument('--image_size', type=int, default=256, 42 | help='size for image after processing') 43 | args = parser.parse_args() 44 | main(args) -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/sample.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import matplotlib.pyplot as plt 3 | import numpy as np 4 | import argparse 5 | import pickle 6 | import os 7 | from torch.autograd import Variable 8 | from torchvision import transforms 9 | from build_vocab import Vocabulary 10 | from model import EncoderCNN, DecoderRNN 11 | from PIL import Image 12 | 13 | 14 | def to_var(x, volatile=False): 15 | if torch.cuda.is_available(): 16 | x = x.cuda() 17 | return Variable(x, volatile=volatile) 18 | 19 | def load_image(image_path, transform=None): 20 | image = Image.open(image_path) 21 | image = image.resize([224, 224], Image.LANCZOS) 22 | 23 | if transform is not None: 24 | image = transform(image).unsqueeze(0) 25 | 26 | return image 27 | 28 | def main(args): 29 | # Image preprocessing 30 | transform = transforms.Compose([ 31 | transforms.ToTensor(), 32 | transforms.Normalize((0.485, 0.456, 0.406), 33 | (0.229, 0.224, 0.225))]) 34 | 35 | # Load vocabulary wrapper 36 | with open(args.vocab_path, 'rb') as f: 37 | vocab = pickle.load(f) 38 | 39 | # Build Models 40 | encoder = EncoderCNN(args.embed_size) 41 | encoder.eval() # evaluation mode (BN uses moving mean/variance) 42 | decoder = DecoderRNN(args.embed_size, args.hidden_size, 43 | len(vocab), args.num_layers) 44 | 45 | 46 | # Load the trained model parameters 47 | encoder.load_state_dict(torch.load(args.encoder_path)) 48 | decoder.load_state_dict(torch.load(args.decoder_path)) 49 | 50 | # Prepare Image 51 | image = load_image(args.image, transform) 52 | image_tensor = to_var(image, volatile=True) 53 | 54 | # If use gpu 55 | if torch.cuda.is_available(): 56 | encoder.cuda() 57 | decoder.cuda() 58 | 59 | # Generate caption from image 60 | feature = encoder(image_tensor) 61 | sampled_ids = decoder.sample(feature) 62 | sampled_ids = sampled_ids.cpu().data.numpy() 63 | 64 | # Decode word_ids to words 65 | sampled_caption = [] 66 | for word_id in sampled_ids: 67 | word = vocab.idx2word[word_id] 68 | sampled_caption.append(word) 69 | if word == '': 70 | break 71 | sentence = ' '.join(sampled_caption) 72 | 73 | # Print out image and generated caption. 74 | print (sentence) 75 | image = Image.open(args.image) 76 | plt.imshow(np.asarray(image)) 77 | 78 | if __name__ == '__main__': 79 | parser = argparse.ArgumentParser() 80 | parser.add_argument('--image', type=str, required=True, 81 | help='input image for generating caption') 82 | parser.add_argument('--encoder_path', type=str, default='./models/encoder-5-3000.pkl', 83 | help='path for trained encoder') 84 | parser.add_argument('--decoder_path', type=str, default='./models/decoder-5-3000.pkl', 85 | help='path for trained decoder') 86 | parser.add_argument('--vocab_path', type=str, default='./data/vocab.pkl', 87 | help='path for vocabulary wrapper') 88 | 89 | # Model parameters (should be same as paramters in train.py) 90 | parser.add_argument('--embed_size', type=int , default=256, 91 | help='dimension of word embedding vectors') 92 | parser.add_argument('--hidden_size', type=int , default=512, 93 | help='dimension of lstm hidden states') 94 | parser.add_argument('--num_layers', type=int , default=1 , 95 | help='number of layers in lstm') 96 | args = parser.parse_args() 97 | main(args) -------------------------------------------------------------------------------- /tutorials/03-advanced/image_captioning/train.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import torch 3 | import torch.nn as nn 4 | import numpy as np 5 | import os 6 | import pickle 7 | from data_loader import get_loader 8 | from build_vocab import Vocabulary 9 | from model import EncoderCNN, DecoderRNN 10 | from torch.autograd import Variable 11 | from torch.nn.utils.rnn import pack_padded_sequence 12 | from torchvision import transforms 13 | 14 | def to_var(x, volatile=False): 15 | if torch.cuda.is_available(): 16 | x = x.cuda() 17 | return Variable(x, volatile=volatile) 18 | 19 | def main(args): 20 | # Create model directory 21 | if not os.path.exists(args.model_path): 22 | os.makedirs(args.model_path) 23 | 24 | # Image preprocessing 25 | # For normalization, see https://github.com/pytorch/vision#models 26 | transform = transforms.Compose([ 27 | transforms.RandomCrop(args.crop_size), 28 | transforms.RandomHorizontalFlip(), 29 | transforms.ToTensor(), 30 | transforms.Normalize((0.485, 0.456, 0.406), 31 | (0.229, 0.224, 0.225))]) 32 | 33 | # Load vocabulary wrapper. 34 | with open(args.vocab_path, 'rb') as f: 35 | vocab = pickle.load(f) 36 | 37 | # Build data loader 38 | data_loader = get_loader(args.image_dir, args.caption_path, vocab, 39 | transform, args.batch_size, 40 | shuffle=True, num_workers=args.num_workers) 41 | 42 | # Build the models 43 | encoder = EncoderCNN(args.embed_size) 44 | decoder = DecoderRNN(args.embed_size, args.hidden_size, 45 | len(vocab), args.num_layers) 46 | 47 | if torch.cuda.is_available(): 48 | encoder.cuda() 49 | decoder.cuda() 50 | 51 | # Loss and Optimizer 52 | criterion = nn.CrossEntropyLoss() 53 | params = list(decoder.parameters()) + list(encoder.linear.parameters()) + list(encoder.bn.parameters()) 54 | optimizer = torch.optim.Adam(params, lr=args.learning_rate) 55 | 56 | # Train the Models 57 | total_step = len(data_loader) 58 | for epoch in range(args.num_epochs): 59 | for i, (images, captions, lengths) in enumerate(data_loader): 60 | 61 | # Set mini-batch dataset 62 | images = to_var(images, volatile=True) 63 | captions = to_var(captions) 64 | targets = pack_padded_sequence(captions, lengths, batch_first=True)[0] 65 | 66 | # Forward, Backward and Optimize 67 | decoder.zero_grad() 68 | encoder.zero_grad() 69 | features = encoder(images) 70 | outputs = decoder(features, captions, lengths) 71 | loss = criterion(outputs, targets) 72 | loss.backward() 73 | optimizer.step() 74 | 75 | # Print log info 76 | if i % args.log_step == 0: 77 | print('Epoch [%d/%d], Step [%d/%d], Loss: %.4f, Perplexity: %5.4f' 78 | %(epoch, args.num_epochs, i, total_step, 79 | loss.data[0], np.exp(loss.data[0]))) 80 | 81 | # Save the models 82 | if (i+1) % args.save_step == 0: 83 | torch.save(decoder.state_dict(), 84 | os.path.join(args.model_path, 85 | 'decoder-%d-%d.pkl' %(epoch+1, i+1))) 86 | torch.save(encoder.state_dict(), 87 | os.path.join(args.model_path, 88 | 'encoder-%d-%d.pkl' %(epoch+1, i+1))) 89 | 90 | if __name__ == '__main__': 91 | parser = argparse.ArgumentParser() 92 | parser.add_argument('--model_path', type=str, default='./models/' , 93 | help='path for saving trained models') 94 | parser.add_argument('--crop_size', type=int, default=224 , 95 | help='size for randomly cropping images') 96 | parser.add_argument('--vocab_path', type=str, default='./data/vocab.pkl', 97 | help='path for vocabulary wrapper') 98 | parser.add_argument('--image_dir', type=str, default='./data/resized2014' , 99 | help='directory for resized images') 100 | parser.add_argument('--caption_path', type=str, 101 | default='./data/annotations/captions_train2014.json', 102 | help='path for train annotation json file') 103 | parser.add_argument('--log_step', type=int , default=10, 104 | help='step size for prining log info') 105 | parser.add_argument('--save_step', type=int , default=1000, 106 | help='step size for saving trained models') 107 | 108 | # Model parameters 109 | parser.add_argument('--embed_size', type=int , default=256 , 110 | help='dimension of word embedding vectors') 111 | parser.add_argument('--hidden_size', type=int , default=512 , 112 | help='dimension of lstm hidden states') 113 | parser.add_argument('--num_layers', type=int , default=1 , 114 | help='number of layers in lstm') 115 | 116 | parser.add_argument('--num_epochs', type=int, default=5) 117 | parser.add_argument('--batch_size', type=int, default=128) 118 | parser.add_argument('--num_workers', type=int, default=2) 119 | parser.add_argument('--learning_rate', type=float, default=0.001) 120 | args = parser.parse_args() 121 | print(args) 122 | main(args) -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/README.md: -------------------------------------------------------------------------------- 1 | # Neural Style Transfer 2 | 3 | [Neural style transfer](https://arxiv.org/abs/1508.06576) is an algorithm that combines the content of one image with the style of another image using CNN. Given a content image and a style image, the goal is to generate a target image that minimizes the content difference with the content image and the style difference with the style image. 4 | 5 |

6 | 7 | 8 | #### Content loss 9 | 10 | To minimize the content difference, we forward propagate the content image and the target image to pretrained [VGGNet](https://arxiv.org/abs/1409.1556) respectively, and extract feature maps from multiple convolutional layers. Then, the target image is updated to minimize the [mean-squared error](https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/03-advanced/neural_style_transfer/main.py#L92-L93) between the feature maps of the content image and its feature maps. 11 | 12 | #### Style loss 13 | 14 | As in computing the content loss, we forward propagate the style image and the target image to the VGGNet and extract convolutional feature maps. To generate a texture that matches the style of the style image, we update the target image by minimizing the mean-squared error between the Gram matrix of the style image and the Gram matrix of the target image (feature correlation minimization). See [here](https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/03-advanced/neural_style_transfer/main.py#L95-L105) for how to compute the style loss. 15 | 16 | 17 | 18 | 19 |
20 | 21 | ## Usage 22 | 23 | ```bash 24 | $ pip install -r requirements.txt 25 | $ python main.py --content='png/content.png' --style='png/style.png' 26 | ``` 27 | 28 |
29 | 30 | ## Results 31 | The following is the result of applying variaous styles of artwork to Anne Hathaway's photograph. 32 | 33 | ![alt text](png/neural_style.png) 34 | -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/main.py: -------------------------------------------------------------------------------- 1 | from __future__ import division 2 | from torch.backends import cudnn 3 | from torch.autograd import Variable 4 | from torchvision import models 5 | from torchvision import transforms 6 | from PIL import Image 7 | import argparse 8 | import torch 9 | import torchvision 10 | import torch.nn as nn 11 | import numpy as np 12 | 13 | 14 | use_cuda = torch.cuda.is_available() 15 | dtype = torch.cuda.FloatTensor if use_cuda else torch.FloatTensor 16 | 17 | # Load image file and convert it into variable 18 | # unsqueeze for make the 4D tensor to perform conv arithmetic 19 | def load_image(image_path, transform=None, max_size=None, shape=None): 20 | image = Image.open(image_path) 21 | 22 | if max_size is not None: 23 | scale = max_size / max(image.size) 24 | size = np.array(image.size) * scale 25 | image = image.resize(size.astype(int), Image.ANTIALIAS) 26 | 27 | if shape is not None: 28 | image = image.resize(shape, Image.LANCZOS) 29 | 30 | if transform is not None: 31 | image = transform(image).unsqueeze(0) 32 | 33 | return image.type(dtype) 34 | 35 | # Pretrained VGGNet 36 | class VGGNet(nn.Module): 37 | def __init__(self): 38 | """Select conv1_1 ~ conv5_1 activation maps.""" 39 | super(VGGNet, self).__init__() 40 | self.select = ['0', '5', '10', '19', '28'] 41 | self.vgg = models.vgg19(pretrained=True).features 42 | 43 | def forward(self, x): 44 | """Extract 5 conv activation maps from an input image. 45 | 46 | Args: 47 | x: 4D tensor of shape (1, 3, height, width). 48 | 49 | Returns: 50 | features: a list containing 5 conv activation maps. 51 | """ 52 | features = [] 53 | for name, layer in self.vgg._modules.items(): 54 | x = layer(x) 55 | if name in self.select: 56 | features.append(x) 57 | return features 58 | 59 | 60 | def main(config): 61 | 62 | # Image preprocessing 63 | # For normalization, see https://github.com/pytorch/vision#models 64 | transform = transforms.Compose([ 65 | transforms.ToTensor(), 66 | transforms.Normalize((0.485, 0.456, 0.406), 67 | (0.229, 0.224, 0.225))]) 68 | 69 | # Load content and style images 70 | # make content.size() == style.size() 71 | content = load_image(config.content, transform, max_size=config.max_size) 72 | style = load_image(config.style, transform, shape=[content.size(2), content.size(3)]) 73 | 74 | # Initialization and optimizer 75 | target = Variable(content.clone(), requires_grad=True) 76 | optimizer = torch.optim.Adam([target], lr=config.lr, betas=[0.5, 0.999]) 77 | 78 | vgg = VGGNet() 79 | if use_cuda: 80 | vgg.cuda() 81 | 82 | for step in range(config.total_step): 83 | 84 | # Extract multiple(5) conv feature vectors 85 | target_features = vgg(target) 86 | content_features = vgg(Variable(content)) 87 | style_features = vgg(Variable(style)) 88 | 89 | style_loss = 0 90 | content_loss = 0 91 | for f1, f2, f3 in zip(target_features, content_features, style_features): 92 | # Compute content loss (target and content image) 93 | content_loss += torch.mean((f1 - f2)**2) 94 | 95 | # Reshape conv features 96 | _, c, h, w = f1.size() 97 | f1 = f1.view(c, h * w) 98 | f3 = f3.view(c, h * w) 99 | 100 | # Compute gram matrix 101 | f1 = torch.mm(f1, f1.t()) 102 | f3 = torch.mm(f3, f3.t()) 103 | 104 | # Compute style loss (target and style image) 105 | style_loss += torch.mean((f1 - f3)**2) / (c * h * w) 106 | 107 | # Compute total loss, backprop and optimize 108 | loss = content_loss + config.style_weight * style_loss 109 | optimizer.zero_grad() 110 | loss.backward() 111 | optimizer.step() 112 | 113 | if (step+1) % config.log_step == 0: 114 | print ('Step [%d/%d], Content Loss: %.4f, Style Loss: %.4f' 115 | %(step+1, config.total_step, content_loss.data[0], style_loss.data[0])) 116 | 117 | if (step+1) % config.sample_step == 0: 118 | # Save the generated image 119 | denorm = transforms.Normalize((-2.12, -2.04, -1.80), (4.37, 4.46, 4.44)) 120 | img = target.clone().cpu().squeeze() 121 | img = denorm(img.data).clamp_(0, 1) 122 | torchvision.utils.save_image(img, 'output-%d.png' %(step+1)) 123 | 124 | 125 | if __name__ == "__main__": 126 | parser = argparse.ArgumentParser() 127 | parser.add_argument('--content', type=str, default='./png/content.png') 128 | parser.add_argument('--style', type=str, default='./png/style.png') 129 | parser.add_argument('--max_size', type=int, default=400) 130 | parser.add_argument('--total_step', type=int, default=5000) 131 | parser.add_argument('--log_step', type=int, default=10) 132 | parser.add_argument('--sample_step', type=int, default=1000) 133 | parser.add_argument('--style_weight', type=float, default=100) 134 | parser.add_argument('--lr', type=float, default=0.003) 135 | config = parser.parse_args() 136 | print(config) 137 | main(config) -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/png/content.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/neural_style_transfer/png/content.png -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/png/neural_style.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/neural_style_transfer/png/neural_style.png -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/png/neural_style2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/neural_style_transfer/png/neural_style2.png -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/png/style.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/neural_style_transfer/png/style.png -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/png/style2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/neural_style_transfer/png/style2.png -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/png/style3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/neural_style_transfer/png/style3.png -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/png/style4.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/neural_style_transfer/png/style4.png -------------------------------------------------------------------------------- /tutorials/03-advanced/neural_style_transfer/requirements.txt: -------------------------------------------------------------------------------- 1 | argparse 2 | torch 3 | torchvision 4 | Pillow 5 | -------------------------------------------------------------------------------- /tutorials/03-advanced/variational_auto_encoder/README.md: -------------------------------------------------------------------------------- 1 | # Variational Auto-Encoder 2 | [Variational Auto-Encoder(VAE)](https://arxiv.org/abs/1312.6114) is one of the generative model. From a neural network perspective, the only difference between the VAE and the Auto-Encoder(AE) is that the latent vector z in VAE is stochastically sampled. This solves the problem that the AE learns identity mapping and can not have meaningful representations in latent space. In fact, the VAE uses [reparameterization trick](https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/03-advanced/variational_auto_encoder/main.py#L40-L44) to enable back propagation without sampling z directly from the mean and variance. 3 | 4 | #### VAE loss 5 | As in conventional auto-encoders, the VAE minimizes the reconstruction loss between the input image and the generated image. In addition, the VAE approximates z to the standard normal distribution so that the decoder in the VAE can be used for sampling in the test phase. 6 | 7 |

8 | 9 | 10 | 11 | 12 | ## Usage 13 | 14 | ```bash 15 | $ pip install -r requirements.txt 16 | $ python main.py 17 | ``` 18 | 19 |
20 | 21 | ## Results 22 | Real image | Reconstruced image 23 | :-------------------------:|:-------------------------: 24 | ![alt text](png/real.png) | ![alt text](png/reconst.png) 25 | -------------------------------------------------------------------------------- /tutorials/03-advanced/variational_auto_encoder/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | from torch.autograd import Variable 5 | from torchvision import datasets 6 | from torchvision import transforms 7 | import torchvision 8 | 9 | # MNIST dataset 10 | dataset = datasets.MNIST(root='./data', 11 | train=True, 12 | transform=transforms.ToTensor(), 13 | download=True) 14 | 15 | # Data loader 16 | data_loader = torch.utils.data.DataLoader(dataset=dataset, 17 | batch_size=100, 18 | shuffle=True) 19 | 20 | def to_var(x): 21 | if torch.cuda.is_available(): 22 | x = x.cuda() 23 | return Variable(x) 24 | 25 | # VAE model 26 | class VAE(nn.Module): 27 | def __init__(self, image_size=784, h_dim=400, z_dim=20): 28 | super(VAE, self).__init__() 29 | self.encoder = nn.Sequential( 30 | nn.Linear(image_size, h_dim), 31 | nn.LeakyReLU(0.2), 32 | nn.Linear(h_dim, z_dim*2)) # 2 for mean and variance. 33 | 34 | self.decoder = nn.Sequential( 35 | nn.Linear(z_dim, h_dim), 36 | nn.ReLU(), 37 | nn.Linear(h_dim, image_size), 38 | nn.Sigmoid()) 39 | 40 | def reparametrize(self, mu, log_var): 41 | """"z = mean + eps * sigma where eps is sampled from N(0, 1).""" 42 | eps = to_var(torch.randn(mu.size(0), mu.size(1))) 43 | z = mu + eps * torch.exp(log_var/2) # 2 for convert var to std 44 | return z 45 | 46 | def forward(self, x): 47 | h = self.encoder(x) 48 | mu, log_var = torch.chunk(h, 2, dim=1) # mean and log variance. 49 | z = self.reparametrize(mu, log_var) 50 | out = self.decoder(z) 51 | return out, mu, log_var 52 | 53 | def sample(self, z): 54 | return self.decoder(z) 55 | 56 | vae = VAE() 57 | 58 | if torch.cuda.is_available(): 59 | vae.cuda() 60 | 61 | optimizer = torch.optim.Adam(vae.parameters(), lr=0.001) 62 | iter_per_epoch = len(data_loader) 63 | data_iter = iter(data_loader) 64 | 65 | # fixed inputs for debugging 66 | fixed_z = to_var(torch.randn(100, 20)) 67 | fixed_x, _ = next(data_iter) 68 | torchvision.utils.save_image(fixed_x.cpu(), './data/real_images.png') 69 | fixed_x = to_var(fixed_x.view(fixed_x.size(0), -1)) 70 | 71 | for epoch in range(50): 72 | for i, (images, _) in enumerate(data_loader): 73 | 74 | images = to_var(images.view(images.size(0), -1)) 75 | out, mu, log_var = vae(images) 76 | 77 | # Compute reconstruction loss and kl divergence 78 | # For kl_divergence, see Appendix B in the paper or http://yunjey47.tistory.com/43 79 | reconst_loss = F.binary_cross_entropy(out, images, size_average=False) 80 | kl_divergence = torch.sum(0.5 * (mu**2 + torch.exp(log_var) - log_var -1)) 81 | 82 | # Backprop + Optimize 83 | total_loss = reconst_loss + kl_divergence 84 | optimizer.zero_grad() 85 | total_loss.backward() 86 | optimizer.step() 87 | 88 | if i % 100 == 0: 89 | print ("Epoch[%d/%d], Step [%d/%d], Total Loss: %.4f, " 90 | "Reconst Loss: %.4f, KL Div: %.7f" 91 | %(epoch+1, 50, i+1, iter_per_epoch, total_loss.data[0], 92 | reconst_loss.data[0], kl_divergence.data[0])) 93 | 94 | # Save the reconstructed images 95 | reconst_images, _, _ = vae(fixed_x) 96 | reconst_images = reconst_images.view(reconst_images.size(0), 1, 28, 28) 97 | torchvision.utils.save_image(reconst_images.data.cpu(), 98 | './data/reconst_images_%d.png' %(epoch+1)) 99 | -------------------------------------------------------------------------------- /tutorials/03-advanced/variational_auto_encoder/png/real.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/variational_auto_encoder/png/real.png -------------------------------------------------------------------------------- /tutorials/03-advanced/variational_auto_encoder/png/reconst.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/variational_auto_encoder/png/reconst.png -------------------------------------------------------------------------------- /tutorials/03-advanced/variational_auto_encoder/png/vae.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/03-advanced/variational_auto_encoder/png/vae.png -------------------------------------------------------------------------------- /tutorials/03-advanced/variational_auto_encoder/requirements.txt: -------------------------------------------------------------------------------- 1 | torch 2 | torchvision 3 | -------------------------------------------------------------------------------- /tutorials/04-utils/tensorboard/README.md: -------------------------------------------------------------------------------- 1 | # TensorBoard in PyTorch 2 | 3 | In this tutorial, we implement the MNIST classifier using a simple neural network and visualize the training process using [TensorBoard](https://www.tensorflow.org/get_started/summaries_and_tensorboard). In training phase, we plot the loss and accuracy functions through `scalar_summary` and visualize the training images through `image_summary`. In addition, we visualize the weight and gradient values of the parameters of the neural network using `histogram_summary`. PyTorch code for handling with these summary functions can be found [here](https://github.com/yunjey/pytorch-tutorial/blob/master/tutorials/04-utils/tensorboard/main.py#L83-L105). 4 | 5 | ![alt text](gif/tensorboard.gif) 6 | 7 |
8 | 9 | ## Usage 10 | 11 | #### 1. Install the dependencies 12 | ```bash 13 | $ pip install -r requirements.txt 14 | ``` 15 | 16 | #### 2. Train the model 17 | ```bash 18 | $ python main.py 19 | ``` 20 | 21 | #### 3. Open the TensorBoard 22 | To run the TensorBoard, open a new terminal and run the command below. Then, open http://localhost:6006/ in your web browser. 23 | ```bash 24 | $ tensorboard --logdir='./logs' --port=6006 25 | ``` -------------------------------------------------------------------------------- /tutorials/04-utils/tensorboard/gif/g: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /tutorials/04-utils/tensorboard/gif/tensorboard.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim810306/PytorchTutorial/a14fe66a454b40108fbe703407971026d83d943f/tutorials/04-utils/tensorboard/gif/tensorboard.gif -------------------------------------------------------------------------------- /tutorials/04-utils/tensorboard/logger.py: -------------------------------------------------------------------------------- 1 | # Code referenced from https://gist.github.com/gyglim/1f8dfb1b5c82627ae3efcfbbadb9f514 2 | import tensorflow as tf 3 | import numpy as np 4 | import scipy.misc 5 | try: 6 | from StringIO import StringIO # Python 2.7 7 | except ImportError: 8 | from io import BytesIO # Python 3.x 9 | 10 | 11 | class Logger(object): 12 | 13 | def __init__(self, log_dir): 14 | """Create a summary writer logging to log_dir.""" 15 | self.writer = tf.summary.FileWriter(log_dir) 16 | 17 | def scalar_summary(self, tag, value, step): 18 | """Log a scalar variable.""" 19 | summary = tf.Summary(value=[tf.Summary.Value(tag=tag, simple_value=value)]) 20 | self.writer.add_summary(summary, step) 21 | 22 | def image_summary(self, tag, images, step): 23 | """Log a list of images.""" 24 | 25 | img_summaries = [] 26 | for i, img in enumerate(images): 27 | # Write the image to a string 28 | try: 29 | s = StringIO() 30 | except: 31 | s = BytesIO() 32 | scipy.misc.toimage(img).save(s, format="png") 33 | 34 | # Create an Image object 35 | img_sum = tf.Summary.Image(encoded_image_string=s.getvalue(), 36 | height=img.shape[0], 37 | width=img.shape[1]) 38 | # Create a Summary value 39 | img_summaries.append(tf.Summary.Value(tag='%s/%d' % (tag, i), image=img_sum)) 40 | 41 | # Create and write Summary 42 | summary = tf.Summary(value=img_summaries) 43 | self.writer.add_summary(summary, step) 44 | 45 | def histo_summary(self, tag, values, step, bins=1000): 46 | """Log a histogram of the tensor of values.""" 47 | 48 | # Create a histogram using numpy 49 | counts, bin_edges = np.histogram(values, bins=bins) 50 | 51 | # Fill the fields of the histogram proto 52 | hist = tf.HistogramProto() 53 | hist.min = float(np.min(values)) 54 | hist.max = float(np.max(values)) 55 | hist.num = int(np.prod(values.shape)) 56 | hist.sum = float(np.sum(values)) 57 | hist.sum_squares = float(np.sum(values**2)) 58 | 59 | # Drop the start of the first bin 60 | bin_edges = bin_edges[1:] 61 | 62 | # Add bin edges and counts 63 | for edge in bin_edges: 64 | hist.bucket_limit.append(edge) 65 | for c in counts: 66 | hist.bucket.append(c) 67 | 68 | # Create and write Summary 69 | summary = tf.Summary(value=[tf.Summary.Value(tag=tag, histo=hist)]) 70 | self.writer.add_summary(summary, step) 71 | self.writer.flush() 72 | -------------------------------------------------------------------------------- /tutorials/04-utils/tensorboard/main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torchvision.datasets as dsets 4 | import torchvision.transforms as transforms 5 | from torch.autograd import Variable 6 | from logger import Logger 7 | 8 | 9 | # MNIST Dataset 10 | dataset = dsets.MNIST(root='./data', 11 | train=True, 12 | transform=transforms.ToTensor(), 13 | download=True) 14 | 15 | # Data Loader (Input Pipeline) 16 | data_loader = torch.utils.data.DataLoader(dataset=dataset, 17 | batch_size=100, 18 | shuffle=True) 19 | 20 | def to_np(x): 21 | return x.data.cpu().numpy() 22 | 23 | def to_var(x): 24 | if torch.cuda.is_available(): 25 | x = x.cuda() 26 | return Variable(x) 27 | 28 | # Neural Network Model (1 hidden layer) 29 | class Net(nn.Module): 30 | def __init__(self, input_size=784, hidden_size=500, num_classes=10): 31 | super(Net, self).__init__() 32 | self.fc1 = nn.Linear(input_size, hidden_size) 33 | self.relu = nn.ReLU() 34 | self.fc2 = nn.Linear(hidden_size, num_classes) 35 | 36 | def forward(self, x): 37 | out = self.fc1(x) 38 | out = self.relu(out) 39 | out = self.fc2(out) 40 | return out 41 | 42 | net = Net() 43 | if torch.cuda.is_available(): 44 | net.cuda() 45 | 46 | # Set the logger 47 | logger = Logger('./logs') 48 | 49 | # Loss and Optimizer 50 | criterion = nn.CrossEntropyLoss() 51 | optimizer = torch.optim.Adam(net.parameters(), lr=0.00001) 52 | 53 | data_iter = iter(data_loader) 54 | iter_per_epoch = len(data_loader) 55 | total_step = 50000 56 | 57 | # Start training 58 | for step in range(total_step): 59 | 60 | # Reset the data_iter 61 | if (step+1) % iter_per_epoch == 0: 62 | data_iter = iter(data_loader) 63 | 64 | # Fetch the images and labels and convert them to variables 65 | images, labels = next(data_iter) 66 | images, labels = to_var(images.view(images.size(0), -1)), to_var(labels) 67 | 68 | # Forward, backward and optimize 69 | optimizer.zero_grad() # zero the gradient buffer 70 | outputs = net(images) 71 | loss = criterion(outputs, labels) 72 | loss.backward() 73 | optimizer.step() 74 | 75 | # Compute accuracy 76 | _, argmax = torch.max(outputs, 1) 77 | accuracy = (labels == argmax.squeeze()).float().mean() 78 | 79 | if (step+1) % 100 == 0: 80 | print ('Step [%d/%d], Loss: %.4f, Acc: %.2f' 81 | %(step+1, total_step, loss.data[0], accuracy.data[0])) 82 | 83 | #============ TensorBoard logging ============# 84 | # (1) Log the scalar values 85 | info = { 86 | 'loss': loss.data[0], 87 | 'accuracy': accuracy.data[0] 88 | } 89 | 90 | for tag, value in info.items(): 91 | logger.scalar_summary(tag, value, step+1) 92 | 93 | # (2) Log values and gradients of the parameters (histogram) 94 | for tag, value in net.named_parameters(): 95 | tag = tag.replace('.', '/') 96 | logger.histo_summary(tag, to_np(value), step+1) 97 | logger.histo_summary(tag+'/grad', to_np(value.grad), step+1) 98 | 99 | # (3) Log the images 100 | info = { 101 | 'images': to_np(images.view(-1, 28, 28)[:10]) 102 | } 103 | 104 | for tag, images in info.items(): 105 | logger.image_summary(tag, images, step+1) -------------------------------------------------------------------------------- /tutorials/04-utils/tensorboard/requirements.txt: -------------------------------------------------------------------------------- 1 | tensorflow 2 | torch 3 | torchvision 4 | scipy 5 | numpy 6 | --------------------------------------------------------------------------------