├── report.pdf ├── LICENSE ├── README.md ├── part1.py ├── part2.py └── part3.py /report.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/AybukeYALCINER/image_classification/HEAD/report.pdf -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Aybüke YALÇINER 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ## Access the dataset: https://drive.google.com/open?id=1XzLpuQ-jtqXgU-SsFKLrNPJOrvtmtyE3 2 | 3 | Use python-3 and pytorch library.
4 | 5 | To use colab add the dataset file to the drive and write :
6 | 7 | from google.colab import drive
8 | drive.mount('/content/drive')
9 | 10 | and give the path of the dataset something like "drive/My Drive/dataset"
11 | 12 | We have 10 different classes and we have 400 training, 250 validation and 250 test images in each class.
13 | 14 | ## PART-I ## 15 | 16 | In this part, pretrained VGG-16 model is used(on imageNet) and this model is used as feature extractor(from FC7 layer after RELU). 17 | Then the extracted features are given to the one-vs-rest multiclass linear SVM to classify. We just use train and test set. 18 | This part is run on CPU.
19 | 20 | And there are some functions:
21 | 22 | def plot_confusion_matrix(y_true, y_pred, classes, normalize=False, title=None, cmap=plt.cm.Blues) => # generate the confusion matrix of the predictions
23 | -- takes correct labels, predicted labels, class names,boolean normalize, title of the graph and background colors of the graph as parameter respectively 24 |
25 | def feature_extract(dirName) => # returns an array that holds array of extracted features of the images and array of the class names of the images.
26 | takes dirName as parameter that specify the which images extracted. 27 |
28 | def imshow(inp, title=None) => # shows the number of batch size images and corresponding labels for each image. and takes images and names of the images as parameters 29 |
30 | 31 | ## PART-II ## 32 | 33 | In this part, we finetune the VGG-16 model of the CNN which is pretrained on imageNet. After we finetune the model, we test it and return the top-1 and top-5 accuracy.
34 | We use train,validation and test sets. This part is run on GPU. To run on CPU, you can remove the "device" and early stopping is made to avoid it, comment out the lines that cause he early stopping.
35 | NOT: We modify the last layer of the model because we have 10 classes and imgaNet has 1000 classes.
36 | 37 | And there are some functions: 38 |
39 | def plot_graph(val_loss, val_acc, tr_loss, tr_acc, num_epochs) => # plot the train and validation loss/accuracy plots and save it.
40 | takes array of validation losses, validation accuracy, train losses, train accuracy and number of epochs respectively. 41 |
42 | def train_model(model, dataloaders, criterion, optimizer, device,num_epochs=25) => # train the model
43 | takes the model, dataloaders, criterion, optimizer, device(GPU or CPU) and number of epochs respectively as parameters
44 | returns model, array of validation accuracy, validation loss, train accuracy, train loss respectively
45 | 46 | ## PART-III ## 47 | 48 | In this part, pretrained VGG-16 model is used(on imageNet) and after finetune the model, used as feature extractor(from FC7 layer after RELU).
49 | Then the extracted features are given to the one-vs-rest multiclass linear SVM to classify. This part is run on GPU. To run on CPU, you can remove the "device"
50 | NOT: We modify the last layer of the model because we have 10 classes and imgaNet has 1000 classes.
51 | 52 | And there are some functions:
53 | 54 | def plot_confusion_matrix(y_true, y_pred, classes, normalize=False, title=None, cmap=plt.cm.Blues) => # generate the confusion matrix of the predictions
55 | takes correct labels, predicted labels, class names,boolean normalize, title of the graph and background colors of the graph as parameter respectively
56 | 57 | def feature_extract(dirName) => # returns an array that holds array of extracted features of the images and array of the class names of the images.
58 | takes dirName as parameter that specify the which images extracted
59 | 60 | def train_model(model, dataloaders, criterion, optimizer, device,num_epochs=25) => # train the model
61 | takes the model, dataloaders, criterion, optimizer, device(GPU or CPU) and number of epochs respectively as parameters
62 | returns model, array of validation accuracy, validation loss, train accuracy, train loss respectively
63 | -------------------------------------------------------------------------------- /part1.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.optim as optim 4 | from torch.optim import lr_scheduler 5 | import numpy as np 6 | import torchvision 7 | from torchvision import datasets, models, transforms 8 | import matplotlib.pyplot as plt 9 | from sklearn.svm import LinearSVC 10 | from sklearn.multiclass import OneVsRestClassifier 11 | from sklearn.metrics import confusion_matrix 12 | from sklearn.utils.multiclass import unique_labels 13 | import time 14 | import os 15 | import copy 16 | 17 | plt.ion() # interactive mode 18 | 19 | # generate the confusion matrix of the predictions 20 | # takes correct labels, predicted labels, class names,boolean normalize, title of the graph and background colors of the graph as parameter respectively 21 | def plot_confusion_matrix(y_true, y_pred, classes, 22 | normalize=False, 23 | title=None, 24 | cmap=plt.cm.Blues): 25 | """ 26 | This function prints and plots the confusion matrix. 27 | Normalization can be applied by setting `normalize=True`. 28 | """ 29 | if not title: 30 | if normalize: 31 | title = 'Normalized confusion matrix' 32 | else: 33 | title = 'Confusion matrix, without normalization' 34 | 35 | # Compute confusion matrix 36 | cm = confusion_matrix(y_true, y_pred) 37 | # Only use the labels that appear in the data 38 | 39 | if normalize: 40 | cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] 41 | print("Normalized confusion matrix") 42 | else: 43 | print('Confusion matrix, without normalization') 44 | 45 | print(cm) 46 | 47 | fig, ax = plt.subplots() 48 | im = ax.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues) 49 | ax.figure.colorbar(im, ax=ax) 50 | # We want to show all ticks... 51 | ax.set(xticks=np.arange(cm.shape[1]), 52 | yticks=np.arange(cm.shape[0]), 53 | # ... and label them with the respective list entries 54 | xticklabels=classes, yticklabels=classes, 55 | title=title, 56 | ylabel='True label', 57 | xlabel='Predicted label') 58 | 59 | # Rotate the tick labels and set their alignment. 60 | plt.setp(ax.get_xticklabels(), rotation=45, ha="right", 61 | rotation_mode="anchor") 62 | 63 | # Loop over data dimensions and create text annotations. 64 | fmt = '.2f' if normalize else 'd' 65 | thresh = cm.max() / 2. 66 | for i in range(cm.shape[0]): 67 | for j in range(cm.shape[1]): 68 | ax.text(j, i, format(cm[i, j], fmt), 69 | ha="center", va="center", 70 | color="white" if cm[i, j] > thresh else "black") 71 | fig.tight_layout() 72 | fig.savefig("conf_mtrx_prt1_b4.png") 73 | return ax 74 | 75 | # returns an array that holds array of extracted features of the images and array of the class names of the images. 76 | # takes dirName as parameter that specify the which images extracted 77 | def feature_extract(dirName): 78 | feature = [] 79 | label = [] 80 | for inputs, labels in dataloaders[dirName]: 81 | outputs = model_ft(inputs) 82 | feature.extend(outputs.cpu().detach().numpy()) 83 | label.extend(labels) 84 | label = np.array(label) 85 | return [feature,label] 86 | 87 | # shows the number of batch size images and corresponding labels for each image. 88 | # takes images and names of the images as parameters 89 | def imshow(inp, title=None): 90 | """Imshow for Tensor.""" 91 | inp = inp.numpy().transpose((1, 2, 0)) 92 | mean = np.array([0.485, 0.456, 0.406]) 93 | std = np.array([0.229, 0.224, 0.225]) 94 | inp = std * inp + mean 95 | inp = np.clip(inp, 0, 1) 96 | plt.imshow(inp) 97 | if title is not None: 98 | plt.title(title) 99 | plt.pause(0.001) # pause a bit so that plots are updated 100 | 101 | # data augmentation 102 | data_transforms = { 103 | 'train': transforms.Compose([ 104 | transforms.RandomResizedCrop(224), 105 | transforms.RandomHorizontalFlip(), 106 | transforms.ToTensor(), 107 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 108 | ]), 109 | 'test': transforms.Compose([ 110 | transforms.Resize(256), 111 | transforms.CenterCrop(224), 112 | transforms.ToTensor(), 113 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 114 | ]), 115 | } 116 | # loads the images from test and train folders who are in dataset folder 117 | # batch size is 4 118 | data_dir = 'dataset' 119 | image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), 120 | data_transforms[x]) 121 | for x in ['train', 'test']} 122 | dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4, 123 | shuffle=True, num_workers=4) 124 | for x in ['train', 'test']} 125 | dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'test']} 126 | class_names = image_datasets['train'].classes 127 | 128 | # Get a batch of training data 129 | inputs, classes = next(iter(dataloaders['train'])) 130 | 131 | # Make a grid from batch 132 | out = torchvision.utils.make_grid(inputs) 133 | # shows the number of batch size images to check 134 | imshow(out, title=[class_names[x] for x in classes]) 135 | # pretrained model is called 136 | model_ft = models.vgg16(pretrained=True) 137 | model_ft.classifier= nn.Sequential(*list(model_ft.classifier.children())[:-2]) # delete the last two layer 138 | # freeze layers 139 | for param in model_ft.parameters(): 140 | param.requires_grad = False 141 | 142 | # call feature_extract() function and sets the arrays that are returned 143 | test_features = feature_extract('test') 144 | test_feature = test_features[0] 145 | test_label = test_features[1] 146 | 147 | train_features = feature_extract('train') 148 | train_feature = train_features[0] 149 | train_label = train_features[1] 150 | 151 | 152 | # one vs rest multiclass linear SVM 153 | clf = OneVsRestClassifier(LinearSVC(random_state=0)) # random_state => randomness (optional) 154 | classifier = clf.fit(train_feature, train_label) 155 | 156 | # predictions 157 | length_test = len(test_feature) 158 | class_based_acc = np.zeros(10) 159 | for test_ind in range(length_test): 160 | y_pred = classifier.predict([test_feature[test_ind]]) 161 | if(y_pred == test_label[test_ind]): 162 | class_based_acc[int(y_pred)] += 1 163 | 164 | #print class based and general acc 165 | for acc_ind in range(10): 166 | print("Accuracy of class "+ str(acc_ind)+ " "+str(class_based_acc[acc_ind]*100/length_test)) 167 | acc = classifier.score(test_feature,test_label) 168 | print("Accuracy: "+ str(acc*100)) 169 | y_pred = classifier.predict(test_feature) 170 | # confusion matrix 171 | plot_confusion_matrix(test_label, y_pred, classes=class_names, normalize=True, title='Normalized confusion matrix') 172 | -------------------------------------------------------------------------------- /part2.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | from __future__ import division 3 | import torch 4 | import torch.nn as nn 5 | import torch.optim as optim 6 | import tensorflow as tf 7 | import numpy as np 8 | import torchvision 9 | from torchvision import datasets, models, transforms 10 | from torch.autograd import Variable 11 | import matplotlib.pyplot as plt 12 | import time 13 | import os 14 | import copy 15 | import tensorflow as tf 16 | 17 | # plot the train and validation loss/accuracy plots and save it. 18 | # takes array of validation losses, validation accuracy, train losses, train accuracy and number of epochs respectively. 19 | def plot_graph(val_loss, val_acc, tr_loss, tr_acc, num_epochs): 20 | plt.subplot(211) 21 | plt.title("Loss plots vs. Number of Training Epochs") 22 | plt.plot(range(1,num_epochs+1),val_loss,label="validation") 23 | plt.plot(range(1,num_epochs+1),tr_loss,label="train") 24 | 25 | plt.xticks(np.arange(1, num_epochs+1, 1.0)) 26 | plt.legend() 27 | 28 | plt.subplot(212) 29 | plt.title("Accuracy plots vs. Number of Training Epochs") 30 | plt.plot(range(1,num_epochs+1),val_acc,label="validation") 31 | plt.plot(range(1,num_epochs+1),tr_acc,label="train") 32 | 33 | plt.xticks(np.arange(1, num_epochs+1, 1.0)) 34 | plt.legend() 35 | 36 | plt.tight_layout() 37 | plt.savefig("plot.png") 38 | 39 | # train the model 40 | # takes the model, dataloaders, criterion, optimizer, device(GPU or CPU) and number of epochs respectively as parameters 41 | # returns model, array of validation accuracy, validation loss, train accuracy, train loss respectively 42 | def train_model(model, dataloaders, criterion, optimizer, device,num_epochs=25): 43 | since = time.time() 44 | 45 | val_acc_history = [] 46 | val_loss_history = [] 47 | tr_acc_history = [] 48 | tr_loss_history = [] 49 | 50 | best_model_wts = copy.deepcopy(model.state_dict()) 51 | best_acc = 0.0 52 | 53 | 54 | n_epochs_stop = 5 55 | min_val_loss = np.Inf 56 | epochs_no_improve = 0 57 | for epoch in range(num_epochs): 58 | print('Epoch {}/{}'.format(epoch, num_epochs - 1)) 59 | print('-' * 10) 60 | #scheduler.step(epoch) #for lr_scheduler 61 | # Each epoch has a training and validation phase 62 | for phase in ['train', 'validation']: 63 | 64 | if phase == 'train': 65 | model.train() # Set model to training mode 66 | else: 67 | model.eval() # Set model to evaluate mode 68 | 69 | running_loss = 0.0 70 | running_corrects = 0 71 | loader = dataloaders[phase] 72 | # Iterate over data. 73 | for inputs, labels in loader: 74 | inputs = inputs.to(device) 75 | labels = labels.to(device) 76 | # print("in dataloaders", end=" ") 77 | # zero the parameter gradients 78 | optimizer.zero_grad() 79 | 80 | # forward 81 | # track history if only in train 82 | with torch.set_grad_enabled(phase == 'train'): 83 | 84 | outputs = model(inputs) 85 | # print("x") 86 | 87 | _, preds = torch.max(outputs, 1) 88 | loss = criterion(outputs, labels) 89 | 90 | # backward + optimize only if in training phase 91 | if phase == 'train': 92 | loss.backward() 93 | optimizer.step() 94 | 95 | # statistics 96 | running_loss += loss.item() * inputs.size(0) 97 | running_corrects += torch.sum(preds == labels.data) 98 | 99 | epoch_loss = running_loss / len(dataloaders[phase].dataset) 100 | epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset) 101 | 102 | print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc)) 103 | 104 | # deep copy the model 105 | if phase == 'validation' and epoch_acc > best_acc: 106 | best_acc = epoch_acc 107 | best_model_wts = copy.deepcopy(model.state_dict()) 108 | if phase == 'validation': 109 | val_acc_history.append(epoch_acc) 110 | val_loss_history.append(epoch_loss) 111 | if phase == 'train': 112 | tr_acc_history.append(epoch_acc) 113 | tr_loss_history.append(epoch_loss) 114 | #early stopping 115 | if phase == 'validation': 116 | if epoch_loss < min_val_loss: 117 | epochs_no_improve = 0 118 | min_val_loss = val_loss 119 | 120 | else: 121 | epochs_no_improve += 1 122 | # Check early stopping condition 123 | if epochs_no_improve == n_epochs_stop: 124 | print('Early stopping!') 125 | return model, val_acc_history, val_loss_history, tr_acc_history, tr_loss_history 126 | 127 | 128 | 129 | time_elapsed = time.time() - since 130 | print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60)) 131 | print('Best val Acc: {:4f}'.format(best_acc)) 132 | 133 | # load best model weights 134 | model.load_state_dict(best_model_wts) 135 | return model, val_acc_history, val_loss_history, tr_acc_history, tr_loss_history 136 | 137 | 138 | 139 | # data augmentation 140 | data_transforms = { 141 | 'train': transforms.Compose([ 142 | transforms.RandomResizedCrop(224), 143 | transforms.RandomHorizontalFlip(), 144 | transforms.ToTensor(), 145 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 146 | ]), 147 | 'validation': transforms.Compose([ 148 | transforms.Resize(256), 149 | transforms.CenterCrop(224), 150 | transforms.ToTensor(), 151 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 152 | ]), 153 | 'test': transforms.Compose([ 154 | transforms.Resize(256), 155 | transforms.CenterCrop(224), 156 | transforms.ToTensor(), 157 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 158 | ]), 159 | } 160 | 161 | data_dir = "dataset" 162 | num_classes = 10 163 | batch_size = 32 164 | image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), 165 | data_transforms[x]) 166 | for x in ['train', 'validation', 'test']} 167 | dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, 168 | shuffle=True, num_workers=2) 169 | for x in ['train', 'validation', 'test']} 170 | dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'validation', 'test']} 171 | class_names = image_datasets['train'].classes 172 | 173 | model_ft = models.vgg16(pretrained=True) 174 | 175 | # freeze layers before classifiers 176 | for param in model_ft.features.parameters(): 177 | # print(param) 178 | param.requires_grad = False 179 | #different number of layer freeze 180 | #model_ft.features[-1].requires_grad = True 181 | #model_ft.features[-2].requires_grad = True 182 | #model_ft.features[-3].requires_grad = True 183 | 184 | 185 | model_ft.classifier[6] = nn.Linear(4096,10) #modify the last layer 186 | 187 | 188 | # specify loss function 189 | criterion = nn.CrossEntropyLoss() 190 | 191 | # specify optimizer 192 | optimizer = torch.optim.Adam(model_ft.parameters(), lr=0.001) 193 | 194 | #lr_scheduler 195 | #scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1) 196 | 197 | #different optimizer 198 | #optimizer = torch.optim.SGD(model_ft.parameters(), lr=0.001) 199 | 200 | #weight_decay 201 | #optimizer = torch.optim.Adam(model_ft.parameters(), lr=0.1,weight_decay= 0.001) 202 | 203 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 204 | model_ft = model_ft.to(device) #send the model to the gpu 205 | model_ft, val_acc, val_loss, tr_acc, tr_loss = train_model(model_ft, dataloaders, criterion, optimizer,device, num_epochs=30) #train model 206 | 207 | #test the model 208 | correct = 0 209 | topk = 0 210 | total = 0 211 | testloader = dataloaders['test'] 212 | with torch.no_grad(): 213 | for data in testloader: 214 | images, labels = data 215 | images = images.to(device) 216 | labels = labels.to(device) 217 | outputs = model_ft(images) 218 | _, predicted = torch.max(outputs.data, 1) 219 | total += labels.size(0) 220 | correct += (predicted == labels).sum().item() 221 | 222 | probs, classes = outputs.topk(5, dim=1) 223 | labels_size = labels.size(0) 224 | for i in range(labels_size): 225 | if(labels[i] in classes[i]): 226 | topk += 1 227 | 228 | 229 | print('Accuracy of the model on the test images: %d %%' % (100 * correct / total)) 230 | print('Accuracy of the top 5 on the test images: %d %%' % (100 * topk / total)) 231 | 232 | # val/train loss and accuracy plots 233 | plot_graph(val_loss, val_acc, tr_loss, tr_acc, 30) 234 | 235 | 236 | 237 | 238 | 239 | 240 | 241 | 242 | 243 | 244 | 245 | 246 | 247 | 248 | 249 | 250 | 251 | 252 | 253 | 254 | 255 | 256 | 257 | 258 | 259 | 260 | 261 | 262 | 263 | -------------------------------------------------------------------------------- /part3.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | from __future__ import division 3 | import torch 4 | import torch.nn as nn 5 | import torch.optim as optim 6 | import numpy as np 7 | import torchvision 8 | from torchvision import datasets, models, transforms 9 | from torch.autograd import Variable 10 | import matplotlib.pyplot as plt 11 | from sklearn.svm import LinearSVC 12 | from sklearn.multiclass import OneVsRestClassifier 13 | from sklearn.metrics import confusion_matrix 14 | from sklearn.utils.multiclass import unique_labels 15 | import time 16 | import os 17 | import copy 18 | 19 | # generate the confusion matrix of the predictions 20 | # takes correct labels, predicted labels, class names,boolean normalize, title of the graph and background colors of the graph as parameter respectively 21 | def plot_confusion_matrix(y_true, y_pred, classes, 22 | normalize=False, 23 | title=None, 24 | cmap=plt.cm.Blues): 25 | """ 26 | This function prints and plots the confusion matrix. 27 | Normalization can be applied by setting `normalize=True`. 28 | """ 29 | if not title: 30 | if normalize: 31 | title = 'Normalized confusion matrix' 32 | else: 33 | title = 'Confusion matrix, without normalization' 34 | 35 | # Compute confusion matrix 36 | cm = confusion_matrix(y_true, y_pred) 37 | # Only use the labels that appear in the data 38 | 39 | if normalize: 40 | cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] 41 | print("Normalized confusion matrix") 42 | else: 43 | print('Confusion matrix, without normalization') 44 | 45 | print(cm) 46 | 47 | fig, ax = plt.subplots() 48 | im = ax.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues) 49 | ax.figure.colorbar(im, ax=ax) 50 | # We want to show all ticks... 51 | ax.set(xticks=np.arange(cm.shape[1]), 52 | yticks=np.arange(cm.shape[0]), 53 | # ... and label them with the respective list entries 54 | xticklabels=classes, yticklabels=classes, 55 | title=title, 56 | ylabel='True label', 57 | xlabel='Predicted label') 58 | 59 | # Rotate the tick labels and set their alignment. 60 | plt.setp(ax.get_xticklabels(), rotation=45, ha="right", 61 | rotation_mode="anchor") 62 | 63 | # Loop over data dimensions and create text annotations. 64 | fmt = '.2f' if normalize else 'd' 65 | thresh = cm.max() / 2. 66 | for i in range(cm.shape[0]): 67 | for j in range(cm.shape[1]): 68 | ax.text(j, i, format(cm[i, j], fmt), 69 | ha="center", va="center", 70 | color="white" if cm[i, j] > thresh else "black") 71 | fig.tight_layout() 72 | fig.savefig("conf_mtrx_prt1_b4.png") 73 | return ax 74 | 75 | # returns an array that holds array of extracted features of the images and array of the class names of the images. 76 | # takes dirName as parameter that specify the which images extracted 77 | def feature_extract(dirName,device): 78 | feature = [] 79 | label = [] 80 | for inputs, labels in dataloaders[dirName]: 81 | inputs = inputs.to(device) 82 | labels = labels.to(device) 83 | outputs = model_ft(inputs) 84 | feature.extend(outputs.cpu().numpy()) 85 | label.extend(labels.cpu().numpy()) 86 | return [feature,label] 87 | 88 | # train the model 89 | # takes the model, dataloaders, criterion, optimizer, device(GPU or CPU) and number of epochs respectively as parameters 90 | # returns model, array of validation accuracy, validation loss, train accuracy, train loss respectively 91 | def train_model(model, dataloaders, criterion, optimizer, device,num_epochs=25): 92 | since = time.time() 93 | 94 | val_acc_history = [] 95 | val_loss_history = [] 96 | tr_acc_history = [] 97 | tr_loss_history = [] 98 | 99 | best_model_wts = copy.deepcopy(model.state_dict()) 100 | best_acc = 0.0 101 | 102 | 103 | for epoch in range(num_epochs): 104 | print('Epoch {}/{}'.format(epoch, num_epochs - 1)) 105 | print('-' * 10) 106 | 107 | # Each epoch has a training and validation phase 108 | for phase in ['train', 'validation']: 109 | 110 | if phase == 'train': 111 | model.train() # Set model to training mode 112 | else: 113 | model.eval() # Set model to evaluate mode 114 | 115 | running_loss = 0.0 116 | running_corrects = 0 117 | loader = dataloaders[phase] 118 | # Iterate over data. 119 | for inputs, labels in loader: 120 | inputs = inputs.to(device) 121 | labels = labels.to(device) 122 | # print("in dataloaders", end=" ") 123 | # zero the parameter gradients 124 | optimizer.zero_grad() 125 | 126 | # forward 127 | # track history if only in train 128 | with torch.set_grad_enabled(phase == 'train'): 129 | 130 | outputs = model(inputs) 131 | # print("x") 132 | 133 | _, preds = torch.max(outputs, 1) 134 | loss = criterion(outputs, labels) 135 | 136 | # backward + optimize only if in training phase 137 | if phase == 'train': 138 | loss.backward() 139 | optimizer.step() 140 | 141 | # statistics 142 | running_loss += loss.item() * inputs.size(0) 143 | running_corrects += torch.sum(preds == labels.data) 144 | 145 | epoch_loss = running_loss / len(dataloaders[phase].dataset) 146 | epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset) 147 | 148 | print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc)) 149 | 150 | # deep copy the model 151 | if phase == 'validation' and epoch_acc > best_acc: 152 | best_acc = epoch_acc 153 | best_model_wts = copy.deepcopy(model.state_dict()) 154 | if phase == 'validation': 155 | val_acc_history.append(epoch_acc) 156 | val_loss_history.append(epoch_loss) 157 | if phase == 'train': 158 | tr_acc_history.append(epoch_acc) 159 | tr_loss_history.append(epoch_loss) 160 | 161 | 162 | 163 | time_elapsed = time.time() - since 164 | print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60)) 165 | print('Best val Acc: {:4f}'.format(best_acc)) 166 | 167 | # load best model weights 168 | model.load_state_dict(best_model_wts) 169 | return model, val_acc_history, val_loss_history, tr_acc_history, tr_loss_history 170 | 171 | 172 | # data augmentation 173 | data_transforms = { 174 | 'train': transforms.Compose([ 175 | transforms.RandomResizedCrop(224), 176 | transforms.RandomHorizontalFlip(), 177 | transforms.ToTensor(), 178 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 179 | ]), 180 | 'validation': transforms.Compose([ 181 | transforms.Resize(256), 182 | transforms.CenterCrop(224), 183 | transforms.ToTensor(), 184 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 185 | ]), 186 | 'test': transforms.Compose([ 187 | transforms.Resize(256), 188 | transforms.CenterCrop(224), 189 | transforms.ToTensor(), 190 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]) 191 | ]), 192 | } 193 | 194 | data_dir = "dataset" 195 | num_classes = 10 196 | batch_size = 32 197 | image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), 198 | data_transforms[x]) 199 | for x in ['train', 'validation', 'test']} 200 | dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, 201 | shuffle=True, num_workers=2) 202 | for x in ['train', 'validation', 'test']} 203 | dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'validation', 'test']} 204 | class_names = image_datasets['train'].classes 205 | 206 | model_ft = models.vgg16(pretrained=True) 207 | 208 | #freeze layers 209 | for param in model_ft.features.parameters(): 210 | param.requires_grad = False 211 | model_ft.features[-1].requires_grad = True 212 | model_ft.features[-2].requires_grad = True 213 | model_ft.features[-3].requires_grad = True 214 | 215 | #modify last layer 216 | model_ft.classifier[6] = nn.Linear(4096,10) 217 | 218 | # specify loss function 219 | criterion = nn.CrossEntropyLoss() 220 | 221 | # specify optimizer 222 | optimizer = torch.optim.Adam(model_ft.parameters(), lr=0.001) 223 | 224 | 225 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") 226 | model_ft = model_ft.to(device) 227 | model_ft, val_acc, val_loss, tr_acc, tr_loss = train_model(model_ft, dataloaders, criterion, optimizer,device, num_epochs=30) #train model 228 | 229 | model_ft.classifier= nn.Sequential(*list(model_ft.classifier.children())[:-2]) # delete the last two layer 230 | # freeze all layers 231 | for param in model_ft.parameters(): 232 | param.requires_grad = False 233 | 234 | #extract feaatures 235 | test_features = feature_extract('test',device) 236 | test_feature = test_features[0] 237 | test_label = test_features[1] 238 | 239 | train_features = feature_extract('train',device) 240 | train_feature = train_features[0] 241 | train_label = train_features[1] 242 | 243 | 244 | # one vs rest multiclass linear SVM 245 | clf = OneVsRestClassifier(LinearSVC(random_state=0,max_iter = 20000)) # random_state => randomness (optional) 246 | classifier = clf.fit(train_feature, train_label) 247 | 248 | # predictions 249 | length_test = len(test_feature) 250 | class_based_acc = np.zeros(10) 251 | for test_ind in range(length_test): 252 | y_pred = classifier.predict([test_feature[test_ind]]) 253 | if(y_pred == test_label[test_ind]): 254 | class_based_acc[int(y_pred)] += 1 255 | 256 | #print class based and general acc 257 | for acc_ind in range(10): 258 | print("Accuracy of class "+ str(acc_ind)+ " "+str(class_based_acc[acc_ind]*100/length_test)) 259 | acc = classifier.score(test_feature,test_label) 260 | print("Accuracy: "+ str(acc*100)) 261 | y_pred = classifier.predict(test_feature) 262 | # confusion matrix 263 | plot_confusion_matrix(test_label, y_pred, classes=class_names, normalize=True, title='Normalized confusion matrix') 264 | 265 | 266 | 267 | 268 | 269 | 270 | 271 | 272 | 273 | 274 | 275 | 276 | 277 | 278 | 279 | 280 | 281 | 282 | 283 | 284 | 285 | 286 | 287 | 288 | 289 | 290 | 291 | 292 | --------------------------------------------------------------------------------