├── report.pdf
├── LICENSE
├── README.md
├── part1.py
├── part2.py
└── part3.py
/report.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/AybukeYALCINER/image_classification/HEAD/report.pdf
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2019 Aybüke YALÇINER
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | ## Access the dataset: https://drive.google.com/open?id=1XzLpuQ-jtqXgU-SsFKLrNPJOrvtmtyE3
2 |
3 | Use python-3 and pytorch library.
4 |
5 | To use colab add the dataset file to the drive and write :
6 |
7 | from google.colab import drive
8 | drive.mount('/content/drive')
9 |
10 | and give the path of the dataset something like "drive/My Drive/dataset"
11 |
12 | We have 10 different classes and we have 400 training, 250 validation and 250 test images in each class.
13 |
14 | ## PART-I ##
15 |
16 | In this part, pretrained VGG-16 model is used(on imageNet) and this model is used as feature extractor(from FC7 layer after RELU).
17 | Then the extracted features are given to the one-vs-rest multiclass linear SVM to classify. We just use train and test set.
18 | This part is run on CPU.
19 |
20 | And there are some functions:
21 |
22 | def plot_confusion_matrix(y_true, y_pred, classes, normalize=False, title=None, cmap=plt.cm.Blues) => # generate the confusion matrix of the predictions
23 | -- takes correct labels, predicted labels, class names,boolean normalize, title of the graph and background colors of the graph as parameter respectively
24 |
25 | def feature_extract(dirName) => # returns an array that holds array of extracted features of the images and array of the class names of the images.
26 | takes dirName as parameter that specify the which images extracted.
27 |
28 | def imshow(inp, title=None) => # shows the number of batch size images and corresponding labels for each image. and takes images and names of the images as parameters
29 |
30 |
31 | ## PART-II ##
32 |
33 | In this part, we finetune the VGG-16 model of the CNN which is pretrained on imageNet. After we finetune the model, we test it and return the top-1 and top-5 accuracy.
34 | We use train,validation and test sets. This part is run on GPU. To run on CPU, you can remove the "device" and early stopping is made to avoid it, comment out the lines that cause he early stopping.
35 | NOT: We modify the last layer of the model because we have 10 classes and imgaNet has 1000 classes.
36 |
37 | And there are some functions:
38 |
39 | def plot_graph(val_loss, val_acc, tr_loss, tr_acc, num_epochs) => # plot the train and validation loss/accuracy plots and save it.
40 | takes array of validation losses, validation accuracy, train losses, train accuracy and number of epochs respectively.
41 |
42 | def train_model(model, dataloaders, criterion, optimizer, device,num_epochs=25) => # train the model
43 | takes the model, dataloaders, criterion, optimizer, device(GPU or CPU) and number of epochs respectively as parameters
44 | returns model, array of validation accuracy, validation loss, train accuracy, train loss respectively
45 |
46 | ## PART-III ##
47 |
48 | In this part, pretrained VGG-16 model is used(on imageNet) and after finetune the model, used as feature extractor(from FC7 layer after RELU).
49 | Then the extracted features are given to the one-vs-rest multiclass linear SVM to classify. This part is run on GPU. To run on CPU, you can remove the "device"
50 | NOT: We modify the last layer of the model because we have 10 classes and imgaNet has 1000 classes.
51 |
52 | And there are some functions:
53 |
54 | def plot_confusion_matrix(y_true, y_pred, classes, normalize=False, title=None, cmap=plt.cm.Blues) => # generate the confusion matrix of the predictions
55 | takes correct labels, predicted labels, class names,boolean normalize, title of the graph and background colors of the graph as parameter respectively
56 |
57 | def feature_extract(dirName) => # returns an array that holds array of extracted features of the images and array of the class names of the images.
58 | takes dirName as parameter that specify the which images extracted
59 |
60 | def train_model(model, dataloaders, criterion, optimizer, device,num_epochs=25) => # train the model
61 | takes the model, dataloaders, criterion, optimizer, device(GPU or CPU) and number of epochs respectively as parameters
62 | returns model, array of validation accuracy, validation loss, train accuracy, train loss respectively
63 |
--------------------------------------------------------------------------------
/part1.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 | import torch.optim as optim
4 | from torch.optim import lr_scheduler
5 | import numpy as np
6 | import torchvision
7 | from torchvision import datasets, models, transforms
8 | import matplotlib.pyplot as plt
9 | from sklearn.svm import LinearSVC
10 | from sklearn.multiclass import OneVsRestClassifier
11 | from sklearn.metrics import confusion_matrix
12 | from sklearn.utils.multiclass import unique_labels
13 | import time
14 | import os
15 | import copy
16 |
17 | plt.ion() # interactive mode
18 |
19 | # generate the confusion matrix of the predictions
20 | # takes correct labels, predicted labels, class names,boolean normalize, title of the graph and background colors of the graph as parameter respectively
21 | def plot_confusion_matrix(y_true, y_pred, classes,
22 | normalize=False,
23 | title=None,
24 | cmap=plt.cm.Blues):
25 | """
26 | This function prints and plots the confusion matrix.
27 | Normalization can be applied by setting `normalize=True`.
28 | """
29 | if not title:
30 | if normalize:
31 | title = 'Normalized confusion matrix'
32 | else:
33 | title = 'Confusion matrix, without normalization'
34 |
35 | # Compute confusion matrix
36 | cm = confusion_matrix(y_true, y_pred)
37 | # Only use the labels that appear in the data
38 |
39 | if normalize:
40 | cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
41 | print("Normalized confusion matrix")
42 | else:
43 | print('Confusion matrix, without normalization')
44 |
45 | print(cm)
46 |
47 | fig, ax = plt.subplots()
48 | im = ax.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)
49 | ax.figure.colorbar(im, ax=ax)
50 | # We want to show all ticks...
51 | ax.set(xticks=np.arange(cm.shape[1]),
52 | yticks=np.arange(cm.shape[0]),
53 | # ... and label them with the respective list entries
54 | xticklabels=classes, yticklabels=classes,
55 | title=title,
56 | ylabel='True label',
57 | xlabel='Predicted label')
58 |
59 | # Rotate the tick labels and set their alignment.
60 | plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
61 | rotation_mode="anchor")
62 |
63 | # Loop over data dimensions and create text annotations.
64 | fmt = '.2f' if normalize else 'd'
65 | thresh = cm.max() / 2.
66 | for i in range(cm.shape[0]):
67 | for j in range(cm.shape[1]):
68 | ax.text(j, i, format(cm[i, j], fmt),
69 | ha="center", va="center",
70 | color="white" if cm[i, j] > thresh else "black")
71 | fig.tight_layout()
72 | fig.savefig("conf_mtrx_prt1_b4.png")
73 | return ax
74 |
75 | # returns an array that holds array of extracted features of the images and array of the class names of the images.
76 | # takes dirName as parameter that specify the which images extracted
77 | def feature_extract(dirName):
78 | feature = []
79 | label = []
80 | for inputs, labels in dataloaders[dirName]:
81 | outputs = model_ft(inputs)
82 | feature.extend(outputs.cpu().detach().numpy())
83 | label.extend(labels)
84 | label = np.array(label)
85 | return [feature,label]
86 |
87 | # shows the number of batch size images and corresponding labels for each image.
88 | # takes images and names of the images as parameters
89 | def imshow(inp, title=None):
90 | """Imshow for Tensor."""
91 | inp = inp.numpy().transpose((1, 2, 0))
92 | mean = np.array([0.485, 0.456, 0.406])
93 | std = np.array([0.229, 0.224, 0.225])
94 | inp = std * inp + mean
95 | inp = np.clip(inp, 0, 1)
96 | plt.imshow(inp)
97 | if title is not None:
98 | plt.title(title)
99 | plt.pause(0.001) # pause a bit so that plots are updated
100 |
101 | # data augmentation
102 | data_transforms = {
103 | 'train': transforms.Compose([
104 | transforms.RandomResizedCrop(224),
105 | transforms.RandomHorizontalFlip(),
106 | transforms.ToTensor(),
107 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
108 | ]),
109 | 'test': transforms.Compose([
110 | transforms.Resize(256),
111 | transforms.CenterCrop(224),
112 | transforms.ToTensor(),
113 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
114 | ]),
115 | }
116 | # loads the images from test and train folders who are in dataset folder
117 | # batch size is 4
118 | data_dir = 'dataset'
119 | image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
120 | data_transforms[x])
121 | for x in ['train', 'test']}
122 | dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=4,
123 | shuffle=True, num_workers=4)
124 | for x in ['train', 'test']}
125 | dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'test']}
126 | class_names = image_datasets['train'].classes
127 |
128 | # Get a batch of training data
129 | inputs, classes = next(iter(dataloaders['train']))
130 |
131 | # Make a grid from batch
132 | out = torchvision.utils.make_grid(inputs)
133 | # shows the number of batch size images to check
134 | imshow(out, title=[class_names[x] for x in classes])
135 | # pretrained model is called
136 | model_ft = models.vgg16(pretrained=True)
137 | model_ft.classifier= nn.Sequential(*list(model_ft.classifier.children())[:-2]) # delete the last two layer
138 | # freeze layers
139 | for param in model_ft.parameters():
140 | param.requires_grad = False
141 |
142 | # call feature_extract() function and sets the arrays that are returned
143 | test_features = feature_extract('test')
144 | test_feature = test_features[0]
145 | test_label = test_features[1]
146 |
147 | train_features = feature_extract('train')
148 | train_feature = train_features[0]
149 | train_label = train_features[1]
150 |
151 |
152 | # one vs rest multiclass linear SVM
153 | clf = OneVsRestClassifier(LinearSVC(random_state=0)) # random_state => randomness (optional)
154 | classifier = clf.fit(train_feature, train_label)
155 |
156 | # predictions
157 | length_test = len(test_feature)
158 | class_based_acc = np.zeros(10)
159 | for test_ind in range(length_test):
160 | y_pred = classifier.predict([test_feature[test_ind]])
161 | if(y_pred == test_label[test_ind]):
162 | class_based_acc[int(y_pred)] += 1
163 |
164 | #print class based and general acc
165 | for acc_ind in range(10):
166 | print("Accuracy of class "+ str(acc_ind)+ " "+str(class_based_acc[acc_ind]*100/length_test))
167 | acc = classifier.score(test_feature,test_label)
168 | print("Accuracy: "+ str(acc*100))
169 | y_pred = classifier.predict(test_feature)
170 | # confusion matrix
171 | plot_confusion_matrix(test_label, y_pred, classes=class_names, normalize=True, title='Normalized confusion matrix')
172 |
--------------------------------------------------------------------------------
/part2.py:
--------------------------------------------------------------------------------
1 | from __future__ import print_function
2 | from __future__ import division
3 | import torch
4 | import torch.nn as nn
5 | import torch.optim as optim
6 | import tensorflow as tf
7 | import numpy as np
8 | import torchvision
9 | from torchvision import datasets, models, transforms
10 | from torch.autograd import Variable
11 | import matplotlib.pyplot as plt
12 | import time
13 | import os
14 | import copy
15 | import tensorflow as tf
16 |
17 | # plot the train and validation loss/accuracy plots and save it.
18 | # takes array of validation losses, validation accuracy, train losses, train accuracy and number of epochs respectively.
19 | def plot_graph(val_loss, val_acc, tr_loss, tr_acc, num_epochs):
20 | plt.subplot(211)
21 | plt.title("Loss plots vs. Number of Training Epochs")
22 | plt.plot(range(1,num_epochs+1),val_loss,label="validation")
23 | plt.plot(range(1,num_epochs+1),tr_loss,label="train")
24 |
25 | plt.xticks(np.arange(1, num_epochs+1, 1.0))
26 | plt.legend()
27 |
28 | plt.subplot(212)
29 | plt.title("Accuracy plots vs. Number of Training Epochs")
30 | plt.plot(range(1,num_epochs+1),val_acc,label="validation")
31 | plt.plot(range(1,num_epochs+1),tr_acc,label="train")
32 |
33 | plt.xticks(np.arange(1, num_epochs+1, 1.0))
34 | plt.legend()
35 |
36 | plt.tight_layout()
37 | plt.savefig("plot.png")
38 |
39 | # train the model
40 | # takes the model, dataloaders, criterion, optimizer, device(GPU or CPU) and number of epochs respectively as parameters
41 | # returns model, array of validation accuracy, validation loss, train accuracy, train loss respectively
42 | def train_model(model, dataloaders, criterion, optimizer, device,num_epochs=25):
43 | since = time.time()
44 |
45 | val_acc_history = []
46 | val_loss_history = []
47 | tr_acc_history = []
48 | tr_loss_history = []
49 |
50 | best_model_wts = copy.deepcopy(model.state_dict())
51 | best_acc = 0.0
52 |
53 |
54 | n_epochs_stop = 5
55 | min_val_loss = np.Inf
56 | epochs_no_improve = 0
57 | for epoch in range(num_epochs):
58 | print('Epoch {}/{}'.format(epoch, num_epochs - 1))
59 | print('-' * 10)
60 | #scheduler.step(epoch) #for lr_scheduler
61 | # Each epoch has a training and validation phase
62 | for phase in ['train', 'validation']:
63 |
64 | if phase == 'train':
65 | model.train() # Set model to training mode
66 | else:
67 | model.eval() # Set model to evaluate mode
68 |
69 | running_loss = 0.0
70 | running_corrects = 0
71 | loader = dataloaders[phase]
72 | # Iterate over data.
73 | for inputs, labels in loader:
74 | inputs = inputs.to(device)
75 | labels = labels.to(device)
76 | # print("in dataloaders", end=" ")
77 | # zero the parameter gradients
78 | optimizer.zero_grad()
79 |
80 | # forward
81 | # track history if only in train
82 | with torch.set_grad_enabled(phase == 'train'):
83 |
84 | outputs = model(inputs)
85 | # print("x")
86 |
87 | _, preds = torch.max(outputs, 1)
88 | loss = criterion(outputs, labels)
89 |
90 | # backward + optimize only if in training phase
91 | if phase == 'train':
92 | loss.backward()
93 | optimizer.step()
94 |
95 | # statistics
96 | running_loss += loss.item() * inputs.size(0)
97 | running_corrects += torch.sum(preds == labels.data)
98 |
99 | epoch_loss = running_loss / len(dataloaders[phase].dataset)
100 | epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)
101 |
102 | print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))
103 |
104 | # deep copy the model
105 | if phase == 'validation' and epoch_acc > best_acc:
106 | best_acc = epoch_acc
107 | best_model_wts = copy.deepcopy(model.state_dict())
108 | if phase == 'validation':
109 | val_acc_history.append(epoch_acc)
110 | val_loss_history.append(epoch_loss)
111 | if phase == 'train':
112 | tr_acc_history.append(epoch_acc)
113 | tr_loss_history.append(epoch_loss)
114 | #early stopping
115 | if phase == 'validation':
116 | if epoch_loss < min_val_loss:
117 | epochs_no_improve = 0
118 | min_val_loss = val_loss
119 |
120 | else:
121 | epochs_no_improve += 1
122 | # Check early stopping condition
123 | if epochs_no_improve == n_epochs_stop:
124 | print('Early stopping!')
125 | return model, val_acc_history, val_loss_history, tr_acc_history, tr_loss_history
126 |
127 |
128 |
129 | time_elapsed = time.time() - since
130 | print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
131 | print('Best val Acc: {:4f}'.format(best_acc))
132 |
133 | # load best model weights
134 | model.load_state_dict(best_model_wts)
135 | return model, val_acc_history, val_loss_history, tr_acc_history, tr_loss_history
136 |
137 |
138 |
139 | # data augmentation
140 | data_transforms = {
141 | 'train': transforms.Compose([
142 | transforms.RandomResizedCrop(224),
143 | transforms.RandomHorizontalFlip(),
144 | transforms.ToTensor(),
145 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
146 | ]),
147 | 'validation': transforms.Compose([
148 | transforms.Resize(256),
149 | transforms.CenterCrop(224),
150 | transforms.ToTensor(),
151 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
152 | ]),
153 | 'test': transforms.Compose([
154 | transforms.Resize(256),
155 | transforms.CenterCrop(224),
156 | transforms.ToTensor(),
157 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
158 | ]),
159 | }
160 |
161 | data_dir = "dataset"
162 | num_classes = 10
163 | batch_size = 32
164 | image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
165 | data_transforms[x])
166 | for x in ['train', 'validation', 'test']}
167 | dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size,
168 | shuffle=True, num_workers=2)
169 | for x in ['train', 'validation', 'test']}
170 | dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'validation', 'test']}
171 | class_names = image_datasets['train'].classes
172 |
173 | model_ft = models.vgg16(pretrained=True)
174 |
175 | # freeze layers before classifiers
176 | for param in model_ft.features.parameters():
177 | # print(param)
178 | param.requires_grad = False
179 | #different number of layer freeze
180 | #model_ft.features[-1].requires_grad = True
181 | #model_ft.features[-2].requires_grad = True
182 | #model_ft.features[-3].requires_grad = True
183 |
184 |
185 | model_ft.classifier[6] = nn.Linear(4096,10) #modify the last layer
186 |
187 |
188 | # specify loss function
189 | criterion = nn.CrossEntropyLoss()
190 |
191 | # specify optimizer
192 | optimizer = torch.optim.Adam(model_ft.parameters(), lr=0.001)
193 |
194 | #lr_scheduler
195 | #scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=3, gamma=0.1)
196 |
197 | #different optimizer
198 | #optimizer = torch.optim.SGD(model_ft.parameters(), lr=0.001)
199 |
200 | #weight_decay
201 | #optimizer = torch.optim.Adam(model_ft.parameters(), lr=0.1,weight_decay= 0.001)
202 |
203 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
204 | model_ft = model_ft.to(device) #send the model to the gpu
205 | model_ft, val_acc, val_loss, tr_acc, tr_loss = train_model(model_ft, dataloaders, criterion, optimizer,device, num_epochs=30) #train model
206 |
207 | #test the model
208 | correct = 0
209 | topk = 0
210 | total = 0
211 | testloader = dataloaders['test']
212 | with torch.no_grad():
213 | for data in testloader:
214 | images, labels = data
215 | images = images.to(device)
216 | labels = labels.to(device)
217 | outputs = model_ft(images)
218 | _, predicted = torch.max(outputs.data, 1)
219 | total += labels.size(0)
220 | correct += (predicted == labels).sum().item()
221 |
222 | probs, classes = outputs.topk(5, dim=1)
223 | labels_size = labels.size(0)
224 | for i in range(labels_size):
225 | if(labels[i] in classes[i]):
226 | topk += 1
227 |
228 |
229 | print('Accuracy of the model on the test images: %d %%' % (100 * correct / total))
230 | print('Accuracy of the top 5 on the test images: %d %%' % (100 * topk / total))
231 |
232 | # val/train loss and accuracy plots
233 | plot_graph(val_loss, val_acc, tr_loss, tr_acc, 30)
234 |
235 |
236 |
237 |
238 |
239 |
240 |
241 |
242 |
243 |
244 |
245 |
246 |
247 |
248 |
249 |
250 |
251 |
252 |
253 |
254 |
255 |
256 |
257 |
258 |
259 |
260 |
261 |
262 |
263 |
--------------------------------------------------------------------------------
/part3.py:
--------------------------------------------------------------------------------
1 | from __future__ import print_function
2 | from __future__ import division
3 | import torch
4 | import torch.nn as nn
5 | import torch.optim as optim
6 | import numpy as np
7 | import torchvision
8 | from torchvision import datasets, models, transforms
9 | from torch.autograd import Variable
10 | import matplotlib.pyplot as plt
11 | from sklearn.svm import LinearSVC
12 | from sklearn.multiclass import OneVsRestClassifier
13 | from sklearn.metrics import confusion_matrix
14 | from sklearn.utils.multiclass import unique_labels
15 | import time
16 | import os
17 | import copy
18 |
19 | # generate the confusion matrix of the predictions
20 | # takes correct labels, predicted labels, class names,boolean normalize, title of the graph and background colors of the graph as parameter respectively
21 | def plot_confusion_matrix(y_true, y_pred, classes,
22 | normalize=False,
23 | title=None,
24 | cmap=plt.cm.Blues):
25 | """
26 | This function prints and plots the confusion matrix.
27 | Normalization can be applied by setting `normalize=True`.
28 | """
29 | if not title:
30 | if normalize:
31 | title = 'Normalized confusion matrix'
32 | else:
33 | title = 'Confusion matrix, without normalization'
34 |
35 | # Compute confusion matrix
36 | cm = confusion_matrix(y_true, y_pred)
37 | # Only use the labels that appear in the data
38 |
39 | if normalize:
40 | cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
41 | print("Normalized confusion matrix")
42 | else:
43 | print('Confusion matrix, without normalization')
44 |
45 | print(cm)
46 |
47 | fig, ax = plt.subplots()
48 | im = ax.imshow(cm, interpolation='nearest', cmap=plt.cm.Blues)
49 | ax.figure.colorbar(im, ax=ax)
50 | # We want to show all ticks...
51 | ax.set(xticks=np.arange(cm.shape[1]),
52 | yticks=np.arange(cm.shape[0]),
53 | # ... and label them with the respective list entries
54 | xticklabels=classes, yticklabels=classes,
55 | title=title,
56 | ylabel='True label',
57 | xlabel='Predicted label')
58 |
59 | # Rotate the tick labels and set their alignment.
60 | plt.setp(ax.get_xticklabels(), rotation=45, ha="right",
61 | rotation_mode="anchor")
62 |
63 | # Loop over data dimensions and create text annotations.
64 | fmt = '.2f' if normalize else 'd'
65 | thresh = cm.max() / 2.
66 | for i in range(cm.shape[0]):
67 | for j in range(cm.shape[1]):
68 | ax.text(j, i, format(cm[i, j], fmt),
69 | ha="center", va="center",
70 | color="white" if cm[i, j] > thresh else "black")
71 | fig.tight_layout()
72 | fig.savefig("conf_mtrx_prt1_b4.png")
73 | return ax
74 |
75 | # returns an array that holds array of extracted features of the images and array of the class names of the images.
76 | # takes dirName as parameter that specify the which images extracted
77 | def feature_extract(dirName,device):
78 | feature = []
79 | label = []
80 | for inputs, labels in dataloaders[dirName]:
81 | inputs = inputs.to(device)
82 | labels = labels.to(device)
83 | outputs = model_ft(inputs)
84 | feature.extend(outputs.cpu().numpy())
85 | label.extend(labels.cpu().numpy())
86 | return [feature,label]
87 |
88 | # train the model
89 | # takes the model, dataloaders, criterion, optimizer, device(GPU or CPU) and number of epochs respectively as parameters
90 | # returns model, array of validation accuracy, validation loss, train accuracy, train loss respectively
91 | def train_model(model, dataloaders, criterion, optimizer, device,num_epochs=25):
92 | since = time.time()
93 |
94 | val_acc_history = []
95 | val_loss_history = []
96 | tr_acc_history = []
97 | tr_loss_history = []
98 |
99 | best_model_wts = copy.deepcopy(model.state_dict())
100 | best_acc = 0.0
101 |
102 |
103 | for epoch in range(num_epochs):
104 | print('Epoch {}/{}'.format(epoch, num_epochs - 1))
105 | print('-' * 10)
106 |
107 | # Each epoch has a training and validation phase
108 | for phase in ['train', 'validation']:
109 |
110 | if phase == 'train':
111 | model.train() # Set model to training mode
112 | else:
113 | model.eval() # Set model to evaluate mode
114 |
115 | running_loss = 0.0
116 | running_corrects = 0
117 | loader = dataloaders[phase]
118 | # Iterate over data.
119 | for inputs, labels in loader:
120 | inputs = inputs.to(device)
121 | labels = labels.to(device)
122 | # print("in dataloaders", end=" ")
123 | # zero the parameter gradients
124 | optimizer.zero_grad()
125 |
126 | # forward
127 | # track history if only in train
128 | with torch.set_grad_enabled(phase == 'train'):
129 |
130 | outputs = model(inputs)
131 | # print("x")
132 |
133 | _, preds = torch.max(outputs, 1)
134 | loss = criterion(outputs, labels)
135 |
136 | # backward + optimize only if in training phase
137 | if phase == 'train':
138 | loss.backward()
139 | optimizer.step()
140 |
141 | # statistics
142 | running_loss += loss.item() * inputs.size(0)
143 | running_corrects += torch.sum(preds == labels.data)
144 |
145 | epoch_loss = running_loss / len(dataloaders[phase].dataset)
146 | epoch_acc = running_corrects.double() / len(dataloaders[phase].dataset)
147 |
148 | print('{} Loss: {:.4f} Acc: {:.4f}'.format(phase, epoch_loss, epoch_acc))
149 |
150 | # deep copy the model
151 | if phase == 'validation' and epoch_acc > best_acc:
152 | best_acc = epoch_acc
153 | best_model_wts = copy.deepcopy(model.state_dict())
154 | if phase == 'validation':
155 | val_acc_history.append(epoch_acc)
156 | val_loss_history.append(epoch_loss)
157 | if phase == 'train':
158 | tr_acc_history.append(epoch_acc)
159 | tr_loss_history.append(epoch_loss)
160 |
161 |
162 |
163 | time_elapsed = time.time() - since
164 | print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))
165 | print('Best val Acc: {:4f}'.format(best_acc))
166 |
167 | # load best model weights
168 | model.load_state_dict(best_model_wts)
169 | return model, val_acc_history, val_loss_history, tr_acc_history, tr_loss_history
170 |
171 |
172 | # data augmentation
173 | data_transforms = {
174 | 'train': transforms.Compose([
175 | transforms.RandomResizedCrop(224),
176 | transforms.RandomHorizontalFlip(),
177 | transforms.ToTensor(),
178 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
179 | ]),
180 | 'validation': transforms.Compose([
181 | transforms.Resize(256),
182 | transforms.CenterCrop(224),
183 | transforms.ToTensor(),
184 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
185 | ]),
186 | 'test': transforms.Compose([
187 | transforms.Resize(256),
188 | transforms.CenterCrop(224),
189 | transforms.ToTensor(),
190 | transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
191 | ]),
192 | }
193 |
194 | data_dir = "dataset"
195 | num_classes = 10
196 | batch_size = 32
197 | image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x),
198 | data_transforms[x])
199 | for x in ['train', 'validation', 'test']}
200 | dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size,
201 | shuffle=True, num_workers=2)
202 | for x in ['train', 'validation', 'test']}
203 | dataset_sizes = {x: len(image_datasets[x]) for x in ['train', 'validation', 'test']}
204 | class_names = image_datasets['train'].classes
205 |
206 | model_ft = models.vgg16(pretrained=True)
207 |
208 | #freeze layers
209 | for param in model_ft.features.parameters():
210 | param.requires_grad = False
211 | model_ft.features[-1].requires_grad = True
212 | model_ft.features[-2].requires_grad = True
213 | model_ft.features[-3].requires_grad = True
214 |
215 | #modify last layer
216 | model_ft.classifier[6] = nn.Linear(4096,10)
217 |
218 | # specify loss function
219 | criterion = nn.CrossEntropyLoss()
220 |
221 | # specify optimizer
222 | optimizer = torch.optim.Adam(model_ft.parameters(), lr=0.001)
223 |
224 |
225 | device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
226 | model_ft = model_ft.to(device)
227 | model_ft, val_acc, val_loss, tr_acc, tr_loss = train_model(model_ft, dataloaders, criterion, optimizer,device, num_epochs=30) #train model
228 |
229 | model_ft.classifier= nn.Sequential(*list(model_ft.classifier.children())[:-2]) # delete the last two layer
230 | # freeze all layers
231 | for param in model_ft.parameters():
232 | param.requires_grad = False
233 |
234 | #extract feaatures
235 | test_features = feature_extract('test',device)
236 | test_feature = test_features[0]
237 | test_label = test_features[1]
238 |
239 | train_features = feature_extract('train',device)
240 | train_feature = train_features[0]
241 | train_label = train_features[1]
242 |
243 |
244 | # one vs rest multiclass linear SVM
245 | clf = OneVsRestClassifier(LinearSVC(random_state=0,max_iter = 20000)) # random_state => randomness (optional)
246 | classifier = clf.fit(train_feature, train_label)
247 |
248 | # predictions
249 | length_test = len(test_feature)
250 | class_based_acc = np.zeros(10)
251 | for test_ind in range(length_test):
252 | y_pred = classifier.predict([test_feature[test_ind]])
253 | if(y_pred == test_label[test_ind]):
254 | class_based_acc[int(y_pred)] += 1
255 |
256 | #print class based and general acc
257 | for acc_ind in range(10):
258 | print("Accuracy of class "+ str(acc_ind)+ " "+str(class_based_acc[acc_ind]*100/length_test))
259 | acc = classifier.score(test_feature,test_label)
260 | print("Accuracy: "+ str(acc*100))
261 | y_pred = classifier.predict(test_feature)
262 | # confusion matrix
263 | plot_confusion_matrix(test_label, y_pred, classes=class_names, normalize=True, title='Normalized confusion matrix')
264 |
265 |
266 |
267 |
268 |
269 |
270 |
271 |
272 |
273 |
274 |
275 |
276 |
277 |
278 |
279 |
280 |
281 |
282 |
283 |
284 |
285 |
286 |
287 |
288 |
289 |
290 |
291 |
292 |
--------------------------------------------------------------------------------