├── Result.PNG ├── Image Classifier Project.pdf ├── LICENSE ├── README.md ├── predict.py └── train.py /Result.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Ashleshk/Image-Classifier/main/Result.PNG -------------------------------------------------------------------------------- /Image Classifier Project.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Ashleshk/Image-Classifier/main/Image Classifier Project.pdf -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Ashlesh Khajbage 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Image-Classifier 2 | 3 | Developing an AI application 4 | Going forward, AI algorithms will be incorporated into more and more everyday applications. For example, you might want to include an image classifier in a smart phone app. To do this, you'd use a deep learning model trained on hundreds of thousands of images as part of the overall application architecture. A large part of software development in the future will be using these types of models as common parts of applications. 5 | 6 | In this project, you'll train an image classifier to recognize different species of flowers. You can imagine using something like this in a phone app that tells you the name of the flower your camera is looking at. In practice you'd train this classifier, then export it for use in your application. I'll be using this dataset of 102 flower categories. 7 | 8 | Steps that I utilized to start training the data set. 9 | Load a pre-trained network (If you need a starting point, the VGG networks work great and are straightforward to use) 10 | Define a new, untrained feed-forward network as a classifier, using ReLU activations and dropout 11 | Train the classifier layers using backpropagation using the pre-trained network to get the features 12 | Track the loss and accuracy on the validation set to determine the best hyperparameters 13 | 14 | ## Result 15 | ![Result](https://github.com/Ashleshk/Image-Classifier/blob/main/Result.PNG) 16 | 17 | ##learning Tips 18 | When training make sure you're updating only the weights of the feed-forward network. You should be able to get the validation accuracy above 70% if you build everything right. Make sure to try different hyperparameters (learning rate, units in the classifier, epochs, etc) to find the best model. Save those hyperparameters to use as default values in the next part of the project. 19 | 20 | One last important tip if you're using the workspace to run your code: To avoid having your workspace disconnect during the long-running tasks in this notebook, please read in the earlier page in this lesson called Intro to GPU Workspaces about Keeping Your Session Active. You'll want to include code from the workspace_utils.py module. 21 | 22 | Note for Workspace users: If your network is over 1 GB when saved as a checkpoint, there might be issues with saving backups in your workspace. Typically this happens with wide dense layers after the convolutional layers. If your saved checkpoint is larger than 1 GB (you can open a terminal and check with ls -lh), you should reduce the size of your hidden layers and train again. 23 | 24 | 25 | -------------------------------------------------------------------------------- /predict.py: -------------------------------------------------------------------------------- 1 | #Author: Pranav Menon 2 | #imporse 3 | #arg 4 | import argparse 5 | #torch library 6 | import torch 7 | #variable 8 | from torch.autograd import Variable 9 | #transforms and models 10 | from torchvision import transforms, models 11 | #F 12 | import torch.nn.functional as F 13 | #np 14 | import numpy as np 15 | #Image 16 | from PIL import Image 17 | #json 18 | import json 19 | #os 20 | import os 21 | #random 22 | import random 23 | 24 | 25 | #method to load the checkpoint data 26 | def load_checkpoint(filepath): 27 | #loading it to a variable checkpoint 28 | checkpoint = torch.load(filepath) 29 | model = checkpoint['model'] 30 | model.classifier = checkpoint['classifier'] 31 | learning_rate = checkpoint['learning_rate'] 32 | epochs = checkpoint['epochs'] 33 | optimizer = checkpoint['optimizer'] 34 | model.load_state_dict(checkpoint['state_dict']) 35 | model.class_to_idx = checkpoint['class_to_idx'] 36 | 37 | return model 38 | 39 | #method to load the files 40 | def load_cat_names(filename): 41 | with open(filename) as f: 42 | category_names = json.load(f) 43 | return category_names 44 | 45 | #method to parse the arguments 46 | def parse_args(): 47 | parser = argparse.ArgumentParser() 48 | #checkpoint 49 | parser.add_argument('checkpoint', action='store', default='checkpoint.pth') 50 | #top k number of plants 51 | parser.add_argument('--top_k', dest='top_k', default='3') 52 | # default filepath to primrose image 53 | parser.add_argument('--filepath', dest='filepath', default='flowers/test/1/image_06743.jpg') 54 | #category 55 | parser.add_argument('--category_names', dest='category_names', default='cat_to_name.json') 56 | #gpu 57 | parser.add_argument('--gpu', action='store', default='gpu') 58 | return parser.parse_args() 59 | 60 | #method to process the actual image 61 | def process_image(image): 62 | ''' Scales, crops, and normalizes a PIL image for a PyTorch model, 63 | returns an Numpy array 64 | ''' 65 | # use Image 66 | img_pil = Image.open(image) 67 | #final adjustments 68 | adjustments = transforms.Compose([ 69 | transforms.Resize(256), 70 | transforms.CenterCrop(224), 71 | transforms.ToTensor(), 72 | transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]) 73 | ]) 74 | 75 | image = adjustments(img_pil) 76 | 77 | return image 78 | 79 | #method the predict what flower it is 80 | def predict(image_path, model, topk=3, gpu='gpu'): 81 | ''' Get probability values (indeces) and respective flower classes. 82 | ''' 83 | 84 | if gpu == 'gpu': 85 | model = model.cuda() 86 | else: 87 | model = model.cpu() 88 | 89 | img_torch = process_image(image_path) 90 | img_torch = img_torch.unsqueeze_(0) 91 | img_torch = img_torch.float() 92 | 93 | if gpu == 'gpu': 94 | with torch.no_grad(): 95 | output = model.forward(img_torch.cuda()) 96 | else: 97 | with torch.no_grad(): 98 | output=model.forward(img_torch) 99 | 100 | probability = F.softmax(output.data,dim=1) # use F 101 | 102 | probs = np.array(probability.topk(topk)[0][0]) 103 | 104 | index_to_class = {val: key for key, val in model.class_to_idx.items()} # from reviewer advice 105 | top_classes = [np.int(index_to_class[each]) for each in np.array(probability.topk(topk)[1][0])] 106 | 107 | return probs, top_classes 108 | 109 | def main(): 110 | args = parse_args() 111 | gpu = args.gpu 112 | model = load_checkpoint(args.checkpoint) 113 | cat_to_name = load_cat_names(args.category_names) 114 | 115 | img_path = args.filepath 116 | probs, classes = predict(img_path, model, int(args.top_k), gpu) 117 | labels = [cat_to_name[str(index)] for index in classes] 118 | probability = probs 119 | print('File selected: ' + img_path) 120 | 121 | print(labels) 122 | print(probability) 123 | # this prints out top numbers classes 124 | #and the proabbalililtes 125 | i=0 126 | while i < len(labels): 127 | print("{} with a probability of {}".format(labels[i], probability[i])) 128 | 129 | i += 1 130 | 131 | if __name__ == "__main__": 132 | main() 133 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | #Author: Pranav Menon 2 | #imports 3 | #args 4 | import argparse 5 | #torch library 6 | import torch 7 | #nn 8 | from torch import nn 9 | #optim 10 | from torch import optim 11 | #variable 12 | from torch.autograd import Variable 13 | #datasets, transforms, models 14 | from torchvision import datasets, transforms, models 15 | #Image Folder 16 | from torchvision.datasets import ImageFolder 17 | #F 18 | import torch.nn.functional as F 19 | #Image 20 | from PIL import Image 21 | #OrderedDict 22 | from collections import OrderedDict 23 | #time 24 | import time 25 | #np 26 | import numpy as np 27 | #plt 28 | import matplotlib.pyplot as plt 29 | 30 | 31 | #save_checpoint method 32 | def save_checkpoint(path, model, optimizer, args, classifier): 33 | #checkpoint 34 | checkpoint = {'arch': args.arch, 35 | 'model': model, 36 | 'learning_rate': args.learning_rate, 37 | 'hidden_units': args.hidden_units, 38 | 'classifier' : classifier, 39 | 'epochs': args.epochs, 40 | 'optimizer': optimizer.state_dict(), 41 | 'state_dict': model.state_dict(), 42 | 'class_to_idx': model.class_to_idx} 43 | #torch save 44 | torch.save(checkpoint, path) # the path will be user defined, if not it autosets to checkpoint.pth 45 | 46 | #parser method 47 | def parse_args(): 48 | parser = argparse.ArgumentParser(description="Training") 49 | #data directory 50 | parser.add_argument('data_dir', action='store') 51 | #type of training 52 | parser.add_argument('--arch', dest='arch', default='vgg13', choices=['vgg13', 'densenet121']) 53 | #learning rate 54 | parser.add_argument('--learning_rate', dest='learning_rate', default='0.01') 55 | #hidden units 56 | parser.add_argument('--hidden_units', dest='hidden_units', default='512') 57 | #epochs 58 | parser.add_argument('--epochs', dest='epochs', default='20') 59 | #gpu 60 | parser.add_argument('--gpu', action='store', default='gpu') 61 | #save directory 62 | parser.add_argument('--save_dir', dest="save_dir", action="store", default="checkpoint.pth") 63 | return parser.parse_args() 64 | 65 | #train method 66 | def train(model, criterion, optimizer, dataloaders, epochs, gpu): 67 | steps = 0 68 | print_every = 10 69 | for e in range(epochs): 70 | running_loss = 0 71 | #0 means to train 72 | for ii, (inputs, labels) in enumerate(dataloaders[0]): 73 | steps += 1 74 | if gpu == 'gpu': 75 | model.cuda() 76 | # use cuda 77 | inputs, labels = inputs.to('cuda'), labels.to('cuda') 78 | else: 79 | model.cpu() 80 | optimizer.zero_grad() 81 | #forward pass 82 | outputs = model.forward(inputs) 83 | loss = criterion(outputs, labels) 84 | #backward pass 85 | loss.backward() 86 | optimizer.step() 87 | #runningloss 88 | running_loss += loss.item() 89 | 90 | if steps % print_every == 0: 91 | model.eval() 92 | valloss = 0 93 | accuracy=0 94 | # 1 is validation 95 | for ii, (inputs2,labels2) in enumerate(dataloaders[1]): 96 | optimizer.zero_grad() 97 | 98 | if gpu == 'gpu': 99 | # use cuda 100 | inputs2, labels2 = inputs2.to('cuda') , labels2.to('cuda') 101 | # use cuda 102 | model.to('cuda:0') 103 | else: 104 | pass 105 | with torch.no_grad(): 106 | outputs = model.forward(inputs2) 107 | valloss = criterion(outputs,labels2) 108 | ps = torch.exp(outputs).data 109 | equality = (labels2.data == ps.max(1)[1]) 110 | accuracy += equality.type_as(torch.FloatTensor()).mean() 111 | #valloss calc 112 | valloss = valloss / len(dataloaders[1]) 113 | #accuracy 114 | accuracy = accuracy /len(dataloaders[1]) 115 | #Epoch, Training Loss, Validation Loss, Accuracy 116 | print("Epoch: {}/{}... ".format(e+1, epochs), 117 | "Training Loss: {:.4f}".format(running_loss/print_every), 118 | "Validation Loss {:.4f}".format(valloss), 119 | "Accuracy: {:.4f}".format(accuracy), 120 | ) 121 | 122 | running_loss = 0 123 | 124 | def main(): 125 | 126 | args = parse_args() 127 | data_dir = 'flowers' 128 | train_dir = data_dir + '/train' 129 | val_dir = data_dir + '/valid' 130 | test_dir = data_dir + '/test' 131 | 132 | training_transforms = transforms.Compose([transforms.RandomRotation(30), transforms.RandomResizedCrop(224), 133 | transforms.RandomHorizontalFlip(), transforms.ToTensor(), 134 | transforms.Normalize([0.485, 0.456, 0.406], 135 | [0.229, 0.224, 0.225])]) 136 | 137 | validataion_transforms = transforms.Compose([transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), 138 | transforms.Normalize([0.485, 0.456, 0.406], 139 | [0.229, 0.224, 0.225])]) 140 | 141 | testing_transforms = transforms.Compose([transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor(), 142 | transforms.Normalize([0.485, 0.456, 0.406], 143 | [0.229, 0.224, 0.225])]) 144 | 145 | image_datasets = [ImageFolder(train_dir, transform=training_transforms), 146 | ImageFolder(val_dir, transform=validataion_transforms), 147 | ImageFolder(test_dir, transform=testing_transforms)] 148 | 149 | dataloaders = [torch.utils.data.DataLoader(image_datasets[0], batch_size=64, shuffle=True), 150 | torch.utils.data.DataLoader(image_datasets[1], batch_size=64, shuffle=True), 151 | torch.utils.data.DataLoader(image_datasets[2], batch_size=64, shuffle=True)] 152 | 153 | model = getattr(models, args.arch)(pretrained=True) 154 | 155 | for param in model.parameters(): 156 | param.requires_grad = False 157 | 158 | if args.arch == "vgg13": 159 | feature_num = model.classifier[0].in_features 160 | classifier = nn.Sequential(OrderedDict([ 161 | ('fc1', nn.Linear(feature_num, 1024)), 162 | ('drop', nn.Dropout(p=0.5)), 163 | ('relu', nn.ReLU()), 164 | ('fc2', nn.Linear(1024, 102)), 165 | ('output', nn.LogSoftmax(dim=1))])) 166 | elif args.arch == "densenet121": 167 | classifier = nn.Sequential(OrderedDict([ 168 | ('fc1', nn.Linear(1024, 500)), 169 | ('drop', nn.Dropout(p=0.6)), 170 | ('relu', nn.ReLU()), 171 | ('fc2', nn.Linear(500, 102)), 172 | ('output', nn.LogSoftmax(dim=1))])) 173 | 174 | model.classifier = classifier 175 | #neuralnetwork loss 176 | criterion = nn.NLLLoss() 177 | #optimizer 178 | optimizer = optim.Adam(model.classifier.parameters(), lr=float(args.learning_rate)) 179 | #epochs 180 | epochs = int(args.epochs) 181 | #convert 182 | class_index = image_datasets[0].class_to_idx 183 | # get the gpu settings 184 | gpu = args.gpu 185 | #call train 186 | train(model, criterion, optimizer, dataloaders, epochs, gpu) 187 | #convert 188 | model.class_to_idx = class_index 189 | # get the new save location 190 | path = args.save_dir 191 | #checkpoint! 192 | save_checkpoint(path, model, optimizer, args, classifier) 193 | 194 | #calling the main method 195 | #During this part of the project for both the train.py and predict.py 196 | #section, I referenced this github link https://github.com/kwahid, 197 | #which helped me understand how to do a lot of it. After looking at 198 | #his code I know fully understand how to do this :D 199 | 200 | if __name__ == "__main__": 201 | main() 202 | --------------------------------------------------------------------------------