├── weightsFromCloud └── .gitkeep ├── DRH.png ├── noise_problem.png ├── basic_conv.py ├── README.md ├── Inference_Stanford_Cars_ResNet50_Student.py ├── Inference_Stanford_Cars_ResNet50_Teacher.py ├── Inference_Stanford_Cars_TResNet_L_Student.py ├── Inference_Stanford_Cars_TResNet_L_Teacher.py ├── LICENSE ├── Stanford_Cars_TResNet_L_Distillation.py ├── Stanford_Cars_ResNet50_Distillation.py ├── Stanford_Cars_ResNet50_PMAL.py └── Stanford_Cars_TResNet_L_PMAL.py /weightsFromCloud/.gitkeep: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /DRH.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Dichao-Liu/Anti-noise_FGVR/HEAD/DRH.png -------------------------------------------------------------------------------- /noise_problem.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Dichao-Liu/Anti-noise_FGVR/HEAD/noise_problem.png -------------------------------------------------------------------------------- /basic_conv.py: -------------------------------------------------------------------------------- 1 | import torch.nn as nn 2 | import torch 3 | 4 | 5 | 6 | class BasicConv(nn.Module): 7 | def __init__(self, in_planes, out_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1, relu=True, bn=True, bias=False): 8 | super(BasicConv, self).__init__() 9 | self.out_channels = out_planes 10 | self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, 11 | stride=stride, padding=padding, dilation=dilation, groups=groups, bias=bias) 12 | self.bn = nn.BatchNorm2d(out_planes, eps=1e-5, 13 | momentum=0.01, affine=True) if bn else None 14 | self.relu = nn.ReLU() if relu else None 15 | 16 | def forward(self, x): 17 | x = self.conv(x) 18 | if self.bn is not None: 19 | x = self.bn(x) 20 | if self.relu is not None: 21 | x = self.relu(x) 22 | return x 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # Anti-noise_FGVR 4 | 5 | This repository provides a PyTorch implementation of the fine-grained vehicle recognition method, as proposed in my paper: [Progressive Multi-Task Anti-Noise Learning and Distilling Frameworks for Fine-Grained Vehicle Recognition](https://ieeexplore.ieee.org/document/10623841). 6 | 7 | ### Figure 1: The target problem addressed by the proposed method 8 | 9 | ![The target problem addressed by the proposed method.](https://raw.githubusercontent.com/Dichao-Liu/Anti-noise_FGVR/main/noise_problem.png) 10 | 11 | 12 | ### Figure 2: The proposed module 13 | 14 | ![The proposed module.](https://raw.githubusercontent.com/Dichao-Liu/Anti-noise_FGVR/main/DRH.png) 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | ### Environment 23 | 24 | This source code was tested in the following environment: 25 | 26 | Python = 3.8.13 27 | PyTorch = 1.12.0 28 | torchvision = 0.13.0 29 | Ubuntu 20.04.6 LTS 30 | NVIDIA GeForce RTX 3080 Ti 31 | 32 | ### Pre-trained Models 33 | The pre-trained models can be downloaded from [this link](https://wani.teracloud.jp/share/11f23df41b4a6f82). 34 | 35 | Please save the downloaded models in the `weightsFromCloud` folder. 36 | 37 | The `xxxxx_Network.pth` file was saved using `torch.save(model, 'xxxxx_Network.pth')`. 38 | 39 | The `xxxxx_Weight.pth` file was saved using `torch.save(model.state_dict(), 'xxxxx_Weight.pth')`. 40 | 41 | If you decide to register on InfiniCLOUD to download the model, I would appreciate it if you could kindly use my referral code `XTQQJ` during the process. This small gesture will be of great help to me. 42 | 43 | 44 | ### Dependencies 45 | 46 | * **(1) Installation** 47 | 48 | Install `Inplace-ABN` following the instructions: 49 | 50 | https://github.com/Alibaba-MIIL/TResNet/blob/master/requirements.txt 51 | 52 | https://github.com/Alibaba-MIIL/TResNet/blob/master/INPLACE_ABN_TIPS.md 53 | 54 | Install `imgaug`: 55 | 56 | pip install imgaug 57 | 58 | * **(2) Download** 59 | 60 | Download the folder `src` from https://github.com/Alibaba-MIIL/TResNet, 61 | the folder `vic` from https://github.com/styler00dollar/pytorch-loss-functions, 62 | the folder `example` and Python file `sam.py` from https://github.com/davda54/sam, 63 | and save them as: 64 | 65 | Anti-noise_FGVR 66 | ├── basic_conv.py 67 | ├── Inference_Stanford_Cars_ResNet50_Student.py 68 | ├── Inference_Stanford_Cars_ResNet50_Teacher.py 69 | ├── ... 70 | ├── src 71 | ├── vic 72 | ├── example 73 | ├── sam.py 74 | 75 | Alternatively, you can simply download by running the following commands (note that `subversion` should be installed beforehand as `sudo apt install subversion`): 76 | 77 | git clone https://github.com/Dichao-Liu/Anti-noise_FGVR.git 78 | cd Anti-noise_FGVR 79 | svn export https://github.com/Alibaba-MIIL/TResNet/branches/master/src 80 | svn export https://github.com/davda54/sam/branches/main/example 81 | svn export https://github.com/davda54/sam/branches/main/sam.py 82 | svn export https://github.com/styler00dollar/pytorch-loss-functions/branches/main/vic 83 | 84 | > ⚠️ **Notice** 85 | > The original `vic` folder used in this project is no longer available. It mainly contained a `CharbonnierLoss` implementation. 86 | > If you're trying to reproduce the results, you may consider using this unofficial alternative: 87 | > [https://gist.github.com/onion-liu/6b8c75585799243f2be1a98cb1bde5ad](https://gist.github.com/onion-liu/6b8c75585799243f2be1a98cb1bde5ad) 88 | > Note: This version was found recently and has not been tested with this project, but it may serve as a good replacement. 89 | 90 | 91 | ### Dataset 92 | 93 | * **(1) Download the Stanford Cars dataset or other datasets mentioned in the paper, and organize the structure as follows:** 94 | ``` 95 | dataset folder 96 | ├── train 97 | │ ├── class_001 98 | | | ├── 1.jpg 99 | | | ├── 2.jpg 100 | | | └── ... 101 | │ ├── class_002 102 | | | ├── 1.jpg 103 | | | ├── 2.jpg 104 | | | └── ... 105 | │ └── ... 106 | └── test 107 | ├── class_001 108 | | ├── 1.jpg 109 | | ├── 2.jpg 110 | | └── ... 111 | ├── class_002 112 | | ├── 1.jpg 113 | | ├── 2.jpg 114 | | └── ... 115 | └── ... 116 | ``` 117 | * **(2) modify the path to the dataset folders.** 118 | 119 | ### Train 120 | 121 | python Stanford_Cars_ResNet50_PMAL.py 122 | python Stanford_Cars_ResNet50_Distillation.py 123 | 124 | When training the student network, the `--from_local` option allows you to specify whether to use the teacher model downloaded from InfiniCLOUD or a model you have trained yourself using the provided code. 125 | 126 | 127 | ### Inference 128 | 129 | python Inference_Stanford_Cars_ResNet50_Teacher.py 130 | python Inference_Stanford_Cars_ResNet50_Student.py 131 | 132 | 133 | ### Bibtex 134 | 135 | ``` 136 | @ARTICLE{10623841, 137 | author={Liu, Dichao}, 138 | journal={IEEE Transactions on Intelligent Transportation Systems}, 139 | title={Progressive Multi-Task Anti-Noise Learning and Distilling Frameworks for Fine-Grained Vehicle Recognition}, 140 | year={2024}, 141 | volume={25}, 142 | number={9}, 143 | pages={10667-10678}, 144 | keywords={Noise;Task analysis;Image recognition;Multitasking;Training;Noise measurement;Accuracy;Fine-grained vehicle recognition;intelligent transportation systems;ConvNets;object recognition}, 145 | doi={10.1109/TITS.2024.3420151} 146 | } 147 | ``` 148 | 149 | 150 | 151 | 152 | 153 | -------------------------------------------------------------------------------- /Inference_Stanford_Cars_ResNet50_Student.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | os.environ["CUDA_VISIBLE_DEVICES"] = "0" 4 | import torchvision.models 5 | from sam import SAM 6 | from torch.utils.model_zoo import load_url as load_state_dict_from_url 7 | import cv2 8 | 9 | import numpy as np 10 | import torchvision 11 | from torch.autograd import Variable 12 | from torchvision import transforms 13 | from basic_conv import * 14 | from example.model.smooth_cross_entropy import smooth_crossentropy 15 | from example.utility.bypass_bn import enable_running_stats, disable_running_stats 16 | 17 | 18 | class Student_Wrapper(nn.Module): 19 | def __init__(self, net_layers, classifier): 20 | super(Student_Wrapper, self).__init__() 21 | self.net_layer_0 = nn.Sequential(net_layers[0]) 22 | self.net_layer_1 = nn.Sequential(net_layers[1]) 23 | self.net_layer_2 = nn.Sequential(net_layers[2]) 24 | self.net_layer_3 = nn.Sequential(net_layers[3]) 25 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 26 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 27 | self.net_layer_6 = nn.Sequential(*net_layers[6]) 28 | self.net_layer_7 = nn.Sequential(*net_layers[7]) 29 | 30 | self.net_layer_8 = nn.Sequential(classifier[0]) 31 | self.net_layer_9 = nn.Sequential(classifier[1]) 32 | 33 | self.bn1 = nn.Sequential(nn.BatchNorm2d(64)) 34 | self.relu = nn.Sequential(nn.ReLU()) 35 | 36 | def forward(self, x): 37 | x = self.net_layer_0(x) 38 | x = self.net_layer_1(x) 39 | x = self.net_layer_2(x) 40 | x = self.net_layer_3(x) 41 | x = self.net_layer_4(x) 42 | x1 = self.net_layer_5(x) 43 | x2 = self.net_layer_6(x1) 44 | x3 = self.net_layer_7(x2) 45 | 46 | feat = self.net_layer_8(x3) 47 | feat = feat.view(feat.size(0), -1) 48 | out = self.net_layer_9(feat) 49 | 50 | return out, x1, x2, x3 51 | 52 | 53 | def cosine_anneal_schedule(t, nb_epoch, lr): 54 | cos_inner = np.pi * (t % (nb_epoch)) # t - 1 is used when t has 1-based indexing. 55 | cos_inner /= (nb_epoch) 56 | cos_out = np.cos(cos_inner) + 1 57 | 58 | return float(lr / 2 * cos_out) 59 | 60 | 61 | def test(net, criterion, batch_size, test_path): 62 | net.eval() 63 | use_cuda = torch.cuda.is_available() 64 | test_loss = 0 65 | correct = 0 66 | total = 0 67 | idx = 0 68 | device = torch.device("cuda") 69 | 70 | transform_test = transforms.Compose([ 71 | transforms.Resize((550, 550)), 72 | transforms.CenterCrop(448), 73 | transforms.ToTensor(), 74 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 75 | ]) 76 | testset = torchvision.datasets.ImageFolder(root=test_path, 77 | transform=transform_test) 78 | testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True, num_workers=4) 79 | 80 | for batch_idx, (inputs, targets) in enumerate(testloader): 81 | idx = batch_idx 82 | if use_cuda: 83 | inputs, targets = inputs.to(device), targets.to(device) 84 | inputs, targets = Variable(inputs, volatile=True), Variable(targets) 85 | output, _, _, _ = net(inputs) 86 | 87 | loss = criterion(output, targets).mean() 88 | 89 | test_loss += loss.item() 90 | _, predicted = torch.max(output.data, 1) 91 | 92 | total += targets.size(0) 93 | correct += predicted.eq(targets.data).cpu().sum() 94 | 95 | if batch_idx % 50 == 0: 96 | print('Step: %d | Loss: %.3f |Combined Acc: %.3f%% (%d/%d)' % ( 97 | batch_idx, test_loss / (batch_idx + 1), 98 | 100. * float(correct) / total, correct, total)) 99 | 100 | test_acc_en = 100. * float(correct) / total 101 | test_loss = test_loss / (idx + 1) 102 | 103 | net.train() 104 | 105 | return test_acc_en, test_loss 106 | 107 | 108 | def show_im(image, h, w): 109 | img = (np.transpose(image.cpu().detach().numpy(), (1, 2, 0)) * (0.5, 0.5, 0.5) + (0.5, 0.5, 0.5)) 110 | 111 | img = cv2.resize(img, (w, h)) 112 | return img 113 | 114 | 115 | class Features(nn.Module): 116 | def __init__(self, net_layers): 117 | super(Features, self).__init__() 118 | self.net_layer_0 = nn.Sequential(net_layers[0]) 119 | self.net_layer_1 = nn.Sequential(net_layers[1]) 120 | self.net_layer_2 = nn.Sequential(net_layers[2]) 121 | self.net_layer_3 = nn.Sequential(net_layers[3]) 122 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 123 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 124 | self.net_layer_6 = nn.Sequential(*net_layers[6]) 125 | self.net_layer_7 = nn.Sequential(*net_layers[7]) 126 | 127 | def forward(self, x): 128 | x = self.net_layer_0(x) 129 | x = self.net_layer_1(x) 130 | x = self.net_layer_2(x) 131 | x = self.net_layer_3(x) 132 | x = self.net_layer_4(x) 133 | x1 = self.net_layer_5(x) 134 | x2 = self.net_layer_6(x1) 135 | x3 = self.net_layer_7(x2) 136 | return x1, x2, x3 137 | 138 | 139 | class Network_Wrapper(nn.Module): 140 | def __init__(self, net_layers, num_class, classifier): 141 | super().__init__() 142 | self.Features = Features(net_layers) 143 | self.classifier_pool = nn.Sequential(classifier[0]) 144 | self.classifier_initial = nn.Sequential(classifier[1]) 145 | self.sigmoid = nn.Sigmoid() 146 | self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True) 147 | 148 | self.max_pool1 = nn.MaxPool2d(kernel_size=56, stride=1) 149 | self.max_pool2 = nn.MaxPool2d(kernel_size=28, stride=1) 150 | self.max_pool3 = nn.MaxPool2d(kernel_size=14, stride=1) 151 | 152 | self.conv_block1 = nn.Sequential( 153 | BasicConv(512, 512, kernel_size=1, stride=1, padding=0, relu=True), 154 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 155 | ) 156 | self.classifier1 = nn.Sequential( 157 | nn.BatchNorm1d(1024), 158 | nn.Linear(1024, 512), 159 | nn.BatchNorm1d(512), 160 | nn.ELU(inplace=True), 161 | nn.Linear(512, num_class) 162 | ) 163 | 164 | self.conv_block2 = nn.Sequential( 165 | BasicConv(1024, 512, kernel_size=1, stride=1, padding=0, relu=True), 166 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 167 | ) 168 | self.classifier2 = nn.Sequential( 169 | nn.BatchNorm1d(1024), 170 | nn.Linear(1024, 512), 171 | nn.BatchNorm1d(512), 172 | nn.ELU(inplace=True), 173 | nn.Linear(512, num_class), 174 | ) 175 | 176 | self.conv_block3 = nn.Sequential( 177 | BasicConv(2048, 512, kernel_size=1, stride=1, padding=0, relu=True), 178 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 179 | ) 180 | self.classifier3 = nn.Sequential( 181 | nn.BatchNorm1d(1024), 182 | nn.Linear(1024, 512), 183 | nn.BatchNorm1d(512), 184 | nn.ELU(inplace=True), 185 | nn.Linear(512, num_class), 186 | ) 187 | 188 | 189 | def forward(self, x): 190 | x1, x2, x3 = self.Features(x) 191 | map1 = x1.clone() 192 | map2 = x2.clone() 193 | map3 = x3.clone() 194 | 195 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 196 | classifiers = self.classifier_initial(classifiers) 197 | 198 | x1_ = self.conv_block1(x1) 199 | x1_ = self.max_pool1(x1_) 200 | x1_f = x1_.view(x1_.size(0), -1) 201 | x1_c = self.classifier1(x1_f) 202 | 203 | x2_ = self.conv_block2(x2) 204 | x2_ = self.max_pool2(x2_) 205 | x2_f = x2_.view(x2_.size(0), -1) 206 | x2_c = self.classifier2(x2_f) 207 | 208 | x3_ = self.conv_block3(x3) 209 | x3_ = self.max_pool3(x3_) 210 | x3_f = x3_.view(x3_.size(0), -1) 211 | x3_c = self.classifier3(x3_f) 212 | 213 | return x1_c, x2_c, x3_c, classifiers, map1, map2, map3 214 | 215 | 216 | def img_add_noise(x, transformation_seq): 217 | x = x.permute(0, 2, 3, 1) 218 | x = x.cpu().numpy() 219 | x = transformation_seq(images=x) 220 | x = torch.from_numpy(x.astype(np.float32)) 221 | x = x.permute(0, 3, 1, 2) 222 | return x 223 | 224 | 225 | def CELoss(x, y): 226 | return smooth_crossentropy(x, y, smoothing=0.1) 227 | 228 | 229 | def inference(batch_size=7, model_path='', num_class=196, data_path='', use_state_dict=False): 230 | use_cuda = torch.cuda.is_available() 231 | print(use_cuda) 232 | 233 | if use_state_dict: 234 | net = torchvision.models.resnet50() 235 | state_dict = load_state_dict_from_url('https://download.pytorch.org/models/resnet50-19c8e357.pth') 236 | net.load_state_dict(state_dict) 237 | fc_features = net.fc.in_features 238 | net.fc = nn.Linear(fc_features, num_class) 239 | 240 | net_layers = list(net.children()) 241 | classifier = net_layers[8:10] 242 | net_layers = net_layers[0:8] 243 | 244 | net_student = Student_Wrapper(net_layers, classifier) 245 | net_student.load_state_dict(torch.load(model_path)) 246 | else: 247 | net_student = torch.load(model_path) 248 | 249 | device = torch.device("cuda") 250 | net_student.to(device) 251 | val_acc_com, val_loss = test(net_student, CELoss, batch_size, data_path + '/test') 252 | print("Validation Accuracy (%): {} | Validation Loss: {}".format(val_acc_com, val_loss)) 253 | 254 | 255 | 256 | if __name__ == '__main__': 257 | data_path = '/Stanford Cars' 258 | 259 | # set model_path as: 260 | # model_path='/Stanford_Cars_ResNet50_Student_Network.pth', or 261 | # model_path='/Stanford_Cars_ResNet50_Student_Weight.pth' 262 | model_path = "" 263 | 264 | model_path_file = model_path.split('/') 265 | model_path_file = model_path_file[-1] 266 | if 'Weight' in model_path_file: 267 | use_state_dict = True 268 | elif 'Network' in model_path_file: 269 | use_state_dict = False 270 | else: 271 | raise Exception("Unknown Model "+model_path_file) 272 | 273 | inference(batch_size=7, 274 | model_path=model_path, 275 | num_class=196, 276 | data_path = data_path, 277 | use_state_dict = use_state_dict) 278 | 279 | -------------------------------------------------------------------------------- /Inference_Stanford_Cars_ResNet50_Teacher.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | os.environ["CUDA_VISIBLE_DEVICES"] = "0" 4 | import torchvision.models 5 | from sam import SAM 6 | from torch.utils.model_zoo import load_url as load_state_dict_from_url 7 | import imgaug as ia 8 | import imgaug.augmenters as iaa 9 | from vic.loss import CharbonnierLoss 10 | import numpy as np 11 | import torchvision 12 | from torch.autograd import Variable 13 | from torchvision import transforms 14 | from basic_conv import * 15 | from example.model.smooth_cross_entropy import smooth_crossentropy 16 | from example.utility.bypass_bn import enable_running_stats, disable_running_stats 17 | 18 | def cosine_anneal_schedule(t, nb_epoch, lr): 19 | cos_inner = np.pi * (t % (nb_epoch)) 20 | cos_inner /= (nb_epoch) 21 | cos_out = np.cos(cos_inner) + 1 22 | 23 | return float(lr / 2 * cos_out) 24 | 25 | def test(net, criterion, batch_size, test_path): 26 | net.eval() 27 | use_cuda = torch.cuda.is_available() 28 | test_loss = 0 29 | correct = 0 30 | correct_com = 0 31 | total = 0 32 | idx = 0 33 | device = torch.device("cuda") 34 | 35 | transform_test = transforms.Compose([ 36 | transforms.Resize((550, 550)), 37 | transforms.CenterCrop(448), 38 | transforms.ToTensor(), 39 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 40 | ]) 41 | testset = torchvision.datasets.ImageFolder(root=test_path, 42 | transform=transform_test) 43 | testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True, num_workers=4) 44 | 45 | for batch_idx, (inputs, targets) in enumerate(testloader): 46 | idx = batch_idx 47 | if use_cuda: 48 | inputs, targets = inputs.to(device), targets.to(device) 49 | inputs, targets = Variable(inputs, volatile=True), Variable(targets) 50 | output_1, output_2, output_3, output_ORI, map1, map2, map3 = net(inputs) 51 | 52 | outputs_com = output_1 + output_2 + output_3 + output_ORI 53 | 54 | loss = criterion(output_ORI, targets).mean() 55 | 56 | test_loss += loss.item() 57 | _, predicted = torch.max(output_ORI.data, 1) 58 | _, predicted_com = torch.max(outputs_com.data, 1) 59 | 60 | total += targets.size(0) 61 | correct += predicted.eq(targets.data).cpu().sum() 62 | correct_com += predicted_com.eq(targets.data).cpu().sum() 63 | 64 | if batch_idx % 50 == 0: 65 | print('Step: %d | Loss: %.3f |Combined Acc: %.3f%% (%d/%d)' % ( 66 | batch_idx, test_loss / (batch_idx + 1), 67 | 100. * float(correct_com) / total, correct_com, total)) 68 | 69 | test_acc_en = 100. * float(correct_com) / total 70 | test_loss = test_loss / (idx + 1) 71 | 72 | return test_acc_en, test_loss 73 | 74 | 75 | 76 | 77 | class Features(nn.Module): 78 | def __init__(self, net_layers): 79 | super(Features, self).__init__() 80 | self.net_layer_0 = nn.Sequential(net_layers[0]) 81 | self.net_layer_1 = nn.Sequential(net_layers[1]) 82 | self.net_layer_2 = nn.Sequential(net_layers[2]) 83 | self.net_layer_3 = nn.Sequential(net_layers[3]) 84 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 85 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 86 | self.net_layer_6 = nn.Sequential(*net_layers[6]) 87 | self.net_layer_7 = nn.Sequential(*net_layers[7]) 88 | 89 | 90 | def forward(self, x): 91 | x = self.net_layer_0(x) 92 | x = self.net_layer_1(x) 93 | x = self.net_layer_2(x) 94 | x = self.net_layer_3(x) 95 | x = self.net_layer_4(x) 96 | x1 = self.net_layer_5(x) 97 | x2 = self.net_layer_6(x1) 98 | x3 = self.net_layer_7(x2) 99 | return x1, x2, x3 100 | 101 | 102 | class Anti_Noise_Decoder(nn.Module): 103 | def __init__(self, scale, in_channel): 104 | super(Anti_Noise_Decoder, self).__init__() 105 | self.Sigmoid = nn.Sigmoid() 106 | 107 | in_channel = in_channel // (scale * scale) 108 | 109 | self.skip = nn.Sequential( 110 | nn.Conv2d(3, 64, 3, 1, 1, bias=False), 111 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 112 | nn.Conv2d(64, 3, 3, 1, 1, bias=False), 113 | nn.LeakyReLU(negative_slope=0.1, inplace=True) 114 | 115 | ) 116 | 117 | self.process = nn.Sequential( 118 | nn.PixelShuffle(scale), 119 | nn.Conv2d(in_channel, 256, 3, 1, 1, bias=False), 120 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 121 | nn.PixelShuffle(2), 122 | nn.Conv2d(64, 128, 3, 1, 1, bias=False), 123 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 124 | nn.PixelShuffle(2), 125 | nn.Conv2d(32, 64, 3, 1, 1, bias=False), 126 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 127 | nn.PixelShuffle(2), 128 | nn.Conv2d(16, 3, 3, 1, 1, bias=False), 129 | nn.LeakyReLU(negative_slope=0.1, inplace=True) 130 | ) 131 | 132 | def forward(self, x, map): 133 | return self.skip(x) + self.process(map) 134 | 135 | class Network_Wrapper(nn.Module): 136 | def __init__(self, net_layers, num_class, classifier): 137 | super().__init__() 138 | self.Features = Features(net_layers) 139 | self.classifier_pool = nn.Sequential(classifier[0]) 140 | self.classifier_initial = nn.Sequential(classifier[1]) 141 | self.sigmoid = nn.Sigmoid() 142 | self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True) 143 | 144 | self.max_pool1 = nn.MaxPool2d(kernel_size=56, stride=1) 145 | self.max_pool2 = nn.MaxPool2d(kernel_size=28, stride=1) 146 | self.max_pool3 = nn.MaxPool2d(kernel_size=14, stride=1) 147 | 148 | self.conv_block1 = nn.Sequential( 149 | BasicConv(512, 512, kernel_size=1, stride=1, padding=0, relu=True), 150 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 151 | ) 152 | self.classifier1 = nn.Sequential( 153 | nn.BatchNorm1d(1024), 154 | nn.Linear(1024, 512), 155 | nn.BatchNorm1d(512), 156 | nn.ELU(inplace=True), 157 | nn.Linear(512, num_class) 158 | ) 159 | 160 | self.conv_block2 = nn.Sequential( 161 | BasicConv(1024, 512, kernel_size=1, stride=1, padding=0, relu=True), 162 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 163 | ) 164 | self.classifier2 = nn.Sequential( 165 | nn.BatchNorm1d(1024), 166 | nn.Linear(1024, 512), 167 | nn.BatchNorm1d(512), 168 | nn.ELU(inplace=True), 169 | nn.Linear(512, num_class), 170 | ) 171 | 172 | self.conv_block3 = nn.Sequential( 173 | BasicConv(2048, 512, kernel_size=1, stride=1, padding=0, relu=True), 174 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 175 | ) 176 | self.classifier3 = nn.Sequential( 177 | nn.BatchNorm1d(1024), 178 | nn.Linear(1024, 512), 179 | nn.BatchNorm1d(512), 180 | nn.ELU(inplace=True), 181 | nn.Linear(512, num_class), 182 | ) 183 | 184 | 185 | def forward(self, x): 186 | x1, x2, x3 = self.Features(x) 187 | map1 = x1.clone() 188 | map2 = x2.clone() 189 | map3 = x3.clone() 190 | 191 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 192 | classifiers = self.classifier_initial(classifiers) 193 | 194 | x1_ = self.conv_block1(x1) 195 | x1_ = self.max_pool1(x1_) 196 | x1_f = x1_.view(x1_.size(0), -1) 197 | x1_c = self.classifier1(x1_f) 198 | 199 | x2_ = self.conv_block2(x2) 200 | x2_ = self.max_pool2(x2_) 201 | x2_f = x2_.view(x2_.size(0), -1) 202 | x2_c = self.classifier2(x2_f) 203 | 204 | 205 | x3_ = self.conv_block3(x3) 206 | x3_ = self.max_pool3(x3_) 207 | x3_f = x3_.view(x3_.size(0), -1) 208 | x3_c = self.classifier3(x3_f) 209 | 210 | return x1_c, x2_c, x3_c, classifiers, map1, map2, map3 211 | 212 | def img_add_noise(x, transformation_seq): 213 | 214 | x = x.permute(0, 2, 3, 1) 215 | x = x.cpu().numpy() 216 | x = transformation_seq(images=x) 217 | x = torch.from_numpy(x.astype(np.float32)) 218 | x = x.permute(0, 3, 1, 2) 219 | return x 220 | 221 | 222 | def CELoss(x, y): 223 | return smooth_crossentropy(x, y, smoothing=0.1) 224 | 225 | 226 | def inference(batch_size=3, model_path='', num_class=0, data_path='', use_state_dict = False): 227 | 228 | use_cuda = torch.cuda.is_available() 229 | print(use_cuda) 230 | 231 | if use_state_dict: 232 | net = torchvision.models.resnet50() 233 | state_dict = load_state_dict_from_url('https://download.pytorch.org/models/resnet50-19c8e357.pth') 234 | net.load_state_dict(state_dict) 235 | fc_features = net.fc.in_features 236 | net.fc = nn.Linear(fc_features, num_class) 237 | net_layers = list(net.children()) 238 | classifier = net_layers[8:10] 239 | net_layers = net_layers[0:8] 240 | net = Network_Wrapper(net_layers, num_class, classifier) 241 | net.load_state_dict(torch.load(model_path)) 242 | else: 243 | net = torch.load(model_path) 244 | 245 | device = torch.device("cuda") 246 | net.to(device) 247 | 248 | val_acc_com, val_loss = test(net, CELoss, batch_size, data_path+'/test') 249 | print("Validation Accuracy (%): {} | Validation Loss: {}".format(val_acc_com, val_loss)) 250 | 251 | 252 | 253 | if __name__ == '__main__': 254 | data_path = '/Stanford Cars' 255 | 256 | # set model_path as: 257 | # model_path='/Stanford_Cars_ResNet50_Teacher_Network.pth', or 258 | # model_path='/Stanford_Cars_ResNet50_Teacher_Weight.pth' 259 | model_path = "" 260 | 261 | model_path_file = model_path.split('/') 262 | model_path_file = model_path_file[-1] 263 | if 'Weight' in model_path_file: 264 | use_state_dict = True 265 | elif 'Network' in model_path_file: 266 | use_state_dict = False 267 | else: 268 | raise Exception("Unknown Model "+model_path_file) 269 | 270 | inference(batch_size=7, 271 | model_path=model_path, 272 | num_class=196, 273 | data_path = data_path, 274 | use_state_dict = use_state_dict) 275 | 276 | -------------------------------------------------------------------------------- /Inference_Stanford_Cars_TResNet_L_Student.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | os.environ["CUDA_VISIBLE_DEVICES"] = "0" 4 | import torchvision.models 5 | from sam import SAM 6 | from torch.utils.model_zoo import load_url as load_state_dict_from_url 7 | import numpy as np 8 | import torchvision 9 | from torch.autograd import Variable 10 | from torchvision import transforms 11 | from basic_conv import * 12 | from example.model.smooth_cross_entropy import smooth_crossentropy 13 | from example.utility.bypass_bn import enable_running_stats, disable_running_stats 14 | from src.models.tresnet_v2.tresnet_v2 import TResnetL_V2 as TResnetL368 15 | import requests 16 | 17 | 18 | class Student_Wrapper(nn.Module): 19 | def __init__(self, net_layers, classifier): 20 | super(Student_Wrapper, self).__init__() 21 | self.net_layer_0 = nn.Sequential(net_layers[0]) 22 | self.net_layer_1 = nn.Sequential(*net_layers[1]) 23 | self.net_layer_2 = nn.Sequential(*net_layers[2]) 24 | self.net_layer_3 = nn.Sequential(*net_layers[3]) 25 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 26 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 27 | 28 | self.classifier_pool = nn.Sequential(classifier[0]) 29 | self.classifier_initial = nn.Sequential(classifier[1]) 30 | 31 | def forward(self, x): 32 | x = self.net_layer_0(x) 33 | x = self.net_layer_1(x) 34 | x = self.net_layer_2(x) 35 | x1 = self.net_layer_3(x) 36 | x2 = self.net_layer_4(x1) 37 | x3 = self.net_layer_5(x2) 38 | 39 | 40 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 41 | out = self.classifier_initial(classifiers) 42 | 43 | return out, x1, x2, x3 44 | 45 | 46 | def cosine_anneal_schedule(t, nb_epoch, lr): 47 | cos_inner = np.pi * (t % (nb_epoch)) # t - 1 is used when t has 1-based indexing. 48 | cos_inner /= (nb_epoch) 49 | cos_out = np.cos(cos_inner) + 1 50 | 51 | return float(lr / 2 * cos_out) 52 | 53 | 54 | def test(net, criterion, batch_size, test_path): 55 | net.eval() 56 | use_cuda = torch.cuda.is_available() 57 | test_loss = 0 58 | correct = 0 59 | total = 0 60 | idx = 0 61 | device = torch.device("cuda") 62 | 63 | transform_test = transforms.Compose([ 64 | transforms.Resize((421, 421)), 65 | transforms.CenterCrop(368), 66 | transforms.ToTensor(), 67 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 68 | ]) 69 | testset = torchvision.datasets.ImageFolder(root=test_path, 70 | transform=transform_test) 71 | testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True, num_workers=4) 72 | 73 | for batch_idx, (inputs, targets) in enumerate(testloader): 74 | idx = batch_idx 75 | if use_cuda: 76 | inputs, targets = inputs.to(device), targets.to(device) 77 | inputs, targets = Variable(inputs, volatile=True), Variable(targets) 78 | output, _, _, _ = net(inputs) 79 | 80 | loss = criterion(output, targets).mean() 81 | 82 | test_loss += loss.item() 83 | _, predicted = torch.max(output.data, 1) 84 | 85 | total += targets.size(0) 86 | correct += predicted.eq(targets.data).cpu().sum() 87 | 88 | if batch_idx % 50 == 0: 89 | print('Step: %d | Loss: %.3f |Combined Acc: %.3f%% (%d/%d)' % ( 90 | batch_idx, test_loss / (batch_idx + 1), 91 | 100. * float(correct) / total, correct, total)) 92 | 93 | test_acc_en = 100. * float(correct) / total 94 | test_loss = test_loss / (idx + 1) 95 | 96 | net.train() 97 | 98 | return test_acc_en, test_loss 99 | 100 | 101 | 102 | class Features(nn.Module): 103 | def __init__(self, net_layers): 104 | super(Features, self).__init__() 105 | self.net_layer_0 = nn.Sequential(net_layers[0]) 106 | self.net_layer_1 = nn.Sequential(*net_layers[1]) 107 | self.net_layer_2 = nn.Sequential(*net_layers[2]) 108 | self.net_layer_3 = nn.Sequential(*net_layers[3]) 109 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 110 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 111 | 112 | def forward(self, x): 113 | x = self.net_layer_0(x) 114 | x = self.net_layer_1(x) 115 | x = self.net_layer_2(x) 116 | x1 = self.net_layer_3(x) 117 | x2 = self.net_layer_4(x1) 118 | x3 = self.net_layer_5(x2) 119 | 120 | return x1, x2, x3 121 | 122 | 123 | class Network_Wrapper(nn.Module): 124 | def __init__(self, net_layers, num_classes, classifier): 125 | super().__init__() 126 | self.Features = Features(net_layers) 127 | self.classifier_pool = nn.Sequential(classifier[0]) 128 | self.classifier_initial = nn.Sequential(classifier[1]) 129 | self.sigmoid = nn.Sigmoid() 130 | self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True) 131 | 132 | self.max_pool1 = nn.MaxPool2d(kernel_size=46, stride=1) 133 | self.max_pool2 = nn.MaxPool2d(kernel_size=23, stride=1) 134 | self.max_pool3 = nn.MaxPool2d(kernel_size=12, stride=1) 135 | 136 | self.conv_block1 = nn.Sequential( 137 | BasicConv(512, 512, kernel_size=1, stride=1, padding=0, relu=True), 138 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 139 | ) 140 | self.classifier1 = nn.Sequential( 141 | nn.BatchNorm1d(1024), 142 | nn.Linear(1024, 512), 143 | nn.BatchNorm1d(512), 144 | nn.ELU(inplace=True), 145 | nn.Linear(512, num_classes) 146 | ) 147 | 148 | self.conv_block2 = nn.Sequential( 149 | BasicConv(1024, 512, kernel_size=1, stride=1, padding=0, relu=True), 150 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 151 | ) 152 | self.classifier2 = nn.Sequential( 153 | nn.BatchNorm1d(1024), 154 | nn.Linear(1024, 512), 155 | nn.BatchNorm1d(512), 156 | nn.ELU(inplace=True), 157 | nn.Linear(512, num_classes), 158 | ) 159 | 160 | self.conv_block3 = nn.Sequential( 161 | BasicConv(2048, 512, kernel_size=1, stride=1, padding=0, relu=True), 162 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 163 | ) 164 | self.classifier3 = nn.Sequential( 165 | nn.BatchNorm1d(1024), 166 | nn.Linear(1024, 512), 167 | nn.BatchNorm1d(512), 168 | nn.ELU(inplace=True), 169 | nn.Linear(512, num_classes), 170 | ) 171 | 172 | def forward(self, x): 173 | x1, x2, x3 = self.Features(x) 174 | map1 = x1.clone() 175 | map2 = x2.clone() 176 | map3 = x3.clone() 177 | 178 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 179 | classifiers = self.classifier_initial(classifiers) 180 | 181 | x1_ = self.conv_block1(x1) 182 | x1_ = self.max_pool1(x1_) 183 | x1_f = x1_.view(x1_.size(0), -1) 184 | 185 | x1_c = self.classifier1(x1_f) 186 | 187 | x2_ = self.conv_block2(x2) 188 | x2_ = self.max_pool2(x2_) 189 | x2_f = x2_.view(x2_.size(0), -1) 190 | x2_c = self.classifier2(x2_f) 191 | 192 | x3_ = self.conv_block3(x3) 193 | x3_ = self.max_pool3(x3_) 194 | x3_f = x3_.view(x3_.size(0), -1) 195 | x3_c = self.classifier3(x3_f) 196 | 197 | return x1_c, x2_c, x3_c, classifiers, map1, map2, map3 198 | 199 | 200 | def img_add_noise(x, transformation_seq): 201 | x = x.permute(0, 2, 3, 1) 202 | x = x.cpu().numpy() 203 | x = transformation_seq(images=x) 204 | x = torch.from_numpy(x.astype(np.float32)) 205 | x = x.permute(0, 3, 1, 2) 206 | return x 207 | 208 | 209 | def CELoss(x, y): 210 | return smooth_crossentropy(x, y, smoothing=0.1) 211 | 212 | 213 | 214 | def inference(batch_size=7, model_path='', num_class=196, data_path='', use_state_dict=False): 215 | 216 | use_cuda = torch.cuda.is_available() 217 | print(use_cuda) 218 | 219 | 220 | if use_state_dict: 221 | model_params = {'num_classes': num_class} 222 | model = TResnetL368(model_params) 223 | weights_url = \ 224 | 'https://miil-public-eu.oss-eu-central-1.aliyuncs.com/model-zoo/tresnet/stanford_cars_tresnet-l-v2_96_27.pth' 225 | weights_path = "tresnet-l-v2.pth" 226 | 227 | if not os.path.exists(weights_path): 228 | print('downloading weights...') 229 | r = requests.get(weights_url) 230 | with open(weights_path, "wb") as code: 231 | code.write(r.content) 232 | pretrained_weights = torch.load(weights_path) 233 | model.load_state_dict(pretrained_weights['model']) 234 | 235 | net_layers = list(model.children()) 236 | classifier = net_layers[1:3] 237 | net_layers = net_layers[0] 238 | net_layers = list(net_layers.children()) 239 | 240 | net_student = Student_Wrapper(net_layers, classifier) 241 | net_student.load_state_dict(torch.load(model_path)) 242 | else: 243 | net_student = torch.load(model_path) 244 | 245 | device = torch.device("cuda") 246 | net_student.to(device) 247 | 248 | val_acc_com, val_loss = test(net_student, CELoss, batch_size, data_path + '/test') 249 | print("Validation Accuracy (%): {} | Validation Loss: {}".format(val_acc_com, val_loss)) 250 | 251 | 252 | 253 | 254 | 255 | if __name__ == '__main__': 256 | 257 | data_path = '/home/liu/Downloads/Experiments/Classification/Datasets/car_ims_organized' 258 | 259 | # set model_path as: 260 | # model_path='/Stanford_Cars_TResNet-L_Student_Network.pth', or 261 | # model_path='/Stanford_Cars_TResNet-L_Student_Weight.pth' 262 | model_path = 'results/Stanford_Cars_TResNet_L_Distillation/model_Network.pth' 263 | # model_path = "/mnt/ssd/LIU/SSS/Upload_code_for_submission/Upload_RealPrepareReady/results/Stanford_Cars_TResNet_L_Distillation/Stanford_Cars_TResNet-L_Student_Network.pth" 264 | 265 | model_path_file = model_path.split('/') 266 | model_path_file = model_path_file[-1] 267 | if 'Weight' in model_path_file: 268 | use_state_dict = True 269 | elif 'Network' in model_path_file: 270 | use_state_dict = False 271 | else: 272 | raise Exception("Unknown Model "+model_path_file) 273 | 274 | inference(batch_size=7, 275 | model_path=model_path, 276 | num_class=196, 277 | data_path = data_path, 278 | use_state_dict = use_state_dict) 279 | -------------------------------------------------------------------------------- /Inference_Stanford_Cars_TResNet_L_Teacher.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | 4 | os.environ["CUDA_VISIBLE_DEVICES"] = "1" 5 | import torchvision.models 6 | from sam import SAM 7 | from torch.utils.model_zoo import load_url as load_state_dict_from_url 8 | import imgaug as ia 9 | import imgaug.augmenters as iaa 10 | from vic.loss import CharbonnierLoss 11 | import numpy as np 12 | import torchvision 13 | from torch.autograd import Variable 14 | from torchvision import transforms 15 | from basic_conv import * 16 | from example.model.smooth_cross_entropy import smooth_crossentropy 17 | from example.utility.bypass_bn import enable_running_stats, disable_running_stats 18 | from src.models.tresnet_v2.tresnet_v2 import TResnetL_V2 as TResnetL368 19 | import requests 20 | import torch.nn.functional as F 21 | 22 | 23 | def cosine_anneal_schedule(t, nb_epoch, lr): 24 | cos_inner = np.pi * (t % (nb_epoch)) 25 | cos_inner /= (nb_epoch) 26 | cos_out = np.cos(cos_inner) + 1 27 | 28 | return float(lr / 2 * cos_out) 29 | 30 | 31 | def test(net, criterion, batch_size, test_path): 32 | net.eval() 33 | use_cuda = torch.cuda.is_available() 34 | test_loss = 0 35 | correct = 0 36 | correct_com = 0 37 | total = 0 38 | idx = 0 39 | device = torch.device("cuda") 40 | 41 | transform_test = transforms.Compose([ 42 | transforms.Resize((421, 421)), 43 | transforms.CenterCrop(368), 44 | transforms.ToTensor(), 45 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 46 | ]) 47 | testset = torchvision.datasets.ImageFolder(root=test_path, 48 | transform=transform_test) 49 | testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True, num_workers=4) 50 | 51 | for batch_idx, (inputs, targets) in enumerate(testloader): 52 | idx = batch_idx 53 | if use_cuda: 54 | inputs, targets = inputs.to(device), targets.to(device) 55 | with torch.no_grad(): 56 | inputs, targets = Variable(inputs), Variable(targets) 57 | output_1, output_2, output_3, output_ORI, _, _, _ = net(inputs) 58 | 59 | outputs_com = output_1.cpu() + output_2.cpu() + output_3.cpu() + output_ORI.cpu() 60 | 61 | loss = criterion(output_ORI, targets).mean().cpu() 62 | 63 | test_loss += loss.cpu().item() 64 | _, predicted = torch.max(output_ORI.data.cpu(), 1) 65 | _, predicted_com = torch.max(outputs_com.data.cpu(), 1) 66 | 67 | total += targets.size(0) 68 | correct += predicted.eq(targets.data.cpu()).cpu().sum() 69 | correct_com += predicted_com.eq(targets.data.cpu()).cpu().sum() 70 | 71 | if batch_idx % 50 == 0: 72 | print('Step: %d | Loss: %.3f |Combined Acc: %.3f%% (%d/%d)' % ( 73 | batch_idx, test_loss / (batch_idx + 1), 74 | 100. * float(correct_com) / total, correct_com, total)) 75 | 76 | test_acc_en = 100. * float(correct_com) / total 77 | test_loss = test_loss / (idx + 1) 78 | del inputs 79 | del loss 80 | del targets 81 | del output_1 82 | del output_2 83 | del output_3 84 | del output_ORI 85 | torch.cuda.empty_cache() 86 | 87 | return test_acc_en, test_loss 88 | 89 | 90 | 91 | class Features(nn.Module): 92 | def __init__(self, net_layers_FeatureHead): 93 | super(Features, self).__init__() 94 | self.net_layer_0 = nn.Sequential(net_layers_FeatureHead[0]) 95 | self.net_layer_1 = nn.Sequential(*net_layers_FeatureHead[1]) 96 | self.net_layer_2 = nn.Sequential(*net_layers_FeatureHead[2]) 97 | self.net_layer_3 = nn.Sequential(*net_layers_FeatureHead[3]) 98 | self.net_layer_4 = nn.Sequential(*net_layers_FeatureHead[4]) 99 | self.net_layer_5 = nn.Sequential(*net_layers_FeatureHead[5]) 100 | 101 | def forward(self, x): 102 | x = self.net_layer_0(x) 103 | x = self.net_layer_1(x) 104 | x = self.net_layer_2(x) 105 | x1 = self.net_layer_3(x) 106 | x2 = self.net_layer_4(x1) 107 | x3 = self.net_layer_5(x2) 108 | 109 | return x1, x2, x3 110 | 111 | 112 | class Network_Wrapper(nn.Module): 113 | def __init__(self, net_layers, num_classes, classifier): 114 | super().__init__() 115 | self.Features = Features(net_layers) 116 | self.classifier_pool = nn.Sequential(classifier[0]) 117 | self.classifier_initial = nn.Sequential(classifier[1]) 118 | self.sigmoid = nn.Sigmoid() 119 | self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True) 120 | 121 | self.max_pool1 = nn.MaxPool2d(kernel_size=46, stride=1) 122 | self.max_pool2 = nn.MaxPool2d(kernel_size=23, stride=1) 123 | self.max_pool3 = nn.MaxPool2d(kernel_size=12, stride=1) 124 | 125 | self.conv_block1 = nn.Sequential( 126 | BasicConv(512, 512, kernel_size=1, stride=1, padding=0, relu=True), 127 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 128 | ) 129 | self.classifier1 = nn.Sequential( 130 | nn.BatchNorm1d(1024), 131 | nn.Linear(1024, 512), 132 | nn.BatchNorm1d(512), 133 | nn.ELU(inplace=True), 134 | nn.Linear(512, num_classes) 135 | ) 136 | 137 | self.conv_block2 = nn.Sequential( 138 | BasicConv(1024, 512, kernel_size=1, stride=1, padding=0, relu=True), 139 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 140 | ) 141 | self.classifier2 = nn.Sequential( 142 | nn.BatchNorm1d(1024), 143 | nn.Linear(1024, 512), 144 | nn.BatchNorm1d(512), 145 | nn.ELU(inplace=True), 146 | nn.Linear(512, num_classes), 147 | ) 148 | 149 | self.conv_block3 = nn.Sequential( 150 | BasicConv(2048, 512, kernel_size=1, stride=1, padding=0, relu=True), 151 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 152 | ) 153 | self.classifier3 = nn.Sequential( 154 | nn.BatchNorm1d(1024), 155 | nn.Linear(1024, 512), 156 | nn.BatchNorm1d(512), 157 | nn.ELU(inplace=True), 158 | nn.Linear(512, num_classes), 159 | ) 160 | 161 | def forward(self, x): 162 | x1, x2, x3 = self.Features(x) 163 | map1 = x1.clone() 164 | map2 = x2.clone() 165 | map3 = x3.clone() 166 | 167 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 168 | classifiers = self.classifier_initial(classifiers) 169 | 170 | x1_ = self.conv_block1(x1) 171 | x1_ = self.max_pool1(x1_) 172 | x1_f = x1_.view(x1_.size(0), -1) 173 | 174 | x1_c = self.classifier1(x1_f) 175 | 176 | x2_ = self.conv_block2(x2) 177 | x2_ = self.max_pool2(x2_) 178 | x2_f = x2_.view(x2_.size(0), -1) 179 | x2_c = self.classifier2(x2_f) 180 | 181 | x3_ = self.conv_block3(x3) 182 | x3_ = self.max_pool3(x3_) 183 | x3_f = x3_.view(x3_.size(0), -1) 184 | x3_c = self.classifier3(x3_f) 185 | 186 | return x1_c, x2_c, x3_c, classifiers, map1, map2, map3 187 | 188 | 189 | class Anti_Noise_Decoder(nn.Module): 190 | def __init__(self, scale, in_channel): 191 | super(Anti_Noise_Decoder, self).__init__() 192 | self.Sigmoid = nn.Sigmoid() 193 | 194 | in_channel = in_channel // (scale * scale) 195 | 196 | self.skip = nn.Sequential( 197 | nn.Conv2d(3, 64, 3, 1, 1, bias=False), 198 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 199 | nn.Conv2d(64, 3, 3, 1, 1, bias=False), 200 | nn.LeakyReLU(negative_slope=0.1, inplace=True) 201 | 202 | ) 203 | 204 | self.process = nn.Sequential( 205 | nn.PixelShuffle(scale), 206 | nn.Conv2d(in_channel, 256, 3, 1, 1, bias=False), 207 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 208 | nn.PixelShuffle(2), 209 | nn.Conv2d(64, 128, 3, 1, 1, bias=False), 210 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 211 | nn.PixelShuffle(2), 212 | nn.Conv2d(32, 64, 3, 1, 1, bias=False), 213 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 214 | nn.PixelShuffle(2), 215 | nn.Conv2d(16, 3, 3, 1, 1, bias=False), 216 | nn.LeakyReLU(negative_slope=0.1, inplace=True) 217 | ) 218 | 219 | def forward(self, x, map): 220 | x_ = self.process(map) 221 | if not (x.size() == x_.size()): 222 | x_ = F.interpolate(x, (x.size(2),x.size(3)), mode='bilinear') 223 | return self.skip(x) + x_ 224 | 225 | 226 | def img_add_noise(x, transformation_seq): 227 | x = x.permute(0, 2, 3, 1) 228 | x = x.cpu().numpy() 229 | x = transformation_seq(images=x) 230 | x = torch.from_numpy(x.astype(np.float32)) 231 | x = x.permute(0, 3, 1, 2) 232 | return x 233 | 234 | 235 | def CELoss(x, y): 236 | return smooth_crossentropy(x, y, smoothing=0.1) 237 | 238 | 239 | def inference(batch_size=3, model_path='', num_class=0, data_path='', use_state_dict = False): 240 | 241 | use_cuda = torch.cuda.is_available() 242 | print(use_cuda) 243 | 244 | 245 | if use_state_dict: 246 | model_params = {'num_classes': num_class} 247 | 248 | model = TResnetL368(model_params) 249 | weights_url = \ 250 | 'https://miil-public-eu.oss-eu-central-1.aliyuncs.com/model-zoo/tresnet/stanford_cars_tresnet-l-v2_96_27.pth' 251 | weights_path = "tresnet-l-v2.pth" 252 | 253 | if not os.path.exists(weights_path): 254 | print('downloading weights...') 255 | r = requests.get(weights_url) 256 | with open(weights_path, "wb") as code: 257 | code.write(r.content) 258 | pretrained_weights = torch.load(weights_path) 259 | model.load_state_dict(pretrained_weights['model']) 260 | 261 | net_layers = list(model.children()) 262 | classifier = net_layers[1:3] 263 | net_layers = net_layers[0] 264 | net_layers = list(net_layers.children()) 265 | net = Network_Wrapper(net_layers, num_class, classifier) 266 | net.load_state_dict(torch.load(model_path)) 267 | else: 268 | net = torch.load(model_path) 269 | 270 | device = torch.device("cuda") 271 | net.to(device) 272 | 273 | val_acc_com, val_loss = test(net, CELoss, batch_size, data_path + '/test') 274 | print("Validation Accuracy (%): {} | Validation Loss: {}".format(val_acc_com, val_loss)) 275 | 276 | 277 | 278 | if __name__ == '__main__': 279 | data_path = '/mnt/ssd/LIU/car_ims_organized' 280 | 281 | # set model_path as: 282 | # model_path='/Stanford_Cars_TResNet-L_Teacher_Network.pth', or 283 | # model_path='/Stanford_Cars_TResNet-L_Teacher_Weight.pth' 284 | model_path = "/mnt/ssd/LIU/SSS/Upload_code_for_submission/Upload_RealPrepareReady/weightsFromCloud/Stanford_Cars_TResNet-L_Teacher_Network.pth" 285 | 286 | model_path_file = model_path.split('/') 287 | model_path_file = model_path_file[-1] 288 | if 'Weight' in model_path_file: 289 | use_state_dict = True 290 | elif 'Network' in model_path_file: 291 | use_state_dict = False 292 | else: 293 | raise Exception("Unknown Model "+model_path_file) 294 | 295 | inference(batch_size=7, 296 | model_path=model_path, 297 | num_class=196, 298 | data_path = data_path, 299 | use_state_dict = use_state_dict) 300 | 301 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /Stanford_Cars_TResNet_L_Distillation.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | os.environ["CUDA_VISIBLE_DEVICES"] = "1" 4 | import torchvision.models 5 | from sam import SAM 6 | from torch.utils.model_zoo import load_url as load_state_dict_from_url 7 | import numpy as np 8 | import torchvision 9 | from torch.autograd import Variable 10 | from torchvision import transforms 11 | from basic_conv import * 12 | from example.model.smooth_cross_entropy import smooth_crossentropy 13 | from example.utility.bypass_bn import enable_running_stats, disable_running_stats 14 | from src.models.tresnet_v2.tresnet_v2 import TResnetL_V2 as TResnetL368 15 | import requests 16 | 17 | import argparse 18 | parser = argparse.ArgumentParser(description='train_student') 19 | parser.add_argument('--from_local', required=False, action='store_true', 20 | help="By default, this is set to use the pre-trained parameters downloaded from the cloud as the teacher model. " 21 | "Set this flag to True to use the locally trained parameters instead.") 22 | 23 | args, unparsed = parser.parse_known_args() 24 | 25 | 26 | 27 | class Student_Wrapper(nn.Module): 28 | def __init__(self, net_layers, classifier): 29 | super(Student_Wrapper, self).__init__() 30 | self.net_layer_0 = nn.Sequential(net_layers[0]) 31 | self.net_layer_1 = nn.Sequential(*net_layers[1]) 32 | self.net_layer_2 = nn.Sequential(*net_layers[2]) 33 | self.net_layer_3 = nn.Sequential(*net_layers[3]) 34 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 35 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 36 | 37 | self.classifier_pool = nn.Sequential(classifier[0]) 38 | self.classifier_initial = nn.Sequential(classifier[1]) 39 | 40 | def forward(self, x): 41 | x = self.net_layer_0(x) 42 | x = self.net_layer_1(x) 43 | x = self.net_layer_2(x) 44 | x1 = self.net_layer_3(x) 45 | x2 = self.net_layer_4(x1) 46 | x3 = self.net_layer_5(x2) 47 | 48 | 49 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 50 | out = self.classifier_initial(classifiers) 51 | 52 | return out, x1, x2, x3 53 | 54 | 55 | def cosine_anneal_schedule(t, nb_epoch, lr): 56 | cos_inner = np.pi * (t % (nb_epoch)) # t - 1 is used when t has 1-based indexing. 57 | cos_inner /= (nb_epoch) 58 | cos_out = np.cos(cos_inner) + 1 59 | 60 | return float(lr / 2 * cos_out) 61 | 62 | 63 | def test(net, criterion, batch_size, test_path): 64 | net.eval() 65 | use_cuda = torch.cuda.is_available() 66 | test_loss = 0 67 | correct = 0 68 | total = 0 69 | idx = 0 70 | device = torch.device("cuda") 71 | 72 | transform_test = transforms.Compose([ 73 | transforms.Resize((421, 421)), 74 | transforms.CenterCrop(368), 75 | transforms.ToTensor(), 76 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 77 | ]) 78 | testset = torchvision.datasets.ImageFolder(root=test_path, 79 | transform=transform_test) 80 | testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True, num_workers=4) 81 | 82 | for batch_idx, (inputs, targets) in enumerate(testloader): 83 | idx = batch_idx 84 | if use_cuda: 85 | inputs, targets = inputs.to(device), targets.to(device) 86 | inputs, targets = Variable(inputs, volatile=True), Variable(targets) 87 | output, _, _, _ = net(inputs) 88 | 89 | loss = criterion(output, targets).mean() 90 | 91 | test_loss += loss.item() 92 | _, predicted = torch.max(output.data, 1) 93 | 94 | total += targets.size(0) 95 | correct += predicted.eq(targets.data).cpu().sum() 96 | 97 | if batch_idx % 50 == 0: 98 | print('Step: %d | Loss: %.3f |Combined Acc: %.3f%% (%d/%d)' % ( 99 | batch_idx, test_loss / (batch_idx + 1), 100 | 100. * float(correct) / total, correct, total)) 101 | 102 | test_acc_en = 100. * float(correct) / total 103 | test_loss = test_loss / (idx + 1) 104 | 105 | net.train() 106 | 107 | return test_acc_en, test_loss 108 | 109 | 110 | class Features(nn.Module): 111 | def __init__(self, net_layers): 112 | super(Features, self).__init__() 113 | self.net_layer_0 = nn.Sequential(net_layers[0]) 114 | self.net_layer_1 = nn.Sequential(*net_layers[1]) 115 | self.net_layer_2 = nn.Sequential(*net_layers[2]) 116 | self.net_layer_3 = nn.Sequential(*net_layers[3]) 117 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 118 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 119 | 120 | def forward(self, x): 121 | x = self.net_layer_0(x) 122 | x = self.net_layer_1(x) 123 | x = self.net_layer_2(x) 124 | x1 = self.net_layer_3(x) 125 | x2 = self.net_layer_4(x1) 126 | x3 = self.net_layer_5(x2) 127 | 128 | return x1, x2, x3 129 | 130 | 131 | class Network_Wrapper(nn.Module): 132 | def __init__(self, net_layers, num_classes, classifier): 133 | super().__init__() 134 | self.Features = Features(net_layers) 135 | self.classifier_pool = nn.Sequential(classifier[0]) 136 | self.classifier_initial = nn.Sequential(classifier[1]) 137 | self.sigmoid = nn.Sigmoid() 138 | self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True) 139 | 140 | self.max_pool1 = nn.MaxPool2d(kernel_size=46, stride=1) 141 | self.max_pool2 = nn.MaxPool2d(kernel_size=23, stride=1) 142 | self.max_pool3 = nn.MaxPool2d(kernel_size=12, stride=1) 143 | 144 | self.conv_block1 = nn.Sequential( 145 | BasicConv(512, 512, kernel_size=1, stride=1, padding=0, relu=True), 146 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 147 | ) 148 | self.classifier1 = nn.Sequential( 149 | nn.BatchNorm1d(1024), 150 | nn.Linear(1024, 512), 151 | nn.BatchNorm1d(512), 152 | nn.ELU(inplace=True), 153 | nn.Linear(512, num_classes) 154 | ) 155 | 156 | self.conv_block2 = nn.Sequential( 157 | BasicConv(1024, 512, kernel_size=1, stride=1, padding=0, relu=True), 158 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 159 | ) 160 | self.classifier2 = nn.Sequential( 161 | nn.BatchNorm1d(1024), 162 | nn.Linear(1024, 512), 163 | nn.BatchNorm1d(512), 164 | nn.ELU(inplace=True), 165 | nn.Linear(512, num_classes), 166 | ) 167 | 168 | self.conv_block3 = nn.Sequential( 169 | BasicConv(2048, 512, kernel_size=1, stride=1, padding=0, relu=True), 170 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 171 | ) 172 | self.classifier3 = nn.Sequential( 173 | nn.BatchNorm1d(1024), 174 | nn.Linear(1024, 512), 175 | nn.BatchNorm1d(512), 176 | nn.ELU(inplace=True), 177 | nn.Linear(512, num_classes), 178 | ) 179 | 180 | def forward(self, x): 181 | x1, x2, x3 = self.Features(x) 182 | map1 = x1.clone() 183 | map2 = x2.clone() 184 | map3 = x3.clone() 185 | 186 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 187 | classifiers = self.classifier_initial(classifiers) 188 | 189 | x1_ = self.conv_block1(x1) 190 | x1_ = self.max_pool1(x1_) 191 | x1_f = x1_.view(x1_.size(0), -1) 192 | 193 | x1_c = self.classifier1(x1_f) 194 | 195 | x2_ = self.conv_block2(x2) 196 | x2_ = self.max_pool2(x2_) 197 | x2_f = x2_.view(x2_.size(0), -1) 198 | x2_c = self.classifier2(x2_f) 199 | 200 | x3_ = self.conv_block3(x3) 201 | x3_ = self.max_pool3(x3_) 202 | x3_f = x3_.view(x3_.size(0), -1) 203 | x3_c = self.classifier3(x3_f) 204 | 205 | return x1_c, x2_c, x3_c, classifiers, map1, map2, map3 206 | 207 | 208 | def img_add_noise(x, transformation_seq): 209 | x = x.permute(0, 2, 3, 1) 210 | x = x.cpu().numpy() 211 | x = transformation_seq(images=x) 212 | x = torch.from_numpy(x.astype(np.float32)) 213 | x = x.permute(0, 3, 1, 2) 214 | return x 215 | 216 | 217 | def CELoss(x, y): 218 | return smooth_crossentropy(x, y, smoothing=0.1) 219 | 220 | 221 | def train(nb_epoch, batch_size, store_name, num_class=1, start_epoch=0, data_path=''): 222 | MSE = nn.MSELoss() 223 | 224 | exp_dir = store_name 225 | try: 226 | os.stat(exp_dir) 227 | except: 228 | os.makedirs(exp_dir) 229 | 230 | use_cuda = torch.cuda.is_available() 231 | print(use_cuda) 232 | 233 | print('==> Preparing data..') 234 | transform_train = transforms.Compose([ 235 | transforms.Resize((421, 421)), 236 | transforms.RandomCrop(368, padding=8), 237 | transforms.RandomHorizontalFlip(), 238 | transforms.ToTensor(), 239 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 240 | ]) 241 | trainset = torchvision.datasets.ImageFolder(root=data_path + '/train', transform=transform_train) 242 | trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=1) 243 | 244 | model_params = {'num_classes': num_class} 245 | model = TResnetL368(model_params) 246 | weights_url = \ 247 | 'https://miil-public-eu.oss-eu-central-1.aliyuncs.com/model-zoo/tresnet/stanford_cars_tresnet-l-v2_96_27.pth' 248 | weights_path = "tresnet-l-v2.pth" 249 | 250 | if not os.path.exists(weights_path): 251 | print('downloading weights...') 252 | r = requests.get(weights_url) 253 | with open(weights_path, "wb") as code: 254 | code.write(r.content) 255 | pretrained_weights = torch.load(weights_path) 256 | model.load_state_dict(pretrained_weights['model']) 257 | 258 | net_layers = list(model.children()) 259 | classifier = net_layers[1:3] 260 | net_layers = net_layers[0] 261 | net_layers = list(net_layers.children()) 262 | 263 | net_student = Student_Wrapper(net_layers, classifier) 264 | 265 | if not args.from_local: 266 | net_teacher = torch.load("weightsFromCloud/Stanford_Cars_TResNet-L_Teacher_Network.pth") 267 | else: 268 | net_teacher = torch.load('results/Stanford_Cars_TResNet_L_PMAL/model.pth') 269 | 270 | 271 | 272 | 273 | base_optimizer = torch.optim.SGD 274 | 275 | optimizer = SAM(net_student.parameters(), base_optimizer, lr=0.002, momentum=0.9, weight_decay=5e-4) 276 | 277 | device = torch.device("cuda") 278 | net_student.to(device) 279 | net_teacher.to(device) 280 | 281 | max_val_acc = 0 282 | lr = [0.002] 283 | for epoch in range(start_epoch, nb_epoch): 284 | print('\nEpoch: %d' % epoch) 285 | 286 | if epoch < (nb_epoch * (1/2)): 287 | net_student.train() 288 | train_loss = 0 289 | train_loss1 = 0 290 | train_loss2 = 0 291 | train_loss3 = 0 292 | train_loss4 = 0 293 | correct = 0 294 | total = 0 295 | idx = 0 296 | for batch_idx, (inputs, targets) in enumerate(trainloader): 297 | idx = batch_idx 298 | if inputs.shape[0] < batch_size: 299 | continue 300 | if use_cuda: 301 | inputs, targets = inputs.to(device), targets.to(device) 302 | inputs, targets = Variable(inputs), Variable(targets) 303 | 304 | for nlr in range(len(optimizer.param_groups)): 305 | optimizer.param_groups[nlr]['lr'] = cosine_anneal_schedule(epoch, nb_epoch, lr[nlr]) 306 | 307 | net_teacher.eval() 308 | with torch.no_grad(): 309 | output_1_t, output_2_t, output_3_t, output_4_t, map1_t, map2_t, map3_t = net_teacher(inputs) 310 | 311 | 312 | 313 | # h1 314 | # h1 first forward-backward step 315 | enable_running_stats(net_student) 316 | optimizer.zero_grad() 317 | _, x1, _, _ = net_student(inputs) 318 | loss1 = MSE(map1_t.detach(), x1) * 100 319 | loss1.backward() 320 | optimizer.first_step(zero_grad=True) 321 | # h1 second forward-backward step 322 | disable_running_stats(net_student) 323 | optimizer.zero_grad() 324 | _, x1, _, _ = net_student(inputs) 325 | loss1_ = MSE(map1_t.detach(), x1) * 100 326 | loss1_.backward() 327 | optimizer.second_step(zero_grad=True) 328 | 329 | # h2 330 | # h2 first forward-backward step 331 | enable_running_stats(net_student) 332 | optimizer.zero_grad() 333 | _, _, x2, _ = net_student(inputs) 334 | loss2 = MSE(map2_t.detach(), x2) * 100 335 | loss2.backward() 336 | optimizer.first_step(zero_grad=True) 337 | 338 | # h2 second forward-backward step 339 | disable_running_stats(net_student) 340 | optimizer.zero_grad() 341 | _, _, x2, _ = net_student(inputs) 342 | loss2_ = MSE(map2_t.detach(), x2) * 100 343 | loss2_.backward() 344 | optimizer.second_step(zero_grad=True) 345 | 346 | 347 | # h3 348 | # h3 first forward-backward step 349 | enable_running_stats(net_student) 350 | optimizer.zero_grad() 351 | _, _, _, x3 = net_student(inputs) 352 | loss3 = MSE(map3_t.detach(), x3) * 100 353 | loss3.backward() 354 | optimizer.first_step(zero_grad=True) 355 | # h3 second forward-backward step 356 | disable_running_stats(net_student) 357 | optimizer.zero_grad() 358 | _, _, _, x3 = net_student(inputs) 359 | loss3_ = MSE(map3_t.detach(), x3) * 100 360 | loss3_.backward() 361 | optimizer.second_step(zero_grad=True) 362 | 363 | 364 | 365 | # h4 366 | # h4 first forward-backward step 367 | enable_running_stats(net_student) 368 | optimizer.zero_grad() 369 | output, _, _, _ = net_student(inputs) 370 | 371 | loss4 = MSE(output_1_t.detach(), output) + \ 372 | MSE(output_2_t.detach(), output) + \ 373 | MSE(output_3_t.detach(), output) + \ 374 | MSE(output_4_t.detach(), output) + \ 375 | CELoss(output, targets).mean() 376 | loss4.backward() 377 | optimizer.first_step(zero_grad=True) 378 | # h4 second forward-backward step 379 | disable_running_stats(net_student) 380 | optimizer.zero_grad() 381 | output, _, _, _ = net_student(inputs) 382 | 383 | loss4_ = MSE(output_1_t.detach(), output) + \ 384 | MSE(output_2_t.detach(), output) + \ 385 | MSE(output_3_t.detach(), output) + \ 386 | MSE(output_4_t.detach(), output) + \ 387 | CELoss(output, targets).mean() 388 | loss4_.backward() 389 | optimizer.second_step(zero_grad=True) 390 | 391 | _, predicted = torch.max(output.data, 1) 392 | total += targets.size(0) 393 | correct += predicted.eq(targets.data).cpu().sum() 394 | 395 | train_loss += (loss1.item() + loss2.item() + loss3.item() + loss4.item()) 396 | train_loss1 += loss1.item() 397 | train_loss2 += loss2.item() 398 | train_loss3 += loss3.item() 399 | train_loss4 += loss4.item() 400 | 401 | if batch_idx % 50 == 0: 402 | print( 403 | 'Step: %d | Loss1: %.3f | Loss2: %.5f | Loss3: %.5f |Loss4: %.5f | Loss: %.3f | Acc: %.3f%% (%d/%d)' % ( 404 | batch_idx, train_loss1 / (batch_idx + 1), train_loss2 / (batch_idx + 1), 405 | train_loss3 / (batch_idx + 1), train_loss4 / (batch_idx + 1), train_loss / (batch_idx + 1), 406 | 100. * float(correct) / total, correct, total)) 407 | else: 408 | net_student.train() 409 | train_loss = 0 410 | correct = 0 411 | total = 0 412 | idx = 0 413 | for batch_idx, (inputs, targets) in enumerate(trainloader): 414 | idx = batch_idx 415 | if inputs.shape[0] < batch_size: 416 | continue 417 | if use_cuda: 418 | inputs, targets = inputs.to(device), targets.to(device) 419 | inputs, targets = Variable(inputs), Variable(targets) 420 | 421 | for nlr in range(len(optimizer.param_groups)): 422 | optimizer.param_groups[nlr]['lr'] = cosine_anneal_schedule(epoch, nb_epoch, lr[nlr]) 423 | 424 | #Fine-tune: first forward-backward step 425 | enable_running_stats(net_student) 426 | optimizer.zero_grad() 427 | output, _, _, _ = net_student(inputs) 428 | loss_f = CELoss(output, targets).mean() 429 | loss_f.backward() 430 | optimizer.first_step(zero_grad=True) 431 | 432 | # Fine-tune: second forward-backward step 433 | disable_running_stats(net_student) 434 | optimizer.zero_grad() 435 | output, _, _, _ = net_student(inputs) 436 | loss_f = CELoss(output, targets).mean() 437 | loss_f.backward() 438 | optimizer.second_step(zero_grad=True) 439 | 440 | 441 | 442 | _, predicted = torch.max(output.data, 1) 443 | total += targets.size(0) 444 | correct += predicted.eq(targets.data).cpu().sum() 445 | 446 | train_loss += loss_f.item() 447 | 448 | if batch_idx % 50 == 0: 449 | print( 450 | 'Step: %d | Loss: %.3f | Acc: %.3f%% (%d/%d)' % ( 451 | batch_idx, train_loss / (batch_idx + 1), 100. * float(correct) / total, correct, total)) 452 | 453 | if epoch < (nb_epoch * (1/2)): 454 | train_acc = 100. * float(correct) / total 455 | train_loss = train_loss / (idx + 1) 456 | with open(exp_dir + '/results_train.txt', 'a') as file: 457 | file.write( 458 | 'Iteration %d | train_acc = %.5f | train_loss = %.5f | Loss1: %.3f | Loss2: %.5f | Loss3: %.5f | Loss4: %.5f |\n' % ( 459 | epoch, train_acc, train_loss, train_loss1 / (idx + 1), train_loss2 / (idx + 1), 460 | train_loss3 / (idx + 1), 461 | train_loss4 / (idx + 1))) 462 | else: 463 | train_acc = 100. * float(correct) / total 464 | train_loss = train_loss / (idx + 1) 465 | with open(exp_dir + '/results_train.txt', 'a') as file: 466 | file.write( 467 | 'Iteration %d | train_acc = %.5f | train_loss = %.5f|\n' % (epoch, train_acc, train_loss)) 468 | 469 | 470 | val_acc_com, val_loss = test(net_student, CELoss, 7, data_path + '/test') 471 | if val_acc_com > max_val_acc: 472 | max_val_acc = val_acc_com 473 | net_student.cpu() 474 | torch.save(net_student, './' + store_name + '/model.pth') 475 | 476 | net_student.to(device) 477 | 478 | with open(exp_dir + '/results_test.txt', 'a') as file: 479 | file.write('Iteration %d, test_acc_combined = %.5f, test_loss = %.6f\n' % ( 480 | epoch, val_acc_com, val_loss)) 481 | 482 | 483 | if __name__ == '__main__': 484 | data_path = '/Stanford Cars' 485 | train(nb_epoch=200, # number of epoch 486 | batch_size=8, # batch size 487 | store_name='results/Stanford_Cars_TResNet_L_Distillation', # the folder for saving results 488 | num_class=196, # number of categories 489 | start_epoch=0, # the start epoch number 490 | data_path=data_path) # the path to the dataset 491 | -------------------------------------------------------------------------------- /Stanford_Cars_ResNet50_Distillation.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | os.environ["CUDA_VISIBLE_DEVICES"] = "1" 4 | import torchvision.models 5 | from sam import SAM 6 | from torch.utils.model_zoo import load_url as load_state_dict_from_url 7 | import cv2 8 | 9 | import numpy as np 10 | import torchvision 11 | from torch.autograd import Variable 12 | from torchvision import transforms 13 | from basic_conv import * 14 | from example.model.smooth_cross_entropy import smooth_crossentropy 15 | from example.utility.bypass_bn import enable_running_stats, disable_running_stats 16 | 17 | import argparse 18 | parser = argparse.ArgumentParser(description='train_student') 19 | parser.add_argument('--from_local', required=False, action='store_true', 20 | help="By default, this is set to use the pre-trained parameters downloaded from the cloud as the teacher model. " 21 | "Set this flag to True to use the locally trained parameters instead.") 22 | 23 | args, unparsed = parser.parse_known_args() 24 | 25 | 26 | class Student_Wrapper(nn.Module): 27 | def __init__(self, net_layers, classifier): 28 | super(Student_Wrapper, self).__init__() 29 | self.net_layer_0 = nn.Sequential(net_layers[0]) 30 | self.net_layer_1 = nn.Sequential(net_layers[1]) 31 | self.net_layer_2 = nn.Sequential(net_layers[2]) 32 | self.net_layer_3 = nn.Sequential(net_layers[3]) 33 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 34 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 35 | self.net_layer_6 = nn.Sequential(*net_layers[6]) 36 | self.net_layer_7 = nn.Sequential(*net_layers[7]) 37 | 38 | self.net_layer_8 = nn.Sequential(classifier[0]) 39 | self.net_layer_9 = nn.Sequential(classifier[1]) 40 | 41 | self.bn1 = nn.Sequential(nn.BatchNorm2d(64)) 42 | self.relu = nn.Sequential(nn.ReLU()) 43 | 44 | def forward(self, x): 45 | x = self.net_layer_0(x) 46 | x = self.net_layer_1(x) 47 | x = self.net_layer_2(x) 48 | x = self.net_layer_3(x) 49 | x = self.net_layer_4(x) 50 | x1 = self.net_layer_5(x) 51 | x2 = self.net_layer_6(x1) 52 | x3 = self.net_layer_7(x2) 53 | 54 | feat = self.net_layer_8(x3) 55 | feat = feat.view(feat.size(0), -1) 56 | out = self.net_layer_9(feat) 57 | 58 | return out, x1, x2, x3 59 | 60 | 61 | def cosine_anneal_schedule(t, nb_epoch, lr): 62 | cos_inner = np.pi * (t % (nb_epoch)) # t - 1 is used when t has 1-based indexing. 63 | cos_inner /= (nb_epoch) 64 | cos_out = np.cos(cos_inner) + 1 65 | 66 | return float(lr / 2 * cos_out) 67 | 68 | 69 | def test(net, criterion, batch_size, test_path): 70 | net.eval() 71 | use_cuda = torch.cuda.is_available() 72 | test_loss = 0 73 | correct = 0 74 | total = 0 75 | idx = 0 76 | device = torch.device("cuda") 77 | 78 | transform_test = transforms.Compose([ 79 | transforms.Resize((550, 550)), 80 | transforms.CenterCrop(448), 81 | transforms.ToTensor(), 82 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 83 | ]) 84 | testset = torchvision.datasets.ImageFolder(root=test_path, 85 | transform=transform_test) 86 | testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True, num_workers=4) 87 | 88 | for batch_idx, (inputs, targets) in enumerate(testloader): 89 | idx = batch_idx 90 | if use_cuda: 91 | inputs, targets = inputs.to(device), targets.to(device) 92 | inputs, targets = Variable(inputs, volatile=True), Variable(targets) 93 | output, _, _, _ = net(inputs) 94 | 95 | loss = criterion(output, targets).mean() 96 | 97 | test_loss += loss.item() 98 | _, predicted = torch.max(output.data, 1) 99 | 100 | total += targets.size(0) 101 | correct += predicted.eq(targets.data).cpu().sum() 102 | 103 | if batch_idx % 50 == 0: 104 | print('Step: %d | Loss: %.3f |Combined Acc: %.3f%% (%d/%d)' % ( 105 | batch_idx, test_loss / (batch_idx + 1), 106 | 100. * float(correct) / total, correct, total)) 107 | 108 | test_acc_en = 100. * float(correct) / total 109 | test_loss = test_loss / (idx + 1) 110 | 111 | net.train() 112 | 113 | return test_acc_en, test_loss 114 | 115 | 116 | def show_im(image, h, w): 117 | img = (np.transpose(image.cpu().detach().numpy(), (1, 2, 0)) * (0.5, 0.5, 0.5) + (0.5, 0.5, 0.5)) 118 | 119 | img = cv2.resize(img, (w, h)) 120 | return img 121 | 122 | 123 | class Features(nn.Module): 124 | def __init__(self, net_layers): 125 | super(Features, self).__init__() 126 | self.net_layer_0 = nn.Sequential(net_layers[0]) 127 | self.net_layer_1 = nn.Sequential(net_layers[1]) 128 | self.net_layer_2 = nn.Sequential(net_layers[2]) 129 | self.net_layer_3 = nn.Sequential(net_layers[3]) 130 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 131 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 132 | self.net_layer_6 = nn.Sequential(*net_layers[6]) 133 | self.net_layer_7 = nn.Sequential(*net_layers[7]) 134 | 135 | def forward(self, x): 136 | x = self.net_layer_0(x) 137 | x = self.net_layer_1(x) 138 | x = self.net_layer_2(x) 139 | x = self.net_layer_3(x) 140 | x = self.net_layer_4(x) 141 | x1 = self.net_layer_5(x) 142 | x2 = self.net_layer_6(x1) 143 | x3 = self.net_layer_7(x2) 144 | return x1, x2, x3 145 | 146 | 147 | class Network_Wrapper(nn.Module): 148 | def __init__(self, net_layers, num_class, classifier): 149 | super().__init__() 150 | self.Features = Features(net_layers) 151 | self.classifier_pool = nn.Sequential(classifier[0]) 152 | self.classifier_initial = nn.Sequential(classifier[1]) 153 | self.sigmoid = nn.Sigmoid() 154 | self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True) 155 | 156 | self.max_pool1 = nn.MaxPool2d(kernel_size=56, stride=1) 157 | self.max_pool2 = nn.MaxPool2d(kernel_size=28, stride=1) 158 | self.max_pool3 = nn.MaxPool2d(kernel_size=14, stride=1) 159 | 160 | self.conv_block1 = nn.Sequential( 161 | BasicConv(512, 512, kernel_size=1, stride=1, padding=0, relu=True), 162 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 163 | ) 164 | self.classifier1 = nn.Sequential( 165 | nn.BatchNorm1d(1024), 166 | nn.Linear(1024, 512), 167 | nn.BatchNorm1d(512), 168 | nn.ELU(inplace=True), 169 | nn.Linear(512, num_class) 170 | ) 171 | 172 | self.conv_block2 = nn.Sequential( 173 | BasicConv(1024, 512, kernel_size=1, stride=1, padding=0, relu=True), 174 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 175 | ) 176 | self.classifier2 = nn.Sequential( 177 | nn.BatchNorm1d(1024), 178 | nn.Linear(1024, 512), 179 | nn.BatchNorm1d(512), 180 | nn.ELU(inplace=True), 181 | nn.Linear(512, num_class), 182 | ) 183 | 184 | self.conv_block3 = nn.Sequential( 185 | BasicConv(2048, 512, kernel_size=1, stride=1, padding=0, relu=True), 186 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 187 | ) 188 | self.classifier3 = nn.Sequential( 189 | nn.BatchNorm1d(1024), 190 | nn.Linear(1024, 512), 191 | nn.BatchNorm1d(512), 192 | nn.ELU(inplace=True), 193 | nn.Linear(512, num_class), 194 | ) 195 | 196 | 197 | def forward(self, x): 198 | x1, x2, x3 = self.Features(x) 199 | map1 = x1.clone() 200 | map2 = x2.clone() 201 | map3 = x3.clone() 202 | 203 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 204 | classifiers = self.classifier_initial(classifiers) 205 | 206 | x1_ = self.conv_block1(x1) 207 | x1_ = self.max_pool1(x1_) 208 | x1_f = x1_.view(x1_.size(0), -1) 209 | x1_c = self.classifier1(x1_f) 210 | 211 | x2_ = self.conv_block2(x2) 212 | x2_ = self.max_pool2(x2_) 213 | x2_f = x2_.view(x2_.size(0), -1) 214 | x2_c = self.classifier2(x2_f) 215 | 216 | x3_ = self.conv_block3(x3) 217 | x3_ = self.max_pool3(x3_) 218 | x3_f = x3_.view(x3_.size(0), -1) 219 | x3_c = self.classifier3(x3_f) 220 | 221 | return x1_c, x2_c, x3_c, classifiers, map1, map2, map3 222 | 223 | 224 | def img_add_noise(x, transformation_seq): 225 | x = x.permute(0, 2, 3, 1) 226 | x = x.cpu().numpy() 227 | x = transformation_seq(images=x) 228 | x = torch.from_numpy(x.astype(np.float32)) 229 | x = x.permute(0, 3, 1, 2) 230 | return x 231 | 232 | 233 | def CELoss(x, y): 234 | return smooth_crossentropy(x, y, smoothing=0.1) 235 | 236 | 237 | def train(nb_epoch, batch_size, store_name, num_class=1, start_epoch=0, data_path=''): 238 | MSE = nn.MSELoss() 239 | 240 | exp_dir = store_name 241 | try: 242 | os.stat(exp_dir) 243 | except: 244 | os.makedirs(exp_dir) 245 | 246 | use_cuda = torch.cuda.is_available() 247 | print(use_cuda) 248 | 249 | print('==> Preparing data..') 250 | transform_train = transforms.Compose([ 251 | transforms.Resize((550, 550)), 252 | transforms.RandomCrop(448, padding=8), 253 | transforms.RandomHorizontalFlip(), 254 | transforms.ToTensor(), 255 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 256 | ]) 257 | trainset = torchvision.datasets.ImageFolder(root=data_path + '/train', transform=transform_train) 258 | trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=1) 259 | 260 | net = torchvision.models.resnet50() 261 | state_dict = load_state_dict_from_url('https://download.pytorch.org/models/resnet50-19c8e357.pth') 262 | net.load_state_dict(state_dict) 263 | fc_features = net.fc.in_features 264 | net.fc = nn.Linear(fc_features, num_class) 265 | 266 | net_layers = list(net.children()) 267 | classifier = net_layers[8:10] 268 | net_layers = net_layers[0:8] 269 | 270 | if not args.from_local: 271 | net_teacher = torch.load('weightsFromCloud/Stanford_Cars_ResNet50_Teacher_Network.pth') 272 | else: 273 | net_teacher = torch.load('results/Stanford_Cars_ResNet50_PMAL/model.pth') 274 | 275 | 276 | net_student = Student_Wrapper(net_layers, classifier) 277 | 278 | base_optimizer = torch.optim.SGD 279 | 280 | optimizer = SAM(net_student.parameters(), base_optimizer, lr=0.002, momentum=0.9, weight_decay=5e-4) 281 | 282 | device = torch.device("cuda") 283 | net_student.to(device) 284 | net_teacher.to(device) 285 | 286 | max_val_acc = 0 287 | lr = [0.002] 288 | for epoch in range(start_epoch, nb_epoch): 289 | print('\nEpoch: %d' % epoch) 290 | 291 | if epoch < (nb_epoch * (1/2)): 292 | net_student.train() 293 | train_loss = 0 294 | train_loss1 = 0 295 | train_loss2 = 0 296 | train_loss3 = 0 297 | train_loss4 = 0 298 | correct = 0 299 | total = 0 300 | idx = 0 301 | for batch_idx, (inputs, targets) in enumerate(trainloader): 302 | idx = batch_idx 303 | if inputs.shape[0] < batch_size: 304 | continue 305 | if use_cuda: 306 | inputs, targets = inputs.to(device), targets.to(device) 307 | inputs, targets = Variable(inputs), Variable(targets) 308 | 309 | for nlr in range(len(optimizer.param_groups)): 310 | optimizer.param_groups[nlr]['lr'] = cosine_anneal_schedule(epoch, nb_epoch, lr[nlr]) 311 | 312 | net_teacher.eval() 313 | with torch.no_grad(): 314 | output_1_t, output_2_t, output_3_t, output_4_t, map1_t, map2_t, map3_t = net_teacher(inputs) 315 | 316 | 317 | 318 | # h1 319 | # h1 first forward-backward step 320 | enable_running_stats(net_student) 321 | optimizer.zero_grad() 322 | _, x1, _, _ = net_student(inputs) 323 | loss1 = MSE(map1_t.detach(), x1) * 100 324 | loss1.backward() 325 | optimizer.first_step(zero_grad=True) 326 | # h1 second forward-backward step 327 | disable_running_stats(net_student) 328 | optimizer.zero_grad() 329 | _, x1, _, _ = net_student(inputs) 330 | loss1_ = MSE(map1_t.detach(), x1) * 100 331 | loss1_.backward() 332 | optimizer.second_step(zero_grad=True) 333 | 334 | # h2 335 | # h2 first forward-backward step 336 | enable_running_stats(net_student) 337 | optimizer.zero_grad() 338 | _, _, x2, _ = net_student(inputs) 339 | loss2 = MSE(map2_t.detach(), x2) * 100 340 | loss2.backward() 341 | optimizer.first_step(zero_grad=True) 342 | 343 | # h2 second forward-backward step 344 | disable_running_stats(net_student) 345 | optimizer.zero_grad() 346 | _, _, x2, _ = net_student(inputs) 347 | loss2_ = MSE(map2_t.detach(), x2) * 100 348 | loss2_.backward() 349 | optimizer.second_step(zero_grad=True) 350 | 351 | 352 | # h3 353 | # h3 first forward-backward step 354 | enable_running_stats(net_student) 355 | optimizer.zero_grad() 356 | _, _, _, x3 = net_student(inputs) 357 | loss3 = MSE(map3_t.detach(), x3) * 100 358 | loss3.backward() 359 | optimizer.first_step(zero_grad=True) 360 | # h3 second forward-backward step 361 | disable_running_stats(net_student) 362 | optimizer.zero_grad() 363 | _, _, _, x3 = net_student(inputs) 364 | loss3_ = MSE(map3_t.detach(), x3) * 100 365 | loss3_.backward() 366 | optimizer.second_step(zero_grad=True) 367 | 368 | 369 | 370 | # h4 371 | # h4 first forward-backward step 372 | enable_running_stats(net_student) 373 | optimizer.zero_grad() 374 | output, _, _, _ = net_student(inputs) 375 | 376 | loss4 = MSE(output_1_t.detach(), output) + \ 377 | MSE(output_2_t.detach(), output) + \ 378 | MSE(output_3_t.detach(), output) + \ 379 | MSE(output_4_t.detach(), output) + \ 380 | CELoss(output, targets).mean() 381 | loss4.backward() 382 | optimizer.first_step(zero_grad=True) 383 | # h4 second forward-backward step 384 | disable_running_stats(net_student) 385 | optimizer.zero_grad() 386 | output, _, _, _ = net_student(inputs) 387 | 388 | loss4_ = MSE(output_1_t.detach(), output) + \ 389 | MSE(output_2_t.detach(), output) + \ 390 | MSE(output_3_t.detach(), output) + \ 391 | MSE(output_4_t.detach(), output) + \ 392 | CELoss(output, targets).mean() 393 | loss4_.backward() 394 | optimizer.second_step(zero_grad=True) 395 | 396 | _, predicted = torch.max(output.data, 1) 397 | total += targets.size(0) 398 | correct += predicted.eq(targets.data).cpu().sum() 399 | 400 | train_loss += (loss1.item() + loss2.item() + loss3.item() + loss4.item()) 401 | train_loss1 += loss1.item() 402 | train_loss2 += loss2.item() 403 | train_loss3 += loss3.item() 404 | train_loss4 += loss4.item() 405 | 406 | if batch_idx % 50 == 0: 407 | print( 408 | 'Step: %d | Loss1: %.3f | Loss2: %.5f | Loss3: %.5f |Loss4: %.5f | Loss: %.3f | Acc: %.3f%% (%d/%d)' % ( 409 | batch_idx, train_loss1 / (batch_idx + 1), train_loss2 / (batch_idx + 1), 410 | train_loss3 / (batch_idx + 1), train_loss4 / (batch_idx + 1), train_loss / (batch_idx + 1), 411 | 100. * float(correct) / total, correct, total)) 412 | else: 413 | net_student.train() 414 | train_loss = 0 415 | correct = 0 416 | total = 0 417 | idx = 0 418 | for batch_idx, (inputs, targets) in enumerate(trainloader): 419 | idx = batch_idx 420 | if inputs.shape[0] < batch_size: 421 | continue 422 | if use_cuda: 423 | inputs, targets = inputs.to(device), targets.to(device) 424 | inputs, targets = Variable(inputs), Variable(targets) 425 | 426 | for nlr in range(len(optimizer.param_groups)): 427 | optimizer.param_groups[nlr]['lr'] = cosine_anneal_schedule(epoch, nb_epoch, lr[nlr]) 428 | 429 | #Fine-tune: first forward-backward step 430 | enable_running_stats(net_student) 431 | optimizer.zero_grad() 432 | output, _, _, _ = net_student(inputs) 433 | loss_f = CELoss(output, targets).mean() 434 | loss_f.backward() 435 | optimizer.first_step(zero_grad=True) 436 | 437 | # Fine-tune: second forward-backward step 438 | disable_running_stats(net_student) 439 | optimizer.zero_grad() 440 | output, _, _, _ = net_student(inputs) 441 | loss_f = CELoss(output, targets).mean() 442 | loss_f.backward() 443 | optimizer.second_step(zero_grad=True) 444 | 445 | 446 | 447 | _, predicted = torch.max(output.data, 1) 448 | total += targets.size(0) 449 | correct += predicted.eq(targets.data).cpu().sum() 450 | 451 | train_loss += loss_f.item() 452 | 453 | if batch_idx % 50 == 0: 454 | print( 455 | 'Step: %d | Loss: %.3f | Acc: %.3f%% (%d/%d)' % ( 456 | batch_idx, train_loss / (batch_idx + 1), 100. * float(correct) / total, correct, total)) 457 | 458 | if epoch < (nb_epoch * (1/2)): 459 | train_acc = 100. * float(correct) / total 460 | train_loss = train_loss / (idx + 1) 461 | with open(exp_dir + '/results_train.txt', 'a') as file: 462 | file.write( 463 | 'Iteration %d | train_acc = %.5f | train_loss = %.5f | Loss1: %.3f | Loss2: %.5f | Loss3: %.5f | Loss4: %.5f |\n' % ( 464 | epoch, train_acc, train_loss, train_loss1 / (idx + 1), train_loss2 / (idx + 1), 465 | train_loss3 / (idx + 1), 466 | train_loss4 / (idx + 1))) 467 | else: 468 | train_acc = 100. * float(correct) / total 469 | train_loss = train_loss / (idx + 1) 470 | with open(exp_dir + '/results_train.txt', 'a') as file: 471 | file.write( 472 | 'Iteration %d | train_acc = %.5f | train_loss = %.5f|\n' % (epoch, train_acc, train_loss)) 473 | 474 | 475 | val_acc_com, val_loss = test(net_student, CELoss, 7, data_path + '/test') 476 | if val_acc_com > max_val_acc: 477 | max_val_acc = val_acc_com 478 | net_student.cpu() 479 | torch.save(net_student, './' + store_name + '/model.pth') 480 | 481 | net_student.to(device) 482 | 483 | with open(exp_dir + '/results_test.txt', 'a') as file: 484 | file.write('Iteration %d, test_acc_combined = %.5f, test_loss = %.6f\n' % ( 485 | epoch, val_acc_com, val_loss)) 486 | 487 | 488 | 489 | if __name__ == '__main__': 490 | data_path = '/Stanford Cars' 491 | train(nb_epoch=200, # number of epoch 492 | batch_size=8, # batch size 493 | store_name='results/Stanford_Cars_ResNet50_Distillation', # the folder for saving results 494 | num_class=196, # number of categories 495 | start_epoch=0, # the start epoch number 496 | data_path=data_path) # the path to the dataset 497 | -------------------------------------------------------------------------------- /Stanford_Cars_ResNet50_PMAL.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | os.environ["CUDA_VISIBLE_DEVICES"] = "1" 4 | import torchvision.models 5 | from sam import SAM 6 | from torch.utils.model_zoo import load_url as load_state_dict_from_url 7 | import imgaug as ia 8 | import imgaug.augmenters as iaa 9 | from vic.loss import CharbonnierLoss 10 | import numpy as np 11 | import torchvision 12 | from torch.autograd import Variable 13 | from torchvision import transforms 14 | from basic_conv import * 15 | from example.model.smooth_cross_entropy import smooth_crossentropy 16 | from example.utility.bypass_bn import enable_running_stats, disable_running_stats 17 | 18 | def cosine_anneal_schedule(t, nb_epoch, lr): 19 | cos_inner = np.pi * (t % (nb_epoch)) 20 | cos_inner /= (nb_epoch) 21 | cos_out = np.cos(cos_inner) + 1 22 | 23 | return float(lr / 2 * cos_out) 24 | 25 | def test(net, criterion, batch_size, test_path): 26 | net.eval() 27 | use_cuda = torch.cuda.is_available() 28 | test_loss = 0 29 | correct = 0 30 | correct_com = 0 31 | total = 0 32 | idx = 0 33 | device = torch.device("cuda") 34 | 35 | transform_test = transforms.Compose([ 36 | transforms.Resize((550, 550)), 37 | transforms.CenterCrop(448), 38 | transforms.ToTensor(), 39 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 40 | ]) 41 | testset = torchvision.datasets.ImageFolder(root=test_path, 42 | transform=transform_test) 43 | testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True, num_workers=4) 44 | 45 | for batch_idx, (inputs, targets) in enumerate(testloader): 46 | idx = batch_idx 47 | if use_cuda: 48 | inputs, targets = inputs.to(device), targets.to(device) 49 | inputs, targets = Variable(inputs, volatile=True), Variable(targets) 50 | output_1, output_2, output_3, output_ORI, map1, map2, map3 = net(inputs) 51 | 52 | outputs_com = output_1 + output_2 + output_3 + output_ORI 53 | 54 | loss = criterion(output_ORI, targets).mean() 55 | 56 | test_loss += loss.item() 57 | _, predicted = torch.max(output_ORI.data, 1) 58 | _, predicted_com = torch.max(outputs_com.data, 1) 59 | 60 | total += targets.size(0) 61 | correct += predicted.eq(targets.data).cpu().sum() 62 | correct_com += predicted_com.eq(targets.data).cpu().sum() 63 | 64 | if batch_idx % 50 == 0: 65 | print('Step: %d | Loss: %.3f |Combined Acc: %.3f%% (%d/%d)' % ( 66 | batch_idx, test_loss / (batch_idx + 1), 67 | 100. * float(correct_com) / total, correct_com, total)) 68 | 69 | test_acc_en = 100. * float(correct_com) / total 70 | test_loss = test_loss / (idx + 1) 71 | 72 | return test_acc_en, test_loss 73 | 74 | 75 | 76 | 77 | class Features(nn.Module): 78 | def __init__(self, net_layers): 79 | super(Features, self).__init__() 80 | self.net_layer_0 = nn.Sequential(net_layers[0]) 81 | self.net_layer_1 = nn.Sequential(net_layers[1]) 82 | self.net_layer_2 = nn.Sequential(net_layers[2]) 83 | self.net_layer_3 = nn.Sequential(net_layers[3]) 84 | self.net_layer_4 = nn.Sequential(*net_layers[4]) 85 | self.net_layer_5 = nn.Sequential(*net_layers[5]) 86 | self.net_layer_6 = nn.Sequential(*net_layers[6]) 87 | self.net_layer_7 = nn.Sequential(*net_layers[7]) 88 | 89 | 90 | def forward(self, x): 91 | x = self.net_layer_0(x) 92 | x = self.net_layer_1(x) 93 | x = self.net_layer_2(x) 94 | x = self.net_layer_3(x) 95 | x = self.net_layer_4(x) 96 | x1 = self.net_layer_5(x) 97 | x2 = self.net_layer_6(x1) 98 | x3 = self.net_layer_7(x2) 99 | return x1, x2, x3 100 | 101 | 102 | class Anti_Noise_Decoder(nn.Module): 103 | def __init__(self, scale, in_channel): 104 | super(Anti_Noise_Decoder, self).__init__() 105 | self.Sigmoid = nn.Sigmoid() 106 | 107 | in_channel = in_channel // (scale * scale) 108 | 109 | self.skip = nn.Sequential( 110 | nn.Conv2d(3, 64, 3, 1, 1, bias=False), 111 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 112 | nn.Conv2d(64, 3, 3, 1, 1, bias=False), 113 | nn.LeakyReLU(negative_slope=0.1, inplace=True) 114 | 115 | ) 116 | 117 | self.process = nn.Sequential( 118 | nn.PixelShuffle(scale), 119 | nn.Conv2d(in_channel, 256, 3, 1, 1, bias=False), 120 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 121 | nn.PixelShuffle(2), 122 | nn.Conv2d(64, 128, 3, 1, 1, bias=False), 123 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 124 | nn.PixelShuffle(2), 125 | nn.Conv2d(32, 64, 3, 1, 1, bias=False), 126 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 127 | nn.PixelShuffle(2), 128 | nn.Conv2d(16, 3, 3, 1, 1, bias=False), 129 | nn.LeakyReLU(negative_slope=0.1, inplace=True) 130 | ) 131 | 132 | def forward(self, x, map): 133 | return self.skip(x) + self.process(map) 134 | 135 | class Network_Wrapper(nn.Module): 136 | def __init__(self, net_layers, num_class, classifier): 137 | super().__init__() 138 | self.Features = Features(net_layers) 139 | self.classifier_pool = nn.Sequential(classifier[0]) 140 | self.classifier_initial = nn.Sequential(classifier[1]) 141 | self.sigmoid = nn.Sigmoid() 142 | self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True) 143 | 144 | self.max_pool1 = nn.MaxPool2d(kernel_size=56, stride=1) 145 | self.max_pool2 = nn.MaxPool2d(kernel_size=28, stride=1) 146 | self.max_pool3 = nn.MaxPool2d(kernel_size=14, stride=1) 147 | 148 | self.conv_block1 = nn.Sequential( 149 | BasicConv(512, 512, kernel_size=1, stride=1, padding=0, relu=True), 150 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 151 | ) 152 | self.classifier1 = nn.Sequential( 153 | nn.BatchNorm1d(1024), 154 | nn.Linear(1024, 512), 155 | nn.BatchNorm1d(512), 156 | nn.ELU(inplace=True), 157 | nn.Linear(512, num_class) 158 | ) 159 | 160 | self.conv_block2 = nn.Sequential( 161 | BasicConv(1024, 512, kernel_size=1, stride=1, padding=0, relu=True), 162 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 163 | ) 164 | self.classifier2 = nn.Sequential( 165 | nn.BatchNorm1d(1024), 166 | nn.Linear(1024, 512), 167 | nn.BatchNorm1d(512), 168 | nn.ELU(inplace=True), 169 | nn.Linear(512, num_class), 170 | ) 171 | 172 | self.conv_block3 = nn.Sequential( 173 | BasicConv(2048, 512, kernel_size=1, stride=1, padding=0, relu=True), 174 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 175 | ) 176 | self.classifier3 = nn.Sequential( 177 | nn.BatchNorm1d(1024), 178 | nn.Linear(1024, 512), 179 | nn.BatchNorm1d(512), 180 | nn.ELU(inplace=True), 181 | nn.Linear(512, num_class), 182 | ) 183 | 184 | 185 | def forward(self, x): 186 | x1, x2, x3 = self.Features(x) 187 | map1 = x1.clone() 188 | map2 = x2.clone() 189 | map3 = x3.clone() 190 | 191 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 192 | classifiers = self.classifier_initial(classifiers) 193 | 194 | x1_ = self.conv_block1(x1) 195 | x1_ = self.max_pool1(x1_) 196 | x1_f = x1_.view(x1_.size(0), -1) 197 | x1_c = self.classifier1(x1_f) 198 | 199 | x2_ = self.conv_block2(x2) 200 | x2_ = self.max_pool2(x2_) 201 | x2_f = x2_.view(x2_.size(0), -1) 202 | x2_c = self.classifier2(x2_f) 203 | 204 | 205 | x3_ = self.conv_block3(x3) 206 | x3_ = self.max_pool3(x3_) 207 | x3_f = x3_.view(x3_.size(0), -1) 208 | x3_c = self.classifier3(x3_f) 209 | 210 | return x1_c, x2_c, x3_c, classifiers, map1, map2, map3 211 | 212 | def img_add_noise(x, transformation_seq): 213 | 214 | x = x.permute(0, 2, 3, 1) 215 | x = x.cpu().numpy() 216 | x = transformation_seq(images=x) 217 | x = torch.from_numpy(x.astype(np.float32)) 218 | x = x.permute(0, 3, 1, 2) 219 | return x 220 | 221 | 222 | def CELoss(x, y): 223 | return smooth_crossentropy(x, y, smoothing=0.1) 224 | 225 | 226 | def train(nb_epoch, batch_size, store_name, num_class=0, start_epoch=0, data_path=''): 227 | 228 | alpha = 1 229 | 230 | exp_dir = store_name 231 | try: 232 | os.stat(exp_dir) 233 | except: 234 | os.makedirs(exp_dir) 235 | 236 | use_cuda = torch.cuda.is_available() 237 | print(use_cuda) 238 | 239 | 240 | print('==> Preparing data..') 241 | transform_train = transforms.Compose([ 242 | transforms.Resize((550, 550)), 243 | transforms.RandomCrop(448, padding=8), 244 | transforms.RandomHorizontalFlip(), 245 | transforms.ToTensor(), 246 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 247 | ]) 248 | trainset = torchvision.datasets.ImageFolder(root=data_path+'/train', transform=transform_train) 249 | trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=1) 250 | 251 | 252 | net = torchvision.models.resnet50() 253 | state_dict = load_state_dict_from_url('https://download.pytorch.org/models/resnet50-19c8e357.pth') 254 | net.load_state_dict(state_dict) 255 | fc_features = net.fc.in_features 256 | net.fc = nn.Linear(fc_features, num_class) 257 | 258 | net_layers = list(net.children()) 259 | classifier = net_layers[8:10] 260 | net_layers = net_layers[0:8] 261 | net = Network_Wrapper(net_layers, num_class, classifier) 262 | 263 | netp = torch.nn.DataParallel(net, device_ids=[0]) 264 | 265 | 266 | 267 | device = torch.device("cuda") 268 | net.to(device) 269 | decoder1 = Anti_Noise_Decoder(1, 512).to(device) 270 | decoder2 = Anti_Noise_Decoder(2, 1024).to(device) 271 | decoder3 = Anti_Noise_Decoder(4, 2048).to(device) 272 | 273 | 274 | 275 | CB_loss = CharbonnierLoss() 276 | 277 | base_optimizer = torch.optim.SGD 278 | 279 | 280 | optimizer = SAM([ 281 | {'params': net.classifier_initial.parameters(), 'lr': 0.002}, 282 | {'params': net.conv_block1.parameters(), 'lr': 0.002}, 283 | {'params': net.classifier1.parameters(), 'lr': 0.002}, 284 | {'params': net.conv_block2.parameters(), 'lr': 0.002}, 285 | {'params': net.classifier2.parameters(), 'lr': 0.002}, 286 | {'params': net.conv_block3.parameters(), 'lr': 0.002}, 287 | {'params': net.classifier3.parameters(), 'lr': 0.002}, 288 | 289 | {'params': decoder1.skip.parameters(), 'lr': 0.002}, 290 | {'params': decoder1.process.parameters(), 'lr': 0.002}, 291 | {'params': decoder2.skip.parameters(), 'lr': 0.002}, 292 | {'params': decoder2.process.parameters(), 'lr': 0.002}, 293 | {'params': decoder3.skip.parameters(), 'lr': 0.002}, 294 | {'params': decoder3.process.parameters(), 'lr': 0.002}, 295 | 296 | {'params': net.Features.parameters(), 'lr': 0.0002} 297 | 298 | ], 299 | base_optimizer, adaptive=False, momentum=0.9, weight_decay=5e-4) 300 | 301 | 302 | 303 | max_val_acc = 0 304 | lr = [0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.0002] 305 | for epoch in range(start_epoch, nb_epoch): 306 | print('\nEpoch: %d' % epoch) 307 | net.train() 308 | train_loss = 0 309 | train_loss1 = 0 310 | train_loss2 = 0 311 | train_loss3 = 0 312 | train_loss4 = 0 313 | train_loss5 = 0 314 | correct = 0 315 | total = 0 316 | idx = 0 317 | for batch_idx, (inputs, targets) in enumerate(trainloader): 318 | idx = batch_idx 319 | if inputs.shape[0] < batch_size: 320 | continue 321 | if use_cuda: 322 | inputs, targets = inputs.to(device), targets.to(device) 323 | inputs, targets = Variable(inputs), Variable(targets) 324 | 325 | 326 | for nlr in range(len(optimizer.param_groups)): 327 | optimizer.param_groups[nlr]['lr'] = cosine_anneal_schedule(epoch, nb_epoch, lr[nlr]) 328 | 329 | sometimes_1 = lambda aug: iaa.Sometimes(0.2, aug) 330 | sometimes_2 = lambda aug: iaa.Sometimes(0.5, aug) 331 | 332 | 333 | 334 | trans_seq_aug = iaa.Sequential( 335 | [ 336 | 337 | sometimes_1(iaa.Affine( 338 | scale={"x": (0.8, 1.2), "y": (0.8, 1.2)}, 339 | translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)}, 340 | rotate=(-15, 15), 341 | shear=(-15, 15), 342 | order=[0, 1], 343 | cval=(0, 1), 344 | mode=ia.ALL 345 | )), 346 | sometimes_2(iaa.GaussianBlur((0, 3.0))) 347 | ], 348 | random_order=True 349 | ) 350 | 351 | trans_seq = iaa.Sequential( 352 | [ 353 | 354 | iaa.AdditiveGaussianNoise( 355 | loc=0, scale=(0.0, 0.05), per_channel=0.5 356 | ) 357 | ], 358 | random_order=True 359 | ) 360 | 361 | # H1 362 | # H1 first forward-backward step 363 | enable_running_stats(netp) 364 | optimizer.zero_grad() 365 | inputs1_gt = img_add_noise(inputs, trans_seq_aug).to(device) 366 | inputs1 = img_add_noise(inputs1_gt, trans_seq).to(device) 367 | output_1, _, _, _, map1, _, _ = netp(inputs1) 368 | loss1_c = CELoss(output_1, targets).mean() * 1 369 | 370 | inputs1_syn = decoder1(inputs1, map1) 371 | loss1_g = CB_loss(inputs1_syn, inputs1_gt) * 1 372 | 373 | output_1_syn, _, _, _, _, _, _ = netp(inputs1_syn) 374 | loss1_c_syn = CELoss(output_1_syn, targets).mean() * 1 375 | 376 | loss1 = loss1_c + (alpha * loss1_g) + loss1_c_syn 377 | loss1.backward() 378 | optimizer.first_step(zero_grad=True) 379 | 380 | 381 | # H1 second forward-backward step 382 | disable_running_stats(netp) 383 | optimizer.zero_grad() 384 | 385 | output_1, _, _, _, map1, _, _ = netp(inputs1) 386 | loss1_c = CELoss(output_1, targets).mean() * 1 387 | 388 | inputs1_syn = decoder1(inputs1, map1) 389 | loss1_g = CB_loss(inputs1_syn, inputs1_gt) * 1 390 | 391 | output_1_syn, _, _, _, _, _, _ = netp(inputs1_syn) 392 | loss1_c_syn = CELoss(output_1_syn, targets).mean() * 1 393 | 394 | loss1_ = loss1_c + (alpha * loss1_g) + loss1_c_syn 395 | loss1_.backward() 396 | optimizer.second_step(zero_grad=True) 397 | 398 | 399 | # H2 400 | # H2 first forward-backward step 401 | enable_running_stats(netp) 402 | optimizer.zero_grad() 403 | inputs2_gt = img_add_noise(inputs, trans_seq_aug).to(device) 404 | inputs2 = img_add_noise(inputs2_gt, trans_seq).to(device) 405 | _, output_2, _, _, _, map2, _ = netp(inputs2) 406 | loss2_c = CELoss(output_2, targets).mean() * 1 407 | 408 | inputs2_syn = decoder2(inputs2, map2) 409 | loss2_g = CB_loss(inputs2_syn, inputs2_gt) * 1 410 | 411 | _, output_2_syn, _, _, _, _, _ = netp(inputs2_syn) 412 | loss2_c_syn = CELoss(output_2_syn, targets).mean() * 1 413 | 414 | loss2 = loss2_c + (alpha * loss2_g) + loss2_c_syn 415 | loss2.backward() 416 | optimizer.first_step(zero_grad=True) 417 | 418 | # H2 second forward-backward step 419 | disable_running_stats(netp) 420 | optimizer.zero_grad() 421 | _, output_2, _, _, _, map2, _ = netp(inputs2) 422 | loss2_c = CELoss(output_2, targets).mean() * 1 423 | 424 | inputs2_syn = decoder2(inputs2, map2) 425 | loss2_g = CB_loss(inputs2_syn, inputs2_gt) * 1 426 | 427 | _, output_2_syn, _, _, _, _, _ = netp(inputs2_syn) 428 | loss2_c_syn = CELoss(output_2_syn, targets).mean() * 1 429 | 430 | loss2_ = loss2_c + (alpha * loss2_g) + loss2_c_syn 431 | loss2_.backward() 432 | optimizer.second_step(zero_grad=True) 433 | 434 | #H3 435 | # H3 first forward-backward step 436 | enable_running_stats(netp) 437 | optimizer.zero_grad() 438 | inputs3_gt = img_add_noise(inputs, trans_seq_aug).to(device) 439 | inputs3 = img_add_noise(inputs3_gt, trans_seq).to(device) 440 | _, _, output_3, _, _, _, map3 = netp(inputs3) 441 | loss3_c = CELoss(output_3, targets).mean() * 1 442 | 443 | inputs3_syn = decoder3(inputs3, map3) 444 | loss3_g = CB_loss(inputs3_syn, inputs3_gt) * 1 445 | 446 | _, _, output_3_syn, _, _, _, _ = netp(inputs3_syn) 447 | loss3_c_syn = CELoss(output_3_syn, targets).mean() * 1 448 | 449 | loss3 = loss3_c + (alpha * loss3_g) + loss3_c_syn 450 | loss3.backward() 451 | optimizer.first_step(zero_grad=True) 452 | 453 | # H3 second forward-backward step 454 | disable_running_stats(netp) 455 | optimizer.zero_grad() 456 | _, _, output_3, _, _, _, map3 = netp(inputs3) 457 | loss3_c = CELoss(output_3, targets).mean() * 1 458 | 459 | inputs3_syn = decoder3(inputs3, map3) 460 | loss3_g = CB_loss(inputs3_syn, inputs3_gt) * 1 461 | 462 | _, _, output_3_syn, _, _, _, _ = netp(inputs3_syn) 463 | loss3_c_syn = CELoss(output_3_syn, targets).mean() * 1 464 | 465 | loss3_ = loss3_c + (alpha * loss3_g) + loss3_c_syn 466 | loss3_.backward() 467 | optimizer.second_step(zero_grad=True) 468 | 469 | 470 | 471 | # H4 472 | # H4 first forward-backward step 473 | enable_running_stats(netp) 474 | optimizer.zero_grad() 475 | output_1_final, output_2_final, output_3_final, output_ORI, _, _, _ = netp(inputs) 476 | ORI_loss = CELoss(output_1_final, targets).mean() + \ 477 | CELoss(output_2_final, targets).mean() + \ 478 | CELoss(output_3_final, targets).mean() + \ 479 | CELoss(output_ORI, targets).mean() * 2 480 | ORI_loss.backward() 481 | optimizer.first_step(zero_grad=True) 482 | 483 | # H4 second forward-backward step 484 | disable_running_stats(netp) 485 | optimizer.zero_grad() 486 | output_1_final, output_2_final, output_3_final, output_ORI, _, _, _ = netp(inputs) 487 | ORI_loss_ = CELoss(output_1_final, targets).mean() + \ 488 | CELoss(output_2_final, targets).mean() + \ 489 | CELoss(output_3_final, targets).mean() + \ 490 | CELoss(output_ORI, targets).mean() * 2 491 | ORI_loss_.backward() 492 | optimizer.second_step(zero_grad=True) 493 | 494 | 495 | _, predicted = torch.max(output_ORI.data, 1) 496 | total += targets.size(0) 497 | correct += predicted.eq(targets.data).cpu().sum() 498 | 499 | train_loss += (loss1.item() + loss2.item() + loss3.item() + ORI_loss.item()) 500 | train_loss1 += loss1.item() 501 | train_loss2 += loss2.item() 502 | train_loss3 += loss3.item() 503 | train_loss4 += (loss1_g.item() + loss2_g.item() + loss3_g.item()) 504 | train_loss5 += ORI_loss.item() 505 | 506 | if batch_idx % 50 == 0: 507 | print( 508 | 'Step: %d | Loss1: %.3f | Loss2: %.5f | Loss3: %.5f | Loss_Gen: %.5f |Loss_ORI: %.5f | Loss: %.3f | Acc: %.3f%% (%d/%d)' % ( 509 | batch_idx, train_loss1 / (batch_idx + 1), train_loss2 / (batch_idx + 1), 510 | train_loss3 / (batch_idx + 1), train_loss4 / (batch_idx + 1), train_loss5/ (batch_idx + 1), train_loss / (batch_idx + 1), 511 | 100. * float(correct) / total, correct, total)) 512 | 513 | train_acc = 100. * float(correct) / total 514 | train_loss = train_loss / (idx + 1) 515 | with open(exp_dir + '/results_train.txt', 'a') as file: 516 | file.write( 517 | 'Iteration %d | train_acc = %.5f | train_loss = %.5f | Loss1: %.3f | Loss2: %.5f | Loss3: %.5f | Loss_Gen: %.5f | Loss_ORI: %.5f |\n' % ( 518 | epoch, train_acc, train_loss, train_loss1 / (idx + 1), train_loss2 / (idx + 1), train_loss3 / (idx + 1), 519 | train_loss4 / (idx + 1), train_loss5 / (idx + 1))) 520 | 521 | 522 | val_acc_com, val_loss = test(net, CELoss, 7, data_path+'/test') 523 | if val_acc_com > max_val_acc: 524 | max_val_acc = val_acc_com 525 | net.cpu() 526 | decoder1.cpu() 527 | decoder2.cpu() 528 | decoder3.cpu() 529 | torch.save(net, './' + store_name + '/model.pth') 530 | torch.save(decoder1, './' + store_name + '/decoder1.pth') 531 | torch.save(decoder2, './' + store_name + '/decoder2.pth') 532 | torch.save(decoder3, './' + store_name + '/decoder3.pth') 533 | net.to(device) 534 | decoder1.to(device) 535 | decoder2.to(device) 536 | decoder3.to(device) 537 | with open(exp_dir + '/results_test.txt', 'a') as file: 538 | file.write('Iteration %d, test_acc_combined = %.5f, test_loss = %.6f\n' % ( 539 | epoch, val_acc_com, val_loss)) 540 | 541 | 542 | if __name__ == '__main__': 543 | data_path = '/Stanford Cars' 544 | if not os.path.isdir('results'): 545 | os.mkdir('results') 546 | train(nb_epoch=200, # number of epoch 547 | batch_size=8, # batch size 548 | store_name='results/Stanford_Cars_ResNet50_PMAL', # the folder for saving results 549 | num_class=196, # number of categories 550 | start_epoch=0, # the start epoch number 551 | data_path = data_path) # the path to the dataset 552 | 553 | -------------------------------------------------------------------------------- /Stanford_Cars_TResNet_L_PMAL.py: -------------------------------------------------------------------------------- 1 | from __future__ import print_function 2 | import os 3 | 4 | os.environ["CUDA_VISIBLE_DEVICES"] = "1" 5 | import torchvision.models 6 | from sam import SAM 7 | from torch.utils.model_zoo import load_url as load_state_dict_from_url 8 | import imgaug as ia 9 | import imgaug.augmenters as iaa 10 | from vic.loss import CharbonnierLoss 11 | import numpy as np 12 | import torchvision 13 | from torch.autograd import Variable 14 | from torchvision import transforms 15 | from basic_conv import * 16 | from example.model.smooth_cross_entropy import smooth_crossentropy 17 | from example.utility.bypass_bn import enable_running_stats, disable_running_stats 18 | from src.models.tresnet_v2.tresnet_v2 import TResnetL_V2 as TResnetL368 19 | import requests 20 | import torch.nn.functional as F 21 | 22 | 23 | def cosine_anneal_schedule(t, nb_epoch, lr): 24 | cos_inner = np.pi * (t % (nb_epoch)) 25 | cos_inner /= (nb_epoch) 26 | cos_out = np.cos(cos_inner) + 1 27 | 28 | return float(lr / 2 * cos_out) 29 | 30 | 31 | def test(net, criterion, batch_size, test_path): 32 | net.eval() 33 | use_cuda = torch.cuda.is_available() 34 | test_loss = 0 35 | correct = 0 36 | correct_com = 0 37 | total = 0 38 | idx = 0 39 | device = torch.device("cuda") 40 | 41 | transform_test = transforms.Compose([ 42 | transforms.Resize((421, 421)), 43 | transforms.CenterCrop(368), 44 | transforms.ToTensor(), 45 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 46 | ]) 47 | testset = torchvision.datasets.ImageFolder(root=test_path, 48 | transform=transform_test) 49 | testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size, shuffle=True, num_workers=4) 50 | 51 | for batch_idx, (inputs, targets) in enumerate(testloader): 52 | idx = batch_idx 53 | if use_cuda: 54 | inputs, targets = inputs.to(device), targets.to(device) 55 | with torch.no_grad(): 56 | inputs, targets = Variable(inputs), Variable(targets) 57 | output_1, output_2, output_3, output_ORI, _, _, _ = net(inputs) 58 | 59 | outputs_com = output_1.cpu() + output_2.cpu() + output_3.cpu() + output_ORI.cpu() 60 | 61 | loss = criterion(output_ORI, targets).mean().cpu() 62 | 63 | test_loss += loss.cpu().item() 64 | _, predicted = torch.max(output_ORI.data.cpu(), 1) 65 | _, predicted_com = torch.max(outputs_com.data.cpu(), 1) 66 | 67 | total += targets.size(0) 68 | correct += predicted.eq(targets.data.cpu()).cpu().sum() 69 | correct_com += predicted_com.eq(targets.data.cpu()).cpu().sum() 70 | 71 | if batch_idx % 50 == 0: 72 | print('Step: %d | Loss: %.3f |Combined Acc: %.3f%% (%d/%d)' % ( 73 | batch_idx, test_loss / (batch_idx + 1), 74 | 100. * float(correct_com) / total, correct_com, total)) 75 | 76 | test_acc_en = 100. * float(correct_com) / total 77 | test_loss = test_loss / (idx + 1) 78 | del inputs 79 | del loss 80 | del targets 81 | del output_1 82 | del output_2 83 | del output_3 84 | del output_ORI 85 | torch.cuda.empty_cache() 86 | 87 | return test_acc_en, test_loss 88 | 89 | 90 | 91 | class Features(nn.Module): 92 | def __init__(self, net_layers_FeatureHead): 93 | super(Features, self).__init__() 94 | self.net_layer_0 = nn.Sequential(net_layers_FeatureHead[0]) 95 | self.net_layer_1 = nn.Sequential(*net_layers_FeatureHead[1]) 96 | self.net_layer_2 = nn.Sequential(*net_layers_FeatureHead[2]) 97 | self.net_layer_3 = nn.Sequential(*net_layers_FeatureHead[3]) 98 | self.net_layer_4 = nn.Sequential(*net_layers_FeatureHead[4]) 99 | self.net_layer_5 = nn.Sequential(*net_layers_FeatureHead[5]) 100 | 101 | def forward(self, x): 102 | x = self.net_layer_0(x) 103 | x = self.net_layer_1(x) 104 | x = self.net_layer_2(x) 105 | x1 = self.net_layer_3(x) 106 | x2 = self.net_layer_4(x1) 107 | x3 = self.net_layer_5(x2) 108 | 109 | return x1, x2, x3 110 | 111 | 112 | class Network_Wrapper(nn.Module): 113 | def __init__(self, net_layers, num_classes, classifier): 114 | super().__init__() 115 | self.Features = Features(net_layers) 116 | self.classifier_pool = nn.Sequential(classifier[0]) 117 | self.classifier_initial = nn.Sequential(classifier[1]) 118 | self.sigmoid = nn.Sigmoid() 119 | self.lrelu = nn.LeakyReLU(negative_slope=0.1, inplace=True) 120 | 121 | self.max_pool1 = nn.MaxPool2d(kernel_size=46, stride=1) 122 | self.max_pool2 = nn.MaxPool2d(kernel_size=23, stride=1) 123 | self.max_pool3 = nn.MaxPool2d(kernel_size=12, stride=1) 124 | 125 | self.conv_block1 = nn.Sequential( 126 | BasicConv(512, 512, kernel_size=1, stride=1, padding=0, relu=True), 127 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 128 | ) 129 | self.classifier1 = nn.Sequential( 130 | nn.BatchNorm1d(1024), 131 | nn.Linear(1024, 512), 132 | nn.BatchNorm1d(512), 133 | nn.ELU(inplace=True), 134 | nn.Linear(512, num_classes) 135 | ) 136 | 137 | self.conv_block2 = nn.Sequential( 138 | BasicConv(1024, 512, kernel_size=1, stride=1, padding=0, relu=True), 139 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 140 | ) 141 | self.classifier2 = nn.Sequential( 142 | nn.BatchNorm1d(1024), 143 | nn.Linear(1024, 512), 144 | nn.BatchNorm1d(512), 145 | nn.ELU(inplace=True), 146 | nn.Linear(512, num_classes), 147 | ) 148 | 149 | self.conv_block3 = nn.Sequential( 150 | BasicConv(2048, 512, kernel_size=1, stride=1, padding=0, relu=True), 151 | BasicConv(512, 1024, kernel_size=3, stride=1, padding=1, relu=True) 152 | ) 153 | self.classifier3 = nn.Sequential( 154 | nn.BatchNorm1d(1024), 155 | nn.Linear(1024, 512), 156 | nn.BatchNorm1d(512), 157 | nn.ELU(inplace=True), 158 | nn.Linear(512, num_classes), 159 | ) 160 | 161 | def forward(self, x): 162 | x1, x2, x3 = self.Features(x) 163 | map1 = x1.clone() 164 | map2 = x2.clone() 165 | map3 = x3.clone() 166 | 167 | classifiers = self.classifier_pool(x3).view(x3.size(0), -1) 168 | classifiers = self.classifier_initial(classifiers) 169 | 170 | x1_ = self.conv_block1(x1) 171 | x1_ = self.max_pool1(x1_) 172 | x1_f = x1_.view(x1_.size(0), -1) 173 | 174 | x1_c = self.classifier1(x1_f) 175 | 176 | x2_ = self.conv_block2(x2) 177 | x2_ = self.max_pool2(x2_) 178 | x2_f = x2_.view(x2_.size(0), -1) 179 | x2_c = self.classifier2(x2_f) 180 | 181 | x3_ = self.conv_block3(x3) 182 | x3_ = self.max_pool3(x3_) 183 | x3_f = x3_.view(x3_.size(0), -1) 184 | x3_c = self.classifier3(x3_f) 185 | 186 | return x1_c, x2_c, x3_c, classifiers, map1, map2, map3 187 | 188 | 189 | class Anti_Noise_Decoder(nn.Module): 190 | def __init__(self, scale, in_channel): 191 | super(Anti_Noise_Decoder, self).__init__() 192 | self.Sigmoid = nn.Sigmoid() 193 | 194 | in_channel = in_channel // (scale * scale) 195 | 196 | self.skip = nn.Sequential( 197 | nn.Conv2d(3, 64, 3, 1, 1, bias=False), 198 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 199 | nn.Conv2d(64, 3, 3, 1, 1, bias=False), 200 | nn.LeakyReLU(negative_slope=0.1, inplace=True) 201 | 202 | ) 203 | 204 | self.process = nn.Sequential( 205 | nn.PixelShuffle(scale), 206 | nn.Conv2d(in_channel, 256, 3, 1, 1, bias=False), 207 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 208 | nn.PixelShuffle(2), 209 | nn.Conv2d(64, 128, 3, 1, 1, bias=False), 210 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 211 | nn.PixelShuffle(2), 212 | nn.Conv2d(32, 64, 3, 1, 1, bias=False), 213 | nn.LeakyReLU(negative_slope=0.1, inplace=True), 214 | nn.PixelShuffle(2), 215 | nn.Conv2d(16, 3, 3, 1, 1, bias=False), 216 | nn.LeakyReLU(negative_slope=0.1, inplace=True) 217 | ) 218 | 219 | def forward(self, x, map): 220 | x_ = self.process(map) 221 | if not (x.size() == x_.size()): 222 | x_ = F.interpolate(x, (x.size(2),x.size(3)), mode='bilinear') 223 | return self.skip(x) + x_ 224 | 225 | 226 | def img_add_noise(x, transformation_seq): 227 | x = x.permute(0, 2, 3, 1) 228 | x = x.cpu().numpy() 229 | x = transformation_seq(images=x) 230 | x = torch.from_numpy(x.astype(np.float32)) 231 | x = x.permute(0, 3, 1, 2) 232 | return x 233 | 234 | 235 | def CELoss(x, y): 236 | return smooth_crossentropy(x, y, smoothing=0.1) 237 | 238 | 239 | def train(nb_epoch, batch_size, store_name, num_class=0, start_epoch=0, data_path=''): 240 | alpha = 1 241 | 242 | exp_dir = store_name 243 | try: 244 | os.stat(exp_dir) 245 | except: 246 | os.makedirs(exp_dir) 247 | 248 | use_cuda = torch.cuda.is_available() 249 | print(use_cuda) 250 | 251 | print('==> Preparing data..') 252 | transform_train = transforms.Compose([ 253 | transforms.Resize((421, 421)), 254 | transforms.RandomCrop(368, padding=8), 255 | transforms.RandomHorizontalFlip(), 256 | transforms.ToTensor(), 257 | transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)), 258 | ]) 259 | trainset = torchvision.datasets.ImageFolder(root=data_path + '/train', transform=transform_train) 260 | trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=1) 261 | 262 | model_params = {'num_classes': num_class} 263 | 264 | model = TResnetL368(model_params) 265 | weights_url = \ 266 | 'https://miil-public-eu.oss-eu-central-1.aliyuncs.com/model-zoo/tresnet/stanford_cars_tresnet-l-v2_96_27.pth' 267 | weights_path = "tresnet-l-v2.pth" 268 | 269 | if not os.path.exists(weights_path): 270 | print('downloading weights...') 271 | r = requests.get(weights_url) 272 | with open(weights_path, "wb") as code: 273 | code.write(r.content) 274 | pretrained_weights = torch.load(weights_path) 275 | model.load_state_dict(pretrained_weights['model']) 276 | 277 | net_layers = list(model.children()) 278 | classifier = net_layers[1:3] 279 | net_layers = net_layers[0] 280 | net_layers = list(net_layers.children()) 281 | net = Network_Wrapper(net_layers, num_class, classifier) 282 | 283 | netp = torch.nn.DataParallel(net, device_ids=[0]) 284 | 285 | device = torch.device("cuda") 286 | net.to(device) 287 | decoder1 = Anti_Noise_Decoder(1, 512).to(device) 288 | decoder2 = Anti_Noise_Decoder(2, 1024).to(device) 289 | decoder3 = Anti_Noise_Decoder(4, 2048).to(device) 290 | 291 | CB_loss = CharbonnierLoss() 292 | 293 | base_optimizer = torch.optim.SGD 294 | 295 | optimizer = SAM([ 296 | {'params': net.classifier_initial.parameters(), 'lr': 0.002}, 297 | {'params': net.conv_block1.parameters(), 'lr': 0.002}, 298 | {'params': net.classifier1.parameters(), 'lr': 0.002}, 299 | {'params': net.conv_block2.parameters(), 'lr': 0.002}, 300 | {'params': net.classifier2.parameters(), 'lr': 0.002}, 301 | {'params': net.conv_block3.parameters(), 'lr': 0.002}, 302 | {'params': net.classifier3.parameters(), 'lr': 0.002}, 303 | 304 | {'params': decoder1.skip.parameters(), 'lr': 0.002}, 305 | {'params': decoder1.process.parameters(), 'lr': 0.002}, 306 | {'params': decoder2.skip.parameters(), 'lr': 0.002}, 307 | {'params': decoder2.process.parameters(), 'lr': 0.002}, 308 | {'params': decoder3.skip.parameters(), 'lr': 0.002}, 309 | {'params': decoder3.process.parameters(), 'lr': 0.002}, 310 | 311 | {'params': net.Features.parameters(), 'lr': 0.0002} 312 | 313 | ], 314 | base_optimizer, adaptive=False, momentum=0.9, weight_decay=5e-4) 315 | 316 | max_val_acc = 0 317 | lr = [0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.002, 0.0002] 318 | for epoch in range(start_epoch, nb_epoch): 319 | torch.cuda.empty_cache() 320 | 321 | print('\nEpoch: %d' % epoch) 322 | net.train() 323 | train_loss = 0 324 | train_loss1 = 0 325 | train_loss2 = 0 326 | train_loss3 = 0 327 | train_loss4 = 0 328 | train_loss5 = 0 329 | correct = 0 330 | total = 0 331 | idx = 0 332 | for batch_idx, (inputs, targets) in enumerate(trainloader): 333 | idx = batch_idx 334 | if inputs.shape[0] < batch_size: 335 | continue 336 | if use_cuda: 337 | inputs, targets = inputs.to(device), targets.to(device) 338 | inputs, targets = Variable(inputs), Variable(targets) 339 | 340 | for nlr in range(len(optimizer.param_groups)): 341 | optimizer.param_groups[nlr]['lr'] = cosine_anneal_schedule(epoch, nb_epoch, lr[nlr]) 342 | 343 | sometimes_1 = lambda aug: iaa.Sometimes(0.2, aug) 344 | sometimes_2 = lambda aug: iaa.Sometimes(0.5, aug) 345 | 346 | trans_seq_aug = iaa.Sequential( 347 | [ 348 | 349 | sometimes_1(iaa.Affine( 350 | scale={"x": (0.8, 1.2), "y": (0.8, 1.2)}, 351 | translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)}, 352 | rotate=(-15, 15), 353 | shear=(-15, 15), 354 | order=[0, 1], 355 | cval=(0, 1), 356 | mode=ia.ALL 357 | )), 358 | sometimes_2(iaa.GaussianBlur((0, 3.0))) 359 | ], 360 | random_order=True 361 | ) 362 | 363 | trans_seq = iaa.Sequential( 364 | [ 365 | 366 | iaa.AdditiveGaussianNoise( 367 | loc=0, scale=(0.0, 0.05), per_channel=0.5 368 | ) 369 | ], 370 | random_order=True 371 | ) 372 | 373 | # H1 374 | # H1 first forward-backward step 375 | enable_running_stats(netp) 376 | optimizer.zero_grad() 377 | inputs1_gt = img_add_noise(inputs, trans_seq_aug).to(device) 378 | inputs1 = img_add_noise(inputs1_gt, trans_seq).to(device) 379 | output_1, _, _, _, map1, _, _ = netp(inputs1) 380 | loss1_c = CELoss(output_1, targets).mean() * 1 381 | 382 | inputs1_syn = decoder1(inputs1, map1) 383 | loss1_g = CB_loss(inputs1_syn, inputs1_gt) * 1 384 | 385 | output_1_syn, _, _, _, _, _, _ = netp(inputs1_syn) 386 | loss1_c_syn = CELoss(output_1_syn, targets).mean() * 1 387 | 388 | loss1 = loss1_c + (alpha * loss1_g) + loss1_c_syn 389 | loss1.backward() 390 | optimizer.first_step(zero_grad=True) 391 | 392 | # H1 second forward-backward step 393 | disable_running_stats(netp) 394 | optimizer.zero_grad() 395 | 396 | output_1, _, _, _, map1, _, _ = netp(inputs1) 397 | loss1_c = CELoss(output_1, targets).mean() * 1 398 | 399 | inputs1_syn = decoder1(inputs1, map1) 400 | loss1_g = CB_loss(inputs1_syn, inputs1_gt) * 1 401 | 402 | output_1_syn, _, _, _, _, _, _ = netp(inputs1_syn) 403 | loss1_c_syn = CELoss(output_1_syn, targets).mean() * 1 404 | 405 | loss1_ = loss1_c + (alpha * loss1_g) + loss1_c_syn 406 | loss1_.backward() 407 | optimizer.second_step(zero_grad=True) 408 | 409 | loss1 = loss1.cpu() 410 | loss1_g = loss1_g.cpu() 411 | del output_1 412 | del output_1_syn 413 | del loss1_ 414 | del loss1_c 415 | del loss1_c_syn 416 | del inputs1 417 | del inputs1_gt 418 | del inputs1_syn 419 | torch.cuda.empty_cache() 420 | 421 | # H2 422 | # H2 first forward-backward step 423 | enable_running_stats(netp) 424 | optimizer.zero_grad() 425 | inputs2_gt = img_add_noise(inputs, trans_seq_aug).to(device) 426 | inputs2 = img_add_noise(inputs2_gt, trans_seq).to(device) 427 | _, output_2, _, _, _, map2, _ = netp(inputs2) 428 | loss2_c = CELoss(output_2, targets).mean() * 1 429 | 430 | inputs2_syn = decoder2(inputs2, map2) 431 | loss2_g = CB_loss(inputs2_syn, inputs2_gt) * 1 432 | 433 | _, output_2_syn, _, _, _, _, _ = netp(inputs2_syn) 434 | loss2_c_syn = CELoss(output_2_syn, targets).mean() * 1 435 | 436 | loss2 = loss2_c + (alpha * loss2_g) + loss2_c_syn 437 | loss2.backward() 438 | optimizer.first_step(zero_grad=True) 439 | 440 | # H2 second forward-backward step 441 | disable_running_stats(netp) 442 | optimizer.zero_grad() 443 | _, output_2, _, _, _, map2, _ = netp(inputs2) 444 | loss2_c = CELoss(output_2, targets).mean() * 1 445 | 446 | inputs2_syn = decoder2(inputs2, map2) 447 | loss2_g = CB_loss(inputs2_syn, inputs2_gt) * 1 448 | 449 | _, output_2_syn, _, _, _, _, _ = netp(inputs2_syn) 450 | loss2_c_syn = CELoss(output_2_syn, targets).mean() * 1 451 | 452 | loss2_ = loss2_c + (alpha * loss2_g) + loss2_c_syn 453 | loss2_.backward() 454 | optimizer.second_step(zero_grad=True) 455 | 456 | loss2 = loss2.cpu() 457 | loss2_g = loss2_g.cpu() 458 | del output_2 459 | del output_2_syn 460 | del loss2_ 461 | del loss2_c 462 | del loss2_c_syn 463 | del inputs2 464 | del inputs2_gt 465 | del inputs2_syn 466 | torch.cuda.empty_cache() 467 | 468 | # H3 469 | # H3 first forward-backward step 470 | enable_running_stats(netp) 471 | optimizer.zero_grad() 472 | inputs3_gt = img_add_noise(inputs, trans_seq_aug).to(device) 473 | inputs3 = img_add_noise(inputs3_gt, trans_seq).to(device) 474 | _, _, output_3, _, _, _, map3 = netp(inputs3) 475 | loss3_c = CELoss(output_3, targets).mean() * 1 476 | 477 | inputs3_syn = decoder3(inputs3, map3) 478 | loss3_g = CB_loss(inputs3_syn, inputs3_gt) * 1 479 | 480 | _, _, output_3_syn, _, _, _, _ = netp(inputs3_syn) 481 | loss3_c_syn = CELoss(output_3_syn, targets).mean() * 1 482 | 483 | loss3 = loss3_c + (alpha * loss3_g) + loss3_c_syn 484 | loss3.backward() 485 | optimizer.first_step(zero_grad=True) 486 | 487 | # H3 second forward-backward step 488 | disable_running_stats(netp) 489 | optimizer.zero_grad() 490 | _, _, output_3, _, _, _, map3 = netp(inputs3) 491 | loss3_c = CELoss(output_3, targets).mean() * 1 492 | 493 | inputs3_syn = decoder3(inputs3, map3) 494 | loss3_g = CB_loss(inputs3_syn, inputs3_gt) * 1 495 | 496 | _, _, output_3_syn, _, _, _, _ = netp(inputs3_syn) 497 | loss3_c_syn = CELoss(output_3_syn, targets).mean() * 1 498 | 499 | loss3_ = loss3_c + (alpha * loss3_g) + loss3_c_syn 500 | loss3_.backward() 501 | optimizer.second_step(zero_grad=True) 502 | 503 | loss3 = loss3.cpu() 504 | loss3_g = loss3_g.cpu() 505 | del output_3 506 | del output_3_syn 507 | del loss3_ 508 | del loss3_c 509 | del loss3_c_syn 510 | del inputs3 511 | del inputs3_gt 512 | del inputs3_syn 513 | torch.cuda.empty_cache() 514 | 515 | # H4 516 | # H4 first forward-backward step 517 | enable_running_stats(netp) 518 | optimizer.zero_grad() 519 | output_1_final, output_2_final, output_3_final, output_ORI, _, _, _ = netp(inputs) 520 | ORI_loss = CELoss(output_1_final, targets).mean() + \ 521 | CELoss(output_2_final, targets).mean() + \ 522 | CELoss(output_3_final, targets).mean() + \ 523 | CELoss(output_ORI, targets).mean() * 2 524 | ORI_loss.backward() 525 | optimizer.first_step(zero_grad=True) 526 | 527 | # H4 second forward-backward step 528 | disable_running_stats(netp) 529 | optimizer.zero_grad() 530 | output_1_final, output_2_final, output_3_final, output_ORI, _, _, _ = netp(inputs) 531 | ORI_loss_ = CELoss(output_1_final, targets).mean() + \ 532 | CELoss(output_2_final, targets).mean() + \ 533 | CELoss(output_3_final, targets).mean() + \ 534 | CELoss(output_ORI, targets).mean() * 2 535 | ORI_loss_.backward() 536 | optimizer.second_step(zero_grad=True) 537 | 538 | ORI_loss = ORI_loss.cpu() 539 | del output_1_final 540 | del output_2_final 541 | del output_3_final 542 | output_ORI = output_ORI.cpu() 543 | targets = targets.cpu() 544 | del inputs 545 | del ORI_loss_ 546 | torch.cuda.empty_cache() 547 | 548 | _, predicted = torch.max(output_ORI.data, 1) 549 | total += targets.size(0) 550 | correct += predicted.eq(targets.data).cpu().sum() 551 | 552 | train_loss += (loss1.item() + loss2.item() + loss3.item() + ORI_loss.item()) 553 | train_loss1 += loss1.item() 554 | train_loss2 += loss2.item() 555 | train_loss3 += loss3.item() 556 | train_loss4 += (loss1_g.item() + loss2_g.item() + loss3_g.item()) 557 | train_loss5 += ORI_loss.item() 558 | 559 | if batch_idx % 50 == 0: 560 | print( 561 | 'Step: %d | Loss1: %.3f | Loss2: %.5f | Loss3: %.5f | Loss_Gen: %.5f |Loss_ORI: %.5f | Loss: %.3f | Acc: %.3f%% (%d/%d)' % ( 562 | batch_idx, train_loss1 / (batch_idx + 1), train_loss2 / (batch_idx + 1), 563 | train_loss3 / (batch_idx + 1), train_loss4 / (batch_idx + 1), train_loss5 / (batch_idx + 1), 564 | train_loss / (batch_idx + 1), 565 | 100. * float(correct) / total, correct, total)) 566 | 567 | train_acc = 100. * float(correct) / total 568 | train_loss = train_loss / (idx + 1) 569 | with open(exp_dir + '/results_train.txt', 'a') as file: 570 | file.write( 571 | 'Iteration %d | train_acc = %.5f | train_loss = %.5f | Loss1: %.3f | Loss2: %.5f | Loss3: %.5f | Loss_Gen: %.5f | Loss_ORI: %.5f |\n' % ( 572 | epoch, train_acc, train_loss, train_loss1 / (idx + 1), train_loss2 / (idx + 1), 573 | train_loss3 / (idx + 1), 574 | train_loss4 / (idx + 1), train_loss5 / (idx + 1))) 575 | 576 | 577 | val_acc_com, val_loss = test(net, CELoss, 3, data_path + '/test') 578 | if val_acc_com > max_val_acc: 579 | max_val_acc = val_acc_com 580 | net.cpu() 581 | decoder1.cpu() 582 | decoder2.cpu() 583 | decoder3.cpu() 584 | torch.save(net, './' + store_name + '/model.pth') 585 | torch.save(decoder1, './' + store_name + '/decoder1.pth') 586 | torch.save(decoder2, './' + store_name + '/decoder2.pth') 587 | torch.save(decoder3, './' + store_name + '/decoder3.pth') 588 | net.to(device) 589 | decoder1.to(device) 590 | decoder2.to(device) 591 | decoder3.to(device) 592 | with open(exp_dir + '/results_test.txt', 'a') as file: 593 | file.write('Iteration %d, test_acc_combined = %.5f, test_loss = %.6f\n' % ( 594 | epoch, val_acc_com, val_loss)) 595 | 596 | 597 | 598 | if __name__ == '__main__': 599 | data_path = '/Stanford Cars' 600 | if not os.path.isdir('results'): 601 | os.mkdir('results') 602 | train(nb_epoch=200, # number of epoch 603 | batch_size=8, # batch size 604 | store_name='results/Stanford_Cars_TResNet_L_PMAL', # the folder for saving results 605 | num_class=196, # number of categories 606 | start_epoch=0, # the start epoch number 607 | data_path=data_path) # the path to the dataset 608 | 609 | --------------------------------------------------------------------------------