├── README.md ├── images ├── LGM_loss_epoch=100.jpg ├── LGM_loss_epoch=50.jpg ├── LGMu_loss_epoch=100.jpg ├── LGMu_loss_epoch=50.jpg ├── LMCL_loss_u_epoch=100.jpg ├── LMCL_loss_u_epoch=50.jpg ├── center_loss_epoch=100.jpg ├── center_loss_epoch=50.jpg ├── coco_loss_epoch=100.jpg ├── coco_loss_epoch=50.jpg ├── softmax_loss_epoch=100.jpg └── softmax_loss_epoch=50.jpg ├── model_utils.py ├── train_mnist_COCO_loss.py ├── train_mnist_LGM.py ├── train_mnist_LGM_u.py ├── train_mnist_LMCL.py ├── train_mnist_center_loss.py └── train_mnist_softmax.py /README.md: -------------------------------------------------------------------------------- 1 | # softmax_variants 2 | Various loss functions for softmax variants: center loss, cosface loss, large-margin gaussian mixture, COCOLoss 3 | implemented by pytorch 0.3.1 4 | 5 | the training dataset is MNIST 6 | 7 | You can directly run code train_mnist_xxx.py to reproduce the result 8 | 9 | The reference papers are as follow: 10 | 11 | Center loss: Yandong Wen, Kaipeng Zhang, Zhifeng Li and Yu Qiao. A Discriminative Feature Learning Approach for Deep Face Recognition. ECCV 2016 12 | 13 | Cosface loss: Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou,Zhifeng Li, and Wei Liu. CosFace: Large Margin Cosine Loss for Deep Face Recognition. CVPR2018 14 | 15 | Large-margin gaussian mixture loss: Weitao Wan, Yuanyi Zhong,Tianpeng Li, Jiansheng Chen. Rethinking Feature Distribution for Loss Functions in Image Classification. CVPR 2018 16 | 17 | COSO loss: Yu Liu, Hongyang Li, Xiaogang Wang. Rethinking Feature Discrimination and Polymerization for Large scale recognition. NIPS workshop 2017 18 | 19 | The learned 2-d embedding features are: 20 | 21 | softmax loss 22 | 23 | ![image](https://github.com/YirongMao/softmax_variants/blob/master/images/softmax_loss_epoch%3D50.jpg) 24 | 25 | COCO loss 26 | 27 | ![image]( https://github.com/YirongMao/softmax_variants/blob/master/images/coco_loss_epoch%3D50.jpg) 28 | 29 | Center loss 30 | 31 | ![image](https://github.com/YirongMao/softmax_variants/blob/master/images/center_loss_epoch%3D50.jpg) 32 | 33 | CosFace loss 34 | 35 | ![image](https://github.com/YirongMao/softmax_variants/blob/master/images/LMCL_loss_u_epoch%3D50.jpg) 36 | 37 | Large-margin gaussian mixture loss 38 | 39 | ![image](https://github.com/YirongMao/softmax_variants/blob/master/images/LGM_loss_epoch%3D50.jpg) 40 | 41 | 42 | 43 | -------------------------------------------------------------------------------- /images/LGM_loss_epoch=100.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/LGM_loss_epoch=100.jpg -------------------------------------------------------------------------------- /images/LGM_loss_epoch=50.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/LGM_loss_epoch=50.jpg -------------------------------------------------------------------------------- /images/LGMu_loss_epoch=100.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/LGMu_loss_epoch=100.jpg -------------------------------------------------------------------------------- /images/LGMu_loss_epoch=50.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/LGMu_loss_epoch=50.jpg -------------------------------------------------------------------------------- /images/LMCL_loss_u_epoch=100.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/LMCL_loss_u_epoch=100.jpg -------------------------------------------------------------------------------- /images/LMCL_loss_u_epoch=50.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/LMCL_loss_u_epoch=50.jpg -------------------------------------------------------------------------------- /images/center_loss_epoch=100.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/center_loss_epoch=100.jpg -------------------------------------------------------------------------------- /images/center_loss_epoch=50.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/center_loss_epoch=50.jpg -------------------------------------------------------------------------------- /images/coco_loss_epoch=100.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/coco_loss_epoch=100.jpg -------------------------------------------------------------------------------- /images/coco_loss_epoch=50.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/coco_loss_epoch=50.jpg -------------------------------------------------------------------------------- /images/softmax_loss_epoch=100.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/softmax_loss_epoch=100.jpg -------------------------------------------------------------------------------- /images/softmax_loss_epoch=50.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/YirongMao/softmax_variants/47e5419965da93f1e43968de9eb8f30e40a8bf01/images/softmax_loss_epoch=50.jpg -------------------------------------------------------------------------------- /model_utils.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from torch.autograd.function import Function 4 | import torch.nn.functional as F 5 | from torch.autograd import Variable 6 | 7 | 8 | class Net(nn.Module): 9 | def __init__(self): 10 | super(Net, self).__init__() 11 | self.conv1_1 = nn.Conv2d(1, 32, kernel_size=5, padding=2) 12 | self.prelu1_1 = nn.PReLU() 13 | self.conv1_2 = nn.Conv2d(32, 32, kernel_size=5, padding=2) 14 | self.prelu1_2 = nn.PReLU() 15 | self.conv2_1 = nn.Conv2d(32, 64, kernel_size=5, padding=2) 16 | self.prelu2_1 = nn.PReLU() 17 | self.conv2_2 = nn.Conv2d(64, 64, kernel_size=5, padding=2) 18 | self.prelu2_2 = nn.PReLU() 19 | self.conv3_1 = nn.Conv2d(64, 128, kernel_size=5, padding=2) 20 | self.prelu3_1 = nn.PReLU() 21 | self.conv3_2 = nn.Conv2d(128, 128, kernel_size=5, padding=2) 22 | self.prelu3_2 = nn.PReLU() 23 | self.preluip1 = nn.PReLU() 24 | self.ip1 = nn.Linear(128 * 3 * 3, 2) 25 | self.ip2 = nn.Linear(2, 10) 26 | 27 | def forward(self, x): 28 | x = self.prelu1_1(self.conv1_1(x)) 29 | x = self.prelu1_2(self.conv1_2(x)) 30 | x = F.max_pool2d(x, 2) 31 | x = self.prelu2_1(self.conv2_1(x)) 32 | x = self.prelu2_2(self.conv2_2(x)) 33 | x = F.max_pool2d(x, 2) 34 | x = self.prelu3_1(self.conv3_1(x)) 35 | x = self.prelu3_2(self.conv3_2(x)) 36 | x = F.max_pool2d(x, 2) 37 | x = x.view(-1, 128 * 3 * 3) 38 | ip1 = self.preluip1(self.ip1(x)) 39 | ip2 = self.ip2(ip1) 40 | return ip1, ip2 41 | 42 | 43 | class RingLoss(nn.Module): 44 | """ 45 | Refer to paper 46 | Ring loss: Convex Feature Normalization for Face Recognition 47 | """ 48 | def __init__(self, type='L2', loss_weight=1.0): 49 | super(RingLoss, self).__init__() 50 | self.radius = nn.Parameter(torch.Tensor(1)) 51 | self.radius.data.fill_(-1) 52 | self.loss_weight = loss_weight 53 | self.type = type 54 | 55 | def forward(self, x): 56 | x = x.pow(2).sum(dim=1).pow(0.5) 57 | if self.radius.data[0] < 0: # Initialize the radius with the mean feature norm of first iteration 58 | self.radius.data.fill_(x.mean().data[0]) 59 | if self.type == 'L1': # Smooth L1 Loss 60 | loss1 = F.smooth_l1_loss(x, self.radius.expand_as(x)).mul_(self.loss_weight) 61 | loss2 = F.smooth_l1_loss(self.radius.expand_as(x), x).mul_(self.loss_weight) 62 | ringloss = loss1 + loss2 63 | elif self.type == 'auto': # Divide the L2 Loss by the feature's own norm 64 | diff = x.sub(self.radius.expand_as(x)) / (x.mean().detach().clamp(min=0.5)) 65 | diff_sq = torch.pow(torch.abs(diff), 2).mean() 66 | ringloss = diff_sq.mul_(self.loss_weight) 67 | else: # L2 Loss, if not specified 68 | diff = x.sub(self.radius.expand_as(x)) 69 | diff_sq = torch.pow(torch.abs(diff), 2).mean() 70 | ringloss = diff_sq.mul_(self.loss_weight) 71 | return ringloss 72 | 73 | 74 | class COCOLoss(nn.Module): 75 | """ 76 | Refer to paper: 77 | Yu Liu, Hongyang Li, Xiaogang Wang 78 | Rethinking Feature Discrimination and Polymerization for Large scale recognition. NIPS workshop 2017 79 | re-implement by yirong mao 80 | 2018 07/02 81 | """ 82 | 83 | def __init__(self, num_classes, feat_dim, alpha=6.25): 84 | super(COCOLoss, self).__init__() 85 | self.feat_dim = feat_dim 86 | self.num_classes = num_classes 87 | self.alpha = alpha 88 | self.centers = nn.Parameter(torch.randn(num_classes, feat_dim)) 89 | 90 | def forward(self, feat): 91 | norms = torch.norm(feat, p=2, dim=-1, keepdim=True) 92 | nfeat = torch.div(feat, norms) 93 | snfeat = self.alpha*nfeat 94 | 95 | norms_c = torch.norm(self.centers, p=2, dim=-1, keepdim=True) 96 | ncenters = torch.div(self.centers, norms_c) 97 | 98 | logits = torch.matmul(snfeat, torch.transpose(ncenters, 0, 1)) 99 | 100 | return logits 101 | 102 | 103 | class LMCL_loss(nn.Module): 104 | """ 105 | Refer to paper: 106 | Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou,Zhifeng Li, and Wei Liu 107 | CosFace: Large Margin Cosine Loss for Deep Face Recognition. CVPR2018 108 | re-implement by yirong mao 109 | 2018 07/02 110 | """ 111 | 112 | def __init__(self, num_classes, feat_dim, s=7.00, m=0.2): 113 | super(LMCL_loss, self).__init__() 114 | self.feat_dim = feat_dim 115 | self.num_classes = num_classes 116 | self.s = s 117 | self.m = m 118 | self.centers = nn.Parameter(torch.randn(num_classes, feat_dim)) 119 | 120 | def forward(self, feat, label): 121 | batch_size = feat.shape[0] 122 | norms = torch.norm(feat, p=2, dim=-1, keepdim=True) 123 | nfeat = torch.div(feat, norms) 124 | 125 | norms_c = torch.norm(self.centers, p=2, dim=-1, keepdim=True) 126 | ncenters = torch.div(self.centers, norms_c) 127 | logits = torch.matmul(nfeat, torch.transpose(ncenters, 0, 1)) 128 | 129 | y_onehot = torch.FloatTensor(batch_size, self.num_classes) 130 | y_onehot.zero_() 131 | y_onehot = Variable(y_onehot).cuda() 132 | y_onehot.scatter_(1, torch.unsqueeze(label, dim=-1), self.m) 133 | margin_logits = self.s * (logits - y_onehot) 134 | 135 | return logits, margin_logits 136 | 137 | 138 | class LGMLoss(nn.Module): 139 | """ 140 | Refer to paper: 141 | Weitao Wan, Yuanyi Zhong,Tianpeng Li, Jiansheng Chen 142 | Rethinking Feature Distribution for Loss Functions in Image Classification. CVPR 2018 143 | re-implement by yirong mao 144 | 2018 07/02 145 | """ 146 | def __init__(self, num_classes, feat_dim, alpha): 147 | super(LGMLoss, self).__init__() 148 | self.feat_dim = feat_dim 149 | self.num_classes = num_classes 150 | self.alpha = alpha 151 | 152 | self.centers = nn.Parameter(torch.randn(num_classes, feat_dim)) 153 | self.log_covs = nn.Parameter(torch.zeros(num_classes, feat_dim)) 154 | 155 | def forward(self, feat, label): 156 | batch_size = feat.shape[0] 157 | log_covs = torch.unsqueeze(self.log_covs, dim=0) 158 | 159 | 160 | covs = torch.exp(log_covs) # 1*c*d 161 | tcovs = covs.repeat(batch_size, 1, 1) # n*c*d 162 | diff = torch.unsqueeze(feat, dim=1) - torch.unsqueeze(self.centers, dim=0) 163 | wdiff = torch.div(diff, tcovs) 164 | diff = torch.mul(diff, wdiff) 165 | dist = torch.sum(diff, dim=-1) #eq.(18) 166 | 167 | 168 | y_onehot = torch.FloatTensor(batch_size, self.num_classes) 169 | y_onehot.zero_() 170 | y_onehot = Variable(y_onehot).cuda() 171 | y_onehot.scatter_(1, torch.unsqueeze(label, dim=-1), self.alpha) 172 | y_onehot = y_onehot + 1.0 173 | margin_dist = torch.mul(dist, y_onehot) 174 | 175 | slog_covs = torch.sum(log_covs, dim=-1) #1*c 176 | tslog_covs = slog_covs.repeat(batch_size, 1) 177 | margin_logits = -0.5*(tslog_covs + margin_dist) #eq.(17) 178 | logits = -0.5 * (tslog_covs + dist) 179 | 180 | cdiff = feat - torch.index_select(self.centers, dim=0, index=label.long()) 181 | cdist = cdiff.pow(2).sum(1).sum(0) / 2.0 182 | 183 | slog_covs = torch.squeeze(slog_covs) 184 | reg = 0.5*torch.sum(torch.index_select(slog_covs, dim=0, index=label.long())) 185 | likelihood = (1.0/batch_size) * (cdist + reg) 186 | 187 | return logits, margin_logits, likelihood 188 | 189 | 190 | class LGMLoss_v0(nn.Module): 191 | """ 192 | LGMLoss whose covariance is fixed as Identity matrix 193 | """ 194 | def __init__(self, num_classes, feat_dim, alpha): 195 | super(LGMLoss_v0, self).__init__() 196 | self.feat_dim = feat_dim 197 | self.num_classes = num_classes 198 | self.alpha = alpha 199 | 200 | self.centers = nn.Parameter(torch.randn(num_classes, feat_dim)) 201 | 202 | 203 | def forward(self, feat, label): 204 | batch_size = feat.shape[0] 205 | 206 | diff = torch.unsqueeze(feat, dim=1) - torch.unsqueeze(self.centers, dim=0) 207 | diff = torch.mul(diff, diff) 208 | dist = torch.sum(diff, dim=-1) 209 | 210 | y_onehot = torch.FloatTensor(batch_size, self.num_classes) 211 | y_onehot.zero_() 212 | y_onehot = Variable(y_onehot).cuda() 213 | y_onehot.scatter_(1, torch.unsqueeze(label, dim=-1), self.alpha) 214 | y_onehot = y_onehot + 1.0 215 | margin_dist = torch.mul(dist, y_onehot) 216 | margin_logits = -0.5 * margin_dist 217 | logits = -0.5 * dist 218 | 219 | cdiff = feat - torch.index_select(self.centers, dim=0, index=label.long()) 220 | likelihood = (1.0/batch_size) * cdiff.pow(2).sum(1).sum(0) / 2.0 221 | return logits, margin_logits, likelihood 222 | 223 | 224 | class CenterLoss(nn.Module): 225 | def __init__(self, num_classes, feat_dim): 226 | super(CenterLoss, self).__init__() 227 | self.num_classes = num_classes 228 | self.feat_dim = feat_dim 229 | self.centers = nn.Parameter(torch.randn(num_classes, feat_dim)) 230 | self.centerlossfunction = CenterlossFunction.apply 231 | 232 | def forward(self, y, feat): 233 | # To squeeze the Tenosr 234 | batch_size = feat.size(0) 235 | feat = feat.view(batch_size, 1, 1, -1).squeeze() 236 | # To check the dim of centers and features 237 | if feat.size(1) != self.feat_dim: 238 | raise ValueError("Center's dim: {0} should be equal to input feature's dim: {1}".format(self.feat_dim,feat.size(1))) 239 | return self.centerlossfunction(feat, y, self.centers) 240 | 241 | 242 | class CenterlossFunction(Function): 243 | 244 | @staticmethod 245 | def forward(ctx, feature, label, centers): 246 | ctx.save_for_backward(feature, label, centers) 247 | centers_pred = centers.index_select(0, label.long()) 248 | return (feature - centers_pred).pow(2).sum(1).sum(0) / 2.0 249 | 250 | 251 | @staticmethod 252 | def backward(ctx, grad_output): 253 | feature, label, centers = ctx.saved_variables 254 | grad_feature = feature - centers.index_select(0, label.long()) # Eq. 3 255 | 256 | # init every iteration 257 | counts = torch.ones(centers.size(0)) 258 | grad_centers = torch.zeros(centers.size()) 259 | if feature.is_cuda: 260 | counts = counts.cuda() 261 | grad_centers = grad_centers.cuda() 262 | # print counts, grad_centers 263 | 264 | # Eq. 4 || need optimization !! To be vectorized, but how? 265 | for i in range(feature.size(0)): 266 | j = int(label[i].data[0]) 267 | counts[j] += 1 268 | grad_centers[j] += (centers.data[j] - feature.data[i]) 269 | # print counts 270 | grad_centers = Variable(grad_centers/counts.view(-1, 1)) 271 | 272 | return grad_feature * grad_output, None, grad_centers 273 | 274 | -------------------------------------------------------------------------------- /train_mnist_COCO_loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.optim as optim 4 | from torchvision import datasets, transforms 5 | from torch.autograd import Variable 6 | from torchvision import datasets 7 | from torch.utils.data import DataLoader 8 | import torch.optim.lr_scheduler as lr_scheduler 9 | import model_utils 10 | from model_utils import Net 11 | import matplotlib 12 | matplotlib.use('Agg') 13 | import matplotlib.pyplot as plt 14 | 15 | batch_size = 100 16 | 17 | def visualize(feat, labels, epoch): 18 | plt.ion() 19 | c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff', 20 | '#ff00ff', '#990000', '#999900', '#009900', '#009999'] 21 | plt.clf() 22 | for i in range(10): 23 | plt.plot(feat[labels == i, 0], feat[labels == i, 1], '.', c=c[i]) 24 | plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], loc='upper right') 25 | # plt.xlim(xmin=-5,xmax=5) 26 | # plt.ylim(ymin=-5,ymax=5) 27 | plt.text(-4.8, 4.6, "epoch=%d" % epoch) 28 | plt.savefig('./images/coco_loss_epoch=%d.jpg' % epoch) 29 | # plt.draw() 30 | # plt.pause(0.001) 31 | plt.close() 32 | 33 | 34 | def test(test_loder, criterion, model, use_cuda): 35 | correct = 0 36 | total = 0 37 | for i, (data, target) in enumerate(test_loder): 38 | if use_cuda: 39 | data = data.cuda() 40 | target = target.cuda() 41 | data, target = Variable(data), Variable(target) 42 | 43 | feats, _ = model(data) 44 | logits = criterion[1](feats) 45 | 46 | _, predicted = torch.max(logits.data, 1) 47 | total += target.size(0) 48 | correct += (predicted == target.data).sum() 49 | 50 | print('Test Accuracy of the model on the 10000 test images: %f %%' % (100 * correct / total)) 51 | 52 | 53 | def train(train_loader, model, criterion, optimizer, epoch, use_cuda): 54 | ip1_loader = [] 55 | idx_loader = [] 56 | for i, (data, target) in enumerate(train_loader): 57 | if use_cuda: 58 | data = data.cuda() 59 | target = target.cuda() 60 | data, target = Variable(data), Variable(target) 61 | 62 | feats, _ = model(data) 63 | logits = criterion[1](feats) 64 | loss = criterion[0](logits, target) 65 | 66 | _, predicted = torch.max(logits.data, 1) 67 | accuracy = (target.data == predicted).float().mean() 68 | 69 | optimizer[0].zero_grad() 70 | optimizer[1].zero_grad() 71 | 72 | loss.backward() 73 | 74 | optimizer[0].step() 75 | optimizer[1].step() 76 | 77 | ip1_loader.append(feats) 78 | idx_loader.append((target)) 79 | 80 | if (i + 1) % 50 == 0: 81 | print('Epoch [%d], Iter [%d/%d] Loss: %.4f Acc %.4f' 82 | % (epoch, i + 1, len(train_loader), loss.data[0], accuracy)) 83 | 84 | # print(norm) 85 | 86 | feat = torch.cat(ip1_loader, 0) 87 | labels = torch.cat(idx_loader, 0) 88 | visualize(feat.data.cpu().numpy(), labels.data.cpu().numpy(), epoch) 89 | 90 | 91 | def main(): 92 | if torch.cuda.is_available(): 93 | use_cuda = True 94 | else: 95 | use_cuda = False 96 | # Dataset 97 | trainset = datasets.MNIST('./data/', download=True, train=True, transform=transforms.Compose([ 98 | transforms.ToTensor(), 99 | transforms.Normalize((0.1307,), (0.3081,))])) 100 | train_loader = DataLoader(trainset, batch_size=100, shuffle=True, num_workers=4) 101 | 102 | testset = datasets.MNIST('./data/', download=True, train=False, transform=transforms.Compose([ 103 | transforms.ToTensor(), 104 | transforms.Normalize((0.1307,), (0.3081,))])) 105 | test_loader = DataLoader(testset, batch_size=100, shuffle=True, num_workers=4) 106 | 107 | # Model 108 | model = Net() 109 | 110 | # NLLLoss 111 | nllloss = nn.CrossEntropyLoss() 112 | coco_loss = model_utils.COCOLoss(10, 2) 113 | if use_cuda: 114 | nllloss = nllloss.cuda() 115 | coco_loss = coco_loss.cuda() 116 | model = model.cuda() 117 | criterion = [nllloss, coco_loss] 118 | # optimzer4nn 119 | optimizer4nn = optim.SGD(model.parameters(), lr=0.001, momentum=0.9, weight_decay=0.0005) 120 | sheduler_4nn = lr_scheduler.StepLR(optimizer4nn, 10, gamma=0.5) 121 | 122 | # optimzer4center 123 | optimzer4center = optim.SGD(coco_loss.parameters(), lr=0.01, momentum=0.9) 124 | # sheduler_4center = lr_scheduler.StepLR(optimzer4center, 10, gamma=0.5) 125 | for epoch in range(100): 126 | sheduler_4nn.step() 127 | # sheduler_4center.step() 128 | # lr = sheduler_4center.get_rate(epoch) 129 | # print optimizer4nn.param_groups[0]['lr'] 130 | train(train_loader, model, criterion, [optimizer4nn, optimzer4center], epoch + 1, use_cuda) 131 | torch.save(model, './model/coco_loss_net_' + str(epoch + 1) + '.model') 132 | test(test_loader, criterion, model, use_cuda) 133 | 134 | 135 | if __name__ == '__main__': 136 | main() -------------------------------------------------------------------------------- /train_mnist_LGM.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.optim as optim 4 | from torchvision import datasets, transforms 5 | from torch.autograd import Variable 6 | from torchvision import datasets 7 | from torch.utils.data import DataLoader 8 | import torch.optim.lr_scheduler as lr_scheduler 9 | import model_utils 10 | from model_utils import Net 11 | import matplotlib 12 | matplotlib.use('Agg') 13 | import matplotlib.pyplot as plt 14 | 15 | batch_size = 100 16 | 17 | def visualize(feat, labels, epoch): 18 | plt.ion() 19 | c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff', 20 | '#ff00ff', '#990000', '#999900', '#009900', '#009999'] 21 | plt.clf() 22 | for i in range(10): 23 | plt.plot(feat[labels == i, 0], feat[labels == i, 1], '.', c=c[i]) 24 | plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], loc='upper right') 25 | # plt.xlim(xmin=-5,xmax=5) 26 | # plt.ylim(ymin=-5,ymax=5) 27 | plt.text(-4.8, 4.6, "epoch=%d" % epoch) 28 | plt.savefig('./images/LGM_loss_epoch=%d.jpg' % epoch) 29 | # plt.draw() 30 | # plt.pause(0.001) 31 | plt.close() 32 | 33 | def test(test_loder, criterion, model, use_cuda): 34 | correct = 0 35 | total = 0 36 | for i, (data, target) in enumerate(test_loder): 37 | if use_cuda: 38 | data = data.cuda() 39 | target = target.cuda() 40 | data, target = Variable(data), Variable(target) 41 | 42 | feats, _ = model(data) 43 | logits, mlogits, likelihood = criterion[1](feats, target) 44 | _, predicted = torch.max(logits.data, 1) 45 | total += target.size(0) 46 | correct += (predicted == target.data).sum() 47 | 48 | print('Test Accuracy of the model on the 10000 test images: %f %%' % (100 * correct / total)) 49 | 50 | 51 | def train(train_loader, model, criterion, optimizer, epoch, loss_weight, use_cuda): 52 | ip1_loader = [] 53 | idx_loader = [] 54 | for i, (data, target) in enumerate(train_loader): 55 | if use_cuda: 56 | data = data.cuda() 57 | target = target.cuda() 58 | data, target = Variable(data), Variable(target) 59 | 60 | feats, _ = model(data) 61 | logits, mlogits, likelihood = criterion[1](feats, target) 62 | # cross_entropy = criterion[0](logits, target) 63 | loss = criterion[0](mlogits, target) + loss_weight * likelihood 64 | 65 | _, predicted = torch.max(logits.data, 1) 66 | accuracy = (target.data == predicted).float().mean() 67 | 68 | optimizer[0].zero_grad() 69 | optimizer[1].zero_grad() 70 | 71 | loss.backward() 72 | 73 | optimizer[0].step() 74 | optimizer[1].step() 75 | 76 | ip1_loader.append(feats) 77 | idx_loader.append((target)) 78 | if (i + 1) % 50 == 0: 79 | print('Epoch [%d], Iter [%d/%d] Loss: %.4f Acc %.4f' 80 | % (epoch, i + 1, len(train_loader) // batch_size, loss.data[0], accuracy)) 81 | 82 | feat = torch.cat(ip1_loader, 0) 83 | labels = torch.cat(idx_loader, 0) 84 | visualize(feat.data.cpu().numpy(), labels.data.cpu().numpy(), epoch) 85 | 86 | 87 | def main(): 88 | if torch.cuda.is_available(): 89 | use_cuda = True 90 | else: 91 | use_cuda = False 92 | # Dataset 93 | trainset = datasets.MNIST('./data/', download=True, train=True, transform=transforms.Compose([ 94 | transforms.ToTensor(), 95 | transforms.Normalize((0.1307,), (0.3081,))])) 96 | train_loader = DataLoader(trainset, batch_size=100, shuffle=True, num_workers=4) 97 | 98 | testset = datasets.MNIST('./data/', download=True, train=False, transform=transforms.Compose([ 99 | transforms.ToTensor(), 100 | transforms.Normalize((0.1307,), (0.3081,))])) 101 | test_loader = DataLoader(testset, batch_size=100, shuffle=True, num_workers=4) 102 | 103 | # Model 104 | model = Net() 105 | 106 | # NLLLoss 107 | nllloss = nn.CrossEntropyLoss() 108 | # CenterLoss 109 | loss_weight = 0.1 110 | lgm_loss = model_utils.LGMLoss_v0(10, 2, 1.0) 111 | if use_cuda: 112 | nllloss = nllloss.cuda() 113 | lgm_loss = lgm_loss.cuda() 114 | model = model.cuda() 115 | criterion = [nllloss, lgm_loss] 116 | # optimzer4nn 117 | optimizer4nn = optim.SGD(model.parameters(), lr=0.001, momentum=0.9, weight_decay=0.0005) 118 | sheduler = lr_scheduler.StepLR(optimizer4nn, 20, gamma=0.8) 119 | 120 | # optimzer4center 121 | optimzer4center = optim.SGD(lgm_loss.parameters(), lr=0.1) 122 | 123 | for epoch in range(100): 124 | sheduler.step() 125 | # print optimizer4nn.param_groups[0]['lr'] 126 | train(train_loader, model, criterion, [optimizer4nn, optimzer4center], epoch + 1, loss_weight, use_cuda) 127 | test(test_loader, criterion, model, use_cuda) 128 | 129 | 130 | if __name__ == '__main__': 131 | main() -------------------------------------------------------------------------------- /train_mnist_LGM_u.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.optim as optim 4 | from torchvision import datasets, transforms 5 | from torch.autograd import Variable 6 | from torchvision import datasets 7 | from torch.utils.data import DataLoader 8 | import torch.optim.lr_scheduler as lr_scheduler 9 | import model_utils 10 | from model_utils import Net 11 | import matplotlib 12 | matplotlib.use('Agg') 13 | import matplotlib.pyplot as plt 14 | 15 | batch_size = 100 16 | 17 | def visualize(feat, labels, epoch): 18 | plt.ion() 19 | c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff', 20 | '#ff00ff', '#990000', '#999900', '#009900', '#009999'] 21 | plt.clf() 22 | for i in range(10): 23 | plt.plot(feat[labels == i, 0], feat[labels == i, 1], '.', c=c[i]) 24 | plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], loc='upper right') 25 | # plt.xlim(xmin=-5,xmax=5) 26 | # plt.ylim(ymin=-5,ymax=5) 27 | plt.text(-4.8, 4.6, "epoch=%d" % epoch) 28 | plt.savefig('./images/LGMu_loss_epoch=%d.jpg' % epoch) 29 | # plt.draw() 30 | # plt.pause(0.001) 31 | plt.close() 32 | 33 | def test(test_loder, criterion, model, use_cuda): 34 | correct = 0 35 | total = 0 36 | for i, (data, target) in enumerate(test_loder): 37 | if use_cuda: 38 | data = data.cuda() 39 | target = target.cuda() 40 | data, target = Variable(data), Variable(target) 41 | 42 | feats, _ = model(data) 43 | logits, mlogits, likelihood = criterion[1](feats, target) 44 | _, predicted = torch.max(logits.data, 1) 45 | total += target.size(0) 46 | correct += (predicted == target.data).sum() 47 | 48 | print('Test Accuracy of the model on the 10000 test images: %f %%' % (100 * correct / total)) 49 | 50 | 51 | def train(train_loader, model, criterion, optimizer, epoch, loss_weight, use_cuda): 52 | ip1_loader = [] 53 | idx_loader = [] 54 | for i, (data, target) in enumerate(train_loader): 55 | if use_cuda: 56 | data = data.cuda() 57 | target = target.cuda() 58 | data, target = Variable(data), Variable(target) 59 | 60 | feats, _ = model(data) 61 | logits, mlogits, likelihood = criterion[1](feats, target) 62 | # cross_entropy = criterion[0](logits, target) 63 | loss = criterion[0](mlogits, target) + loss_weight * likelihood 64 | 65 | _, predicted = torch.max(logits.data, 1) 66 | accuracy = (target.data == predicted).float().mean() 67 | 68 | optimizer[0].zero_grad() 69 | optimizer[1].zero_grad() 70 | 71 | loss.backward() 72 | 73 | optimizer[0].step() 74 | optimizer[1].step() 75 | 76 | ip1_loader.append(feats) 77 | idx_loader.append((target)) 78 | 79 | if (i + 1) % 50 == 0: 80 | tmp = criterion[1].log_covs 81 | norm = torch.sum(torch.mul(tmp, tmp)) 82 | print('Epoch [%d], Iter [%d/%d] Cov_norm %.4f Loss: %.4f Acc %.4f' 83 | % (epoch, i + 1, len(train_loader) // batch_size, norm.data[0], loss.data[0], accuracy)) 84 | 85 | # print(norm) 86 | 87 | feat = torch.cat(ip1_loader, 0) 88 | labels = torch.cat(idx_loader, 0) 89 | visualize(feat.data.cpu().numpy(), labels.data.cpu().numpy(), epoch) 90 | 91 | 92 | def main(): 93 | if torch.cuda.is_available(): 94 | use_cuda = True 95 | else: 96 | use_cuda = False 97 | # Dataset 98 | trainset = datasets.MNIST('./data/', download=True, train=True, transform=transforms.Compose([ 99 | transforms.ToTensor(), 100 | transforms.Normalize((0.1307,), (0.3081,))])) 101 | train_loader = DataLoader(trainset, batch_size=100, shuffle=True, num_workers=4) 102 | 103 | testset = datasets.MNIST('./data/', download=True, train=False, transform=transforms.Compose([ 104 | transforms.ToTensor(), 105 | transforms.Normalize((0.1307,), (0.3081,))])) 106 | test_loader = DataLoader(testset, batch_size=100, shuffle=True, num_workers=4) 107 | 108 | # Model 109 | model = Net() 110 | 111 | # NLLLoss 112 | nllloss = nn.CrossEntropyLoss() 113 | # CenterLoss 114 | loss_weight = 0.1 115 | lgm_loss = model_utils.LGMLoss(10, 2, 0.00) 116 | if use_cuda: 117 | nllloss = nllloss.cuda() 118 | lgm_loss = lgm_loss.cuda() 119 | model = model.cuda() 120 | criterion = [nllloss, lgm_loss] 121 | # optimzer4nn 122 | optimizer4nn = optim.SGD(model.parameters(), lr=0.001, momentum=0.9, weight_decay=0.0005) 123 | sheduler_4nn = lr_scheduler.StepLR(optimizer4nn, 10, gamma=0.5) 124 | 125 | # optimzer4center 126 | optimzer4center = optim.SGD(lgm_loss.parameters(), lr=0.01, momentum=0.9) 127 | sheduler_4center = lr_scheduler.StepLR(optimizer4nn, 10, gamma=0.5) 128 | for epoch in range(100): 129 | sheduler_4nn.step() 130 | sheduler_4center.step() 131 | # lr = sheduler_4center.get_rate(epoch) 132 | # print optimizer4nn.param_groups[0]['lr'] 133 | train(train_loader, model, criterion, [optimizer4nn, optimzer4center], epoch + 1, loss_weight, use_cuda) 134 | torch.save(model, './model/LGM_u_net_' + str(epoch + 1) + '.model') 135 | test(test_loader, criterion, model, use_cuda) 136 | 137 | 138 | if __name__ == '__main__': 139 | main() -------------------------------------------------------------------------------- /train_mnist_LMCL.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.optim as optim 4 | from torchvision import datasets, transforms 5 | from torch.autograd import Variable 6 | from torchvision import datasets 7 | from torch.utils.data import DataLoader 8 | import torch.optim.lr_scheduler as lr_scheduler 9 | import model_utils 10 | from model_utils import Net 11 | import matplotlib 12 | matplotlib.use('Agg') 13 | import matplotlib.pyplot as plt 14 | 15 | batch_size = 100 16 | 17 | def visualize(feat, labels, epoch): 18 | plt.ion() 19 | c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff', 20 | '#ff00ff', '#990000', '#999900', '#009900', '#009999'] 21 | plt.clf() 22 | for i in range(10): 23 | plt.plot(feat[labels == i, 0], feat[labels == i, 1], '.', c=c[i]) 24 | plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], loc='upper right') 25 | # plt.xlim(xmin=-5,xmax=5) 26 | # plt.ylim(ymin=-5,ymax=5) 27 | plt.text(-4.8, 4.6, "epoch=%d" % epoch) 28 | plt.savefig('./images/LMCL_loss_u_epoch=%d.jpg' % epoch) 29 | # plt.draw() 30 | # plt.pause(0.001) 31 | plt.close() 32 | 33 | def test(test_loder, criterion, model, use_cuda): 34 | correct = 0 35 | total = 0 36 | for i, (data, target) in enumerate(test_loder): 37 | if use_cuda: 38 | data = data.cuda() 39 | target = target.cuda() 40 | data, target = Variable(data), Variable(target) 41 | 42 | feats, _ = model(data) 43 | logits, mlogits = criterion[1](feats, target) 44 | _, predicted = torch.max(logits.data, 1) 45 | total += target.size(0) 46 | correct += (predicted == target.data).sum() 47 | 48 | print('Test Accuracy of the model on the 10000 test images: %f %%' % (100 * correct / total)) 49 | 50 | 51 | def train(train_loader, model, criterion, optimizer, epoch, loss_weight, use_cuda): 52 | ip1_loader = [] 53 | idx_loader = [] 54 | for i, (data, target) in enumerate(train_loader): 55 | if use_cuda: 56 | data = data.cuda() 57 | target = target.cuda() 58 | data, target = Variable(data), Variable(target) 59 | 60 | feats, _ = model(data) 61 | logits, mlogits = criterion[1](feats, target) 62 | # cross_entropy = criterion[0](logits, target) 63 | loss = criterion[0](mlogits, target) 64 | 65 | _, predicted = torch.max(logits.data, 1) 66 | accuracy = (target.data == predicted).float().mean() 67 | 68 | optimizer[0].zero_grad() 69 | optimizer[1].zero_grad() 70 | 71 | loss.backward() 72 | 73 | optimizer[0].step() 74 | optimizer[1].step() 75 | 76 | ip1_loader.append(feats) 77 | idx_loader.append((target)) 78 | if (i + 1) % 50 == 0: 79 | print('Epoch [%d], Iter [%d/%d] Loss: %.4f Acc %.4f' 80 | % (epoch, i + 1, len(train_loader), loss.data[0], accuracy)) 81 | 82 | feat = torch.cat(ip1_loader, 0) 83 | labels = torch.cat(idx_loader, 0) 84 | visualize(feat.data.cpu().numpy(), labels.data.cpu().numpy(), epoch) 85 | 86 | 87 | def main(): 88 | if torch.cuda.is_available(): 89 | use_cuda = True 90 | else: 91 | use_cuda = False 92 | # Dataset 93 | trainset = datasets.MNIST('./data/', download=True, train=True, transform=transforms.Compose([ 94 | transforms.ToTensor(), 95 | transforms.Normalize((0.1307,), (0.3081,))])) 96 | train_loader = DataLoader(trainset, batch_size=100, shuffle=True, num_workers=4) 97 | 98 | testset = datasets.MNIST('./data/', download=True, train=False, transform=transforms.Compose([ 99 | transforms.ToTensor(), 100 | transforms.Normalize((0.1307,), (0.3081,))])) 101 | test_loader = DataLoader(testset, batch_size=100, shuffle=True, num_workers=4) 102 | 103 | # Model 104 | model = Net() 105 | 106 | # NLLLoss 107 | nllloss = nn.CrossEntropyLoss() 108 | # CenterLoss 109 | loss_weight = 0.1 110 | lmcl_loss = model_utils.LMCL_loss(num_classes=10, feat_dim=2) 111 | if use_cuda: 112 | nllloss = nllloss.cuda() 113 | coco_loss = lmcl_loss.cuda() 114 | model = model.cuda() 115 | criterion = [nllloss, lmcl_loss] 116 | # optimzer4nn 117 | optimizer4nn = optim.SGD(model.parameters(), lr=0.001, momentum=0.9, weight_decay=0.0005) 118 | sheduler_4nn = lr_scheduler.StepLR(optimizer4nn, 20, gamma=0.5) 119 | 120 | # optimzer4center 121 | optimzer4center = optim.SGD(lmcl_loss.parameters(), lr=0.01) 122 | sheduler_4center = lr_scheduler.StepLR(optimizer4nn, 20, gamma=0.5) 123 | for epoch in range(100): 124 | sheduler_4nn.step() 125 | sheduler_4center.step() 126 | # print optimizer4nn.param_groups[0]['lr'] 127 | train(train_loader, model, criterion, [optimizer4nn, optimzer4center], epoch + 1, loss_weight, use_cuda) 128 | test(test_loader, criterion, model, use_cuda) 129 | 130 | 131 | if __name__ == '__main__': 132 | main() -------------------------------------------------------------------------------- /train_mnist_center_loss.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.optim as optim 4 | from torchvision import datasets, transforms 5 | from torch.autograd import Variable 6 | from torchvision import datasets 7 | from torch.utils.data import DataLoader 8 | import torch.optim.lr_scheduler as lr_scheduler 9 | from model_utils import CenterLoss 10 | from model_utils import Net 11 | import matplotlib.pyplot as plt 12 | 13 | batch_size = 100 14 | 15 | def visualize(feat, labels, epoch): 16 | plt.ion() 17 | c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff', 18 | '#ff00ff', '#990000', '#999900', '#009900', '#009999'] 19 | plt.clf() 20 | for i in range(10): 21 | plt.plot(feat[labels == i, 0], feat[labels == i, 1], '.', c=c[i]) 22 | plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], loc='upper right') 23 | # plt.xlim(xmin=-5,xmax=5) 24 | # plt.ylim(ymin=-5,ymax=5) 25 | plt.text(-4.8, 4.6, "epoch=%d" % epoch) 26 | plt.savefig('./images/center_loss_epoch=%d.jpg' % epoch) 27 | plt.draw() 28 | plt.pause(0.001) 29 | 30 | 31 | def test(test_loder, model, use_cuda): 32 | correct = 0 33 | total = 0 34 | for i, (data, target) in enumerate(test_loder): 35 | if use_cuda: 36 | data = data.cuda() 37 | target = target.cuda() 38 | data, target = Variable(data), Variable(target) 39 | 40 | ip1, logits = model(data) 41 | _, predicted = torch.max(logits.data, 1) 42 | total += target.size(0) 43 | correct += (predicted == target.data).sum() 44 | 45 | print('Test Accuracy of the model on the 10000 test images: %f %%' % (100 * correct / total)) 46 | 47 | 48 | def train(train_loader, model, criterion, optimizer, epoch, loss_weight, use_cuda): 49 | ip1_loader = [] 50 | idx_loader = [] 51 | for i, (data, target) in enumerate(train_loader): 52 | if use_cuda: 53 | data = data.cuda() 54 | target = target.cuda() 55 | data, target = Variable(data), Variable(target) 56 | 57 | feats, logits = model(data) 58 | loss = criterion[0](logits, target) + loss_weight * criterion[1](target, feats) 59 | 60 | _, predicted = torch.max(logits.data, 1) 61 | accuracy = (target.data == predicted).float().mean() 62 | 63 | optimizer[0].zero_grad() 64 | optimizer[1].zero_grad() 65 | 66 | loss.backward() 67 | 68 | optimizer[0].step() 69 | optimizer[1].step() 70 | 71 | ip1_loader.append(feats) 72 | idx_loader.append((target)) 73 | if (i + 1) % 50 == 0: 74 | print('Epoch [%d], Iter [%d/%d] Loss: %.4f Acc %.4f' 75 | % (epoch, i + 1, len(train_loader) // batch_size, loss.data[0], accuracy)) 76 | 77 | feat = torch.cat(ip1_loader, 0) 78 | labels = torch.cat(idx_loader, 0) 79 | visualize(feat.data.cpu().numpy(), labels.data.cpu().numpy(), epoch) 80 | 81 | 82 | def main(): 83 | if torch.cuda.is_available(): 84 | use_cuda = True 85 | else: 86 | use_cuda = False 87 | # Dataset 88 | trainset = datasets.MNIST('./data/', download=True, train=True, transform=transforms.Compose([ 89 | transforms.ToTensor(), 90 | transforms.Normalize((0.1307,), (0.3081,))])) 91 | train_loader = DataLoader(trainset, batch_size=100, shuffle=True, num_workers=4) 92 | 93 | testset = datasets.MNIST('./data/', download=True, train=False, transform=transforms.Compose([ 94 | transforms.ToTensor(), 95 | transforms.Normalize((0.1307,), (0.3081,))])) 96 | test_loader = DataLoader(testset, batch_size=100, shuffle=True, num_workers=4) 97 | 98 | # Model 99 | model = Net() 100 | 101 | # NLLLoss 102 | nllloss = nn.CrossEntropyLoss() 103 | # CenterLoss 104 | loss_weight = 0.001 105 | centerloss = CenterLoss(10, 2) 106 | if use_cuda: 107 | nllloss = nllloss.cuda() 108 | centerloss = centerloss.cuda() 109 | model = model.cuda() 110 | criterion = [nllloss, centerloss] 111 | 112 | # optimzer4nn 113 | optimizer4nn = optim.SGD(model.parameters(), lr=0.001, momentum=0.9, weight_decay=0.0005) 114 | sheduler = lr_scheduler.StepLR(optimizer4nn, 20, gamma=0.8) 115 | 116 | # optimzer4center 117 | optimzer4center = optim.SGD(centerloss.parameters(), lr=0.5) 118 | 119 | for epoch in range(100): 120 | sheduler.step() 121 | # print optimizer4nn.param_groups[0]['lr'] 122 | train(train_loader, model, criterion, [optimizer4nn, optimzer4center], epoch + 1, loss_weight, use_cuda) 123 | test(test_loader, model, use_cuda) 124 | 125 | 126 | if __name__ == '__main__': 127 | main() -------------------------------------------------------------------------------- /train_mnist_softmax.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.optim as optim 4 | from torchvision import datasets, transforms 5 | from torch.autograd import Variable 6 | from torchvision import datasets 7 | from torch.utils.data import DataLoader 8 | import torch.optim.lr_scheduler as lr_scheduler 9 | from model_utils import CenterLoss 10 | from model_utils import Net 11 | import matplotlib 12 | matplotlib.use('Agg') 13 | import matplotlib.pyplot as plt 14 | 15 | batch_size = 100 16 | num_epoch = 100 17 | 18 | 19 | def visualize(feat, labels, epoch): 20 | plt.ion() 21 | c = ['#ff0000', '#ffff00', '#00ff00', '#00ffff', '#0000ff', 22 | '#ff00ff', '#990000', '#999900', '#009900', '#009999'] 23 | plt.clf() 24 | for i in range(10): 25 | plt.plot(feat[labels == i, 0], feat[labels == i, 1], '.', c=c[i]) 26 | plt.legend(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], loc='upper right') 27 | # plt.xlim(xmin=-5,xmax=5) 28 | # plt.ylim(ymin=-5,ymax=5) 29 | plt.text(-4.8, 4.6, "epoch=%d" % epoch) 30 | plt.savefig('./images/softmax_loss_epoch=%d.jpg' % epoch) 31 | plt.close() 32 | 33 | 34 | def test(test_loder, model, use_cuda): 35 | correct = 0 36 | total = 0 37 | for i, (data, target) in enumerate(test_loder): 38 | if use_cuda: 39 | data = data.cuda() 40 | target = target.cuda() 41 | data, target = Variable(data), Variable(target) 42 | 43 | ip1, logits = model(data) 44 | _, predicted = torch.max(logits.data, 1) 45 | total += target.size(0) 46 | correct += (predicted == target.data).sum() 47 | 48 | print('Test Accuracy of the model on the 10000 test images: %f %%' % (100 * correct / total)) 49 | 50 | 51 | def train(train_loader, model, criterion, optimizer, epoch, use_cuda): 52 | ip1_loader = [] 53 | idx_loader = [] 54 | for i, (data, target) in enumerate(train_loader): 55 | if use_cuda: 56 | data = data.cuda() 57 | target = target.cuda() 58 | data, target = Variable(data), Variable(target) 59 | 60 | feats, logits = model(data) 61 | loss = criterion[0](logits, target) 62 | 63 | _, predicted = torch.max(logits.data, 1) 64 | accuracy = (target.data == predicted).float().mean() 65 | 66 | optimizer[0].zero_grad() 67 | loss.backward() 68 | optimizer[0].step() 69 | 70 | ip1_loader.append(feats) 71 | idx_loader.append((target)) 72 | if (i + 1) % 50 == 0: 73 | print('Epoch [%d], Iter [%d/%d] Loss: %.4f Acc %.4f' 74 | % (epoch, i + 1, len(train_loader) // batch_size, loss.data[0], accuracy)) 75 | 76 | feat = torch.cat(ip1_loader, 0) 77 | labels = torch.cat(idx_loader, 0) 78 | visualize(feat.data.cpu().numpy(), labels.data.cpu().numpy(), epoch) 79 | 80 | 81 | def main(): 82 | if torch.cuda.is_available(): 83 | use_cuda = True 84 | else: 85 | use_cuda = False 86 | # Dataset 87 | trainset = datasets.MNIST('./data/', download=True, train=True, transform=transforms.Compose([ 88 | transforms.ToTensor(), 89 | transforms.Normalize((0.1307,), (0.3081,))])) 90 | train_loader = DataLoader(trainset, batch_size=100, shuffle=True, num_workers=4) 91 | 92 | testset = datasets.MNIST('./data/', download=True, train=False, transform=transforms.Compose([ 93 | transforms.ToTensor(), 94 | transforms.Normalize((0.1307,), (0.3081,))])) 95 | test_loader = DataLoader(testset, batch_size=100, shuffle=True, num_workers=4) 96 | 97 | # Model 98 | model = Net() 99 | 100 | # NLLLoss 101 | nllloss = nn.CrossEntropyLoss() 102 | if use_cuda: 103 | nllloss = nllloss.cuda() 104 | model = model.cuda() 105 | criterion = [nllloss] 106 | 107 | # optimzer4nn 108 | optimizer4nn = optim.SGD(model.parameters(), lr=0.001, momentum=0.9, weight_decay=0.0005) 109 | sheduler = lr_scheduler.StepLR(optimizer4nn, 20, gamma=0.8) 110 | 111 | for epoch in range(num_epoch): 112 | sheduler.step() 113 | # print optimizer4nn.param_groups[0]['lr'] 114 | train(train_loader, model, criterion, [optimizer4nn], epoch + 1, use_cuda) 115 | torch.save(model, './model/softmax_net_' + str(epoch + 1) + '.model') 116 | test(test_loader, model, use_cuda) 117 | 118 | 119 | if __name__ == '__main__': 120 | main() --------------------------------------------------------------------------------