├── .idea ├── .gitignore ├── deployment.xml ├── inspectionProfiles │ └── profiles_settings.xml ├── misc.xml ├── modules.xml ├── multilabel-energy.iml └── vcs.xml ├── README.md ├── datasets ├── nus-wide │ ├── nus_wide_test_imglist.txt │ ├── nus_wide_test_label.txt │ ├── nus_wide_train_imglist.txt │ └── nus_wide_train_label.txt └── pascal │ ├── voc12-test.mat │ ├── voc12-train.mat │ └── voc12-val.mat ├── demo_figs ├── result_screenshot.png └── teaser.png ├── eval.py ├── fine_tune.py ├── lib.py ├── model ├── __init__.py └── classifiersimple.py ├── train.py ├── utils ├── __init__.py ├── anom_utils.py ├── coco-preprocessing.py └── dataloader │ ├── __init__.py │ ├── coco_loader.py │ ├── nus_wide_loader.py │ └── pascal_voc_loader.py └── validate.py /.idea/.gitignore: -------------------------------------------------------------------------------- 1 | # Default ignored files 2 | /shelf/ 3 | /workspace.xml 4 | # Datasource local storage ignored files 5 | /dataSources/ 6 | /dataSources.local.xml 7 | # Editor-based HTTP Client requests 8 | /httpRequests/ 9 | -------------------------------------------------------------------------------- /.idea/deployment.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | -------------------------------------------------------------------------------- /.idea/inspectionProfiles/profiles_settings.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 6 | -------------------------------------------------------------------------------- /.idea/misc.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 6 | -------------------------------------------------------------------------------- /.idea/modules.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/multilabel-energy.iml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | -------------------------------------------------------------------------------- /.idea/vcs.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Can multi-label classification networks know what they don’t know? 2 | 3 | This is a [PyTorch](http://pytorch.org) implementation of [Can multi-label classification networks know what they don't know?](......) by Haoran Wang, Weitang Liu, Alex Bocchieri, Sharon Li. 4 | Code is modified from 5 | [multilabel-ood](https://github.com/xksteven/multilabel-ood), 6 | [ODIN](https://github.com/facebookresearch/odin), 7 | [Outlier Exposure](https://github.com/hendrycks/outlier-exposure), and 8 | [deep Mahalanobis 9 | detector](https://github.com/pokaxpoka/deep_Mahalanobis_detector) 10 | 11 | ![teaser](demo_figs/teaser.png) 12 | ## Datasets 13 | 14 | ### In-distribution dataset 15 | 16 | PASCAL-VOC: please download the dataset from 17 | [this mirror](https://pjreddie.com/projects/pascal-voc-dataset-mirror/). Parsed labels for PASCAL-VOC are under ./dataset/pascal folder. Create the symlink to the location of Pascal dataset. 18 | 19 | ``` 20 | ln -s path/to/PASCALdataset Pascal 21 | ``` 22 | 23 | COCO: please download the MS-COCO 2014 dataset from [here](http://cocodataset.org/#download). Install the pycocotools to preprocess the dataset 24 | 25 | ``` 26 | pip3 install git+https://github.com/waleedka/coco.git#egg=pycocotools&subdirectory=PythonAPI 27 | ``` 28 | 29 | Preprocess the COCO dataset. 30 | 31 | ``` 32 | python3 utils/coco-preprocessing.py path/to/coco-dataset 33 | ``` 34 | 35 | NUS-WIDE: please download the dataset from [here](https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html). Parsed labels for NUS-WIDE are under ./dataset/nus-wide folder. 36 | 37 | ### Out-of-distribution dataset 38 | 39 | OOD dataset can be downloaded 40 | [here](https://drive.google.com/drive/folders/1BGMRQz3eB_npaGD46HC6K_uzt105HPRy?usp=sharing) 41 | 42 | ## Pre-trained models 43 | Pre-trained models can be downloaded from 44 | [here](https://drive.google.com/drive/folders/1ZfWB6vSYTK004j0bmfj6W0Xs6kwDTFX0?usp=sharing). 45 | 46 | ## Training the models 47 | 48 | ### Below are the examples on COCO dataset. 49 | 50 | Train the densenet model for COCO dataset 51 | ``` 52 | python3 train.py --arch densenet --dataset coco --save_dir ./saved_models/ 53 | ``` 54 | 55 | Evaluate the trained model 56 | ``` 57 | python3 validate.py --arch densenet --dataset coco --load_path ./saved_models/ 58 | ``` 59 | 60 | ## OOD dection 61 | 62 | To reproduce the JointEnergy score for COCO dataset, please run: 63 | 64 | ``` 65 | python3 eval.py --arch densenet --dataset coco --ood_data imagenet --ood energy 66 | --method sum 67 | ``` 68 | 69 | To reproduce the scores for logit/msp/prob/lof/isol, please run: 70 | 71 | ``` 72 | python3 eval.py --arch densenet --dataset coco --ood_data imagenet --ood 73 | logit/msp/prob/lof/isol/ --method max 74 | ``` 75 | 76 | To finetune the parameters for Odin and Mahalanobis, please run: 77 | ``` 78 | python3 fine_tune.py --arch densenet --dataset coco --ood odin/M --method max 79 | ``` 80 | 81 | After getting the best_T and best_noise, please run the evaluation: 82 | ``` 83 | python3 eval.py --arch densenet --dataset coco --ood_data imagenet --ood odin/M 84 | --method max --T best_T --noise --best_noise 85 | ``` 86 | 87 | ## OOD Detection Result 88 | OOD detection performance comparison using JointEnergy vs. competitive 89 | baselines. 90 | ![result](demo_figs/result_screenshot.png) 91 | 92 | ## Citation 93 | @article{wang2021canmulti, 94 | title={Can multi-label classification networks know what they don't know?}, 95 | author={Wang, Haoran and Liu, Weitang and Bocchieri, Alex and Li, Yixuan}, 96 | journal={Advances in Neural Information Processing Systems}, 97 | year={2021} 98 | } 99 | -------------------------------------------------------------------------------- /datasets/pascal/voc12-test.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deeplearning-wisc/multi-label-ood/0c999beb57319016740f19b72bc87c1115c80127/datasets/pascal/voc12-test.mat -------------------------------------------------------------------------------- /datasets/pascal/voc12-train.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deeplearning-wisc/multi-label-ood/0c999beb57319016740f19b72bc87c1115c80127/datasets/pascal/voc12-train.mat -------------------------------------------------------------------------------- /datasets/pascal/voc12-val.mat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deeplearning-wisc/multi-label-ood/0c999beb57319016740f19b72bc87c1115c80127/datasets/pascal/voc12-val.mat -------------------------------------------------------------------------------- /demo_figs/result_screenshot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deeplearning-wisc/multi-label-ood/0c999beb57319016740f19b72bc87c1115c80127/demo_figs/result_screenshot.png -------------------------------------------------------------------------------- /demo_figs/teaser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deeplearning-wisc/multi-label-ood/0c999beb57319016740f19b72bc87c1115c80127/demo_figs/teaser.png -------------------------------------------------------------------------------- /eval.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import argparse 3 | import torchvision 4 | import lib 5 | import numpy as np 6 | import torch.nn as nn 7 | from torch.autograd import Variable 8 | from torch.utils import data 9 | from model.classifiersimple import * 10 | from utils.dataloader.pascal_voc_loader import * 11 | from utils.dataloader.nus_wide_loader import * 12 | from utils.dataloader.coco_loader import * 13 | from utils import anom_utils 14 | 15 | def evaluation(): 16 | print("In-dis data: "+args.dataset) 17 | print("Out-dis data: " + args.ood_data) 18 | torch.manual_seed(0) 19 | np.random.seed(0) 20 | ###################### Setup Dataloader ###################### 21 | normalize = torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], 22 | std=[0.229, 0.224, 0.225]) 23 | img_transform = torchvision.transforms.Compose([ 24 | torchvision.transforms.Resize((256, 256)), 25 | torchvision.transforms.ToTensor(), 26 | normalize, 27 | ]) 28 | label_transform = torchvision.transforms.Compose([ 29 | anom_utils.ToLabel(), 30 | ]) 31 | # in_dis 32 | if args.dataset == 'pascal': 33 | train_data = pascalVOCLoader('./datasets/pascal/', 34 | img_transform=img_transform, label_transform=label_transform) 35 | test_data = pascalVOCLoader('./datasets/pascal/', split="voc12-test", 36 | img_transform=img_transform, label_transform=None) 37 | val_data = pascalVOCLoader('./datasets/pascal/', split="voc12-val", 38 | img_transform=img_transform, label_transform=label_transform) 39 | 40 | elif args.dataset == 'coco': 41 | train_data = cocoloader("./datasets/coco/", 42 | img_transform = img_transform, label_transform = label_transform) 43 | val_data = cocoloader('./datasets/coco/', split="multi-label-val2014", 44 | img_transform=img_transform, label_transform=label_transform) 45 | test_data = cocoloader('./datasets/coco/', split="test", 46 | img_transform=img_transform, label_transform=None) 47 | 48 | elif args.dataset == "nus-wide": 49 | train_data = nuswideloader("./datasets/nus-wide/", 50 | img_transform = img_transform, label_transform = label_transform) 51 | val_data = nuswideloader("./datasets/nus-wide/", split="val", 52 | img_transform = img_transform, label_transform = label_transform) 53 | test_data = nuswideloader("./datasets/nus-wide/", split="test", 54 | img_transform = img_transform, label_transform = label_transform) 55 | 56 | else: 57 | raise AssertionError 58 | 59 | args.n_classes = train_data.n_classes 60 | train_loader = data.DataLoader(train_data, batch_size=args.batch_size, num_workers=8, shuffle=True, pin_memory=True) 61 | in_test_loader = data.DataLoader(test_data, batch_size=args.batch_size, num_workers=8, shuffle=False, pin_memory=True) 62 | val_loader = data.DataLoader(val_data, batch_size=args.batch_size, num_workers=8, shuffle=False, pin_memory=True) 63 | 64 | # OOD data 65 | if args.ood_data == "imagenet": 66 | if args.dataset == "nus-wide": 67 | ood_root = "/nobackup-slow/dataset/nus-ood/" 68 | out_test_data = torchvision.datasets.ImageFolder(ood_root, transform=img_transform) 69 | else: 70 | ood_root = "/nobackup-slow/dataset/ImageNet22k/ImageNet-22K" 71 | out_test_data = torchvision.datasets.ImageFolder(ood_root, transform=img_transform) 72 | elif args.ood_data == "texture": 73 | ood_root = "/nobackup-slow/dataset/dtd/images/" 74 | out_test_data = torchvision.datasets.ImageFolder(ood_root, transform = img_transform) 75 | elif args.ood_data == "MNIST": 76 | gray_transform = torchvision.transforms.Compose([ 77 | torchvision.transforms.Resize((256, 256)), 78 | torchvision.transforms.ToTensor(), 79 | torchvision.transforms.Lambda(lambda x: x.repeat(3, 1, 1)), 80 | normalize 81 | ]) 82 | out_test_data = torchvision.datasets.MNIST('/nobackup-slow/dataset/MNIST/', 83 | train=False, transform=gray_transform) 84 | 85 | out_test_loader = data.DataLoader(out_test_data, batch_size=args.batch_size, num_workers=8, pin_memory=True) 86 | 87 | ###################### Load Models ###################### 88 | if args.arch == "resnet101": 89 | orig_resnet = torchvision.models.resnet101(pretrained=True) 90 | features = list(orig_resnet.children()) 91 | model= nn.Sequential(*features[0:8]) 92 | clsfier = clssimp(2048, args.n_classes) 93 | elif args.arch == "densenet": 94 | orig_densenet = torchvision.models.densenet121(pretrained=True) 95 | features = list(orig_densenet.features) 96 | model = nn.Sequential(*features, nn.ReLU(inplace=True)) 97 | clsfier = clssimp(1024, args.n_classes) 98 | 99 | model = model.cuda() 100 | clsfier = clsfier.cuda() 101 | if torch.cuda.device_count() > 1: 102 | print("Using",torch.cuda.device_count(), "GPUs!") 103 | model = nn.DataParallel(model) 104 | clsfier = nn.DataParallel(clsfier) 105 | 106 | model.load_state_dict(torch.load(args.load_model + args.arch + '.pth')) 107 | clsfier.load_state_dict(torch.load(args.load_model + args.arch + 'clssegsimp.pth')) 108 | print("model loaded!") 109 | 110 | # freeze the batchnorm and dropout layers 111 | model.eval() 112 | clsfier.eval() 113 | ###################### Compute Scores ###################### 114 | if args.ood == "odin": 115 | print("Using temperature", args.T, "noise", args.noise) 116 | in_scores = lib.get_odin_scores(in_test_loader, model, clsfier, args.method, 117 | args.T, args.noise) 118 | out_scores = lib.get_odin_scores(out_test_loader, model, clsfier, args.method, 119 | args.T, args.noise) 120 | elif args.ood == "M": 121 | ## Feature Extraction 122 | temp_x = torch.rand(2, 3, 256, 256) 123 | temp_x = Variable(temp_x.cuda()) 124 | temp_list = lib.model_feature_list(model, clsfier, temp_x, args.arch)[1] 125 | num_output = len(temp_list) 126 | feature_list = np.empty(num_output) 127 | count = 0 128 | for out in temp_list: 129 | feature_list[count] = out.size(1) 130 | count += 1 131 | print('get sample mean and covariance') 132 | sample_mean, precision = lib.sample_estimator(model, clsfier, args.n_classes, 133 | feature_list, train_loader) 134 | # Only use the 135 | pack = (sample_mean, precision) 136 | print("Using noise", args.noise) 137 | in_scores = lib.get_Mahalanobis_score(model, clsfier, in_test_loader, pack, 138 | args.noise, args.n_classes, args.method) 139 | out_scores = lib.get_Mahalanobis_score(model, clsfier, out_test_loader, pack, 140 | args.noise, args.n_classes, args.method) 141 | 142 | else: 143 | in_scores = lib.get_logits(in_test_loader, model, clsfier, args, name="in_test") 144 | out_scores = lib.get_logits(out_test_loader, model, clsfier, args, name="out_test") 145 | 146 | if args.ood == "lof": 147 | val_scores = lib.get_logits(val_loader, model, clsfier, args, name="in_val") 148 | scores = lib.get_localoutlierfactor_scores(val_scores, in_scores, out_scores) 149 | in_scores = scores[:len(in_scores)] 150 | out_scores = scores[-len(out_scores):] 151 | 152 | if args.ood == "isol": 153 | val_scores = lib.get_logits(val_loader, model, clsfier, args, name="in_val") 154 | scores = lib.get_isolationforest_scores(val_scores, in_scores, out_scores) 155 | in_scores = scores[:len(in_scores)] 156 | out_scores = scores[-len(out_scores):] 157 | ###################### Measure ###################### 158 | anom_utils.get_and_print_results(in_scores, out_scores, args.ood, args.method) 159 | 160 | 161 | if __name__ == '__main__': 162 | parser = argparse.ArgumentParser(description='Hyperparams') 163 | # ood measures 164 | parser.add_argument('--ood', type=str, default='logit', 165 | help='which measure to use odin|M|logit|energy|msp|prob|lof|isol') 166 | parser.add_argument('--method', type=str, default='max', 167 | help='which method to use max|sum') 168 | # dataset 169 | parser.add_argument('--dataset', type=str, default='pascal', 170 | help='Dataset to use pascal|coco|nus-wide') 171 | parser.add_argument('--ood_data', type=str, default='imagenet') 172 | parser.add_argument('--arch', type=str, default='densenet', 173 | help='Architecture to use densenet|resnet101') 174 | parser.add_argument('--batch_size', type=int, default=200, help='Batch Size') 175 | parser.add_argument('--n_classes', type=int, default=20, help='# of classes') 176 | # save and load 177 | parser.add_argument('--save_path', type=str, default="./logits/", help="save the logits") 178 | parser.add_argument('--load_model', type=str, default="./saved_models/", 179 | help='Path to load models') 180 | # input pre-processing 181 | parser.add_argument('--T', type=int, default=1) 182 | parser.add_argument('--noise', type=float, default=0.0) 183 | args = parser.parse_args() 184 | args.load_model += args.dataset + '/' 185 | 186 | args.save_path += args.dataset + '/' + args.ood_data + '/' + args.arch + '/' 187 | evaluation() 188 | -------------------------------------------------------------------------------- /fine_tune.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import argparse 3 | import numpy as np 4 | import torch.nn as nn 5 | from torch.autograd import Variable 6 | from torch.utils import data 7 | import torchvision 8 | from utils.anom_utils import ToLabel 9 | from model.classifiersimple import * 10 | from utils.dataloader.pascal_voc_loader import * 11 | from utils.dataloader.nus_wide_loader import * 12 | from utils.dataloader.coco_loader import * 13 | import lib 14 | import torchvision.transforms as transforms 15 | from utils import anom_utils 16 | 17 | 18 | def tune(): 19 | # compute in_data score 20 | torch.manual_seed(0) 21 | np.random.seed(0) 22 | if args.ood == 'M': 23 | pack = (sample_mean, precision) 24 | in_scores = lib.get_Mahalanobis_score(model, clsfier, val_loader, pack, 25 | args.noise, args.n_classes, args.method) 26 | else: 27 | in_scores = lib.get_odin_scores(val_loader, model, clsfier, args.method, 28 | args.T, args.noise) 29 | 30 | ############################### ood ############################### 31 | ood_num_examples = 1000 32 | auroc_list = [] 33 | aupr_list = [] 34 | fpr_list = [] 35 | # /////////////// Gaussion Noise /////////////// 36 | # print("Gaussion noise detection") 37 | dummy_targets = -torch.ones(ood_num_examples, args.n_classes) 38 | ood_data = torch.from_numpy(np.float32(np.clip( 39 | np.random.normal(size=(ood_num_examples, 3, 256, 256), scale=0.5), -1, 1))) 40 | ood_data = torch.utils.data.TensorDataset(ood_data, dummy_targets) 41 | ood_loader = torch.utils.data.DataLoader(ood_data, batch_size=args.batch_size, shuffle=True, 42 | num_workers=args.n_workers) 43 | # save_name = "\nGaussion" 44 | if args.ood == "M": 45 | out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 46 | args.noise, args.n_classes, args.method) 47 | 48 | else: 49 | out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 50 | args.T, args.noise) 51 | 52 | auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 53 | args.ood, args.method) 54 | auroc_list.append(auroc) 55 | aupr_list.append(aupr) 56 | fpr_list.append(fpr) 57 | # f.write(save_name+'\n') 58 | # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 59 | # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 60 | # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 61 | # f.write('\n') 62 | 63 | # /////////////// Uniform Noise /////////////// 64 | # print('Uniform[-1,1] Noise Detection') 65 | dummy_targets = -torch.ones(ood_num_examples, args.n_classes) 66 | ood_data = torch.from_numpy( 67 | np.random.uniform(size=(ood_num_examples, 3, 256, 256), 68 | low=-1.0, high=1.0).astype(np.float32)) 69 | ood_data = torch.utils.data.TensorDataset(ood_data, dummy_targets) 70 | ood_loader = torch.utils.data.DataLoader(ood_data, batch_size=args.batch_size, shuffle=True, 71 | num_workers=args.n_workers) 72 | 73 | # save_name = "\nUniform" 74 | if args.ood == "M": 75 | out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 76 | args.noise, args.n_classes, args.method) 77 | else: 78 | out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 79 | args.T, args.noise) 80 | auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 81 | args.ood, args.method) 82 | auroc_list.append(auroc) 83 | aupr_list.append(aupr) 84 | fpr_list.append(fpr) 85 | # f.write(save_name+'\n') 86 | # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 87 | # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 88 | # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 89 | # f.write('\n') 90 | 91 | # #/////////////// Arithmetic Mean of Images /////////////// 92 | class AvgOfPair(torch.utils.data.Dataset): 93 | def __init__(self, dataset): 94 | self.dataset = dataset 95 | self.shuffle_indices = np.arange(len(dataset)) 96 | np.random.shuffle(self.shuffle_indices) 97 | 98 | def __getitem__(self, i): 99 | random_idx = np.random.choice(len(self.dataset)) 100 | while random_idx == i: 101 | random_idx = np.random.choice(len(self.dataset)) 102 | 103 | return self.dataset[i][0] / 2. + self.dataset[random_idx][0] / 2., 0 104 | 105 | def __len__(self): 106 | return len(self.dataset) 107 | 108 | ood_loader = torch.utils.data.DataLoader(AvgOfPair(val_data), 109 | batch_size=args.batch_size, shuffle=True, 110 | num_workers=args.n_workers, pin_memory=True) 111 | 112 | # save_name = "\nArithmetic_Mean" 113 | # print(save_name + 'Detection') 114 | if args.ood == "M": 115 | out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 116 | args.noise, args.n_classes, args.method) 117 | else: 118 | out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 119 | args.T, args.noise) 120 | auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 121 | args.ood, args.method) 122 | auroc_list.append(auroc) 123 | aupr_list.append(aupr) 124 | fpr_list.append(fpr) 125 | # f.write(save_name+'\n') 126 | # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 127 | # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 128 | # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 129 | # f.write('\n') 130 | 131 | # # /////////////// Geometric Mean of Images /////////////// 132 | if args.dataset == 'pascal': 133 | modified_data = pascalVOCLoader('./datasets/pascal/', split="voc12-val", 134 | img_transform=transforms.Compose([transforms.Resize((256,256)),transforms.ToTensor()])) 135 | elif args.dataset == 'coco': 136 | modified_data = cocoloader('./datasets/coco/', split="multi-label-val2014", 137 | img_transform=transforms.Compose([transforms.Resize((256,256)),transforms.ToTensor()])) 138 | elif args.dataset == "nus-wide": 139 | modified_data = nuswideloader("./datasets/nus-wide/", split="val", 140 | img_transform=transforms.Compose([transforms.Resize((256,256)),transforms.ToTensor()])) 141 | 142 | modified_data = data.Subset(modified_data, indices) 143 | class GeomMeanOfPair(torch.utils.data.Dataset): 144 | def __init__(self, dataset): 145 | self.dataset = dataset 146 | self.shuffle_indices = np.arange(len(dataset)) 147 | np.random.shuffle(self.shuffle_indices) 148 | 149 | def __getitem__(self, i): 150 | random_idx = np.random.choice(len(self.dataset)) 151 | while random_idx == i: 152 | random_idx = np.random.choice(len(self.dataset)) 153 | 154 | return normalize(torch.sqrt(self.dataset[i][0] * self.dataset[random_idx][0])), 0 155 | 156 | def __len__(self): 157 | return len(self.dataset) 158 | 159 | ood_loader = torch.utils.data.DataLoader( 160 | GeomMeanOfPair(modified_data), batch_size=args.batch_size, shuffle=True, 161 | num_workers=args.n_workers, pin_memory=True) 162 | # save_name = "\nGeometric_Mean" 163 | # print(save_name + 'Detection') 164 | if args.ood == "M": 165 | out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 166 | args.noise, args.n_classes, args.method) 167 | else: 168 | out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 169 | args.T, args.noise) 170 | auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 171 | args.ood, args.method) 172 | auroc_list.append(auroc) 173 | aupr_list.append(aupr) 174 | fpr_list.append(fpr) 175 | # f.write(save_name+'\n') 176 | # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 177 | # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 178 | # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 179 | # f.write('\n') 180 | 181 | # /////////////// Jigsaw Images /////////////// 182 | 183 | ood_loader = torch.utils.data.DataLoader(modified_data, batch_size=args.batch_size, shuffle=True, 184 | num_workers=args.n_workers, pin_memory=True) 185 | 186 | jigsaw = lambda x: torch.cat(( 187 | torch.cat((torch.cat((x[:, 64:128, :128], x[:, :64, :128]), 1), 188 | x[:, 128:, :128]), 2), 189 | torch.cat((x[:, 128:, 128:], 190 | torch.cat((x[:, :128, 192:], x[:, :128, 128:192]), 2)), 2), 191 | ), 1) 192 | ood_loader.dataset.transform = transforms.Compose([transforms.Resize((256,256)), 193 | transforms.ToTensor(),jigsaw, normalize]) 194 | # save_name = "\nJigsaw" 195 | # print(save_name + 'Detection') 196 | if args.ood == "M": 197 | out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 198 | args.noise, args.n_classes, args.method) 199 | else: 200 | out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 201 | args.T, args.noise) 202 | auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 203 | args.ood, args.method) 204 | auroc_list.append(auroc) 205 | aupr_list.append(aupr) 206 | fpr_list.append(fpr) 207 | # f.write(save_name+'\n') 208 | # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 209 | # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 210 | # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 211 | # f.write('\n') 212 | 213 | # /////////////// Speckled Images /////////////// 214 | 215 | # speckle = lambda x: torch.clamp(x + x * torch.randn_like(x), 0, 1) 216 | # ood_loader.dataset.transform = transforms.Compose([transforms.Resize((256,256)), transforms.ToTensor(), speckle, normalize]) 217 | # save_name = "\nSpeckled" 218 | # print(save_name + 'Detection') 219 | # if args.ood == "M": 220 | # out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 221 | # args.noise, args.n_classes) 222 | # else: 223 | # out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 224 | # args.T, args.noise) 225 | # auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 226 | # args.ood, args.method) 227 | # auroc_list.append(auroc) 228 | # aupr_list.append(aupr) 229 | # fpr_list.append(fpr) 230 | # # f.write(save_name+'\n') 231 | # # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 232 | # # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 233 | # # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 234 | # # f.write('\n') 235 | # 236 | # 237 | # # /////////////// Pixelated Images /////////////// 238 | # 239 | # pixelate = lambda x: x.resize((int(256 * 0.2), int(256 * 0.2)), PILImage.BOX).resize((256, 256), PILImage.BOX) 240 | # ood_loader.dataset.transform = transforms.Compose([pixelate, 241 | # transforms.ToTensor(), normalize]) 242 | # save_name = "\nPixelated" 243 | # print(save_name + 'Detection') 244 | # if args.ood == "M": 245 | # out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 246 | # args.noise, args.n_classes) 247 | # else: 248 | # out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 249 | # args.T, args.noise) 250 | # auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 251 | # args.ood, args.method) 252 | # auroc_list.append(auroc) 253 | # aupr_list.append(aupr) 254 | # fpr_list.append(fpr) 255 | # # f.write(save_name+'\n') 256 | # # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 257 | # # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 258 | # # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 259 | # # f.write('\n') 260 | # 261 | # 262 | # # /////////////// RGB Ghosted/Shifted Images /////////////// 263 | # 264 | # rgb_shift = lambda x: torch.cat((x[1:2].index_select(2, torch.LongTensor([i for i in range(256 - 1, -1, -1)])), 265 | # x[2:, :, :], x[0:1, :, :]), 0) 266 | # ood_loader.dataset.transform = transforms.Compose([transforms.Resize((256,256)),transforms.ToTensor(),rgb_shift, normalize]) 267 | # 268 | # save_name = "\nShifted" 269 | # print(save_name + 'Detection') 270 | # if args.ood == "M": 271 | # out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 272 | # args.noise, args.n_classes) 273 | # else: 274 | # out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 275 | # args.T, args.noise) 276 | # auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 277 | # args.ood, args.method) 278 | # auroc_list.append(auroc) 279 | # aupr_list.append(aupr) 280 | # fpr_list.append(fpr) 281 | # # f.write(save_name + '\n') 282 | # # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 283 | # # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 284 | # # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 285 | # # f.write('\n') 286 | # 287 | # # /////////////// Inverted Images /////////////// 288 | # # not done on all channels to make image ood with higher probability 289 | # invert = lambda x: torch.cat((x[0:1, :, :], 1 - x[1:2, :, ], 1 - x[2:, :, :],), 0) 290 | # ood_loader.dataset.transform = transforms.Compose([transforms.Resize((256,256)), 291 | # transforms.ToTensor(),invert, normalize]) 292 | # 293 | # save_name = "\nInverted" 294 | # print(save_name + 'Detection') 295 | # if args.ood == "M": 296 | # out_scores = lib.get_Mahalanobis_score(model, clsfier, ood_loader, pack, 297 | # args.noise, args.n_classes) 298 | # else: 299 | # out_scores = lib.get_odin_scores(ood_loader, model, clsfier, args.method, 300 | # args.T, args.noise) 301 | # auroc, aupr, fpr = anom_utils.get_and_print_results(in_scores, out_scores, 302 | # args.ood, args.method) 303 | # auroc_list.append(auroc) 304 | # aupr_list.append(aupr) 305 | # fpr_list.append(fpr) 306 | # # f.write(save_name + '\n') 307 | # # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * fpr)) 308 | # # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * auroc)) 309 | # # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * aupr)) 310 | # # f.write('\n') 311 | 312 | # /////////////// Mean Results /////////////// 313 | 314 | # print('Mean Validation Results') 315 | anom_utils.print_measures(np.mean(auroc_list), np.mean(aupr_list), np.mean(fpr_list), 316 | ood="Mean", method="validation") 317 | # f.write("Mean Validation Results\n") 318 | # f.write('FPR{:d}:\t\t\t{:.2f}\n'.format(int(100 * 0.95), 100 * np.mean(fpr_list))) 319 | # f.write('AUROC: \t\t\t{:.2f}\n'.format(100 * np.mean(auroc_list))) 320 | # f.write('AUPR: \t\t\t{:.2f}\n'.format(100 * np.mean(aupr_list))) 321 | 322 | return np.mean(fpr_list) 323 | 324 | if __name__ == '__main__': 325 | parser = argparse.ArgumentParser(description='Hyperparams') 326 | # ood measures 327 | parser.add_argument('--ood', type=str, default='odin', 328 | help='which measure to tune odin|M') 329 | parser.add_argument('--method', type=str, default='max', 330 | help='which method to use max|sum') 331 | parser.add_argument('--dataset', type=str, default='pascal', 332 | help='Dataset to use pascal|coco|nus-wide') 333 | parser.add_argument('--arch', type=str, default='densenet', 334 | help='Architecture to use densenet|resnet101') 335 | parser.add_argument('--batch_size', type=int, default=200, help='Batch Size') 336 | parser.add_argument('--n_workers', type=int, default=4) 337 | parser.add_argument('--n_classes', type=int, default=20) 338 | 339 | parser.add_argument('--load_model', type=str, default="saved_models/", 340 | help="Path to load models") 341 | parser.add_argument('--T', type=int, default=1) 342 | parser.add_argument('--noise', type=float, default=0.0) 343 | args = parser.parse_args() 344 | 345 | args.load_model += args.dataset + '/' 346 | torch.manual_seed(0) 347 | np.random.seed(0) 348 | 349 | # Setup Dataloader 350 | normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], 351 | std=[0.229, 0.224, 0.225]) 352 | img_transform = transforms.Compose([ 353 | transforms.Resize((256, 256)), 354 | transforms.ToTensor(), 355 | normalize, 356 | ]) 357 | label_transform = transforms.Compose([ 358 | ToLabel(), 359 | ]) 360 | if args.dataset == 'pascal': 361 | train_data = pascalVOCLoader('./datasets/pascal/', 362 | img_transform=img_transform, label_transform=label_transform) 363 | val_data = pascalVOCLoader('./datasets/pascal/', split="voc12-val", 364 | img_transform=img_transform, label_transform=label_transform) 365 | 366 | elif args.dataset == 'coco': 367 | train_data = cocoloader("./datasets/coco/", 368 | img_transform=img_transform, label_transform=label_transform) 369 | val_data = cocoloader('./datasets/coco/', split="multi-label-val2014", 370 | img_transform=img_transform, label_transform=label_transform) 371 | # test_loader = data.DataLoader(test_data, batch_size=args.batch_size, num_workers=8, shuffle=False) 372 | elif args.dataset == "nus-wide": 373 | train_data = nuswideloader("./datasets/nus-wide/", 374 | img_transform=img_transform, label_transform=label_transform) 375 | val_data = nuswideloader("./datasets/nus-wide/", split="val", 376 | img_transform=img_transform, label_transform=label_transform) 377 | 378 | # To speed up the process 379 | args.n_classes = val_data.n_classes 380 | indices = np.random.randint(len(val_data), size = 1000) 381 | val_data = data.Subset(val_data, indices) 382 | 383 | val_loader = data.DataLoader(val_data, batch_size=args.batch_size, num_workers=args.n_workers, shuffle=False, pin_memory=True) 384 | train_loader = data.DataLoader(train_data, batch_size=args.batch_size, num_workers=args.n_workers, shuffle=True, pin_memory=True) 385 | 386 | # load models 387 | if args.arch == "resnet101": 388 | orig_resnet = torchvision.models.resnet101(pretrained=True) 389 | features = list(orig_resnet.children()) 390 | model = nn.Sequential(*features[0:8]) 391 | clsfier = clssimp(2048, args.n_classes) 392 | elif args.arch == "densenet": 393 | orig_densenet = torchvision.models.densenet121(pretrained=True) 394 | features = list(orig_densenet.features) 395 | model = nn.Sequential(*features, nn.ReLU(inplace=True)) 396 | clsfier = clssimp(1024, args.n_classes) 397 | 398 | if torch.cuda.device_count() > 1: 399 | model = nn.DataParallel(model).cuda() 400 | clsfier = nn.DataParallel(clsfier).cuda() 401 | 402 | # load model 403 | model.load_state_dict(torch.load(args.load_model + args.arch + '.pth')) 404 | clsfier.load_state_dict(torch.load(args.load_model + args.arch + 'clssegsimp.pth')) 405 | print("model loaded!") 406 | 407 | # freeze the batchnorm and dropout layers 408 | model.eval() 409 | clsfier.eval() 410 | 411 | temp = [1, 10, 100, 1000] 412 | noises = [0, 0.0002, 0.0004, 0.0008, 0.001, 0.0012, 0.0014, 413 | 0.0016, 0.002, 0.0024, 0.0028, 0.003, 414 | 0.0034, 0.0036, 0.004] 415 | if args.ood == "M": 416 | temp = [1] 417 | noises = [0, 0.002, 0.0014, 0.001, 0.0005, 0.005] 418 | # feature extraction 419 | temp_x = torch.rand(2, 3, 256, 256) 420 | temp_x = Variable(temp_x.cuda()) 421 | temp_list = lib.model_feature_list(model, clsfier, temp_x, args.arch)[1] 422 | num_output = len(temp_list) 423 | feature_list = np.empty(num_output) 424 | count = 0 425 | for out in temp_list: 426 | feature_list[count] = out.size(1) 427 | count += 1 428 | # print(feature_list) 429 | print('get sample mean and covariance') 430 | # 431 | sample_mean, precision = lib.sample_estimator(model, clsfier, args.n_classes, feature_list, train_loader) 432 | 433 | best_T = 1 434 | best_noise = 0 435 | best_fpr = 100 436 | for T in temp: 437 | for noise in noises: 438 | args.T = T 439 | args.noise = noise 440 | print("T = "+str(T)+"\tnoise = "+str(noise)) 441 | fpr = tune() 442 | if fpr < best_fpr: 443 | best_T = T 444 | best_noise = noise 445 | best_fpr = fpr 446 | 447 | f = open("./" + args.dataset + '_' + args.arch + '_' + args.ood + '_' + 448 | args.method + ".txt", 'w') 449 | f.write("Best T%d\tBest noise%.5f\n" % (best_T, best_noise)) 450 | f.close() 451 | 452 | print("Best T%d\tBest noise%.5f" % (best_T, best_noise)) 453 | -------------------------------------------------------------------------------- /lib.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import torch.nn as nn 4 | import numpy as np 5 | import time 6 | import torch.nn.functional as F 7 | from torch.autograd import Variable 8 | 9 | to_np = lambda x: x.data.cpu().numpy() 10 | 11 | def get_odin_scores(loader, model, clsfier, method, T, noise): 12 | ## get logits 13 | bceloss = nn.BCEWithLogitsLoss(reduction="none") 14 | for i, (images, _) in enumerate(loader): 15 | images = Variable(images.cuda(), requires_grad=True) 16 | nnOutputs = clsfier(model(images)) 17 | 18 | # using temperature scaling 19 | preds = torch.sigmoid(nnOutputs / T) 20 | 21 | labels = torch.ones(preds.shape).cuda() * (preds >= 0.5) 22 | labels = Variable(labels.float()) 23 | 24 | # input pre-processing 25 | loss = bceloss(nnOutputs, labels) 26 | 27 | if method == 'max': 28 | idx = torch.max(preds, dim=1)[1].unsqueeze(-1) 29 | loss = torch.mean(torch.gather(loss, 1, idx)) 30 | elif method == 'sum': 31 | loss = torch.mean(torch.sum(loss, dim=1)) 32 | 33 | loss.backward() 34 | # calculating the perturbation 35 | gradient = torch.ge(images.grad.data, 0) 36 | gradient = (gradient.float() - 0.5) * 2 37 | gradient.index_copy_(1, torch.LongTensor([0]).cuda(), 38 | gradient.index_select(1, torch.LongTensor([0]).cuda()) / (0.229)) 39 | gradient.index_copy_(1, torch.LongTensor([1]).cuda(), 40 | gradient.index_select(1, torch.LongTensor([1]).cuda()) / (0.224)) 41 | gradient.index_copy_(1, torch.LongTensor([2]).cuda(), 42 | gradient.index_select(1, torch.LongTensor([2]).cuda()) / (0.225)) 43 | tempInputs = torch.add(images.data, gradient, alpha=-noise) 44 | 45 | with torch.no_grad(): 46 | nnOutputs = clsfier(model(Variable(tempInputs))) 47 | 48 | ## compute odin score 49 | outputs = torch.sigmoid(nnOutputs / T) 50 | 51 | if method == "max": 52 | score = np.max(to_np(outputs), axis=1) 53 | elif method == "sum": 54 | score = np.sum(to_np(outputs), axis=1) 55 | 56 | if i == 0: 57 | scores = score 58 | else: 59 | scores = np.concatenate((scores, score),axis=0) 60 | 61 | return scores 62 | 63 | def sample_estimator(model, clsfier, num_classes, feature_list, train_loader): 64 | """ 65 | compute sample mean and precision (inverse of covariance) 66 | return: sample_class_mean: list of class mean 67 | precision: list of precisions 68 | """ 69 | import sklearn.covariance 70 | 71 | group_lasso = sklearn.covariance.EmpiricalCovariance(assume_centered=False) 72 | num_output = len(feature_list) 73 | num_sample_per_class = np.empty(num_classes) 74 | num_sample_per_class.fill(0) 75 | list_features = [] 76 | 77 | # list_features = [] 78 | # for i in range(num_output): 79 | # temp_list = [] 80 | # for j in range(num_classes): 81 | # temp_list.append(0) 82 | # list_features.append(temp_list) 83 | 84 | for j in range(num_classes): 85 | list_features.append(0) 86 | 87 | idx = 0 88 | with torch.no_grad(): 89 | for data, target in train_loader: 90 | idx += 1 91 | print(idx) 92 | data = Variable(data.cuda()) 93 | 94 | target = target.cuda() 95 | 96 | # output, out_features = model_feature_list(model, clsfier, data) # output = size[batch_size, num_class] 97 | # get hidden features 98 | # for i in range(num_output): 99 | # out_features[i] = out_features[i].view(out_features[i].size(0), out_features[i].size(1), -1) 100 | # out_features[i] = torch.mean(out_features[i].data, 2) 101 | 102 | out_features = model(data) 103 | out_features = out_features.view(out_features.size(0), out_features.size(1), -1) 104 | out_features = torch.mean(out_features.data, 2) 105 | 106 | # construct the sample matrix 107 | # use the training set labels(multiple) or set with the one with max prob 108 | 109 | for i in range(data.size(0)): 110 | # px = 0 111 | for j in range(num_classes): 112 | if target[i][j] == 0: 113 | continue 114 | label = j 115 | if num_sample_per_class[label] == 0: 116 | # out_count = 0 117 | # for out in out_features: 118 | # list_features[out_count][label] = out[i].view(1, -1) 119 | # out_count += 1 120 | 121 | list_features[label] = out_features[i].view(1, -1) 122 | else: 123 | # out_count = 0 124 | # for out in out_features: 125 | # list_features[out_count][label] \ 126 | # = torch.cat((list_features[out_count][label], out[i].view(1, -1)), 0) 127 | # out_count += 1 128 | 129 | list_features[label] = torch.cat((list_features[label], 130 | out_features[i].view(1, -1)), 0) 131 | num_sample_per_class[label] += 1 132 | 133 | # sample_class_mean = [] 134 | # out_count = 0 135 | # for num_feature in feature_list: 136 | # temp_list = torch.Tensor(num_classes, int(num_feature)).cuda() 137 | # for j in range(num_classes): 138 | # temp_list[j] = torch.mean(list_features[out_count][j], 0) 139 | # sample_class_mean.append(temp_list) 140 | # out_count += 1 141 | 142 | num_feature = feature_list[-1] 143 | temp_list = torch.Tensor(num_classes, int(num_feature)).cuda() 144 | for j in range(num_classes): 145 | temp_list[j] = torch.mean(list_features[j], 0) 146 | sample_class_mean = temp_list 147 | 148 | # precision = [] 149 | # for k in range(num_output): 150 | # X = 0 151 | # for i in range(num_classes): 152 | # if i == 0: 153 | # X = list_features[k][i] - sample_class_mean[k][i] 154 | # else: 155 | # X = torch.cat((X, list_features[k][i] - sample_class_mean[k][i]), 0) 156 | # 157 | # # find inverse 158 | # group_lasso.fit(X.cpu().numpy()) 159 | # temp_precision = group_lasso.precision_ 160 | # temp_precision = torch.from_numpy(temp_precision).float().cuda() 161 | # precision.append(temp_precision) 162 | 163 | X = 0 164 | for i in range(num_classes): 165 | if i == 0: 166 | X = list_features[i] - sample_class_mean[i] 167 | else: 168 | X = torch.cat((X, list_features[i] - sample_class_mean[i]), 0) 169 | # find inverse 170 | group_lasso.fit(X.cpu().numpy()) 171 | temp_precision = group_lasso.precision_ 172 | temp_precision = torch.from_numpy(temp_precision).float().cuda() 173 | precision = temp_precision 174 | 175 | return sample_class_mean, precision 176 | 177 | 178 | def get_Mahalanobis_score(model, clsfier, loader, pack, noise, num_classes, method): 179 | ''' 180 | Compute the proposed Mahalanobis confidence score on input dataset 181 | return: Mahalanobis score from layer_index 182 | ''' 183 | sample_mean, precision = pack 184 | model.eval() 185 | clsfier.eval() 186 | Mahalanobis = [] 187 | for i, (data, target) in enumerate(loader): 188 | data = Variable(data.cuda(), requires_grad=True) 189 | 190 | # out_features = model_penultimate_layer(model, clsfier, data) 191 | out_features = model(data) 192 | out_features = out_features.view(out_features.size(0), out_features.size(1), -1) 193 | out_features = torch.mean(out_features, 2) # size(batch_size, F) 194 | 195 | # compute Mahalanobis score 196 | gaussian_score = 0 197 | for i in range(num_classes): 198 | batch_sample_mean = sample_mean[i] 199 | zero_f = out_features.data - batch_sample_mean 200 | term_gau = -0.5 * torch.mm(torch.mm(zero_f, precision), zero_f.t()).diag() 201 | if i == 0: 202 | gaussian_score = term_gau.view(-1, 1) 203 | else: 204 | gaussian_score = torch.cat((gaussian_score, term_gau.view(-1, 1)), 1) 205 | 206 | # Input_processing 207 | sample_pred = gaussian_score.max(1)[1] 208 | batch_sample_mean = sample_mean.index_select(0, sample_pred) 209 | zero_f = out_features - Variable(batch_sample_mean) 210 | pure_gau = -0.5 * torch.mm(torch.mm(zero_f, Variable(precision)), zero_f.t()).diag() 211 | loss = torch.mean(-pure_gau) 212 | loss.backward() 213 | 214 | gradient = torch.ge(data.grad.data, 0) 215 | gradient = (gradient.float() - 0.5) * 2 216 | gradient.index_copy_(1, torch.LongTensor([0]).cuda(), 217 | gradient.index_select(1, torch.LongTensor([0]).cuda()) / (0.229)) 218 | gradient.index_copy_(1, torch.LongTensor([1]).cuda(), 219 | gradient.index_select(1, torch.LongTensor([1]).cuda()) / (0.224)) 220 | gradient.index_copy_(1, torch.LongTensor([2]).cuda(), 221 | gradient.index_select(1, torch.LongTensor([2]).cuda()) / (0.225)) 222 | tempInputs = torch.add(data.data, gradient, alpha=-noise) 223 | 224 | #noise_out_features = model.intermediate_forward(Variable(tempInputs, volatile=True), layer_index) 225 | with torch.no_grad(): 226 | # noise_out_features = model_penultimate_layer(model, clsfier, Variable(tempInputs)) 227 | noise_out_features = model(Variable(tempInputs)) 228 | noise_out_features = noise_out_features.view(noise_out_features.size(0), noise_out_features.size(1), -1) 229 | noise_out_features = torch.mean(noise_out_features, 2) 230 | noise_gaussian_score = 0 231 | for i in range(num_classes): 232 | batch_sample_mean = sample_mean[i] 233 | zero_f = noise_out_features.data - batch_sample_mean 234 | term_gau = -0.5 * torch.mm(torch.mm(zero_f, precision), zero_f.t()).diag() 235 | if i == 0: 236 | noise_gaussian_score = term_gau.view(-1, 1) 237 | else: 238 | noise_gaussian_score = torch.cat((noise_gaussian_score, term_gau.view(-1, 1)), 1) 239 | # noise_gaussion_score size([batch_size, n_classes]) 240 | 241 | if method == "max": 242 | noise_gaussian_score, _ = torch.max(noise_gaussian_score, dim=1) 243 | elif method == "sum": 244 | noise_gaussian_score = torch.sum(noise_gaussian_score, dim=1) 245 | 246 | Mahalanobis.extend(to_np(noise_gaussian_score)) 247 | 248 | return Mahalanobis 249 | 250 | 251 | def model_feature_list(model, clsfier, x, arch): 252 | out_list = [] 253 | if arch == "resnet101": 254 | out = model.module[:4](x) 255 | out_list.append(out) 256 | out = model.module[4](out) 257 | out_list.append(out) 258 | out = model.module[5](out) 259 | out_list.append(out) 260 | out = model.module[6](out) 261 | out_list.append(out) 262 | out = model.module[7](out) 263 | out_list.append(out.data) 264 | elif arch == "densenet": 265 | out = model.module[:4](x) 266 | out_list.append(out) 267 | out = model.module[4:6](out) 268 | out_list.append(out) 269 | out = model.module[6:8](out) 270 | out_list.append(out) 271 | out = model.module[8:10](out) 272 | out_list.append(out) 273 | out = model.module[10:](out) 274 | out_list.append(out.data) 275 | return clsfier(out), out_list 276 | 277 | def get_logits(loader, model, clsfier, args, k=20, name=None): 278 | print(args.save_path + name + ".npy", os.path.exists(args.save_path + name + ".npy")) 279 | if not (os.path.exists(args.save_path + name + ".npy")): 280 | logits_np = np.empty([0, args.n_classes]) 281 | 282 | with torch.no_grad(): 283 | for i, (images, labels) in enumerate(loader): 284 | 285 | images = Variable(images.cuda()) 286 | nnOutputs = model(images) 287 | nnOutputs = clsfier(nnOutputs) 288 | 289 | nnOutputs_np = to_np(nnOutputs.squeeze()) 290 | logits_np = np.vstack((logits_np, nnOutputs_np)) 291 | 292 | os.makedirs(args.save_path, exist_ok = True) 293 | np.save(args.save_path + name, logits_np) 294 | 295 | else: 296 | logits_np = np.load(args.save_path + name + ".npy") 297 | 298 | ## Compute the Score 299 | logits = torch.from_numpy(logits_np).cuda() 300 | outputs = torch.sigmoid(logits) 301 | if args.ood == "logit": 302 | if args.method == "max": scores = np.max(logits_np, axis=1) 303 | if args.method == "sum": scores = np.sum(logits_np, axis=1) 304 | elif args.ood == "energy": 305 | E_f = torch.log(1+torch.exp(logits)) 306 | if args.method == "max": scores = to_np(torch.max(E_f, dim=1)[0]) 307 | if args.method == "sum": scores = to_np(torch.sum(E_f, dim=1)) 308 | if args.method == "topk": 309 | scores = to_np(torch.sum(torch.topk(E_f, k=k, dim=1)[0], dim=1)) 310 | elif args.ood == "prob": 311 | if args.method == "max": scores = np.max(to_np(outputs), axis=1) 312 | if args.method == "sum": scores = np.sum(to_np(outputs),axis=1) 313 | elif args.ood == "msp": 314 | outputs = F.softmax(logits, dim=1) 315 | scores = np.max(to_np(outputs), axis=1) 316 | else: 317 | scores = logits_np 318 | 319 | return scores 320 | 321 | 322 | def get_localoutlierfactor_scores(val, test, out_scores): 323 | import sklearn.neighbors 324 | scorer = sklearn.neighbors.LocalOutlierFactor(novelty=True) 325 | print("fitting validation set") 326 | start = time.time() 327 | scorer.fit(val) 328 | end = time.time() 329 | print("fitting took ", end - start) 330 | val = np.asarray(val) 331 | test = np.asarray(test) 332 | out_scores = np.asarray(out_scores) 333 | print(val.shape, test.shape, out_scores.shape) 334 | return scorer.score_samples(np.vstack((test, out_scores))) 335 | 336 | 337 | def get_isolationforest_scores(val, test, out_scores): 338 | import sklearn.ensemble 339 | rng = np.random.RandomState(42) 340 | scorer = sklearn.ensemble.IsolationForest(random_state = rng) 341 | print("fitting validation set") 342 | start = time.time() 343 | scorer.fit(val) 344 | end = time.time() 345 | print("fitting took ", end - start) 346 | val = np.asarray(val) 347 | test = np.asarray(test) 348 | out_scores = np.asarray(out_scores) 349 | print(val.shape, test.shape, out_scores.shape) 350 | return scorer.score_samples(np.vstack((test, out_scores))) 351 | 352 | 353 | 354 | -------------------------------------------------------------------------------- /model/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deeplearning-wisc/multi-label-ood/0c999beb57319016740f19b72bc87c1115c80127/model/__init__.py -------------------------------------------------------------------------------- /model/classifiersimple.py: -------------------------------------------------------------------------------- 1 | import re 2 | import torch 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | import torch.utils.model_zoo as model_zoo 6 | from collections import OrderedDict 7 | 8 | # GroupNorm 9 | 10 | class clssimp(nn.Module): 11 | def __init__(self, ch=2880, num_classes=20): 12 | 13 | super(clssimp, self).__init__() 14 | self.pool = nn.AdaptiveAvgPool2d(output_size=(1, 1)) 15 | self.way1 = nn.Sequential( 16 | nn.Linear(ch, 1000, bias=True), 17 | nn.BatchNorm1d(1000), 18 | nn.ReLU(inplace=True), 19 | ) 20 | 21 | self.cls= nn.Linear(1000, num_classes,bias=True) 22 | 23 | def forward(self, x): 24 | # bp() 25 | x = self.pool(x) 26 | x = x.reshape(x.size(0), -1) 27 | x = self.way1(x) 28 | logits = self.cls(x) 29 | return logits 30 | 31 | def intermediate_forward(self, x): 32 | x = self.pool(x) 33 | x = x.reshape(x.size(0), -1) 34 | x = self.way1(x) 35 | return x 36 | 37 | 38 | 39 | 40 | class segclssimp_group(nn.Module): 41 | def __init__(self, ch=2880, num_classes=21): 42 | 43 | super(segclssimp_group, self).__init__() 44 | self.depthway1 = nn.Sequential( 45 | nn.Conv2d(ch, 1000, kernel_size=1), 46 | nn.GroupNorm(4,1000), 47 | nn.ReLU(inplace=True), 48 | ) 49 | self.depthway2 = nn.Sequential( 50 | nn.Conv2d(1000, 1000, kernel_size=1), 51 | nn.GroupNorm(4,1000), 52 | nn.ReLU(inplace=True), 53 | ) 54 | self.depthway3 = nn.Sequential( 55 | nn.Conv2d(1000, 512, kernel_size=1), 56 | nn.GroupNorm(4,512), 57 | nn.ReLU(inplace=True), 58 | ) 59 | 60 | self.clsdepth = nn.Conv2d(512, num_classes, kernel_size=1) 61 | 62 | def forward(self, x): 63 | # bp() 64 | 65 | seg = self.depthway1(x) 66 | seg = self.depthway2(seg) 67 | seg = self.depthway3(seg) 68 | seg = self.clsdepth(seg) 69 | 70 | 71 | 72 | return seg 73 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import argparse 4 | import torchvision 5 | import torch.nn as nn 6 | from torch.autograd import Variable 7 | from torch.utils import data 8 | 9 | import validate 10 | from utils.dataloader.pascal_voc_loader import * 11 | from utils.dataloader.nus_wide_loader import * 12 | from utils.dataloader.coco_loader import * 13 | from utils.anom_utils import ToLabel 14 | from model.classifiersimple import * 15 | 16 | def train(): 17 | args.save_dir += args.dataset + '/' 18 | if not os.path.exists(args.save_dir): 19 | os.makedirs(args.save_dir) 20 | 21 | normalize = torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], 22 | std=[0.229, 0.224, 0.225]) 23 | img_transform = torchvision.transforms.Compose([ 24 | torchvision.transforms.RandomHorizontalFlip(), 25 | torchvision.transforms.RandomResizedCrop((256, 256), scale=(0.5, 2.0)), 26 | torchvision.transforms.ToTensor(), 27 | normalize, 28 | ]) 29 | 30 | label_transform = torchvision.transforms.Compose([ 31 | ToLabel(), 32 | ]) 33 | val_transform = torchvision.transforms.Compose([ 34 | torchvision.transforms.Resize((256, 256)), 35 | torchvision.transforms.ToTensor(), 36 | normalize 37 | ]) 38 | 39 | if args.dataset == "pascal": 40 | loader = pascalVOCLoader( 41 | "./datasets/pascal/", 42 | img_transform = img_transform, 43 | label_transform = label_transform) 44 | val_data = pascalVOCLoader('./datasets/pascal/', split="voc12-val", 45 | img_transform=img_transform, 46 | label_transform=label_transform) 47 | elif args.dataset == "coco": 48 | loader = cocoloader("./datasets/coco/", 49 | img_transform = img_transform, 50 | label_transform = label_transform) 51 | val_data = cocoloader("./datasets/coco/", split="multi-label-val2014", 52 | img_transform = val_transform, 53 | label_transform = label_transform) 54 | elif args.dataset == "nus-wide": 55 | loader = nuswideloader("./datasets/nus-wide/", 56 | img_transform = img_transform, 57 | label_transform = label_transform) 58 | val_data = nuswideloader("./datasets/nus-wide/", split="val", 59 | img_transform=val_transform, 60 | label_transform=label_transform) 61 | else: 62 | raise AssertionError 63 | 64 | args.n_classes = loader.n_classes 65 | trainloader = data.DataLoader(loader, batch_size=args.batch_size, num_workers=8, shuffle=True, pin_memory=True) 66 | val_loader = data.DataLoader(val_data, batch_size=args.batch_size, num_workers=8, shuffle=True, pin_memory=True) 67 | 68 | print("number of images = ", len(loader)) 69 | print("number of classes = ", args.n_classes, " architecture used = ", args.arch) 70 | 71 | if args.arch == "resnet101": 72 | orig_resnet = torchvision.models.resnet101(pretrained=True) 73 | features = list(orig_resnet.children()) 74 | model= nn.Sequential(*features[0:8]) 75 | clsfier = clssimp(2048, args.n_classes) 76 | elif args.arch == "densenet": 77 | orig_densenet = torchvision.models.densenet121(pretrained=True) 78 | features = list(orig_densenet.features) 79 | model = nn.Sequential(*features, nn.ReLU(inplace=True)) 80 | clsfier = clssimp(1024, args.n_classes) 81 | 82 | model = model.cuda() 83 | clsfier = clsfier.cuda() 84 | if torch.cuda.device_count() > 1: 85 | print("Using", torch.cuda.device_count(), "GPUs") 86 | model = nn.DataParallel(model) 87 | clsfier = nn.DataParallel(clsfier) 88 | 89 | optimizer = torch.optim.Adam([{'params': model.parameters(),'lr':args.l_rate/10},{'params': clsfier.parameters()}], lr=args.l_rate) 90 | 91 | if args.load: 92 | model.load_state_dict(torch.load(args.save_dir + args.arch + ".pth")) 93 | clsfier.load_state_dict(torch.load(args.save_dir + args.arch +'clsfier' + ".pth")) 94 | print("Model loaded!") 95 | 96 | bceloss = nn.BCEWithLogitsLoss() 97 | model.train() 98 | clsfier.train() 99 | for epoch in range(args.n_epoch): 100 | for i, (images, labels) in enumerate(trainloader): 101 | images = Variable(images.cuda()) 102 | labels = Variable(labels.cuda().float()) 103 | 104 | optimizer.zero_grad() 105 | 106 | outputs = model(images) 107 | outputs = clsfier(outputs) 108 | loss = bceloss(outputs, labels) 109 | 110 | loss.backward() 111 | optimizer.step() 112 | torch.save(model.module.state_dict(), args.save_dir + args.arch + ".pth") 113 | torch.save(clsfier.module.state_dict(), args.save_dir + args.arch + 'clsfier' + ".pth") 114 | mAP = validate.validate(args, model, clsfier, val_loader) 115 | 116 | print("Epoch [%d/%d] Loss: %.4f mAP: %.4f" % (epoch, args.n_epoch, loss.data, mAP)) 117 | 118 | if __name__ == '__main__': 119 | parser = argparse.ArgumentParser(description='Hyperparams') 120 | parser.add_argument('--arch', type=str, default='densenet', 121 | help='Architecture to use densenet|resnet101') 122 | parser.add_argument('--dataset', type=str, default='pascal', 123 | help='Dataset to use pascal|coco|nus-wide') 124 | parser.add_argument('--n_epoch', type=int, default=50, 125 | help='# of the epochs') 126 | parser.add_argument('--n_classes', type=int, default=20, 127 | help='# of classes') 128 | parser.add_argument('--batch_size', type=int, default=200, 129 | help='Batch Size') 130 | # batch_size 320 for resenet101 131 | parser.add_argument('--l_rate', type=float, default=1e-4, 132 | help='Learning Rate') 133 | 134 | #save and load 135 | parser.add_argument('--load', action='store_true', help='Whether to load models') 136 | parser.add_argument('--save_dir', type=str, default="./saved_models/", 137 | help='Path to save models') 138 | parser.add_argument('--load_dir', type=str, default="./saved_models", 139 | help='Path to load models') 140 | args = parser.parse_args() 141 | train() 142 | -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deeplearning-wisc/multi-label-ood/0c999beb57319016740f19b72bc87c1115c80127/utils/__init__.py -------------------------------------------------------------------------------- /utils/anom_utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch.nn as nn 3 | import sklearn.metrics as sk 4 | import sklearn.neighbors 5 | import sklearn.ensemble 6 | import time 7 | import torch 8 | from torch.autograd import Variable 9 | import os.path 10 | 11 | recall_level_default = 0.95 12 | 13 | class ToLabel(object): 14 | def __call__(self, inputs): 15 | return (torch.from_numpy(np.array(inputs)).long()) 16 | 17 | 18 | def stable_cumsum(arr, rtol=1e-05, atol=1e-08): 19 | """Use high precision for cumsum and check that final value matches sum 20 | Parameters 21 | ---------- 22 | arr : array-like 23 | To be cumulatively summed as flat 24 | rtol : float 25 | Relative tolerance, see ``np.allclose`` 26 | atol : float 27 | Absolute tolerance, see ``np.allclose`` 28 | """ 29 | out = np.cumsum(arr, dtype=np.float64) 30 | expected = np.sum(arr, dtype=np.float64) 31 | if not np.allclose(out[-1], expected, rtol=rtol, atol=atol): 32 | raise RuntimeError('cumsum was found to be unstable: ' 33 | 'its last element does not correspond to sum') 34 | return out 35 | 36 | def fpr_and_fdr_at_recall(y_true, y_score, recall_level=recall_level_default, pos_label=None): 37 | classes = np.unique(y_true) 38 | if (pos_label is None and 39 | not (np.array_equal(classes, [0, 1]) or 40 | np.array_equal(classes, [-1, 1]) or 41 | np.array_equal(classes, [0]) or 42 | np.array_equal(classes, [-1]) or 43 | np.array_equal(classes, [1]))): 44 | raise ValueError("Data is not binary and pos_label is not specified") 45 | elif pos_label is None: 46 | pos_label = 1. 47 | 48 | # make y_true a boolean vector 49 | y_true = (y_true == pos_label) 50 | 51 | # sort scores and corresponding truth values 52 | desc_score_indices = np.argsort(y_score, kind="mergesort")[::-1] 53 | y_score = y_score[desc_score_indices] 54 | y_true = y_true[desc_score_indices] 55 | 56 | # y_score typically has many tied values. Here we extract 57 | # the indices associated with the distinct values. We also 58 | # concatenate a value for the end of the curve. 59 | distinct_value_indices = np.where(np.diff(y_score))[0] 60 | threshold_idxs = np.r_[distinct_value_indices, y_true.size - 1] 61 | 62 | # accumulate the true positives with decreasing threshold 63 | tps = stable_cumsum(y_true)[threshold_idxs] 64 | fps = 1 + threshold_idxs - tps # add one because of zero-based indexing 65 | 66 | thresholds = y_score[threshold_idxs] 67 | 68 | recall = tps / tps[-1] 69 | 70 | last_ind = tps.searchsorted(tps[-1]) 71 | sl = slice(last_ind, None, -1) # [last_ind::-1] 72 | recall, fps, tps, thresholds = np.r_[recall[sl], 1], np.r_[fps[sl], 0], np.r_[tps[sl], 0], thresholds[sl] 73 | 74 | cutoff = np.argmin(np.abs(recall - recall_level)) 75 | if np.array_equal(classes, [1]): 76 | return thresholds[cutoff] # return threshold 77 | 78 | return fps[cutoff] / (np.sum(np.logical_not(y_true))), thresholds[cutoff] 79 | 80 | def get_measures(_pos, _neg, recall_level=recall_level_default): 81 | pos = np.array(_pos[:]).reshape((-1, 1)) 82 | neg = np.array(_neg[:]).reshape((-1, 1)) 83 | examples = np.squeeze(np.vstack((pos, neg))) 84 | labels = np.zeros(len(examples), dtype=np.int32) 85 | labels[:len(pos)] += 1 86 | 87 | auroc = sk.roc_auc_score(labels, examples) 88 | aupr = sk.average_precision_score(labels, examples) 89 | fpr, threshould = fpr_and_fdr_at_recall(labels, examples, recall_level) 90 | 91 | return auroc, aupr, fpr, threshould 92 | 93 | def print_measures(auroc, aupr, fpr, ood, method, recall_level=recall_level_default): 94 | print('\t\t\t' + ood+'_'+method) 95 | print('FPR{:d}:\t\t\t{:.2f}'.format(int(100 * recall_level), 100 * fpr)) 96 | print('AUROC: \t\t\t{:.2f}'.format(100 * auroc)) 97 | print('AUPR: \t\t\t{:.2f}'.format(100 * aupr)) 98 | 99 | def get_and_print_results(out_score, in_score, ood, method): 100 | aurocs, auprs, fprs = [], [], [] 101 | measures = get_measures(out_score, in_score) 102 | aurocs.append(measures[0]); auprs.append(measures[1]); fprs.append(measures[2]) 103 | 104 | auroc = np.mean(aurocs); aupr = np.mean(auprs); fpr = np.mean(fprs) 105 | 106 | print_measures(auroc, aupr, fpr, ood, method) 107 | return auroc, aupr, fpr 108 | 109 | def get_localoutlierfactor_scores(val, test, out_scores): 110 | scorer = sklearn.neighbors.LocalOutlierFactor(novelty=True) 111 | print("fitting validation set") 112 | start = time.time() 113 | scorer.fit(val) 114 | end = time.time() 115 | print("fitting took ", end - start) 116 | val = np.asarray(val) 117 | test = np.asarray(test) 118 | out_scores = np.asarray(out_scores) 119 | print(val.shape, test.shape, out_scores.shape) 120 | return scorer.score_samples(np.vstack((test, out_scores))) 121 | 122 | 123 | def get_isolationforest_scores(val, test, out_scores): 124 | scorer = sklearn.ensemble.IsolationForest() 125 | print("fitting validation set") 126 | start = time.time() 127 | scorer.fit(val) 128 | end = time.time() 129 | print("fitting took ", end - start) 130 | val = np.asarray(val) 131 | test = np.asarray(test) 132 | out_scores = np.asarray(out_scores) 133 | print(val.shape, test.shape, out_scores.shape) 134 | return scorer.score_samples(np.vstack((test, out_scores))) 135 | 136 | -------------------------------------------------------------------------------- /utils/coco-preprocessing.py: -------------------------------------------------------------------------------- 1 | # coding: utf-8 2 | 3 | from pycocotools.coco import COCO 4 | import argparse 5 | import numpy as np 6 | #import skimage.io as io 7 | import pylab 8 | import os, os.path 9 | import pickle 10 | from tqdm import tqdm 11 | 12 | #pylab.rcParams['figure.figsize'] = (10.0, 8.0) 13 | 14 | parser = argparse.ArgumentParser(description="Preprocess COCO Labels.") 15 | 16 | #dataDir='/share/data/vision-greg/coco' 17 | #which dataset to extract options are [all, train, val, test] 18 | #dataset = "all" 19 | parser.add_argument("--dir", type=str, default="/nobackup-slow/dataset/coco/", 20 | help="where is the coco dataset located.") 21 | parser.add_argument("--save_dir", type=str, default="./datasets/coco/", 22 | help="where to save the coco labels.") 23 | parser.add_argument("--dataset", type=str, default="all", 24 | choices=["all", "train", "val", "test"], 25 | help="which coco partition to create the multilabel set" 26 | "for the options [all, train, val, test] default is all") 27 | args = parser.parse_args() 28 | 29 | 30 | def save_obj(obj, name): 31 | with open(name + '.pkl', 'wb') as f: 32 | pickle.dump(obj, f, pickle.HIGHEST_PROTOCOL) 33 | 34 | def load_obj(name ): 35 | with open(name + '.pkl', 'rb') as f: 36 | return pickle.load(f) 37 | 38 | def wrrite(fname, d): 39 | fout = open(fname, 'w') 40 | for i in range(len(d)): 41 | fout.write(d[i] +'\n') 42 | fout.close() 43 | 44 | def load(fname): 45 | data = [] 46 | labels = [] 47 | for line in open(fname).readlines(): 48 | l = line.strip().split(' ') 49 | data.append(l[0]) 50 | labels.append(int(l[1])) 51 | return data,np.array(labels,dtype=np.int32) 52 | 53 | 54 | def load_labels(img_names, root_dir, dataset, coco, idmapper): 55 | labels = {} 56 | for i in tqdm(range(len(img_names))): 57 | #print(i, dataset) 58 | #print(img_names[i], img_names[i][18:-4]) 59 | # Hack to extract the image id from the image name 60 | if dataset == "val": 61 | imgIds=int(img_names[i][18:-4]) 62 | else: 63 | imgIds=int(img_names[i][19:-4]) 64 | annIds = coco.getAnnIds(imgIds=imgIds, iscrowd=None) 65 | anns = coco.loadAnns(annIds) 66 | c = [] 67 | for annot in anns: 68 | c.append(idmapper[annot['category_id']]) 69 | if not c: 70 | c = np.array(-1) 71 | labels[root_dir + '/' + img_names[i]] = np.unique(c) 72 | 73 | return labels 74 | 75 | 76 | def load_image_names(root_dir): 77 | DIR = root_dir 78 | #print(DIR) 79 | img_names = [name for name in os.listdir(DIR) if os.path.isfile(os.path.join(DIR, name))] 80 | return img_names 81 | 82 | 83 | def load_annotations(dataDir, dataType): 84 | annFile='%sannotations/instances_%s.json'%(dataDir, dataType) 85 | 86 | # initialize COCO api for instance annotations 87 | coco=COCO(annFile) 88 | 89 | # display COCO categories and supercategories 90 | cats = coco.loadCats(coco.getCatIds()) 91 | 92 | 93 | nms=[cat['id'] for cat in cats] 94 | idmapper = {} 95 | for i in range(len(nms)): 96 | idmapper[nms[i]] = i 97 | 98 | return coco, idmapper 99 | 100 | 101 | root_dir = args.dir + "train2014" 102 | train_img_names = load_image_names(root_dir) 103 | root_dir = args.dir + "val2014" 104 | val_img_names = load_image_names(root_dir) 105 | 106 | if args.dataset == "test" or args.dataset == "all": 107 | root_dir = args.dir + "test2014" 108 | test_img_names = load_image_names(root_dir) 109 | 110 | d = {} 111 | for i in range(len(test_img_names)): 112 | d[i] = root_dir + '/' + test_img_names[i] 113 | 114 | LIST = args.save_dir + 'test2014imgs.txt' 115 | wrrite(LIST,d) 116 | 117 | 118 | if args.dataset == "all": 119 | root_dir = args.dir + "train2014" 120 | 121 | coco, idmapper = load_annotations(args.dir, "train2014") 122 | labels = load_labels(train_img_names, root_dir, "train", coco, idmapper) 123 | save_obj(labels, args.save_dir + "/multi-label-train2014") 124 | LIST = args.save_dir + "train2014imgs.txt" 125 | wrrite(LIST, train_img_names) 126 | 127 | root_dir = args.dir + "val2014" 128 | 129 | coco, idmapper = load_annotations(args.dir, "val2014") 130 | labels = load_labels(val_img_names, root_dir, "val", coco, idmapper) 131 | save_obj(labels, args.save_dir + "/multi-label-val2014") 132 | LIST = args.save_dir + "/val2014imgs.txt" 133 | wrrite(LIST, val_img_names) 134 | 135 | elif args.dataset == 'val': 136 | 137 | root_dir = args.dir + "val2014" 138 | 139 | coco, idmapper = load_annotations(root_dir) 140 | 141 | labels = load_labels(val_img_names, root_dir, "val", coco, idmapper) 142 | save_obj(labels, args.save_dir + "/multi-label-val2014") 143 | LIST = args.save_dir + "/val2014imgs.txt" 144 | wrrite(LIST, val_img_names) 145 | 146 | 147 | elif args.dataset == 'train': 148 | root_dir = args.dir + "/train2014" 149 | 150 | coco, idmapper = load_annotations(root_dir) 151 | 152 | labels = load_labels(train_img_names, root_dir, "train", coco, idmapper) 153 | save_obj(labels, args.save_dir + "/multi-label-train2014") 154 | LIST = args.save_dir + "/train2014imgs.txt" 155 | wrrite(LIST, train_img_names) 156 | 157 | 158 | 159 | # For image segmentaion 160 | # converting polygon and RLE to binary mask 161 | 162 | #labels = {} 163 | #for i in range(len(imgsname)): 164 | # print(i) 165 | # if val == True: 166 | # imgIds=int(imgsname[i][19:25]) 167 | # else: 168 | # imgIds=int(imgsname[i][21:27]) 169 | # annIds = coco.getAnnIds(imgIds=imgIds, iscrowd=None) 170 | # anns = coco.loadAnns(annIds) 171 | # for annot in anns: 172 | # cmask_partial = coco.annToMask(annot) 173 | # 174 | -------------------------------------------------------------------------------- /utils/dataloader/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/deeplearning-wisc/multi-label-ood/0c999beb57319016740f19b72bc87c1115c80127/utils/dataloader/__init__.py -------------------------------------------------------------------------------- /utils/dataloader/coco_loader.py: -------------------------------------------------------------------------------- 1 | import os 2 | import collections 3 | import json 4 | 5 | import os.path as osp 6 | import numpy as np 7 | from PIL import Image 8 | import collections 9 | import torch 10 | import torchvision 11 | 12 | from torch.utils import data 13 | from tqdm import tqdm 14 | from torch.utils import data 15 | import random 16 | 17 | from scipy.io import loadmat 18 | 19 | 20 | 21 | class cocoloader(data.Dataset): 22 | def __init__(self,root='./datasets/coco/', split="multi-label-train2014", img_transform=None, label_transform=None): 23 | self.root = root 24 | self.split = split 25 | self.n_classes = 80 26 | self.img_transform = img_transform 27 | self.label_transform = label_transform 28 | if split == "test": 29 | with open(root+"test2014imgs.txt") as f: 30 | tmp = f.readlines() 31 | self.Imglist = [s.rstrip() for s in tmp] 32 | else: 33 | filePath = self.root + self.split + '.pkl' 34 | datafile = np.load(filePath, allow_pickle=True) 35 | self.GT = list(datafile.values()) 36 | self.Imglist = list(datafile.keys()) 37 | 38 | 39 | def __len__(self): 40 | return len(self.Imglist) 41 | 42 | def __getitem__(self, index): 43 | img = Image.open(self.Imglist[index]).convert('RGB') 44 | if self.split == "test": 45 | lbl_num = [-1, -1] 46 | else: 47 | lbl_num = self.GT[index] 48 | lbl = np.zeros(self.n_classes) 49 | if lbl_num[0] != -1: 50 | lbl[lbl_num] = 1 51 | 52 | 53 | seed = np.random.randint(2147483647) 54 | random.seed(seed) 55 | if self.img_transform is not None: 56 | img_o = self.img_transform(img) 57 | # img_h = self.img_transform(self.h_flip(img)) 58 | # img_v = self.img_transform(self.v_flip(img)) 59 | imgs = img_o 60 | else: 61 | imgs = img 62 | random.seed(seed) 63 | if self.label_transform is not None: 64 | label_o = self.label_transform(lbl) 65 | # label_h = self.label_transform(self.h_flip(label)) 66 | # label_v = self.label_transform(self.v_flip(label)) 67 | lbls = label_o 68 | else: 69 | lbls = lbl 70 | 71 | 72 | 73 | return imgs, lbls 74 | 75 | 76 | -------------------------------------------------------------------------------- /utils/dataloader/nus_wide_loader.py: -------------------------------------------------------------------------------- 1 | import os 2 | import collections 3 | import json 4 | 5 | import numpy as np 6 | from PIL import Image 7 | import torch 8 | import torchvision 9 | 10 | from tqdm import tqdm 11 | from torch.utils import data 12 | import random 13 | 14 | 15 | 16 | class nuswideloader(data.Dataset): 17 | def __init__(self, root='./datasets/nus-wide/', split="train", 18 | in_dis=True, img_transform=None, label_transform=None): 19 | self.root = root 20 | self.split = split 21 | self.in_dis = in_dis 22 | self.n_classes = 81 23 | self.img_transform = img_transform 24 | self.label_transform = label_transform 25 | self.GT = [] 26 | self.Imglist = [] 27 | self.processing() 28 | if split != "train": 29 | self.partition() 30 | 31 | def processing(self): 32 | if self.split == "train": 33 | file_img = "nus_wide_train_imglist.txt" 34 | file_label = "nus_wide_train_label.txt" 35 | else: 36 | # validation and test 37 | file_img = "nus_wide_test_imglist.txt" 38 | file_label = "nus_wide_test_label.txt" 39 | f1 = open(self.root + file_img) 40 | img = f1.readlines() 41 | lbl = np.loadtxt(self.root + file_label, dtype=np.int64) 42 | # if self.in_dis: 43 | select = np.where(np.sum(lbl, axis=1) > 0)[0] 44 | # else: 45 | # select = np.where(np.sum(lbl, axis=1) == 0)[0] 46 | 47 | self.GT = lbl[select] 48 | # img = [img[i].split()[0] for i in range(len(img))] 49 | # self.Imglist = [img[i].replace('\\','/') for i in select] 50 | self.Imglist = [img[i].split()[0] for i in select] 51 | 52 | def partition(self): 53 | np.random.seed(999) 54 | state = np.random.get_state() 55 | labels = self.GT 56 | imgs = self.Imglist 57 | num = labels.shape[0] // 2 58 | # num = labels.shape[0] 59 | np.random.shuffle(labels) 60 | np.random.set_state(state) 61 | np.random.shuffle(imgs) 62 | if self.split == "val": 63 | self.GT = labels[:num] 64 | self.Imglist = imgs[:num] 65 | else: 66 | self.GT = labels[num:] 67 | self.Imglist = imgs[num:] 68 | 69 | def __len__(self): 70 | return len(self.Imglist) 71 | 72 | def __getitem__(self, index): 73 | path = "/nobackup-slow/dataset/nus-wide/" 74 | img = Image.open(path + self.Imglist[index]).convert('RGB') 75 | if self.split == "test": 76 | lbl = -np.ones(self.n_classes) 77 | else: 78 | lbl = self.GT[index] 79 | 80 | 81 | if self.img_transform is not None: 82 | img_o = self.img_transform(img) 83 | # img_h = self.img_transform(self.h_flip(img)) 84 | # img_v = self.img_transform(self.v_flip(img)) 85 | imgs = img_o 86 | else: 87 | imgs = img 88 | # random.seed(seed) 89 | if self.label_transform is not None: 90 | label_o = self.label_transform(lbl) 91 | # label_h = self.label_transform(self.h_flip(label)) 92 | # label_v = self.label_transform(self.v_flip(label)) 93 | lbls = label_o 94 | else: 95 | lbls = lbl 96 | 97 | return imgs, lbls 98 | 99 | 100 | -------------------------------------------------------------------------------- /utils/dataloader/pascal_voc_loader.py: -------------------------------------------------------------------------------- 1 | import os 2 | import collections 3 | import json 4 | 5 | import numpy as np 6 | from PIL import Image 7 | import collections 8 | import torch 9 | import torchvision 10 | 11 | from torch.utils import data 12 | import random 13 | 14 | from scipy.io import loadmat 15 | 16 | 17 | class pascalVOCLoader(data.Dataset): 18 | def __init__(self,root='./datasets/pascal/', split="voc12-train", img_transform=None, label_transform=None): 19 | self.root = root 20 | self.split = split 21 | self.n_classes = 20 22 | self.img_transform = img_transform 23 | self.label_transform = label_transform 24 | filePath = self.root + self.split + '.mat' 25 | datafile = loadmat(filePath) 26 | if split == "voc12-test": 27 | self.GT = None 28 | else: 29 | self.GT = datafile['labels'] 30 | self.Imglist = datafile['Imlist'] 31 | 32 | 33 | 34 | def __len__(self): 35 | return len(self.Imglist) 36 | 37 | def __getitem__(self, index): 38 | img = Image.open(self.Imglist[index].strip()).convert('RGB') 39 | if self.GT is not None: 40 | lbl = self.GT[index] 41 | else: 42 | lbl = -1 43 | 44 | seed = np.random.randint(2147483647) 45 | random.seed(seed) 46 | if self.img_transform is not None: 47 | img_o = self.img_transform(img) 48 | imgs = img_o 49 | else: 50 | imgs = img 51 | random.seed(seed) 52 | if self.label_transform is not None: 53 | label_o = self.label_transform(lbl) 54 | lbls = label_o 55 | else: 56 | lbls = lbl 57 | 58 | return imgs, lbls 59 | 60 | -------------------------------------------------------------------------------- /validate.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import argparse 3 | import torchvision 4 | import numpy as np 5 | from sklearn import metrics 6 | import torch.nn as nn 7 | from torch.autograd import Variable 8 | from torch.utils import data 9 | 10 | from utils.dataloader.coco_loader import * 11 | from utils.dataloader.nus_wide_loader import * 12 | from utils.dataloader.pascal_voc_loader import * 13 | from utils.anom_utils import ToLabel 14 | 15 | from model.classifiersimple import * 16 | 17 | print("Using", torch.cuda.device_count(), "GPUs") 18 | def validate(args, model, clsfier, val_loader): 19 | model.eval() 20 | clsfier.eval() 21 | 22 | gts = {i:[] for i in range(0, args.n_classes)} 23 | preds = {i:[] for i in range(0, args.n_classes)} 24 | with torch.no_grad(): 25 | for images, labels in val_loader: 26 | images = Variable(images.cuda()) 27 | labels = Variable(labels.cuda().float()) 28 | outputs = model(images) 29 | outputs = F.relu(outputs, inplace=True) 30 | outputs = clsfier(outputs) 31 | outputs = torch.sigmoid(outputs) 32 | pred = outputs.squeeze().data.cpu().numpy() 33 | gt = labels.squeeze().data.cpu().numpy() 34 | 35 | for label in range(0, args.n_classes): 36 | gts[label].extend(gt[:,label]) 37 | preds[label].extend(pred[:,label]) 38 | 39 | FinalMAPs = [] 40 | for i in range(0, args.n_classes): 41 | precision, recall, thresholds = metrics.precision_recall_curve(gts[i], preds[i]) 42 | FinalMAPs.append(metrics.auc(recall, precision)) 43 | # print(FinalMAPs) 44 | 45 | return np.mean(FinalMAPs) 46 | 47 | 48 | if __name__ == '__main__': 49 | parser = argparse.ArgumentParser(description='Hyperparams') 50 | parser.add_argument('--arch', nargs='?', type=str, default='resnet101', 51 | help='Architecture to use') 52 | parser.add_argument('--dataset', nargs='?', type=str, default='pascal', 53 | help='Dataset to use [\'pascal, coco, nus-wide\']') 54 | parser.add_argument('--load_path', nargs='?', type=str, default='./saved_models/', 55 | help='Model path') 56 | parser.add_argument('--batch_size', nargs='?', type=int, default=20, 57 | help='Batch Size') 58 | parser.add_argument('--n_classes', nargs='?', type=int, default=20) 59 | args = parser.parse_args() 60 | 61 | # Setup Dataloader 62 | normalize = torchvision.transforms.Normalize(mean=[0.485, 0.456, 0.406], 63 | std=[0.229, 0.224, 0.225]) 64 | img_transform = torchvision.transforms.Compose([ 65 | torchvision.transforms.Resize((256, 256)), 66 | torchvision.transforms.ToTensor(), 67 | normalize, 68 | ]) 69 | 70 | label_transform = torchvision.transforms.Compose([ 71 | ToLabel(), 72 | ]) 73 | 74 | if args.dataset == 'pascal': 75 | loader = pascalVOCLoader('./datasets/pascal/', split="voc12-val", 76 | img_transform=img_transform, 77 | label_transform=label_transform) 78 | elif args.dataset == 'coco': 79 | loader = cocoloader('./datasets/coco/', split="multi-label-val2014", 80 | img_transform=img_transform, 81 | label_transform=label_transform) 82 | elif args.dataset == "nus-wide": 83 | loader = nuswideloader("./datasets/nus-wide/", split="val", 84 | img_transform=img_transform, 85 | label_transform=label_transform) 86 | else: 87 | raise AssertionError 88 | 89 | args.n_classes = loader.n_classes 90 | val_loader = data.DataLoader(loader, batch_size=args.batch_size, num_workers=8, shuffle=False) 91 | 92 | if args.arch == "resnet101": 93 | orig_resnet = torchvision.models.resnet101(pretrained=True) 94 | features = list(orig_resnet.children()) 95 | model = nn.Sequential(*features[0:8]) 96 | clsfier = clssimp(2048, args.n_classes) 97 | 98 | elif args.arch == "densenet": 99 | orig_densenet = torchvision.models.densenet121(pretrained=True) 100 | features = list(orig_densenet.features) 101 | model = nn.Sequential(*features, nn.ReLU(inplace=True)) 102 | clsfier = clssimp(1024, args.n_classes) 103 | 104 | model.load_state_dict(torch.load(args.load_path + args.dataset + '/' + 105 | args.arch + ".pth", map_location="cpu")) 106 | clsfier.load_state_dict(torch.load(args.load_path + args.dataset + '/' + 107 | args.arch + 'clsfier' + ".pth", map_location="cpu")) 108 | 109 | model = model.cuda() 110 | clsfier = clsfier.cuda() 111 | if torch.cuda.device_count() > 1: 112 | model = nn.DataParallel(model) 113 | clsfier = nn.DataParallel(clsfier) 114 | 115 | mAP = validate(args, model, clsfier, val_loader) 116 | print("mAP on validation set: %.4f" % (mAP * 100)) 117 | --------------------------------------------------------------------------------