├── .gitignore ├── LICENSE ├── Readme.md ├── anomaly_detection ├── __init__.py ├── anomalyDetection.py ├── base │ ├── __init__.py │ ├── base_dataset.py │ ├── base_net.py │ ├── base_trainer.py │ └── torchvision_dataset.py ├── datasets │ ├── __init__.py │ ├── main.py │ ├── selfsupervised_images.py │ └── selfsupervised_patches.py ├── networks │ ├── __init__.py │ ├── main.py │ ├── real_nvp.py │ └── stack_conv_net.py ├── optim │ ├── __init__.py │ ├── ae_trainer.py │ └── anomalyDetection_trainer.py └── utils │ ├── __init__.py │ ├── config.py │ ├── eval_functions.py │ ├── generate_incremental_table.py │ ├── generate_overview_table.py │ ├── generate_table.py │ └── visualization │ └── plot_images_grid.py ├── dataset_labeller.py ├── evaluate_dense_svdd.py ├── get_dataset.sh ├── train_all_combinations.py ├── train_anomaly_detection.py ├── train_anomaly_detection.sh ├── train_incrementally.py └── visualize_positive_labels.py /.gitignore: -------------------------------------------------------------------------------- 1 | runs/ 2 | __pycache__ 3 | erfnet_stuff/ 4 | save/ 5 | log/ 6 | data/ 7 | *.zip 8 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Lukas Ruff 4 | Copyright (c) 2020 Lorenz Wellhausen 5 | 6 | Permission is hereby granted, free of charge, to any person obtaining a copy 7 | of this software and associated documentation files (the "Software"), to deal 8 | in the Software without restriction, including without limitation the rights 9 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 10 | copies of the Software, and to permit persons to whom the Software is 11 | furnished to do so, subject to the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be included in all 14 | copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 17 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 18 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 19 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 20 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 21 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 22 | SOFTWARE. 23 | -------------------------------------------------------------------------------- /Readme.md: -------------------------------------------------------------------------------- 1 | # Anomaly Navigation - ANNA 2 | 3 | ## Overview 4 | 5 | This is code to train and evaluate anomaly detection algorithms using multi-modal sensor information. 6 | 7 | Based in part on [Deep-SVDD-PyTorch](https://github.com/lukasruff/Deep-SVDD-PyTorch). 8 | 9 | **Author:** Lorenz Wellhausen, [lorenwel@ethz.ch](mailto:lorenwel@ethz.ch) 10 | 11 | **Affiliation:** [Robotic Systems Lab](https://rsl.ethz.ch/), ETH Zürich 12 | 13 | ## Publications 14 | 15 | If you use this work in an academic context, please cite the following publication: 16 | 17 | > L. Wellhausen, R. Ranftl and M. Hutter, 18 | > **"Safe Robot Navigation via Multi-Modal Anomaly Detection"**, 19 | > in IEEE Robotics and Automation Letters (RA-L), 2020 20 | 21 | ## Installation 22 | 23 | Tested on Ubuntu 18.04 using Python 3 and Pytorch 1.2.0/1.3.1. 24 | 25 | Install dependencies: 26 | 27 | `sudo apt install git-lfs virtualenv unzip` 28 | 29 | Create and activate virtual environment with Python 3 as default version: 30 | 31 | ``` 32 | virtualenv --system-site-packages -p python3 ~/venv 33 | source ~/venv/bin/activate 34 | ``` 35 | 36 | [Install Pytorch using Pip](https://pytorch.org/get-started/locally/) (Follow instructions for Ubuntu and your CUDA version). 37 | 38 | Install additional Python dependencies: 39 | 40 | `pip install click numpy matplotlib opencv-python sklearn tensorboard` 41 | 42 | ## Usage 43 | 44 | All of the commands below assume that you have your terminal open in the base directory of this repo. 45 | 46 | ### ANNA Dataset 47 | 48 | We use our anomaly navigation (ANNA) dataset to evaluate the performance of different methods and sensor configurations. 49 | You can automatically download and extract the ANNA dataset into the appropriate folders by calling the appropriate script. 50 | 51 | `./get_dataset.sh` 52 | 53 | You can also download the ANNA dataset yourself from the [ETH Research Collection](https://www.research-collection.ethz.ch/handle/20.500.11850/389950) 54 | 55 | ### Single Network 56 | 57 | To train a network using Real-NVP, with RGB, gravity-aligned depth and surface normal information (the highest performing method evaluated in the publication), simply call the provided script: 58 | 59 | `./train_anomaly_detection.sh` 60 | 61 | ### All Combinations 62 | 63 | To train all possible combinations (reproducing Table I from the publication), call 64 | 65 | `python train_all_combinations.py` 66 | 67 | This will take multiple days to train if you want to average over 10 runs, as we did for the publication. 68 | You can also adjust the number of iterations in the code, if you want to train more quickly. 69 | 70 | We also provide our script to generate the pretty table we use in our publication. 71 | 72 | `python anomaly_detection/utils/generate_overview_table.py` 73 | 74 | ### Incremental Data 75 | 76 | To train with incrementally more data from different environmental conditions (reproducing Fig. 5 from the publication) , call 77 | 78 | `python train_incrementally.py` 79 | 80 | ### Track Progress 81 | 82 | Network training logs progress via Tensorboard so that you can track AUC and loss performance during training. 83 | To launch tensorboard run 84 | 85 | `tensorboard --logdir log` 86 | 87 | Then, navigate to `http://127.0.0.1:6006/` in your browser. 88 | -------------------------------------------------------------------------------- /anomaly_detection/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/leggedrobotics/anomaly_navigation/e4d87a1b67904e0537de3fa4b3e53bc4932d1681/anomaly_detection/__init__.py -------------------------------------------------------------------------------- /anomaly_detection/anomalyDetection.py: -------------------------------------------------------------------------------- 1 | import json 2 | import torch 3 | 4 | from anomaly_detection.base.base_dataset import BaseADDataset 5 | from anomaly_detection.networks.main import build_network, build_autoencoder 6 | from anomaly_detection.optim.anomalyDetection_trainer import AnomalyDetectionTrainer 7 | from anomaly_detection.optim.ae_trainer import AETrainer 8 | 9 | 10 | class AnomalyDetection(object): 11 | 12 | def __init__(self, writer, objective: str = 'one-class', nu: float = 0.1): 13 | """Inits anomaly detection with one of the two objectives and hyperparameter nu.""" 14 | 15 | assert objective in ('one-class', 'soft-boundary', 'real-nvp'), "Objective must be either 'one-class' or 'soft-boundary'." 16 | self.objective = objective 17 | assert (0 < nu) & (nu <= 1), "For hyperparameter nu, it must hold: 0 < nu <= 1." 18 | self.nu = nu 19 | self.R = 0.0 # hypersphere radius R 20 | self.Thres = None 21 | self.c = None # hypersphere center c 22 | 23 | self.net_name = None 24 | self.net = None # neural network \phi 25 | 26 | self.trainer = None 27 | self.optimizer_name = None 28 | 29 | self.ae_net = None # autoencoder network for pretraining 30 | self.ae_trainer = None 31 | self.ae_optimizer_name = None 32 | 33 | self.auc_result = None 34 | 35 | self.writer = writer 36 | 37 | self.results = { 38 | 'train_time': None, 39 | 'test_auc': None, 40 | 'test_auc_ae': None, 41 | 'test_time': None, 42 | 'test_scores': None, 43 | 'thres_ae': 0, 44 | } 45 | 46 | def set_network(self, net_name, cfg): 47 | """Builds the neural network \phi.""" 48 | self.net_name = net_name 49 | self.net = build_network(net_name, cfg) 50 | 51 | def train(self, dataset: BaseADDataset, augment: bool, optimizer_name: str = 'adam', lr: float = 0.001, n_epochs: int = 50, 52 | lr_milestones: tuple = (), batch_size: int = 128, weight_decay: float = 1e-6, device: str = 'cuda', 53 | n_jobs_dataloader: int = 0, fix_encoder: bool = False): 54 | """Trains the anomaly detection model on the training data.""" 55 | 56 | self.optimizer_name = optimizer_name 57 | self.trainer = AnomalyDetectionTrainer(self.writer, self.objective, self.R, self.c, self.nu, optimizer_name, lr=lr, 58 | n_epochs=n_epochs, lr_milestones=lr_milestones, batch_size=batch_size, 59 | weight_decay=weight_decay, device=device, n_jobs_dataloader=n_jobs_dataloader, 60 | fix_encoder=fix_encoder) 61 | # Get the model 62 | self.net = self.trainer.train(dataset, self.net, augment) 63 | self.R = float(self.trainer.R.cpu().data.numpy()) # get float 64 | self.best_R = float(self.trainer.best_R.cpu().data.numpy()) 65 | self.best_weights = self.trainer.best_weights 66 | self.c = self.trainer.c.cpu().data.numpy().tolist() # get list 67 | self.results['train_time'] = self.trainer.train_time 68 | 69 | def test(self, dataset: BaseADDataset, device: str = 'cuda', n_jobs_dataloader: int = 0): 70 | """Tests the anomaly detection model on the test data.""" 71 | 72 | if self.trainer is None: 73 | self.trainer = AnomalyDetectionTrainer(self.objective, self.R, self.c, self.nu, 74 | device=device, n_jobs_dataloader=n_jobs_dataloader) 75 | 76 | self.trainer.test(dataset, self.net) 77 | # Get results 78 | self.results['thres'] = self.trainer.thres 79 | self.results['test_auc'] = self.trainer.test_auc 80 | self.results['test_fpr5'] = self.trainer.test_fpr5 81 | self.results['test_time'] = self.trainer.test_time 82 | self.results['test_scores'] = self.trainer.test_scores 83 | 84 | def testInputSensitivity(self, dataset: BaseADDataset, device: str = 'cuda', n_jobs_dataloader: int = 0): 85 | if self.trainer is None: 86 | self.trainer = AnomalyDetectionTrainer(self.objective, self.R, self.c, self.nu, 87 | device=device, n_jobs_dataloader=n_jobs_dataloader) 88 | 89 | self.trainer.testInputSensitivity(dataset, self.net) 90 | 91 | 92 | def pretrain(self, dataset: BaseADDataset, cfg, optimizer_name: str = 'adam', lr: float = 0.001, n_epochs: int = 100, 93 | lr_milestones: tuple = (), batch_size: int = 128, weight_decay: float = 1e-6, device: str = 'cuda', 94 | n_jobs_dataloader: int = 0): 95 | """Pretrains the weights for the anomaly detection network \phi via autoencoder.""" 96 | 97 | self.ae_net = build_autoencoder(self.net_name, cfg) 98 | self.ae_optimizer_name = optimizer_name 99 | self.ae_trainer = AETrainer(self.writer, optimizer_name, lr=lr, n_epochs=n_epochs, 100 | lr_milestones=lr_milestones, batch_size=batch_size, 101 | weight_decay=weight_decay, device=device, 102 | n_jobs_dataloader=n_jobs_dataloader) 103 | self.ae_net = self.ae_trainer.train(dataset, self.ae_net) 104 | self.results['test_auc_ae'], _ = self.ae_trainer.test(dataset, self.ae_net) 105 | self.results['thres_ae'] = self.ae_trainer.thres 106 | self.best_ae_weights = self.ae_trainer.best_weights 107 | self.init_network_weights_from_pretraining() 108 | 109 | def init_network_weights_from_pretraining(self): 110 | """Initialize the anomaly detection network weights from the encoder weights of the pretraining autoencoder.""" 111 | 112 | if self.objective == 'real-nvp': 113 | net_dict = self.net.encoder.state_dict() 114 | else: 115 | net_dict = self.net.state_dict() 116 | ae_net_dict = self.ae_net.state_dict() 117 | 118 | # Filter out decoder network keys 119 | ae_net_dict = {k: v for k, v in ae_net_dict.items() if k in net_dict} 120 | # Overwrite values in the existing state_dict 121 | net_dict.update(ae_net_dict) 122 | # Load the new state_dict 123 | 124 | if self.objective == 'real-nvp': 125 | self.net.encoder.load_state_dict(net_dict) 126 | else: 127 | self.net.load_state_dict(net_dict) 128 | 129 | def save_model(self, export_model, export_best_model, save_ae=True): 130 | """Save anomaly detection model to export_model.""" 131 | 132 | net_dict = self.net.state_dict() 133 | ae_net_dict = self.ae_net.state_dict() if save_ae else None 134 | ae_best_net_dict = self.best_ae_weights if save_ae else None 135 | print('Best threshold: ' + str(self.results['thres'])) 136 | print('Best AE threshold: ' + str(self.results['thres_ae'])) 137 | 138 | torch.save({'R': self.R, 139 | 'thres': self.results['thres'], 140 | 'thres_ae': self.results['thres_ae'], 141 | 'c': self.c, 142 | 'net_dict': net_dict, 143 | 'ae_net_dict': ae_net_dict}, export_model) 144 | torch.save({'R': self.best_R, 145 | 'c': self.c, 146 | 'net_dict': self.best_weights, 147 | 'ae_net_dict': ae_best_net_dict}, export_best_model) 148 | 149 | def load_model(self, model_path, cfg, load_ae=False): 150 | """Load anomaly detection model from model_path.""" 151 | 152 | model_dict = torch.load(model_path) 153 | 154 | self.R = model_dict['R'] 155 | self.c = model_dict['c'] 156 | self.net.load_state_dict(model_dict['net_dict']) 157 | if load_ae: 158 | if self.ae_net is None: 159 | self.ae_net = build_autoencoder(self.net_name, cfg) 160 | self.ae_net.load_state_dict(model_dict['ae_net_dict']) 161 | 162 | def save_results(self, export_json): 163 | """Save results dict to a JSON-file.""" 164 | with open(export_json, 'w') as fp: 165 | json.dump(self.results, fp) 166 | -------------------------------------------------------------------------------- /anomaly_detection/base/__init__.py: -------------------------------------------------------------------------------- 1 | from .base_dataset import * 2 | from .torchvision_dataset import * 3 | from .base_net import * 4 | from .base_trainer import * 5 | -------------------------------------------------------------------------------- /anomaly_detection/base/base_dataset.py: -------------------------------------------------------------------------------- 1 | from abc import ABC, abstractmethod 2 | from torch.utils.data import DataLoader 3 | 4 | 5 | class BaseADDataset(ABC): 6 | """Anomaly detection dataset base class.""" 7 | 8 | def __init__(self, root: str): 9 | super().__init__() 10 | self.root = root # root path to data 11 | 12 | self.n_classes = 2 # 0: normal, 1: outlier 13 | self.normal_classes = None # tuple with original class labels that define the normal class 14 | self.outlier_classes = None # tuple with original class labels that define the outlier class 15 | 16 | self.train_set = None # must be of type torch.utils.data.Dataset 17 | self.test_set = None # must be of type torch.utils.data.Dataset 18 | 19 | @abstractmethod 20 | def loaders(self, batch_size: int, shuffle_train=True, shuffle_test=False, num_workers: int = 0) -> ( 21 | DataLoader, DataLoader): 22 | """Implement data loaders of type torch.utils.data.DataLoader for train_set and test_set.""" 23 | pass 24 | 25 | def __repr__(self): 26 | return self.__class__.__name__ 27 | -------------------------------------------------------------------------------- /anomaly_detection/base/base_net.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import torch.nn as nn 3 | import numpy as np 4 | 5 | 6 | class BaseNet(nn.Module): 7 | """Base class for all neural networks.""" 8 | 9 | def __init__(self): 10 | super().__init__() 11 | self.logger = logging.getLogger(self.__class__.__name__) 12 | self.rep_dim = None # representation dimensionality, i.e. dim of the last layer 13 | 14 | def forward(self, *input): 15 | """ 16 | Forward pass logic 17 | :return: Network output 18 | """ 19 | raise NotImplementedError 20 | 21 | def summary(self): 22 | """Network summary.""" 23 | net_parameters = filter(lambda p: p.requires_grad, self.parameters()) 24 | params = sum([np.prod(p.size()) for p in net_parameters]) 25 | self.logger.info('Trainable parameters: {}'.format(params)) 26 | self.logger.info(self) 27 | -------------------------------------------------------------------------------- /anomaly_detection/base/base_trainer.py: -------------------------------------------------------------------------------- 1 | from abc import ABC, abstractmethod 2 | from .base_dataset import BaseADDataset 3 | from .base_net import BaseNet 4 | 5 | 6 | class BaseTrainer(ABC): 7 | """Trainer base class.""" 8 | 9 | def __init__(self, optimizer_name: str, lr: float, n_epochs: int, lr_milestones: tuple, batch_size: int, 10 | weight_decay: float, device: str, n_jobs_dataloader: int, writer): 11 | super().__init__() 12 | self.optimizer_name = optimizer_name 13 | self.lr = lr 14 | self.n_epochs = n_epochs 15 | self.lr_milestones = lr_milestones 16 | self.batch_size = batch_size 17 | self.weight_decay = weight_decay 18 | self.device = device 19 | self.n_jobs_dataloader = n_jobs_dataloader 20 | self.writer = writer 21 | 22 | @abstractmethod 23 | def train(self, dataset: BaseADDataset, net: BaseNet) -> BaseNet: 24 | """ 25 | Implement train method that trains the given network using the train_set of dataset. 26 | :return: Trained net 27 | """ 28 | pass 29 | 30 | @abstractmethod 31 | def test(self, dataset: BaseADDataset, net: BaseNet): 32 | """ 33 | Implement test method that evaluates the test_set of dataset on the given network. 34 | """ 35 | pass 36 | -------------------------------------------------------------------------------- /anomaly_detection/base/torchvision_dataset.py: -------------------------------------------------------------------------------- 1 | from .base_dataset import BaseADDataset 2 | from torch.utils.data import DataLoader 3 | 4 | 5 | class TorchvisionDataset(BaseADDataset): 6 | """TorchvisionDataset class for datasets already implemented in torchvision.datasets.""" 7 | 8 | def __init__(self, root: str): 9 | super().__init__(root) 10 | 11 | def loaders(self, batch_size: int, shuffle_train=True, shuffle_test=False, num_workers: int = 0) -> ( 12 | DataLoader, DataLoader): 13 | train_loader = DataLoader(dataset=self.train_set, batch_size=batch_size, shuffle=shuffle_train, 14 | num_workers=num_workers) 15 | test_loader = DataLoader(dataset=self.test_set, batch_size=batch_size, shuffle=shuffle_test, 16 | num_workers=num_workers) 17 | return train_loader, test_loader 18 | -------------------------------------------------------------------------------- /anomaly_detection/datasets/__init__.py: -------------------------------------------------------------------------------- 1 | from .main import load_dataset 2 | -------------------------------------------------------------------------------- /anomaly_detection/datasets/main.py: -------------------------------------------------------------------------------- 1 | from .selfsupervised_patches import SelfSupervisedDataset 2 | 3 | 4 | def load_dataset(dataset_name, data_path, normal_class, cfg): 5 | """Loads the dataset.""" 6 | 7 | implemented_datasets = ('selfsupervised') 8 | assert dataset_name in implemented_datasets 9 | 10 | dataset = None 11 | 12 | if dataset_name == 'selfsupervised': 13 | dataset = SelfSupervisedDataset(root=data_path, 14 | train=cfg.settings['train_folder'], 15 | val_pos=cfg.settings['val_pos_folder'], 16 | val_neg=cfg.settings['val_neg_folder'], 17 | rgb=cfg.settings['rgb'], 18 | ir=cfg.settings['ir'], 19 | depth=cfg.settings['depth'], 20 | depth_3d=cfg.settings['depth_3d'], 21 | normals=cfg.settings['normals'], 22 | normal_angle=cfg.settings['normal_angle']) 23 | 24 | return dataset 25 | -------------------------------------------------------------------------------- /anomaly_detection/datasets/selfsupervised_images.py: -------------------------------------------------------------------------------- 1 | import math 2 | import numpy as np 3 | import random 4 | import os 5 | import cv2 6 | 7 | import torch 8 | 9 | from PIL import Image 10 | 11 | from numpy import genfromtxt, count_nonzero 12 | 13 | from torch.utils.data import Dataset 14 | from torchvision.transforms import ToTensor 15 | from torchvision.transforms import Compose, CenterCrop, Normalize, Resize, Pad, ColorJitter 16 | from torchvision.transforms import ToTensor, ToPILImage 17 | from torchvision.transforms import functional 18 | 19 | 20 | 21 | image_transform = ToPILImage() 22 | 23 | IMG_EXTENSIONS = ['.bmp', '.png'] 24 | 25 | def isImage(file): 26 | return os.path.splitext(file)[1] in IMG_EXTENSIONS 27 | 28 | 29 | 30 | class ToLabel: 31 | 32 | def __call__(self, image): 33 | return torch.from_numpy(np.array(image)).long() 34 | 35 | 36 | class FloatToLongLabel: 37 | 38 | def __call__(self, image): 39 | return torch.from_numpy(np.array(image)*1000000).long().unsqueeze(0) 40 | 41 | 42 | class ToFloatLabel: 43 | 44 | def __call__(self, image): 45 | return torch.from_numpy(np.array(image)).float() 46 | 47 | 48 | 49 | def extractStamp(file_name): 50 | split_name = os.path.basename(file_name).split('_') 51 | return split_name[0] 52 | 53 | 54 | 55 | def splitByFileBaseName(files): 56 | active_stamp = -1. 57 | files.sort() 58 | files_split = {} 59 | cur_files = [] 60 | 61 | longest_file_list = 0 62 | 63 | for file in files: 64 | # Get stamp. 65 | stamp = extractStamp(file) 66 | # Handle new time stamp. 67 | if stamp != active_stamp: 68 | if cur_files: 69 | files_split[active_stamp] = cur_files 70 | if len(cur_files) > longest_file_list: 71 | longest_file_list = len(cur_files) 72 | cur_files = [] 73 | active_stamp = stamp 74 | # Append current file. 75 | cur_files.append(file) 76 | 77 | # One final apend after 78 | if cur_files: 79 | files_split[active_stamp] = cur_files 80 | 81 | # Remove lists which are missing some image. 82 | n_removed = 0 83 | remove_keys = [] 84 | for key in files_split: 85 | if len(files_split[key]) != longest_file_list: 86 | n_removed +=1 87 | remove_keys.append(key) 88 | for key in remove_keys: 89 | files_split.pop(key) 90 | if n_removed: 91 | print('Removed ' + str(n_removed) + ' file lists because they miss some image type.') 92 | 93 | return files_split 94 | 95 | 96 | 97 | class SelfSupervisedDataset(Dataset): 98 | 99 | def isLabel(self, file): 100 | return file.endswith(self.file_format) 101 | 102 | def __init__(self, root, file_format="npy", subsample=1, tensor_type='long'): 103 | self.root = root 104 | self.file_format = file_format 105 | self.tensor_type = tensor_type 106 | 107 | print ("Image root is: " + self.root) 108 | print ("Load files with extension: " + self.file_format) 109 | if subsample > 1: 110 | print("Using every ", subsample, "th image") 111 | 112 | filenames_img = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) for f in fn if isImage(f)] 113 | filenames_img = splitByFileBaseName(filenames_img) 114 | 115 | filenames_label = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) for f in fn if self.isLabel(f)] 116 | filenames_label = splitByFileBaseName(filenames_label) 117 | 118 | # Make sure all images have labels and vice-versa. 119 | keys_remove = [] 120 | for key in filenames_img: 121 | if not key in filenames_label: 122 | keys_remove.append(key) 123 | for key in keys_remove: 124 | filenames_img.pop(key) 125 | keys_remove.clear() 126 | for key in filenames_label: 127 | if not key in filenames_img: 128 | keys_remove.append(key) 129 | for key in keys_remove: 130 | filenames_label.pop(key) 131 | 132 | # Test to make sure everything's super duper. 133 | for key1, key2 in zip(filenames_img, filenames_label): 134 | if key1 != key2: 135 | raise Exception('Time stampes of images and labels are not identical') 136 | 137 | self.filenames_img = list(filenames_img.values()) 138 | self.filenames_label = list(filenames_label.values()) 139 | 140 | if subsample > 1: 141 | self.filenames_img = [val for ind, val in enumerate(self.filenames_img) if ind % subsample == 0] # Subsample. 142 | self.filenames_label = [val for ind, val in enumerate(self.filenames_label) if ind % subsample == 0] # Subsample. 143 | 144 | print ("Found " + str(len(self.filenames_img)) + " images.") 145 | print ("Found " + str(len(self.filenames_label)) + " labels.") 146 | 147 | 148 | 149 | def __getitem__(self, index): 150 | filenames_image = self.filenames_img[index] 151 | filename_labels = self.filenames_label[index] 152 | 153 | image_depth = cv2.imread(filenames_image[0], cv2.IMREAD_UNCHANGED) 154 | image_depth.dtype=np.float32 155 | # Remove negative depth which comes from projection. 156 | image_depth[image_depth < 0.0] = 0.0 157 | # Remove far away depth. 158 | image_depth[image_depth > 10.0] = 0.0 159 | depth_mask = image_depth == 0.0 160 | image_depth = np.transpose(image_depth, (2, 0, 1)) 161 | 162 | 163 | image_depth_3d_x = cv2.imread(filenames_image[1], cv2.IMREAD_UNCHANGED) 164 | image_depth_3d_y = cv2.imread(filenames_image[2], cv2.IMREAD_UNCHANGED) 165 | image_depth_3d_z = cv2.imread(filenames_image[3], cv2.IMREAD_UNCHANGED) 166 | image_depth_3d = [image_depth_3d_x, image_depth_3d_y, image_depth_3d_z] 167 | for i, img in enumerate(image_depth_3d): 168 | img.dtype=np.float32 169 | img[depth_mask] = 0.0 170 | image_depth_3d[i] = np.transpose(img, (2, 0, 1)) 171 | 172 | image_bgr = cv2.imread(filenames_image[4], cv2.IMREAD_UNCHANGED).astype(np.float32)/255 173 | image_rgb = cv2.cvtColor(image_bgr, cv2.COLOR_BGR2RGB) 174 | image_rgb = np.transpose(image_rgb, (2, 0, 1)) 175 | 176 | image_ir = cv2.imread(filenames_image[5], cv2.IMREAD_UNCHANGED).astype(np.float32)/255 177 | image_ir = np.expand_dims(image_ir, 0) 178 | 179 | image_normal_x = cv2.imread(filenames_image[6], cv2.IMREAD_UNCHANGED).astype(np.float32) # Normals are uint8 scaled 180 | image_normal_y = cv2.imread(filenames_image[7], cv2.IMREAD_UNCHANGED).astype(np.float32) 181 | image_normal_z = cv2.imread(filenames_image[8], cv2.IMREAD_UNCHANGED).astype(np.float32) 182 | image_normal = [image_normal_x, image_normal_y, image_normal_z] 183 | for i, img in enumerate(image_normal): 184 | img /= 127 185 | img -= 1 186 | image_normal[i] = np.expand_dims(img, 0) 187 | 188 | # cv2.imshow('rgb', image_rgb) 189 | # cv2.imshow('ir', image_ir) 190 | # cv2.imshow('depth', image_depth) 191 | # cv2.imshow('normal z', image_normal[2]) 192 | # cv2.waitKey(0) 193 | 194 | label_array = None 195 | if self.file_format == "npy": 196 | label_array = np.load(os.path.join(self.root, filename_labels[0])) 197 | elif self.file_format == "csv": 198 | label_array = genfromtxt(os.path.join(self.root, filename_labels[0]), delimiter=',', dtype="float32") 199 | else: 200 | print("Unsupported file format " + self.file_format) 201 | 202 | label = Image.fromarray(label_array, 'F') 203 | 204 | # Convert to tensor 205 | if self.tensor_type=='long': 206 | label = ToLabel()(label) 207 | elif self.tensor_type=='float': 208 | label = ToFloatLabel()(label) 209 | 210 | # Sanitize labels. 211 | if self.file_format == "csv": 212 | label[label != label] = -1 213 | 214 | n_nan = np.count_nonzero(np.isnan(label.numpy())) 215 | if n_nan > 0: 216 | print("File " + filename_labels[0] + " produces nan " + str(n_nan)) 217 | 218 | label = np.expand_dims(label, 0) 219 | 220 | return (image_rgb, image_ir, image_depth, image_depth_3d[0], image_depth_3d[1], image_depth_3d[2], image_normal[0], image_normal[1], image_normal[2]), label 221 | 222 | def __len__(self): 223 | return len(self.filenames_img) 224 | 225 | 226 | 227 | class Transform(object): 228 | def __init__(self, augment=True, height=512): 229 | self.augment = augment 230 | self.height = height 231 | 232 | self.rotation_angle = 5.0 233 | self.affine_angle = 5.0 234 | self.shear_angle = 5.0 235 | # self.crop_ratio = 0.7 236 | self.gaussian_noise = 0.03 * 255 237 | 238 | self.color_augmentation = ColorJitter(brightness=0.4, 239 | contrast=0.4, 240 | saturation=0.4, 241 | hue=0.06) 242 | pass 243 | 244 | def transform_augmentation(self, image, flip, rotation, affine_angle, affine_shear): 245 | # Horizontal flip 246 | if flip: 247 | image = functional.hflip(image) 248 | # Rotate image. 249 | image = functional.rotate(image, rotation) 250 | # Affine transformation 251 | # image = functional.affine(image, affine_angle, (0,0), affine_shear) # Affine not available in this pytorch version 252 | 253 | return image 254 | 255 | 256 | def __call__(self, input, target): 257 | # valid_mask = Image.fromarray(np.ones((target.size[1], target.size[0]), dtype=np.uint8),'L') 258 | # Crop needs to happen here to avoid cropping out all footsteps 259 | # while True: 260 | # # Generate parameters for image transforms 261 | # rotation_angle = random.uniform(-self.rotation_angle, self.rotation_angle) 262 | # tan_ang = abs(math.tan(math.radians(rotation_angle))) 263 | # y_bound_pix = tan_ang*320 264 | # x_bound_pix = tan_ang*240 265 | # # crop_val = random.uniform(self.crop_ratio, 1.0-(y_bound_pix/240)) 266 | # affine_angle = random.uniform(-self.affine_angle, self.affine_angle) 267 | # shear_angle = random.uniform(-self.shear_angle, self.shear_angle) 268 | # flip = random.random() < 0.5 269 | # # img_size = np.array([640, 480]) * crop_val 270 | # # hor_pos = int(random.uniform(tan_ang, 1-tan_ang) * (640 - img_size[0])) 271 | # # Do other transform. 272 | # input_crop = self.transform_augmentation(input, flip, rotation_angle, affine_angle, shear_angle) 273 | # target_crop = self.transform_augmentation(target, flip, rotation_angle, affine_angle, shear_angle) 274 | # mask_crop = self.transform_augmentation(valid_mask, flip, rotation_angle, affine_angle, shear_angle) 275 | # # Do crop 276 | # # crop_tuple = (hor_pos, 480 - img_size[1]-y_bound_pix, hor_pos + img_size[0], 480-y_bound_pix) 277 | # # input_crop = input_crop.crop(crop_tuple) 278 | # # target_crop = target_crop.crop(crop_tuple) 279 | # target_test = np.array(target_crop, dtype="float32") 280 | # # Make this condition proper for regression where we want > 0.0. Or fix border issues?! 281 | # if np.any(target_test != -1): 282 | # input = input_crop.resize((640,480)) 283 | # target = target_crop.resize((640,480)) 284 | # valid_mask = mask_crop.resize((640,480)) 285 | # break 286 | 287 | # # Set black parts from transform to invalid. 288 | # target_np = np.array(target) 289 | # target_np[np.array(valid_mask)!=1] = -1 290 | # target = Image.fromarray(target_np) 291 | # Color transformation 292 | input_augment = self.color_augmentation(input) 293 | # Add noise. Since PIL sucks, this is the best way. 294 | noise = self.gaussian_noise * np.random.randn(input_augment.size[1],input_augment.size[0],len(input_augment.getbands())) 295 | input_augment = Image.fromarray(np.uint8(np.clip(np.array(input_augment) + noise, 0, 255))) 296 | 297 | return input_augment, input, target 298 | -------------------------------------------------------------------------------- /anomaly_detection/datasets/selfsupervised_patches.py: -------------------------------------------------------------------------------- 1 | from torch.utils.data import Subset, Dataset, random_split 2 | from PIL import Image 3 | from anomaly_detection.base.torchvision_dataset import TorchvisionDataset 4 | 5 | import os 6 | import torchvision.transforms as transforms 7 | from cv2 import imread, IMREAD_UNCHANGED, IMREAD_GRAYSCALE 8 | import numpy as np 9 | 10 | 11 | 12 | # DEPTH_MEAN=1.8502674 13 | # DEPTH_STD=1.566148 14 | DEPTH_MEAN=0.0 15 | DEPTH_STD=10.0 16 | 17 | 18 | 19 | class DatasetCombiner(Dataset): 20 | def __init__(self, datasets): 21 | image_list = [dataset.images for dataset in datasets] 22 | self.images = np.concatenate(tuple(image_list)) 23 | self.labels = [] 24 | for dataset in datasets: 25 | self.labels += [dataset.label for val in dataset.images] 26 | 27 | def __len__(self): 28 | return len(self.labels) 29 | 30 | def __getitem__(self, ind): 31 | return self.images[ind], self.labels[ind], ind 32 | 33 | 34 | 35 | class SelfSupervisedDataset(TorchvisionDataset): 36 | 37 | def __init__(self, root: str, train=None, val_pos=None, val_neg=None, rgb=True, ir=False, depth=False, depth_3d=False, normals=False, normal_angle=False, normal_class = 1): 38 | super().__init__(root) 39 | 40 | if train is None: 41 | train = 'train' 42 | print('Set train folder to: ' + train) 43 | if val_pos is None: 44 | val_pos = 'wangen_sun_3_pos' 45 | print('Set val_pos folder to: ' + val_pos) 46 | if val_neg is None: 47 | val_neg = 'wangen_sun_3_neg' 48 | print('Set val_neg folder to: ' + val_neg) 49 | 50 | self.n_classes = 2 51 | self.normal_class = tuple([normal_class]) 52 | self.outlier_classes = list(range(0, self.n_classes)) 53 | self.outlier_classes.remove(normal_class) 54 | 55 | # transform = transforms.Compose([transforms.ToTensor()]) 56 | 57 | # target_transform = transforms.Lambda(lambda x: int(x in self.outlier_classes)) 58 | 59 | if not isinstance(train, list) and not isinstance(train, tuple): 60 | train = (train,) 61 | if not isinstance(val_pos, list) and not isinstance(val_pos, tuple): 62 | val_pos = (val_pos,) 63 | if not isinstance(val_neg, list) and not isinstance(val_neg, tuple): 64 | val_neg = (val_neg,) 65 | 66 | positive_data_train = MySelfSupervised([os.path.join(root, folder) for folder in train], rgb=rgb, ir=ir, depth=depth, depth_3d=depth_3d, normals=normals, normal_angle=normal_angle, label=1) 67 | positive_data_val = MySelfSupervised([os.path.join(root, folder) for folder in val_pos], rgb=rgb, ir=ir, depth=depth, depth_3d=depth_3d, normals=normals, normal_angle=normal_angle, label=1) 68 | negative_data_val = MySelfSupervised([os.path.join(root, folder) for folder in val_neg], rgb=rgb, ir=ir, depth=depth, depth_3d=depth_3d, normals=normals, normal_angle=normal_angle, label=0) 69 | 70 | self.train_set = positive_data_train 71 | # self.train_set = DatasetCombiner([positive_data_train, unlabelled_data_train]) 72 | # self.test_set = positive_data_val 73 | self.test_set = DatasetCombiner([negative_data_val, positive_data_val]) 74 | 75 | 76 | 77 | class MySelfSupervised(Dataset): 78 | 79 | def loadRGBImages(self): 80 | filenames_rgb = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 81 | for f in fn if f.endswith('rgb.png')] 82 | filenames_rgb.sort() 83 | print('Found ' + str(len(filenames_rgb)) + ' RGB images in ' + self.root) 84 | images_rgb = [imread(f) for f in filenames_rgb] 85 | return images_rgb 86 | 87 | 88 | 89 | def loadIrImages(self): 90 | filenames_ir = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 91 | for f in fn if f.endswith('ir.png')] 92 | filenames_ir.sort() 93 | print('Found ' + str(len(filenames_ir)) + ' IR images in ' + self.root) 94 | images_ir = [imread(f, IMREAD_GRAYSCALE) for f in filenames_ir] 95 | return images_ir 96 | 97 | 98 | 99 | def loadDepthImages(self): 100 | filenames_depth = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 101 | for f in fn if f.endswith('depth.png')] 102 | filenames_depth.sort() 103 | print('Found ' + str(len(filenames_depth)) + ' depth images in ' + self.root) 104 | images_depth = [imread(f, IMREAD_UNCHANGED) for f in filenames_depth] 105 | 106 | # Convert depth to float. 107 | for image in images_depth: 108 | if image is not None: 109 | image.dtype=np.float32 110 | 111 | return images_depth 112 | 113 | 114 | 115 | def loadDepth3dImages(self): 116 | filenames_depth_x = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 117 | for f in fn if f.endswith('depth_3d_x.png')] 118 | filenames_depth_y = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 119 | for f in fn if f.endswith('depth_3d_y.png')] 120 | filenames_depth_z = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 121 | for f in fn if f.endswith('depth_3d_z.png')] 122 | print('Found ' + str(len(filenames_depth_x)) + ' depth_3d images in ' + self.root) 123 | images_depth_x = [imread(f, IMREAD_UNCHANGED) for f in filenames_depth_x] 124 | images_depth_y = [imread(f, IMREAD_UNCHANGED) for f in filenames_depth_y] 125 | images_depth_z = [imread(f, IMREAD_UNCHANGED) for f in filenames_depth_z] 126 | 127 | for image in images_depth_x: 128 | if image is not None: 129 | image.dtype=np.float32 130 | for image in images_depth_y: 131 | if image is not None: 132 | image.dtype=np.float32 133 | for image in images_depth_z: 134 | if image is not None: 135 | image.dtype=np.float32 136 | 137 | images_depth_x = np.stack([(np.squeeze(img)-DEPTH_MEAN)/DEPTH_STD for img in images_depth_x if img is not None]) 138 | images_depth_x = np.expand_dims(images_depth_x, axis=1) 139 | images_depth_y = np.stack([(np.squeeze(img)-DEPTH_MEAN)/DEPTH_STD for img in images_depth_y if img is not None]) 140 | images_depth_y = np.expand_dims(images_depth_y, axis=1) 141 | images_depth_z = np.stack([(np.squeeze(img)-DEPTH_MEAN)/DEPTH_STD for img in images_depth_z if img is not None]) 142 | images_depth_z = np.expand_dims(images_depth_z, axis=1) 143 | 144 | images_depth_horz = (images_depth_x**2 + images_depth_y**2)**(0.5) 145 | 146 | invalid_mask = (images_depth_horz > 1.0) | (images_depth_z > 1.0) | (images_depth_horz < 0) 147 | images_depth_horz[invalid_mask] = 0.0 148 | images_depth_z[invalid_mask] = 0.0 149 | print('Zeroed out ' + str(np.sum(invalid_mask)) + ' depth 3d values') 150 | images_depth_3d = np.concatenate([images_depth_horz, images_depth_z], axis=1) 151 | 152 | return images_depth_3d 153 | 154 | 155 | 156 | def loadNormalsImages(self): 157 | filenames_normals_x = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 158 | for f in fn if f.endswith('normals_x.png')] 159 | filenames_normals_y = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 160 | for f in fn if f.endswith('normals_y.png')] 161 | filenames_normals_z = [os.path.join(dp, f) for dp, dn, fn in os.walk(os.path.expanduser(self.root), followlinks=True) 162 | for f in fn if f.endswith('normals_z.png')] 163 | print('Found ' + str(len(filenames_normals_x)) + ' normals images in ' + self.root) 164 | images_normals_x = [imread(f, IMREAD_UNCHANGED) for f in filenames_normals_x] 165 | images_normals_y = [imread(f, IMREAD_UNCHANGED) for f in filenames_normals_y] 166 | images_normals_z = [imread(f, IMREAD_UNCHANGED) for f in filenames_normals_z] 167 | 168 | for image in images_normals_x: 169 | if image is not None: 170 | image.dtype=np.float32 171 | for image in images_normals_y: 172 | if image is not None: 173 | image.dtype=np.float32 174 | for image in images_normals_z: 175 | if image is not None: 176 | image.dtype=np.float32 177 | 178 | images_normals_x = np.stack([np.squeeze(img) for img in images_normals_x if img is not None]) 179 | images_normals_x = np.expand_dims(images_normals_x, axis=1) 180 | images_normals_y = np.stack([np.squeeze(img) for img in images_normals_y if img is not None]) 181 | images_normals_y = np.expand_dims(images_normals_y, axis=1) 182 | images_normals_z = np.stack([np.squeeze(img) for img in images_normals_z if img is not None]) 183 | images_normals_z = np.expand_dims(images_normals_z, axis=1) 184 | 185 | images_normals_horz = (images_normals_x**2 + images_normals_y**2)**0.5 186 | 187 | images_normals = np.concatenate([images_normals_horz, images_normals_z], axis=1) 188 | 189 | return images_normals 190 | 191 | 192 | 193 | def loadImagesFromFiles(self, label, rgb, ir, depth, depth_3d, normals, normal_angle, indices): 194 | if rgb: 195 | images_rgb = self.loadRGBImages() 196 | 197 | if ir: 198 | images_ir = self.loadIrImages() 199 | 200 | if depth: 201 | images_depth = self.loadDepthImages() 202 | 203 | if depth_3d: 204 | images_depth_3d = self.loadDepth3dImages() 205 | 206 | if normals or normal_angle: 207 | images_normals = self.loadNormalsImages() 208 | 209 | if indices is not None: 210 | if rgb: 211 | images_rgb = [images_rgb[ind] for ind in indices] 212 | if ir: 213 | images_ir = [images_ir[ind] for ind in indices] 214 | if depth: 215 | images_depth = [images_depth[ind] for ind in indices] 216 | 217 | images = [] 218 | if rgb: 219 | images_rgb = np.stack([img.transpose((2,0,1)).astype(np.float32)/255 for img in images_rgb if img is not None]) 220 | images.append(images_rgb) 221 | if ir: 222 | images_ir = np.stack([img.astype(np.float32)/255 for img in images_ir if img is not None]) 223 | images_ir = np.expand_dims(images_ir, axis=1) 224 | images.append(images_ir) 225 | if depth: 226 | images_depth = np.stack([(np.squeeze(img)-DEPTH_MEAN)/DEPTH_STD for img in images_depth if img is not None]) 227 | images_depth = np.expand_dims(images_depth, axis=1) 228 | images.append(images_depth) 229 | if depth_3d: 230 | images.append(images_depth_3d) 231 | if normals: 232 | images.append(images_normals) 233 | if normal_angle: 234 | normal_angle = np.arctan2(images_normals[:,1:2,:,:], images_normals[:,0:1,:,:])/np.pi 235 | images.append(normal_angle) 236 | 237 | self.images.append(np.concatenate(images, axis=1)) 238 | 239 | 240 | self.label = label 241 | 242 | 243 | 244 | def __init__(self, roots, label, rgb, ir, depth, depth_3d, normals, normal_angle, indices=None): 245 | super(MySelfSupervised).__init__() 246 | 247 | self.images = [] 248 | 249 | for root in roots: 250 | self.root = root 251 | 252 | self.loadImagesFromFiles(label, rgb, ir, depth, depth_3d, normals, normal_angle, indices) 253 | 254 | self.images = np.concatenate(self.images, axis=0) 255 | print(str(self.images.shape[0]), 'images total.') 256 | 257 | 258 | 259 | def __len__(self): 260 | return len(self.images) 261 | 262 | 263 | 264 | def __getitem__(self, ind): 265 | return self.images[ind], self.label, ind 266 | 267 | -------------------------------------------------------------------------------- /anomaly_detection/networks/__init__.py: -------------------------------------------------------------------------------- 1 | from .main import build_network, build_autoencoder 2 | from .stack_conv_net import StackConvNet, StackConvNet_Autoencoder 3 | -------------------------------------------------------------------------------- /anomaly_detection/networks/main.py: -------------------------------------------------------------------------------- 1 | from .stack_conv_net import StackConvNet, StackConvNet_Autoencoder 2 | from .real_nvp import RealNVP, EncoderRealNVP 3 | 4 | 5 | def build_network(net_name, cfg): 6 | """Builds the neural network.""" 7 | 8 | implemented_networks = ('StackConvNet') 9 | assert net_name in implemented_networks 10 | 11 | net = None 12 | 13 | if net_name == 'StackConvNet': 14 | n_channel = 0 15 | if cfg.settings['rgb']: 16 | n_channel += 3 17 | if cfg.settings['ir']: 18 | n_channel += 1 19 | if cfg.settings['depth']: 20 | n_channel += 1 21 | if cfg.settings['depth_3d']: 22 | n_channel += 2 23 | if cfg.settings['normals']: 24 | n_channel += 2 25 | if cfg.settings['normal_angle']: 26 | n_channel += 1 27 | net = StackConvNet(in_channels=n_channel, 28 | use_bn=cfg.settings['batchnorm'], 29 | use_dropout=cfg.settings['dropout']) 30 | 31 | if cfg.settings['objective'] == 'real-nvp': 32 | net_nvp = RealNVP(in_dim=net.rep_dim, mid_dim=2*net.rep_dim) 33 | net = EncoderRealNVP(net, net_nvp) 34 | 35 | return net 36 | 37 | 38 | def build_autoencoder(net_name, cfg): 39 | """Builds the corresponding autoencoder network.""" 40 | 41 | implemented_networks = ('StackConvNet') 42 | assert net_name in implemented_networks 43 | 44 | ae_net = None 45 | 46 | if net_name == 'StackConvNet': 47 | n_channel = 0 48 | if cfg.settings['rgb']: 49 | n_channel += 3 50 | if cfg.settings['ir']: 51 | n_channel += 1 52 | if cfg.settings['depth']: 53 | n_channel += 1 54 | if cfg.settings['depth_3d']: 55 | n_channel += 2 56 | if cfg.settings['normals']: 57 | n_channel += 2 58 | if cfg.settings['normal_angle']: 59 | n_channel += 1 60 | ae_net = StackConvNet_Autoencoder(in_channels=n_channel, 61 | use_bn=cfg.settings['batchnorm'], 62 | use_dropout=cfg.settings['dropout']) 63 | 64 | return ae_net 65 | -------------------------------------------------------------------------------- /anomaly_detection/networks/real_nvp.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | 6 | 7 | 8 | class STNetwork(nn.Module): 9 | def __init__(self, in_dim, mid_dim): 10 | super().__init__() 11 | 12 | self.layers = nn.Sequential( 13 | nn.Conv2d(in_dim, mid_dim, 1, bias=True), 14 | nn.LeakyReLU(), 15 | nn.Conv2d(mid_dim, mid_dim, 1, bias=True), 16 | nn.LeakyReLU(), 17 | nn.Conv2d(mid_dim, 2*in_dim, 1, bias=True), 18 | nn.Tanh() 19 | ) 20 | for m in self.modules(): 21 | if isinstance(m, nn.Linear): 22 | nn.init.xavier_uniform_(m.weight) 23 | nn.init.constant_(m.bias, 0) 24 | 25 | def forward(self, x): 26 | x = self.layers(x) 27 | return x.chunk(2, dim=1) 28 | 29 | 30 | 31 | class RealNVPBlock(nn.Module): 32 | def __init__(self, in_dim, mid_dim): 33 | super().__init__() 34 | self.st_net = STNetwork(in_dim, mid_dim) 35 | 36 | def forward(self, x, mask, inv_mask): 37 | # Split channels. 38 | x_fix = F.conv2d(x, inv_mask) 39 | x_change = F.conv2d(x, mask) 40 | 41 | # Compute scaling and translation. 42 | s, t = self.st_net(x_fix) 43 | x_change = x_change * s.exp() + t 44 | 45 | # Write changes back to channels. 46 | mask_nochange = inv_mask.sum(dim=0) 47 | mask_change = mask.sum(dim=0) 48 | x = x * mask_nochange 49 | x_change_long = F.conv2d(x_change, mask.transpose(0, 1)) 50 | x = x + x_change_long 51 | 52 | return x, s.sum(dim=1, keepdim=True) 53 | 54 | 55 | 56 | class RealNVP(nn.Module): 57 | def __init__(self, in_dim, mid_dim): 58 | super().__init__() 59 | eye = torch.eye(in_dim//2) 60 | zeros = torch.zeros(in_dim//2, in_dim//2) 61 | a1 = torch.cat((eye, zeros), dim=1).unsqueeze_(-1).unsqueeze_(-1) 62 | a2 = torch.cat((zeros, eye), dim=1).unsqueeze_(-1).unsqueeze_(-1) 63 | b1 = torch.stack((eye, zeros), dim=-1).reshape(in_dim//2, in_dim).unsqueeze_(-1).unsqueeze_(-1) 64 | b2 = torch.stack((zeros, eye), dim=-1).reshape(in_dim//2, in_dim).unsqueeze_(-1).unsqueeze_(-1) 65 | self.masks = [(a1, a2), 66 | (a2, a1), 67 | (b1, b2), 68 | (b2, b1), 69 | (a1, a2), 70 | (a2, a1), 71 | (b1, b2), 72 | (b2, b1)] 73 | self.nets = nn.ModuleList([RealNVPBlock(in_dim//2, mid_dim) for _ in self.masks]) 74 | 75 | def to(self, device): 76 | super().to(device) 77 | self.masks = [(mask.to(device), inv_mask.to(device)) for mask, inv_mask in self.masks] 78 | return self 79 | 80 | def forward(self, x): 81 | log_det_J = torch.zeros((x.shape[0], 1, x.shape[2], x.shape[3]), device=x.device) 82 | for net, (mask, inv_mask) in zip(self.nets, self.masks): 83 | x, cur_log_det_J = net(x, mask, inv_mask) 84 | log_det_J = log_det_J + cur_log_det_J 85 | return x, log_det_J 86 | 87 | 88 | 89 | class EncoderRealNVP(nn.Module): 90 | def __init__(self, encoder, nvp): 91 | super().__init__() 92 | self.encoder = encoder 93 | self.nvp = nvp 94 | self.rep_dim = encoder.rep_dim 95 | 96 | def to(self, device): 97 | super().to(device) 98 | self.encoder.to(device) 99 | self.nvp.to(device) 100 | return self 101 | 102 | def forward(self, x): 103 | x = self.encoder(x) 104 | return self.nvp(x) 105 | 106 | 107 | -------------------------------------------------------------------------------- /anomaly_detection/networks/stack_conv_net.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | from anomaly_detection.base.base_net import BaseNet 6 | 7 | 8 | class StackConvNet(BaseNet): 9 | 10 | def __init__(self, in_channels=3, use_bn=False, use_dropout=True): 11 | super().__init__() 12 | 13 | self.rep_dim = 128 14 | self.pool = nn.MaxPool2d(2, 2) 15 | self.use_bn = use_bn 16 | self.use_dropout = use_dropout 17 | 18 | if use_dropout: 19 | self.drop = nn.Dropout2d(p=0.05) 20 | 21 | self.conv1 = nn.Conv2d(in_channels, 32, 5, bias=False) 22 | self.conv2 = nn.Conv2d(32, 64, 5, bias=False) 23 | self.conv3 = nn.Conv2d(64, 128, 5, bias=False) 24 | self.fconv1 = nn.Conv2d(128, self.rep_dim, 1, bias=False) 25 | if use_bn: 26 | self.bn2d1 = nn.BatchNorm2d(32, eps=1e-04, affine=False) 27 | self.bn2d2 = nn.BatchNorm2d(64, eps=1e-04, affine=False) 28 | self.bn2d3 = nn.BatchNorm2d(128, eps=1e-04, affine=False) 29 | 30 | def forward(self, x): 31 | x = self.conv1(x) 32 | if self.use_bn: 33 | x = self.bn2d1(x) 34 | x = F.leaky_relu(x) 35 | if self.use_dropout: 36 | x = self.drop(x) 37 | x = self.pool(x) 38 | 39 | x = self.conv2(x) 40 | if self.use_bn: 41 | x = self.bn2d2(x) 42 | x = F.leaky_relu(x) 43 | if self.use_dropout: 44 | x = self.drop(x) 45 | x = self.pool(x) 46 | 47 | x = self.conv3(x) 48 | if self.use_bn: 49 | x = self.bn2d3(x) 50 | x = F.leaky_relu(x) 51 | if self.use_dropout: 52 | x = self.drop(x) 53 | 54 | x = self.fconv1(x) 55 | return x 56 | 57 | 58 | class StackConvNet_Autoencoder(BaseNet): 59 | 60 | def __init__(self, in_channels=3, use_bn=False, use_dropout=True): 61 | super().__init__() 62 | 63 | self.rep_dim = 128 64 | self.pool = nn.MaxPool2d(2, 2) 65 | self.use_bn = use_bn 66 | self.use_dropout = use_dropout 67 | 68 | if use_dropout: 69 | self.drop = nn.Dropout2d(p=0.05) 70 | 71 | # Encoder (must match the network above) 72 | self.conv1 = nn.Conv2d(in_channels, 32, 5, bias=False) 73 | nn.init.xavier_uniform_(self.conv1.weight, gain=nn.init.calculate_gain('leaky_relu')) 74 | self.conv2 = nn.Conv2d(32, 64, 5, bias=False) 75 | nn.init.xavier_uniform_(self.conv2.weight, gain=nn.init.calculate_gain('leaky_relu')) 76 | self.conv3 = nn.Conv2d(64, 128, 5, bias=False) 77 | nn.init.xavier_uniform_(self.conv3.weight, gain=nn.init.calculate_gain('leaky_relu')) 78 | self.fconv1 = nn.Conv2d(128, self.rep_dim, 1, bias=False) 79 | if use_bn: 80 | self.bn2d1 = nn.BatchNorm2d(32, eps=1e-04, affine=False) 81 | self.bn2d2 = nn.BatchNorm2d(64, eps=1e-04, affine=False) 82 | self.bn2d3 = nn.BatchNorm2d(128, eps=1e-04, affine=False) 83 | self.bn2d = nn.BatchNorm2d(self.rep_dim, eps=1e-04, affine=False) 84 | 85 | # Decoder 86 | self.deconv1 = nn.ConvTranspose2d(self.rep_dim, 128, 1, bias=False) 87 | nn.init.xavier_uniform_(self.deconv1.weight, gain=nn.init.calculate_gain('leaky_relu')) 88 | self.deconv2 = nn.ConvTranspose2d(128, 64, 5, bias=False) 89 | nn.init.xavier_uniform_(self.deconv2.weight, gain=nn.init.calculate_gain('leaky_relu')) 90 | self.deconv3 = nn.ConvTranspose2d(64, 32, 5, bias=False) 91 | nn.init.xavier_uniform_(self.deconv3.weight, gain=nn.init.calculate_gain('leaky_relu')) 92 | self.deconv4 = nn.ConvTranspose2d(32, in_channels, 5, bias=False) 93 | nn.init.xavier_uniform_(self.deconv4.weight, gain=nn.init.calculate_gain('leaky_relu')) 94 | if use_bn: 95 | self.bn2d4 = nn.BatchNorm2d(128, eps=1e-04, affine=False) 96 | self.bn2d5 = nn.BatchNorm2d(64, eps=1e-04, affine=False) 97 | self.bn2d6 = nn.BatchNorm2d(32, eps=1e-04, affine=False) 98 | 99 | def forward(self, x): 100 | x = self.conv1(x) 101 | if self.use_bn: 102 | x = self.bn2d1(x) 103 | x = F.leaky_relu(x) 104 | if self.use_dropout: 105 | x = self.drop(x) 106 | x = self.pool(x) 107 | 108 | x = self.conv2(x) 109 | if self.use_bn: 110 | x = self.bn2d2(x) 111 | x = F.leaky_relu(x) 112 | if self.use_dropout: 113 | x = self.drop(x) 114 | x = self.pool(x) 115 | 116 | x = self.conv3(x) 117 | if self.use_bn: 118 | x = self.bn2d3(x) 119 | x = F.leaky_relu(x) 120 | if self.use_dropout: 121 | x = self.drop(x) 122 | 123 | x = self.fconv1(x) 124 | if self.use_bn: 125 | x = self.bn2d(x) 126 | x = F.leaky_relu(x) 127 | if self.use_dropout: 128 | x = self.drop(x) 129 | 130 | x = self.deconv1(x) 131 | if self.use_bn: 132 | x = self.bn2d4(x) 133 | x = F.leaky_relu(x) 134 | if self.use_dropout: 135 | x = self.drop(x) 136 | 137 | x = self.deconv2(x) 138 | if self.use_bn: 139 | x = self.bn2d5(x) 140 | x = F.leaky_relu(x) 141 | if self.use_dropout: 142 | x = self.drop(x) 143 | 144 | x = F.interpolate(x, scale_factor=2) 145 | x = self.deconv3(x) 146 | if self.use_bn: 147 | x = self.bn2d6(x) 148 | x = F.leaky_relu(x) 149 | if self.use_dropout: 150 | x = self.drop(x) 151 | 152 | x = F.interpolate(x, scale_factor=2) 153 | x = self.deconv4(x) 154 | x = torch.tanh(x) 155 | 156 | return x 157 | -------------------------------------------------------------------------------- /anomaly_detection/optim/__init__.py: -------------------------------------------------------------------------------- 1 | from .anomalyDetection_trainer import AnomalyDetectionTrainer 2 | from .ae_trainer import AETrainer 3 | -------------------------------------------------------------------------------- /anomaly_detection/optim/ae_trainer.py: -------------------------------------------------------------------------------- 1 | from anomaly_detection.base.base_trainer import BaseTrainer 2 | from anomaly_detection.base.base_dataset import BaseADDataset 3 | from anomaly_detection.base.base_net import BaseNet 4 | from anomaly_detection.utils.eval_functions import computeMaxYoudensIndex 5 | from sklearn.metrics import roc_auc_score, roc_curve 6 | 7 | import logging 8 | import time 9 | import torch 10 | import torch.optim as optim 11 | import numpy as np 12 | 13 | 14 | class AETrainer(BaseTrainer): 15 | 16 | def __init__(self, writer, optimizer_name: str = 'adam', lr: float = 0.001, n_epochs: int = 150, lr_milestones: tuple = (), 17 | batch_size: int = 128, weight_decay: float = 1e-6, device: str = 'cuda', n_jobs_dataloader: int = 0): 18 | super().__init__(optimizer_name, lr, n_epochs, lr_milestones, batch_size, weight_decay, device, 19 | n_jobs_dataloader, writer) 20 | 21 | def train(self, dataset: BaseADDataset, ae_net: BaseNet): 22 | logger = logging.getLogger() 23 | 24 | # Set device for network 25 | ae_net = ae_net.to(self.device) 26 | 27 | # Get train data loader 28 | train_loader, _ = dataset.loaders(batch_size=self.batch_size, num_workers=self.n_jobs_dataloader) 29 | 30 | # Set optimizer (Adam optimizer for now) 31 | optimizer = optim.Adam(ae_net.parameters(), lr=self.lr, weight_decay=self.weight_decay) 32 | 33 | # Set learning rate scheduler 34 | scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones=self.lr_milestones, gamma=0.1) 35 | 36 | best_auc = 0 37 | 38 | # Training 39 | logger.info('Starting pretraining...') 40 | start_time = time.time() 41 | ae_net.train() 42 | for epoch in range(self.n_epochs): 43 | 44 | if epoch in self.lr_milestones: 45 | logger.info(' LR scheduler: new learning rate is %g' % float(scheduler.get_lr()[0])) 46 | 47 | loss_epoch = 0.0 48 | n_batches = 0 49 | epoch_start_time = time.time() 50 | for data in train_loader: 51 | inputs, _, _ = data 52 | inputs = inputs.to(self.device) 53 | 54 | # Zero the network parameter gradients 55 | optimizer.zero_grad() 56 | 57 | # Update network parameters via backpropagation: forward + backward + optimize 58 | outputs = ae_net(inputs) 59 | scores = torch.sum((outputs - inputs) ** 2, dim=tuple(range(1, outputs.dim()))) 60 | loss = torch.mean(scores) 61 | loss.backward() 62 | optimizer.step() 63 | 64 | loss_epoch += loss.item() 65 | n_batches += 1 66 | 67 | scheduler.step() 68 | 69 | # Test AUC. 70 | auc, (val_loss_pos, val_loss_neg) = self.test(dataset, ae_net, verbose=False) 71 | 72 | if auc > best_auc: 73 | best_auc = auc 74 | self.best_weights = {} 75 | for key in ae_net.state_dict(): 76 | self.best_weights[key] = ae_net.state_dict()[key].cpu() 77 | 78 | avg_loss = loss_epoch / n_batches 79 | 80 | # Do Tensorboard logging. 81 | self.writer.add_scalar('ae/loss', avg_loss, epoch) 82 | self.writer.add_scalar('ae/AUC', auc, epoch) 83 | self.writer.add_scalar('ae/loss_val_pos', val_loss_pos, epoch) 84 | self.writer.add_scalar('ae/loss_val_neg', val_loss_neg, epoch) 85 | 86 | # log epoch statistics 87 | epoch_train_time = time.time() - epoch_start_time 88 | logger.info(' Epoch {}/{}\t Time: {:.3f}\t Loss: {:.8f}\t AUC: {:.8f}' 89 | .format(epoch + 1, self.n_epochs, epoch_train_time, avg_loss, auc)) 90 | 91 | pretrain_time = time.time() - start_time 92 | logger.info('Pretraining time: %.3f' % pretrain_time) 93 | logger.info('Finished pretraining.') 94 | 95 | return ae_net 96 | 97 | def test(self, dataset: BaseADDataset, ae_net: BaseNet, verbose=True): 98 | logger = logging.getLogger() 99 | 100 | # Set device for network 101 | ae_net = ae_net.to(self.device) 102 | 103 | # Get test data loader 104 | _, test_loader = dataset.loaders(batch_size=self.batch_size, num_workers=self.n_jobs_dataloader) 105 | 106 | # Testing 107 | if verbose: 108 | logger.info('Testing autoencoder...') 109 | loss_epoch = 0.0 110 | n_batches = 0 111 | start_time = time.time() 112 | idx_label_score = [] 113 | ae_net.eval() 114 | with torch.no_grad(): 115 | for data in test_loader: 116 | inputs, labels, idx = data 117 | inputs = inputs.to(self.device) 118 | outputs = ae_net(inputs) 119 | scores = torch.sum((outputs - inputs) ** 2, dim=tuple(range(1, outputs.dim()))) 120 | loss = torch.mean(scores) 121 | 122 | # Save triple of (idx, label, score) in a list 123 | idx_label_score += list(zip(idx.cpu().data.numpy().tolist(), 124 | labels.cpu().data.numpy().tolist(), 125 | scores.cpu().data.numpy().tolist())) 126 | 127 | loss_epoch += loss.item() 128 | n_batches += 1 129 | 130 | if verbose: 131 | logger.info('Test set Loss: {:.8f}'.format(loss_epoch / n_batches)) 132 | 133 | n_pos = 0; loss_pos = 0 134 | n_neg = 0; loss_neg = 0 135 | for idx, label, score in idx_label_score: 136 | if label: 137 | n_pos += 1 138 | loss_pos += score 139 | else: 140 | n_neg += 1 141 | loss_neg += score 142 | loss_pos /= n_pos 143 | loss_neg /= n_neg 144 | 145 | _, labels, scores = zip(*idx_label_score) 146 | labels = np.array(labels) 147 | scores = -np.array(scores) 148 | 149 | 150 | auc = roc_auc_score(labels, scores) 151 | if verbose: 152 | roc_x, roc_y, roc_thres = roc_curve(labels, scores) 153 | thres_best = computeMaxYoudensIndex(roc_x, roc_y, roc_thres) 154 | self.thres = -thres_best 155 | logger.info('Test set AUC: {:.2f}%'.format(100. * auc)) 156 | 157 | test_time = time.time() - start_time 158 | logger.info('Autoencoder testing time: %.3f' % test_time) 159 | logger.info('Finished testing autoencoder.') 160 | 161 | return auc, (loss_pos, loss_neg) 162 | -------------------------------------------------------------------------------- /anomaly_detection/optim/anomalyDetection_trainer.py: -------------------------------------------------------------------------------- 1 | from anomaly_detection.base.base_trainer import BaseTrainer 2 | from anomaly_detection.base.base_dataset import BaseADDataset 3 | from anomaly_detection.base.base_net import BaseNet 4 | from anomaly_detection.utils.eval_functions import computeMaxYoudensIndex, computeTprAt5Fpr 5 | from torch.utils.data.dataloader import DataLoader 6 | from sklearn.metrics import roc_auc_score, roc_curve 7 | import matplotlib 8 | matplotlib.use('Agg') # or 'PS', 'PDF', 'SVG' 9 | import matplotlib.pyplot as plt 10 | import cv2 11 | 12 | import logging 13 | import time 14 | import torch 15 | import torch.optim as optim 16 | import numpy as np 17 | 18 | 19 | 20 | # Adversarial training params. 21 | EPS_AUGMENT = 2.5e-2 22 | 23 | 24 | 25 | 26 | class AnomalyDetectionTrainer(BaseTrainer): 27 | 28 | def __init__(self, writer, objective, R, c, nu: float, optimizer_name: str = 'adam', lr: float = 0.001, n_epochs: int = 150, 29 | lr_milestones: tuple = (), batch_size: int = 128, weight_decay: float = 1e-6, device: str = 'cuda', 30 | n_jobs_dataloader: int = 0, fix_encoder: bool = False): 31 | super().__init__(optimizer_name, lr, n_epochs, lr_milestones, batch_size, weight_decay, device, 32 | n_jobs_dataloader, writer) 33 | 34 | assert objective in ('one-class', 'soft-boundary', 'real-nvp'), "Objective must be either 'one-class' or 'soft-boundary'." 35 | self.objective = objective 36 | self.fix_encoder = fix_encoder 37 | 38 | # Deep SVDD parameters 39 | self.R = torch.tensor(R, device=self.device) # radius R initialized with 0 by default. 40 | self.c = torch.tensor(c, device=self.device) if c is not None else None 41 | self.nu = nu 42 | self.thres = None 43 | 44 | # Do Real NVP initializations. 45 | 46 | 47 | # Optimization parameters 48 | self.warm_up_n_epochs = 10 # number of training epochs for soft-boundary Deep SVDD before radius R gets updated 49 | 50 | # Results 51 | self.train_time = None 52 | self.test_auc = None 53 | self.test_time = None 54 | self.test_scores = None 55 | 56 | def train(self, dataset: BaseADDataset, net: BaseNet, augment: bool): 57 | logger = logging.getLogger() 58 | 59 | # Set device for network 60 | net = net.to(self.device) 61 | 62 | # Get train data loader 63 | train_loader, _ = dataset.loaders(batch_size=self.batch_size, num_workers=self.n_jobs_dataloader, shuffle_train=False, shuffle_test=False) 64 | 65 | # Set optimizer (Adam optimizer for now) 66 | if self.fix_encoder: 67 | params = [param for name, param in net.named_parameters() if 'nvp' in name] 68 | param_names = [name for name, param in net.named_parameters() if 'nvp' in name] 69 | print('Optimizing params: ' + str(param_names)) 70 | else: 71 | params = net.parameters() 72 | optimizer = optim.Adam(params, lr=self.lr, weight_decay=self.weight_decay) 73 | 74 | # Set learning rate scheduler 75 | scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones=self.lr_milestones, gamma=0.1) 76 | 77 | # Initialize hypersphere center c (if c not loaded) 78 | if self.c is None: 79 | logger.info('Initializing center c...') 80 | self.c = self.init_center_c(train_loader, net) 81 | logger.info('Center c initialized.') 82 | 83 | # Set prior with current center. 84 | self.prior = torch.distributions.MultivariateNormal(self.c, 85 | torch.eye(net.rep_dim, device=self.device)) 86 | 87 | # Training 88 | best_auc = 0 89 | logger.info('Starting training...') 90 | start_time = time.time() 91 | for epoch in range(self.n_epochs): 92 | net.train() 93 | 94 | if epoch in self.lr_milestones: 95 | logger.info(' LR scheduler: new learning rate is %g' % float(scheduler.get_lr()[0])) 96 | 97 | loss_epoch = 0.0 98 | n_batches = 0 99 | epoch_start_time = time.time() 100 | for data in train_loader: 101 | inputs, _, _ = data 102 | inputs = inputs.to(self.device) 103 | 104 | if augment: 105 | # Adversarial augmentation. 106 | inputs.requires_grad = True 107 | outputs = net(inputs) 108 | if self.objective == 'real-nvp': 109 | z, log_det_J = outputs 110 | log_prob_z = self.prior.log_prob(z) 111 | loss = -log_prob_z.mean() - log_det_J.mean() 112 | else: 113 | dist = torch.sum((outputs - self.c) ** 2, dim=1) 114 | if self.objective == 'soft-boundary': 115 | scores = dist - self.R ** 2 116 | loss = self.R ** 2 + (1 / self.nu) * torch.mean(torch.max(torch.zeros_like(scores), scores)) 117 | elif self.objective == 'one-class': 118 | loss = torch.mean(dist) 119 | loss.backward() 120 | 121 | r_adv = EPS_AUGMENT * inputs.grad.sign() 122 | inputs.requires_grad = False 123 | inputs_aug = inputs + r_adv 124 | else: 125 | inputs_aug = inputs 126 | 127 | # cv2.imshow('rgb',inputs_aug[0,0:3,...].squeeze().cpu().numpy().transpose((1,2,0))) 128 | # cv2.imshow('ir',inputs_aug[3,0,...].squeeze().cpu().numpy()) 129 | # cv2.imshow('depth',(inputs_aug[4,0,...]).squeeze().cpu().numpy()) 130 | cv2.waitKey(0) 131 | 132 | # Zero the network parameter gradients 133 | optimizer.zero_grad() 134 | 135 | # Update network parameters via backpropagation: forward + backward + optimize 136 | outputs = net(inputs_aug) 137 | if self.objective == 'real-nvp': 138 | z, log_det_J = outputs 139 | log_prob_z = self.prior.log_prob(z) 140 | loss = -log_prob_z.mean() - log_det_J.mean() 141 | else: 142 | dist = torch.sum((outputs - self.c) ** 2, dim=1) 143 | if self.objective == 'soft-boundary': 144 | scores = dist - self.R ** 2 145 | loss = self.R ** 2 + (1 / self.nu) * torch.mean(torch.max(torch.zeros_like(scores), scores)) 146 | else: # 'one-class' 147 | loss = torch.mean(dist) 148 | loss.backward() 149 | optimizer.step() 150 | 151 | # Update hypersphere radius R on mini-batch distances 152 | if (self.objective == 'soft-boundary') and (epoch >= self.warm_up_n_epochs): 153 | self.R.data = torch.tensor(get_radius(dist, self.nu), device=self.device) 154 | 155 | loss_epoch += loss.item() 156 | n_batches += 1 157 | 158 | scheduler.step() 159 | 160 | # Test on val set. 161 | self.test(dataset, net, verbose=False) 162 | if self.test_auc > best_auc: 163 | best_auc = self.test_auc 164 | self.best_weights = {} 165 | for key in net.state_dict(): 166 | self.best_weights[key] = net.state_dict()[key].cpu() 167 | self.best_R = self.R.detach().clone() 168 | 169 | avg_loss = loss_epoch / n_batches 170 | 171 | # Do Tensorboard logging. 172 | self.writer.add_scalar('anomaly/loss', avg_loss, epoch) 173 | self.writer.add_scalar('anomaly/auc', self.test_auc, epoch) 174 | self.writer.add_scalar('anomaly/loss_val_pos', self.val_loss[0], epoch) 175 | self.writer.add_scalar('anomaly/loss_val_neg', self.val_loss[1], epoch) 176 | 177 | # log epoch statistics 178 | epoch_train_time = time.time() - epoch_start_time 179 | logger.info(' Epoch {}/{}\t Time: {:.3f}\t Loss: {:.8f}\t AUC: {:.8f}' 180 | .format(epoch + 1, self.n_epochs, epoch_train_time, avg_loss, self.test_auc)) 181 | 182 | self.train_time = time.time() - start_time 183 | logger.info('Training time: %.3f' % self.train_time) 184 | 185 | logger.info('Finished training.') 186 | 187 | return net 188 | 189 | def test(self, dataset: BaseADDataset, net: BaseNet, verbose=True): 190 | logger = logging.getLogger() 191 | 192 | # Set device for network 193 | net = net.to(self.device) 194 | 195 | # Get test data loader 196 | _, test_loader = dataset.loaders(batch_size=self.batch_size, num_workers=self.n_jobs_dataloader) 197 | 198 | # Testing 199 | if verbose: 200 | logger.info('Starting testing...') 201 | start_time = time.time() 202 | idx_label_score = [] 203 | net.eval() 204 | with torch.no_grad(): 205 | for data in test_loader: 206 | inputs, labels, idx = data 207 | inputs = inputs.to(self.device) 208 | outputs = net(inputs) 209 | if self.objective == 'real-nvp': 210 | z, log_det_J = outputs 211 | log_prob_z = self.prior.log_prob(z.squeeze()) 212 | scores = -log_prob_z - log_det_J.squeeze() # We negate here, because we negate again later. 213 | else: 214 | dist = torch.sum((outputs.squeeze() - self.c) ** 2, dim=1) 215 | if self.objective == 'soft-boundary': 216 | scores = dist - self.R ** 2 217 | elif self.objective == 'one-class': 218 | scores = dist 219 | 220 | # Save triples of (idx, label, score) in a list 221 | idx_label_score += list(zip(idx.cpu().data.numpy().tolist(), 222 | labels.cpu().data.numpy().tolist(), 223 | scores.cpu().data.numpy().tolist())) 224 | 225 | # Compute pos and neg loss. 226 | n_pos = 0; loss_pos = 0 227 | n_neg = 0; loss_neg = 0 228 | for idx, label, score in idx_label_score: 229 | if label: 230 | n_pos += 1 231 | loss_pos += score 232 | else: 233 | n_neg += 1 234 | loss_neg += score 235 | loss_pos /= n_pos 236 | loss_neg /= n_neg 237 | 238 | 239 | if verbose: 240 | self.test_time = time.time() - start_time 241 | logger.info('Testing time: %.3f' % self.test_time) 242 | 243 | self.test_scores = idx_label_score 244 | self.val_loss = (loss_pos, loss_neg) 245 | 246 | # Compute AUC 247 | _, labels, scores = zip(*idx_label_score) 248 | labels = np.array(labels) 249 | scores = np.array(scores) 250 | # Invert scores because we compute error, not class probability. 251 | scores = -scores 252 | 253 | self.test_auc = roc_auc_score(labels, scores) 254 | if verbose: 255 | logger.info('Test set AUC: {:.2f}%'.format(100. * self.test_auc)) 256 | 257 | # Plot ROC curve. 258 | roc_x, roc_y, roc_thres = roc_curve(labels, scores) 259 | thres_best = computeMaxYoudensIndex(roc_x, roc_y, roc_thres) 260 | self.test_fpr5 = computeTprAt5Fpr(roc_x, roc_y) 261 | if self.objective == 'real-nvp': 262 | thres_best = -thres_best 263 | self.thres = thres_best 264 | fig, ax = plt.subplots() 265 | ax.set(xlabel='FPR', ylabel='TPR', title='ROC curve: {:.2f}% AUC'.format(100.*self.test_auc)) 266 | ax.plot(roc_x, roc_y) 267 | ax.grid() 268 | self.roc_plt = (fig, ax) 269 | # plt.show() 270 | plt.close() 271 | 272 | logger.info('Finished testing.') 273 | 274 | def init_center_c(self, train_loader: DataLoader, net: BaseNet, eps=0.1): 275 | """Initialize hypersphere center c as the mean from an initial forward pass on the data.""" 276 | n_samples = 0 277 | c = torch.zeros(net.rep_dim, device=self.device) 278 | 279 | net.eval() 280 | with torch.no_grad(): 281 | for data in train_loader: 282 | # get the inputs of the batch 283 | inputs, _, _ = data 284 | inputs = inputs.to(self.device) 285 | outputs = net(inputs) 286 | if isinstance(outputs, tuple): 287 | outputs = outputs[0] 288 | n_samples += outputs.shape[0] 289 | c += torch.sum(outputs, dim=0).squeeze() 290 | 291 | c /= n_samples 292 | 293 | # If c_i is too close to 0, set to +-eps. Reason: a zero unit can be trivially matched with zero weights. 294 | c[(abs(c) < eps) & (c < 0)] = -eps 295 | c[(abs(c) < eps) & (c > 0)] = eps 296 | 297 | return c 298 | 299 | def testInputSensitivity(self, dataset: BaseADDataset, net: BaseNet): 300 | # Set device for network 301 | net = net.to(self.device) 302 | 303 | # Get test data loader 304 | _, test_loader = dataset.loaders(batch_size=self.batch_size, num_workers=self.n_jobs_dataloader) 305 | 306 | net.eval() 307 | prop = GuidedBackProp(net) 308 | 309 | for inputs, labels, idx in test_loader: 310 | inputs = inputs.to(self.device) 311 | # Enable gradient wrt input. 312 | inputs.requires_grad = True 313 | outputs = net(inputs).squeeze() 314 | # Do backward pass. 315 | outputs.mean().backward(gradient=torch.ones(outputs.mean().size(), device=self.device)) 316 | # Average over all but channels. 317 | sensitivity = inputs.grad 318 | sensitivity = sensitivity.abs().mean((0, 2, 3)) 319 | print(sensitivity/sensitivity.norm()) 320 | 321 | 322 | 323 | 324 | 325 | def get_radius(dist: torch.Tensor, nu: float): 326 | """Optimally solve for radius R via the (1-nu)-quantile of distances.""" 327 | return np.quantile(np.sqrt(dist.clone().data.cpu().numpy()), 1 - nu) 328 | -------------------------------------------------------------------------------- /anomaly_detection/utils/__init__.py: -------------------------------------------------------------------------------- 1 | from .config import Config 2 | -------------------------------------------------------------------------------- /anomaly_detection/utils/config.py: -------------------------------------------------------------------------------- 1 | import json 2 | 3 | 4 | class Config(object): 5 | """Base class for experimental setting/configuration.""" 6 | 7 | def __init__(self, settings): 8 | self.settings = settings 9 | 10 | def load_config(self, import_json): 11 | """Load settings dict from import_json (path/filename.json) JSON-file.""" 12 | 13 | with open(import_json, 'r') as fp: 14 | settings = json.load(fp) 15 | 16 | for key, value in settings.items(): 17 | self.settings[key] = value 18 | 19 | def save_config(self, export_json): 20 | """Save settings dict to export_json (path/filename.json) JSON-file.""" 21 | 22 | with open(export_json, 'w') as fp: 23 | json.dump(self.settings, fp) 24 | -------------------------------------------------------------------------------- /anomaly_detection/utils/eval_functions.py: -------------------------------------------------------------------------------- 1 | def computeMaxYoudensIndex(fpr, tpr, thr): 2 | max_val = 0 3 | max_ind = None 4 | for i, (fp, tp) in enumerate(zip(fpr, tpr)): 5 | cur_val = tp + (1-fp) 6 | if cur_val > max_val: 7 | max_val = cur_val 8 | max_ind = i 9 | print('FPR: ' + str(fpr[max_ind])) 10 | print('TPR: ' + str(tpr[max_ind])) 11 | print('Thr: ' + str(thr[max_ind])) 12 | return thr[max_ind] 13 | 14 | 15 | 16 | def computeTprAt5Fpr(fpr, tpr): 17 | return computeTprAtXFpr(fpr, tpr, 0.05) 18 | 19 | 20 | 21 | def computeTprAtXFpr(fpr, tpr, fpr_thres): 22 | crit_ind = 0 23 | for ind, (fp, tp) in enumerate(zip(fpr, tpr)): 24 | if fp > fpr_thres: 25 | crit_ind = ind 26 | break 27 | if crit_ind == 0: 28 | fp_lower = 0.0 29 | tp_lower = 0.0 30 | else: 31 | fp_lower = fpr[crit_ind-1] 32 | tp_lower = tpr[crit_ind-1] 33 | fp_upper = fpr[crit_ind] 34 | tp_upper = tpr[crit_ind] 35 | 36 | tp_crit = tp_lower + (tp_upper-tp_lower)*(fpr_thres-fp_lower)/(fp_upper-fp_lower) 37 | return tp_crit 38 | 39 | -------------------------------------------------------------------------------- /anomaly_detection/utils/generate_incremental_table.py: -------------------------------------------------------------------------------- 1 | from generate_table import printTable 2 | import numpy as np 3 | 4 | FILE_AUC='anomaly_detection/log_incremental_rgbd_10/auc.npy' 5 | FILE_FPR5='anomaly_detection/log_incremental_rgbd_10/fpr5.npy' 6 | 7 | train_dat = [ 8 | 'Base', # 0 9 | '+Sun', # 1 10 | '+Twilight', # 2 11 | '+Rain', # 3 12 | ] 13 | 14 | val_dat = [ 15 | 'Sun', # 0 16 | 'Fire', # 1 17 | 'Rain', # 2 18 | 'Wet', # 3 19 | 'Twilight', # 4 20 | ] 21 | 22 | val_order = [ 23 | 0,1,4,2,3 24 | ] 25 | 26 | 27 | auc = np.transpose(np.load(FILE_AUC), (0,2,1)) 28 | fpr5 = np.transpose(np.load(FILE_FPR5), (0,2,1)) 29 | 30 | printTable(auc, val_dat, train_dat) 31 | 32 | print('\n\n\n\n') 33 | 34 | printTable(fpr5, val_dat, train_dat) 35 | -------------------------------------------------------------------------------- /anomaly_detection/utils/generate_overview_table.py: -------------------------------------------------------------------------------- 1 | from generate_table import printTable 2 | import numpy as np 3 | 4 | FILE='anomaly_detection/log/9/auc.npy' 5 | 6 | methods = [ 7 | 'NVP Fixed\\\\Features', # 0 8 | 'SVDD Soft\\\\Pretrained', # 1 9 | 'SVDD Hard\\\\Pretrained', # 2 10 | 'NVP\\\\No Pretraining', # 3 11 | 'NVP\\\\Pretrained', # 4 12 | 'SVDD Hard\\\\No Pretrained', # 5 13 | 'SVDD Soft\\\\No Pretrained', # 6 14 | 'Autoencoder' # 7 15 | ] 16 | 17 | method_order = [ 18 | 7, 6, 5, 1, 2, 3, 4, 0 19 | ] 20 | 21 | modalities = [ 22 | 'RGB+G+A', # 0 23 | 'RGB', # 1 24 | 'IR', # 2 25 | 'D', # 3 26 | 'RGB+D', # 4 27 | 'IR+D', # 5 28 | 'RGB+IR+D', # 6 29 | 'RGB+IR+D+N', # 7 30 | 'RGB+IR+G+A', # 8 31 | 'RGB+D+N', # 9 32 | 'RGB+N', # 10 33 | 'RGB+G', # 11 34 | 'RGB+A', # 12 35 | 'D+N', # 13 36 | 'G+A', # 14 37 | 'RGB+D+A', # 15 38 | 'RGB+G+N', # 16 39 | 'D+A', # 17 40 | ] 41 | 42 | mod_order = [ 43 | 1, 3, 4, 11, 10, 12, 13, 17, 14, 9, 15, 16, 0, # No IR 44 | 2, 5, 6, 7, 8 #IR 45 | ] 46 | 47 | val = np.load(FILE) 48 | 49 | printTable(val, methods, modalities, lambda x: 'IR' not in x, method_order, mod_order) 50 | 51 | print('\n\n\n\n') 52 | 53 | printTable(val, methods, modalities, lambda x: 'IR' in x, method_order, mod_order) 54 | -------------------------------------------------------------------------------- /anomaly_detection/utils/generate_table.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | def printTable(val, col_val, row_val, func=None, col_order=None, row_order=None): 4 | if col_order is None: 5 | col_order = [i for i in range(len(col_val))] 6 | if row_order is None: 7 | row_order = [i for i in range(len(row_val))] 8 | if func is None: 9 | func = lambda x: True 10 | 11 | for i in range(val.shape[0]): 12 | if val[i,0,0] == 0: 13 | val = val[0:i] 14 | break 15 | mean = np.mean(val, axis=0) 16 | std = np.std(val, axis=0) 17 | 18 | # Find max. 19 | max_ind = np.unravel_index(mean.argmax(), mean.shape) 20 | 21 | def isMax(col_ind, row_ind): 22 | return col_ind == max_ind[0] and row_ind == max_ind[1] 23 | 24 | row_mean = np.mean(mean, axis=0) 25 | col_mean = np.mean(mean, axis=1) 26 | 27 | tab = '' 28 | for col_ind in col_order: 29 | tab += ' & \\pbox{20cm}{' + col_val[col_ind] + '}' 30 | tab += '\\\\ \\hline\\hline' 31 | print (tab) 32 | grey = True 33 | for row_ind in row_order: 34 | if func(row_val[row_ind]): 35 | tab = row_val[row_ind] 36 | for col_ind in col_order: 37 | tab += ' & ' 38 | if isMax(col_ind, row_ind): 39 | tab += '\\textbf{' 40 | tab += '{:.2f}'.format(mean[col_ind, row_ind]*100) + '$\\pm$' 41 | tab += '{:.2f}'.format(std[col_ind, row_ind]*100) 42 | if isMax(col_ind, row_ind): 43 | tab += '}' 44 | # tab += '{:.2f}'.format(row_mean[row_ind]*100) + ' \\\\' 45 | tab += ' \\\\' 46 | if grey: 47 | tab += ' \\rowcolor{lightgray}' 48 | grey = False 49 | else: 50 | grey = True 51 | print(tab) 52 | print('\\hline\\hline') 53 | # tab = 'Avg' 54 | # for col_ind in col_order: 55 | # tab += ' & {:.2f}'.format(col_mean[col_ind]*100) 56 | # tab += '\\\\' 57 | # print(tab) 58 | -------------------------------------------------------------------------------- /anomaly_detection/utils/visualization/plot_images_grid.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import matplotlib 3 | matplotlib.use('Agg') # or 'PS', 'PDF', 'SVG' 4 | 5 | import matplotlib.pyplot as plt 6 | import numpy as np 7 | from torchvision.utils import make_grid 8 | 9 | 10 | def plot_images_grid(x: torch.tensor, export_img, title: str = '', nrow=8, padding=2, normalize=False, pad_value=0): 11 | """Plot 4D Tensor of images of shape (B x C x H x W) as a grid.""" 12 | 13 | grid = make_grid(x, nrow=nrow, padding=padding, normalize=normalize, pad_value=pad_value) 14 | npgrid = grid.cpu().numpy() 15 | 16 | plt.imshow(np.transpose(npgrid, (1, 2, 0)), interpolation='nearest') 17 | 18 | ax = plt.gca() 19 | ax.xaxis.set_visible(False) 20 | ax.yaxis.set_visible(False) 21 | 22 | if not (title == ''): 23 | plt.title(title) 24 | 25 | plt.savefig(export_img, bbox_inches='tight', pad_inches=0.1) 26 | plt.clf() 27 | -------------------------------------------------------------------------------- /dataset_labeller.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import matplotlib.pyplot as plt 4 | from matplotlib.widgets import Slider, Button 5 | from argparse import ArgumentParser 6 | from torch.utils.data import DataLoader 7 | import cv2 8 | 9 | from anomaly_detection.datasets.selfsupervised_images import SelfSupervisedDataset 10 | 11 | 12 | 13 | SQ_SIZE=32 14 | 15 | 16 | 17 | def getPermutedNumpyPatchesFromPytorch(images, x_ind, y_ind): 18 | patches = [] 19 | for img in images: 20 | img = img.numpy() 21 | patch = img[0, :, x_ind[0]:x_ind[1], y_ind[0]:y_ind[1]] 22 | if len(patch.shape) == 3: 23 | patch = np.transpose(patch, (1, 2, 0)) 24 | else: 25 | raise Exception('I don\'t think this should ever happen') 26 | patches.append(np.ascontiguousarray(patch)) 27 | 28 | return patches 29 | 30 | 31 | 32 | def makeIndexValid(x_ind, y_ind, img_shape): 33 | x_shape = img_shape[0] 34 | y_shape = img_shape[1] 35 | if x_ind[0] < 0: 36 | x_ind -= x_ind[0] 37 | if x_ind[1] > x_shape: 38 | x_ind -= (x_ind[1]-x_shape) 39 | 40 | if y_ind[0] < 0: 41 | y_ind -= y_ind[0] 42 | if y_ind[1] > y_shape: 43 | y_ind -= (y_ind[1]-y_shape) 44 | 45 | 46 | 47 | def savePatches(patches, step): 48 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_rgb.png'), patches[0]*255) 49 | 50 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_ir.png'), patches[1]*255) 51 | 52 | patches[2].dtype = np.uint8 53 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_depth.png'), patches[2]) 54 | 55 | patches[3].dtype = np.uint8 56 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_depth_3d_x.png'), patches[3]) 57 | 58 | patches[4].dtype = np.uint8 59 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_depth_3d_y.png'), patches[4]) 60 | 61 | patches[5].dtype = np.uint8 62 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_depth_3d_z.png'), patches[5]) 63 | 64 | patches[6].dtype = np.uint8 65 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_normals_x.png'), patches[6]) 66 | patches[7].dtype = np.uint8 67 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_normals_y.png'), patches[7]) 68 | patches[8].dtype = np.uint8 69 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_normals_z.png'), patches[8]) 70 | 71 | 72 | 73 | def label(args): 74 | dataset = SelfSupervisedDataset(args.datadir, file_format='csv', subsample=args.subsample, tensor_type='float') 75 | loader = DataLoader(dataset, shuffle=False) 76 | 77 | if not os.path.exists(args.outdir): 78 | os.makedirs(args.outdir) 79 | print('Create directory: ' + args.outdir) 80 | 81 | n_steps = len(loader) 82 | 83 | if args.manually: 84 | for step, (images, labels) in enumerate(loader): 85 | print(str(step+1) + '/' + str(n_steps),end='\r') 86 | 87 | label_mask = labels != 0 88 | img_footholds = images[0].clone() 89 | img_footholds.masked_fill_(label_mask, 0.0) 90 | img_footholds = img_footholds[0].permute(1,2,0).numpy() 91 | label_mask = label_mask.squeeze().cpu().numpy() 92 | 93 | fig = plt.figure(figsize=(10, 7), dpi=100) 94 | ax_img = plt.axes([0.05, 0.15, 0.8, 0.8]) 95 | img_plt = ax_img.imshow(img_footholds) 96 | ax_patch = plt.axes([0.875, 0.45, 0.1, 0.1]) 97 | 98 | def onclick(event): 99 | # Crop patch. 100 | # Event data gives images coordinates, whereas we save in matrix coordinates. 101 | # Thats why we swap x and y 102 | y_ind = np.array([int(event.xdata)-SQ_SIZE//2, int(event.xdata)+SQ_SIZE//2]) 103 | x_ind = np.array([int(event.ydata)-SQ_SIZE//2, int(event.ydata)+SQ_SIZE//2]) 104 | makeIndexValid(x_ind, y_ind, label_mask.shape) 105 | patches = getPermutedNumpyPatchesFromPytorch(images, x_ind, y_ind) 106 | # Draw patch. 107 | # ax_patch.imshow(patch_rgb) 108 | # fig.canvas.draw_idle() 109 | # Save patches. 110 | savePatches(patches, step) 111 | plt.close(fig) 112 | 113 | fig.canvas.mpl_connect('button_release_event', onclick) 114 | 115 | plt.show() 116 | else: 117 | for step, (images, labels) in enumerate(loader): 118 | print(str(step+1) + '/' + str(n_steps),end='\r') 119 | 120 | label_mask = (labels!=0).squeeze().numpy() 121 | foot_ind = np.where(label_mask) 122 | # Make sure we have a foothold in the image. 123 | if foot_ind[0].shape[0] == 0: 124 | print('Encountered empty foothold mask') 125 | continue 126 | 127 | samp_ind = int(np.random.rand()*foot_ind[0].shape[0]) 128 | indices = np.array([foot_ind[0][samp_ind], foot_ind[1][samp_ind]]) 129 | 130 | x_ind = np.array([int(indices[0]-SQ_SIZE/2), int(indices[0]+SQ_SIZE/2)]) 131 | y_ind = np.array([int(indices[1]-SQ_SIZE/2), int(indices[1]+SQ_SIZE/2)]) 132 | makeIndexValid(x_ind, y_ind, label_mask.shape) 133 | 134 | patches = getPermutedNumpyPatchesFromPytorch(images, x_ind, y_ind) 135 | 136 | # Save everything in the appropriate format. 137 | savePatches(patches, step) 138 | 139 | 140 | 141 | if __name__ == '__main__': 142 | parser = ArgumentParser() 143 | parser.add_argument('--datadir', required=True, help='Directory for dataset') 144 | parser.add_argument('--outdir', required=True, help='Output directory for patches') 145 | parser.add_argument('--subsample', type=int, default=1, help='Only use every nth image of the dataset') 146 | parser.add_argument('--manually', action='store_true', default=False) 147 | 148 | args = parser.parse_args() 149 | 150 | label(args) -------------------------------------------------------------------------------- /evaluate_dense_svdd.py: -------------------------------------------------------------------------------- 1 | import click 2 | import os 3 | import random 4 | import time 5 | import numpy as np 6 | import torch 7 | import torch.nn as nn 8 | import torch.nn.functional as F 9 | import math 10 | import matplotlib.pyplot as plt 11 | from matplotlib.widgets import Slider, Button 12 | import cv2 13 | 14 | from argparse import ArgumentParser 15 | 16 | from torch.optim import Adam, lr_scheduler 17 | from torch.utils.data import DataLoader 18 | from tensorboardX import SummaryWriter 19 | 20 | from anomaly_detection.datasets.selfsupervised_images import SelfSupervisedDataset 21 | from anomaly_detection.networks.main import build_network, build_autoencoder 22 | from anomaly_detection.utils.config import Config 23 | 24 | 25 | MAX = 1.0 26 | MIN = 0.5 27 | INVALID = -123 28 | 29 | 30 | 31 | class Upsampler: 32 | def __init__(self, shape): 33 | # self.upsampler = nn.Upsample(shape) 34 | self.shape = shape 35 | self.upsampler = nn.Upsample(scale_factor=4) 36 | 37 | def getPadding(self, tensor): 38 | cur_shape = tensor.shape 39 | w_dif = self.shape[-1] - tensor.shape[-1] 40 | pad_left = w_dif//2 41 | pad_right = w_dif - pad_left 42 | h_dif = self.shape[-2] - tensor.shape[-2] 43 | pad_top = h_dif//2 44 | pad_bot = h_dif - pad_top 45 | return (pad_left, pad_right, pad_top, pad_bot) 46 | 47 | def __call__(self, tensor): 48 | # return self.upsampler(tensor.unsqueeze(0).unsqueeze(0)).squeeze() 49 | n_missing_dim = 0 50 | while len(tensor.shape) < 4: 51 | tensor = tensor.unsqueeze(0) 52 | n_missing_dim = n_missing_dim + 1 53 | tensor = self.upsampler(tensor) 54 | pad = self.getPadding(tensor) 55 | mean_val = tensor.mean() 56 | tensor = F.pad(tensor, pad, mode='constant', value=mean_val) 57 | for i in range(0, n_missing_dim): 58 | tensor = tensor.squeeze(0) 59 | return tensor, mean_val 60 | 61 | 62 | 63 | def getOverlayImage(image, mask, invalid_mask): 64 | mask = mask.squeeze().unsqueeze(-1) 65 | invalid_mask = invalid_mask.squeeze().unsqueeze(-1) 66 | 67 | mask_red = mask&(~invalid_mask) 68 | mask_green = ~mask&(~invalid_mask) 69 | 70 | image_out = torch.asin(image.clone()-1)/(math.pi/2)+1.0 71 | image_out.masked_scatter_(mask_red, (image_out*torch.Tensor(np.array([MAX,MIN,MIN]))).masked_select(mask_red)) 72 | image_out.masked_scatter_(mask_green, (image_out*torch.Tensor(np.array([MIN,MAX,MIN]))).masked_select(mask_green)) 73 | return image_out 74 | 75 | 76 | 77 | def computeObjective(outputs, objective, center): 78 | if objective == 'real-nvp': 79 | log_det_J = outputs[1] 80 | outputs = outputs[0] 81 | if objective in ['real-nvp']: 82 | # Reshape so that we can compute log_probs. 83 | outs_reshape = outputs.permute(0,2,3,1).reshape(-1, outputs.shape[1]) 84 | prior = torch.distributions.MultivariateNormal(center.squeeze(), 85 | torch.eye(outputs.shape[1], 86 | device=outputs.device)) 87 | log_prob_reshape = prior.log_prob(outs_reshape) 88 | log_prob = log_prob_reshape.reshape(outputs.shape[0], 1, outputs.shape[2], outputs.shape[3]) 89 | if objective == 'real-nvp': 90 | print(log_prob.shape) 91 | print(log_det_J.shape) 92 | log_prob = log_prob + log_det_J 93 | return -log_prob 94 | elif objective in ['one-class', 'soft-boundary']: 95 | return ((outputs - center)**2).mean(dim=1) 96 | else: 97 | raise Exception('Unknown objective function') 98 | 99 | 100 | 101 | 102 | def train(cfg, model): 103 | assert os.path.exists(cfg.settings['data_path']), "Error: datadir (dataset directory) could not be loaded" 104 | 105 | # Dataset stuff. 106 | dataset = SelfSupervisedDataset(cfg.settings['data_path'], file_format='csv', subsample=cfg.settings['subsample'], tensor_type='float') 107 | loader = DataLoader(dataset, num_workers=0, batch_size=cfg.settings['batch_size'], shuffle=False) 108 | 109 | # Figure out base latent weight assuming dense input mask. 110 | in_shape = dataset[0][0][0].shape 111 | 112 | with torch.no_grad(): 113 | # Determine feature center. 114 | feature_center = cfg.settings['center'] 115 | print('Features center is') 116 | print(feature_center) 117 | # Compute helper matrix with feature center in channel dimension. 118 | feature_center_2d = feature_center.unsqueeze(0).unsqueeze(2).unsqueeze(3) 119 | 120 | # Check for CUDA. 121 | if torch.cuda.is_available(): 122 | device = 'cuda:0' 123 | print('Using CUDA') 124 | else: 125 | device = 'cpu' 126 | print('Using CPU') 127 | model = model.to(device) 128 | feature_center_2d = feature_center_2d.to(device) 129 | 130 | upsampler = Upsampler((in_shape[-2], in_shape[-1])) 131 | 132 | last_loss_masks = [] 133 | 134 | # Get threshold. 135 | THRES = (cfg.settings['radius']-10, cfg.settings['radius'], cfg.settings['radius']+10) 136 | 137 | ####### Training #################### 138 | 139 | epoch_steps = len(loader) 140 | for step, ((images_rgb, images_ir, images_depth, images_depth_3d_x, images_depth_3d_y, images_depth_3d_z, images_normals_x, images_normals_y, images_normals_z), labels) in enumerate(loader): 141 | print(str(step+1) + '/' + str(epoch_steps),end='\r') 142 | 143 | images_rgb = images_rgb.to(device) 144 | images_ir = images_ir.to(device) 145 | images_depth = images_depth.to(device) 146 | images_depth_3d_x = images_depth_3d_x.to(device) 147 | images_depth_3d_y = images_depth_3d_y.to(device) 148 | images_depth_3d_z = images_depth_3d_z.to(device) 149 | images_normals_x = images_normals_x.to(device) 150 | images_normals_y = images_normals_y.to(device) 151 | images_normals_z = images_normals_z.to(device) 152 | labels = labels.to(device) 153 | 154 | # Normalize depth. 155 | images_depth = images_depth / 10.0 156 | images_depth_3d_x = images_depth_3d_x / 10.0 157 | images_depth_3d_y = images_depth_3d_y / 10.0 158 | images_depth_3d_z = images_depth_3d_z / 10.0 159 | 160 | # Reconstruct the actual inputs. 161 | images_normals_horz = (images_normals_x**2 + images_normals_y**2)**0.5 162 | images_depth_3d_horz = (images_depth_3d_x**2 + images_depth_3d_y**2)**0.5 163 | images_depth_3d = torch.cat((images_depth_3d_horz, images_depth_3d_z), dim=1) 164 | images_normals = torch.cat((images_normals_horz, images_normals_z),dim=1) 165 | images_normal_angle = torch.atan2(images_normals_z, images_normals_horz)/np.pi 166 | 167 | label_mask = (labels != 0) 168 | 169 | images = () 170 | if cfg.settings['rgb']: 171 | images += (images_rgb,) 172 | if cfg.settings['ir']: 173 | images += (images_ir,) 174 | if cfg.settings['depth']: 175 | images += (images_depth,) 176 | if cfg.settings['depth_3d']: 177 | images += (images_depth_3d,) 178 | if cfg.settings['normals']: 179 | images += (images_normals,) 180 | if cfg.settings['normal_angle']: 181 | images += (images_normal_angle,) 182 | images_in = torch.cat(images, dim=1) 183 | 184 | features = model(images_in) 185 | 186 | if cfg.settings['objective'] == 'ae': 187 | loss_unmasked = ((features - images_in)**2).sum(dim=1) 188 | loss_unmasked = F.avg_pool2d(loss_unmasked, 32, 1)*32 189 | else: 190 | loss_unmasked = computeObjective(features, 191 | cfg.settings['objective'], 192 | feature_center_2d) 193 | 194 | # Mask image with footholds. 195 | images_rgb.masked_fill_(label_mask, 0) 196 | 197 | # Get first batch element. 198 | img = images_in[0,0:3].permute(1,2,0).cpu() 199 | img_np = img.numpy() 200 | 201 | img_ir = images_depth[0].expand(3, -1, -1).permute(1,2,0).cpu() 202 | img_ir_np = img_ir.numpy() 203 | 204 | if cfg.settings['objective'] == 'ae': 205 | img_depth = features[0,0:3].permute(1,2,0).cpu() 206 | else: 207 | img_depth = (images_normal_angle[0].expand(3, -1, -1)).permute(1,2,0).cpu() 208 | img_depth_np = img_depth.numpy() 209 | 210 | loss_img_orig = loss_unmasked[0].squeeze().cpu() 211 | print(loss_img_orig.shape) 212 | loss_img_orig, invalid_val = upsampler(loss_img_orig) 213 | loss_img_np = loss_img_orig.numpy() 214 | 215 | loss_mask = loss_img_orig > THRES[1] 216 | 217 | N_HIST=3 218 | 219 | if cfg.settings['temporal_filter']: 220 | loss_avg = loss_mask.clone() 221 | for mask in last_loss_masks: 222 | loss_avg &= mask 223 | last_loss_masks.append(loss_mask) 224 | if len(last_loss_masks) > N_HIST: 225 | last_loss_masks.pop(0) 226 | loss_mask = loss_avg 227 | 228 | invalid_mask = loss_img_orig == invalid_val 229 | 230 | img_overlay_np = getOverlayImage(img, loss_mask, invalid_mask).numpy() 231 | 232 | if cfg.settings['save_dir'] is not None: 233 | # Permute axes and flip RGB. 234 | print(img_overlay_np.shape) 235 | img_save = np.flip(img_overlay_np, axis=2) 236 | file_name = "{:05d}".format(step)+'_inf.png' 237 | cv2.imwrite(os.path.join(cfg.settings['save_dir'], file_name), img_save*255) 238 | 239 | else: 240 | # Set up plot with slider. 241 | fig = plt.figure(figsize=(24, 13), dpi=100) 242 | 243 | ax_loss_img = plt.axes([0.15, 0.1, 0.3, 0.4]) 244 | loss_img_plt = ax_loss_img.imshow(loss_img_orig.numpy()) 245 | 246 | ax_img = plt.axes([0.025, 0.55, 0.3, 0.4]) 247 | img_plt = ax_img.imshow(img_np) 248 | 249 | ax_img_ir = plt.axes([0.35, 0.55, 0.3, 0.4]) 250 | img_ir_plt = ax_img_ir.imshow(img_ir_np) 251 | 252 | ax_img_depth = plt.axes([0.675, 0.55, 0.3, 0.4]) 253 | img_depth_plt = ax_img_depth.imshow(img_depth_np) 254 | 255 | ax_img_overlay = plt.axes([0.55, 0.1, 0.3, 0.4]) 256 | 257 | img_overlay_plt = ax_img_overlay.imshow(img_overlay_np) 258 | 259 | ax_thres = plt.axes([0.25, 0.025, 0.65, 0.03], facecolor='lightgoldenrodyellow') 260 | s_thres = Slider(ax_thres, 'Threshold', THRES[0], THRES[2], valinit=THRES[1], valstep=(THRES[2]-THRES[0])/1000) 261 | 262 | def update(val): 263 | threshold_sq = s_thres.val 264 | loss_img = (loss_img_orig > threshold_sq) 265 | loss_img_np = loss_img_orig.numpy() 266 | loss_img_plt.set_data(loss_img_np) 267 | img_overlay_plt.set_data(getOverlayImage(img, loss_img).numpy()) 268 | fig.canvas.draw_idle() 269 | s_thres.on_changed(update) 270 | 271 | plt.show() 272 | 273 | 274 | 275 | @click.command() 276 | @click.argument('net_name', type=click.Choice(['StackConvNet', 'fusion'])) 277 | @click.argument('data_path', type=click.Path(exists=True)) 278 | @click.option('--load_model', type=click.Path(exists=True), default=None, 279 | help='Model file path (default: None).') 280 | @click.option('--objective', type=click.Choice(['one-class', 'soft-boundary', 'real-nvp', 'ae']), default='one-class', 281 | help='Specify Deep SVDD objective ("one-class" or "soft-boundary").') 282 | @click.option('--nu', type=float, default=0.1, help='Deep SVDD hyperparameter nu (must be 0 < nu <= 1).') 283 | @click.option('--batch_size', type=int, default=1, help='Batch size for mini-batch training.') 284 | @click.option('--subsample', type=int, default=1, help='Subsample dataset.') 285 | @click.option('--rgb', is_flag=True) 286 | @click.option('--ir', is_flag=True) 287 | @click.option('--depth', is_flag=True) 288 | @click.option('--depth_3d', is_flag=True) 289 | @click.option('--normals', is_flag=True) 290 | @click.option('--normal_angle', is_flag=True) 291 | @click.option('--batchnorm', is_flag=True) 292 | @click.option('--dropout', is_flag=True) 293 | @click.option('--save_dir', type=click.Path(exists=True), default=None) 294 | @click.option('--temporal_filter', default=None) 295 | def main(net_name, data_path, load_model, objective, nu, batch_size, subsample, 296 | rgb, ir, depth, depth_3d, normals, normal_angle, batchnorm, dropout, 297 | save_dir, temporal_filter): 298 | cfg = Config(locals().copy()) 299 | 300 | assert objective in ['one-class','soft-boundary','real-nvp','ae'] 301 | assert net_name in ['StackConvNet', 'fusion'] 302 | 303 | if objective == 'ae': 304 | model = build_autoencoder(net_name, cfg) 305 | else: 306 | model = build_network(net_name, cfg) 307 | 308 | test = torch.load(load_model) 309 | if objective == 'ae': 310 | out = model.load_state_dict(test['ae_net_dict'], strict=True) 311 | else: 312 | out = model.load_state_dict(test['net_dict'], strict=True) 313 | print('Loading weights output:') 314 | print(out) 315 | model = model.cpu() 316 | # Set radius. 317 | if objective == 'ae': 318 | cfg.settings['radius'] = test['thres_ae'] 319 | else: 320 | cfg.settings['radius'] = test['thres'] 321 | print('Decision radius is',test['thres']) 322 | # Set center. 323 | cfg.settings['center'] = torch.Tensor(test['c']) 324 | model.eval() 325 | 326 | train(cfg, model) 327 | 328 | 329 | 330 | if __name__ == '__main__': 331 | main() 332 | -------------------------------------------------------------------------------- /get_dataset.sh: -------------------------------------------------------------------------------- 1 | wget https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/389950/anomaly_navigation_dataset.zip 2 | unzip anomaly_navigation_dataset.zip -d data 3 | rm anomaly_navigation_dataset.zip 4 | -------------------------------------------------------------------------------- /train_all_combinations.py: -------------------------------------------------------------------------------- 1 | import click 2 | from train_anomaly_detection import main_func 3 | import numpy as np 4 | import os 5 | 6 | # Define base parameters. 7 | dataset_name = 'selfsupervised' 8 | net_name = 'StackConvNet' 9 | xp_path_base = 'log' 10 | data_path = 'data/full' 11 | train_folder = 'train' 12 | val_pos_folder = 'val/wangen_sun_3_pos' 13 | val_neg_folder = 'val/wangen_sun_3_neg' 14 | load_config = None 15 | load_model = None 16 | nu = 0.1 17 | device = 'cuda' 18 | seed = -1 19 | optimizer_name = 'adam' 20 | lr = 0.0001 21 | n_epochs = 150 22 | lr_milestone = (100,) 23 | batch_size = 200 24 | weight_decay = 0.5e-6 25 | ae_optimizer_name = 'adam' 26 | ae_lr = 0.0001 27 | ae_n_epochs = 350 28 | ae_lr_milestone = (250,) 29 | ae_batch_size = 200 30 | ae_weight_decay = 0.5e-6 31 | n_jobs_dataloader = 0 32 | normal_class = 1 33 | batchnorm = False 34 | dropout = False 35 | augment = False 36 | 37 | objectives = [ 38 | {'objective': 'real-nvp', 'pretrain': True, 'fix_encoder': True}, # 0 39 | {'objective': 'soft-boundary', 'pretrain': True, 'fix_encoder': False}, # 1 40 | {'objective': 'one-class', 'pretrain': True, 'fix_encoder': False}, # 2 41 | {'objective': 'real-nvp', 'pretrain': False, 'fix_encoder': False}, # 3 42 | {'objective': 'real-nvp', 'pretrain': True, 'fix_encoder': False}, # 4 43 | {'objective': 'one-class', 'pretrain': False, 'fix_encoder': False}, # 5 44 | {'objective': 'soft-boundary', 'pretrain': False, 'fix_encoder': False} # 6 45 | ] 46 | 47 | modalities = [ 48 | {'rgb': True , 'ir': False, 'depth': False, 'depth_3d': True , 'normals': False, 'normal_angle': True }, 49 | {'rgb': True , 'ir': False, 'depth': False, 'depth_3d': False, 'normals': False, 'normal_angle': False}, 50 | {'rgb': False, 'ir': True , 'depth': False, 'depth_3d': False, 'normals': False, 'normal_angle': False}, 51 | {'rgb': False, 'ir': False, 'depth': True , 'depth_3d': False, 'normals': False, 'normal_angle': False}, 52 | {'rgb': True , 'ir': False, 'depth': True , 'depth_3d': False, 'normals': False, 'normal_angle': False}, 53 | {'rgb': False, 'ir': True , 'depth': True , 'depth_3d': False, 'normals': False, 'normal_angle': False}, 54 | {'rgb': True , 'ir': True , 'depth': True , 'depth_3d': False, 'normals': False, 'normal_angle': False}, 55 | {'rgb': True , 'ir': True , 'depth': True , 'depth_3d': False, 'normals': True , 'normal_angle': False}, 56 | {'rgb': True , 'ir': True , 'depth': False, 'depth_3d': True , 'normals': False, 'normal_angle': True }, 57 | {'rgb': True , 'ir': False, 'depth': True , 'depth_3d': False, 'normals': True , 'normal_angle': False}, 58 | {'rgb': True , 'ir': False, 'depth': False, 'depth_3d': False, 'normals': True , 'normal_angle': False}, 59 | {'rgb': True , 'ir': False, 'depth': False, 'depth_3d': True , 'normals': False, 'normal_angle': False}, 60 | {'rgb': True , 'ir': False, 'depth': False, 'depth_3d': False, 'normals': False, 'normal_angle': True }, 61 | {'rgb': False, 'ir': False, 'depth': True , 'depth_3d': False, 'normals': True , 'normal_angle': False}, 62 | {'rgb': False, 'ir': False, 'depth': False, 'depth_3d': True , 'normals': False, 'normal_angle': True }, 63 | {'rgb': True , 'ir': False, 'depth': True , 'depth_3d': False, 'normals': False, 'normal_angle': True }, 64 | {'rgb': True , 'ir': False, 'depth': False, 'depth_3d': True , 'normals': True , 'normal_angle': False}, 65 | {'rgb': False, 'ir': False, 'depth': True , 'depth_3d': False, 'normals': False, 'normal_angle': True } 66 | ] 67 | 68 | N_ITER = 10 69 | 70 | auc_mat = np.zeros((N_ITER, len(objectives)+1, len(modalities))) # +1 for Autoencoder 71 | 72 | for it in range(N_ITER): 73 | xp_path = os.path.join(xp_path_base, str(it)) 74 | for i, obj in enumerate(objectives): 75 | for j, mod in enumerate(modalities): 76 | train_obj = main_func(dataset_name, net_name, xp_path, data_path, train_folder, 77 | val_pos_folder, val_neg_folder, load_config, load_model, obj['objective'], nu, 78 | device, seed, optimizer_name, lr, n_epochs, lr_milestone, batch_size, 79 | weight_decay, obj['pretrain'], ae_optimizer_name, ae_lr, ae_n_epochs, 80 | ae_lr_milestone, ae_batch_size, ae_weight_decay, n_jobs_dataloader, normal_class, 81 | mod['rgb'], mod['ir'], mod['depth'], mod['depth_3d'], mod['normals'], 82 | mod['normal_angle'], batchnorm, dropout, augment, obj['fix_encoder']) 83 | auc = train_obj.results['test_auc'] 84 | auc_ae = train_obj.results['test_auc_ae'] 85 | auc_mat[it, i,j] = auc 86 | if auc_ae is not None: 87 | auc_mat[it, -1,j] = auc_ae 88 | 89 | np.save(os.path.join(xp_path, 'auc.npy'), auc_mat) 90 | 91 | np.save(os.path.join(xp_path_base, 'auc.npy'), auc_mat) 92 | print('avg') 93 | print(np.mean(auc_mat, axis=0)) 94 | print('std') 95 | print(np.std(auc_mat, axis=0)) -------------------------------------------------------------------------------- /train_anomaly_detection.py: -------------------------------------------------------------------------------- 1 | import click 2 | import torch 3 | import logging 4 | import random 5 | import numpy as np 6 | import shutil 7 | import os 8 | 9 | from anomaly_detection.utils.config import Config 10 | from anomaly_detection.utils.visualization.plot_images_grid import plot_images_grid 11 | from anomaly_detection.anomalyDetection import AnomalyDetection 12 | from anomaly_detection.datasets.main import load_dataset 13 | from torch.utils.tensorboard import SummaryWriter 14 | 15 | 16 | ################################################################################ 17 | # Settings 18 | ################################################################################ 19 | @click.command() 20 | @click.argument('dataset_name', type=click.Choice(['selfsupervised'])) 21 | @click.argument('net_name', type=click.Choice(['StackConvNet'])) 22 | @click.argument('xp_path', type=click.Path(exists=False)) 23 | @click.argument('data_path', type=click.Path(exists=True)) 24 | @click.option('--train_folder', '-t', type=str, default=None, multiple=True) 25 | @click.option('--val_pos_folder', '-vp', type=str, default=None, multiple=True) 26 | @click.option('--val_neg_folder', '-vn', type=str, default=None, multiple=True) 27 | @click.option('--load_config', type=click.Path(exists=True), default=None, 28 | help='Config JSON-file path (default: None).') 29 | @click.option('--load_model', type=click.Path(exists=True), default=None, 30 | help='Model file path (default: None).') 31 | @click.option('--objective', type=click.Choice(['one-class', 'soft-boundary', 'real-nvp']), default='one-class', 32 | help='Specify Anomaly Detection objective ("one-class", "soft-boundary", or "real-nvp").') 33 | @click.option('--nu', type=float, default=0.1, help='Deep SVDD hyperparameter nu (must be 0 < nu <= 1).') 34 | @click.option('--device', type=str, default='cuda', help='Computation device to use ("cpu", "cuda", "cuda:2", etc.).') 35 | @click.option('--seed', type=int, default=-1, help='Set seed. If -1, use randomization.') 36 | @click.option('--optimizer_name', type=click.Choice(['adam']), default='adam', 37 | help='Name of the optimizer to use for network training.') 38 | @click.option('--lr', type=float, default=0.001, 39 | help='Initial learning rate for network training. Default=0.001') 40 | @click.option('--n_epochs', type=int, default=50, help='Number of epochs to train.') 41 | @click.option('--lr_milestone', type=int, default=0, multiple=True, 42 | help='Lr scheduler milestones at which lr is multiplied by 0.1. Can be multiple and must be increasing.') 43 | @click.option('--batch_size', type=int, default=128, help='Batch size for mini-batch training.') 44 | @click.option('--weight_decay', type=float, default=1e-6, 45 | help='Weight decay (L2 penalty) hyperparameter for objective.') 46 | @click.option('--pretrain', type=bool, default=True, 47 | help='Pretrain neural network parameters via autoencoder.') 48 | @click.option('--ae_optimizer_name', type=click.Choice(['adam']), default='adam', 49 | help='Name of the optimizer to use for autoencoder pretraining.') 50 | @click.option('--ae_lr', type=float, default=0.001, 51 | help='Initial learning rate for autoencoder pretraining. Default=0.001') 52 | @click.option('--ae_n_epochs', type=int, default=100, help='Number of epochs to train autoencoder.') 53 | @click.option('--ae_lr_milestone', type=int, default=0, multiple=True, 54 | help='Lr scheduler milestones at which lr is multiplied by 0.1. Can be multiple and must be increasing.') 55 | @click.option('--ae_batch_size', type=int, default=128, help='Batch size for mini-batch autoencoder training.') 56 | @click.option('--ae_weight_decay', type=float, default=1e-6, 57 | help='Weight decay (L2 penalty) hyperparameter for autoencoder objective.') 58 | @click.option('--n_jobs_dataloader', type=int, default=0, 59 | help='Number of workers for data loading. 0 means that the data will be loaded in the main process.') 60 | @click.option('--normal_class', type=int, default=0, 61 | help='Specify the normal class of the dataset (all other classes are considered anomalous).') 62 | @click.option('--rgb', is_flag=True) 63 | @click.option('--ir', is_flag=True) 64 | @click.option('--depth', is_flag=True) 65 | @click.option('--depth_3d', is_flag=True) 66 | @click.option('--normals', is_flag=True) 67 | @click.option('--normal_angle', is_flag=True) 68 | @click.option('--batchnorm', is_flag=True) 69 | @click.option('--dropout', is_flag=True) 70 | @click.option('--augment', is_flag=True) 71 | @click.option('--fix_encoder', is_flag=True) 72 | def main(dataset_name, net_name, xp_path, data_path, train_folder, val_pos_folder, val_neg_folder, load_config, 73 | load_model, objective, nu, device, seed, optimizer_name, lr, n_epochs, lr_milestone, batch_size, 74 | weight_decay, pretrain, ae_optimizer_name, ae_lr, ae_n_epochs, ae_lr_milestone, ae_batch_size, 75 | ae_weight_decay, n_jobs_dataloader, normal_class, rgb, ir, depth, depth_3d, normals, normal_angle, 76 | batchnorm, dropout, augment, fix_encoder): 77 | main_func(dataset_name, net_name, xp_path, data_path, train_folder, val_pos_folder, val_neg_folder, load_config, 78 | load_model, objective, nu, device, seed, optimizer_name, lr, n_epochs, lr_milestone, batch_size, 79 | weight_decay, pretrain, ae_optimizer_name, ae_lr, ae_n_epochs, ae_lr_milestone, ae_batch_size, 80 | ae_weight_decay, n_jobs_dataloader, normal_class, rgb, ir, depth, depth_3d, normals, normal_angle, 81 | batchnorm, dropout, augment, fix_encoder) 82 | 83 | 84 | 85 | def main_func(dataset_name, net_name, xp_path, data_path, train_folder, val_pos_folder, val_neg_folder, load_config, 86 | load_model, objective, nu, device, seed, optimizer_name, lr, n_epochs, lr_milestone, batch_size, 87 | weight_decay, pretrain, ae_optimizer_name, ae_lr, ae_n_epochs, ae_lr_milestone, ae_batch_size, 88 | ae_weight_decay, n_jobs_dataloader, normal_class, rgb, ir, depth, depth_3d, normals, normal_angle, 89 | batchnorm, dropout, augment, fix_encoder): 90 | 91 | # Get configuration 92 | cfg = Config(locals().copy()) 93 | 94 | assert rgb or ir or depth or depth_3d or normals or normal_angle, 'Need to select at least one input channel' 95 | 96 | # Get logging name based on settings. 97 | if net_name=='StackConvNet': 98 | log_folder = 'stack' 99 | if rgb: 100 | log_folder += '_rgb' 101 | if depth: 102 | log_folder += '_depth' 103 | if ir: 104 | log_folder += '_ir' 105 | if depth_3d: 106 | log_folder += '_3d' 107 | if normals: 108 | log_folder += '_normals' 109 | if normal_angle: 110 | log_folder += '_ang' 111 | if batchnorm: 112 | log_folder += '_bn' 113 | if dropout: 114 | log_folder += '_drop' 115 | if augment: 116 | log_folder += '_aug' 117 | if not pretrain: 118 | log_folder += '_nopre' 119 | if objective == 'one-class': 120 | log_folder += '_hard' 121 | elif objective == 'soft-boundary': 122 | log_folder += '_soft' 123 | elif objective == 'real-nvp': 124 | log_folder += '_nvp' 125 | if fix_encoder: 126 | log_folder += '_fix' 127 | if ae_n_epochs != 350 or n_epochs != 150: 128 | log_folder += '_' + str(ae_n_epochs) + '_' + str(n_epochs) 129 | 130 | tb_path = os.path.join(xp_path, 'tb', log_folder) 131 | if not os.path.exists(tb_path): 132 | os.makedirs(tb_path) 133 | else: 134 | for file in os.listdir(tb_path): 135 | file_path = os.path.join(tb_path, file) 136 | if os.path.isfile(file_path): 137 | os.remove(file_path) 138 | 139 | xp_path = os.path.join(xp_path, log_folder) 140 | 141 | if not os.path.exists(xp_path): 142 | os.makedirs(xp_path) 143 | 144 | 145 | writer = SummaryWriter(tb_path) 146 | 147 | # Copy executed script to log folder. 148 | shutil.copyfile('train_anomaly_detection.sh', os.path.join(xp_path, 'train_anomaly_detection.sh')) 149 | 150 | # Set up logging 151 | logging.basicConfig(level=logging.INFO) 152 | logger = logging.getLogger() 153 | logger.setLevel(logging.INFO) 154 | formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') 155 | log_file = xp_path + '/log.txt' 156 | file_handler = logging.FileHandler(log_file) 157 | file_handler.setLevel(logging.INFO) 158 | file_handler.setFormatter(formatter) 159 | logger.addHandler(file_handler) 160 | 161 | # Print arguments 162 | logger.info('Log file is %s.' % log_file) 163 | logger.info('Data path is %s.' % data_path) 164 | logger.info('Export path is %s.' % xp_path) 165 | 166 | logger.info('Dataset: %s' % dataset_name) 167 | logger.info('Normal class: %d' % normal_class) 168 | logger.info('Network: %s' % net_name) 169 | 170 | # If specified, load experiment config from JSON-file 171 | if load_config: 172 | cfg.load_config(import_json=load_config) 173 | logger.info('Loaded configuration from %s.' % load_config) 174 | 175 | # Print configuration 176 | logger.info('Anomaly detection objective: %s' % cfg.settings['objective']) 177 | logger.info('Nu-paramerter: %.2f' % cfg.settings['nu']) 178 | 179 | # Set seed 180 | if cfg.settings['seed'] != -1: 181 | random.seed(cfg.settings['seed']) 182 | np.random.seed(cfg.settings['seed']) 183 | torch.manual_seed(cfg.settings['seed']) 184 | logger.info('Set seed to %d.' % cfg.settings['seed']) 185 | 186 | # Default device to 'cpu' if cuda is not available 187 | if not torch.cuda.is_available(): 188 | device = 'cpu' 189 | logger.info('Computation device: %s' % device) 190 | logger.info('Number of dataloader workers: %d' % n_jobs_dataloader) 191 | 192 | # Load data 193 | dataset = load_dataset(dataset_name, data_path, normal_class, cfg) 194 | 195 | # Initialize model and set neural network \phi 196 | anomaly_detection = AnomalyDetection(writer, cfg.settings['objective'], cfg.settings['nu']) 197 | anomaly_detection.set_network(net_name, cfg) 198 | # If specified, load model (radius R, center c, network weights, and possibly autoencoder weights) 199 | if load_model: 200 | anomaly_detection.load_model(model_path=load_model, cfg=cfg, load_ae=pretrain) 201 | logger.info('Loading model from %s.' % load_model) 202 | 203 | logger.info('Pretraining: %s' % pretrain) 204 | if pretrain: 205 | # Log pretraining details 206 | logger.info('Pretraining optimizer: %s' % cfg.settings['ae_optimizer_name']) 207 | logger.info('Pretraining learning rate: %g' % cfg.settings['ae_lr']) 208 | logger.info('Pretraining epochs: %d' % cfg.settings['ae_n_epochs']) 209 | logger.info('Pretraining learning rate scheduler milestones: %s' % (cfg.settings['ae_lr_milestone'],)) 210 | logger.info('Pretraining batch size: %d' % cfg.settings['ae_batch_size']) 211 | logger.info('Pretraining weight decay: %g' % cfg.settings['ae_weight_decay']) 212 | 213 | # Pretrain model on dataset (via autoencoder) 214 | anomaly_detection.pretrain(dataset, 215 | cfg, 216 | optimizer_name=cfg.settings['ae_optimizer_name'], 217 | lr=cfg.settings['ae_lr'], 218 | n_epochs=cfg.settings['ae_n_epochs'], 219 | lr_milestones=cfg.settings['ae_lr_milestone'], 220 | batch_size=cfg.settings['ae_batch_size'], 221 | weight_decay=cfg.settings['ae_weight_decay'], 222 | device=device, 223 | n_jobs_dataloader=n_jobs_dataloader) 224 | 225 | # Log training details 226 | logger.info('Training optimizer: %s' % cfg.settings['optimizer_name']) 227 | logger.info('Training learning rate: %g' % cfg.settings['lr']) 228 | logger.info('Training epochs: %d' % cfg.settings['n_epochs']) 229 | logger.info('Training learning rate scheduler milestones: %s' % (cfg.settings['lr_milestone'],)) 230 | logger.info('Training batch size: %d' % cfg.settings['batch_size']) 231 | logger.info('Training weight decay: %g' % cfg.settings['weight_decay']) 232 | 233 | # Train model on dataset 234 | anomaly_detection.train(dataset, 235 | augment=cfg.settings['augment'], 236 | optimizer_name=cfg.settings['optimizer_name'], 237 | lr=cfg.settings['lr'], 238 | n_epochs=cfg.settings['n_epochs'], 239 | lr_milestones=cfg.settings['lr_milestone'], 240 | batch_size=cfg.settings['batch_size'], 241 | weight_decay=cfg.settings['weight_decay'], 242 | device=device, 243 | n_jobs_dataloader=n_jobs_dataloader, 244 | fix_encoder=cfg.settings['fix_encoder']) 245 | 246 | # Test model 247 | anomaly_detection.test(dataset, device=device, n_jobs_dataloader=n_jobs_dataloader) 248 | 249 | # Plot most anomalous and most normal (within-class) test samples 250 | indices, labels, scores = zip(*anomaly_detection.results['test_scores']) 251 | indices, labels, scores = np.array(indices), np.array(labels), np.array(scores) 252 | idx_sorted = indices[labels == 0][np.argsort(scores[labels == 0])] # sorted from lowest to highest anomaly score 253 | 254 | if dataset_name in ('selfsupervised'): 255 | 256 | if dataset_name == 'selfsupervised': 257 | if rgb: 258 | X_normals = torch.tensor(dataset.test_set.images[idx_sorted[:32], 0:3, ...]) 259 | X_outliers = torch.tensor(dataset.test_set.images[idx_sorted[-32:], 0:3, ...]) 260 | else: 261 | X_normals = torch.tensor(dataset.test_set.images[idx_sorted[:32], 0:1, ...]) 262 | X_outliers = torch.tensor(dataset.test_set.images[idx_sorted[-32:], 0:1, ...]) 263 | 264 | plot_images_grid(X_normals, export_img=xp_path + '/normals', title='Most normal examples', padding=2) 265 | plot_images_grid(X_outliers, export_img=xp_path + '/outliers', title='Most anomalous examples', padding=2) 266 | anomaly_detection.trainer.roc_plt[0].savefig(os.path.join(xp_path, 'roc_curve.svg')) 267 | 268 | # Save results, model, and configuration 269 | anomaly_detection.save_results(export_json=xp_path + '/results.json') 270 | anomaly_detection.save_model(export_model=xp_path + '/model.tar', export_best_model=os.path.join(xp_path, 'model_best.tar'), save_ae=pretrain) 271 | cfg.save_config(export_json=xp_path + '/config.json') 272 | 273 | return anomaly_detection 274 | 275 | 276 | if __name__ == '__main__': 277 | main() 278 | -------------------------------------------------------------------------------- /train_anomaly_detection.sh: -------------------------------------------------------------------------------- 1 | python train_anomaly_detection.py selfsupervised StackConvNet log \ 2 | data/full --objective real-nvp \ 3 | --lr 0.0001 --n_epochs 150 --lr_milestone 100 --batch_size 200 --weight_decay 0.5e-6 \ 4 | --pretrain True --ae_lr 0.0001 --ae_n_epochs 350 --ae_lr_milestone 250 --ae_batch_size 200 \ 5 | --ae_weight_decay 0.5e-6 --normal_class 1 --rgb --depth_3d --normals \ 6 | --train_folder train --val_pos_folder val/wangen_sun_3_pos --val_neg_folder val/wangen_sun_3_neg \ 7 | --fix_encoder -------------------------------------------------------------------------------- /train_incrementally.py: -------------------------------------------------------------------------------- 1 | import click 2 | from train_anomaly_detection import main_func 3 | from anomaly_detection.datasets.main import load_dataset 4 | import numpy as np 5 | import os 6 | 7 | # Define base parameters. 8 | dataset_name = 'selfsupervised' 9 | net_name = 'StackConvNet' 10 | xp_path_base = 'log' 11 | data_path = 'data/incremental' 12 | train_folders = ('train/base', 'train/wangen_sun_1_pos','train/wangen_twilight_2_pos','train/wangen_rain_2_pos') 13 | val_pos_folders = ('val/wangen_sun_3_pos','val/wangen_fire_4_pos','val/wangen_rain_1_pos','val/wangen_wet_1_pos','val/wangen_twilight_1_pos') 14 | val_neg_folders = ('val/wangen_sun_3_neg','val/wangen_fire_4_neg','val/wangen_rain_1_neg','val/wangen_wet_1_neg','val/wangen_twilight_1_neg') 15 | load_config = None 16 | load_model = None 17 | nu = 0.1 18 | device = 'cuda' 19 | seed = -1 20 | optimizer_name = 'adam' 21 | lr = 0.0001 22 | # n_epochs = 150 23 | n_epochs = 10 24 | lr_milestone = (100,) 25 | batch_size = 200 26 | weight_decay = 0.5e-6 27 | ae_optimizer_name = 'adam' 28 | ae_lr = 0.0001 29 | ae_n_epochs = 350 30 | ae_lr_milestone = (250,) 31 | ae_batch_size = 200 32 | ae_weight_decay = 0.5e-6 33 | n_jobs_dataloader = 0 34 | normal_class = 1 35 | 36 | batchnorm = False 37 | dropout = False 38 | augment = False 39 | 40 | objective = 'real-nvp' 41 | pretrain = True 42 | fix_encoder = True 43 | rgb = True 44 | ir = False 45 | depth = True 46 | depth_3d = False 47 | normals = False 48 | normal_angle = False 49 | 50 | class Config: 51 | def __init__(self): 52 | self.settings={} 53 | 54 | cfg = Config 55 | cfg.settings = { 56 | # 'train_folder': train_folder 57 | # 'val_pos_folder': val_pos_folder 58 | # 'val_neg_folder': val_neg_folder 59 | 'rgb': rgb, 60 | 'ir': ir, 61 | 'depth': depth, 62 | 'depth_3d': depth_3d, 63 | 'normals': normals, 64 | 'normal_angle': normal_angle, 65 | } 66 | 67 | 68 | N_ITER = 10 69 | 70 | auc_mat = np.zeros((N_ITER, len(train_folders), len(val_pos_folders))) 71 | fpr5_mat = np.zeros((N_ITER, len(train_folders), len(val_pos_folders))) 72 | 73 | for it in range(N_ITER): 74 | xp_path = os.path.join(xp_path_base, str(it)) 75 | for i, _ in enumerate(train_folders): 76 | train_folder = train_folders[:i+1] 77 | cfg.settings['train_folder'] = train_folder 78 | 79 | train_obj = main_func(dataset_name, net_name, xp_path, data_path, train_folder, 80 | val_pos_folders[0], val_neg_folders[0], load_config, load_model, objective, nu, 81 | device, seed, optimizer_name, lr, n_epochs, lr_milestone, batch_size, 82 | weight_decay, pretrain, ae_optimizer_name, ae_lr, ae_n_epochs, 83 | ae_lr_milestone, ae_batch_size, ae_weight_decay, n_jobs_dataloader, normal_class, 84 | rgb, ir, depth, depth_3d, normals, 85 | normal_angle, batchnorm, dropout, augment, fix_encoder) 86 | for j, (val_pos_folder, val_neg_folder) in enumerate(zip(val_pos_folders, val_neg_folders)): 87 | cfg.settings['val_pos_folder'] = val_pos_folder 88 | cfg.settings['val_neg_folder'] = val_neg_folder 89 | dataset_val = load_dataset(dataset_name, data_path, normal_class, cfg) 90 | train_obj.test(dataset_val, device, n_jobs_dataloader) 91 | 92 | auc_mat[it, i,j] = train_obj.results['test_auc'] 93 | fpr5_mat[it, i,j] = train_obj.results['test_fpr5'] 94 | # Get ROC curve file name. 95 | train_name = train_folder[-1].split('/')[-1] 96 | test_name = val_pos_folder.split('/')[-1] 97 | file_name = 'roc_curve_' + train_name + '_' + test_name + '.svg' 98 | train_obj.trainer.roc_plt[0].savefig(os.path.join(xp_path, file_name)) 99 | 100 | # print(auc_mat[it]) 101 | np.save(os.path.join(xp_path, 'auc.npy'), auc_mat) 102 | np.save(os.path.join(xp_path, 'fpr5.npy'), fpr5_mat) 103 | 104 | np.save(os.path.join(xp_path_base, 'auc.npy'), auc_mat) 105 | np.save(os.path.join(xp_path_base, 'fpr5.npy'), fpr5_mat) 106 | print('AUC avg') 107 | print(np.mean(auc_mat, axis=0)) 108 | print('AUC std') 109 | print(np.std(auc_mat, axis=0)) 110 | print('FPR5 avg') 111 | print(np.mean(fpr5_mat, axis=0)) 112 | print('FPR5 std') 113 | print(np.std(fpr5_mat, axis=0)) 114 | -------------------------------------------------------------------------------- /visualize_positive_labels.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import matplotlib 4 | matplotlib.use('Agg') # or 'PS', 'PDF', 'SVG' 5 | import matplotlib.pyplot as plt 6 | from matplotlib.widgets import Slider, Button 7 | from argparse import ArgumentParser 8 | from torch.utils.data import DataLoader 9 | import torch 10 | import cv2 11 | 12 | from anomaly_detection.datasets.selfsupervised_images import SelfSupervisedDataset 13 | 14 | 15 | 16 | SQ_SIZE=32 17 | 18 | 19 | 20 | def correctColor(image): 21 | image = np.transpose(image, (1, 2, 0)) 22 | image = np.flip(image, axis=2) 23 | return image 24 | 25 | 26 | 27 | def makeIndexValid(x_ind, y_ind, img_shape): 28 | x_shape = img_shape[0] 29 | y_shape = img_shape[1] 30 | if x_ind[0] < 0: 31 | x_ind -= x_ind[0] 32 | if x_ind[1] > x_shape: 33 | x_ind -= (x_ind[1]-x_shape) 34 | 35 | if y_ind[0] < 0: 36 | y_ind -= y_ind[0] 37 | if y_ind[1] > y_shape: 38 | y_ind -= (y_ind[1]-y_shape) 39 | 40 | 41 | 42 | def saveImages(img_rgb, img_target, step): 43 | img_rgb = correctColor(img_rgb) 44 | img_target = correctColor(img_target) 45 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_rgb.png'), img_rgb*255) 46 | cv2.imwrite(os.path.join(args.outdir, "{:05d}".format(step)+'_target.png'), img_target*255) 47 | 48 | 49 | 50 | def label(args): 51 | dataset = SelfSupervisedDataset(args.datadir, file_format='csv', subsample=args.subsample, tensor_type='float') 52 | loader = DataLoader(dataset, shuffle=False) 53 | 54 | if not os.path.exists(args.outdir): 55 | os.makedirs(args.outdir) 56 | print('Create directory: ' + args.outdir) 57 | 58 | n_steps = len(loader) 59 | 60 | for step, (images, labels) in enumerate(loader): 61 | print(str(step+1) + '/' + str(n_steps),end='\r') 62 | 63 | img_rgb = images[0].squeeze().numpy() 64 | img_target = np.zeros(img_rgb.shape) 65 | 66 | label_mask = (labels!=0).squeeze().numpy() 67 | foot_ind = np.where(label_mask) 68 | # Make sure we have a foothold in the image. 69 | if foot_ind[0].shape[0] == 0: 70 | print('Encountered empty foothold mask') 71 | 72 | for i in range(foot_ind[0].shape[0]): 73 | indices = np.array([foot_ind[0][i], foot_ind[1][i]]) 74 | x_ind = np.array([int(indices[0]-SQ_SIZE/2), int(indices[0]+SQ_SIZE/2)]) 75 | y_ind = np.array([int(indices[1]-SQ_SIZE/2), int(indices[1]+SQ_SIZE/2)]) 76 | makeIndexValid(x_ind, y_ind, np.squeeze(img_rgb.shape[1:])) 77 | patch = img_rgb[:, x_ind[0]:x_ind[1], y_ind[0]:y_ind[1]] 78 | img_target[:, x_ind[0]:x_ind[1], y_ind[0]:y_ind[1]] = patch 79 | 80 | img_rgb[:, label_mask] = 0.0 81 | # Erode. 82 | label_mask = cv2.erode(label_mask.astype(np.uint8)*255, np.ones([5,5], np.uint8),iterations=1).astype(np.bool) 83 | img_rgb[:, label_mask] = 1.0 84 | # Save everything in the appropriate format. 85 | saveImages(img_rgb, img_target, step) 86 | 87 | 88 | 89 | if __name__ == '__main__': 90 | parser = ArgumentParser() 91 | parser.add_argument('--datadir', required=True, help='Directory for dataset') 92 | parser.add_argument('--outdir', required=True, help='Output directory for patches') 93 | parser.add_argument('--subsample', type=int, default=1, help='Only use every nth image of the dataset') 94 | 95 | args = parser.parse_args() 96 | 97 | label(args) 98 | --------------------------------------------------------------------------------