├── README.md ├── install.sh ├── main.py ├── run.sh └── src ├── __init__.py ├── data ├── SlidingWindowDataset.py ├── Transforms.py └── __init__.py ├── datasets ├── __init__.py ├── demo.py ├── files │ ├── .gitignore │ ├── processed │ │ └── demo │ │ │ ├── labels.pt │ │ │ ├── list.txt │ │ │ ├── meta.json │ │ │ ├── test.pt │ │ │ └── train.pt │ └── raw │ │ └── demo │ │ ├── test.csv │ │ └── train.csv ├── from_csv.py ├── swat.py └── wadi.py ├── layers ├── __init__.py ├── attention.py └── embedding.py ├── models ├── ConvSeqAttention.py ├── GNNLSTM.py ├── LSTM.py ├── Linear.py ├── MTGNN.py └── __init__.py ├── utils ├── __init__.py ├── device.py ├── evaluate.py ├── metrics.py ├── nn_trainer.py └── utils.py └── visualization ├── error_distribution.py ├── graph_plot.py ├── loss_plot.py └── tensorboard.py /README.md: -------------------------------------------------------------------------------- 1 | # Multivariate Time Series Anomaly Detection with GNNs and Latent Graph Inference 2 | Implementation of different graph neural network (GNN) based models for anomaly detection in multivariate timeseries in sensor networks.
3 | An explicit graph structure modelling the interrelations between sensors is inferred during training and used for time series forecasting. Anomaly detection is based on the error between the predicted and actual values at each time step. 4 | 5 | ## Installation 6 | ### Requirements 7 | * Python == 3.7 8 | * cuda == 10.2 9 | * [pytorch==1.8.1] (https://pytorch.org/) 10 | * [torch-geometric==1.7.2] (https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html)
11 | 12 | Additional package files for torch-geometric (python 3.7 & pytorch 1.8.1) provided in '/whl/' in case they are unavailable.
13 | Refer to https://pytorch-geometric.com/whl/ for other versions. 14 | ### Install python 15 | Install python environment for example with conda: 16 | ``` 17 | conda create -n py37 python=3.7 18 | ``` 19 | ### Install packages 20 | Run install bash script with either cpu or cuda flag depeneding on the indended use. 21 | ``` 22 | # run after installing python 23 | bash install.sh cpu 24 | 25 | # or 26 | bash install.sh cuda 27 | ``` 28 | 29 | ## Models 30 | The repository contains several models. GNN-LSTM is used by default and achieved best performance. 31 | 32 | ### GNN-LSTM 33 | Model with GNN feature expansion before multi-layer LSTM. A single node embedding is used to infer the latent graph through vector similary, 34 | and as node positional embeddings added to the GNN features before they are passed to the recurrent network. 35 | 36 | ### Convolutional-Attention Sequence-To-Sequence 37 | Spatial-Temporal Convolution GNN with attention. Data is split into an encoder and decoder. Encoder creates a feature representation for each time step while the decoder creates a single representation. Encoder-Decoder attention is concatenated with the decoder output before passed to the prediction layer. 38 | Uses multiple embedding layers to parameterize the latent graph diretly by the network.
39 | Inspired by: https://arxiv.org/pdf/1705.03122.pdf. 40 | 41 | ### MTGNN 42 | Sptial-Temporal Convolution GNN with attention and graph mix-hop propagation.
43 | Taken from: https://arxiv.org/pdf/2005.11650.pdf. 44 | 45 | ### LSTM 46 | Vanilla multi-layer LSTM used for benchmarking. 47 | 48 | 49 | ## Data 50 | ### SWaT, WADI, Demo 51 | Test dataset ('demo') included in the model folder.
52 | SWaT and WADI datasets can be requested from [iTrust](https://itrust.sutd.edu.sg/).
53 | The files should be opened in e.g. Excel to remove the first empty rows and save as a .csv file.
54 | The CSV files should be placed in a folder with the same name ('swat' or 'wadi') in '/datasets/files/raw/\/\'
55 | 56 | ### Other 57 | Additional datasets can either be loaded directly from CSV file using the dataset 'from_csv'
58 | or by creating a custom dataset following the examples found in the '/datasets/' folder.
59 | If 'from_csv' is used, the data should come in the same format as the demo data included in this repository, 60 | with individual time series for each sensor represented by a single column. (Only) the test data should have 61 | anomaly labels included in the last column.
62 | The first column is assumed to be the timestamp. The files are to be placed in '/datasets/files/raw/from_csv/'. 63 | If this option is chosen, data normalization is not available. Any preprocessing should be done manually. 64 | 65 | ## Usage 66 | 67 | ### Bash Script 68 | Suitable parameters for the SWaT, Wadi, and Demo datasets can be found in the bash scripts, 69 | which is the most convenient way to run models. 70 | ``` 71 | # run from terminal 72 | sh run.sh [dataset] 73 | ``` 74 | 75 | *Examples:* 76 | ``` 77 | # example 1 78 | sh run.sh swat 79 | 80 | # example 2 81 | sh run.sh wadi 82 | 83 | # example 3 84 | sh run.sh demo 85 | ``` 86 | 87 | ### Python File 88 | Run the *main.py* script from your terminal (bash, powershell, etc).
89 | To change the default model and training hyperparameters, flags can be included.
90 | Alternatively, those parameters can be changed within the file (argsparser default values). 91 | ``` 92 | # run from terminal 93 | python main.py -[flags] 94 | ``` 95 | *Examples:* 96 | ``` 97 | # example 1 98 | python main.py -dataset demo -batch_size 4 -epochs 10 99 | 100 | # example 2 101 | python main.py -dataset swat -epochs 10 -topk 20 -embed_dim 128 102 | 103 | # example 3 104 | python main.py -dataset from_csv 105 | ``` 106 | 107 | **Available flags:**
108 | `-dataset` The dataset.
109 | `-window_size` Number of historical timesteps used in each sample.
110 | `-horizon` Number of prediction steps.
111 | `-val_split` Amount of data used for the validation dataset. Value between 0 and 1.
112 | `-transform` Sampling transform applied to the model input data (e.g. median).
113 | `-target_transform` Sampling transform applied to the model target values. (e.g. median, max).
114 | `-normalize` Boolean value if data normalization should be applied.
115 | `-shuffle_train` Boolean value if training data should be shuffled.
116 | `-batch_size` Number of samples in each batch.
117 | 118 | `-embed_dim` Number of node embedding dimensions (Disabled for GNN-LSTM).
119 | `-topk` Number of allowed neighbors for each node.
120 | 121 | `-smoothing` Error smoothing kernel size.
122 | `-smoothing_method` Error smoothing kernel type (*mean* or *exp*).
123 | `-thresholding` Thresholding method (*mean*, *max*, *best* (best performs an exhaustive search for theoretical performance evaluation)).
124 | 125 | `-epochs` Number of training epochs.
126 | `-early_stopping` Patience parameter of number of epochs without improvement for early stopping.
127 | `-lr` Learning rate.
128 | `-betas` Adam optimizer parameter.
129 | `-weight_decay` Adam optimizer weight regularization parameter.
130 | `-device` Computing device (*cpu* or *cuda*).
131 | 132 | `-log_graph` Boolean for logging of learned graphs.
133 | 134 | ## Results 135 | ### Logs 136 | After the initial run, a '/runs/' folder will be automatically created.
137 | A copy of the model state dict, a loss plot, plots for the learned graph representation
138 | and some additional information will be saved for each run of the model. 139 | 140 | ### Example Plots 141 | 142 | Visualization of a t-SNE embedding of the learned undirected graph representation for the SWaT dataset
143 | with 15 neighbors per node.
144 | 145 | 146 | Plot of a directly parameterized uni-directional graph adjaceny matrix with a single neighbor per node.
147 | 148 | 149 | Node colors and labels indicate type of sensor. 150 | 151 | 152 | **P:** Pump
153 | **MV:** Motorized valve
154 | **UV:** Dechlorinator
155 | **LIT:** Level in tank
156 | **PIT:** Pressure in tank
157 | **FIT:** Flow in tank
158 | **AIT:** Analyzer in tank (different chemical analyzers; NaCl, HCl, ORP meters, etc)
159 | **DPIT:** Differential pressure indicating transmitter
160 | -------------------------------------------------------------------------------- /install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | DIR=$1 4 | 5 | # torch 6 | pip install torch==1.8.1 7 | pip install torch-geometric==1.7.2 8 | 9 | # extra torch geometric packages 10 | pip install --no-index torch-scatter -f whl/$DIR/ 11 | pip install --no-index torch-sparse -f whl/$DIR/ 12 | pip install --no-index torch-cluster -f whl/$DIR/ 13 | pip install --no-index torch-spline-conv -f whl/$DIR/ 14 | 15 | # extra packages 16 | pip install matplotlib==3.4.3 17 | pip install networkx==2.6.2 18 | pip install scikit-learn==0.24.2 19 | pip install scipy==1.7.1 20 | pip install seaborn==0.11.2 21 | pip install numpy==1.21.2 22 | pip install pandas==1.3.2 23 | pip install pyvis==0.1.9 -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import os 3 | 4 | import argparse 5 | import importlib 6 | from distutils.util import strtobool 7 | 8 | from datetime import datetime 9 | 10 | import numpy as np 11 | import torch 12 | from torch import nn 13 | from torch_geometric.data import DataLoader 14 | 15 | from src.data.Transforms import MedianSampling2d, MaxSampling1d, MedianSampling1d 16 | from src.models import GNNLSTM, ConvSeqAttentionModel, MTGNNModel, RecurrentModel 17 | from src.utils.device import get_device 18 | from src.utils import Trainer 19 | from src.utils.evaluate import evaluate_performance 20 | from src.visualization.graph_plot import plot_embedding, plot_adjacency 21 | from src.visualization.loss_plot import get_loss_plot 22 | # from src.visualization.error_distribution import get_error_distribution_plot 23 | 24 | def main(args): 25 | 26 | print() 27 | # check python and torch versions 28 | print(f'Python v.{sys.version.split()[0]}') 29 | print(f'PyTorch v.{torch.__version__}') 30 | 31 | # get device 32 | device = args.device 33 | print(f'Device status: {device}') 34 | 35 | dataset = args.dataset 36 | # dataset import 37 | p, m = 'src.datasets.' + dataset, dataset.capitalize() 38 | mod = importlib.import_module(p) 39 | dataset_class = getattr(mod, m) 40 | 41 | # dataset transforms 42 | transform_dict = {'median': MedianSampling2d} 43 | target_transform_dict = {'median': MedianSampling1d, 'max': MaxSampling1d} 44 | 45 | transform = transform_dict.get(args.transform, None) 46 | if transform is not None: 47 | transform = transform(10) 48 | 49 | target_transform = target_transform_dict.get(args.target_transform, None) 50 | if target_transform is not None: 51 | target_transform = target_transform(10) 52 | 53 | # training / test data set definitions 54 | lags = args.window_size 55 | stride = args.stride 56 | horizon = args.horizon 57 | train_ds = dataset_class(lags, stride=stride, horizon=horizon, train=True, transform=transform, normalize=args.normalize, device=device) 58 | test_ds = dataset_class(lags, stride=stride, horizon=horizon, train=False, transform=transform, target_transform=target_transform, normalize=args.normalize, device=device) 59 | 60 | # get train and validation data split at random index 61 | val_split = args.val_split 62 | val_len = int(len(train_ds)*val_split) 63 | split_idx = np.random.randint(0, len(train_ds) - val_len) # exclude beginning of dataset for stability 64 | a, b = split_idx, split_idx+val_len # split interval 65 | 66 | train_parition = train_ds[:a] + train_ds[b:] 67 | val_parition = train_ds[a:b] 68 | 69 | # data loaders 70 | batch_size = args.batch_size 71 | # train, val, test partitions 72 | train_loader = DataLoader(train_parition[int(len(train_ds)*0.0):], batch_size=batch_size, shuffle=args.shuffle_train) 73 | if len(val_parition) > 0: 74 | val_loader = DataLoader(val_parition, batch_size=batch_size, shuffle=True) 75 | thresholding_loader = DataLoader(val_parition, batch_size=batch_size, shuffle=False) 76 | else: 77 | val_loader = None 78 | thresholding_loader = DataLoader(train_ds[int(len(train_ds)*0.0):], batch_size=batch_size, shuffle=False) 79 | test_loader = DataLoader(test_ds, batch_size=batch_size, shuffle=False) 80 | 81 | # node meta data 82 | num_nodes = train_ds.num_nodes 83 | args.num_nodes = num_nodes 84 | try: 85 | node_names = train_ds.node_names 86 | except: 87 | node_names = str(list(range(num_nodes))) 88 | 89 | print(f'\nDataset <{train_ds.name.capitalize()}> loaded...') 90 | print(f' Number of nodes: {num_nodes}') 91 | print(f' Training samples: {len(train_parition)}') 92 | if val_loader: 93 | print(f' Validation samples: {len(val_parition)}') 94 | print(f' Test samples: {len(test_ds)}\n') 95 | 96 | ### MODEL 97 | # model = ConvSeqAttentionModel(args).to(device) 98 | # model = RecurrentModel(args).to(device) 99 | # model = MTGNNModel(args).to(device) 100 | model = GNNLSTM(args).to(device) 101 | 102 | optimizer = torch.optim.Adam(model.parameters(), lr=args.lr, betas=args.betas, weight_decay=args.weight_decay) 103 | criterion = nn.MSELoss(reduction='mean') 104 | 105 | # torch.autograd.set_detect_anomaly(True) # uncomment for debugging 106 | 107 | print('Training...') 108 | trainer = Trainer(model, optimizer, criterion) 109 | stamp = datetime.now().strftime("%Y%m%d-%H%M%S") 110 | 111 | # log directory 112 | logdir = os.path.join('runs/', stamp + f' - {args.dataset}') 113 | os.makedirs(logdir, exist_ok=True) 114 | 115 | if args.log_graph: 116 | # save randomly initialised graph for plotting 117 | init_edge_index, init_embedding = model.get_embedding() 118 | init_graph = model.get_graph() 119 | 120 | # TRAINING ### 121 | train_loss_history, val_loss_history, best_model_state = trainer.train( 122 | train_loader, 123 | val_loader, 124 | epochs=args.epochs, 125 | early_stopping=args.early_stopping, 126 | return_model_state=True, 127 | ) 128 | ### TESTING ### 129 | print('Testing...') 130 | # best model parameters 131 | model.load_state_dict(best_model_state) 132 | 133 | thresholding_results, final_train_loss = trainer.test(thresholding_loader) 134 | test_ds_results, test_loss = trainer.test(test_loader) 135 | print(f' Tresholding Data MSE: {final_train_loss:.6f}') 136 | print(f' Test MSE: {test_loss:.4f}\n') 137 | 138 | with open(os.path.join(logdir, 'loss.txt'), 'w') as f: 139 | f.write(f'Tresholding MSE: {final_train_loss:.6f}\nTest MSE: {test_loss:.6f}\n') 140 | 141 | print('Evaluating Performance...') 142 | 143 | results = evaluate_performance( 144 | thresholding_results, 145 | test_ds_results, 146 | threshold_method=args.thresholding, 147 | smoothing=args.smoothing, 148 | smoothing_method=args.smoothing_method, 149 | ) 150 | result_str = f' {str(results["method"]).capitalize()} thresholding:\n \n' + \ 151 | f' | Normal | Adjusted |\n' + \ 152 | f' ----------|----------------|----------------|\n' + \ 153 | f' Precision | {results["prec"]:>13.3f} | {results["a_prec"]:>13.3f} |\n' + \ 154 | f' Recall | {results["rec"]:>13.3f} | {results["a_rec"]:>13.3f} |\n' + \ 155 | f' F1 / F2 | {results["f1"]:>6.3f} / {results["f2"]:>5.3f} | {results["a_f1"]:>6.3f} / {results["a_f2"]:>5.3f} |\n' + \ 156 | f' ----------|----------------|----------------|----------------\n' + \ 157 | f' | Latency: {results["latency"]:.2f}\n' 158 | 159 | print(result_str) 160 | with open(os.path.join(logdir, f'results_{str(results["method"])}.txt'), 'w') as f: 161 | f.write(result_str) 162 | 163 | ### Uncomment for exhaustive threshold / smoothing parameter search 164 | # precision, recall, f1, f2 = -1, -1, -1, -1 165 | # best_method = None 166 | # j = 0 167 | # for i in range(1, 25+1): 168 | # results = evaluate_performance( 169 | # thresholding_results, 170 | # test_ds_results, 171 | # threshold_method='best', 172 | # smoothing=i, 173 | # smoothing_method=args.smoothing_method, 174 | # ) 175 | # if 1 >= results["f1"] > f1 : 176 | # precision, recall, f1, f2 = results["prec"], results["rec"], results["f1"], results["f2"] 177 | # best_method = results["method"] 178 | # j = i 179 | # print(f' Best method: {best_method}') 180 | # print(f' Best smoothing parameter: {j}') 181 | # print(f' Precision: {precision:.4f}') 182 | # print(f' Recall: {recall:.4f}') 183 | # print(f' F1 | F2 scores: {f1:.4f} | {f2:.4f}\n') 184 | 185 | ### RESULTS PLOTS ### 186 | print('Logging Results...') 187 | 188 | with open(os.path.join(logdir, 'model.txt'), 'w') as f: 189 | f.write(str(model)) 190 | 191 | # learned graph 192 | if args.log_graph: 193 | learned_edges, learned_embedding = model.get_embedding() 194 | learned_graph = model.get_graph() 195 | for i in range(len(learned_embedding)): 196 | plot_embedding(init_edge_index, init_embedding[i], node_names, os.path.join(logdir, f'init_emb_{i}.html')) 197 | plot_embedding(learned_edges, learned_embedding[i], node_names, os.path.join(logdir, f'trained_emd_{i}.html')) 198 | 199 | plot_adjacency(init_graph, node_names, os.path.join(logdir, f'init_A.html')) 200 | plot_adjacency(learned_graph, node_names, os.path.join(logdir, f'learned_A.html')) 201 | 202 | # loss 203 | fig = get_loss_plot(train_loss_history, val_loss_history) 204 | fig.savefig(os.path.join(logdir, 'loss_plot.png')) 205 | 206 | # # error distributions 207 | # results_dict = {'Validation': thresholding_results, 'Testing': test_ds_results} 208 | # fig = get_error_distribution_plot(results_dict) 209 | # fig.savefig(os.path.join(logdir, 'error_distribution.png')) 210 | 211 | ### SAVE MODEL ### 212 | torch.save(best_model_state, os.path.join(logdir, 'model.pt')) 213 | 214 | print() # script end 215 | 216 | if __name__ == '__main__': 217 | 218 | device = get_device() 219 | 220 | parser = argparse.ArgumentParser() 221 | 222 | ### -- Data params --- ### 223 | parser.add_argument("-dataset", type=str.lower, default="swat") 224 | parser.add_argument("-window_size", type=int, default=30) 225 | parser.add_argument("-stride", type=int, default=1) 226 | parser.add_argument("-horizon", type=int, default=10) 227 | parser.add_argument("-val_split", type=float, default=0.2) 228 | parser.add_argument("-transform", type=str, default='median') 229 | parser.add_argument("-target_transform", type=str, default='median') 230 | parser.add_argument("-normalize", type=lambda x:strtobool(x), default=False) 231 | parser.add_argument("-shuffle_train", type=lambda x:strtobool(x), default=True) 232 | parser.add_argument("-batch_size", type=int, default=64) 233 | 234 | ### -- Model params --- ### 235 | # Sensor embedding 236 | parser.add_argument("-embed_dim", type=int, default=16) 237 | parser.add_argument("-topk", type=int, default=5) 238 | 239 | ### --- Thresholding params --- ### 240 | parser.add_argument("-smoothing", type=int, default=1) 241 | parser.add_argument("-smoothing_method", type=str, default='exp') # exp or mean 242 | parser.add_argument("-thresholding", type=str, default='max') # max or mean 243 | 244 | ### --- Training params --- ### 245 | parser.add_argument("-epochs", type=int, default=50) 246 | parser.add_argument("-early_stopping", type=int, default=20) 247 | parser.add_argument("-lr", type=float, default=1e-3) 248 | parser.add_argument("-betas", nargs=2, type=float, default=(0.9, 0.999)) 249 | parser.add_argument("-weight_decay", type=float, default=0) 250 | parser.add_argument("-device", type=torch.device, default=device) # cpu or cuda 251 | 252 | ### --- Logging params --- ### 253 | parser.add_argument("-log_tensorboard", type=lambda x:strtobool(x), default=False) 254 | parser.add_argument("-log_graph", type=lambda x:strtobool(x), default=True) 255 | 256 | args = parser.parse_args() 257 | 258 | main(args) 259 | -------------------------------------------------------------------------------- /run.sh: -------------------------------------------------------------------------------- 1 | DATASET=$1 2 | 3 | if [[ "$DATASET" == "swat" ]]; then 4 | WINDOW_SIZE=50 5 | HORIZON=1 6 | STRIDE=1 7 | BATCH_SIZE=64 8 | EMBED_DIM=32 9 | TOPK=5 10 | EPOCHS=50 11 | VAL_SPLIT=0.2 12 | EARLY_STOPPING=10 13 | SMOOTHING=1 14 | SMOOTHING_METHOD="exp" 15 | THRESHOLDING="best" 16 | TRANSFORM="median" 17 | TARGET_TRANSFORM="median" 18 | NORMALIZE="True" 19 | 20 | elif [[ "$DATASET" == "wadi" ]]; then 21 | WINDOW_SIZE=50 22 | STRIDE=1 23 | HORIZON=1 24 | BATCH_SIZE=64 25 | EMBED_DIM=32 26 | TOPK=8 27 | EPOCHS=50 28 | VAL_SPLIT=0.1 29 | EARLY_STOPPING=10 30 | SMOOTHING=1 31 | SMOOTHING_METHOD="exp" 32 | THRESHOLDING="best" 33 | TRANSFORM="median" 34 | TARGET_TRANSFORM="median" 35 | NORMALIZE="True" 36 | 37 | elif [[ "$DATASET" == "demo" ]]; then 38 | WINDOW_SIZE=25 39 | STRIDE=1 40 | HORIZON=1 41 | BATCH_SIZE=32 42 | EMBED_DIM=16 43 | TOPK=3 44 | EPOCHS=50 45 | VAL_SPLIT=0 46 | EARLY_STOPPING=20 47 | SMOOTHING=1 48 | SMOOTHING_METHOD="mean" 49 | THRESHOLDING="best" 50 | TRANSFORM="none" 51 | TARGET_TRANSFORM="none" 52 | NORMALIZE="False" 53 | fi 54 | 55 | python main.py \ 56 | -dataset $DATASET \ 57 | -window_size $WINDOW_SIZE \ 58 | -horizon $HORIZON \ 59 | -stride $STRIDE \ 60 | -val_split $VAL_SPLIT \ 61 | -batch_size $BATCH_SIZE \ 62 | -embed_dim $EMBED_DIM \ 63 | -topk $TOPK \ 64 | -epochs $EPOCHS \ 65 | -early_stopping $EARLY_STOPPING \ 66 | -smoothing $SMOOTHING \ 67 | -smoothing_method $SMOOTHING_METHOD \ 68 | -thresholding $THRESHOLDING \ 69 | -normalize $NORMALIZE \ 70 | -------------------------------------------------------------------------------- /src/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timbrockmeyer/mulivariate-time-series-anomaly-detection/bef170c5a20e00e5316002afdc3fdd445aa43777/src/__init__.py -------------------------------------------------------------------------------- /src/data/SlidingWindowDataset.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.utils.data import Dataset 3 | from torch_geometric.data import Data 4 | from sklearn.preprocessing import MinMaxScaler 5 | 6 | class SlidingWindowDataset(Dataset): 7 | ''' 8 | Dataset class for multivariate time series data. 9 | Each returned sample of the dataset is a sliding window of specific length 10 | given as a pytorch geometric data objects. 11 | https://pytorch-geometric.readthedocs.io/en/latest/modules/data.html#torch_geometric.data.Data 12 | 13 | This should serve as the base class for specific datasets 14 | 15 | Args: 16 | data (Tensor): 2d tensor with one timesteps in the rows and sensors as columns. 17 | window_size (int): Length of the sliding window of each sample. 18 | stride (int, optional): Stride length a which the dataset is sampled. 19 | horizon (int, optional): Number of timesteps used as prediction target. 20 | labels (Tensor, optional): Anomaly labels (None during training). 21 | transform (callable, optional): Transform appplied to the data. 22 | target_transform (callable, optional): Transform applied to the labels. 23 | device (str, optional): Device where data will be held (cpu, cuda). 24 | ''' 25 | def __init__(self, data, window_size, stride=1, horizon=1, labels=None, transform=None, target_transform=None, normalize=False, device='cpu'): 26 | 27 | self.window_size = window_size 28 | self.stride = stride 29 | self.horizon = horizon 30 | self.normalize = normalize 31 | self.device = torch.device(device) 32 | 33 | self.dataset = self._process(data, labels, transform, target_transform) 34 | 35 | def _process(self, data, labels, transform, target_transform): 36 | assert isinstance(data, torch.Tensor) 37 | assert isinstance(labels, (type(None), torch.Tensor)) 38 | 39 | _, info = self.meta 40 | if self.normalize: 41 | train_meta = info['train'] 42 | min_ = torch.tensor(train_meta['min'], requires_grad=False) 43 | max_ = torch.tensor(train_meta['max'], requires_grad=False) 44 | 45 | fit_data = torch.stack([min_, max_], dim=0).detach().cpu().numpy() 46 | 47 | normalizer = MinMaxScaler(feature_range=(0,1)).fit(fit_data) 48 | 49 | data = torch.tensor(normalizer.transform(data.cpu().numpy())).to(self.device) 50 | 51 | data = data.to(self.device).T.float() 52 | 53 | if transform is not None: 54 | data = transform(data) 55 | 56 | self.num_nodes = data.size(0) 57 | 58 | if labels is not None: 59 | labels = labels.to(self.device) 60 | 61 | if target_transform is not None: 62 | labels = target_transform(labels) 63 | 64 | self._len = ((data.size(1) - self.window_size - self.horizon) // self.stride) + 1 65 | 66 | dataset = [] 67 | for idx in range(self._len): 68 | id = idx 69 | idx *= self.stride 70 | x = data[:, idx : idx + self.window_size] 71 | y = data[:, idx + self.window_size : idx + self.window_size + self.horizon] 72 | 73 | if labels == None: 74 | y_label = None 75 | else: 76 | y_label = labels[idx + self.window_size : idx + self.window_size + self.horizon] 77 | 78 | window = Data(x=x, edge_idx=None, edge_attr=None, y=y, y_label=y_label, id=id) 79 | dataset.append(window) 80 | 81 | return dataset 82 | 83 | def __getitem__(self, idx): 84 | return self.dataset[idx] 85 | 86 | def __iter__(self): 87 | self._idx = 0 88 | return self 89 | 90 | def __next__(self): 91 | if self._idx >= self._len: 92 | raise StopIteration 93 | 94 | item = self.dataset[self._idx] 95 | self._idx += 1 96 | return item 97 | 98 | def __repr__(self): 99 | return f'{self.__class__.__name__}(num_nodes={self.num_nodes}, window_size={self.window_size}, stride={self.stride}, horizon={self.horizon})' 100 | 101 | def __len__(self): 102 | return self._len 103 | -------------------------------------------------------------------------------- /src/data/Transforms.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | 4 | # Transforms applied to datasets, e.g. for downsampling to speed up training 5 | 6 | class BaseSampling(torch.nn.Module): 7 | ''' 8 | Base class for sampling transforms applied to tensors. 9 | 10 | Args: 11 | k (int): Number of samples to be aggregated. 12 | ''' 13 | def __init__(self, k): 14 | super().__init__() 15 | 16 | self._k = k 17 | 18 | def forward(self, x): 19 | dims = len(x.shape) 20 | p1d = 0, (self._k - len(x) % self._k) % self._k 21 | x = F.pad(x, p1d, "constant", 0) 22 | x = x.unfold(dims-1, self._k, self._k) 23 | x = self.sampling(x) 24 | return x 25 | 26 | def sampling(self, x): 27 | raise NotImplementedError 28 | 29 | class MedianSampling2d(BaseSampling): 30 | ''' 31 | Returns a 2d tensor where each row is downsampled with the median of k values. 32 | Only for 2d tensors. 33 | ''' 34 | def __init__(self, k): 35 | super().__init__(k) 36 | 37 | def sampling(self, x): 38 | assert len(x.shape) == 3 39 | x, _ = x.median(dim=2) 40 | return x 41 | 42 | class MedianSampling1d(BaseSampling): 43 | ''' 44 | Returns a 1d tensor that is downsampled with the median of k values. 45 | Only for 1d tensors. 46 | ''' 47 | def __init__(self, k): 48 | super().__init__(k) 49 | 50 | def sampling(self, x): 51 | assert len(x.shape) == 2 52 | x, _ = x.median(dim=1) 53 | return x 54 | 55 | class MaxSampling1d(BaseSampling): 56 | ''' 57 | Returns a 1d tensor that is downsampled with the maximum of k values. 58 | Only for 1d tensors. 59 | ''' 60 | def __init__(self, k): 61 | super().__init__(k) 62 | 63 | def sampling(self, x): 64 | assert len(x.shape) == 2 65 | x, _ = x.max(dim=1) 66 | return x 67 | 68 | -------------------------------------------------------------------------------- /src/data/__init__.py: -------------------------------------------------------------------------------- 1 | from .SlidingWindowDataset import SlidingWindowDataset -------------------------------------------------------------------------------- /src/datasets/__init__.py: -------------------------------------------------------------------------------- 1 | from .demo import Demo 2 | from .swat import Swat 3 | from .wadi import Wadi 4 | from .from_csv import From_csv -------------------------------------------------------------------------------- /src/datasets/demo.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import numpy as np 4 | import torch 5 | from shutil import rmtree 6 | 7 | from ..data import SlidingWindowDataset 8 | 9 | class Demo(SlidingWindowDataset): 10 | 11 | ''' 12 | Small excerpt from the MSL dataset used for testing 13 | ''' 14 | 15 | def __init__(self, window_size=1, stride=1, horizon=1, train=True, transform=None, target_transform=None, normalize=False, device='cpu'): 16 | 17 | self.device = device 18 | 19 | self.name = 'demo' 20 | 21 | train_file = 'train.csv' 22 | test_file = 'test.csv' 23 | 24 | root = os.path.dirname(__file__) 25 | raw_dir = os.path.join(root, f'files/raw/{self.name}') 26 | self.processed_dir = os.path.join(root, f'files/processed/{self.name}') 27 | 28 | self.raw_paths = [os.path.join(raw_dir, ending) for ending in [train_file, test_file]] 29 | self.processed_paths = [os.path.join(self.processed_dir, ending) for ending in ['train.pt', 'test.pt', 'labels.pt', 'list.txt', 'meta.json']] 30 | 31 | data, labels, node_names = self.load(train) 32 | 33 | self.node_names = node_names 34 | 35 | super().__init__(data, window_size, stride=stride, horizon=horizon, labels=labels, transform=transform, target_transform=target_transform, normalize=normalize, device=device) 36 | 37 | def load(self, train): 38 | 39 | # process csv files if not done 40 | if not all(map(lambda x: os.path.isfile(x), self.processed_paths)): 41 | self.process() 42 | 43 | # check if processed and load 44 | if all(map(lambda x: os.path.isfile(x), self.processed_paths)): 45 | if train: 46 | data = torch.load(self.processed_paths[0], map_location=self.device) 47 | labels = None 48 | else: 49 | data = torch.load(self.processed_paths[1], map_location=self.device) 50 | labels = torch.load(self.processed_paths[2], map_location=self.device) 51 | sensor_list = np.loadtxt(self.processed_paths[3], dtype=str) 52 | with open(self.processed_paths[4], 'r') as f: 53 | self.meta = json.load(f) 54 | else: 55 | raise Exception(f'{self.name} dataset file processing failed') 56 | 57 | return data, labels, sensor_list 58 | 59 | def process(self): 60 | 61 | # purge old files if any exist 62 | if os.path.exists(self.processed_dir): 63 | rmtree(self.processed_dir) 64 | 65 | # load csv file 66 | train_csv = np.genfromtxt(self.raw_paths[0], delimiter=",") 67 | train_data = torch.from_numpy(train_csv[1:,1:]).float().to(self.device) 68 | 69 | test_csv = np.genfromtxt(self.raw_paths[1], delimiter=",") 70 | test_data = torch.from_numpy(test_csv[1:,1:]).float().to(self.device) 71 | test_data, test_labels = test_data[:,:-1], test_data[:,-1] 72 | 73 | with open(self.raw_paths[0], 'r') as f: 74 | line = f.readline().split(',')[1:] 75 | sensor_list = np.array(list(map(str.strip, line)), dtype=str) 76 | 77 | meta = [self.name, { 78 | 'num_nodes': train_data.size(1), 79 | 'train': { 80 | 'samples': train_data.size(0), 81 | 'min': train_data.min(dim=0)[0].tolist(), 82 | 'max': train_data.max(dim=0)[0].tolist(), 83 | }, 84 | 'test': { 85 | 'samples': test_data.size(0), 86 | 'min': test_data.min(dim=0)[0].tolist(), 87 | 'max': test_data.max(dim=0)[0].tolist(), 88 | } 89 | }] 90 | 91 | os.makedirs(self.processed_dir) 92 | torch.save(train_data, self.processed_paths[0]) 93 | torch.save(test_data, self.processed_paths[1]) 94 | torch.save(test_labels, self.processed_paths[2]) 95 | np.savetxt(self.processed_paths[3], sensor_list, delimiter='\n', fmt='%s') 96 | dump = json.dumps(meta, indent=4) 97 | with open(self.processed_paths[4], 'w') as f: 98 | f.write(dump) 99 | -------------------------------------------------------------------------------- /src/datasets/files/.gitignore: -------------------------------------------------------------------------------- 1 | raw/* 2 | processed/* 3 | !raw/demo/ 4 | !processed/demo/ -------------------------------------------------------------------------------- /src/datasets/files/processed/demo/labels.pt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timbrockmeyer/mulivariate-time-series-anomaly-detection/bef170c5a20e00e5316002afdc3fdd445aa43777/src/datasets/files/processed/demo/labels.pt -------------------------------------------------------------------------------- /src/datasets/files/processed/demo/list.txt: -------------------------------------------------------------------------------- 1 | M-06 2 | M-01 3 | M-02 4 | S-02 5 | P-10 6 | T-04 7 | T-05 8 | F-07 9 | M-03 10 | M-04 11 | M-05 12 | P-15 13 | C-01 14 | C-02 15 | T-12 16 | T-13 17 | F-04 18 | F-05 19 | D-14 20 | T-09 21 | P-14 22 | T-08 23 | P-11 24 | D-15 25 | D-16 26 | M-07 27 | F-08 28 | -------------------------------------------------------------------------------- /src/datasets/files/processed/demo/meta.json: -------------------------------------------------------------------------------- 1 | [ 2 | "demo", 3 | { 4 | "num_nodes": 27, 5 | "train": { 6 | "samples": 1565, 7 | "min": [ 8 | -1.0, 9 | -0.6427881717681885, 10 | -1.2107261419296265, 11 | -1.0, 12 | 0.987858235836029, 13 | -1.0, 14 | -1.0, 15 | -1.0, 16 | -1.4772167205810547, 17 | -1.4654639959335327, 18 | -1.255005955696106, 19 | 0.7308394908905029, 20 | -1.0, 21 | -1.0, 22 | -1.0, 23 | -1.0, 24 | -1.0, 25 | -1.116377592086792, 26 | -1.0, 27 | -1.0, 28 | 0.9991111159324646, 29 | -1.0, 30 | 0.3225919306278229, 31 | -1.0, 32 | -1.0, 33 | -1.0020240545272827, 34 | -1.0 35 | ], 36 | "max": [ 37 | -1.0, 38 | 2.4922983646392822, 39 | 0.6932034492492676, 40 | 0.0, 41 | 0.9988705515861511, 42 | 0.0, 43 | -1.0, 44 | 1.0, 45 | 1.0000708103179932, 46 | 1.0000054836273193, 47 | 0.981587827205658, 48 | 1.0036537647247314, 49 | 2.1934478282928467, 50 | 0.0, 51 | 1.0, 52 | 1.0, 53 | 0.7545157074928284, 54 | 4.162651062011719, 55 | -1.0, 56 | 1.0, 57 | 1.0, 58 | 1.029411792755127, 59 | 0.9952331185340881, 60 | 1.1915780305862427, 61 | 1.0088798999786377, 62 | 0.4032494127750397, 63 | 1.08695650100708 64 | ] 65 | }, 66 | "test": { 67 | "samples": 2049, 68 | "min": [ 69 | -1.0, 70 | -1.0764756202697754, 71 | -1.0846891403198242, 72 | -1.0, 73 | 0.9903995394706726, 74 | -1.0, 75 | -1.0, 76 | -1.0, 77 | -1.4241974353790283, 78 | -1.3314414024353027, 79 | -1.3089860677719116, 80 | -1.0, 81 | -1.0, 82 | -1.0, 83 | -1.0, 84 | -1.0, 85 | -1.0, 86 | -1.0768855810165405, 87 | -1.0, 88 | -1.0, 89 | 0.9903995394706726, 90 | -1.0, 91 | -1.0, 92 | -1.0, 93 | -1.0, 94 | -1.0, 95 | -1.0 96 | ], 97 | "max": [ 98 | 258.10809326171875, 99 | 2.2688539028167725, 100 | 1.3352758884429932, 101 | 1.0, 102 | 0.9983057975769043, 103 | 1.0, 104 | 1.0, 105 | 1.0, 106 | 1.0000708103179932, 107 | 1.17589271068573, 108 | 1.0951875448226929, 109 | 1.0, 110 | 1.0, 111 | 1.0, 112 | 1.0, 113 | 0.8645181059837341, 114 | 1.46547269821167, 115 | 1.806624412536621, 116 | 1.0, 117 | 1.0, 118 | 0.9983057975769043, 119 | 1.0196079015731812, 120 | 0.9705088138580322, 121 | 1.3461360931396484, 122 | 1.0, 123 | 1.0, 124 | 1.3043478727340698 125 | ] 126 | } 127 | } 128 | ] -------------------------------------------------------------------------------- /src/datasets/files/processed/demo/test.pt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timbrockmeyer/mulivariate-time-series-anomaly-detection/bef170c5a20e00e5316002afdc3fdd445aa43777/src/datasets/files/processed/demo/test.pt -------------------------------------------------------------------------------- /src/datasets/files/processed/demo/train.pt: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/timbrockmeyer/mulivariate-time-series-anomaly-detection/bef170c5a20e00e5316002afdc3fdd445aa43777/src/datasets/files/processed/demo/train.pt -------------------------------------------------------------------------------- /src/datasets/from_csv.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import torch 4 | 5 | from ..data import SlidingWindowDataset 6 | 7 | class From_csv(SlidingWindowDataset): 8 | 9 | ''' 10 | Requires CSV files for training and testing. 11 | Test file is expected to have its labels in the last column, train file to be without labels. 12 | Naming: 13 | Training data: 'train.csv' 14 | Test data: 'test.csv' 15 | ''' 16 | 17 | def __init__(self, window_size=1, stride=1, horizon=1, train=True, transform=None, target_transform=None, normalize=False, device='cpu'): 18 | 19 | self.device = device 20 | 21 | self.name = 'from_csv' 22 | 23 | train_file = 'train.csv' 24 | test_file = 'test.csv' 25 | 26 | self.meta = (None, None) 27 | self.normalize = False 28 | 29 | root = os.path.dirname(__file__) 30 | raw_dir = os.path.join(root, f'files/raw/{self.name}') 31 | raw_paths = [os.path.join(raw_dir, ending) for ending in [train_file, test_file]] 32 | 33 | if train: 34 | data = np.genfromtxt(raw_paths[0], delimiter=",")[1:,1:] 35 | data = torch.from_numpy(data).float().to(self.device) 36 | labels = None 37 | 38 | else: 39 | data = np.genfromtxt(raw_paths[1], delimiter=",")[1:,1:] 40 | data = torch.from_numpy(data).float().to(self.device) 41 | data, labels = data[:,:-1], data[:,-1] 42 | 43 | super().__init__(data, window_size, stride=stride, horizon=horizon, labels=labels, transform=transform, target_transform=target_transform, normalize=False, device=device) -------------------------------------------------------------------------------- /src/datasets/swat.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import numpy as np 4 | import pandas as pd 5 | import torch 6 | from shutil import rmtree 7 | 8 | from ..data import SlidingWindowDataset 9 | 10 | 11 | class Swat(SlidingWindowDataset): 12 | 13 | ''' 14 | LOAD ORIGINAL FILES IN EXCEL FIRST!!! 15 | delete the unnecessary first rows and save as a CSV file. 16 | 17 | Dataset can be requested from 18 | https://itrust.sutd.edu.sg/testbeds/secure-water-treatment-swat/ 19 | ''' 20 | 21 | def __init__(self, window_size=1, stride=1, horizon=1, train=True, transform=None, target_transform=None, normalize=False, device='cpu'): 22 | 23 | self.device = device 24 | 25 | self.name = 'swat' 26 | 27 | train_file = 'SWaT_Dataset_Normal_v0.csv' 28 | test_file = 'SWaT_Dataset_Attack_v0.csv' 29 | 30 | root = os.path.dirname(__file__) 31 | raw_dir = os.path.join(root, f'files/raw/{self.name}') 32 | self.processed_dir = os.path.join(root, f'files/processed/{self.name}') 33 | 34 | self.raw_paths = [os.path.join(raw_dir, ending) for ending in [train_file, test_file]] 35 | self.processed_paths = [os.path.join(self.processed_dir, ending) for ending in ['train.pt', 'test.pt', 'labels.pt', 'list.txt', 'meta.json']] 36 | 37 | data, labels, node_names = self.load(train) 38 | 39 | self.node_names = node_names 40 | 41 | super().__init__(data, window_size, stride=stride, horizon=horizon, labels=labels, transform=transform, target_transform=target_transform, normalize=normalize, device=device) 42 | 43 | def load(self, train): 44 | 45 | # process csv files if not done 46 | if not all(map(lambda x: os.path.isfile(x), self.processed_paths)): 47 | self.process() 48 | 49 | # check if processed and load 50 | if all(map(lambda x: os.path.isfile(x), self.processed_paths)): 51 | if train: 52 | data = torch.load(self.processed_paths[0], map_location=self.device) 53 | labels = None 54 | else: 55 | data = torch.load(self.processed_paths[1], map_location=self.device) 56 | labels = torch.load(self.processed_paths[2], map_location=self.device) 57 | sensor_list = np.loadtxt(self.processed_paths[3], dtype=str) 58 | with open(self.processed_paths[4], 'r') as f: 59 | self.meta = json.load(f) 60 | else: 61 | raise Exception(f'{self.name} dataset raw file processing failed') 62 | 63 | return data, labels, sensor_list 64 | 65 | def process(self): 66 | 67 | # purge old files if any exist 68 | if os.path.exists(self.processed_dir): 69 | rmtree(self.processed_dir) 70 | 71 | files = {'train': self.raw_paths[0], 'test': self.raw_paths[1]} 72 | 73 | for key, file in files.items(): 74 | 75 | df = pd.read_csv(file) 76 | 77 | # strip white spaces from column names 78 | df = df.rename(columns=lambda x: x.strip()) 79 | 80 | # timestamp column to index 81 | df.iloc[:,0] = df.index 82 | df = df.set_index(df.columns[0]) 83 | 84 | if key == 'train': 85 | # drop label column for training data 86 | df = df.drop(df.columns[-1], axis=1) 87 | 88 | column_names = df.columns.to_numpy() 89 | 90 | train_data = df.to_numpy() 91 | train_data = torch.from_numpy(train_data).float().to(self.device) 92 | 93 | else: 94 | # categorial labels to numerical values 95 | vocab = {'Normal': 0, 'Attack': 1, 'A ttack': 1} 96 | df.iloc[:,-1] = df.iloc[:,-1].apply(lambda x: vocab[x]) 97 | 98 | test_data = df.to_numpy() 99 | test_data = torch.from_numpy(test_data).float().to(self.device) 100 | test_data, test_labels = test_data[:,:-1], test_data[:,-1] 101 | 102 | meta = [self.name, { 103 | 'num_nodes': train_data.size(1), 104 | 'train': { 105 | 'samples': train_data.size(0), 106 | 'min': train_data.min(dim=0)[0].tolist(), 107 | 'max': train_data.max(dim=0)[0].tolist(), 108 | }, 109 | 'test': { 110 | 'samples': test_data.size(0), 111 | 'min': test_data.min(dim=0)[0].tolist(), 112 | 'max': test_data.max(dim=0)[0].tolist(), 113 | } 114 | }] 115 | 116 | os.makedirs(self.processed_dir, exist_ok=True) 117 | torch.save(train_data, self.processed_paths[0]) 118 | torch.save(test_data, self.processed_paths[1]) 119 | torch.save(test_labels, self.processed_paths[2]) 120 | np.savetxt(self.processed_paths[3], column_names, fmt = "%s") 121 | dump = json.dumps(meta, indent=4) 122 | with open(self.processed_paths[4], 'w') as f: 123 | f.write(dump) 124 | -------------------------------------------------------------------------------- /src/datasets/wadi.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import numpy as np 4 | import pandas as pd 5 | import torch 6 | from shutil import rmtree 7 | 8 | from ..data import SlidingWindowDataset 9 | 10 | 11 | class Wadi(SlidingWindowDataset): 12 | 13 | ''' LOAD ORIGINAL FILES IN EXCEL FIRST!!! 14 | delete the unnecessary first rows and save as a CSV file. 15 | 16 | The dataset includes a PDF with descriptions of 15 anomaly events, 17 | including start and end dates (m/d/y) and times. 18 | -> copy the following table and save as "WADI_attacktimes.csv": 19 | 20 | Start_Date Start_Time End_Date End_Time 21 | 10/9/2017 19:25:00 10/9/2017 19:50:16 22 | 10/10/2017 10:24:10 10/10/2017 10:34:00 23 | 10/10/2017 10:55:00 10/10/2017 11:24:00 24 | 10/10/2017 11:30:40 10/10/2017 11:44:50 25 | 10/10/2017 13:39:30 10/10/2017 13:50:40 26 | 10/10/2017 14:48:17 10/10/2017 14:59:55 27 | 10/10/2017 17:40:00 10/10/2017 17:49:40 28 | 10/10/2017 10:55:00 10/10/2017 10:56:27 29 | 10/11/2017 11:17:54 10/11/2017 11:31:20 30 | 10/11/2017 11:36:31 10/11/2017 11:47:00 31 | 10/11/2017 11:59:00 10/11/2017 12:05:00 32 | 10/11/2017 12:07:30 10/11/2017 12:10:52 33 | 10/11/2017 12:16:00 10/11/2017 12:25:36 34 | 10/11/2017 15:26:30 10/11/2017 15:37:00 35 | 36 | Data can be requested from 37 | https://itrust.sutd.edu.sg/itrust-labs-home/itrust-labs_wadi/ 38 | ''' 39 | 40 | def __init__(self, window_size=1, stride=1, horizon=1, train=True, transform=None, target_transform=None, normalize=False, device='cpu'): 41 | 42 | self.device = device 43 | 44 | self.name = 'wadi' 45 | 46 | train_file = 'WADI_14days.csv' 47 | test_file = 'WADI_attackdata.csv' 48 | label_file = 'WADI_attacktimes.csv' 49 | 50 | root = os.path.dirname(__file__) 51 | raw_dir = os.path.join(root, f'files/raw/{self.name}') 52 | self.processed_dir = os.path.join(root, f'files/processed/{self.name}') 53 | 54 | self.raw_paths = [os.path.join(raw_dir, ending) for ending in [train_file, test_file, label_file]] 55 | self.processed_paths = [os.path.join(self.processed_dir, ending) for ending in ['train.pt', 'test.pt', 'labels.pt', 'list.txt', 'meta.json']] 56 | 57 | data, labels, sensor_names = self.load(train) 58 | 59 | self.node_names = sensor_names 60 | 61 | super().__init__(data, window_size, stride=stride, horizon=horizon, labels=labels, transform=transform, target_transform=target_transform, normalize=normalize, device=device) 62 | 63 | def load(self, train): 64 | 65 | # process csv files if not done 66 | if not all(map(lambda x: os.path.isfile(x), self.processed_paths)): 67 | self.process() 68 | 69 | # check if processed and load 70 | if all(map(lambda x: os.path.isfile(x), self.processed_paths)): 71 | if train: 72 | data = torch.load(self.processed_paths[0], map_location=self.device) 73 | labels = None 74 | else: 75 | data = torch.load(self.processed_paths[1], map_location=self.device) 76 | labels = torch.load(self.processed_paths[2], map_location=self.device) 77 | sensor_list = np.loadtxt(self.processed_paths[3], dtype=str) 78 | with open(self.processed_paths[4], 'r') as f: 79 | self.meta = json.load(f) 80 | else: 81 | raise Exception(f'{self.name} dataset raw file processing failed') 82 | 83 | return data, labels, sensor_list 84 | 85 | def process(self): 86 | 87 | # purge old files if any exist 88 | if os.path.exists(self.processed_dir): 89 | rmtree(self.processed_dir) 90 | 91 | df_train = pd.read_csv(self.raw_paths[0]) 92 | df_test = pd.read_csv(self.raw_paths[1]) 93 | anomaly_timeframes = pd.read_csv(self.raw_paths[2]) 94 | 95 | def drop_columns(columns): 96 | df_train.drop(columns, axis=1, inplace=True) 97 | df_test.drop(columns, axis=1, inplace=True) 98 | 99 | assert list(df_train.columns) == list(df_test.columns) 100 | 101 | # find row indices of anomaly interval 102 | start_indices = pd.merge(df_test, anomaly_timeframes, left_on=['Date', 'Time'], right_on=['Start_Date', 'Start_Time'])['Row'] 103 | end_indices = pd.merge(df_test, anomaly_timeframes, left_on=['Date', 'Time'], right_on=['End_Date', 'End_Time'])['Row'] 104 | assert start_indices.shape == end_indices.shape 105 | 106 | # add anomaly labels to test data 107 | labels = pd.Series(np.zeros(len(df_test))) 108 | for a,b in zip(start_indices,end_indices): 109 | labels[a:b+1] = np.ones((b-a)+1) 110 | df_test['label'] = labels 111 | 112 | # drop date and time columns 113 | datetime_cols = ['Date', 'Time'] 114 | drop_columns(datetime_cols) 115 | 116 | # fix columns 117 | for df in [df_train, df_test]: 118 | # set index column 119 | df.rename(columns={df.columns[0]:'timestamp'}, inplace=True) 120 | df.iloc[:,0] = df.index 121 | df.set_index(df.columns[0], inplace=True) 122 | # strip column names 123 | df.rename(columns=lambda x: x.strip(), inplace=True) 124 | # shorten column names 125 | df.columns = [x.split('\\')[-1] for x in df.columns] 126 | 127 | # account for missing data 128 | # completely empty colums in training or test data 129 | empty_columns = [col for col in df_train.columns if df_train[col].isnull().all() or df_test[col].isnull().all()] 130 | drop_columns(empty_columns) 131 | # other missing values 132 | assert not df_test.isnull().any().any() 133 | df_train = df_train.interpolate(method='nearest') 134 | 135 | # columns with zero variance in test data 136 | zero_var_columns_test = [col for col in df_test.columns if df_test[col].var() == 0] 137 | # columns with extremly high variance in training data 138 | extreme_var_columns_train = [col for col in df_train.columns if df_train[col].var() > 10000] 139 | drop_columns(zero_var_columns_test + extreme_var_columns_train) 140 | 141 | assert list(df_train.columns) == list(df_test.columns)[:-1] 142 | 143 | column_names = df_train.columns.to_numpy() 144 | 145 | train_data = df_train.to_numpy() 146 | train_data = torch.from_numpy(train_data).float().to(self.device) 147 | 148 | test_data = df_test.to_numpy() 149 | test_data = torch.from_numpy(test_data).float().to(self.device) 150 | 151 | meta = [self.name, { 152 | 'num_nodes': train_data.size(1), 153 | 'train': { 154 | 'samples': train_data.size(0), 155 | 'min': train_data.min(dim=0)[0].tolist(), 156 | 'max': train_data.max(dim=0)[0].tolist(), 157 | }, 158 | 'test': { 159 | 'samples': test_data.size(0), 160 | 'min': test_data.min(dim=0)[0].tolist(), 161 | 'max': test_data.max(dim=0)[0].tolist(), 162 | } 163 | }] 164 | 165 | test_data, test_labels = test_data[:,:-1], test_data[:,-1] 166 | 167 | assert train_data.size(1) == test_data.size(1) 168 | 169 | os.makedirs(self.processed_dir, exist_ok=True) 170 | torch.save(train_data, self.processed_paths[0]) 171 | torch.save(test_data, self.processed_paths[1]) 172 | torch.save(test_labels, self.processed_paths[2]) 173 | np.savetxt(self.processed_paths[3], column_names, fmt = "%s") 174 | dump = json.dumps(meta, indent=4) 175 | with open(self.processed_paths[4], 'w') as f: 176 | f.write(dump) 177 | 178 | 179 | 180 | 181 | 182 | 183 | 184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | -------------------------------------------------------------------------------- /src/layers/__init__.py: -------------------------------------------------------------------------------- 1 | from .embedding import SingleEmbedding, DoubleEmbedding 2 | from .attention import EmbeddingAttention 3 | -------------------------------------------------------------------------------- /src/layers/attention.py: -------------------------------------------------------------------------------- 1 | import math 2 | import torch 3 | import torch.nn.functional as F 4 | from torch.nn import Parameter, Linear 5 | from torch.autograd import Variable 6 | from torch_geometric.nn.conv import MessagePassing 7 | from torch_geometric.utils import remove_self_loops, add_self_loops, softmax 8 | from torch_geometric.utils.to_dense_adj import to_dense_adj 9 | 10 | from torch_geometric.nn.inits import glorot, zeros 11 | 12 | 13 | class EmbeddingAttention(MessagePassing): 14 | ''' 15 | GATConv layer that computes concatinated attention scores for a graph time series window 16 | and corresponding node embedding values. 17 | Modification of the implementation of GATConv in pytorch geometric. 18 | https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.conv.GATConv 19 | ''' 20 | 21 | def __init__(self, in_channels, out_channels, heads=1, concat=True, 22 | negative_slope=0.2, dropout=0.0, 23 | add_self_loops=True, bias=True, **kwargs): 24 | kwargs.setdefault('aggr', 'add') 25 | super().__init__(node_dim=0, **kwargs) 26 | 27 | self.in_channels = in_channels 28 | self.out_channels = out_channels 29 | self.heads = heads 30 | self.concat = concat 31 | self.negative_slope = negative_slope 32 | self.dropout = dropout 33 | self.add_self_loops = add_self_loops 34 | 35 | # transformations on source and target nodes (will be the same in sensor network) 36 | self.lin_src = Linear(in_channels, heads * out_channels, bias=False) 37 | self.lin_dst = self.lin_src 38 | 39 | # learnable parameters to compute attention coefficients 40 | # double number of parameters; for node features and sensor embedding 41 | self.att_src = Parameter(torch.Tensor(1, heads, 2*out_channels)) 42 | self.att_dst = Parameter(torch.Tensor(1, heads, 2*out_channels)) 43 | 44 | if bias and concat: 45 | self.bias = Parameter(torch.Tensor(heads * out_channels)) 46 | elif bias and not concat: 47 | self.bias = Parameter(torch.Tensor(out_channels)) 48 | else: 49 | self.register_parameter('bias', None) 50 | 51 | self._alpha = None 52 | 53 | self.reset_parameters() 54 | 55 | def reset_parameters(self): 56 | self.lin_src.reset_parameters() 57 | self.lin_dst.reset_parameters() 58 | glorot(self.lin_src.weight) 59 | glorot(self.att_src) 60 | glorot(self.att_dst) 61 | zeros(self.bias) 62 | 63 | def forward(self, x, edge_index, embedding, size=None, return_attention_weights=False): 64 | 65 | H, C = self.heads, self.out_channels 66 | 67 | # transform input node features 68 | assert x.dim() == 2, "Static graphs not supported in 'GATConv'" 69 | x_src = x_dst = self.lin_src(x).view(-1, H, C) 70 | x = (x_src, x_dst) 71 | 72 | # shape [num_nodes*batch_size, embed_dim] -> [num_nodes*batch_size, heads, embed_dim] 73 | assert embedding.size(1) == C 74 | emb_src = emb_dst = embedding.unsqueeze(1).expand(-1, H, C) 75 | 76 | # combined representation of node features and embedding 77 | src = torch.cat([x_src, emb_src], dim=2) 78 | dst = torch.cat([x_dst, emb_dst], dim=2) 79 | # compute node-level attention coefficients 80 | alpha_src = (src * self.att_src).sum(dim=-1) 81 | alpha_dst = (dst * self.att_dst).sum(dim=-1) 82 | alpha = (alpha_src, alpha_dst) 83 | 84 | if self.add_self_loops: 85 | num_nodes = x_src.size(0) 86 | edge_index, _ = remove_self_loops(edge_index) 87 | edge_index, _ = add_self_loops(edge_index, num_nodes=num_nodes) 88 | 89 | # propagate_type: (x: OptPairTensor, alpha: OptPairTensor) 90 | out = self.propagate(edge_index, x=x, alpha=alpha, size=size) 91 | 92 | alpha = self._alpha 93 | assert alpha is not None 94 | self._alpha = None 95 | 96 | if self.concat: 97 | out = out.view(-1, self.heads * self.out_channels) 98 | else: 99 | out = out.mean(dim=1) 100 | 101 | if self.bias is not None: 102 | out += self.bias 103 | 104 | if return_attention_weights: 105 | return out, (edge_index, alpha) 106 | else: 107 | return out 108 | 109 | def message(self, x_j, alpha_j, alpha_i, index, ptr, size_i): 110 | 111 | alpha = alpha_j if alpha_i is None else alpha_j + alpha_i 112 | 113 | alpha = F.leaky_relu(alpha, self.negative_slope) 114 | alpha = softmax(alpha, index, ptr, size_i) 115 | self._alpha = alpha # Save for later use. 116 | alpha = F.dropout(alpha, p=self.dropout, training=self.training) 117 | 118 | msg = x_j * alpha.unsqueeze(-1) 119 | 120 | return msg 121 | 122 | def __repr__(self): 123 | return '{}({}, {}, heads={})'.format(self.__class__.__name__, 124 | self.in_channels, 125 | self.out_channels, self.heads) -------------------------------------------------------------------------------- /src/layers/embedding.py: -------------------------------------------------------------------------------- 1 | from torch import nn 2 | import torch.nn.functional as F 3 | import torch 4 | import math 5 | from ..utils.device import get_device 6 | 7 | class SingleEmbedding(nn.Module): 8 | r''' Layer for graph representation learning 9 | using a linear embedding layer and cosine similarity 10 | to produce an index list of edges for a fixed number of 11 | neighbors for each node. 12 | 13 | Args: 14 | num_nodes (int): Number of nodes. 15 | embed_dim (int): Dimension of embedding. 16 | topk (int, optional): Number of neighbors per node. 17 | ''' 18 | 19 | def __init__(self, num_nodes, embed_dim, topk=15, warmup_epochs = 20): 20 | super().__init__() 21 | 22 | self.device = get_device() 23 | 24 | self.topk = topk 25 | self.embed_dim = embed_dim 26 | self.num_nodes = num_nodes 27 | 28 | self.embedding = nn.Embedding(num_nodes, embed_dim) 29 | nn.init.kaiming_uniform_(self.embedding.weight, a=math.sqrt(5)) 30 | 31 | self._A = None 32 | self._edges = None 33 | 34 | ### pre-computed index matrices 35 | # square matrix for adjacency matrix indexing 36 | self._edge_indices = torch.arange(num_nodes).to(self.device).expand(num_nodes, num_nodes) # [[1,2,3,4,5], [1,2,3,4,5], ...] 37 | # matrix containing column indices for the right side of a matrix - will be used to remove all but topk entries 38 | self._i = torch.arange(self.num_nodes).unsqueeze(1).expand(self.num_nodes, self.num_nodes - self.topk).flatten() 39 | 40 | # fully connected graph 41 | self._fc_edge_indices = torch.stack([self._edge_indices.T.flatten(), self._edge_indices.flatten()], dim=0) 42 | 43 | self.warmup_counter = 0 44 | self.warmup_durantion = warmup_epochs 45 | 46 | def get_A(self): 47 | if self._A is None: 48 | self.forward() 49 | return self._A 50 | 51 | def get_E(self): 52 | if self._edges is None: 53 | self.forward() 54 | return self._edges, [self.embedding.weight.clone()] 55 | 56 | def forward(self): 57 | W = self.embedding.weight.clone() # row vector represents sensor embedding 58 | 59 | eps = 1e-8 # avoid division by 0 60 | W_norm = W / torch.clamp(W.norm(dim=1)[:, None], min=eps) 61 | A = W_norm @ W_norm.t() 62 | 63 | # remove self loops 64 | A.fill_diagonal_(0) 65 | 66 | # remove negative scores 67 | A = A.clamp(0) 68 | 69 | if self.warmup_counter < self.warmup_durantion: 70 | edge_indices = self._fc_edge_indices 71 | edge_attr = A.flatten() 72 | 73 | self.warmup_counter += 1 74 | else: 75 | 76 | # topk entries 77 | _, topk_idx = A.sort(descending=True) 78 | 79 | j = topk_idx[:, self.topk:].flatten() 80 | A[self._i, j] = 0 81 | 82 | # # row degree 83 | # row_degree = A.sum(1).view(-1, 1) + 1e-8 # column vector 84 | # col_degree = A.sum(0) + 1e-8 # row vector 85 | 86 | # # normalized adjacency matrix 87 | # A /= torch.sqrt(row_degree) 88 | # A /= torch.sqrt(col_degree) 89 | 90 | msk = A > 0 # boolean mask 91 | 92 | edge_idx_src = self._edge_indices.T[msk] # source edge indices 93 | edge_idx_dst = self._edge_indices[msk] # target edge indices 94 | edge_attr = A[msk].flatten() # edge weights 95 | 96 | # shape [2, topk*num_nodes] tensor holding topk edge-index-pairs for each node 97 | edge_indices = torch.stack([edge_idx_src, edge_idx_dst], dim=0) 98 | 99 | # save for later 100 | self._A = A 101 | self._edges = edge_indices 102 | 103 | return edge_indices, edge_attr, A 104 | 105 | 106 | class DoubleEmbedding(nn.Module): 107 | r"""An implementation of the graph learning layer to construct an adjacency matrix. 108 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks." 109 | `_ 110 | 111 | Args: 112 | num_nodes (int): Number of nodes in the graph. 113 | k (int): Number of largest values to consider in constructing the neighbourhood of a node (pick the "nearest" k nodes). 114 | dim (int): Dimension of the node embedding. 115 | alpha (float, optional): Tanh alpha for generating adjacency matrix, alpha controls the saturation rate 116 | """ 117 | 118 | def __init__(self, num_nodes, embed_dim, topk=5, alpha=3, type='uni', warmup_epochs=20): 119 | 120 | super(DoubleEmbedding, self).__init__() 121 | 122 | self.device = get_device() 123 | 124 | assert type in ['bi', 'uni', 'sym'] 125 | self.graph_type = type 126 | 127 | self.alpha = alpha 128 | 129 | self._embedding1 = nn.Embedding(num_nodes, embed_dim) 130 | self._embedding2 = nn.Embedding(num_nodes, embed_dim) 131 | self._linear1 = nn.Linear(embed_dim, embed_dim) 132 | self._linear2 = nn.Linear(embed_dim, embed_dim) 133 | 134 | nn.init.kaiming_uniform_(self._embedding1.weight, a=math.sqrt(5)) 135 | nn.init.kaiming_uniform_(self._embedding2.weight, a=math.sqrt(5)) 136 | 137 | self._topk = topk 138 | self._num_nodes = num_nodes 139 | 140 | # placeholders 141 | self._A = None 142 | self._edges = None 143 | self._M1 = self._embedding1.weight.clone() 144 | self._M2 = self._embedding2.weight.clone() 145 | 146 | ### pre-computed index matrices 147 | # square matrix for adjacency matrix indexing 148 | self._edge_indices = torch.arange(num_nodes).to(self.device).expand(num_nodes, num_nodes) # [[1,2,3,4,5], [1,2,3,4,5], ...] 149 | # row indices for entries that will be removed from adjacency matrix 150 | self._i = torch.arange(self._num_nodes).unsqueeze(1).expand(self._num_nodes, self._num_nodes - self._topk).flatten() 151 | 152 | # fully connected graph 153 | self._fc_edge_indices = torch.stack([self._edge_indices.T.flatten(), self._edge_indices.flatten()], dim=0) 154 | 155 | self.warmup_counter = 0 156 | self.warmup_durantion = warmup_epochs 157 | 158 | def get_A(self): 159 | if self._A is None: 160 | self.forward() 161 | return self._A 162 | 163 | def get_E(self): 164 | if self._edges is None: 165 | self.forward() 166 | return self._edges, [self._M1, self._M2] 167 | 168 | def forward(self) -> torch.FloatTensor: 169 | """ 170 | ... 171 | """ 172 | 173 | M1 = self._embedding1.weight.clone() 174 | M2 = self._embedding2.weight.clone() 175 | 176 | self._M1 = M1.data.clone() 177 | self._M2 = M2.data.clone() 178 | 179 | M1 = torch.tanh(self.alpha * self._linear1(M1)) 180 | M2 = torch.tanh(self.alpha * self._linear2(M2)) 181 | 182 | if self.graph_type is 'uni': 183 | A = M1 @ M2.T - M2 @ M1.T # skew symmetric matrix (uni-directed) 184 | 185 | elif self.graph_type is 'bi': # unordered matrix (directed unconstraint) 186 | A = M1 @ M2.T 187 | 188 | elif self.graph_type is 'sym': # symmetric matrix (undirected) 189 | A = M1 @ M1.T - M2 @ M2.T 190 | # A = A.triu() 191 | 192 | # set negative values to zero 193 | A = F.relu(A) 194 | # no self loops 195 | A.fill_diagonal_(0) 196 | 197 | if self.warmup_counter < self.warmup_durantion: 198 | edge_indices = self._fc_edge_indices 199 | edge_attr = A.flatten() 200 | 201 | self.warmup_counter += 1 202 | else: 203 | # topk entries 204 | _, idx = A.sort(descending=True) 205 | j = idx[:, self._topk:].flatten() # column indices of topk 206 | # remove all but topk 207 | A[self._i, j] = 0 208 | 209 | # # node degrees (num incoming edges) 210 | # row_degree = A.sum(1).view(-1, 1) + 1e-8 # column vector 211 | # col_degree = A.sum(0) + 1e-8 # row vector 212 | 213 | # # normalized adjacency matrix 214 | # A /= torch.sqrt(row_degree) 215 | # A /= torch.sqrt(col_degree) 216 | 217 | msk = A > 0 # boolean mask 218 | 219 | edge_idx_src = self._edge_indices.T[msk] # source edge indices 220 | edge_idx_dst = self._edge_indices[msk] # target edge indices 221 | edge_attr = A[msk].flatten() # edge weights 222 | 223 | # shape [2, topk*num_nodes] tensor holding topk edge-index-pairs for each node 224 | edge_indices = torch.stack([edge_idx_src, edge_idx_dst], dim=0) 225 | 226 | # save for later 227 | self._A = A 228 | self._edges = edge_indices 229 | 230 | return edge_indices, edge_attr, A 231 | 232 | 233 | class ProjectedEmbedding(nn.Module): 234 | r''' Layer for graph representation learning 235 | using a linear embedding layer and cosine similarity 236 | to produce an index list of edges for a fixed number of 237 | neighbors for each node. 238 | 239 | Args: 240 | num_nodes (int): Number of nodes. 241 | embed_dim (int): Dimension of embedding. 242 | topk (int, optional): Number of neighbors per node. 243 | ''' 244 | 245 | def __init__(self, num_nodes, num_node_features, embed_dim, topk=15): 246 | super().__init__() 247 | 248 | self.topk = topk 249 | self.embed_dim = embed_dim 250 | self.in_features = num_node_features 251 | 252 | self.device = get_device() 253 | 254 | self.embedding_projection = nn.ModuleList([ 255 | nn.Sequential( 256 | nn.Linear(num_node_features, 64), 257 | nn.ReLU(), 258 | nn.Linear(64, embed_dim) 259 | ) for _ in range(num_nodes)] 260 | ) 261 | 262 | self.prev_embed = torch.empty((num_nodes, embed_dim), dtype=torch.float, requires_grad=True) 263 | nn.init.kaiming_uniform_(self.prev_embed, a=math.sqrt(5)) 264 | 265 | self._A = None 266 | self._edges = None 267 | 268 | ### pre-computed index matrices 269 | # square matrix for adjacency matrix indexing 270 | self._edge_indices = torch.arange(num_nodes).to(self.device).expand(num_nodes, num_nodes) # [[1,2,3,4,5], [1,2,3,4,5], ...] 271 | # matrix containing column indices for the right side of a matrix - will be used to remove all but topk entries 272 | self._i = torch.arange(num_nodes).unsqueeze(1).expand(num_nodes, num_nodes - topk).flatten() 273 | 274 | def get_A(self): 275 | if self._A is None: 276 | self.forward() 277 | return self._A 278 | 279 | def get_E(self): 280 | if self._edges is None: 281 | self.forward() 282 | return self._edges, [self.embedding.weight.clone()] 283 | 284 | def forward(self, x): 285 | # x shape, B, N, F 286 | proj = [] 287 | for i, func in enumerate(self.embedding_projection): 288 | proj.append(func(x[..., i, :])) 289 | M1 = self.prev_embed 290 | M2 = torch.stack(proj, dim=0) 291 | 292 | A = F.relu(M1 @ M2.T - M2 @ M1.T) 293 | 294 | self.prev_embed = M2 295 | 296 | # topk entries 297 | _, topk_idx = A.sort(descending=True) 298 | 299 | j = topk_idx[:, self.topk:].flatten() 300 | A[self._i, j] = 0 301 | 302 | msk = A > 0 # boolean mask 303 | 304 | edge_idx_src = self._edge_indices.T[msk] # source edge indices 305 | edge_idx_dst = self._edge_indices[msk] # target edge indices 306 | edge_attr = A[msk].flatten() # edge weights 307 | 308 | # shape [2, topk*num_nodes] tensor holding topk edge-index-pairs for each node 309 | edge_indices = torch.stack([edge_idx_src, edge_idx_dst], dim=0) 310 | 311 | # save for later 312 | self._A = A 313 | self._edges = edge_indices 314 | 315 | return edge_indices, edge_attr, A 316 | 317 | class ConvEmbedding(nn.Module): 318 | r''' Layer for graph representation learning 319 | using a linear embedding layer and cosine similarity 320 | to produce an index list of edges for a fixed number of 321 | neighbors for each node. 322 | 323 | Args: 324 | num_nodes (int): Number of nodes. 325 | embed_dim (int): Dimension of embedding. 326 | topk (int, optional): Number of neighbors per node. 327 | ''' 328 | 329 | def __init__(self, num_nodes, num_node_features, embed_dim, topk=15): 330 | super().__init__() 331 | 332 | self.topk = topk 333 | self.embed_dim = embed_dim 334 | self.in_features = num_node_features 335 | 336 | self.device = get_device() 337 | 338 | # INPUT SIZE 25 339 | self.embedding_conv = nn.Sequential( 340 | nn.Conv1d(1, 8, 7), 341 | nn.ReLU(), 342 | nn.BatchNorm1d(8), 343 | nn.Conv1d(8, 16, 5), 344 | nn.ReLU(), 345 | nn.BatchNorm1d(16), 346 | nn.Conv1d(16, 32, 5), 347 | nn.ReLU(), 348 | nn.BatchNorm1d(32), 349 | nn.Flatten(1, -1), 350 | nn.Linear(32*11, 2*32), 351 | nn.ReLU(), 352 | ) 353 | 354 | self._A = None 355 | self._edges = None 356 | 357 | ### pre-computed index matrices 358 | # square matrix for adjacency matrix indexing 359 | self._edge_indices = torch.arange(num_nodes).to(self.device).expand(num_nodes, num_nodes) # [[1,2,3,4,5], [1,2,3,4,5], ...] 360 | # matrix containing column indices for the right side of a matrix - will be used to remove all but topk entries 361 | self._i = torch.arange(num_nodes).unsqueeze(1).expand(num_nodes, num_nodes - topk).flatten() 362 | 363 | def get_A(self): 364 | if self._A is None: 365 | self.forward() 366 | return self._A 367 | 368 | def get_E(self): 369 | if self._edges is None: 370 | self.forward() 371 | return self._edges, [self.embedding.weight.clone()] 372 | 373 | def forward(self, x): 374 | # x shape, B, N, F 375 | 376 | M1, M2 = self.embedding_conv(x.unsqueeze(-2)).chunk() 377 | 378 | A = F.relu(M1 @ M2.T - M2 @ M1.T) 379 | 380 | self.prev_embed = M2 381 | 382 | # topk entries 383 | _, topk_idx = A.sort(descending=True) 384 | 385 | j = topk_idx[:, self.topk:].flatten() 386 | A[self._i, j] = 0 387 | 388 | msk = A > 0 # boolean mask 389 | 390 | edge_idx_src = self._edge_indices.T[msk] # source edge indices 391 | edge_idx_dst = self._edge_indices[msk] # target edge indices 392 | edge_attr = A[msk].flatten() # edge weights 393 | 394 | # shape [2, topk*num_nodes] tensor holding topk edge-index-pairs for each node 395 | edge_indices = torch.stack([edge_idx_src, edge_idx_dst], dim=0) 396 | 397 | # save for later 398 | self._A = A 399 | self._edges = edge_indices 400 | 401 | return edge_indices, edge_attr, A 402 | 403 | -------------------------------------------------------------------------------- /src/models/ConvSeqAttention.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | from torch import nn 4 | import math 5 | 6 | from ..utils.device import get_device 7 | 8 | from torch_geometric.nn import ARMAConv 9 | from torch_geometric.nn import Sequential 10 | 11 | from ..layers import DoubleEmbedding 12 | 13 | 14 | class ConvSeqAttentionModel(torch.nn.Module): 15 | ''' 16 | Anomaly detection neural network model for multivariate sensor time series. 17 | Graph structure is randomly initialized and learned during training. 18 | Uses an attention layer that scores the attention weights for the input 19 | time series window and the sensor embedding vector. 20 | 21 | Args: 22 | args (dict): Argparser with config information. 23 | ''' 24 | def __init__(self, args): 25 | super().__init__() 26 | 27 | self.device = args.device 28 | 29 | self.num_nodes = args.num_nodes 30 | self.horizon = args.horizon 31 | self.topk = args.topk 32 | self.embed_dim = args.embed_dim 33 | self.lags = args.window_size 34 | 35 | # learned graph embeddings 36 | self.graph_embedding = DoubleEmbedding(self.num_nodes, self.embed_dim, topk=self.topk, type='uni', warmup_epochs=50).to(self.device) 37 | 38 | # model parameters 39 | kernels = [5, 3] 40 | channels = [16, 32] 41 | hidden_dim = 64 42 | 43 | # GNN ENCODER ::: outputs one hidden state for each time step 44 | self.conv_encoder = Sequential('x, idx, attr', [ 45 | (STConv(1, channels[0], channels[0], kernels[0], p=0.2, padding=True, residual=True), 'x, idx, attr -> x'), 46 | (STConv(channels[0], channels[1], channels[1], kernels[1], p=0.2, padding=True, residual=True), 'x, idx, attr -> x'), 47 | (STConv(channels[1], hidden_dim, hidden_dim, kernels[1], p=0.2, padding=True, residual=True), 'x, idx, attr -> x') 48 | ]) 49 | 50 | # linear transformation of encoder hidden states for alignment scores 51 | self.alignment_W = nn.Linear(hidden_dim, hidden_dim) 52 | 53 | # GNN DECODER ::: outputs single vector hidden state 54 | self.decoder_window_length = sum(kernels) - len(kernels) + 1 55 | self.conv_decoder = Sequential('x, idx, attr', [ 56 | (nn.Sequential( 57 | nn.Conv1d(1, 2*channels[0], kernels[0]), 58 | nn.BatchNorm1d(2*channels[0]), 59 | nn.GLU(dim=1), 60 | nn.Dropout(0.2), 61 | nn.Conv1d(channels[0], 2*channels[1], kernels[1]), 62 | nn.BatchNorm1d(2*channels[1]), 63 | nn.GLU(dim=1), 64 | nn.Dropout(0.2), 65 | nn.Flatten(1, -1),), 'x, -> x'), 66 | (ARMAConv( 67 | in_channels=channels[1], 68 | out_channels=hidden_dim, 69 | num_stacks=1, 70 | num_layers=1, 71 | act=nn.GELU(), 72 | dropout=0.2,), 'x, idx, attr -> x'), 73 | (nn.LayerNorm(hidden_dim), 'x -> x'), 74 | ]) 75 | 76 | # prediction layer 77 | pred_channels = 2*hidden_dim 78 | self.pred = Sequential('x, idx, attr', [ 79 | (nn.Linear(pred_channels, 1), 'x -> x'), 80 | ]) 81 | 82 | # absolute positional embeddings based on sine and cosine functions 83 | position = torch.arange(1000).unsqueeze(1) 84 | div_term = torch.exp(torch.arange(0, hidden_dim, 2) * (-math.log(1e5) / hidden_dim)) 85 | pe = torch.zeros(1000, hidden_dim) 86 | pe[:, 0::2] = torch.sin(position * div_term) / 1e9 87 | pe[:, 1::2] = torch.cos(position * div_term) / 1e9 88 | self.positional_embedding = F.dropout(pe[:self.lags - self.decoder_window_length], 0.05).to(self.device) # vector with PEs for each input timestep 89 | 90 | # cached offsets for batch stacking for each batch_size and number of edges 91 | self.batch_edge_offset_cache = {} 92 | 93 | # initial graph 94 | self._edge_index, self.edge_attr, self.A = self.graph_embedding() 95 | 96 | def get_graph(self): 97 | return self.graph_embedding.get_A() 98 | 99 | def get_embedding(self): 100 | return self.graph_embedding.get_E() 101 | 102 | def forward(self, window): 103 | # batch stacked window; input shape: [num_nodes*batch_size, lags] 104 | N = self.num_nodes # number of nodes 105 | T = self.lags # number of input time steps 106 | B = window.size(0) // N # batch size 107 | 108 | # get learned graph representation 109 | edge_index, edge_attr, _ = self.graph_embedding() 110 | 111 | # batching works by stacking graphs; creates a mega graph with disjointed subgraphs 112 | # for each input sample. E.g. for a batch of B inputs with 51 nodes each; 113 | # samples i in {0, ..., B} => node indices [0...50], [51...101], [102...152], ... ,[51*B...50*51*B] 114 | # => node indices for sample i = [0, ..., num_nodes-1] + (i*num_nodes) 115 | num_edges = len(edge_attr) 116 | try: 117 | batch_offset = self.batch_edge_offset_cache[(B, num_edges)] 118 | except: 119 | batch_offset = torch.arange(0, N * B, N).view(1, B, 1).expand(2, B, num_edges).flatten(1,-1).to(self.device) 120 | self.batch_edge_offset_cache[(B, num_edges)] = batch_offset 121 | # repeat edge indices B times and add i*num_nodes where i is the input index 122 | batched_edge_index = edge_index.unsqueeze(1).expand(2, B, -1).flatten(1, -1) + batch_offset 123 | # repeat edge weights B times 124 | batched_edge_attr = edge_attr.unsqueeze(0).expand(B, -1).flatten() 125 | 126 | # add node feature dimension to input 127 | window = window.unsqueeze(1) # (B, 1, T) 128 | encoder_window = window[..., :-self.decoder_window_length] # encoder takes beginning of input window 129 | decoder_window = window[..., -self.decoder_window_length:] # decoder takes the end 130 | 131 | # hidden states for all input time steps 132 | h_encoder = self.conv_encoder(encoder_window, batched_edge_index, batched_edge_attr) # (B, C, T) 133 | 134 | # add small positional encoding value 135 | h_encoder += self.positional_embedding.T 136 | 137 | # multistep prediction 138 | predictions = [] 139 | for _ in range(self.horizon): 140 | # decoder hidden state 141 | h_decoder = self.conv_decoder(decoder_window, batched_edge_index, batched_edge_attr).unsqueeze(1) # -> (B, 1, C) 142 | # transformation of encoder states 143 | a = self.alignment_W(h_encoder.permute(0,2,1)).permute(0,2,1) # W @ H_encoder, shape -> (B, C, T) 144 | # compute alignment vector from decoder transformed encoder hidden states 145 | score = h_decoder @ a # (B, 1, C) @ (B, C, T) -> (B, 1, T) 146 | # attention weights for each time step 147 | alpha = F.softmax(score, dim=2) # -> (B, 1, T) 148 | # context vector 149 | context = torch.sum(alpha * h_encoder, dim=2) # -> (B, C) 150 | # concatination of context vector and decoder hidden state 151 | context = torch.cat([context, h_decoder.squeeze(1)], dim=1) # -> (B, 2C) 152 | # layer normalization after adding all components 153 | context = F.layer_norm(context, tuple(context.shape[1:])) 154 | # single step prediction 155 | y_pred = self.pred(context, batched_edge_index, batched_edge_attr).view(-1, 1) # column vector 156 | predictions.append(y_pred) 157 | # decoder input for the next step 158 | decoder_window = torch.cat([decoder_window[..., 1:], y_pred.detach().unsqueeze(1)], dim=-1) 159 | 160 | # full output prediction vector 161 | pred = torch.cat(predictions, dim=1) # row = node, column = time 162 | 163 | return pred 164 | 165 | # return window[..., -1].view(-1, 1).repeat(1, self.horizon) 166 | 167 | 168 | class STConv(nn.Module): 169 | r'''Spatio-Temporal convolution block. 170 | 171 | Args: 172 | in_channels (int): Number of input features. 173 | out_channels (int): Number of output features. 174 | kernel_size (int): Convolutional kernel size. 175 | ''' 176 | 177 | def __init__(self, in_channels: int, temporal_channels: int, spatial_channels: int, kernel_size: int = 3, padding: bool = True, residual: bool = True, p: float = 0.0): 178 | super(STConv, self).__init__() 179 | 180 | self.padding = padding 181 | self.residual = residual 182 | 183 | self.device = get_device() 184 | 185 | if residual: 186 | self.res = nn.Conv1d(in_channels, spatial_channels, 1) 187 | 188 | if padding: 189 | self.p1d = (kernel_size-1, 0) 190 | 191 | # absolute positional embeddings based on sine and cosine functions 192 | position = torch.arange(1000).unsqueeze(1) 193 | div_term = torch.exp(torch.arange(0, temporal_channels, 2) * (-math.log(1e5) / temporal_channels)) 194 | pe = torch.zeros(1000, temporal_channels) 195 | pe[:, 0::2] = torch.sin(position * div_term) / 100 196 | pe[:, 1::2] = torch.cos(position * div_term) / 100 197 | self.positional_embedding = F.dropout(pe, 0.05).to(self.device) # vector with PEs for each input timestep 198 | 199 | self.temporal_conv = nn.Sequential( 200 | nn.Conv1d(in_channels, 2*temporal_channels, kernel_size), 201 | nn.BatchNorm1d(2*temporal_channels), 202 | nn.GLU(dim=1), 203 | nn.Dropout(p), 204 | ) 205 | self.graph_conv = ARMAConv( 206 | in_channels=temporal_channels, 207 | out_channels=spatial_channels, 208 | num_stacks=1, 209 | num_layers=1, 210 | act=nn.GELU(), 211 | dropout=p, 212 | ) 213 | 214 | # cached offsets for temporal batch stacking for each batch_size and number of edges 215 | self.batch_edge_offset_cache = {} 216 | 217 | def forward(self, x: torch.FloatTensor, edge_index: torch.FloatTensor, edge_attr: torch.FloatTensor = None) -> torch.FloatTensor: 218 | '''Forward pass through temporal convolution block. 219 | 220 | Input data of shape: (batch, in_channels, time_steps). 221 | Output data of shape: (batch, out_channels, time_steps). 222 | ''' 223 | 224 | # input shape (batch*num_nodes, in_channels, time) 225 | 226 | if self.residual: 227 | res = self.res(x) 228 | 229 | if self.padding: 230 | x = F.pad(x, self.p1d, "constant", 0) 231 | 232 | # temporal aggregation 233 | x = self.temporal_conv(x) 234 | # dims after temporal convolution 235 | BN, C, T = x.shape # (batch*nodes, out_channels, time) 236 | N = edge_index.max().item() + 1 # number of nodes in the batch stack 237 | 238 | # positional encoding for every time step 239 | pe = self.positional_embedding[:T].T 240 | # print(x.mean().item(), pe.mean().item()) # balance layer output and embeddings 241 | x += pe 242 | 243 | # batch stacking the temporal dimension to create a mega giga graph consisting of batched temporally-stacked graphs 244 | # analogous to batch stacking in main GNN, see description there. 245 | x = x.view(-1, C) # (B*N*T, C) 246 | 247 | # create temporal batch edge and weight lists 248 | num_edges = len(edge_attr) 249 | try: 250 | batch_offset = self.batch_edge_offset_cache[(BN, num_edges)] 251 | except: 252 | batch_offset = torch.arange(0, BN*T, N).view(1, T, 1).expand(2, T, num_edges).flatten(1,-1).to(x.device) 253 | self.batch_edge_offset_cache[(BN, num_edges)] = batch_offset 254 | # repeat edge indices T times and add offset for the edge indices 255 | temporal_batched_edge_index = edge_index.unsqueeze(1).expand(2, T, -1).flatten(1, -1) + batch_offset 256 | # repeat edge weights T times 257 | temporal_batched_edge_attr = edge_attr.unsqueeze(0).expand(T, -1).flatten() 258 | 259 | x = self.graph_conv(x, temporal_batched_edge_index, temporal_batched_edge_attr) 260 | x = x.view(BN, -1, T) 261 | 262 | # add residual connection 263 | x = x + res if self.residual else x 264 | 265 | # layer normalization 266 | return F.layer_norm(x, tuple(x.shape[1:])) 267 | -------------------------------------------------------------------------------- /src/models/GNNLSTM.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | from torch import nn 4 | from torch.nn import LSTM, LSTMCell 5 | import math 6 | 7 | from ..utils.device import get_device 8 | 9 | from torch_geometric.nn import ARMAConv 10 | from torch_geometric.nn import Sequential 11 | 12 | from ..layers import DoubleEmbedding, SingleEmbedding 13 | 14 | 15 | class GNNLSTM(torch.nn.Module): 16 | ''' 17 | Anomaly detection neural network model for multivariate sensor time series. 18 | Graph structure is randomly initialized and learned during training. 19 | Uses an attention layer that scores the attention weights for the input 20 | time series window and the sensor embedding vector. 21 | 22 | Args: 23 | args (dict): Argparser with config information. 24 | ''' 25 | def __init__(self, args): 26 | super().__init__() 27 | 28 | self.device = args.device 29 | 30 | self.num_nodes = args.num_nodes 31 | self.horizon = args.horizon 32 | self.topk = args.topk 33 | self.embed_dim = args.embed_dim 34 | self.lags = args.window_size 35 | 36 | # model parameters 37 | channels = 32 # channel == node embedding size because they are added 38 | hidden_size = 512 39 | 40 | # learned graph embeddings 41 | self.graph_embedding = SingleEmbedding(self.num_nodes, channels, topk=self.topk, warmup_epochs=10) 42 | 43 | # encoder 44 | self.tgconv = TGConv(1, channels) 45 | self.lstm = LSTM(channels*self.num_nodes, hidden_size, 2, batch_first=True, dropout=0.20) 46 | 47 | # decoder 48 | self.gnn = ARMAConv( 49 | in_channels=1, 50 | out_channels=channels, 51 | num_stacks=1, 52 | num_layers=1, 53 | act=nn.GELU(), 54 | dropout=0.2, 55 | ) 56 | self.cell1 = LSTMCell(self.num_nodes*channels, hidden_size) 57 | self.cell2 = LSTMCell(hidden_size, hidden_size) 58 | 59 | # linear prediction layer 60 | self.pred = nn.Linear(hidden_size, self.num_nodes) 61 | 62 | # cached offsets for batch stacking for each batch_size and number of edges 63 | self.batch_edge_offset_cache = {} 64 | 65 | # initial graph 66 | self._edge_index, self.edge_attr, self.A = self.graph_embedding() 67 | 68 | def get_graph(self): 69 | return self.graph_embedding.get_A() 70 | 71 | def get_embedding(self): 72 | return self.graph_embedding.get_E() 73 | 74 | def forward(self, window): 75 | # batch stacked window; input shape: [num_nodes*batch_size, lags] 76 | N = self.num_nodes # number of nodes 77 | T = self.lags # number of input time steps 78 | B = window.size(0) // N # batch size 79 | 80 | # get learned graph representation 81 | edge_index, edge_attr, _ = self.graph_embedding() 82 | _, W = self.get_embedding() 83 | W = W.pop() 84 | 85 | # batching works by stacking graphs; creates a mega graph with disjointed subgraphs 86 | # for each input sample. E.g. for a batch of B inputs with 51 nodes each; 87 | # samples i in {0, ..., B} => node indices [0...50], [51...101], [102...152], ... ,[51*B...50*51*B] 88 | # => node indices for sample i = [0, ..., num_nodes-1] + (i*num_nodes) 89 | num_edges = len(edge_attr) 90 | try: 91 | batch_offset = self.batch_edge_offset_cache[(B, num_edges)] 92 | except: 93 | batch_offset = torch.arange(0, N * B, N).view(1, B, 1).expand(2, B, num_edges).flatten(1,-1).to(self.device) 94 | self.batch_edge_offset_cache[(B, num_edges)] = batch_offset 95 | # repeat edge indices B times and add i*num_nodes where i is the input index 96 | batched_edge_index = edge_index.unsqueeze(1).expand(2, B, -1).flatten(1, -1) + batch_offset 97 | # repeat edge weights B times 98 | batched_edge_attr = edge_attr.unsqueeze(0).expand(B, -1).flatten() 99 | 100 | # add node feature dimension to input 101 | x = window.unsqueeze(-1) # (B*N, T, 1) 102 | 103 | ### ENCODER 104 | # GNN layer; batch stacked output with C feature channels for each time step 105 | x = self.tgconv(x, batched_edge_index, batched_edge_attr) # (B*N, T, C) 106 | x = x.view(B, N, T, -1).permute(0, 2, 1, 3).contiguous() # -> (B, T, N, C) 107 | # add node embeddings to feature vector as node positional embeddings 108 | x = x + W # (B, T, N, C) + (N, C) 109 | # concatenate node features for LSTM input 110 | x = x.view(B, T, -1) # -> (B, T, N*C) 111 | # LSTM layer 112 | h, (h_n, h_n) = self.lstm(x) # -> (B, T, H), (2, B, H), (2, B, H) 113 | # get hidden and cell states for each layer 114 | h1 = h_n[0, ...].squeeze(0) 115 | h2 = h_n[1, ...].squeeze(0) 116 | c1 = h_n[0, ...].squeeze(0) 117 | c2 = h_n[1, ...].squeeze(0) 118 | 119 | # TODO: try attention on h 120 | 121 | ### DECODER 122 | predictions = [] 123 | # if prediction horizon > 1, iterate through decoder LSTM step by step 124 | for _ in range(self.horizon-1): 125 | # single decoder step per loop iteration 126 | pred = self.pred(h2).view(-1, 1) 127 | predictions.append(pred) 128 | 129 | # GNN layer analogous to encoder without time dimension 130 | x = self.gnn(pred, batched_edge_index, batched_edge_attr) 131 | x = x.view(B, N, -1) + W 132 | x = x.view(B, -1) 133 | # LSTM layer 1 134 | h1, c1 = self.cell1(x, (h1, c1)) 135 | h1 = F.dropout(h1, 0.2) 136 | c1 = F.dropout(c1, 0.2) 137 | # LSTM layer 2 138 | h2, c2 = self.cell2(h1, (h2, c2)) 139 | # final prediction 140 | pred = self.pred(h2).view(-1, 1) 141 | predictions.append(pred) 142 | 143 | return torch.cat(predictions, dim=1) 144 | 145 | 146 | class TGConv(nn.Module): 147 | r''' 148 | Parallel graph convolution for multiple time steps. 149 | 150 | Args: 151 | in_channels (int): Number of input features. 152 | out_channels (int): Number of output features. 153 | p (float): Dropout value between 0 and 1 154 | ''' 155 | 156 | def __init__(self, in_channels: int, out_channels: int, p: float = 0.0): 157 | super(TGConv, self).__init__() 158 | 159 | self.device = get_device() 160 | 161 | self.graph_conv = ARMAConv( 162 | in_channels=in_channels, 163 | out_channels=out_channels, 164 | num_stacks=1, 165 | num_layers=1, 166 | act=nn.GELU(), 167 | dropout=p, 168 | ) 169 | 170 | # cached offsets for temporal batch stacking for each batch_size and number of edges 171 | self.batch_edge_offset_cache = {} 172 | 173 | def forward(self, x: torch.FloatTensor, edge_index: torch.FloatTensor, edge_attr: torch.FloatTensor = None) -> torch.FloatTensor: 174 | ''' 175 | Forward pass through temporal convolution block. 176 | 177 | Input data of shape: (batch, time_steps, in_channels). 178 | Output data of shape: (batch, time_steps, out_channels). 179 | ''' 180 | 181 | # input dims 182 | BN, T, C = x.shape # (batch*nodes, time, in_channels) 183 | N = edge_index.max().item() + 1 # number of nodes in the batch stack 184 | 185 | # batch stacking the temporal dimension to create a mega giga graph consisting of batched temporally-stacked graphs 186 | # analogous to batch stacking in main GNN, see description there. 187 | x = x.contiguous().view(-1, C) # (B*N*T, C) 188 | 189 | # create temporal batch edge and weight lists 190 | num_edges = len(edge_attr) 191 | try: 192 | batch_offset = self.batch_edge_offset_cache[(BN, num_edges)] 193 | except: 194 | batch_offset = torch.arange(0, BN*T, N).view(1, T, 1).expand(2, T, num_edges).flatten(1,-1).to(x.device) 195 | self.batch_edge_offset_cache[(BN, num_edges)] = batch_offset 196 | # repeat edge indices T times and add offset for the edge indices 197 | temporal_batched_edge_index = edge_index.unsqueeze(1).expand(2, T, -1).flatten(1, -1) + batch_offset 198 | # repeat edge weights T times 199 | temporal_batched_edge_attr = edge_attr.unsqueeze(0).expand(T, -1).flatten() 200 | 201 | # GNN with C output channels 202 | x = self.graph_conv(x, temporal_batched_edge_index, temporal_batched_edge_attr) # (B*N*T, C) 203 | x = x.view(BN, T, -1) # -> (B*N, T, C) 204 | 205 | return x 206 | -------------------------------------------------------------------------------- /src/models/LSTM.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | from torch import nn 4 | 5 | from ..layers import SingleEmbedding 6 | 7 | class RecurrentModel(torch.nn.Module): 8 | def __init__(self, config): 9 | super().__init__() 10 | 11 | self.device = config.device 12 | 13 | self.num_nodes = config.num_nodes 14 | self.horizon = config.horizon 15 | self.topk = config.topk 16 | self.embed_dim = config.embed_dim 17 | self.lags = config.window_size 18 | 19 | # dummy 20 | self.embedding = SingleEmbedding(1, 1, 1).to(self.device) 21 | 22 | # encoder lstm 23 | self.lstm = nn.LSTM(self.num_nodes, 512, 2, batch_first=True, dropout=0.25) 24 | 25 | # decoder lstm 26 | self.cell1 = nn.LSTMCell(self.num_nodes, 512) 27 | self.cell2 = nn.LSTMCell(512, 512) 28 | 29 | # linear prediction layer 30 | self.pred = nn.Linear(512, self.num_nodes) 31 | 32 | def get_graph(self): 33 | return self.embedding.get_A() 34 | 35 | def get_embedding(self): 36 | return self.embedding.get_E() 37 | 38 | 39 | def forward(self, window): 40 | # batch stacked window; input shape: [num_nodes*batch_size, lags] 41 | N = self.num_nodes # number of nodes 42 | T = self.lags # number of input time steps 43 | B = window.size(0) // N # batch size 44 | 45 | x = window.view(B, T, N) 46 | 47 | # encoder 48 | _, (h, c) = self.lstm(x) # -> (B, T, H), (2, B, H), (2, B, H) 49 | # get hidden and cell states for each layer 50 | h1 = h[0, ...].squeeze(0) 51 | h2 = h[1, ...].squeeze(0) 52 | c1 = c[0, ...].squeeze(0) 53 | c2 = c[1, ...].squeeze(0) 54 | 55 | # decoder 56 | predictions = [] 57 | for _ in range(self.horizon-1): 58 | pred = self.pred(h2) 59 | predictions.append(pred.view(-1, 1)) 60 | # layer 1 61 | h1, c1 = self.cell1(pred, (h1, c1)) 62 | h1 = F.dropout(h1, 0.2) 63 | c1 = F.dropout(c1, 0.2) 64 | # layer 2 65 | h2, c2 = self.cell2(h1, (h2, c2)) 66 | # final prediction 67 | pred = self.pred(h2).view(-1, 1) 68 | predictions.append(pred) 69 | 70 | return torch.cat(predictions, dim=1) 71 | 72 | 73 | -------------------------------------------------------------------------------- /src/models/Linear.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch import nn 3 | 4 | from ..layers import SingleEmbedding 5 | 6 | class LinearModel(torch.nn.Module): 7 | def __init__(self, args): 8 | super().__init__() 9 | 10 | self.device = args.device 11 | 12 | self.num_nodes = args.num_nodes 13 | self.horizon = args.horizon 14 | self.topk = args.topk 15 | self.embed_dim = args.embed_dim 16 | self.lags = args.window_size 17 | 18 | self.embedding = SingleEmbedding(self.num_nodes, self.embed_dim, topk=self.topk) 19 | 20 | self.lin = nn.Sequential( 21 | nn.Linear(self.lags, 1024), 22 | nn.BatchNorm1d(1024), 23 | nn.ReLU(), 24 | ) 25 | 26 | self.pred = nn.Sequential( 27 | nn.Linear(1024, self.horizon) 28 | ) 29 | 30 | # initial graph 31 | self._edge_index, self.edge_attr, self.A = self.embedding() 32 | 33 | def get_graph(self): 34 | return self.embedding.get_A() 35 | 36 | def get_embedding(self): 37 | return self.embedding.get_E() 38 | 39 | def forward(self, x): 40 | # input sizes 41 | N = self.num_nodes 42 | B = x.size(0) // N # batch size 43 | 44 | x = self.lin(x) 45 | pred = self.pred(x).view(B*N, -1) 46 | 47 | return pred 48 | -------------------------------------------------------------------------------- /src/models/MTGNN.py: -------------------------------------------------------------------------------- 1 | from __future__ import division 2 | 3 | from typing import Optional 4 | 5 | import torch 6 | import torch.nn.functional as F 7 | from torch.nn import init 8 | from torch import nn 9 | 10 | from ..layers import SingleEmbedding 11 | 12 | class MTGNNModel(torch.nn.Module): 13 | def __init__(self, config): 14 | super().__init__() 15 | 16 | self.device = config.device 17 | 18 | self.num_nodes = config.num_nodes 19 | self.horizon = config.horizon 20 | self.topk = config.topk 21 | self.embed_dim = config.embed_dim 22 | self.lags = config.window_size 23 | 24 | # layer definitions 25 | self.embedding = SingleEmbedding( 26 | self.num_nodes, 27 | self.embed_dim, 28 | topk=self.topk 29 | ).to(self.device) 30 | 31 | conv_channels = 30 32 | residual_channels = 30 33 | skip_channels = 128 34 | end_channels = 256 35 | 36 | self.gnn = MTGNN( 37 | gcn_true=True, 38 | build_adj=True, 39 | gcn_depth=3, 40 | num_nodes=self.num_nodes, 41 | kernel_set=[3,3,3], 42 | kernel_size=3, 43 | dropout=0.2, 44 | subgraph_size=self.topk, 45 | node_dim=1, 46 | dilation_exponential=2, 47 | conv_channels=conv_channels, 48 | residual_channels=residual_channels, 49 | skip_channels=skip_channels, 50 | end_channels=end_channels, 51 | seq_length=self.lags, 52 | in_dim=1, 53 | out_dim=self.horizon, 54 | layers=3, 55 | propalpha=0.4, 56 | tanhalpha=3, 57 | layer_norm_affline=True, 58 | xd=None 59 | ) 60 | 61 | # initial graph 62 | self._edge_index, self.edge_attr, self.A = self.embedding() 63 | 64 | def get_graph(self): 65 | return self.embedding.get_A() 66 | 67 | def get_embedding(self): 68 | return self.embedding.get_E() 69 | 70 | def forward(self, window): 71 | x = window.view(-1, 1, self.num_nodes, self.lags) # (batch, 1, num_nodes, lags) 72 | pred = self.gnn(x) # (batch, out_channels, num_nodes, 1) 73 | pred = pred.squeeze(-1).permute(0,2,1).contiguous().view(-1, self.horizon) # batch stacked 74 | 75 | return pred 76 | 77 | 78 | class Linear(nn.Module): 79 | r"""An implementation of the linear layer, conducting 2D convolution. 80 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks." 81 | `_ 82 | 83 | Args: 84 | c_in (int): Number of input channels. 85 | c_out (int): Number of output channels. 86 | bias (bool, optional): Whether to have bias. Default: True. 87 | """ 88 | 89 | def __init__(self, c_in: int, c_out: int, bias: bool = True): 90 | super(Linear, self).__init__() 91 | self._mlp = torch.nn.Conv2d( 92 | c_in, c_out, kernel_size=(1, 1), padding=(0, 0), stride=(1, 1), bias=bias 93 | ) 94 | 95 | self._reset_parameters() 96 | 97 | def _reset_parameters(self): 98 | for p in self.parameters(): 99 | if p.dim() > 1: 100 | nn.init.xavier_uniform_(p) 101 | else: 102 | nn.init.uniform_(p) 103 | 104 | def forward(self, X: torch.FloatTensor) -> torch.FloatTensor: 105 | """ 106 | Making a forward pass of the linear layer. 107 | 108 | Arg types: 109 | * **X** (Pytorch Float Tensor) - Input tensor, with shape (batch_size, c_in, num_nodes, seq_len). 110 | 111 | Return types: 112 | * **X** (PyTorch Float Tensor) - Output tensor, with shape (batch_size, c_out, num_nodes, seq_len). 113 | """ 114 | return self._mlp(X) 115 | 116 | 117 | class MixProp(nn.Module): 118 | r"""An implementation of the dynatic mix-hop propagation layer. 119 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks." 120 | `_ 121 | 122 | Args: 123 | c_in (int): Number of input channels. 124 | c_out (int): Number of output channels. 125 | gdep (int): Depth of graph convolution. 126 | dropout (float): Dropout rate. 127 | alpha (float): Ratio of retaining the root nodes's original states, a value between 0 and 1. 128 | """ 129 | 130 | def __init__(self, c_in: int, c_out: int, gdep: int, dropout: float, alpha: float): 131 | super(MixProp, self).__init__() 132 | self._mlp = Linear((gdep + 1) * c_in, c_out) 133 | self._gdep = gdep 134 | self._dropout = dropout 135 | self._alpha = alpha 136 | 137 | self._reset_parameters() 138 | 139 | def _reset_parameters(self): 140 | for p in self.parameters(): 141 | if p.dim() > 1: 142 | nn.init.xavier_uniform_(p) 143 | else: 144 | nn.init.uniform_(p) 145 | 146 | def forward(self, X: torch.FloatTensor, A: torch.FloatTensor) -> torch.FloatTensor: 147 | """ 148 | Making a forward pass of mix-hop propagation. 149 | 150 | Arg types: 151 | * **X** (Pytorch Float Tensor) - Input feature Tensor, with shape (batch_size, c_in, num_nodes, seq_len). 152 | * **A** (PyTorch Float Tensor) - Adjacency matrix, with shape (num_nodes, num_nodes). 153 | 154 | Return types: 155 | * **H_0** (PyTorch Float Tensor) - Hidden representation for all nodes, with shape (batch_size, c_out, num_nodes, seq_len). 156 | """ 157 | A = A + torch.eye(A.size(0)).to(X.device) 158 | d = A.sum(1) 159 | H = X 160 | H_0 = X 161 | A = A / d.view(-1, 1) 162 | for _ in range(self._gdep): 163 | H = self._alpha * X + (1 - self._alpha) * torch.einsum( 164 | "ncwl,vw->ncvl", (H, A) 165 | ) 166 | H_0 = torch.cat((H_0, H), dim=1) 167 | H_0 = self._mlp(H_0) 168 | return H_0 169 | 170 | 171 | class DilatedInception(nn.Module): 172 | r"""An implementation of the dilated inception layer. 173 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks." 174 | `_ 175 | 176 | Args: 177 | c_in (int): Number of input channels. 178 | c_out (int): Number of output channels. 179 | kernel_set (list of int): List of kernel sizes. 180 | dilated_factor (int, optional): Dilation factor. 181 | """ 182 | 183 | def __init__(self, c_in: int, c_out: int, kernel_set: list, dilation_factor: int): 184 | super(DilatedInception, self).__init__() 185 | self._time_conv = nn.ModuleList() 186 | self._kernel_set = kernel_set 187 | c_out = int(c_out / len(self._kernel_set)) 188 | for kern in self._kernel_set: 189 | self._time_conv.append( 190 | nn.Conv2d(c_in, c_out, (1, kern), dilation=(1, dilation_factor)) 191 | ) 192 | self._reset_parameters() 193 | 194 | def _reset_parameters(self): 195 | for p in self.parameters(): 196 | if p.dim() > 1: 197 | nn.init.xavier_uniform_(p) 198 | else: 199 | nn.init.uniform_(p) 200 | 201 | def forward(self, X_in: torch.FloatTensor) -> torch.FloatTensor: 202 | """ 203 | Making a forward pass of dilated inception. 204 | 205 | Arg types: 206 | * **X_in** (Pytorch Float Tensor) - Input feature Tensor, with shape (batch_size, c_in, num_nodes, seq_len). 207 | 208 | Return types: 209 | * **X** (PyTorch Float Tensor) - Hidden representation for all nodes, 210 | with shape (batch_size, c_out, num_nodes, seq_len-6). 211 | """ 212 | X = [] 213 | for i in range(len(self._kernel_set)): 214 | X.append(self._time_conv[i](X_in)) 215 | for i in range(len(self._kernel_set)): 216 | X[i] = X[i][..., -X[-1].size(3) :] 217 | X = torch.cat(X, dim=1) 218 | return X 219 | 220 | 221 | class GraphConstructor(nn.Module): 222 | r"""An implementation of the graph learning layer to construct an adjacency matrix. 223 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks." 224 | `_ 225 | 226 | Args: 227 | nnodes (int): Number of nodes in the graph. 228 | k (int): Number of largest values to consider in constructing the neighbourhood of a node (pick the "nearest" k nodes). 229 | dim (int): Dimension of the node embedding. 230 | alpha (float, optional): Tanh alpha for generating adjacency matrix, alpha controls the saturation rate 231 | xd (int, optional): Static feature dimension, default None. 232 | """ 233 | 234 | def __init__( 235 | self, nnodes: int, k: int, dim: int, alpha: float, xd: Optional[int] = None 236 | ): 237 | super(GraphConstructor, self).__init__() 238 | if xd is not None: 239 | self._static_feature_dim = xd 240 | self._linear1 = nn.Linear(xd, dim) 241 | self._linear2 = nn.Linear(xd, dim) 242 | else: 243 | self._embedding1 = nn.Embedding(nnodes, dim) 244 | self._embedding2 = nn.Embedding(nnodes, dim) 245 | self._linear1 = nn.Linear(dim, dim) 246 | self._linear2 = nn.Linear(dim, dim) 247 | 248 | self._k = k 249 | self._alpha = alpha 250 | 251 | self._reset_parameters() 252 | 253 | def _reset_parameters(self): 254 | for p in self.parameters(): 255 | if p.dim() > 1: 256 | nn.init.xavier_uniform_(p) 257 | else: 258 | nn.init.uniform_(p) 259 | 260 | def forward( 261 | self, idx: torch.LongTensor, FE: Optional[torch.FloatTensor] = None 262 | ) -> torch.FloatTensor: 263 | """ 264 | Making a forward pass to construct an adjacency matrix from node embeddings. 265 | 266 | Arg types: 267 | * **idx** (Pytorch Long Tensor) - Input indices, a permutation of the number of nodes, default None (no permutation). 268 | * **FE** (Pytorch Float Tensor, optional) - Static feature, default None. 269 | Return types: 270 | * **A** (PyTorch Float Tensor) - Adjacency matrix constructed from node embeddings. 271 | """ 272 | 273 | if FE is None: 274 | nodevec1 = self._embedding1(idx) 275 | nodevec2 = self._embedding2(idx) 276 | else: 277 | assert FE.shape[1] == self._static_feature_dim 278 | nodevec1 = FE[idx, :] 279 | nodevec2 = nodevec1 280 | 281 | nodevec1 = torch.tanh(self._alpha * self._linear1(nodevec1)) 282 | nodevec2 = torch.tanh(self._alpha * self._linear2(nodevec2)) 283 | 284 | a = torch.mm(nodevec1, nodevec2.transpose(1, 0)) - torch.mm( 285 | nodevec2, nodevec1.transpose(1, 0) 286 | ) 287 | A = F.relu(torch.tanh(self._alpha * a)) 288 | mask = torch.zeros(idx.size(0), idx.size(0)).to(A.device) 289 | mask.fill_(float("0")) 290 | s1, t1 = A.topk(self._k, 1) 291 | mask.scatter_(1, t1, s1.fill_(1)) 292 | A = A * mask 293 | return A 294 | 295 | 296 | class LayerNormalization(nn.Module): 297 | __constants__ = ["normalized_shape", "weight", "bias", "eps", "elementwise_affine"] 298 | r"""An implementation of the layer normalization layer. 299 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks." 300 | `_ 301 | 302 | Args: 303 | normalized_shape (int): Input shape from an expected input of size. 304 | eps (float, optional): Value added to the denominator for numerical stability. Default: 1e-5. 305 | elementwise_affine (bool, optional): Whether to conduct elementwise affine transformation or not. Default: True. 306 | """ 307 | 308 | def __init__( 309 | self, normalized_shape: int, eps: float = 1e-5, elementwise_affine: bool = True 310 | ): 311 | super(LayerNormalization, self).__init__() 312 | self._normalized_shape = tuple(normalized_shape) 313 | self._eps = eps 314 | self._elementwise_affine = elementwise_affine 315 | if self._elementwise_affine: 316 | self._weight = nn.Parameter(torch.Tensor(*normalized_shape)) 317 | self._bias = nn.Parameter(torch.Tensor(*normalized_shape)) 318 | else: 319 | self.register_parameter("_weight", None) 320 | self.register_parameter("_bias", None) 321 | self._reset_parameters() 322 | 323 | def _reset_parameters(self): 324 | if self._elementwise_affine: 325 | init.ones_(self._weight) 326 | init.zeros_(self._bias) 327 | 328 | def forward(self, X: torch.FloatTensor, idx: torch.LongTensor) -> torch.FloatTensor: 329 | """ 330 | Making a forward pass of layer normalization. 331 | 332 | Arg types: 333 | * **X** (Pytorch Float Tensor) - Input tensor, 334 | with shape (batch_size, feature_dim, num_nodes, seq_len). 335 | * **idx** (Pytorch Long Tensor) - Input indices. 336 | 337 | Return types: 338 | * **X** (PyTorch Float Tensor) - Output tensor, 339 | with shape (batch_size, feature_dim, num_nodes, seq_len). 340 | """ 341 | if self._elementwise_affine: 342 | return F.layer_norm( 343 | X, 344 | tuple(X.shape[1:]), 345 | self._weight[:, idx, :], 346 | self._bias[:, idx, :], 347 | self._eps, 348 | ) 349 | else: 350 | return F.layer_norm( 351 | X, tuple(X.shape[1:]), self._weight, self._bias, self._eps 352 | ) 353 | 354 | 355 | class MTGNNLayer(nn.Module): 356 | r"""An implementation of the MTGNN layer. 357 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks." 358 | `_ 359 | 360 | Args: 361 | dilation_exponential (int): Dilation exponential. 362 | rf_size_i (int): Size of receptive field. 363 | kernel_size (int): Size of kernel for convolution, to calculate receptive field size. 364 | j (int): Iteration index. 365 | residual_channels (int): Residual channels. 366 | conv_channels (int): Convolution channels. 367 | skip_channels (int): Skip channels. 368 | kernel_set (list of int): List of kernel sizes. 369 | new_dilation (int): Dilation. 370 | layer_norm_affline (bool): Whether to do elementwise affine in Layer Normalization. 371 | gcn_true (bool): Whether to add graph convolution layer. 372 | seq_length (int): Length of input sequence. 373 | receptive_field (int): Receptive field. 374 | dropout (float): Droupout rate. 375 | gcn_depth (int): Graph convolution depth. 376 | num_nodes (int): Number of nodes in the graph. 377 | propalpha (float): Prop alpha, ratio of retaining the root nodes's original states in mix-hop propagation, a value between 0 and 1. 378 | 379 | """ 380 | 381 | def __init__( 382 | self, 383 | dilation_exponential: int, 384 | rf_size_i: int, 385 | kernel_size: int, 386 | j: int, 387 | residual_channels: int, 388 | conv_channels: int, 389 | skip_channels: int, 390 | kernel_set: list, 391 | new_dilation: int, 392 | layer_norm_affline: bool, 393 | gcn_true: bool, 394 | seq_length: int, 395 | receptive_field: int, 396 | dropout: float, 397 | gcn_depth: int, 398 | num_nodes: int, 399 | propalpha: float, 400 | ): 401 | super(MTGNNLayer, self).__init__() 402 | self._dropout = dropout 403 | self._gcn_true = gcn_true 404 | 405 | if dilation_exponential > 1: 406 | rf_size_j = int( 407 | rf_size_i 408 | + (kernel_size - 1) 409 | * (dilation_exponential ** j - 1) 410 | / (dilation_exponential - 1) 411 | ) 412 | else: 413 | rf_size_j = rf_size_i + j * (kernel_size - 1) 414 | 415 | self._filter_conv = DilatedInception( 416 | residual_channels, 417 | conv_channels, 418 | kernel_set=kernel_set, 419 | dilation_factor=new_dilation, 420 | ) 421 | 422 | self._gate_conv = DilatedInception( 423 | residual_channels, 424 | conv_channels, 425 | kernel_set=kernel_set, 426 | dilation_factor=new_dilation, 427 | ) 428 | 429 | self._residual_conv = nn.Conv2d( 430 | in_channels=conv_channels, 431 | out_channels=residual_channels, 432 | kernel_size=(1, 1), 433 | ) 434 | 435 | if seq_length > receptive_field: 436 | self._skip_conv = nn.Conv2d( 437 | in_channels=conv_channels, 438 | out_channels=skip_channels, 439 | kernel_size=(1, seq_length - rf_size_j + 1), 440 | ) 441 | else: 442 | self._skip_conv = nn.Conv2d( 443 | in_channels=conv_channels, 444 | out_channels=skip_channels, 445 | kernel_size=(1, receptive_field - rf_size_j + 1), 446 | ) 447 | 448 | if gcn_true: 449 | self._mixprop_conv1 = MixProp( 450 | conv_channels, residual_channels, gcn_depth, dropout, propalpha 451 | ) 452 | 453 | self._mixprop_conv2 = MixProp( 454 | conv_channels, residual_channels, gcn_depth, dropout, propalpha 455 | ) 456 | 457 | if seq_length > receptive_field: 458 | self._normalization = LayerNormalization( 459 | (residual_channels, num_nodes, seq_length - rf_size_j + 1), 460 | elementwise_affine=layer_norm_affline, 461 | ) 462 | 463 | else: 464 | self._normalization = LayerNormalization( 465 | (residual_channels, num_nodes, receptive_field - rf_size_j + 1), 466 | elementwise_affine=layer_norm_affline, 467 | ) 468 | self._reset_parameters() 469 | 470 | def _reset_parameters(self): 471 | for p in self.parameters(): 472 | if p.dim() > 1: 473 | nn.init.xavier_uniform_(p) 474 | else: 475 | nn.init.uniform_(p) 476 | 477 | def forward( 478 | self, 479 | X: torch.FloatTensor, 480 | X_skip: torch.FloatTensor, 481 | A_tilde: Optional[torch.FloatTensor], 482 | idx: torch.LongTensor, 483 | training: bool, 484 | ) -> torch.FloatTensor: 485 | """ 486 | Making a forward pass of MTGNN layer. 487 | 488 | Arg types: 489 | * **X** (PyTorch FloatTensor) - Input feature tensor, 490 | with shape (batch_size, in_dim, num_nodes, seq_len). 491 | * **X_skip** (PyTorch FloatTensor) - Input feature tensor for skip connection, 492 | with shape (batch_size, in_dim, num_nodes, seq_len). 493 | * **A_tilde** (Pytorch FloatTensor or None) - Predefined adjacency matrix. 494 | * **idx** (Pytorch LongTensor) - Input indices. 495 | * **training** (bool) - Whether in traning mode. 496 | 497 | Return types: 498 | * **X** (PyTorch FloatTensor) - Output sequence tensor, 499 | with shape (batch_size, seq_len, num_nodes, seq_len). 500 | * **X_skip** (PyTorch FloatTensor) - Output feature tensor for skip connection, 501 | with shape (batch_size, in_dim, num_nodes, seq_len). 502 | """ 503 | X_residual = X 504 | X_filter = self._filter_conv(X) 505 | X_filter = torch.tanh(X_filter) 506 | X_gate = self._gate_conv(X) 507 | X_gate = torch.sigmoid(X_gate) 508 | X = X_filter * X_gate 509 | X = F.dropout(X, self._dropout, training=training) 510 | X_skip = self._skip_conv(X) + X_skip 511 | if self._gcn_true: 512 | X = self._mixprop_conv1(X, A_tilde) + self._mixprop_conv2( 513 | X, A_tilde.transpose(1, 0) 514 | ) 515 | else: 516 | X = self._residual_conv(X) 517 | 518 | X = X + X_residual[:, :, :, -X.size(3) :] 519 | X = self._normalization(X, idx) 520 | return X, X_skip 521 | 522 | 523 | class MTGNN(nn.Module): 524 | r"""An implementation of the Multivariate Time Series Forecasting Graph Neural Networks. 525 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks." 526 | `_ 527 | 528 | Args: 529 | gcn_true (bool): Whether to add graph convolution layer. 530 | build_adj (bool): Whether to construct adaptive adjacency matrix. 531 | gcn_depth (int): Graph convolution depth. 532 | num_nodes (int): Number of nodes in the graph. 533 | kernel_set (list of int): List of kernel sizes. 534 | kernel_size (int): Size of kernel for convolution, to calculate receptive field size. 535 | dropout (float): Droupout rate. 536 | subgraph_size (int): Size of subgraph. 537 | node_dim (int): Dimension of nodes. 538 | dilation_exponential (int): Dilation exponential. 539 | conv_channels (int): Convolution channels. 540 | residual_channels (int): Residual channels. 541 | skip_channels (int): Skip channels. 542 | end_channels (int): End channels. 543 | seq_length (int): Length of input sequence. 544 | in_dim (int): Input dimension. 545 | out_dim (int): Output dimension. 546 | layers (int): Number of layers. 547 | propalpha (float): Prop alpha, ratio of retaining the root nodes's original states in mix-hop propagation, a value between 0 and 1. 548 | tanhalpha (float): Tanh alpha for generating adjacency matrix, alpha controls the saturation rate. 549 | layer_norm_affline (bool): Whether to do elementwise affine in Layer Normalization. 550 | xd (int, optional): Static feature dimension, default None. 551 | """ 552 | 553 | def __init__( 554 | self, 555 | gcn_true: bool, 556 | build_adj: bool, 557 | gcn_depth: int, 558 | num_nodes: int, 559 | kernel_set: list, 560 | kernel_size: int, 561 | dropout: float, 562 | subgraph_size: int, 563 | node_dim: int, 564 | dilation_exponential: int, 565 | conv_channels: int, 566 | residual_channels: int, 567 | skip_channels: int, 568 | end_channels: int, 569 | seq_length: int, 570 | in_dim: int, 571 | out_dim: int, 572 | layers: int, 573 | propalpha: float, 574 | tanhalpha: float, 575 | layer_norm_affline: bool, 576 | xd: Optional[int] = None, 577 | ): 578 | super(MTGNN, self).__init__() 579 | 580 | self._gcn_true = gcn_true 581 | self._build_adj_true = build_adj 582 | self._num_nodes = num_nodes 583 | self._dropout = dropout 584 | self._seq_length = seq_length 585 | self._layers = layers 586 | self._idx = torch.arange(self._num_nodes) 587 | 588 | self._mtgnn_layers = nn.ModuleList() 589 | 590 | self._graph_constructor = GraphConstructor( 591 | num_nodes, subgraph_size, node_dim, alpha=tanhalpha, xd=xd 592 | ) 593 | 594 | self._set_receptive_field(dilation_exponential, kernel_size, layers) 595 | 596 | new_dilation = 1 597 | for j in range(1, layers + 1): 598 | self._mtgnn_layers.append( 599 | MTGNNLayer( 600 | dilation_exponential=dilation_exponential, 601 | rf_size_i=1, 602 | kernel_size=kernel_size, 603 | j=j, 604 | residual_channels=residual_channels, 605 | conv_channels=conv_channels, 606 | skip_channels=skip_channels, 607 | kernel_set=kernel_set, 608 | new_dilation=new_dilation, 609 | layer_norm_affline=layer_norm_affline, 610 | gcn_true=gcn_true, 611 | seq_length=seq_length, 612 | receptive_field=self._receptive_field, 613 | dropout=dropout, 614 | gcn_depth=gcn_depth, 615 | num_nodes=num_nodes, 616 | propalpha=propalpha, 617 | ) 618 | ) 619 | 620 | new_dilation *= dilation_exponential 621 | 622 | self._setup_conv( 623 | in_dim, skip_channels, end_channels, residual_channels, out_dim 624 | ) 625 | 626 | self._reset_parameters() 627 | 628 | def _setup_conv( 629 | self, in_dim, skip_channels, end_channels, residual_channels, out_dim 630 | ): 631 | 632 | self._start_conv = nn.Conv2d( 633 | in_channels=in_dim, out_channels=residual_channels, kernel_size=(1, 1) 634 | ) 635 | 636 | if self._seq_length > self._receptive_field: 637 | 638 | self._skip_conv_0 = nn.Conv2d( 639 | in_channels=in_dim, 640 | out_channels=skip_channels, 641 | kernel_size=(1, self._seq_length), 642 | bias=True, 643 | ) 644 | 645 | self._skip_conv_E = nn.Conv2d( 646 | in_channels=residual_channels, 647 | out_channels=skip_channels, 648 | kernel_size=(1, self._seq_length - self._receptive_field + 1), 649 | bias=True, 650 | ) 651 | 652 | else: 653 | self._skip_conv_0 = nn.Conv2d( 654 | in_channels=in_dim, 655 | out_channels=skip_channels, 656 | kernel_size=(1, self._receptive_field), 657 | bias=True, 658 | ) 659 | 660 | self._skip_conv_E = nn.Conv2d( 661 | in_channels=residual_channels, 662 | out_channels=skip_channels, 663 | kernel_size=(1, 1), 664 | bias=True, 665 | ) 666 | 667 | self._end_conv_1 = nn.Conv2d( 668 | in_channels=skip_channels, 669 | out_channels=end_channels, 670 | kernel_size=(1, 1), 671 | bias=True, 672 | ) 673 | 674 | self._end_conv_2 = nn.Conv2d( 675 | in_channels=end_channels, 676 | out_channels=out_dim, 677 | kernel_size=(1, 1), 678 | bias=True, 679 | ) 680 | 681 | def _reset_parameters(self): 682 | for p in self.parameters(): 683 | if p.dim() > 1: 684 | nn.init.xavier_uniform_(p) 685 | else: 686 | nn.init.uniform_(p) 687 | 688 | def _set_receptive_field(self, dilation_exponential, kernel_size, layers): 689 | if dilation_exponential > 1: 690 | self._receptive_field = int( 691 | 1 692 | + (kernel_size - 1) 693 | * (dilation_exponential ** layers - 1) 694 | / (dilation_exponential - 1) 695 | ) 696 | else: 697 | self._receptive_field = layers * (kernel_size - 1) + 1 698 | 699 | def forward( 700 | self, 701 | X_in: torch.FloatTensor, 702 | A_tilde: Optional[torch.FloatTensor] = None, 703 | idx: Optional[torch.LongTensor] = None, 704 | FE: Optional[torch.FloatTensor] = None, 705 | ) -> torch.FloatTensor: 706 | """ 707 | Making a forward pass of MTGNN. 708 | 709 | Arg types: 710 | * **X_in** (PyTorch FloatTensor) - Input sequence, with shape (batch_size, in_dim, num_nodes, seq_len). 711 | * **A_tilde** (Pytorch FloatTensor, optional) - Predefined adjacency matrix, default None. 712 | * **idx** (Pytorch LongTensor, optional) - Input indices, a permutation of the num_nodes, default None (no permutation). 713 | * **FE** (Pytorch FloatTensor, optional) - Static feature, default None. 714 | 715 | Return types: 716 | * **X** (PyTorch FloatTensor) - Output sequence for prediction, with shape (batch_size, seq_len, num_nodes, 1). 717 | """ 718 | seq_len = X_in.size(3) 719 | assert ( 720 | seq_len == self._seq_length 721 | ), "Input sequence length not equal to preset sequence length." 722 | 723 | if self._seq_length < self._receptive_field: 724 | X_in = nn.functional.pad( 725 | X_in, (self._receptive_field - self._seq_length, 0, 0, 0) 726 | ) 727 | 728 | if self._gcn_true: 729 | if self._build_adj_true: 730 | if idx is None: 731 | A_tilde = self._graph_constructor(self._idx.to(X_in.device), FE=FE) 732 | else: 733 | A_tilde = self._graph_constructor(idx, FE=FE) 734 | 735 | X = self._start_conv(X_in) 736 | X_skip = self._skip_conv_0( 737 | F.dropout(X_in, self._dropout, training=self.training) 738 | ) 739 | if idx is None: 740 | for mtgnn in self._mtgnn_layers: 741 | X, X_skip = mtgnn( 742 | X, X_skip, A_tilde, self._idx.to(X_in.device), self.training 743 | ) 744 | else: 745 | for mtgnn in self._mtgnn_layers: 746 | X, X_skip = mtgnn(X, X_skip, A_tilde, idx, self.training) 747 | 748 | X_skip = self._skip_conv_E(X) + X_skip 749 | X = F.relu(X_skip) 750 | X = F.relu(self._end_conv_1(X)) 751 | X = self._end_conv_2(X) 752 | return X 753 | -------------------------------------------------------------------------------- /src/models/__init__.py: -------------------------------------------------------------------------------- 1 | from .Linear import LinearModel 2 | from .ConvSeqAttention import ConvSeqAttentionModel 3 | from .LSTM import RecurrentModel 4 | from .MTGNN import MTGNNModel 5 | from .GNNLSTM import GNNLSTM -------------------------------------------------------------------------------- /src/utils/__init__.py: -------------------------------------------------------------------------------- 1 | from .nn_trainer import Trainer -------------------------------------------------------------------------------- /src/utils/device.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | # Dirty file that holds the device (cpu or cuda). 4 | # Import setter or getter functions to access. 5 | # get_device determine the device from availability 6 | # if none is set manually. 7 | 8 | _device = None 9 | 10 | def get_device(): 11 | ''' Returns the currently used computing device.''' 12 | if _device is None: 13 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 14 | set_device(device) 15 | return _device 16 | 17 | def set_device(device): 18 | ''' Sets the computing device. ''' 19 | global _device 20 | _device = device 21 | -------------------------------------------------------------------------------- /src/utils/evaluate.py: -------------------------------------------------------------------------------- 1 | from .metrics import precision_recall, F_score 2 | from . import utils 3 | import numpy as np 4 | 5 | def evaluate_performance(train_res, test_res, threshold_method='max', smoothing=4, smoothing_method='mean'): 6 | ''' 7 | Returns precision, recall, f1 and f2 scores. 8 | Determines anomaly threshold from normalized and smoothed validation data. 9 | Normalization is performed on each 1d sensor time series. 10 | Anomaly predictions are calculated as smoothed and normalized test error scores that exceed the threshold. 11 | 12 | Args: 13 | train_res (list): List of length three holding prediction and groundtruth values from validation. Third 14 | Entry is assumed to be NoneType for nonexisting anomaly labels. 15 | test_res (list): List of length three holding prediction, groundtruth values and anomaly labels testing. 16 | ''' 17 | train_pred_err, _ = train_res 18 | test_pred_err, anomaly_labels = test_res 19 | 20 | assert test_pred_err.size(1) == anomaly_labels.size(0) 21 | 22 | # row-wise normalization (within each sensor) and subsequent 1d smoothing 23 | train_error = utils.normalize_with_median_iqr(train_pred_err) 24 | test_error = utils.normalize_with_median_iqr(test_pred_err) 25 | if smoothing > 0: 26 | train_error = utils.weighted_average_smoothing(train_error, k=smoothing, mode=smoothing_method) 27 | test_error = utils.weighted_average_smoothing(test_error, k=smoothing, mode=smoothing_method) 28 | 29 | if threshold_method == 'max': 30 | anomaly_predictions = _max_thresholding(train_error, test_error) 31 | elif threshold_method == 'mean': 32 | anomaly_predictions = _mean_thresholding(train_error, test_error) 33 | elif threshold_method == 'best': 34 | anomaly_predictions, threshold_method = _best_thresholding(test_error, anomaly_labels) 35 | 36 | # evaluate test performance 37 | assert anomaly_predictions.shape == anomaly_labels.shape 38 | precision, recall = precision_recall(anomaly_predictions, anomaly_labels) 39 | f1 = F_score(precision, recall, beta=1) 40 | f2 = F_score(precision, recall, beta=2) 41 | 42 | # adjusted performance 43 | adjusted_predictions, latency = adjust_predicts(anomaly_predictions, anomaly_labels, calc_latency=True) 44 | precision_adj, recall_adj = precision_recall(adjusted_predictions, anomaly_labels) 45 | f1_adj = F_score(precision_adj, recall_adj, beta=1) 46 | f2_adj = F_score(precision_adj, recall_adj, beta=2) 47 | 48 | results_dict = { 49 | 'method': threshold_method, 50 | 'prec': precision, 51 | 'rec': recall, 52 | 'f1': f1, 53 | 'f2': f2, 54 | 'a_prec': precision_adj, 55 | 'a_rec': recall_adj, 56 | 'a_f1': f1_adj, 57 | 'a_f2': f2_adj, 58 | 'latency': latency 59 | } 60 | return results_dict 61 | 62 | def adjust_predicts(pred, label, calc_latency=False): 63 | """ 64 | Calculate adjusted predict labels using given `score`, `threshold` (or given `pred`) and `label`. 65 | Args: 66 | score (np.ndarray): The anomaly score 67 | label (np.ndarray): The ground-truth label 68 | threshold (float): The threshold of anomaly score. 69 | A point is labeled as "anomaly" if its score is lower than the threshold. 70 | pred (np.ndarray or None): if not None, adjust `pred` and ignore `score` and `threshold`, 71 | calc_latency (bool): 72 | Returns: 73 | np.ndarray: predict labels 74 | 75 | Method from OmniAnomaly (https://github.com/NetManAIOps/OmniAnomaly) 76 | """ 77 | anomaly_state = False 78 | anomaly_count = 0 # number of anomalies found 79 | latency = 0 80 | 81 | for i in range(len(pred)): 82 | if label[i] and pred[i] and not anomaly_state: # if correctly found anomaly 83 | anomaly_state = True 84 | anomaly_count += 1 85 | for j in range(i, 0, -1): # go backward until beginning of anomaly 86 | if not label[j]: # BEGINNING of anomaly 87 | break 88 | else: 89 | if not pred[j]: # set prediction to true 90 | pred[j] = True 91 | latency += 1 92 | elif not label[i]: # END of anomaly 93 | anomaly_state = False 94 | if anomaly_state: # still in anomaly and was already found 95 | pred[i] = True 96 | if calc_latency: 97 | return pred, latency / (anomaly_count + 1e-8) 98 | else: 99 | return pred 100 | 101 | def _max_thresholding(train_errors, test_errors): 102 | ''' 103 | Returns anomaly predictions on test errors based on threshold 104 | calculated on the validation errors. 105 | Threshold is the largest validation error within the entire validation data. 106 | ''' 107 | 108 | # set threshold as global max of validation errors 109 | threshold = train_errors.max().item() 110 | 111 | # set test scores as max error in one time tick 112 | score, _ = test_errors.max(dim=0) 113 | 114 | return score > threshold 115 | 116 | def _mean_thresholding(train_errors, test_errors): 117 | ''' 118 | Returns anomaly predictions on test errors based on threshold 119 | calculated on the validation errors. 120 | Threshold is the largest validation error within the entire validation data. 121 | ''' 122 | 123 | # set threshold as global max of validation errors 124 | threshold = train_errors.mean(dim=0).max().item() 125 | 126 | # set test scores as max error in one time tick 127 | score = test_errors.mean(dim=0) 128 | 129 | return score > threshold 130 | 131 | def _best_thresholding(test_errors, test_labels): 132 | ''' 133 | Returns anomaly predictions on test errors based on threshold 134 | calculated on the validation errors. 135 | Threshold is the largest validation error within the entire validation data. 136 | 137 | ONLY USE TO TEST THEORETICAL PERFORMANCE, NOT FOR REAL EVALUATION! 138 | ''' 139 | 140 | # set threshold as global max of validation errors 141 | 142 | max_score, _ = test_errors.max(dim=0) 143 | mean_score = test_errors.mean(dim=0) 144 | scores = {'max': max_score, 'mean': mean_score} 145 | 146 | best_f1 = 0 147 | best_method = None 148 | best_predictions = None 149 | 150 | lower_bound = min(min(max_score), min(mean_score)).item() 151 | upper_bound = max(max(max_score), max(mean_score)).item() 152 | 153 | thresholds = np.linspace(lower_bound, upper_bound, 1000) 154 | for threshold in thresholds: 155 | for method, score in scores.items(): 156 | anomaly_predictions = score > threshold 157 | precision, recall = precision_recall(anomaly_predictions, test_labels) 158 | f1 = F_score(precision, recall, beta=1) 159 | if f1 > best_f1: 160 | best_f1 = f1 161 | best_method = method 162 | best_predictions = anomaly_predictions 163 | 164 | return best_predictions, best_method -------------------------------------------------------------------------------- /src/utils/metrics.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from . import utils 3 | 4 | def precision_recall(pred, labels): 5 | ''' 6 | Calculates precision and recall. 7 | Precision = (TP / (TP+FP)). 8 | Recall = (TP / (TP+FN)). 9 | 10 | Args: 11 | pred (Tensor): 1-dimensional tensor of predictions. 12 | labels (Tensor): 1-dimensional tensor of ground truth observations. 13 | ''' 14 | pred, labels = utils.cast(torch.bool, pred, labels) 15 | 16 | # precision 17 | hits = labels[pred] 18 | precision = hits.sum() / pred.sum() 19 | 20 | # recall 21 | hits = pred[labels] 22 | recall = hits.sum() / labels.sum() 23 | 24 | return precision.item(), recall.item() 25 | 26 | def F_score(precision, recall, beta=1): 27 | ''' 28 | Calculates F-scores. 29 | 30 | Args: 31 | precision (int, float): Precision score. 32 | recall (int, float): Recall score. 33 | beta (int, float, optional): Positive number. 34 | ''' 35 | div = (beta**2 * precision) + recall 36 | if div > 0: 37 | return ((1 + beta**2) * (precision * recall)) / div 38 | else: 39 | return 0 40 | -------------------------------------------------------------------------------- /src/utils/nn_trainer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from time import time 3 | from copy import deepcopy 4 | 5 | from .utils import format_time 6 | 7 | 8 | class Trainer: 9 | ''' 10 | Class for model training, validation and testing. 11 | 12 | Args: 13 | model (callable): Pytorch nn.module class object that defines the neural network model. 14 | optimizer (callable): Pytorch optim object, e.g. Adam. 15 | criterion (func): Loss function that takes two arguments. 16 | ''' 17 | 18 | def __init__(self, model, optimizer, criterion): 19 | 20 | self.model = model 21 | self.optimizer = optimizer 22 | self.criterion = criterion 23 | 24 | def _train_iteration(self, loader): 25 | ''' 26 | Returns the average training loss of one iteration over the training dataloader. 27 | ''' 28 | self.model.train() 29 | 30 | avg_pred_loss = 0 31 | for i, window in enumerate(loader): 32 | self.optimizer.zero_grad() 33 | 34 | x = window.x 35 | y = window.y 36 | 37 | # forward step 38 | pred = self.model(x) 39 | 40 | assert pred.shape == y.shape 41 | 42 | loss = self.criterion(pred, y) 43 | 44 | # backward step 45 | loss.backward() 46 | torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=2.0, norm_type=2) 47 | self.optimizer.step() 48 | 49 | avg_pred_loss += loss.item() 50 | avg_pred_loss /= i+1 51 | 52 | return avg_pred_loss 53 | 54 | def test(self, loader, return_errors=True): 55 | ''' 56 | Returns the average loss over the test data. 57 | Optionally returns a list of predictions and corresponding groundtruth values 58 | and anomaly labels. 59 | ''' 60 | self.model.eval() 61 | 62 | avg_pred_loss = 0 63 | pred_errors = [] 64 | y_labels = [] 65 | with torch.no_grad(): 66 | for i, window in enumerate(loader): 67 | x = window.x 68 | y = window.y 69 | batch_size = len(window.ptr) - 1 70 | 71 | pred = self.model(x) 72 | 73 | assert pred.shape == y.shape 74 | 75 | pred_loss = self.criterion(pred, y) 76 | 77 | if return_errors: 78 | y_label = window.y_label 79 | if y_label is not None: 80 | y_labels.append(y_label[::pred.size(1)]) 81 | else: 82 | y_labels.append(y_label) # NoneType labels for validation data 83 | 84 | pred_error = ((pred[:, -1] - y[:, -1]) ** 2).detach() 85 | pred_errors.append(pred_error.T.view(batch_size, -1).T) 86 | 87 | avg_pred_loss += pred_loss.item() 88 | 89 | avg_pred_loss /= i+1 90 | 91 | # results to be returned 92 | re = [] 93 | if return_errors: 94 | pred_errors = torch.cat(pred_errors, dim=1) 95 | 96 | if isinstance(y_labels[0], torch.Tensor): 97 | anomaly_labels = torch.cat(y_labels) 98 | else: # during validation 99 | anomaly_labels = None 100 | 101 | re.append([pred_errors, anomaly_labels]) 102 | 103 | re.append(avg_pred_loss) 104 | 105 | if len(re) == 1: 106 | return re.pop() 107 | else: 108 | return tuple(re) 109 | 110 | def train(self, train_loader, val_loader=None, epochs=10, early_stopping=10, return_model_state=False, return_val_results=False, verbose=True): 111 | ''' 112 | Main function of the Trainer class. Handles the training procedure, 113 | including the training and validation steps for each epoch testing 114 | the resulting model on the test data. 115 | 116 | Args: 117 | train_loader (iterable): Dataloader holding the (batched) training samples. 118 | val_loader (iterable, optional): Dataloader holding the (batched) validation samples. 119 | epochs (int, optional): Number of epochs for training. 120 | early_stopping (int, optional): Number of epochs without improvement on the validation data until training is stopped. 121 | return_model_state (bool, optional): If true, returns the model state dict. 122 | return_val_results (bool, optional): If true, returns predictions and groundtruth values for validation. 123 | verbose (bool, optional): If true, prints updates on training and validation loss each epoch. 124 | ''' 125 | 126 | train_loss_history = [] 127 | val_loss_history = [] 128 | early_stopping_counter = 0 129 | early_stopping_point = early_stopping 130 | best_train_loss = float('inf') 131 | best_val_loss = float('inf') 132 | val_results = None # dummy variable for optional return values 133 | indicator = '' 134 | for i in range(epochs): 135 | start = time() 136 | # train 137 | train_loss = self._train_iteration(train_loader) 138 | train_loss_history.append(train_loss) 139 | 140 | if val_loader is not None: 141 | # validate if validation loader is provided 142 | if return_val_results: 143 | val_results, val_loss = self.test(val_loader, return_errors=return_val_results) 144 | else: 145 | val_loss = self.test(val_loader, return_errors=return_val_results) 146 | val_loss_history.append(val_loss) 147 | else: 148 | # use training loss for early stopping if no validation data 149 | val_loss = train_loss 150 | 151 | # check for early stopping 152 | if val_loss < best_val_loss: 153 | best_val_loss = val_loss 154 | best_val_results = val_results 155 | best_train_loss = train_loss 156 | best_model_state = deepcopy(self.model.state_dict()) 157 | early_stopping_counter = 0 158 | indicator = '*' 159 | else: 160 | early_stopping_counter += 1 161 | indicator = '' 162 | 163 | if verbose: 164 | # print loss of epoch 165 | time_elapsed = format_time(time() - start) 166 | train_print_string = f'Train Loss: {train_loss:>9.5f}' 167 | val_print_string = f' || Validation Loss: {val_loss:>9.5f}' if val_loader is not None else '' 168 | print(f' Epoch {i+1:>2}/{epochs} ({time_elapsed}/it) -- ({train_print_string}{val_print_string}) {indicator}') 169 | 170 | # stop training if early stopping criterion is fulfilled 171 | if early_stopping_counter == early_stopping_point and not epochs == i+1: 172 | if verbose: 173 | print(f' ...Stopping early after {i+1} epochs...') 174 | break 175 | # end of epoch 176 | # end of training loop 177 | 178 | if verbose: 179 | # print loss after training 180 | print(' Training Results:') 181 | print(f' Train MSE: {best_train_loss:.5f}') 182 | if val_loader is not None: 183 | print(f' Validation MSE: {best_val_loss:.5f}\n') 184 | 185 | # return values: loss for each epoch, validation results (optional), model_state_dict (optional) 186 | if val_loader is None: 187 | re = [train_loss_history, None] 188 | best_val_results = None 189 | else: 190 | re = [train_loss_history, val_loss_history] 191 | self.model.load_state_dict(best_model_state) 192 | if return_model_state: 193 | re.append(best_model_state) 194 | if return_val_results: 195 | re.append(best_val_results) 196 | 197 | if len(re) == 1: 198 | return re.pop() 199 | else: 200 | return tuple(re) 201 | 202 | 203 | -------------------------------------------------------------------------------- /src/utils/utils.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn.functional as F 3 | from .device import get_device 4 | 5 | def normalize_with_median_iqr(x): 6 | ''' 7 | Row normalization with median und interquartile range for 2d tensors. 8 | 9 | Args: 10 | x (Tensor): 2-dimensional input tensor. 11 | ''' 12 | assert isinstance(x, torch.Tensor) 13 | 14 | device = get_device() 15 | 16 | quantiles = torch.tensor([.25, .5, .75]).to(device) 17 | q1, median, q3, = torch.quantile(x, quantiles, dim=1) 18 | iqr = q3 - q1 19 | 20 | return (x - median.unsqueeze(0).T) / (1 + iqr.unsqueeze(0).T) 21 | 22 | def weighted_average_smoothing(x, k, mode='mean'): 23 | ''' 24 | Average (weighted) smooothing of rows of a 2d tensor with 1d kernel, padding='same'. 25 | 26 | Args: 27 | x (Tensor): 2-dimensional input tensor. 28 | k (int): Size of the smoothing kernel. 29 | mode (str): Weighting of the average. Can be: 30 | 'mean' : no weighting 31 | 'exp' : exponentially gives heigher weights to the right side of a row 32 | 33 | ''' 34 | assert isinstance(x, torch.Tensor) 35 | 36 | device = get_device() 37 | 38 | n = x.size(0) 39 | div, mod = divmod(k, 2) 40 | p1d = (div, div - (mod ^ 1)) # padding size 41 | x = torch.constant_pad_nd(x, p1d, value=0.0) 42 | x = x.view(n, 1, -1) 43 | 44 | if mode == 'mean': 45 | kernel = torch.full(size=(1,1,k), fill_value=1/k, requires_grad=False) 46 | elif mode == 'exp': 47 | kernel = torch.logspace(-k+1, 0, k, base=1.5, requires_grad=False) 48 | kernel /= kernel.sum() 49 | kernel = kernel.view(1,1,k) 50 | 51 | return F.conv1d(x, kernel.to(device)).squeeze() 52 | 53 | def cast(dtype, *args): 54 | ''' 55 | Casts arbitrary number of tensors to specified type. 56 | 57 | Args: 58 | dtype (type or string): The desired type. 59 | *args: Tensors to be type-cast. 60 | 61 | ''' 62 | a = [x.type(dtype) for x in args] 63 | if len(a) == 1: 64 | return a.pop() 65 | else: 66 | return a 67 | 68 | def equalize_len(t1, t2, value=0): 69 | ''' 70 | Returns new tensors with equal length according to max(len(t1), len(t2)). 71 | 72 | Args: 73 | t1 (Tensor): Input tensor 74 | t2 (Tensor): Input tensor 75 | value (int, float, optional): Fill value for new entries in shorter tensor. 76 | ''' 77 | 78 | assert isinstance(t1, torch.Tensor) 79 | assert isinstance(t2, torch.Tensor) 80 | 81 | if len(t1) == len(t2): 82 | return t1, t2 83 | 84 | diff = abs(len(t2) - len(t1)) 85 | p1d = (0, diff) 86 | if len(t1) > len(t2): 87 | t2 = F.pad(t2, p1d, 'constant', value) 88 | return t1, t2 89 | else: 90 | t1 = F.pad(t1, p1d, 'constant', value) 91 | return t1, t2 92 | 93 | def format_time(t): 94 | ''' 95 | Format seconds to days, hours, minutes, and seconds. 96 | -> Output format example: "01d-09h-24m-54s" 97 | 98 | Args: 99 | t (float, int): Time in seconds. 100 | ''' 101 | assert isinstance(t, (float, int)) 102 | 103 | h, r = divmod(t,3600) 104 | d, h = divmod(h, 24) 105 | m, r = divmod(r, 60) 106 | s, r = divmod(r, 1) 107 | 108 | values = [d, h, m, s] 109 | symbols = ['d', 'h', 'm', 's'] 110 | for i, val in enumerate(values): 111 | if val > 0: 112 | symbols[i] = ''.join([f'{int(val):02d}', symbols[i]]) 113 | else: 114 | symbols[i] = '' 115 | return '-'.join(s for s in symbols if s) if any(symbols) else '<1s' -------------------------------------------------------------------------------- /src/visualization/error_distribution.py: -------------------------------------------------------------------------------- 1 | ### TODO:NEEDS REWORK !!! 2 | 3 | 4 | import numpy as np 5 | import torch 6 | import seaborn as sns 7 | import pandas as pd 8 | from ..utils.utils import normalize_with_median_iqr 9 | 10 | def get_error_distribution_plot(results_dict): 11 | ''' 12 | Returns a plot of the error distribution derived from predictions 13 | and groundtruth values. 14 | 15 | Args: 16 | results_dict (dict): Dictionary of test results. 17 | ''' 18 | 19 | errors = [] 20 | for key, value in results_dict.items(): 21 | y_pred, y, _ = value 22 | err = torch.abs(y_pred - y).cpu().numpy() 23 | err = normalize_with_median_iqr(err) 24 | s = pd.Series(err, index=[key]*len(err)) 25 | errors.append(s) 26 | if key == 'Validation': 27 | threshold = err.max() 28 | 29 | errors = pd.Series(dtype=np.float64).append(errors) 30 | errors = errors.apply(lambda x: np.nan if x < threshold*0.75 else x) 31 | df = pd.DataFrame({'normalized_error': errors}) 32 | df = df.dropna() 33 | df.index.name = 'Mode' 34 | df = df.reset_index() 35 | 36 | error_plot = sns.displot(df, x="normalized_error", hue="Mode", kind="kde", fill=True) 37 | return error_plot 38 | 39 | -------------------------------------------------------------------------------- /src/visualization/graph_plot.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import networkx as nx 3 | from sklearn.manifold import TSNE 4 | from pyvis.network import Network 5 | import matplotlib.pyplot as plt 6 | 7 | def plot_embedding(edge_indices, W, labels, path=None, notebook=False): 8 | ''' 9 | Creates a plot of the given graph. Layout is determined by t-SNE dimensionality reduction to 2d. 10 | Saves a plot of the 2d t-SNE of the embedding space. 11 | Can directly display plot if used within notebook. 12 | ''' 13 | 14 | assert isinstance(edge_indices, torch.Tensor) 15 | assert isinstance(W, torch.Tensor) 16 | 17 | edge_indices = edge_indices.cpu() 18 | W = W.detach().cpu() 19 | 20 | num_nodes = edge_indices.max() + 1 21 | 22 | # compute list of edge index pairs from sparse adj matrix shape [2, num_edges] 23 | edge_list = zip(*edge_indices.detach().tolist()) 24 | edge_list = [(a,b) for a,b in edge_list if not a == b] # remove self loops 25 | 26 | # generate graph from edge list 27 | G = nx.from_edgelist(edge_list) 28 | 29 | ### PARAMETERS FOR DRAWING 30 | # node ids 31 | node_keys = range(num_nodes) 32 | def node_dict(x): return dict(zip(node_keys, x)) 33 | 34 | # embedding mapping from Nd space to 2d 35 | W2D = TSNE(n_components=2).fit_transform(W) 36 | 37 | xs, ys = W2D.T.tolist() 38 | 39 | # node coordinates 40 | x_map = node_dict(xs) 41 | y_map = node_dict(ys) 42 | 43 | # node labels 44 | node_labels = node_dict(labels) 45 | 46 | # node sizes 47 | sizes = [12] * num_nodes 48 | size_map = node_dict(sizes) 49 | 50 | # node colours 51 | string_split = [string.split('_') for string in labels] 52 | if len(string_split[0]) == 1: # swat 53 | sensor_types = [str(*string)[:-3] for string in string_split] 54 | else: # wadi 55 | sensor_types = ['_'.join(string[:2]) for string in string_split] 56 | sensor_set = set(sensor_types) 57 | mapping = dict(zip(sensor_set, range(len(sensor_set)))) 58 | node_color_map = dict(zip(node_keys, [mapping[key] for key in sensor_types])) 59 | 60 | nx.set_node_attributes(G, node_labels, 'label') 61 | nx.set_node_attributes(G, node_color_map, 'group') 62 | nx.set_node_attributes(G, size_map, 'size') 63 | nx.set_node_attributes(G, x_map, 'x') 64 | nx.set_node_attributes(G, y_map, 'y') 65 | 66 | # pyvis network from networkx graph 67 | net = Network('1000px', '100%', bgcolor='#222222', font_color='white', notebook=notebook) 68 | # net = Network('1000px', '100%', bgcolor='#ffffff', font_color='black', notebook=notebook) 69 | net.from_nx(G) 70 | # gravity model for plot layout 71 | net.force_atlas_2based(gravity=-30, central_gravity=0.1, spring_length=50, spring_strength=0.001, damping=0.09, overlap=0.1) 72 | net.show_buttons(filter_=['physics']) 73 | if path is not None: 74 | net.save_graph(path) 75 | if notebook: 76 | net.show('graph.html') 77 | 78 | # plot of t-SNE 79 | _, axes = plt.subplots(1,2) 80 | for ax in axes: 81 | ax.scatter(xs, ys, c=list(node_color_map.values()), alpha=0.7) 82 | for i, label in enumerate(labels): 83 | axes[1].annotate(label, (xs[i], ys[i])) 84 | path = path.rsplit('.')[0] + '_tSNE.png' 85 | plt.savefig(path) 86 | 87 | 88 | def plot_adjacency(A, labels, path=None, notebook=False): 89 | ''' 90 | Creates a plot of the given graph. Layout is determined by t-SNE dimensionality reduction to 2d. 91 | Saves a plot of the 2d t-SNE of the embedding space. 92 | Can directly display plot if used within notebook. 93 | ''' 94 | assert isinstance(A, torch.Tensor) 95 | 96 | A = A.detach() 97 | A.fill_diagonal_(0) 98 | A = A.cpu().numpy() 99 | 100 | num_nodes = A.shape[0] 101 | 102 | # generate graph from adjacency matrix 103 | directed = (A != A.T).any() 104 | if directed: 105 | G = nx.from_numpy_matrix(A, create_using=nx.DiGraph) 106 | else: 107 | G = nx.from_numpy_matrix(A) 108 | 109 | ### PARAMETERS FOR DRAWING 110 | # node ids 111 | node_keys = range(num_nodes) 112 | def node_dict(x): return dict(zip(node_keys, x)) 113 | 114 | # node labels 115 | node_labels = node_dict(labels) 116 | 117 | # node sizes 118 | sizes = [12] * num_nodes 119 | size_map = node_dict(sizes) 120 | 121 | # node colours 122 | string_split = [string.split('_') for string in labels] 123 | if len(string_split[0]) == 1: # swat 124 | sensor_types = [str(*string)[:-3] for string in string_split] 125 | else: # wadi 126 | sensor_types = ['_'.join(string[:2]) for string in string_split] 127 | sensor_set = set(sensor_types) 128 | mapping = dict(zip(sensor_set, range(len(sensor_set)))) 129 | node_color_map = dict(zip(node_keys, [mapping[key] for key in sensor_types])) 130 | 131 | nx.set_node_attributes(G, node_labels, 'label') 132 | nx.set_node_attributes(G, node_color_map, 'group') 133 | nx.set_node_attributes(G, size_map, 'size') 134 | 135 | # pyvis network from networkx graph 136 | directed = (A != A.T).any() 137 | net = Network('1000px', '100%', directed=directed, bgcolor='#222222', font_color='white', notebook=notebook) 138 | # net = Network('1000px', '100%', directed=directed, bgcolor='#ffffff', font_color='black', notebook=notebook) 139 | net.from_nx(G) 140 | # gravity model for plot layout 141 | net.force_atlas_2based(gravity=-30, central_gravity=0.1, spring_length=50, spring_strength=0.001, damping=0.09, overlap=0.1) 142 | net.show_buttons(filter_=['physics']) 143 | if path is not None: 144 | net.save_graph(path) 145 | if notebook: 146 | net.show('graph.html') -------------------------------------------------------------------------------- /src/visualization/loss_plot.py: -------------------------------------------------------------------------------- 1 | import matplotlib.pyplot as plt 2 | 3 | def get_loss_plot(train_loss_history, val_loss_history): 4 | ''' 5 | Returns a pyplot figure object with the plot of the 6 | train and validation loss lists obtained in training. 7 | ''' 8 | 9 | colors = ['#2300a8', '#8400a8'] # '#8400a8', '#00A658' 10 | plot_dict = {'Training': (train_loss_history, colors[0]), 'Validation': (val_loss_history, colors[1])} 11 | 12 | n = len(train_loss_history) 13 | 14 | # plot train and val losses and fill area under the curve 15 | fig, ax = plt.subplots() 16 | x_axis = list(range(1, n+1)) 17 | for key, (data, color) in plot_dict.items(): 18 | if data is not None: 19 | ax.plot(x_axis, data, 20 | label=key, 21 | linewidth=2, 22 | linestyle='-', 23 | marker='o', 24 | alpha=1, 25 | color=color) 26 | ax.fill_between(x_axis, data, 27 | alpha=0.3, 28 | color=color) 29 | 30 | # x axis ticks 31 | n_x_ticks = 10 32 | k = max(1, n // n_x_ticks) 33 | x_ticks = list(range(1, n+1, k)) 34 | ax.set_xticks(x_ticks) 35 | 36 | # figure labels 37 | ax.set_title('Loss over time', fontweight='bold') 38 | ax.set_xlabel('Epochs', fontweight='bold') 39 | ax.set_ylabel('Mean Squared Error', fontweight='bold') 40 | ax.legend(loc='upper right') 41 | 42 | # remove top and right borders 43 | ax.spines['top'].set_visible(False) 44 | ax.spines['right'].set_visible(False) 45 | 46 | # adds major gridlines 47 | ax.grid(color='grey', linestyle='-', linewidth=0.35, alpha=0.8) 48 | 49 | # log scale of y-axis 50 | ax.set_yscale('log') 51 | 52 | return fig -------------------------------------------------------------------------------- /src/visualization/tensorboard.py: -------------------------------------------------------------------------------- 1 | 2 | # from torch.utils.tensorboard import SummaryWriter 3 | 4 | # stamp = datetime.now().strftime("%Y%m%d-%H%M%S") 5 | # logdir = 'logs/%s' % stamp 6 | 7 | # writer = SummaryWriter(logdir) 8 | # writer.add_graph(model, [x[:,:window], *[torch.tensor(np.nan)] * 2]) 9 | # writer.close() 10 | --------------------------------------------------------------------------------