├── README.md
├── install.sh
├── main.py
├── run.sh
└── src
├── __init__.py
├── data
├── SlidingWindowDataset.py
├── Transforms.py
└── __init__.py
├── datasets
├── __init__.py
├── demo.py
├── files
│ ├── .gitignore
│ ├── processed
│ │ └── demo
│ │ │ ├── labels.pt
│ │ │ ├── list.txt
│ │ │ ├── meta.json
│ │ │ ├── test.pt
│ │ │ └── train.pt
│ └── raw
│ │ └── demo
│ │ ├── test.csv
│ │ └── train.csv
├── from_csv.py
├── swat.py
└── wadi.py
├── layers
├── __init__.py
├── attention.py
└── embedding.py
├── models
├── ConvSeqAttention.py
├── GNNLSTM.py
├── LSTM.py
├── Linear.py
├── MTGNN.py
└── __init__.py
├── utils
├── __init__.py
├── device.py
├── evaluate.py
├── metrics.py
├── nn_trainer.py
└── utils.py
└── visualization
├── error_distribution.py
├── graph_plot.py
├── loss_plot.py
└── tensorboard.py
/README.md:
--------------------------------------------------------------------------------
1 | # Multivariate Time Series Anomaly Detection with GNNs and Latent Graph Inference
2 | Implementation of different graph neural network (GNN) based models for anomaly detection in multivariate timeseries in sensor networks.
3 | An explicit graph structure modelling the interrelations between sensors is inferred during training and used for time series forecasting. Anomaly detection is based on the error between the predicted and actual values at each time step.
4 |
5 | ## Installation
6 | ### Requirements
7 | * Python == 3.7
8 | * cuda == 10.2
9 | * [pytorch==1.8.1] (https://pytorch.org/)
10 | * [torch-geometric==1.7.2] (https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html)
11 |
12 | Additional package files for torch-geometric (python 3.7 & pytorch 1.8.1) provided in '/whl/' in case they are unavailable.
13 | Refer to https://pytorch-geometric.com/whl/ for other versions.
14 | ### Install python
15 | Install python environment for example with conda:
16 | ```
17 | conda create -n py37 python=3.7
18 | ```
19 | ### Install packages
20 | Run install bash script with either cpu or cuda flag depeneding on the indended use.
21 | ```
22 | # run after installing python
23 | bash install.sh cpu
24 |
25 | # or
26 | bash install.sh cuda
27 | ```
28 |
29 | ## Models
30 | The repository contains several models. GNN-LSTM is used by default and achieved best performance.
31 |
32 | ### GNN-LSTM
33 | Model with GNN feature expansion before multi-layer LSTM. A single node embedding is used to infer the latent graph through vector similary,
34 | and as node positional embeddings added to the GNN features before they are passed to the recurrent network.
35 |
36 | ### Convolutional-Attention Sequence-To-Sequence
37 | Spatial-Temporal Convolution GNN with attention. Data is split into an encoder and decoder. Encoder creates a feature representation for each time step while the decoder creates a single representation. Encoder-Decoder attention is concatenated with the decoder output before passed to the prediction layer.
38 | Uses multiple embedding layers to parameterize the latent graph diretly by the network.
39 | Inspired by: https://arxiv.org/pdf/1705.03122.pdf.
40 |
41 | ### MTGNN
42 | Sptial-Temporal Convolution GNN with attention and graph mix-hop propagation.
43 | Taken from: https://arxiv.org/pdf/2005.11650.pdf.
44 |
45 | ### LSTM
46 | Vanilla multi-layer LSTM used for benchmarking.
47 |
48 |
49 | ## Data
50 | ### SWaT, WADI, Demo
51 | Test dataset ('demo') included in the model folder.
52 | SWaT and WADI datasets can be requested from [iTrust](https://itrust.sutd.edu.sg/).
53 | The files should be opened in e.g. Excel to remove the first empty rows and save as a .csv file.
54 | The CSV files should be placed in a folder with the same name ('swat' or 'wadi') in '/datasets/files/raw/\/\'
55 |
56 | ### Other
57 | Additional datasets can either be loaded directly from CSV file using the dataset 'from_csv'
58 | or by creating a custom dataset following the examples found in the '/datasets/' folder.
59 | If 'from_csv' is used, the data should come in the same format as the demo data included in this repository,
60 | with individual time series for each sensor represented by a single column. (Only) the test data should have
61 | anomaly labels included in the last column.
62 | The first column is assumed to be the timestamp. The files are to be placed in '/datasets/files/raw/from_csv/'.
63 | If this option is chosen, data normalization is not available. Any preprocessing should be done manually.
64 |
65 | ## Usage
66 |
67 | ### Bash Script
68 | Suitable parameters for the SWaT, Wadi, and Demo datasets can be found in the bash scripts,
69 | which is the most convenient way to run models.
70 | ```
71 | # run from terminal
72 | sh run.sh [dataset]
73 | ```
74 |
75 | *Examples:*
76 | ```
77 | # example 1
78 | sh run.sh swat
79 |
80 | # example 2
81 | sh run.sh wadi
82 |
83 | # example 3
84 | sh run.sh demo
85 | ```
86 |
87 | ### Python File
88 | Run the *main.py* script from your terminal (bash, powershell, etc).
89 | To change the default model and training hyperparameters, flags can be included.
90 | Alternatively, those parameters can be changed within the file (argsparser default values).
91 | ```
92 | # run from terminal
93 | python main.py -[flags]
94 | ```
95 | *Examples:*
96 | ```
97 | # example 1
98 | python main.py -dataset demo -batch_size 4 -epochs 10
99 |
100 | # example 2
101 | python main.py -dataset swat -epochs 10 -topk 20 -embed_dim 128
102 |
103 | # example 3
104 | python main.py -dataset from_csv
105 | ```
106 |
107 | **Available flags:**
108 | `-dataset` The dataset.
109 | `-window_size` Number of historical timesteps used in each sample.
110 | `-horizon` Number of prediction steps.
111 | `-val_split` Amount of data used for the validation dataset. Value between 0 and 1.
112 | `-transform` Sampling transform applied to the model input data (e.g. median).
113 | `-target_transform` Sampling transform applied to the model target values. (e.g. median, max).
114 | `-normalize` Boolean value if data normalization should be applied.
115 | `-shuffle_train` Boolean value if training data should be shuffled.
116 | `-batch_size` Number of samples in each batch.
117 |
118 | `-embed_dim` Number of node embedding dimensions (Disabled for GNN-LSTM).
119 | `-topk` Number of allowed neighbors for each node.
120 |
121 | `-smoothing` Error smoothing kernel size.
122 | `-smoothing_method` Error smoothing kernel type (*mean* or *exp*).
123 | `-thresholding` Thresholding method (*mean*, *max*, *best* (best performs an exhaustive search for theoretical performance evaluation)).
124 |
125 | `-epochs` Number of training epochs.
126 | `-early_stopping` Patience parameter of number of epochs without improvement for early stopping.
127 | `-lr` Learning rate.
128 | `-betas` Adam optimizer parameter.
129 | `-weight_decay` Adam optimizer weight regularization parameter.
130 | `-device` Computing device (*cpu* or *cuda*).
131 |
132 | `-log_graph` Boolean for logging of learned graphs.
133 |
134 | ## Results
135 | ### Logs
136 | After the initial run, a '/runs/' folder will be automatically created.
137 | A copy of the model state dict, a loss plot, plots for the learned graph representation
138 | and some additional information will be saved for each run of the model.
139 |
140 | ### Example Plots
141 |
142 | Visualization of a t-SNE embedding of the learned undirected graph representation for the SWaT dataset
143 | with 15 neighbors per node.
144 |
145 |
146 | Plot of a directly parameterized uni-directional graph adjaceny matrix with a single neighbor per node.
147 |
148 |
149 | Node colors and labels indicate type of sensor.
150 |
151 |
152 | **P:** Pump
153 | **MV:** Motorized valve
154 | **UV:** Dechlorinator
155 | **LIT:** Level in tank
156 | **PIT:** Pressure in tank
157 | **FIT:** Flow in tank
158 | **AIT:** Analyzer in tank (different chemical analyzers; NaCl, HCl, ORP meters, etc)
159 | **DPIT:** Differential pressure indicating transmitter
160 |
--------------------------------------------------------------------------------
/install.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | DIR=$1
4 |
5 | # torch
6 | pip install torch==1.8.1
7 | pip install torch-geometric==1.7.2
8 |
9 | # extra torch geometric packages
10 | pip install --no-index torch-scatter -f whl/$DIR/
11 | pip install --no-index torch-sparse -f whl/$DIR/
12 | pip install --no-index torch-cluster -f whl/$DIR/
13 | pip install --no-index torch-spline-conv -f whl/$DIR/
14 |
15 | # extra packages
16 | pip install matplotlib==3.4.3
17 | pip install networkx==2.6.2
18 | pip install scikit-learn==0.24.2
19 | pip install scipy==1.7.1
20 | pip install seaborn==0.11.2
21 | pip install numpy==1.21.2
22 | pip install pandas==1.3.2
23 | pip install pyvis==0.1.9
--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
1 | import sys
2 | import os
3 |
4 | import argparse
5 | import importlib
6 | from distutils.util import strtobool
7 |
8 | from datetime import datetime
9 |
10 | import numpy as np
11 | import torch
12 | from torch import nn
13 | from torch_geometric.data import DataLoader
14 |
15 | from src.data.Transforms import MedianSampling2d, MaxSampling1d, MedianSampling1d
16 | from src.models import GNNLSTM, ConvSeqAttentionModel, MTGNNModel, RecurrentModel
17 | from src.utils.device import get_device
18 | from src.utils import Trainer
19 | from src.utils.evaluate import evaluate_performance
20 | from src.visualization.graph_plot import plot_embedding, plot_adjacency
21 | from src.visualization.loss_plot import get_loss_plot
22 | # from src.visualization.error_distribution import get_error_distribution_plot
23 |
24 | def main(args):
25 |
26 | print()
27 | # check python and torch versions
28 | print(f'Python v.{sys.version.split()[0]}')
29 | print(f'PyTorch v.{torch.__version__}')
30 |
31 | # get device
32 | device = args.device
33 | print(f'Device status: {device}')
34 |
35 | dataset = args.dataset
36 | # dataset import
37 | p, m = 'src.datasets.' + dataset, dataset.capitalize()
38 | mod = importlib.import_module(p)
39 | dataset_class = getattr(mod, m)
40 |
41 | # dataset transforms
42 | transform_dict = {'median': MedianSampling2d}
43 | target_transform_dict = {'median': MedianSampling1d, 'max': MaxSampling1d}
44 |
45 | transform = transform_dict.get(args.transform, None)
46 | if transform is not None:
47 | transform = transform(10)
48 |
49 | target_transform = target_transform_dict.get(args.target_transform, None)
50 | if target_transform is not None:
51 | target_transform = target_transform(10)
52 |
53 | # training / test data set definitions
54 | lags = args.window_size
55 | stride = args.stride
56 | horizon = args.horizon
57 | train_ds = dataset_class(lags, stride=stride, horizon=horizon, train=True, transform=transform, normalize=args.normalize, device=device)
58 | test_ds = dataset_class(lags, stride=stride, horizon=horizon, train=False, transform=transform, target_transform=target_transform, normalize=args.normalize, device=device)
59 |
60 | # get train and validation data split at random index
61 | val_split = args.val_split
62 | val_len = int(len(train_ds)*val_split)
63 | split_idx = np.random.randint(0, len(train_ds) - val_len) # exclude beginning of dataset for stability
64 | a, b = split_idx, split_idx+val_len # split interval
65 |
66 | train_parition = train_ds[:a] + train_ds[b:]
67 | val_parition = train_ds[a:b]
68 |
69 | # data loaders
70 | batch_size = args.batch_size
71 | # train, val, test partitions
72 | train_loader = DataLoader(train_parition[int(len(train_ds)*0.0):], batch_size=batch_size, shuffle=args.shuffle_train)
73 | if len(val_parition) > 0:
74 | val_loader = DataLoader(val_parition, batch_size=batch_size, shuffle=True)
75 | thresholding_loader = DataLoader(val_parition, batch_size=batch_size, shuffle=False)
76 | else:
77 | val_loader = None
78 | thresholding_loader = DataLoader(train_ds[int(len(train_ds)*0.0):], batch_size=batch_size, shuffle=False)
79 | test_loader = DataLoader(test_ds, batch_size=batch_size, shuffle=False)
80 |
81 | # node meta data
82 | num_nodes = train_ds.num_nodes
83 | args.num_nodes = num_nodes
84 | try:
85 | node_names = train_ds.node_names
86 | except:
87 | node_names = str(list(range(num_nodes)))
88 |
89 | print(f'\nDataset <{train_ds.name.capitalize()}> loaded...')
90 | print(f' Number of nodes: {num_nodes}')
91 | print(f' Training samples: {len(train_parition)}')
92 | if val_loader:
93 | print(f' Validation samples: {len(val_parition)}')
94 | print(f' Test samples: {len(test_ds)}\n')
95 |
96 | ### MODEL
97 | # model = ConvSeqAttentionModel(args).to(device)
98 | # model = RecurrentModel(args).to(device)
99 | # model = MTGNNModel(args).to(device)
100 | model = GNNLSTM(args).to(device)
101 |
102 | optimizer = torch.optim.Adam(model.parameters(), lr=args.lr, betas=args.betas, weight_decay=args.weight_decay)
103 | criterion = nn.MSELoss(reduction='mean')
104 |
105 | # torch.autograd.set_detect_anomaly(True) # uncomment for debugging
106 |
107 | print('Training...')
108 | trainer = Trainer(model, optimizer, criterion)
109 | stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
110 |
111 | # log directory
112 | logdir = os.path.join('runs/', stamp + f' - {args.dataset}')
113 | os.makedirs(logdir, exist_ok=True)
114 |
115 | if args.log_graph:
116 | # save randomly initialised graph for plotting
117 | init_edge_index, init_embedding = model.get_embedding()
118 | init_graph = model.get_graph()
119 |
120 | # TRAINING ###
121 | train_loss_history, val_loss_history, best_model_state = trainer.train(
122 | train_loader,
123 | val_loader,
124 | epochs=args.epochs,
125 | early_stopping=args.early_stopping,
126 | return_model_state=True,
127 | )
128 | ### TESTING ###
129 | print('Testing...')
130 | # best model parameters
131 | model.load_state_dict(best_model_state)
132 |
133 | thresholding_results, final_train_loss = trainer.test(thresholding_loader)
134 | test_ds_results, test_loss = trainer.test(test_loader)
135 | print(f' Tresholding Data MSE: {final_train_loss:.6f}')
136 | print(f' Test MSE: {test_loss:.4f}\n')
137 |
138 | with open(os.path.join(logdir, 'loss.txt'), 'w') as f:
139 | f.write(f'Tresholding MSE: {final_train_loss:.6f}\nTest MSE: {test_loss:.6f}\n')
140 |
141 | print('Evaluating Performance...')
142 |
143 | results = evaluate_performance(
144 | thresholding_results,
145 | test_ds_results,
146 | threshold_method=args.thresholding,
147 | smoothing=args.smoothing,
148 | smoothing_method=args.smoothing_method,
149 | )
150 | result_str = f' {str(results["method"]).capitalize()} thresholding:\n \n' + \
151 | f' | Normal | Adjusted |\n' + \
152 | f' ----------|----------------|----------------|\n' + \
153 | f' Precision | {results["prec"]:>13.3f} | {results["a_prec"]:>13.3f} |\n' + \
154 | f' Recall | {results["rec"]:>13.3f} | {results["a_rec"]:>13.3f} |\n' + \
155 | f' F1 / F2 | {results["f1"]:>6.3f} / {results["f2"]:>5.3f} | {results["a_f1"]:>6.3f} / {results["a_f2"]:>5.3f} |\n' + \
156 | f' ----------|----------------|----------------|----------------\n' + \
157 | f' | Latency: {results["latency"]:.2f}\n'
158 |
159 | print(result_str)
160 | with open(os.path.join(logdir, f'results_{str(results["method"])}.txt'), 'w') as f:
161 | f.write(result_str)
162 |
163 | ### Uncomment for exhaustive threshold / smoothing parameter search
164 | # precision, recall, f1, f2 = -1, -1, -1, -1
165 | # best_method = None
166 | # j = 0
167 | # for i in range(1, 25+1):
168 | # results = evaluate_performance(
169 | # thresholding_results,
170 | # test_ds_results,
171 | # threshold_method='best',
172 | # smoothing=i,
173 | # smoothing_method=args.smoothing_method,
174 | # )
175 | # if 1 >= results["f1"] > f1 :
176 | # precision, recall, f1, f2 = results["prec"], results["rec"], results["f1"], results["f2"]
177 | # best_method = results["method"]
178 | # j = i
179 | # print(f' Best method: {best_method}')
180 | # print(f' Best smoothing parameter: {j}')
181 | # print(f' Precision: {precision:.4f}')
182 | # print(f' Recall: {recall:.4f}')
183 | # print(f' F1 | F2 scores: {f1:.4f} | {f2:.4f}\n')
184 |
185 | ### RESULTS PLOTS ###
186 | print('Logging Results...')
187 |
188 | with open(os.path.join(logdir, 'model.txt'), 'w') as f:
189 | f.write(str(model))
190 |
191 | # learned graph
192 | if args.log_graph:
193 | learned_edges, learned_embedding = model.get_embedding()
194 | learned_graph = model.get_graph()
195 | for i in range(len(learned_embedding)):
196 | plot_embedding(init_edge_index, init_embedding[i], node_names, os.path.join(logdir, f'init_emb_{i}.html'))
197 | plot_embedding(learned_edges, learned_embedding[i], node_names, os.path.join(logdir, f'trained_emd_{i}.html'))
198 |
199 | plot_adjacency(init_graph, node_names, os.path.join(logdir, f'init_A.html'))
200 | plot_adjacency(learned_graph, node_names, os.path.join(logdir, f'learned_A.html'))
201 |
202 | # loss
203 | fig = get_loss_plot(train_loss_history, val_loss_history)
204 | fig.savefig(os.path.join(logdir, 'loss_plot.png'))
205 |
206 | # # error distributions
207 | # results_dict = {'Validation': thresholding_results, 'Testing': test_ds_results}
208 | # fig = get_error_distribution_plot(results_dict)
209 | # fig.savefig(os.path.join(logdir, 'error_distribution.png'))
210 |
211 | ### SAVE MODEL ###
212 | torch.save(best_model_state, os.path.join(logdir, 'model.pt'))
213 |
214 | print() # script end
215 |
216 | if __name__ == '__main__':
217 |
218 | device = get_device()
219 |
220 | parser = argparse.ArgumentParser()
221 |
222 | ### -- Data params --- ###
223 | parser.add_argument("-dataset", type=str.lower, default="swat")
224 | parser.add_argument("-window_size", type=int, default=30)
225 | parser.add_argument("-stride", type=int, default=1)
226 | parser.add_argument("-horizon", type=int, default=10)
227 | parser.add_argument("-val_split", type=float, default=0.2)
228 | parser.add_argument("-transform", type=str, default='median')
229 | parser.add_argument("-target_transform", type=str, default='median')
230 | parser.add_argument("-normalize", type=lambda x:strtobool(x), default=False)
231 | parser.add_argument("-shuffle_train", type=lambda x:strtobool(x), default=True)
232 | parser.add_argument("-batch_size", type=int, default=64)
233 |
234 | ### -- Model params --- ###
235 | # Sensor embedding
236 | parser.add_argument("-embed_dim", type=int, default=16)
237 | parser.add_argument("-topk", type=int, default=5)
238 |
239 | ### --- Thresholding params --- ###
240 | parser.add_argument("-smoothing", type=int, default=1)
241 | parser.add_argument("-smoothing_method", type=str, default='exp') # exp or mean
242 | parser.add_argument("-thresholding", type=str, default='max') # max or mean
243 |
244 | ### --- Training params --- ###
245 | parser.add_argument("-epochs", type=int, default=50)
246 | parser.add_argument("-early_stopping", type=int, default=20)
247 | parser.add_argument("-lr", type=float, default=1e-3)
248 | parser.add_argument("-betas", nargs=2, type=float, default=(0.9, 0.999))
249 | parser.add_argument("-weight_decay", type=float, default=0)
250 | parser.add_argument("-device", type=torch.device, default=device) # cpu or cuda
251 |
252 | ### --- Logging params --- ###
253 | parser.add_argument("-log_tensorboard", type=lambda x:strtobool(x), default=False)
254 | parser.add_argument("-log_graph", type=lambda x:strtobool(x), default=True)
255 |
256 | args = parser.parse_args()
257 |
258 | main(args)
259 |
--------------------------------------------------------------------------------
/run.sh:
--------------------------------------------------------------------------------
1 | DATASET=$1
2 |
3 | if [[ "$DATASET" == "swat" ]]; then
4 | WINDOW_SIZE=50
5 | HORIZON=1
6 | STRIDE=1
7 | BATCH_SIZE=64
8 | EMBED_DIM=32
9 | TOPK=5
10 | EPOCHS=50
11 | VAL_SPLIT=0.2
12 | EARLY_STOPPING=10
13 | SMOOTHING=1
14 | SMOOTHING_METHOD="exp"
15 | THRESHOLDING="best"
16 | TRANSFORM="median"
17 | TARGET_TRANSFORM="median"
18 | NORMALIZE="True"
19 |
20 | elif [[ "$DATASET" == "wadi" ]]; then
21 | WINDOW_SIZE=50
22 | STRIDE=1
23 | HORIZON=1
24 | BATCH_SIZE=64
25 | EMBED_DIM=32
26 | TOPK=8
27 | EPOCHS=50
28 | VAL_SPLIT=0.1
29 | EARLY_STOPPING=10
30 | SMOOTHING=1
31 | SMOOTHING_METHOD="exp"
32 | THRESHOLDING="best"
33 | TRANSFORM="median"
34 | TARGET_TRANSFORM="median"
35 | NORMALIZE="True"
36 |
37 | elif [[ "$DATASET" == "demo" ]]; then
38 | WINDOW_SIZE=25
39 | STRIDE=1
40 | HORIZON=1
41 | BATCH_SIZE=32
42 | EMBED_DIM=16
43 | TOPK=3
44 | EPOCHS=50
45 | VAL_SPLIT=0
46 | EARLY_STOPPING=20
47 | SMOOTHING=1
48 | SMOOTHING_METHOD="mean"
49 | THRESHOLDING="best"
50 | TRANSFORM="none"
51 | TARGET_TRANSFORM="none"
52 | NORMALIZE="False"
53 | fi
54 |
55 | python main.py \
56 | -dataset $DATASET \
57 | -window_size $WINDOW_SIZE \
58 | -horizon $HORIZON \
59 | -stride $STRIDE \
60 | -val_split $VAL_SPLIT \
61 | -batch_size $BATCH_SIZE \
62 | -embed_dim $EMBED_DIM \
63 | -topk $TOPK \
64 | -epochs $EPOCHS \
65 | -early_stopping $EARLY_STOPPING \
66 | -smoothing $SMOOTHING \
67 | -smoothing_method $SMOOTHING_METHOD \
68 | -thresholding $THRESHOLDING \
69 | -normalize $NORMALIZE \
70 |
--------------------------------------------------------------------------------
/src/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/timbrockmeyer/mulivariate-time-series-anomaly-detection/bef170c5a20e00e5316002afdc3fdd445aa43777/src/__init__.py
--------------------------------------------------------------------------------
/src/data/SlidingWindowDataset.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torch.utils.data import Dataset
3 | from torch_geometric.data import Data
4 | from sklearn.preprocessing import MinMaxScaler
5 |
6 | class SlidingWindowDataset(Dataset):
7 | '''
8 | Dataset class for multivariate time series data.
9 | Each returned sample of the dataset is a sliding window of specific length
10 | given as a pytorch geometric data objects.
11 | https://pytorch-geometric.readthedocs.io/en/latest/modules/data.html#torch_geometric.data.Data
12 |
13 | This should serve as the base class for specific datasets
14 |
15 | Args:
16 | data (Tensor): 2d tensor with one timesteps in the rows and sensors as columns.
17 | window_size (int): Length of the sliding window of each sample.
18 | stride (int, optional): Stride length a which the dataset is sampled.
19 | horizon (int, optional): Number of timesteps used as prediction target.
20 | labels (Tensor, optional): Anomaly labels (None during training).
21 | transform (callable, optional): Transform appplied to the data.
22 | target_transform (callable, optional): Transform applied to the labels.
23 | device (str, optional): Device where data will be held (cpu, cuda).
24 | '''
25 | def __init__(self, data, window_size, stride=1, horizon=1, labels=None, transform=None, target_transform=None, normalize=False, device='cpu'):
26 |
27 | self.window_size = window_size
28 | self.stride = stride
29 | self.horizon = horizon
30 | self.normalize = normalize
31 | self.device = torch.device(device)
32 |
33 | self.dataset = self._process(data, labels, transform, target_transform)
34 |
35 | def _process(self, data, labels, transform, target_transform):
36 | assert isinstance(data, torch.Tensor)
37 | assert isinstance(labels, (type(None), torch.Tensor))
38 |
39 | _, info = self.meta
40 | if self.normalize:
41 | train_meta = info['train']
42 | min_ = torch.tensor(train_meta['min'], requires_grad=False)
43 | max_ = torch.tensor(train_meta['max'], requires_grad=False)
44 |
45 | fit_data = torch.stack([min_, max_], dim=0).detach().cpu().numpy()
46 |
47 | normalizer = MinMaxScaler(feature_range=(0,1)).fit(fit_data)
48 |
49 | data = torch.tensor(normalizer.transform(data.cpu().numpy())).to(self.device)
50 |
51 | data = data.to(self.device).T.float()
52 |
53 | if transform is not None:
54 | data = transform(data)
55 |
56 | self.num_nodes = data.size(0)
57 |
58 | if labels is not None:
59 | labels = labels.to(self.device)
60 |
61 | if target_transform is not None:
62 | labels = target_transform(labels)
63 |
64 | self._len = ((data.size(1) - self.window_size - self.horizon) // self.stride) + 1
65 |
66 | dataset = []
67 | for idx in range(self._len):
68 | id = idx
69 | idx *= self.stride
70 | x = data[:, idx : idx + self.window_size]
71 | y = data[:, idx + self.window_size : idx + self.window_size + self.horizon]
72 |
73 | if labels == None:
74 | y_label = None
75 | else:
76 | y_label = labels[idx + self.window_size : idx + self.window_size + self.horizon]
77 |
78 | window = Data(x=x, edge_idx=None, edge_attr=None, y=y, y_label=y_label, id=id)
79 | dataset.append(window)
80 |
81 | return dataset
82 |
83 | def __getitem__(self, idx):
84 | return self.dataset[idx]
85 |
86 | def __iter__(self):
87 | self._idx = 0
88 | return self
89 |
90 | def __next__(self):
91 | if self._idx >= self._len:
92 | raise StopIteration
93 |
94 | item = self.dataset[self._idx]
95 | self._idx += 1
96 | return item
97 |
98 | def __repr__(self):
99 | return f'{self.__class__.__name__}(num_nodes={self.num_nodes}, window_size={self.window_size}, stride={self.stride}, horizon={self.horizon})'
100 |
101 | def __len__(self):
102 | return self._len
103 |
--------------------------------------------------------------------------------
/src/data/Transforms.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn.functional as F
3 |
4 | # Transforms applied to datasets, e.g. for downsampling to speed up training
5 |
6 | class BaseSampling(torch.nn.Module):
7 | '''
8 | Base class for sampling transforms applied to tensors.
9 |
10 | Args:
11 | k (int): Number of samples to be aggregated.
12 | '''
13 | def __init__(self, k):
14 | super().__init__()
15 |
16 | self._k = k
17 |
18 | def forward(self, x):
19 | dims = len(x.shape)
20 | p1d = 0, (self._k - len(x) % self._k) % self._k
21 | x = F.pad(x, p1d, "constant", 0)
22 | x = x.unfold(dims-1, self._k, self._k)
23 | x = self.sampling(x)
24 | return x
25 |
26 | def sampling(self, x):
27 | raise NotImplementedError
28 |
29 | class MedianSampling2d(BaseSampling):
30 | '''
31 | Returns a 2d tensor where each row is downsampled with the median of k values.
32 | Only for 2d tensors.
33 | '''
34 | def __init__(self, k):
35 | super().__init__(k)
36 |
37 | def sampling(self, x):
38 | assert len(x.shape) == 3
39 | x, _ = x.median(dim=2)
40 | return x
41 |
42 | class MedianSampling1d(BaseSampling):
43 | '''
44 | Returns a 1d tensor that is downsampled with the median of k values.
45 | Only for 1d tensors.
46 | '''
47 | def __init__(self, k):
48 | super().__init__(k)
49 |
50 | def sampling(self, x):
51 | assert len(x.shape) == 2
52 | x, _ = x.median(dim=1)
53 | return x
54 |
55 | class MaxSampling1d(BaseSampling):
56 | '''
57 | Returns a 1d tensor that is downsampled with the maximum of k values.
58 | Only for 1d tensors.
59 | '''
60 | def __init__(self, k):
61 | super().__init__(k)
62 |
63 | def sampling(self, x):
64 | assert len(x.shape) == 2
65 | x, _ = x.max(dim=1)
66 | return x
67 |
68 |
--------------------------------------------------------------------------------
/src/data/__init__.py:
--------------------------------------------------------------------------------
1 | from .SlidingWindowDataset import SlidingWindowDataset
--------------------------------------------------------------------------------
/src/datasets/__init__.py:
--------------------------------------------------------------------------------
1 | from .demo import Demo
2 | from .swat import Swat
3 | from .wadi import Wadi
4 | from .from_csv import From_csv
--------------------------------------------------------------------------------
/src/datasets/demo.py:
--------------------------------------------------------------------------------
1 | import os
2 | import json
3 | import numpy as np
4 | import torch
5 | from shutil import rmtree
6 |
7 | from ..data import SlidingWindowDataset
8 |
9 | class Demo(SlidingWindowDataset):
10 |
11 | '''
12 | Small excerpt from the MSL dataset used for testing
13 | '''
14 |
15 | def __init__(self, window_size=1, stride=1, horizon=1, train=True, transform=None, target_transform=None, normalize=False, device='cpu'):
16 |
17 | self.device = device
18 |
19 | self.name = 'demo'
20 |
21 | train_file = 'train.csv'
22 | test_file = 'test.csv'
23 |
24 | root = os.path.dirname(__file__)
25 | raw_dir = os.path.join(root, f'files/raw/{self.name}')
26 | self.processed_dir = os.path.join(root, f'files/processed/{self.name}')
27 |
28 | self.raw_paths = [os.path.join(raw_dir, ending) for ending in [train_file, test_file]]
29 | self.processed_paths = [os.path.join(self.processed_dir, ending) for ending in ['train.pt', 'test.pt', 'labels.pt', 'list.txt', 'meta.json']]
30 |
31 | data, labels, node_names = self.load(train)
32 |
33 | self.node_names = node_names
34 |
35 | super().__init__(data, window_size, stride=stride, horizon=horizon, labels=labels, transform=transform, target_transform=target_transform, normalize=normalize, device=device)
36 |
37 | def load(self, train):
38 |
39 | # process csv files if not done
40 | if not all(map(lambda x: os.path.isfile(x), self.processed_paths)):
41 | self.process()
42 |
43 | # check if processed and load
44 | if all(map(lambda x: os.path.isfile(x), self.processed_paths)):
45 | if train:
46 | data = torch.load(self.processed_paths[0], map_location=self.device)
47 | labels = None
48 | else:
49 | data = torch.load(self.processed_paths[1], map_location=self.device)
50 | labels = torch.load(self.processed_paths[2], map_location=self.device)
51 | sensor_list = np.loadtxt(self.processed_paths[3], dtype=str)
52 | with open(self.processed_paths[4], 'r') as f:
53 | self.meta = json.load(f)
54 | else:
55 | raise Exception(f'{self.name} dataset file processing failed')
56 |
57 | return data, labels, sensor_list
58 |
59 | def process(self):
60 |
61 | # purge old files if any exist
62 | if os.path.exists(self.processed_dir):
63 | rmtree(self.processed_dir)
64 |
65 | # load csv file
66 | train_csv = np.genfromtxt(self.raw_paths[0], delimiter=",")
67 | train_data = torch.from_numpy(train_csv[1:,1:]).float().to(self.device)
68 |
69 | test_csv = np.genfromtxt(self.raw_paths[1], delimiter=",")
70 | test_data = torch.from_numpy(test_csv[1:,1:]).float().to(self.device)
71 | test_data, test_labels = test_data[:,:-1], test_data[:,-1]
72 |
73 | with open(self.raw_paths[0], 'r') as f:
74 | line = f.readline().split(',')[1:]
75 | sensor_list = np.array(list(map(str.strip, line)), dtype=str)
76 |
77 | meta = [self.name, {
78 | 'num_nodes': train_data.size(1),
79 | 'train': {
80 | 'samples': train_data.size(0),
81 | 'min': train_data.min(dim=0)[0].tolist(),
82 | 'max': train_data.max(dim=0)[0].tolist(),
83 | },
84 | 'test': {
85 | 'samples': test_data.size(0),
86 | 'min': test_data.min(dim=0)[0].tolist(),
87 | 'max': test_data.max(dim=0)[0].tolist(),
88 | }
89 | }]
90 |
91 | os.makedirs(self.processed_dir)
92 | torch.save(train_data, self.processed_paths[0])
93 | torch.save(test_data, self.processed_paths[1])
94 | torch.save(test_labels, self.processed_paths[2])
95 | np.savetxt(self.processed_paths[3], sensor_list, delimiter='\n', fmt='%s')
96 | dump = json.dumps(meta, indent=4)
97 | with open(self.processed_paths[4], 'w') as f:
98 | f.write(dump)
99 |
--------------------------------------------------------------------------------
/src/datasets/files/.gitignore:
--------------------------------------------------------------------------------
1 | raw/*
2 | processed/*
3 | !raw/demo/
4 | !processed/demo/
--------------------------------------------------------------------------------
/src/datasets/files/processed/demo/labels.pt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/timbrockmeyer/mulivariate-time-series-anomaly-detection/bef170c5a20e00e5316002afdc3fdd445aa43777/src/datasets/files/processed/demo/labels.pt
--------------------------------------------------------------------------------
/src/datasets/files/processed/demo/list.txt:
--------------------------------------------------------------------------------
1 | M-06
2 | M-01
3 | M-02
4 | S-02
5 | P-10
6 | T-04
7 | T-05
8 | F-07
9 | M-03
10 | M-04
11 | M-05
12 | P-15
13 | C-01
14 | C-02
15 | T-12
16 | T-13
17 | F-04
18 | F-05
19 | D-14
20 | T-09
21 | P-14
22 | T-08
23 | P-11
24 | D-15
25 | D-16
26 | M-07
27 | F-08
28 |
--------------------------------------------------------------------------------
/src/datasets/files/processed/demo/meta.json:
--------------------------------------------------------------------------------
1 | [
2 | "demo",
3 | {
4 | "num_nodes": 27,
5 | "train": {
6 | "samples": 1565,
7 | "min": [
8 | -1.0,
9 | -0.6427881717681885,
10 | -1.2107261419296265,
11 | -1.0,
12 | 0.987858235836029,
13 | -1.0,
14 | -1.0,
15 | -1.0,
16 | -1.4772167205810547,
17 | -1.4654639959335327,
18 | -1.255005955696106,
19 | 0.7308394908905029,
20 | -1.0,
21 | -1.0,
22 | -1.0,
23 | -1.0,
24 | -1.0,
25 | -1.116377592086792,
26 | -1.0,
27 | -1.0,
28 | 0.9991111159324646,
29 | -1.0,
30 | 0.3225919306278229,
31 | -1.0,
32 | -1.0,
33 | -1.0020240545272827,
34 | -1.0
35 | ],
36 | "max": [
37 | -1.0,
38 | 2.4922983646392822,
39 | 0.6932034492492676,
40 | 0.0,
41 | 0.9988705515861511,
42 | 0.0,
43 | -1.0,
44 | 1.0,
45 | 1.0000708103179932,
46 | 1.0000054836273193,
47 | 0.981587827205658,
48 | 1.0036537647247314,
49 | 2.1934478282928467,
50 | 0.0,
51 | 1.0,
52 | 1.0,
53 | 0.7545157074928284,
54 | 4.162651062011719,
55 | -1.0,
56 | 1.0,
57 | 1.0,
58 | 1.029411792755127,
59 | 0.9952331185340881,
60 | 1.1915780305862427,
61 | 1.0088798999786377,
62 | 0.4032494127750397,
63 | 1.08695650100708
64 | ]
65 | },
66 | "test": {
67 | "samples": 2049,
68 | "min": [
69 | -1.0,
70 | -1.0764756202697754,
71 | -1.0846891403198242,
72 | -1.0,
73 | 0.9903995394706726,
74 | -1.0,
75 | -1.0,
76 | -1.0,
77 | -1.4241974353790283,
78 | -1.3314414024353027,
79 | -1.3089860677719116,
80 | -1.0,
81 | -1.0,
82 | -1.0,
83 | -1.0,
84 | -1.0,
85 | -1.0,
86 | -1.0768855810165405,
87 | -1.0,
88 | -1.0,
89 | 0.9903995394706726,
90 | -1.0,
91 | -1.0,
92 | -1.0,
93 | -1.0,
94 | -1.0,
95 | -1.0
96 | ],
97 | "max": [
98 | 258.10809326171875,
99 | 2.2688539028167725,
100 | 1.3352758884429932,
101 | 1.0,
102 | 0.9983057975769043,
103 | 1.0,
104 | 1.0,
105 | 1.0,
106 | 1.0000708103179932,
107 | 1.17589271068573,
108 | 1.0951875448226929,
109 | 1.0,
110 | 1.0,
111 | 1.0,
112 | 1.0,
113 | 0.8645181059837341,
114 | 1.46547269821167,
115 | 1.806624412536621,
116 | 1.0,
117 | 1.0,
118 | 0.9983057975769043,
119 | 1.0196079015731812,
120 | 0.9705088138580322,
121 | 1.3461360931396484,
122 | 1.0,
123 | 1.0,
124 | 1.3043478727340698
125 | ]
126 | }
127 | }
128 | ]
--------------------------------------------------------------------------------
/src/datasets/files/processed/demo/test.pt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/timbrockmeyer/mulivariate-time-series-anomaly-detection/bef170c5a20e00e5316002afdc3fdd445aa43777/src/datasets/files/processed/demo/test.pt
--------------------------------------------------------------------------------
/src/datasets/files/processed/demo/train.pt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/timbrockmeyer/mulivariate-time-series-anomaly-detection/bef170c5a20e00e5316002afdc3fdd445aa43777/src/datasets/files/processed/demo/train.pt
--------------------------------------------------------------------------------
/src/datasets/from_csv.py:
--------------------------------------------------------------------------------
1 | import os
2 | import numpy as np
3 | import torch
4 |
5 | from ..data import SlidingWindowDataset
6 |
7 | class From_csv(SlidingWindowDataset):
8 |
9 | '''
10 | Requires CSV files for training and testing.
11 | Test file is expected to have its labels in the last column, train file to be without labels.
12 | Naming:
13 | Training data: 'train.csv'
14 | Test data: 'test.csv'
15 | '''
16 |
17 | def __init__(self, window_size=1, stride=1, horizon=1, train=True, transform=None, target_transform=None, normalize=False, device='cpu'):
18 |
19 | self.device = device
20 |
21 | self.name = 'from_csv'
22 |
23 | train_file = 'train.csv'
24 | test_file = 'test.csv'
25 |
26 | self.meta = (None, None)
27 | self.normalize = False
28 |
29 | root = os.path.dirname(__file__)
30 | raw_dir = os.path.join(root, f'files/raw/{self.name}')
31 | raw_paths = [os.path.join(raw_dir, ending) for ending in [train_file, test_file]]
32 |
33 | if train:
34 | data = np.genfromtxt(raw_paths[0], delimiter=",")[1:,1:]
35 | data = torch.from_numpy(data).float().to(self.device)
36 | labels = None
37 |
38 | else:
39 | data = np.genfromtxt(raw_paths[1], delimiter=",")[1:,1:]
40 | data = torch.from_numpy(data).float().to(self.device)
41 | data, labels = data[:,:-1], data[:,-1]
42 |
43 | super().__init__(data, window_size, stride=stride, horizon=horizon, labels=labels, transform=transform, target_transform=target_transform, normalize=False, device=device)
--------------------------------------------------------------------------------
/src/datasets/swat.py:
--------------------------------------------------------------------------------
1 | import os
2 | import json
3 | import numpy as np
4 | import pandas as pd
5 | import torch
6 | from shutil import rmtree
7 |
8 | from ..data import SlidingWindowDataset
9 |
10 |
11 | class Swat(SlidingWindowDataset):
12 |
13 | '''
14 | LOAD ORIGINAL FILES IN EXCEL FIRST!!!
15 | delete the unnecessary first rows and save as a CSV file.
16 |
17 | Dataset can be requested from
18 | https://itrust.sutd.edu.sg/testbeds/secure-water-treatment-swat/
19 | '''
20 |
21 | def __init__(self, window_size=1, stride=1, horizon=1, train=True, transform=None, target_transform=None, normalize=False, device='cpu'):
22 |
23 | self.device = device
24 |
25 | self.name = 'swat'
26 |
27 | train_file = 'SWaT_Dataset_Normal_v0.csv'
28 | test_file = 'SWaT_Dataset_Attack_v0.csv'
29 |
30 | root = os.path.dirname(__file__)
31 | raw_dir = os.path.join(root, f'files/raw/{self.name}')
32 | self.processed_dir = os.path.join(root, f'files/processed/{self.name}')
33 |
34 | self.raw_paths = [os.path.join(raw_dir, ending) for ending in [train_file, test_file]]
35 | self.processed_paths = [os.path.join(self.processed_dir, ending) for ending in ['train.pt', 'test.pt', 'labels.pt', 'list.txt', 'meta.json']]
36 |
37 | data, labels, node_names = self.load(train)
38 |
39 | self.node_names = node_names
40 |
41 | super().__init__(data, window_size, stride=stride, horizon=horizon, labels=labels, transform=transform, target_transform=target_transform, normalize=normalize, device=device)
42 |
43 | def load(self, train):
44 |
45 | # process csv files if not done
46 | if not all(map(lambda x: os.path.isfile(x), self.processed_paths)):
47 | self.process()
48 |
49 | # check if processed and load
50 | if all(map(lambda x: os.path.isfile(x), self.processed_paths)):
51 | if train:
52 | data = torch.load(self.processed_paths[0], map_location=self.device)
53 | labels = None
54 | else:
55 | data = torch.load(self.processed_paths[1], map_location=self.device)
56 | labels = torch.load(self.processed_paths[2], map_location=self.device)
57 | sensor_list = np.loadtxt(self.processed_paths[3], dtype=str)
58 | with open(self.processed_paths[4], 'r') as f:
59 | self.meta = json.load(f)
60 | else:
61 | raise Exception(f'{self.name} dataset raw file processing failed')
62 |
63 | return data, labels, sensor_list
64 |
65 | def process(self):
66 |
67 | # purge old files if any exist
68 | if os.path.exists(self.processed_dir):
69 | rmtree(self.processed_dir)
70 |
71 | files = {'train': self.raw_paths[0], 'test': self.raw_paths[1]}
72 |
73 | for key, file in files.items():
74 |
75 | df = pd.read_csv(file)
76 |
77 | # strip white spaces from column names
78 | df = df.rename(columns=lambda x: x.strip())
79 |
80 | # timestamp column to index
81 | df.iloc[:,0] = df.index
82 | df = df.set_index(df.columns[0])
83 |
84 | if key == 'train':
85 | # drop label column for training data
86 | df = df.drop(df.columns[-1], axis=1)
87 |
88 | column_names = df.columns.to_numpy()
89 |
90 | train_data = df.to_numpy()
91 | train_data = torch.from_numpy(train_data).float().to(self.device)
92 |
93 | else:
94 | # categorial labels to numerical values
95 | vocab = {'Normal': 0, 'Attack': 1, 'A ttack': 1}
96 | df.iloc[:,-1] = df.iloc[:,-1].apply(lambda x: vocab[x])
97 |
98 | test_data = df.to_numpy()
99 | test_data = torch.from_numpy(test_data).float().to(self.device)
100 | test_data, test_labels = test_data[:,:-1], test_data[:,-1]
101 |
102 | meta = [self.name, {
103 | 'num_nodes': train_data.size(1),
104 | 'train': {
105 | 'samples': train_data.size(0),
106 | 'min': train_data.min(dim=0)[0].tolist(),
107 | 'max': train_data.max(dim=0)[0].tolist(),
108 | },
109 | 'test': {
110 | 'samples': test_data.size(0),
111 | 'min': test_data.min(dim=0)[0].tolist(),
112 | 'max': test_data.max(dim=0)[0].tolist(),
113 | }
114 | }]
115 |
116 | os.makedirs(self.processed_dir, exist_ok=True)
117 | torch.save(train_data, self.processed_paths[0])
118 | torch.save(test_data, self.processed_paths[1])
119 | torch.save(test_labels, self.processed_paths[2])
120 | np.savetxt(self.processed_paths[3], column_names, fmt = "%s")
121 | dump = json.dumps(meta, indent=4)
122 | with open(self.processed_paths[4], 'w') as f:
123 | f.write(dump)
124 |
--------------------------------------------------------------------------------
/src/datasets/wadi.py:
--------------------------------------------------------------------------------
1 | import os
2 | import json
3 | import numpy as np
4 | import pandas as pd
5 | import torch
6 | from shutil import rmtree
7 |
8 | from ..data import SlidingWindowDataset
9 |
10 |
11 | class Wadi(SlidingWindowDataset):
12 |
13 | ''' LOAD ORIGINAL FILES IN EXCEL FIRST!!!
14 | delete the unnecessary first rows and save as a CSV file.
15 |
16 | The dataset includes a PDF with descriptions of 15 anomaly events,
17 | including start and end dates (m/d/y) and times.
18 | -> copy the following table and save as "WADI_attacktimes.csv":
19 |
20 | Start_Date Start_Time End_Date End_Time
21 | 10/9/2017 19:25:00 10/9/2017 19:50:16
22 | 10/10/2017 10:24:10 10/10/2017 10:34:00
23 | 10/10/2017 10:55:00 10/10/2017 11:24:00
24 | 10/10/2017 11:30:40 10/10/2017 11:44:50
25 | 10/10/2017 13:39:30 10/10/2017 13:50:40
26 | 10/10/2017 14:48:17 10/10/2017 14:59:55
27 | 10/10/2017 17:40:00 10/10/2017 17:49:40
28 | 10/10/2017 10:55:00 10/10/2017 10:56:27
29 | 10/11/2017 11:17:54 10/11/2017 11:31:20
30 | 10/11/2017 11:36:31 10/11/2017 11:47:00
31 | 10/11/2017 11:59:00 10/11/2017 12:05:00
32 | 10/11/2017 12:07:30 10/11/2017 12:10:52
33 | 10/11/2017 12:16:00 10/11/2017 12:25:36
34 | 10/11/2017 15:26:30 10/11/2017 15:37:00
35 |
36 | Data can be requested from
37 | https://itrust.sutd.edu.sg/itrust-labs-home/itrust-labs_wadi/
38 | '''
39 |
40 | def __init__(self, window_size=1, stride=1, horizon=1, train=True, transform=None, target_transform=None, normalize=False, device='cpu'):
41 |
42 | self.device = device
43 |
44 | self.name = 'wadi'
45 |
46 | train_file = 'WADI_14days.csv'
47 | test_file = 'WADI_attackdata.csv'
48 | label_file = 'WADI_attacktimes.csv'
49 |
50 | root = os.path.dirname(__file__)
51 | raw_dir = os.path.join(root, f'files/raw/{self.name}')
52 | self.processed_dir = os.path.join(root, f'files/processed/{self.name}')
53 |
54 | self.raw_paths = [os.path.join(raw_dir, ending) for ending in [train_file, test_file, label_file]]
55 | self.processed_paths = [os.path.join(self.processed_dir, ending) for ending in ['train.pt', 'test.pt', 'labels.pt', 'list.txt', 'meta.json']]
56 |
57 | data, labels, sensor_names = self.load(train)
58 |
59 | self.node_names = sensor_names
60 |
61 | super().__init__(data, window_size, stride=stride, horizon=horizon, labels=labels, transform=transform, target_transform=target_transform, normalize=normalize, device=device)
62 |
63 | def load(self, train):
64 |
65 | # process csv files if not done
66 | if not all(map(lambda x: os.path.isfile(x), self.processed_paths)):
67 | self.process()
68 |
69 | # check if processed and load
70 | if all(map(lambda x: os.path.isfile(x), self.processed_paths)):
71 | if train:
72 | data = torch.load(self.processed_paths[0], map_location=self.device)
73 | labels = None
74 | else:
75 | data = torch.load(self.processed_paths[1], map_location=self.device)
76 | labels = torch.load(self.processed_paths[2], map_location=self.device)
77 | sensor_list = np.loadtxt(self.processed_paths[3], dtype=str)
78 | with open(self.processed_paths[4], 'r') as f:
79 | self.meta = json.load(f)
80 | else:
81 | raise Exception(f'{self.name} dataset raw file processing failed')
82 |
83 | return data, labels, sensor_list
84 |
85 | def process(self):
86 |
87 | # purge old files if any exist
88 | if os.path.exists(self.processed_dir):
89 | rmtree(self.processed_dir)
90 |
91 | df_train = pd.read_csv(self.raw_paths[0])
92 | df_test = pd.read_csv(self.raw_paths[1])
93 | anomaly_timeframes = pd.read_csv(self.raw_paths[2])
94 |
95 | def drop_columns(columns):
96 | df_train.drop(columns, axis=1, inplace=True)
97 | df_test.drop(columns, axis=1, inplace=True)
98 |
99 | assert list(df_train.columns) == list(df_test.columns)
100 |
101 | # find row indices of anomaly interval
102 | start_indices = pd.merge(df_test, anomaly_timeframes, left_on=['Date', 'Time'], right_on=['Start_Date', 'Start_Time'])['Row']
103 | end_indices = pd.merge(df_test, anomaly_timeframes, left_on=['Date', 'Time'], right_on=['End_Date', 'End_Time'])['Row']
104 | assert start_indices.shape == end_indices.shape
105 |
106 | # add anomaly labels to test data
107 | labels = pd.Series(np.zeros(len(df_test)))
108 | for a,b in zip(start_indices,end_indices):
109 | labels[a:b+1] = np.ones((b-a)+1)
110 | df_test['label'] = labels
111 |
112 | # drop date and time columns
113 | datetime_cols = ['Date', 'Time']
114 | drop_columns(datetime_cols)
115 |
116 | # fix columns
117 | for df in [df_train, df_test]:
118 | # set index column
119 | df.rename(columns={df.columns[0]:'timestamp'}, inplace=True)
120 | df.iloc[:,0] = df.index
121 | df.set_index(df.columns[0], inplace=True)
122 | # strip column names
123 | df.rename(columns=lambda x: x.strip(), inplace=True)
124 | # shorten column names
125 | df.columns = [x.split('\\')[-1] for x in df.columns]
126 |
127 | # account for missing data
128 | # completely empty colums in training or test data
129 | empty_columns = [col for col in df_train.columns if df_train[col].isnull().all() or df_test[col].isnull().all()]
130 | drop_columns(empty_columns)
131 | # other missing values
132 | assert not df_test.isnull().any().any()
133 | df_train = df_train.interpolate(method='nearest')
134 |
135 | # columns with zero variance in test data
136 | zero_var_columns_test = [col for col in df_test.columns if df_test[col].var() == 0]
137 | # columns with extremly high variance in training data
138 | extreme_var_columns_train = [col for col in df_train.columns if df_train[col].var() > 10000]
139 | drop_columns(zero_var_columns_test + extreme_var_columns_train)
140 |
141 | assert list(df_train.columns) == list(df_test.columns)[:-1]
142 |
143 | column_names = df_train.columns.to_numpy()
144 |
145 | train_data = df_train.to_numpy()
146 | train_data = torch.from_numpy(train_data).float().to(self.device)
147 |
148 | test_data = df_test.to_numpy()
149 | test_data = torch.from_numpy(test_data).float().to(self.device)
150 |
151 | meta = [self.name, {
152 | 'num_nodes': train_data.size(1),
153 | 'train': {
154 | 'samples': train_data.size(0),
155 | 'min': train_data.min(dim=0)[0].tolist(),
156 | 'max': train_data.max(dim=0)[0].tolist(),
157 | },
158 | 'test': {
159 | 'samples': test_data.size(0),
160 | 'min': test_data.min(dim=0)[0].tolist(),
161 | 'max': test_data.max(dim=0)[0].tolist(),
162 | }
163 | }]
164 |
165 | test_data, test_labels = test_data[:,:-1], test_data[:,-1]
166 |
167 | assert train_data.size(1) == test_data.size(1)
168 |
169 | os.makedirs(self.processed_dir, exist_ok=True)
170 | torch.save(train_data, self.processed_paths[0])
171 | torch.save(test_data, self.processed_paths[1])
172 | torch.save(test_labels, self.processed_paths[2])
173 | np.savetxt(self.processed_paths[3], column_names, fmt = "%s")
174 | dump = json.dumps(meta, indent=4)
175 | with open(self.processed_paths[4], 'w') as f:
176 | f.write(dump)
177 |
178 |
179 |
180 |
181 |
182 |
183 |
184 |
185 |
186 |
187 |
188 |
189 |
190 |
191 |
--------------------------------------------------------------------------------
/src/layers/__init__.py:
--------------------------------------------------------------------------------
1 | from .embedding import SingleEmbedding, DoubleEmbedding
2 | from .attention import EmbeddingAttention
3 |
--------------------------------------------------------------------------------
/src/layers/attention.py:
--------------------------------------------------------------------------------
1 | import math
2 | import torch
3 | import torch.nn.functional as F
4 | from torch.nn import Parameter, Linear
5 | from torch.autograd import Variable
6 | from torch_geometric.nn.conv import MessagePassing
7 | from torch_geometric.utils import remove_self_loops, add_self_loops, softmax
8 | from torch_geometric.utils.to_dense_adj import to_dense_adj
9 |
10 | from torch_geometric.nn.inits import glorot, zeros
11 |
12 |
13 | class EmbeddingAttention(MessagePassing):
14 | '''
15 | GATConv layer that computes concatinated attention scores for a graph time series window
16 | and corresponding node embedding values.
17 | Modification of the implementation of GATConv in pytorch geometric.
18 | https://pytorch-geometric.readthedocs.io/en/latest/modules/nn.html#torch_geometric.nn.conv.GATConv
19 | '''
20 |
21 | def __init__(self, in_channels, out_channels, heads=1, concat=True,
22 | negative_slope=0.2, dropout=0.0,
23 | add_self_loops=True, bias=True, **kwargs):
24 | kwargs.setdefault('aggr', 'add')
25 | super().__init__(node_dim=0, **kwargs)
26 |
27 | self.in_channels = in_channels
28 | self.out_channels = out_channels
29 | self.heads = heads
30 | self.concat = concat
31 | self.negative_slope = negative_slope
32 | self.dropout = dropout
33 | self.add_self_loops = add_self_loops
34 |
35 | # transformations on source and target nodes (will be the same in sensor network)
36 | self.lin_src = Linear(in_channels, heads * out_channels, bias=False)
37 | self.lin_dst = self.lin_src
38 |
39 | # learnable parameters to compute attention coefficients
40 | # double number of parameters; for node features and sensor embedding
41 | self.att_src = Parameter(torch.Tensor(1, heads, 2*out_channels))
42 | self.att_dst = Parameter(torch.Tensor(1, heads, 2*out_channels))
43 |
44 | if bias and concat:
45 | self.bias = Parameter(torch.Tensor(heads * out_channels))
46 | elif bias and not concat:
47 | self.bias = Parameter(torch.Tensor(out_channels))
48 | else:
49 | self.register_parameter('bias', None)
50 |
51 | self._alpha = None
52 |
53 | self.reset_parameters()
54 |
55 | def reset_parameters(self):
56 | self.lin_src.reset_parameters()
57 | self.lin_dst.reset_parameters()
58 | glorot(self.lin_src.weight)
59 | glorot(self.att_src)
60 | glorot(self.att_dst)
61 | zeros(self.bias)
62 |
63 | def forward(self, x, edge_index, embedding, size=None, return_attention_weights=False):
64 |
65 | H, C = self.heads, self.out_channels
66 |
67 | # transform input node features
68 | assert x.dim() == 2, "Static graphs not supported in 'GATConv'"
69 | x_src = x_dst = self.lin_src(x).view(-1, H, C)
70 | x = (x_src, x_dst)
71 |
72 | # shape [num_nodes*batch_size, embed_dim] -> [num_nodes*batch_size, heads, embed_dim]
73 | assert embedding.size(1) == C
74 | emb_src = emb_dst = embedding.unsqueeze(1).expand(-1, H, C)
75 |
76 | # combined representation of node features and embedding
77 | src = torch.cat([x_src, emb_src], dim=2)
78 | dst = torch.cat([x_dst, emb_dst], dim=2)
79 | # compute node-level attention coefficients
80 | alpha_src = (src * self.att_src).sum(dim=-1)
81 | alpha_dst = (dst * self.att_dst).sum(dim=-1)
82 | alpha = (alpha_src, alpha_dst)
83 |
84 | if self.add_self_loops:
85 | num_nodes = x_src.size(0)
86 | edge_index, _ = remove_self_loops(edge_index)
87 | edge_index, _ = add_self_loops(edge_index, num_nodes=num_nodes)
88 |
89 | # propagate_type: (x: OptPairTensor, alpha: OptPairTensor)
90 | out = self.propagate(edge_index, x=x, alpha=alpha, size=size)
91 |
92 | alpha = self._alpha
93 | assert alpha is not None
94 | self._alpha = None
95 |
96 | if self.concat:
97 | out = out.view(-1, self.heads * self.out_channels)
98 | else:
99 | out = out.mean(dim=1)
100 |
101 | if self.bias is not None:
102 | out += self.bias
103 |
104 | if return_attention_weights:
105 | return out, (edge_index, alpha)
106 | else:
107 | return out
108 |
109 | def message(self, x_j, alpha_j, alpha_i, index, ptr, size_i):
110 |
111 | alpha = alpha_j if alpha_i is None else alpha_j + alpha_i
112 |
113 | alpha = F.leaky_relu(alpha, self.negative_slope)
114 | alpha = softmax(alpha, index, ptr, size_i)
115 | self._alpha = alpha # Save for later use.
116 | alpha = F.dropout(alpha, p=self.dropout, training=self.training)
117 |
118 | msg = x_j * alpha.unsqueeze(-1)
119 |
120 | return msg
121 |
122 | def __repr__(self):
123 | return '{}({}, {}, heads={})'.format(self.__class__.__name__,
124 | self.in_channels,
125 | self.out_channels, self.heads)
--------------------------------------------------------------------------------
/src/layers/embedding.py:
--------------------------------------------------------------------------------
1 | from torch import nn
2 | import torch.nn.functional as F
3 | import torch
4 | import math
5 | from ..utils.device import get_device
6 |
7 | class SingleEmbedding(nn.Module):
8 | r''' Layer for graph representation learning
9 | using a linear embedding layer and cosine similarity
10 | to produce an index list of edges for a fixed number of
11 | neighbors for each node.
12 |
13 | Args:
14 | num_nodes (int): Number of nodes.
15 | embed_dim (int): Dimension of embedding.
16 | topk (int, optional): Number of neighbors per node.
17 | '''
18 |
19 | def __init__(self, num_nodes, embed_dim, topk=15, warmup_epochs = 20):
20 | super().__init__()
21 |
22 | self.device = get_device()
23 |
24 | self.topk = topk
25 | self.embed_dim = embed_dim
26 | self.num_nodes = num_nodes
27 |
28 | self.embedding = nn.Embedding(num_nodes, embed_dim)
29 | nn.init.kaiming_uniform_(self.embedding.weight, a=math.sqrt(5))
30 |
31 | self._A = None
32 | self._edges = None
33 |
34 | ### pre-computed index matrices
35 | # square matrix for adjacency matrix indexing
36 | self._edge_indices = torch.arange(num_nodes).to(self.device).expand(num_nodes, num_nodes) # [[1,2,3,4,5], [1,2,3,4,5], ...]
37 | # matrix containing column indices for the right side of a matrix - will be used to remove all but topk entries
38 | self._i = torch.arange(self.num_nodes).unsqueeze(1).expand(self.num_nodes, self.num_nodes - self.topk).flatten()
39 |
40 | # fully connected graph
41 | self._fc_edge_indices = torch.stack([self._edge_indices.T.flatten(), self._edge_indices.flatten()], dim=0)
42 |
43 | self.warmup_counter = 0
44 | self.warmup_durantion = warmup_epochs
45 |
46 | def get_A(self):
47 | if self._A is None:
48 | self.forward()
49 | return self._A
50 |
51 | def get_E(self):
52 | if self._edges is None:
53 | self.forward()
54 | return self._edges, [self.embedding.weight.clone()]
55 |
56 | def forward(self):
57 | W = self.embedding.weight.clone() # row vector represents sensor embedding
58 |
59 | eps = 1e-8 # avoid division by 0
60 | W_norm = W / torch.clamp(W.norm(dim=1)[:, None], min=eps)
61 | A = W_norm @ W_norm.t()
62 |
63 | # remove self loops
64 | A.fill_diagonal_(0)
65 |
66 | # remove negative scores
67 | A = A.clamp(0)
68 |
69 | if self.warmup_counter < self.warmup_durantion:
70 | edge_indices = self._fc_edge_indices
71 | edge_attr = A.flatten()
72 |
73 | self.warmup_counter += 1
74 | else:
75 |
76 | # topk entries
77 | _, topk_idx = A.sort(descending=True)
78 |
79 | j = topk_idx[:, self.topk:].flatten()
80 | A[self._i, j] = 0
81 |
82 | # # row degree
83 | # row_degree = A.sum(1).view(-1, 1) + 1e-8 # column vector
84 | # col_degree = A.sum(0) + 1e-8 # row vector
85 |
86 | # # normalized adjacency matrix
87 | # A /= torch.sqrt(row_degree)
88 | # A /= torch.sqrt(col_degree)
89 |
90 | msk = A > 0 # boolean mask
91 |
92 | edge_idx_src = self._edge_indices.T[msk] # source edge indices
93 | edge_idx_dst = self._edge_indices[msk] # target edge indices
94 | edge_attr = A[msk].flatten() # edge weights
95 |
96 | # shape [2, topk*num_nodes] tensor holding topk edge-index-pairs for each node
97 | edge_indices = torch.stack([edge_idx_src, edge_idx_dst], dim=0)
98 |
99 | # save for later
100 | self._A = A
101 | self._edges = edge_indices
102 |
103 | return edge_indices, edge_attr, A
104 |
105 |
106 | class DoubleEmbedding(nn.Module):
107 | r"""An implementation of the graph learning layer to construct an adjacency matrix.
108 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks."
109 | `_
110 |
111 | Args:
112 | num_nodes (int): Number of nodes in the graph.
113 | k (int): Number of largest values to consider in constructing the neighbourhood of a node (pick the "nearest" k nodes).
114 | dim (int): Dimension of the node embedding.
115 | alpha (float, optional): Tanh alpha for generating adjacency matrix, alpha controls the saturation rate
116 | """
117 |
118 | def __init__(self, num_nodes, embed_dim, topk=5, alpha=3, type='uni', warmup_epochs=20):
119 |
120 | super(DoubleEmbedding, self).__init__()
121 |
122 | self.device = get_device()
123 |
124 | assert type in ['bi', 'uni', 'sym']
125 | self.graph_type = type
126 |
127 | self.alpha = alpha
128 |
129 | self._embedding1 = nn.Embedding(num_nodes, embed_dim)
130 | self._embedding2 = nn.Embedding(num_nodes, embed_dim)
131 | self._linear1 = nn.Linear(embed_dim, embed_dim)
132 | self._linear2 = nn.Linear(embed_dim, embed_dim)
133 |
134 | nn.init.kaiming_uniform_(self._embedding1.weight, a=math.sqrt(5))
135 | nn.init.kaiming_uniform_(self._embedding2.weight, a=math.sqrt(5))
136 |
137 | self._topk = topk
138 | self._num_nodes = num_nodes
139 |
140 | # placeholders
141 | self._A = None
142 | self._edges = None
143 | self._M1 = self._embedding1.weight.clone()
144 | self._M2 = self._embedding2.weight.clone()
145 |
146 | ### pre-computed index matrices
147 | # square matrix for adjacency matrix indexing
148 | self._edge_indices = torch.arange(num_nodes).to(self.device).expand(num_nodes, num_nodes) # [[1,2,3,4,5], [1,2,3,4,5], ...]
149 | # row indices for entries that will be removed from adjacency matrix
150 | self._i = torch.arange(self._num_nodes).unsqueeze(1).expand(self._num_nodes, self._num_nodes - self._topk).flatten()
151 |
152 | # fully connected graph
153 | self._fc_edge_indices = torch.stack([self._edge_indices.T.flatten(), self._edge_indices.flatten()], dim=0)
154 |
155 | self.warmup_counter = 0
156 | self.warmup_durantion = warmup_epochs
157 |
158 | def get_A(self):
159 | if self._A is None:
160 | self.forward()
161 | return self._A
162 |
163 | def get_E(self):
164 | if self._edges is None:
165 | self.forward()
166 | return self._edges, [self._M1, self._M2]
167 |
168 | def forward(self) -> torch.FloatTensor:
169 | """
170 | ...
171 | """
172 |
173 | M1 = self._embedding1.weight.clone()
174 | M2 = self._embedding2.weight.clone()
175 |
176 | self._M1 = M1.data.clone()
177 | self._M2 = M2.data.clone()
178 |
179 | M1 = torch.tanh(self.alpha * self._linear1(M1))
180 | M2 = torch.tanh(self.alpha * self._linear2(M2))
181 |
182 | if self.graph_type is 'uni':
183 | A = M1 @ M2.T - M2 @ M1.T # skew symmetric matrix (uni-directed)
184 |
185 | elif self.graph_type is 'bi': # unordered matrix (directed unconstraint)
186 | A = M1 @ M2.T
187 |
188 | elif self.graph_type is 'sym': # symmetric matrix (undirected)
189 | A = M1 @ M1.T - M2 @ M2.T
190 | # A = A.triu()
191 |
192 | # set negative values to zero
193 | A = F.relu(A)
194 | # no self loops
195 | A.fill_diagonal_(0)
196 |
197 | if self.warmup_counter < self.warmup_durantion:
198 | edge_indices = self._fc_edge_indices
199 | edge_attr = A.flatten()
200 |
201 | self.warmup_counter += 1
202 | else:
203 | # topk entries
204 | _, idx = A.sort(descending=True)
205 | j = idx[:, self._topk:].flatten() # column indices of topk
206 | # remove all but topk
207 | A[self._i, j] = 0
208 |
209 | # # node degrees (num incoming edges)
210 | # row_degree = A.sum(1).view(-1, 1) + 1e-8 # column vector
211 | # col_degree = A.sum(0) + 1e-8 # row vector
212 |
213 | # # normalized adjacency matrix
214 | # A /= torch.sqrt(row_degree)
215 | # A /= torch.sqrt(col_degree)
216 |
217 | msk = A > 0 # boolean mask
218 |
219 | edge_idx_src = self._edge_indices.T[msk] # source edge indices
220 | edge_idx_dst = self._edge_indices[msk] # target edge indices
221 | edge_attr = A[msk].flatten() # edge weights
222 |
223 | # shape [2, topk*num_nodes] tensor holding topk edge-index-pairs for each node
224 | edge_indices = torch.stack([edge_idx_src, edge_idx_dst], dim=0)
225 |
226 | # save for later
227 | self._A = A
228 | self._edges = edge_indices
229 |
230 | return edge_indices, edge_attr, A
231 |
232 |
233 | class ProjectedEmbedding(nn.Module):
234 | r''' Layer for graph representation learning
235 | using a linear embedding layer and cosine similarity
236 | to produce an index list of edges for a fixed number of
237 | neighbors for each node.
238 |
239 | Args:
240 | num_nodes (int): Number of nodes.
241 | embed_dim (int): Dimension of embedding.
242 | topk (int, optional): Number of neighbors per node.
243 | '''
244 |
245 | def __init__(self, num_nodes, num_node_features, embed_dim, topk=15):
246 | super().__init__()
247 |
248 | self.topk = topk
249 | self.embed_dim = embed_dim
250 | self.in_features = num_node_features
251 |
252 | self.device = get_device()
253 |
254 | self.embedding_projection = nn.ModuleList([
255 | nn.Sequential(
256 | nn.Linear(num_node_features, 64),
257 | nn.ReLU(),
258 | nn.Linear(64, embed_dim)
259 | ) for _ in range(num_nodes)]
260 | )
261 |
262 | self.prev_embed = torch.empty((num_nodes, embed_dim), dtype=torch.float, requires_grad=True)
263 | nn.init.kaiming_uniform_(self.prev_embed, a=math.sqrt(5))
264 |
265 | self._A = None
266 | self._edges = None
267 |
268 | ### pre-computed index matrices
269 | # square matrix for adjacency matrix indexing
270 | self._edge_indices = torch.arange(num_nodes).to(self.device).expand(num_nodes, num_nodes) # [[1,2,3,4,5], [1,2,3,4,5], ...]
271 | # matrix containing column indices for the right side of a matrix - will be used to remove all but topk entries
272 | self._i = torch.arange(num_nodes).unsqueeze(1).expand(num_nodes, num_nodes - topk).flatten()
273 |
274 | def get_A(self):
275 | if self._A is None:
276 | self.forward()
277 | return self._A
278 |
279 | def get_E(self):
280 | if self._edges is None:
281 | self.forward()
282 | return self._edges, [self.embedding.weight.clone()]
283 |
284 | def forward(self, x):
285 | # x shape, B, N, F
286 | proj = []
287 | for i, func in enumerate(self.embedding_projection):
288 | proj.append(func(x[..., i, :]))
289 | M1 = self.prev_embed
290 | M2 = torch.stack(proj, dim=0)
291 |
292 | A = F.relu(M1 @ M2.T - M2 @ M1.T)
293 |
294 | self.prev_embed = M2
295 |
296 | # topk entries
297 | _, topk_idx = A.sort(descending=True)
298 |
299 | j = topk_idx[:, self.topk:].flatten()
300 | A[self._i, j] = 0
301 |
302 | msk = A > 0 # boolean mask
303 |
304 | edge_idx_src = self._edge_indices.T[msk] # source edge indices
305 | edge_idx_dst = self._edge_indices[msk] # target edge indices
306 | edge_attr = A[msk].flatten() # edge weights
307 |
308 | # shape [2, topk*num_nodes] tensor holding topk edge-index-pairs for each node
309 | edge_indices = torch.stack([edge_idx_src, edge_idx_dst], dim=0)
310 |
311 | # save for later
312 | self._A = A
313 | self._edges = edge_indices
314 |
315 | return edge_indices, edge_attr, A
316 |
317 | class ConvEmbedding(nn.Module):
318 | r''' Layer for graph representation learning
319 | using a linear embedding layer and cosine similarity
320 | to produce an index list of edges for a fixed number of
321 | neighbors for each node.
322 |
323 | Args:
324 | num_nodes (int): Number of nodes.
325 | embed_dim (int): Dimension of embedding.
326 | topk (int, optional): Number of neighbors per node.
327 | '''
328 |
329 | def __init__(self, num_nodes, num_node_features, embed_dim, topk=15):
330 | super().__init__()
331 |
332 | self.topk = topk
333 | self.embed_dim = embed_dim
334 | self.in_features = num_node_features
335 |
336 | self.device = get_device()
337 |
338 | # INPUT SIZE 25
339 | self.embedding_conv = nn.Sequential(
340 | nn.Conv1d(1, 8, 7),
341 | nn.ReLU(),
342 | nn.BatchNorm1d(8),
343 | nn.Conv1d(8, 16, 5),
344 | nn.ReLU(),
345 | nn.BatchNorm1d(16),
346 | nn.Conv1d(16, 32, 5),
347 | nn.ReLU(),
348 | nn.BatchNorm1d(32),
349 | nn.Flatten(1, -1),
350 | nn.Linear(32*11, 2*32),
351 | nn.ReLU(),
352 | )
353 |
354 | self._A = None
355 | self._edges = None
356 |
357 | ### pre-computed index matrices
358 | # square matrix for adjacency matrix indexing
359 | self._edge_indices = torch.arange(num_nodes).to(self.device).expand(num_nodes, num_nodes) # [[1,2,3,4,5], [1,2,3,4,5], ...]
360 | # matrix containing column indices for the right side of a matrix - will be used to remove all but topk entries
361 | self._i = torch.arange(num_nodes).unsqueeze(1).expand(num_nodes, num_nodes - topk).flatten()
362 |
363 | def get_A(self):
364 | if self._A is None:
365 | self.forward()
366 | return self._A
367 |
368 | def get_E(self):
369 | if self._edges is None:
370 | self.forward()
371 | return self._edges, [self.embedding.weight.clone()]
372 |
373 | def forward(self, x):
374 | # x shape, B, N, F
375 |
376 | M1, M2 = self.embedding_conv(x.unsqueeze(-2)).chunk()
377 |
378 | A = F.relu(M1 @ M2.T - M2 @ M1.T)
379 |
380 | self.prev_embed = M2
381 |
382 | # topk entries
383 | _, topk_idx = A.sort(descending=True)
384 |
385 | j = topk_idx[:, self.topk:].flatten()
386 | A[self._i, j] = 0
387 |
388 | msk = A > 0 # boolean mask
389 |
390 | edge_idx_src = self._edge_indices.T[msk] # source edge indices
391 | edge_idx_dst = self._edge_indices[msk] # target edge indices
392 | edge_attr = A[msk].flatten() # edge weights
393 |
394 | # shape [2, topk*num_nodes] tensor holding topk edge-index-pairs for each node
395 | edge_indices = torch.stack([edge_idx_src, edge_idx_dst], dim=0)
396 |
397 | # save for later
398 | self._A = A
399 | self._edges = edge_indices
400 |
401 | return edge_indices, edge_attr, A
402 |
403 |
--------------------------------------------------------------------------------
/src/models/ConvSeqAttention.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn.functional as F
3 | from torch import nn
4 | import math
5 |
6 | from ..utils.device import get_device
7 |
8 | from torch_geometric.nn import ARMAConv
9 | from torch_geometric.nn import Sequential
10 |
11 | from ..layers import DoubleEmbedding
12 |
13 |
14 | class ConvSeqAttentionModel(torch.nn.Module):
15 | '''
16 | Anomaly detection neural network model for multivariate sensor time series.
17 | Graph structure is randomly initialized and learned during training.
18 | Uses an attention layer that scores the attention weights for the input
19 | time series window and the sensor embedding vector.
20 |
21 | Args:
22 | args (dict): Argparser with config information.
23 | '''
24 | def __init__(self, args):
25 | super().__init__()
26 |
27 | self.device = args.device
28 |
29 | self.num_nodes = args.num_nodes
30 | self.horizon = args.horizon
31 | self.topk = args.topk
32 | self.embed_dim = args.embed_dim
33 | self.lags = args.window_size
34 |
35 | # learned graph embeddings
36 | self.graph_embedding = DoubleEmbedding(self.num_nodes, self.embed_dim, topk=self.topk, type='uni', warmup_epochs=50).to(self.device)
37 |
38 | # model parameters
39 | kernels = [5, 3]
40 | channels = [16, 32]
41 | hidden_dim = 64
42 |
43 | # GNN ENCODER ::: outputs one hidden state for each time step
44 | self.conv_encoder = Sequential('x, idx, attr', [
45 | (STConv(1, channels[0], channels[0], kernels[0], p=0.2, padding=True, residual=True), 'x, idx, attr -> x'),
46 | (STConv(channels[0], channels[1], channels[1], kernels[1], p=0.2, padding=True, residual=True), 'x, idx, attr -> x'),
47 | (STConv(channels[1], hidden_dim, hidden_dim, kernels[1], p=0.2, padding=True, residual=True), 'x, idx, attr -> x')
48 | ])
49 |
50 | # linear transformation of encoder hidden states for alignment scores
51 | self.alignment_W = nn.Linear(hidden_dim, hidden_dim)
52 |
53 | # GNN DECODER ::: outputs single vector hidden state
54 | self.decoder_window_length = sum(kernels) - len(kernels) + 1
55 | self.conv_decoder = Sequential('x, idx, attr', [
56 | (nn.Sequential(
57 | nn.Conv1d(1, 2*channels[0], kernels[0]),
58 | nn.BatchNorm1d(2*channels[0]),
59 | nn.GLU(dim=1),
60 | nn.Dropout(0.2),
61 | nn.Conv1d(channels[0], 2*channels[1], kernels[1]),
62 | nn.BatchNorm1d(2*channels[1]),
63 | nn.GLU(dim=1),
64 | nn.Dropout(0.2),
65 | nn.Flatten(1, -1),), 'x, -> x'),
66 | (ARMAConv(
67 | in_channels=channels[1],
68 | out_channels=hidden_dim,
69 | num_stacks=1,
70 | num_layers=1,
71 | act=nn.GELU(),
72 | dropout=0.2,), 'x, idx, attr -> x'),
73 | (nn.LayerNorm(hidden_dim), 'x -> x'),
74 | ])
75 |
76 | # prediction layer
77 | pred_channels = 2*hidden_dim
78 | self.pred = Sequential('x, idx, attr', [
79 | (nn.Linear(pred_channels, 1), 'x -> x'),
80 | ])
81 |
82 | # absolute positional embeddings based on sine and cosine functions
83 | position = torch.arange(1000).unsqueeze(1)
84 | div_term = torch.exp(torch.arange(0, hidden_dim, 2) * (-math.log(1e5) / hidden_dim))
85 | pe = torch.zeros(1000, hidden_dim)
86 | pe[:, 0::2] = torch.sin(position * div_term) / 1e9
87 | pe[:, 1::2] = torch.cos(position * div_term) / 1e9
88 | self.positional_embedding = F.dropout(pe[:self.lags - self.decoder_window_length], 0.05).to(self.device) # vector with PEs for each input timestep
89 |
90 | # cached offsets for batch stacking for each batch_size and number of edges
91 | self.batch_edge_offset_cache = {}
92 |
93 | # initial graph
94 | self._edge_index, self.edge_attr, self.A = self.graph_embedding()
95 |
96 | def get_graph(self):
97 | return self.graph_embedding.get_A()
98 |
99 | def get_embedding(self):
100 | return self.graph_embedding.get_E()
101 |
102 | def forward(self, window):
103 | # batch stacked window; input shape: [num_nodes*batch_size, lags]
104 | N = self.num_nodes # number of nodes
105 | T = self.lags # number of input time steps
106 | B = window.size(0) // N # batch size
107 |
108 | # get learned graph representation
109 | edge_index, edge_attr, _ = self.graph_embedding()
110 |
111 | # batching works by stacking graphs; creates a mega graph with disjointed subgraphs
112 | # for each input sample. E.g. for a batch of B inputs with 51 nodes each;
113 | # samples i in {0, ..., B} => node indices [0...50], [51...101], [102...152], ... ,[51*B...50*51*B]
114 | # => node indices for sample i = [0, ..., num_nodes-1] + (i*num_nodes)
115 | num_edges = len(edge_attr)
116 | try:
117 | batch_offset = self.batch_edge_offset_cache[(B, num_edges)]
118 | except:
119 | batch_offset = torch.arange(0, N * B, N).view(1, B, 1).expand(2, B, num_edges).flatten(1,-1).to(self.device)
120 | self.batch_edge_offset_cache[(B, num_edges)] = batch_offset
121 | # repeat edge indices B times and add i*num_nodes where i is the input index
122 | batched_edge_index = edge_index.unsqueeze(1).expand(2, B, -1).flatten(1, -1) + batch_offset
123 | # repeat edge weights B times
124 | batched_edge_attr = edge_attr.unsqueeze(0).expand(B, -1).flatten()
125 |
126 | # add node feature dimension to input
127 | window = window.unsqueeze(1) # (B, 1, T)
128 | encoder_window = window[..., :-self.decoder_window_length] # encoder takes beginning of input window
129 | decoder_window = window[..., -self.decoder_window_length:] # decoder takes the end
130 |
131 | # hidden states for all input time steps
132 | h_encoder = self.conv_encoder(encoder_window, batched_edge_index, batched_edge_attr) # (B, C, T)
133 |
134 | # add small positional encoding value
135 | h_encoder += self.positional_embedding.T
136 |
137 | # multistep prediction
138 | predictions = []
139 | for _ in range(self.horizon):
140 | # decoder hidden state
141 | h_decoder = self.conv_decoder(decoder_window, batched_edge_index, batched_edge_attr).unsqueeze(1) # -> (B, 1, C)
142 | # transformation of encoder states
143 | a = self.alignment_W(h_encoder.permute(0,2,1)).permute(0,2,1) # W @ H_encoder, shape -> (B, C, T)
144 | # compute alignment vector from decoder transformed encoder hidden states
145 | score = h_decoder @ a # (B, 1, C) @ (B, C, T) -> (B, 1, T)
146 | # attention weights for each time step
147 | alpha = F.softmax(score, dim=2) # -> (B, 1, T)
148 | # context vector
149 | context = torch.sum(alpha * h_encoder, dim=2) # -> (B, C)
150 | # concatination of context vector and decoder hidden state
151 | context = torch.cat([context, h_decoder.squeeze(1)], dim=1) # -> (B, 2C)
152 | # layer normalization after adding all components
153 | context = F.layer_norm(context, tuple(context.shape[1:]))
154 | # single step prediction
155 | y_pred = self.pred(context, batched_edge_index, batched_edge_attr).view(-1, 1) # column vector
156 | predictions.append(y_pred)
157 | # decoder input for the next step
158 | decoder_window = torch.cat([decoder_window[..., 1:], y_pred.detach().unsqueeze(1)], dim=-1)
159 |
160 | # full output prediction vector
161 | pred = torch.cat(predictions, dim=1) # row = node, column = time
162 |
163 | return pred
164 |
165 | # return window[..., -1].view(-1, 1).repeat(1, self.horizon)
166 |
167 |
168 | class STConv(nn.Module):
169 | r'''Spatio-Temporal convolution block.
170 |
171 | Args:
172 | in_channels (int): Number of input features.
173 | out_channels (int): Number of output features.
174 | kernel_size (int): Convolutional kernel size.
175 | '''
176 |
177 | def __init__(self, in_channels: int, temporal_channels: int, spatial_channels: int, kernel_size: int = 3, padding: bool = True, residual: bool = True, p: float = 0.0):
178 | super(STConv, self).__init__()
179 |
180 | self.padding = padding
181 | self.residual = residual
182 |
183 | self.device = get_device()
184 |
185 | if residual:
186 | self.res = nn.Conv1d(in_channels, spatial_channels, 1)
187 |
188 | if padding:
189 | self.p1d = (kernel_size-1, 0)
190 |
191 | # absolute positional embeddings based on sine and cosine functions
192 | position = torch.arange(1000).unsqueeze(1)
193 | div_term = torch.exp(torch.arange(0, temporal_channels, 2) * (-math.log(1e5) / temporal_channels))
194 | pe = torch.zeros(1000, temporal_channels)
195 | pe[:, 0::2] = torch.sin(position * div_term) / 100
196 | pe[:, 1::2] = torch.cos(position * div_term) / 100
197 | self.positional_embedding = F.dropout(pe, 0.05).to(self.device) # vector with PEs for each input timestep
198 |
199 | self.temporal_conv = nn.Sequential(
200 | nn.Conv1d(in_channels, 2*temporal_channels, kernel_size),
201 | nn.BatchNorm1d(2*temporal_channels),
202 | nn.GLU(dim=1),
203 | nn.Dropout(p),
204 | )
205 | self.graph_conv = ARMAConv(
206 | in_channels=temporal_channels,
207 | out_channels=spatial_channels,
208 | num_stacks=1,
209 | num_layers=1,
210 | act=nn.GELU(),
211 | dropout=p,
212 | )
213 |
214 | # cached offsets for temporal batch stacking for each batch_size and number of edges
215 | self.batch_edge_offset_cache = {}
216 |
217 | def forward(self, x: torch.FloatTensor, edge_index: torch.FloatTensor, edge_attr: torch.FloatTensor = None) -> torch.FloatTensor:
218 | '''Forward pass through temporal convolution block.
219 |
220 | Input data of shape: (batch, in_channels, time_steps).
221 | Output data of shape: (batch, out_channels, time_steps).
222 | '''
223 |
224 | # input shape (batch*num_nodes, in_channels, time)
225 |
226 | if self.residual:
227 | res = self.res(x)
228 |
229 | if self.padding:
230 | x = F.pad(x, self.p1d, "constant", 0)
231 |
232 | # temporal aggregation
233 | x = self.temporal_conv(x)
234 | # dims after temporal convolution
235 | BN, C, T = x.shape # (batch*nodes, out_channels, time)
236 | N = edge_index.max().item() + 1 # number of nodes in the batch stack
237 |
238 | # positional encoding for every time step
239 | pe = self.positional_embedding[:T].T
240 | # print(x.mean().item(), pe.mean().item()) # balance layer output and embeddings
241 | x += pe
242 |
243 | # batch stacking the temporal dimension to create a mega giga graph consisting of batched temporally-stacked graphs
244 | # analogous to batch stacking in main GNN, see description there.
245 | x = x.view(-1, C) # (B*N*T, C)
246 |
247 | # create temporal batch edge and weight lists
248 | num_edges = len(edge_attr)
249 | try:
250 | batch_offset = self.batch_edge_offset_cache[(BN, num_edges)]
251 | except:
252 | batch_offset = torch.arange(0, BN*T, N).view(1, T, 1).expand(2, T, num_edges).flatten(1,-1).to(x.device)
253 | self.batch_edge_offset_cache[(BN, num_edges)] = batch_offset
254 | # repeat edge indices T times and add offset for the edge indices
255 | temporal_batched_edge_index = edge_index.unsqueeze(1).expand(2, T, -1).flatten(1, -1) + batch_offset
256 | # repeat edge weights T times
257 | temporal_batched_edge_attr = edge_attr.unsqueeze(0).expand(T, -1).flatten()
258 |
259 | x = self.graph_conv(x, temporal_batched_edge_index, temporal_batched_edge_attr)
260 | x = x.view(BN, -1, T)
261 |
262 | # add residual connection
263 | x = x + res if self.residual else x
264 |
265 | # layer normalization
266 | return F.layer_norm(x, tuple(x.shape[1:]))
267 |
--------------------------------------------------------------------------------
/src/models/GNNLSTM.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn.functional as F
3 | from torch import nn
4 | from torch.nn import LSTM, LSTMCell
5 | import math
6 |
7 | from ..utils.device import get_device
8 |
9 | from torch_geometric.nn import ARMAConv
10 | from torch_geometric.nn import Sequential
11 |
12 | from ..layers import DoubleEmbedding, SingleEmbedding
13 |
14 |
15 | class GNNLSTM(torch.nn.Module):
16 | '''
17 | Anomaly detection neural network model for multivariate sensor time series.
18 | Graph structure is randomly initialized and learned during training.
19 | Uses an attention layer that scores the attention weights for the input
20 | time series window and the sensor embedding vector.
21 |
22 | Args:
23 | args (dict): Argparser with config information.
24 | '''
25 | def __init__(self, args):
26 | super().__init__()
27 |
28 | self.device = args.device
29 |
30 | self.num_nodes = args.num_nodes
31 | self.horizon = args.horizon
32 | self.topk = args.topk
33 | self.embed_dim = args.embed_dim
34 | self.lags = args.window_size
35 |
36 | # model parameters
37 | channels = 32 # channel == node embedding size because they are added
38 | hidden_size = 512
39 |
40 | # learned graph embeddings
41 | self.graph_embedding = SingleEmbedding(self.num_nodes, channels, topk=self.topk, warmup_epochs=10)
42 |
43 | # encoder
44 | self.tgconv = TGConv(1, channels)
45 | self.lstm = LSTM(channels*self.num_nodes, hidden_size, 2, batch_first=True, dropout=0.20)
46 |
47 | # decoder
48 | self.gnn = ARMAConv(
49 | in_channels=1,
50 | out_channels=channels,
51 | num_stacks=1,
52 | num_layers=1,
53 | act=nn.GELU(),
54 | dropout=0.2,
55 | )
56 | self.cell1 = LSTMCell(self.num_nodes*channels, hidden_size)
57 | self.cell2 = LSTMCell(hidden_size, hidden_size)
58 |
59 | # linear prediction layer
60 | self.pred = nn.Linear(hidden_size, self.num_nodes)
61 |
62 | # cached offsets for batch stacking for each batch_size and number of edges
63 | self.batch_edge_offset_cache = {}
64 |
65 | # initial graph
66 | self._edge_index, self.edge_attr, self.A = self.graph_embedding()
67 |
68 | def get_graph(self):
69 | return self.graph_embedding.get_A()
70 |
71 | def get_embedding(self):
72 | return self.graph_embedding.get_E()
73 |
74 | def forward(self, window):
75 | # batch stacked window; input shape: [num_nodes*batch_size, lags]
76 | N = self.num_nodes # number of nodes
77 | T = self.lags # number of input time steps
78 | B = window.size(0) // N # batch size
79 |
80 | # get learned graph representation
81 | edge_index, edge_attr, _ = self.graph_embedding()
82 | _, W = self.get_embedding()
83 | W = W.pop()
84 |
85 | # batching works by stacking graphs; creates a mega graph with disjointed subgraphs
86 | # for each input sample. E.g. for a batch of B inputs with 51 nodes each;
87 | # samples i in {0, ..., B} => node indices [0...50], [51...101], [102...152], ... ,[51*B...50*51*B]
88 | # => node indices for sample i = [0, ..., num_nodes-1] + (i*num_nodes)
89 | num_edges = len(edge_attr)
90 | try:
91 | batch_offset = self.batch_edge_offset_cache[(B, num_edges)]
92 | except:
93 | batch_offset = torch.arange(0, N * B, N).view(1, B, 1).expand(2, B, num_edges).flatten(1,-1).to(self.device)
94 | self.batch_edge_offset_cache[(B, num_edges)] = batch_offset
95 | # repeat edge indices B times and add i*num_nodes where i is the input index
96 | batched_edge_index = edge_index.unsqueeze(1).expand(2, B, -1).flatten(1, -1) + batch_offset
97 | # repeat edge weights B times
98 | batched_edge_attr = edge_attr.unsqueeze(0).expand(B, -1).flatten()
99 |
100 | # add node feature dimension to input
101 | x = window.unsqueeze(-1) # (B*N, T, 1)
102 |
103 | ### ENCODER
104 | # GNN layer; batch stacked output with C feature channels for each time step
105 | x = self.tgconv(x, batched_edge_index, batched_edge_attr) # (B*N, T, C)
106 | x = x.view(B, N, T, -1).permute(0, 2, 1, 3).contiguous() # -> (B, T, N, C)
107 | # add node embeddings to feature vector as node positional embeddings
108 | x = x + W # (B, T, N, C) + (N, C)
109 | # concatenate node features for LSTM input
110 | x = x.view(B, T, -1) # -> (B, T, N*C)
111 | # LSTM layer
112 | h, (h_n, h_n) = self.lstm(x) # -> (B, T, H), (2, B, H), (2, B, H)
113 | # get hidden and cell states for each layer
114 | h1 = h_n[0, ...].squeeze(0)
115 | h2 = h_n[1, ...].squeeze(0)
116 | c1 = h_n[0, ...].squeeze(0)
117 | c2 = h_n[1, ...].squeeze(0)
118 |
119 | # TODO: try attention on h
120 |
121 | ### DECODER
122 | predictions = []
123 | # if prediction horizon > 1, iterate through decoder LSTM step by step
124 | for _ in range(self.horizon-1):
125 | # single decoder step per loop iteration
126 | pred = self.pred(h2).view(-1, 1)
127 | predictions.append(pred)
128 |
129 | # GNN layer analogous to encoder without time dimension
130 | x = self.gnn(pred, batched_edge_index, batched_edge_attr)
131 | x = x.view(B, N, -1) + W
132 | x = x.view(B, -1)
133 | # LSTM layer 1
134 | h1, c1 = self.cell1(x, (h1, c1))
135 | h1 = F.dropout(h1, 0.2)
136 | c1 = F.dropout(c1, 0.2)
137 | # LSTM layer 2
138 | h2, c2 = self.cell2(h1, (h2, c2))
139 | # final prediction
140 | pred = self.pred(h2).view(-1, 1)
141 | predictions.append(pred)
142 |
143 | return torch.cat(predictions, dim=1)
144 |
145 |
146 | class TGConv(nn.Module):
147 | r'''
148 | Parallel graph convolution for multiple time steps.
149 |
150 | Args:
151 | in_channels (int): Number of input features.
152 | out_channels (int): Number of output features.
153 | p (float): Dropout value between 0 and 1
154 | '''
155 |
156 | def __init__(self, in_channels: int, out_channels: int, p: float = 0.0):
157 | super(TGConv, self).__init__()
158 |
159 | self.device = get_device()
160 |
161 | self.graph_conv = ARMAConv(
162 | in_channels=in_channels,
163 | out_channels=out_channels,
164 | num_stacks=1,
165 | num_layers=1,
166 | act=nn.GELU(),
167 | dropout=p,
168 | )
169 |
170 | # cached offsets for temporal batch stacking for each batch_size and number of edges
171 | self.batch_edge_offset_cache = {}
172 |
173 | def forward(self, x: torch.FloatTensor, edge_index: torch.FloatTensor, edge_attr: torch.FloatTensor = None) -> torch.FloatTensor:
174 | '''
175 | Forward pass through temporal convolution block.
176 |
177 | Input data of shape: (batch, time_steps, in_channels).
178 | Output data of shape: (batch, time_steps, out_channels).
179 | '''
180 |
181 | # input dims
182 | BN, T, C = x.shape # (batch*nodes, time, in_channels)
183 | N = edge_index.max().item() + 1 # number of nodes in the batch stack
184 |
185 | # batch stacking the temporal dimension to create a mega giga graph consisting of batched temporally-stacked graphs
186 | # analogous to batch stacking in main GNN, see description there.
187 | x = x.contiguous().view(-1, C) # (B*N*T, C)
188 |
189 | # create temporal batch edge and weight lists
190 | num_edges = len(edge_attr)
191 | try:
192 | batch_offset = self.batch_edge_offset_cache[(BN, num_edges)]
193 | except:
194 | batch_offset = torch.arange(0, BN*T, N).view(1, T, 1).expand(2, T, num_edges).flatten(1,-1).to(x.device)
195 | self.batch_edge_offset_cache[(BN, num_edges)] = batch_offset
196 | # repeat edge indices T times and add offset for the edge indices
197 | temporal_batched_edge_index = edge_index.unsqueeze(1).expand(2, T, -1).flatten(1, -1) + batch_offset
198 | # repeat edge weights T times
199 | temporal_batched_edge_attr = edge_attr.unsqueeze(0).expand(T, -1).flatten()
200 |
201 | # GNN with C output channels
202 | x = self.graph_conv(x, temporal_batched_edge_index, temporal_batched_edge_attr) # (B*N*T, C)
203 | x = x.view(BN, T, -1) # -> (B*N, T, C)
204 |
205 | return x
206 |
--------------------------------------------------------------------------------
/src/models/LSTM.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn.functional as F
3 | from torch import nn
4 |
5 | from ..layers import SingleEmbedding
6 |
7 | class RecurrentModel(torch.nn.Module):
8 | def __init__(self, config):
9 | super().__init__()
10 |
11 | self.device = config.device
12 |
13 | self.num_nodes = config.num_nodes
14 | self.horizon = config.horizon
15 | self.topk = config.topk
16 | self.embed_dim = config.embed_dim
17 | self.lags = config.window_size
18 |
19 | # dummy
20 | self.embedding = SingleEmbedding(1, 1, 1).to(self.device)
21 |
22 | # encoder lstm
23 | self.lstm = nn.LSTM(self.num_nodes, 512, 2, batch_first=True, dropout=0.25)
24 |
25 | # decoder lstm
26 | self.cell1 = nn.LSTMCell(self.num_nodes, 512)
27 | self.cell2 = nn.LSTMCell(512, 512)
28 |
29 | # linear prediction layer
30 | self.pred = nn.Linear(512, self.num_nodes)
31 |
32 | def get_graph(self):
33 | return self.embedding.get_A()
34 |
35 | def get_embedding(self):
36 | return self.embedding.get_E()
37 |
38 |
39 | def forward(self, window):
40 | # batch stacked window; input shape: [num_nodes*batch_size, lags]
41 | N = self.num_nodes # number of nodes
42 | T = self.lags # number of input time steps
43 | B = window.size(0) // N # batch size
44 |
45 | x = window.view(B, T, N)
46 |
47 | # encoder
48 | _, (h, c) = self.lstm(x) # -> (B, T, H), (2, B, H), (2, B, H)
49 | # get hidden and cell states for each layer
50 | h1 = h[0, ...].squeeze(0)
51 | h2 = h[1, ...].squeeze(0)
52 | c1 = c[0, ...].squeeze(0)
53 | c2 = c[1, ...].squeeze(0)
54 |
55 | # decoder
56 | predictions = []
57 | for _ in range(self.horizon-1):
58 | pred = self.pred(h2)
59 | predictions.append(pred.view(-1, 1))
60 | # layer 1
61 | h1, c1 = self.cell1(pred, (h1, c1))
62 | h1 = F.dropout(h1, 0.2)
63 | c1 = F.dropout(c1, 0.2)
64 | # layer 2
65 | h2, c2 = self.cell2(h1, (h2, c2))
66 | # final prediction
67 | pred = self.pred(h2).view(-1, 1)
68 | predictions.append(pred)
69 |
70 | return torch.cat(predictions, dim=1)
71 |
72 |
73 |
--------------------------------------------------------------------------------
/src/models/Linear.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torch import nn
3 |
4 | from ..layers import SingleEmbedding
5 |
6 | class LinearModel(torch.nn.Module):
7 | def __init__(self, args):
8 | super().__init__()
9 |
10 | self.device = args.device
11 |
12 | self.num_nodes = args.num_nodes
13 | self.horizon = args.horizon
14 | self.topk = args.topk
15 | self.embed_dim = args.embed_dim
16 | self.lags = args.window_size
17 |
18 | self.embedding = SingleEmbedding(self.num_nodes, self.embed_dim, topk=self.topk)
19 |
20 | self.lin = nn.Sequential(
21 | nn.Linear(self.lags, 1024),
22 | nn.BatchNorm1d(1024),
23 | nn.ReLU(),
24 | )
25 |
26 | self.pred = nn.Sequential(
27 | nn.Linear(1024, self.horizon)
28 | )
29 |
30 | # initial graph
31 | self._edge_index, self.edge_attr, self.A = self.embedding()
32 |
33 | def get_graph(self):
34 | return self.embedding.get_A()
35 |
36 | def get_embedding(self):
37 | return self.embedding.get_E()
38 |
39 | def forward(self, x):
40 | # input sizes
41 | N = self.num_nodes
42 | B = x.size(0) // N # batch size
43 |
44 | x = self.lin(x)
45 | pred = self.pred(x).view(B*N, -1)
46 |
47 | return pred
48 |
--------------------------------------------------------------------------------
/src/models/MTGNN.py:
--------------------------------------------------------------------------------
1 | from __future__ import division
2 |
3 | from typing import Optional
4 |
5 | import torch
6 | import torch.nn.functional as F
7 | from torch.nn import init
8 | from torch import nn
9 |
10 | from ..layers import SingleEmbedding
11 |
12 | class MTGNNModel(torch.nn.Module):
13 | def __init__(self, config):
14 | super().__init__()
15 |
16 | self.device = config.device
17 |
18 | self.num_nodes = config.num_nodes
19 | self.horizon = config.horizon
20 | self.topk = config.topk
21 | self.embed_dim = config.embed_dim
22 | self.lags = config.window_size
23 |
24 | # layer definitions
25 | self.embedding = SingleEmbedding(
26 | self.num_nodes,
27 | self.embed_dim,
28 | topk=self.topk
29 | ).to(self.device)
30 |
31 | conv_channels = 30
32 | residual_channels = 30
33 | skip_channels = 128
34 | end_channels = 256
35 |
36 | self.gnn = MTGNN(
37 | gcn_true=True,
38 | build_adj=True,
39 | gcn_depth=3,
40 | num_nodes=self.num_nodes,
41 | kernel_set=[3,3,3],
42 | kernel_size=3,
43 | dropout=0.2,
44 | subgraph_size=self.topk,
45 | node_dim=1,
46 | dilation_exponential=2,
47 | conv_channels=conv_channels,
48 | residual_channels=residual_channels,
49 | skip_channels=skip_channels,
50 | end_channels=end_channels,
51 | seq_length=self.lags,
52 | in_dim=1,
53 | out_dim=self.horizon,
54 | layers=3,
55 | propalpha=0.4,
56 | tanhalpha=3,
57 | layer_norm_affline=True,
58 | xd=None
59 | )
60 |
61 | # initial graph
62 | self._edge_index, self.edge_attr, self.A = self.embedding()
63 |
64 | def get_graph(self):
65 | return self.embedding.get_A()
66 |
67 | def get_embedding(self):
68 | return self.embedding.get_E()
69 |
70 | def forward(self, window):
71 | x = window.view(-1, 1, self.num_nodes, self.lags) # (batch, 1, num_nodes, lags)
72 | pred = self.gnn(x) # (batch, out_channels, num_nodes, 1)
73 | pred = pred.squeeze(-1).permute(0,2,1).contiguous().view(-1, self.horizon) # batch stacked
74 |
75 | return pred
76 |
77 |
78 | class Linear(nn.Module):
79 | r"""An implementation of the linear layer, conducting 2D convolution.
80 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks."
81 | `_
82 |
83 | Args:
84 | c_in (int): Number of input channels.
85 | c_out (int): Number of output channels.
86 | bias (bool, optional): Whether to have bias. Default: True.
87 | """
88 |
89 | def __init__(self, c_in: int, c_out: int, bias: bool = True):
90 | super(Linear, self).__init__()
91 | self._mlp = torch.nn.Conv2d(
92 | c_in, c_out, kernel_size=(1, 1), padding=(0, 0), stride=(1, 1), bias=bias
93 | )
94 |
95 | self._reset_parameters()
96 |
97 | def _reset_parameters(self):
98 | for p in self.parameters():
99 | if p.dim() > 1:
100 | nn.init.xavier_uniform_(p)
101 | else:
102 | nn.init.uniform_(p)
103 |
104 | def forward(self, X: torch.FloatTensor) -> torch.FloatTensor:
105 | """
106 | Making a forward pass of the linear layer.
107 |
108 | Arg types:
109 | * **X** (Pytorch Float Tensor) - Input tensor, with shape (batch_size, c_in, num_nodes, seq_len).
110 |
111 | Return types:
112 | * **X** (PyTorch Float Tensor) - Output tensor, with shape (batch_size, c_out, num_nodes, seq_len).
113 | """
114 | return self._mlp(X)
115 |
116 |
117 | class MixProp(nn.Module):
118 | r"""An implementation of the dynatic mix-hop propagation layer.
119 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks."
120 | `_
121 |
122 | Args:
123 | c_in (int): Number of input channels.
124 | c_out (int): Number of output channels.
125 | gdep (int): Depth of graph convolution.
126 | dropout (float): Dropout rate.
127 | alpha (float): Ratio of retaining the root nodes's original states, a value between 0 and 1.
128 | """
129 |
130 | def __init__(self, c_in: int, c_out: int, gdep: int, dropout: float, alpha: float):
131 | super(MixProp, self).__init__()
132 | self._mlp = Linear((gdep + 1) * c_in, c_out)
133 | self._gdep = gdep
134 | self._dropout = dropout
135 | self._alpha = alpha
136 |
137 | self._reset_parameters()
138 |
139 | def _reset_parameters(self):
140 | for p in self.parameters():
141 | if p.dim() > 1:
142 | nn.init.xavier_uniform_(p)
143 | else:
144 | nn.init.uniform_(p)
145 |
146 | def forward(self, X: torch.FloatTensor, A: torch.FloatTensor) -> torch.FloatTensor:
147 | """
148 | Making a forward pass of mix-hop propagation.
149 |
150 | Arg types:
151 | * **X** (Pytorch Float Tensor) - Input feature Tensor, with shape (batch_size, c_in, num_nodes, seq_len).
152 | * **A** (PyTorch Float Tensor) - Adjacency matrix, with shape (num_nodes, num_nodes).
153 |
154 | Return types:
155 | * **H_0** (PyTorch Float Tensor) - Hidden representation for all nodes, with shape (batch_size, c_out, num_nodes, seq_len).
156 | """
157 | A = A + torch.eye(A.size(0)).to(X.device)
158 | d = A.sum(1)
159 | H = X
160 | H_0 = X
161 | A = A / d.view(-1, 1)
162 | for _ in range(self._gdep):
163 | H = self._alpha * X + (1 - self._alpha) * torch.einsum(
164 | "ncwl,vw->ncvl", (H, A)
165 | )
166 | H_0 = torch.cat((H_0, H), dim=1)
167 | H_0 = self._mlp(H_0)
168 | return H_0
169 |
170 |
171 | class DilatedInception(nn.Module):
172 | r"""An implementation of the dilated inception layer.
173 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks."
174 | `_
175 |
176 | Args:
177 | c_in (int): Number of input channels.
178 | c_out (int): Number of output channels.
179 | kernel_set (list of int): List of kernel sizes.
180 | dilated_factor (int, optional): Dilation factor.
181 | """
182 |
183 | def __init__(self, c_in: int, c_out: int, kernel_set: list, dilation_factor: int):
184 | super(DilatedInception, self).__init__()
185 | self._time_conv = nn.ModuleList()
186 | self._kernel_set = kernel_set
187 | c_out = int(c_out / len(self._kernel_set))
188 | for kern in self._kernel_set:
189 | self._time_conv.append(
190 | nn.Conv2d(c_in, c_out, (1, kern), dilation=(1, dilation_factor))
191 | )
192 | self._reset_parameters()
193 |
194 | def _reset_parameters(self):
195 | for p in self.parameters():
196 | if p.dim() > 1:
197 | nn.init.xavier_uniform_(p)
198 | else:
199 | nn.init.uniform_(p)
200 |
201 | def forward(self, X_in: torch.FloatTensor) -> torch.FloatTensor:
202 | """
203 | Making a forward pass of dilated inception.
204 |
205 | Arg types:
206 | * **X_in** (Pytorch Float Tensor) - Input feature Tensor, with shape (batch_size, c_in, num_nodes, seq_len).
207 |
208 | Return types:
209 | * **X** (PyTorch Float Tensor) - Hidden representation for all nodes,
210 | with shape (batch_size, c_out, num_nodes, seq_len-6).
211 | """
212 | X = []
213 | for i in range(len(self._kernel_set)):
214 | X.append(self._time_conv[i](X_in))
215 | for i in range(len(self._kernel_set)):
216 | X[i] = X[i][..., -X[-1].size(3) :]
217 | X = torch.cat(X, dim=1)
218 | return X
219 |
220 |
221 | class GraphConstructor(nn.Module):
222 | r"""An implementation of the graph learning layer to construct an adjacency matrix.
223 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks."
224 | `_
225 |
226 | Args:
227 | nnodes (int): Number of nodes in the graph.
228 | k (int): Number of largest values to consider in constructing the neighbourhood of a node (pick the "nearest" k nodes).
229 | dim (int): Dimension of the node embedding.
230 | alpha (float, optional): Tanh alpha for generating adjacency matrix, alpha controls the saturation rate
231 | xd (int, optional): Static feature dimension, default None.
232 | """
233 |
234 | def __init__(
235 | self, nnodes: int, k: int, dim: int, alpha: float, xd: Optional[int] = None
236 | ):
237 | super(GraphConstructor, self).__init__()
238 | if xd is not None:
239 | self._static_feature_dim = xd
240 | self._linear1 = nn.Linear(xd, dim)
241 | self._linear2 = nn.Linear(xd, dim)
242 | else:
243 | self._embedding1 = nn.Embedding(nnodes, dim)
244 | self._embedding2 = nn.Embedding(nnodes, dim)
245 | self._linear1 = nn.Linear(dim, dim)
246 | self._linear2 = nn.Linear(dim, dim)
247 |
248 | self._k = k
249 | self._alpha = alpha
250 |
251 | self._reset_parameters()
252 |
253 | def _reset_parameters(self):
254 | for p in self.parameters():
255 | if p.dim() > 1:
256 | nn.init.xavier_uniform_(p)
257 | else:
258 | nn.init.uniform_(p)
259 |
260 | def forward(
261 | self, idx: torch.LongTensor, FE: Optional[torch.FloatTensor] = None
262 | ) -> torch.FloatTensor:
263 | """
264 | Making a forward pass to construct an adjacency matrix from node embeddings.
265 |
266 | Arg types:
267 | * **idx** (Pytorch Long Tensor) - Input indices, a permutation of the number of nodes, default None (no permutation).
268 | * **FE** (Pytorch Float Tensor, optional) - Static feature, default None.
269 | Return types:
270 | * **A** (PyTorch Float Tensor) - Adjacency matrix constructed from node embeddings.
271 | """
272 |
273 | if FE is None:
274 | nodevec1 = self._embedding1(idx)
275 | nodevec2 = self._embedding2(idx)
276 | else:
277 | assert FE.shape[1] == self._static_feature_dim
278 | nodevec1 = FE[idx, :]
279 | nodevec2 = nodevec1
280 |
281 | nodevec1 = torch.tanh(self._alpha * self._linear1(nodevec1))
282 | nodevec2 = torch.tanh(self._alpha * self._linear2(nodevec2))
283 |
284 | a = torch.mm(nodevec1, nodevec2.transpose(1, 0)) - torch.mm(
285 | nodevec2, nodevec1.transpose(1, 0)
286 | )
287 | A = F.relu(torch.tanh(self._alpha * a))
288 | mask = torch.zeros(idx.size(0), idx.size(0)).to(A.device)
289 | mask.fill_(float("0"))
290 | s1, t1 = A.topk(self._k, 1)
291 | mask.scatter_(1, t1, s1.fill_(1))
292 | A = A * mask
293 | return A
294 |
295 |
296 | class LayerNormalization(nn.Module):
297 | __constants__ = ["normalized_shape", "weight", "bias", "eps", "elementwise_affine"]
298 | r"""An implementation of the layer normalization layer.
299 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks."
300 | `_
301 |
302 | Args:
303 | normalized_shape (int): Input shape from an expected input of size.
304 | eps (float, optional): Value added to the denominator for numerical stability. Default: 1e-5.
305 | elementwise_affine (bool, optional): Whether to conduct elementwise affine transformation or not. Default: True.
306 | """
307 |
308 | def __init__(
309 | self, normalized_shape: int, eps: float = 1e-5, elementwise_affine: bool = True
310 | ):
311 | super(LayerNormalization, self).__init__()
312 | self._normalized_shape = tuple(normalized_shape)
313 | self._eps = eps
314 | self._elementwise_affine = elementwise_affine
315 | if self._elementwise_affine:
316 | self._weight = nn.Parameter(torch.Tensor(*normalized_shape))
317 | self._bias = nn.Parameter(torch.Tensor(*normalized_shape))
318 | else:
319 | self.register_parameter("_weight", None)
320 | self.register_parameter("_bias", None)
321 | self._reset_parameters()
322 |
323 | def _reset_parameters(self):
324 | if self._elementwise_affine:
325 | init.ones_(self._weight)
326 | init.zeros_(self._bias)
327 |
328 | def forward(self, X: torch.FloatTensor, idx: torch.LongTensor) -> torch.FloatTensor:
329 | """
330 | Making a forward pass of layer normalization.
331 |
332 | Arg types:
333 | * **X** (Pytorch Float Tensor) - Input tensor,
334 | with shape (batch_size, feature_dim, num_nodes, seq_len).
335 | * **idx** (Pytorch Long Tensor) - Input indices.
336 |
337 | Return types:
338 | * **X** (PyTorch Float Tensor) - Output tensor,
339 | with shape (batch_size, feature_dim, num_nodes, seq_len).
340 | """
341 | if self._elementwise_affine:
342 | return F.layer_norm(
343 | X,
344 | tuple(X.shape[1:]),
345 | self._weight[:, idx, :],
346 | self._bias[:, idx, :],
347 | self._eps,
348 | )
349 | else:
350 | return F.layer_norm(
351 | X, tuple(X.shape[1:]), self._weight, self._bias, self._eps
352 | )
353 |
354 |
355 | class MTGNNLayer(nn.Module):
356 | r"""An implementation of the MTGNN layer.
357 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks."
358 | `_
359 |
360 | Args:
361 | dilation_exponential (int): Dilation exponential.
362 | rf_size_i (int): Size of receptive field.
363 | kernel_size (int): Size of kernel for convolution, to calculate receptive field size.
364 | j (int): Iteration index.
365 | residual_channels (int): Residual channels.
366 | conv_channels (int): Convolution channels.
367 | skip_channels (int): Skip channels.
368 | kernel_set (list of int): List of kernel sizes.
369 | new_dilation (int): Dilation.
370 | layer_norm_affline (bool): Whether to do elementwise affine in Layer Normalization.
371 | gcn_true (bool): Whether to add graph convolution layer.
372 | seq_length (int): Length of input sequence.
373 | receptive_field (int): Receptive field.
374 | dropout (float): Droupout rate.
375 | gcn_depth (int): Graph convolution depth.
376 | num_nodes (int): Number of nodes in the graph.
377 | propalpha (float): Prop alpha, ratio of retaining the root nodes's original states in mix-hop propagation, a value between 0 and 1.
378 |
379 | """
380 |
381 | def __init__(
382 | self,
383 | dilation_exponential: int,
384 | rf_size_i: int,
385 | kernel_size: int,
386 | j: int,
387 | residual_channels: int,
388 | conv_channels: int,
389 | skip_channels: int,
390 | kernel_set: list,
391 | new_dilation: int,
392 | layer_norm_affline: bool,
393 | gcn_true: bool,
394 | seq_length: int,
395 | receptive_field: int,
396 | dropout: float,
397 | gcn_depth: int,
398 | num_nodes: int,
399 | propalpha: float,
400 | ):
401 | super(MTGNNLayer, self).__init__()
402 | self._dropout = dropout
403 | self._gcn_true = gcn_true
404 |
405 | if dilation_exponential > 1:
406 | rf_size_j = int(
407 | rf_size_i
408 | + (kernel_size - 1)
409 | * (dilation_exponential ** j - 1)
410 | / (dilation_exponential - 1)
411 | )
412 | else:
413 | rf_size_j = rf_size_i + j * (kernel_size - 1)
414 |
415 | self._filter_conv = DilatedInception(
416 | residual_channels,
417 | conv_channels,
418 | kernel_set=kernel_set,
419 | dilation_factor=new_dilation,
420 | )
421 |
422 | self._gate_conv = DilatedInception(
423 | residual_channels,
424 | conv_channels,
425 | kernel_set=kernel_set,
426 | dilation_factor=new_dilation,
427 | )
428 |
429 | self._residual_conv = nn.Conv2d(
430 | in_channels=conv_channels,
431 | out_channels=residual_channels,
432 | kernel_size=(1, 1),
433 | )
434 |
435 | if seq_length > receptive_field:
436 | self._skip_conv = nn.Conv2d(
437 | in_channels=conv_channels,
438 | out_channels=skip_channels,
439 | kernel_size=(1, seq_length - rf_size_j + 1),
440 | )
441 | else:
442 | self._skip_conv = nn.Conv2d(
443 | in_channels=conv_channels,
444 | out_channels=skip_channels,
445 | kernel_size=(1, receptive_field - rf_size_j + 1),
446 | )
447 |
448 | if gcn_true:
449 | self._mixprop_conv1 = MixProp(
450 | conv_channels, residual_channels, gcn_depth, dropout, propalpha
451 | )
452 |
453 | self._mixprop_conv2 = MixProp(
454 | conv_channels, residual_channels, gcn_depth, dropout, propalpha
455 | )
456 |
457 | if seq_length > receptive_field:
458 | self._normalization = LayerNormalization(
459 | (residual_channels, num_nodes, seq_length - rf_size_j + 1),
460 | elementwise_affine=layer_norm_affline,
461 | )
462 |
463 | else:
464 | self._normalization = LayerNormalization(
465 | (residual_channels, num_nodes, receptive_field - rf_size_j + 1),
466 | elementwise_affine=layer_norm_affline,
467 | )
468 | self._reset_parameters()
469 |
470 | def _reset_parameters(self):
471 | for p in self.parameters():
472 | if p.dim() > 1:
473 | nn.init.xavier_uniform_(p)
474 | else:
475 | nn.init.uniform_(p)
476 |
477 | def forward(
478 | self,
479 | X: torch.FloatTensor,
480 | X_skip: torch.FloatTensor,
481 | A_tilde: Optional[torch.FloatTensor],
482 | idx: torch.LongTensor,
483 | training: bool,
484 | ) -> torch.FloatTensor:
485 | """
486 | Making a forward pass of MTGNN layer.
487 |
488 | Arg types:
489 | * **X** (PyTorch FloatTensor) - Input feature tensor,
490 | with shape (batch_size, in_dim, num_nodes, seq_len).
491 | * **X_skip** (PyTorch FloatTensor) - Input feature tensor for skip connection,
492 | with shape (batch_size, in_dim, num_nodes, seq_len).
493 | * **A_tilde** (Pytorch FloatTensor or None) - Predefined adjacency matrix.
494 | * **idx** (Pytorch LongTensor) - Input indices.
495 | * **training** (bool) - Whether in traning mode.
496 |
497 | Return types:
498 | * **X** (PyTorch FloatTensor) - Output sequence tensor,
499 | with shape (batch_size, seq_len, num_nodes, seq_len).
500 | * **X_skip** (PyTorch FloatTensor) - Output feature tensor for skip connection,
501 | with shape (batch_size, in_dim, num_nodes, seq_len).
502 | """
503 | X_residual = X
504 | X_filter = self._filter_conv(X)
505 | X_filter = torch.tanh(X_filter)
506 | X_gate = self._gate_conv(X)
507 | X_gate = torch.sigmoid(X_gate)
508 | X = X_filter * X_gate
509 | X = F.dropout(X, self._dropout, training=training)
510 | X_skip = self._skip_conv(X) + X_skip
511 | if self._gcn_true:
512 | X = self._mixprop_conv1(X, A_tilde) + self._mixprop_conv2(
513 | X, A_tilde.transpose(1, 0)
514 | )
515 | else:
516 | X = self._residual_conv(X)
517 |
518 | X = X + X_residual[:, :, :, -X.size(3) :]
519 | X = self._normalization(X, idx)
520 | return X, X_skip
521 |
522 |
523 | class MTGNN(nn.Module):
524 | r"""An implementation of the Multivariate Time Series Forecasting Graph Neural Networks.
525 | For details see this paper: `"Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks."
526 | `_
527 |
528 | Args:
529 | gcn_true (bool): Whether to add graph convolution layer.
530 | build_adj (bool): Whether to construct adaptive adjacency matrix.
531 | gcn_depth (int): Graph convolution depth.
532 | num_nodes (int): Number of nodes in the graph.
533 | kernel_set (list of int): List of kernel sizes.
534 | kernel_size (int): Size of kernel for convolution, to calculate receptive field size.
535 | dropout (float): Droupout rate.
536 | subgraph_size (int): Size of subgraph.
537 | node_dim (int): Dimension of nodes.
538 | dilation_exponential (int): Dilation exponential.
539 | conv_channels (int): Convolution channels.
540 | residual_channels (int): Residual channels.
541 | skip_channels (int): Skip channels.
542 | end_channels (int): End channels.
543 | seq_length (int): Length of input sequence.
544 | in_dim (int): Input dimension.
545 | out_dim (int): Output dimension.
546 | layers (int): Number of layers.
547 | propalpha (float): Prop alpha, ratio of retaining the root nodes's original states in mix-hop propagation, a value between 0 and 1.
548 | tanhalpha (float): Tanh alpha for generating adjacency matrix, alpha controls the saturation rate.
549 | layer_norm_affline (bool): Whether to do elementwise affine in Layer Normalization.
550 | xd (int, optional): Static feature dimension, default None.
551 | """
552 |
553 | def __init__(
554 | self,
555 | gcn_true: bool,
556 | build_adj: bool,
557 | gcn_depth: int,
558 | num_nodes: int,
559 | kernel_set: list,
560 | kernel_size: int,
561 | dropout: float,
562 | subgraph_size: int,
563 | node_dim: int,
564 | dilation_exponential: int,
565 | conv_channels: int,
566 | residual_channels: int,
567 | skip_channels: int,
568 | end_channels: int,
569 | seq_length: int,
570 | in_dim: int,
571 | out_dim: int,
572 | layers: int,
573 | propalpha: float,
574 | tanhalpha: float,
575 | layer_norm_affline: bool,
576 | xd: Optional[int] = None,
577 | ):
578 | super(MTGNN, self).__init__()
579 |
580 | self._gcn_true = gcn_true
581 | self._build_adj_true = build_adj
582 | self._num_nodes = num_nodes
583 | self._dropout = dropout
584 | self._seq_length = seq_length
585 | self._layers = layers
586 | self._idx = torch.arange(self._num_nodes)
587 |
588 | self._mtgnn_layers = nn.ModuleList()
589 |
590 | self._graph_constructor = GraphConstructor(
591 | num_nodes, subgraph_size, node_dim, alpha=tanhalpha, xd=xd
592 | )
593 |
594 | self._set_receptive_field(dilation_exponential, kernel_size, layers)
595 |
596 | new_dilation = 1
597 | for j in range(1, layers + 1):
598 | self._mtgnn_layers.append(
599 | MTGNNLayer(
600 | dilation_exponential=dilation_exponential,
601 | rf_size_i=1,
602 | kernel_size=kernel_size,
603 | j=j,
604 | residual_channels=residual_channels,
605 | conv_channels=conv_channels,
606 | skip_channels=skip_channels,
607 | kernel_set=kernel_set,
608 | new_dilation=new_dilation,
609 | layer_norm_affline=layer_norm_affline,
610 | gcn_true=gcn_true,
611 | seq_length=seq_length,
612 | receptive_field=self._receptive_field,
613 | dropout=dropout,
614 | gcn_depth=gcn_depth,
615 | num_nodes=num_nodes,
616 | propalpha=propalpha,
617 | )
618 | )
619 |
620 | new_dilation *= dilation_exponential
621 |
622 | self._setup_conv(
623 | in_dim, skip_channels, end_channels, residual_channels, out_dim
624 | )
625 |
626 | self._reset_parameters()
627 |
628 | def _setup_conv(
629 | self, in_dim, skip_channels, end_channels, residual_channels, out_dim
630 | ):
631 |
632 | self._start_conv = nn.Conv2d(
633 | in_channels=in_dim, out_channels=residual_channels, kernel_size=(1, 1)
634 | )
635 |
636 | if self._seq_length > self._receptive_field:
637 |
638 | self._skip_conv_0 = nn.Conv2d(
639 | in_channels=in_dim,
640 | out_channels=skip_channels,
641 | kernel_size=(1, self._seq_length),
642 | bias=True,
643 | )
644 |
645 | self._skip_conv_E = nn.Conv2d(
646 | in_channels=residual_channels,
647 | out_channels=skip_channels,
648 | kernel_size=(1, self._seq_length - self._receptive_field + 1),
649 | bias=True,
650 | )
651 |
652 | else:
653 | self._skip_conv_0 = nn.Conv2d(
654 | in_channels=in_dim,
655 | out_channels=skip_channels,
656 | kernel_size=(1, self._receptive_field),
657 | bias=True,
658 | )
659 |
660 | self._skip_conv_E = nn.Conv2d(
661 | in_channels=residual_channels,
662 | out_channels=skip_channels,
663 | kernel_size=(1, 1),
664 | bias=True,
665 | )
666 |
667 | self._end_conv_1 = nn.Conv2d(
668 | in_channels=skip_channels,
669 | out_channels=end_channels,
670 | kernel_size=(1, 1),
671 | bias=True,
672 | )
673 |
674 | self._end_conv_2 = nn.Conv2d(
675 | in_channels=end_channels,
676 | out_channels=out_dim,
677 | kernel_size=(1, 1),
678 | bias=True,
679 | )
680 |
681 | def _reset_parameters(self):
682 | for p in self.parameters():
683 | if p.dim() > 1:
684 | nn.init.xavier_uniform_(p)
685 | else:
686 | nn.init.uniform_(p)
687 |
688 | def _set_receptive_field(self, dilation_exponential, kernel_size, layers):
689 | if dilation_exponential > 1:
690 | self._receptive_field = int(
691 | 1
692 | + (kernel_size - 1)
693 | * (dilation_exponential ** layers - 1)
694 | / (dilation_exponential - 1)
695 | )
696 | else:
697 | self._receptive_field = layers * (kernel_size - 1) + 1
698 |
699 | def forward(
700 | self,
701 | X_in: torch.FloatTensor,
702 | A_tilde: Optional[torch.FloatTensor] = None,
703 | idx: Optional[torch.LongTensor] = None,
704 | FE: Optional[torch.FloatTensor] = None,
705 | ) -> torch.FloatTensor:
706 | """
707 | Making a forward pass of MTGNN.
708 |
709 | Arg types:
710 | * **X_in** (PyTorch FloatTensor) - Input sequence, with shape (batch_size, in_dim, num_nodes, seq_len).
711 | * **A_tilde** (Pytorch FloatTensor, optional) - Predefined adjacency matrix, default None.
712 | * **idx** (Pytorch LongTensor, optional) - Input indices, a permutation of the num_nodes, default None (no permutation).
713 | * **FE** (Pytorch FloatTensor, optional) - Static feature, default None.
714 |
715 | Return types:
716 | * **X** (PyTorch FloatTensor) - Output sequence for prediction, with shape (batch_size, seq_len, num_nodes, 1).
717 | """
718 | seq_len = X_in.size(3)
719 | assert (
720 | seq_len == self._seq_length
721 | ), "Input sequence length not equal to preset sequence length."
722 |
723 | if self._seq_length < self._receptive_field:
724 | X_in = nn.functional.pad(
725 | X_in, (self._receptive_field - self._seq_length, 0, 0, 0)
726 | )
727 |
728 | if self._gcn_true:
729 | if self._build_adj_true:
730 | if idx is None:
731 | A_tilde = self._graph_constructor(self._idx.to(X_in.device), FE=FE)
732 | else:
733 | A_tilde = self._graph_constructor(idx, FE=FE)
734 |
735 | X = self._start_conv(X_in)
736 | X_skip = self._skip_conv_0(
737 | F.dropout(X_in, self._dropout, training=self.training)
738 | )
739 | if idx is None:
740 | for mtgnn in self._mtgnn_layers:
741 | X, X_skip = mtgnn(
742 | X, X_skip, A_tilde, self._idx.to(X_in.device), self.training
743 | )
744 | else:
745 | for mtgnn in self._mtgnn_layers:
746 | X, X_skip = mtgnn(X, X_skip, A_tilde, idx, self.training)
747 |
748 | X_skip = self._skip_conv_E(X) + X_skip
749 | X = F.relu(X_skip)
750 | X = F.relu(self._end_conv_1(X))
751 | X = self._end_conv_2(X)
752 | return X
753 |
--------------------------------------------------------------------------------
/src/models/__init__.py:
--------------------------------------------------------------------------------
1 | from .Linear import LinearModel
2 | from .ConvSeqAttention import ConvSeqAttentionModel
3 | from .LSTM import RecurrentModel
4 | from .MTGNN import MTGNNModel
5 | from .GNNLSTM import GNNLSTM
--------------------------------------------------------------------------------
/src/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .nn_trainer import Trainer
--------------------------------------------------------------------------------
/src/utils/device.py:
--------------------------------------------------------------------------------
1 | import torch
2 |
3 | # Dirty file that holds the device (cpu or cuda).
4 | # Import setter or getter functions to access.
5 | # get_device determine the device from availability
6 | # if none is set manually.
7 |
8 | _device = None
9 |
10 | def get_device():
11 | ''' Returns the currently used computing device.'''
12 | if _device is None:
13 | device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
14 | set_device(device)
15 | return _device
16 |
17 | def set_device(device):
18 | ''' Sets the computing device. '''
19 | global _device
20 | _device = device
21 |
--------------------------------------------------------------------------------
/src/utils/evaluate.py:
--------------------------------------------------------------------------------
1 | from .metrics import precision_recall, F_score
2 | from . import utils
3 | import numpy as np
4 |
5 | def evaluate_performance(train_res, test_res, threshold_method='max', smoothing=4, smoothing_method='mean'):
6 | '''
7 | Returns precision, recall, f1 and f2 scores.
8 | Determines anomaly threshold from normalized and smoothed validation data.
9 | Normalization is performed on each 1d sensor time series.
10 | Anomaly predictions are calculated as smoothed and normalized test error scores that exceed the threshold.
11 |
12 | Args:
13 | train_res (list): List of length three holding prediction and groundtruth values from validation. Third
14 | Entry is assumed to be NoneType for nonexisting anomaly labels.
15 | test_res (list): List of length three holding prediction, groundtruth values and anomaly labels testing.
16 | '''
17 | train_pred_err, _ = train_res
18 | test_pred_err, anomaly_labels = test_res
19 |
20 | assert test_pred_err.size(1) == anomaly_labels.size(0)
21 |
22 | # row-wise normalization (within each sensor) and subsequent 1d smoothing
23 | train_error = utils.normalize_with_median_iqr(train_pred_err)
24 | test_error = utils.normalize_with_median_iqr(test_pred_err)
25 | if smoothing > 0:
26 | train_error = utils.weighted_average_smoothing(train_error, k=smoothing, mode=smoothing_method)
27 | test_error = utils.weighted_average_smoothing(test_error, k=smoothing, mode=smoothing_method)
28 |
29 | if threshold_method == 'max':
30 | anomaly_predictions = _max_thresholding(train_error, test_error)
31 | elif threshold_method == 'mean':
32 | anomaly_predictions = _mean_thresholding(train_error, test_error)
33 | elif threshold_method == 'best':
34 | anomaly_predictions, threshold_method = _best_thresholding(test_error, anomaly_labels)
35 |
36 | # evaluate test performance
37 | assert anomaly_predictions.shape == anomaly_labels.shape
38 | precision, recall = precision_recall(anomaly_predictions, anomaly_labels)
39 | f1 = F_score(precision, recall, beta=1)
40 | f2 = F_score(precision, recall, beta=2)
41 |
42 | # adjusted performance
43 | adjusted_predictions, latency = adjust_predicts(anomaly_predictions, anomaly_labels, calc_latency=True)
44 | precision_adj, recall_adj = precision_recall(adjusted_predictions, anomaly_labels)
45 | f1_adj = F_score(precision_adj, recall_adj, beta=1)
46 | f2_adj = F_score(precision_adj, recall_adj, beta=2)
47 |
48 | results_dict = {
49 | 'method': threshold_method,
50 | 'prec': precision,
51 | 'rec': recall,
52 | 'f1': f1,
53 | 'f2': f2,
54 | 'a_prec': precision_adj,
55 | 'a_rec': recall_adj,
56 | 'a_f1': f1_adj,
57 | 'a_f2': f2_adj,
58 | 'latency': latency
59 | }
60 | return results_dict
61 |
62 | def adjust_predicts(pred, label, calc_latency=False):
63 | """
64 | Calculate adjusted predict labels using given `score`, `threshold` (or given `pred`) and `label`.
65 | Args:
66 | score (np.ndarray): The anomaly score
67 | label (np.ndarray): The ground-truth label
68 | threshold (float): The threshold of anomaly score.
69 | A point is labeled as "anomaly" if its score is lower than the threshold.
70 | pred (np.ndarray or None): if not None, adjust `pred` and ignore `score` and `threshold`,
71 | calc_latency (bool):
72 | Returns:
73 | np.ndarray: predict labels
74 |
75 | Method from OmniAnomaly (https://github.com/NetManAIOps/OmniAnomaly)
76 | """
77 | anomaly_state = False
78 | anomaly_count = 0 # number of anomalies found
79 | latency = 0
80 |
81 | for i in range(len(pred)):
82 | if label[i] and pred[i] and not anomaly_state: # if correctly found anomaly
83 | anomaly_state = True
84 | anomaly_count += 1
85 | for j in range(i, 0, -1): # go backward until beginning of anomaly
86 | if not label[j]: # BEGINNING of anomaly
87 | break
88 | else:
89 | if not pred[j]: # set prediction to true
90 | pred[j] = True
91 | latency += 1
92 | elif not label[i]: # END of anomaly
93 | anomaly_state = False
94 | if anomaly_state: # still in anomaly and was already found
95 | pred[i] = True
96 | if calc_latency:
97 | return pred, latency / (anomaly_count + 1e-8)
98 | else:
99 | return pred
100 |
101 | def _max_thresholding(train_errors, test_errors):
102 | '''
103 | Returns anomaly predictions on test errors based on threshold
104 | calculated on the validation errors.
105 | Threshold is the largest validation error within the entire validation data.
106 | '''
107 |
108 | # set threshold as global max of validation errors
109 | threshold = train_errors.max().item()
110 |
111 | # set test scores as max error in one time tick
112 | score, _ = test_errors.max(dim=0)
113 |
114 | return score > threshold
115 |
116 | def _mean_thresholding(train_errors, test_errors):
117 | '''
118 | Returns anomaly predictions on test errors based on threshold
119 | calculated on the validation errors.
120 | Threshold is the largest validation error within the entire validation data.
121 | '''
122 |
123 | # set threshold as global max of validation errors
124 | threshold = train_errors.mean(dim=0).max().item()
125 |
126 | # set test scores as max error in one time tick
127 | score = test_errors.mean(dim=0)
128 |
129 | return score > threshold
130 |
131 | def _best_thresholding(test_errors, test_labels):
132 | '''
133 | Returns anomaly predictions on test errors based on threshold
134 | calculated on the validation errors.
135 | Threshold is the largest validation error within the entire validation data.
136 |
137 | ONLY USE TO TEST THEORETICAL PERFORMANCE, NOT FOR REAL EVALUATION!
138 | '''
139 |
140 | # set threshold as global max of validation errors
141 |
142 | max_score, _ = test_errors.max(dim=0)
143 | mean_score = test_errors.mean(dim=0)
144 | scores = {'max': max_score, 'mean': mean_score}
145 |
146 | best_f1 = 0
147 | best_method = None
148 | best_predictions = None
149 |
150 | lower_bound = min(min(max_score), min(mean_score)).item()
151 | upper_bound = max(max(max_score), max(mean_score)).item()
152 |
153 | thresholds = np.linspace(lower_bound, upper_bound, 1000)
154 | for threshold in thresholds:
155 | for method, score in scores.items():
156 | anomaly_predictions = score > threshold
157 | precision, recall = precision_recall(anomaly_predictions, test_labels)
158 | f1 = F_score(precision, recall, beta=1)
159 | if f1 > best_f1:
160 | best_f1 = f1
161 | best_method = method
162 | best_predictions = anomaly_predictions
163 |
164 | return best_predictions, best_method
--------------------------------------------------------------------------------
/src/utils/metrics.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from . import utils
3 |
4 | def precision_recall(pred, labels):
5 | '''
6 | Calculates precision and recall.
7 | Precision = (TP / (TP+FP)).
8 | Recall = (TP / (TP+FN)).
9 |
10 | Args:
11 | pred (Tensor): 1-dimensional tensor of predictions.
12 | labels (Tensor): 1-dimensional tensor of ground truth observations.
13 | '''
14 | pred, labels = utils.cast(torch.bool, pred, labels)
15 |
16 | # precision
17 | hits = labels[pred]
18 | precision = hits.sum() / pred.sum()
19 |
20 | # recall
21 | hits = pred[labels]
22 | recall = hits.sum() / labels.sum()
23 |
24 | return precision.item(), recall.item()
25 |
26 | def F_score(precision, recall, beta=1):
27 | '''
28 | Calculates F-scores.
29 |
30 | Args:
31 | precision (int, float): Precision score.
32 | recall (int, float): Recall score.
33 | beta (int, float, optional): Positive number.
34 | '''
35 | div = (beta**2 * precision) + recall
36 | if div > 0:
37 | return ((1 + beta**2) * (precision * recall)) / div
38 | else:
39 | return 0
40 |
--------------------------------------------------------------------------------
/src/utils/nn_trainer.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from time import time
3 | from copy import deepcopy
4 |
5 | from .utils import format_time
6 |
7 |
8 | class Trainer:
9 | '''
10 | Class for model training, validation and testing.
11 |
12 | Args:
13 | model (callable): Pytorch nn.module class object that defines the neural network model.
14 | optimizer (callable): Pytorch optim object, e.g. Adam.
15 | criterion (func): Loss function that takes two arguments.
16 | '''
17 |
18 | def __init__(self, model, optimizer, criterion):
19 |
20 | self.model = model
21 | self.optimizer = optimizer
22 | self.criterion = criterion
23 |
24 | def _train_iteration(self, loader):
25 | '''
26 | Returns the average training loss of one iteration over the training dataloader.
27 | '''
28 | self.model.train()
29 |
30 | avg_pred_loss = 0
31 | for i, window in enumerate(loader):
32 | self.optimizer.zero_grad()
33 |
34 | x = window.x
35 | y = window.y
36 |
37 | # forward step
38 | pred = self.model(x)
39 |
40 | assert pred.shape == y.shape
41 |
42 | loss = self.criterion(pred, y)
43 |
44 | # backward step
45 | loss.backward()
46 | torch.nn.utils.clip_grad_norm_(self.model.parameters(), max_norm=2.0, norm_type=2)
47 | self.optimizer.step()
48 |
49 | avg_pred_loss += loss.item()
50 | avg_pred_loss /= i+1
51 |
52 | return avg_pred_loss
53 |
54 | def test(self, loader, return_errors=True):
55 | '''
56 | Returns the average loss over the test data.
57 | Optionally returns a list of predictions and corresponding groundtruth values
58 | and anomaly labels.
59 | '''
60 | self.model.eval()
61 |
62 | avg_pred_loss = 0
63 | pred_errors = []
64 | y_labels = []
65 | with torch.no_grad():
66 | for i, window in enumerate(loader):
67 | x = window.x
68 | y = window.y
69 | batch_size = len(window.ptr) - 1
70 |
71 | pred = self.model(x)
72 |
73 | assert pred.shape == y.shape
74 |
75 | pred_loss = self.criterion(pred, y)
76 |
77 | if return_errors:
78 | y_label = window.y_label
79 | if y_label is not None:
80 | y_labels.append(y_label[::pred.size(1)])
81 | else:
82 | y_labels.append(y_label) # NoneType labels for validation data
83 |
84 | pred_error = ((pred[:, -1] - y[:, -1]) ** 2).detach()
85 | pred_errors.append(pred_error.T.view(batch_size, -1).T)
86 |
87 | avg_pred_loss += pred_loss.item()
88 |
89 | avg_pred_loss /= i+1
90 |
91 | # results to be returned
92 | re = []
93 | if return_errors:
94 | pred_errors = torch.cat(pred_errors, dim=1)
95 |
96 | if isinstance(y_labels[0], torch.Tensor):
97 | anomaly_labels = torch.cat(y_labels)
98 | else: # during validation
99 | anomaly_labels = None
100 |
101 | re.append([pred_errors, anomaly_labels])
102 |
103 | re.append(avg_pred_loss)
104 |
105 | if len(re) == 1:
106 | return re.pop()
107 | else:
108 | return tuple(re)
109 |
110 | def train(self, train_loader, val_loader=None, epochs=10, early_stopping=10, return_model_state=False, return_val_results=False, verbose=True):
111 | '''
112 | Main function of the Trainer class. Handles the training procedure,
113 | including the training and validation steps for each epoch testing
114 | the resulting model on the test data.
115 |
116 | Args:
117 | train_loader (iterable): Dataloader holding the (batched) training samples.
118 | val_loader (iterable, optional): Dataloader holding the (batched) validation samples.
119 | epochs (int, optional): Number of epochs for training.
120 | early_stopping (int, optional): Number of epochs without improvement on the validation data until training is stopped.
121 | return_model_state (bool, optional): If true, returns the model state dict.
122 | return_val_results (bool, optional): If true, returns predictions and groundtruth values for validation.
123 | verbose (bool, optional): If true, prints updates on training and validation loss each epoch.
124 | '''
125 |
126 | train_loss_history = []
127 | val_loss_history = []
128 | early_stopping_counter = 0
129 | early_stopping_point = early_stopping
130 | best_train_loss = float('inf')
131 | best_val_loss = float('inf')
132 | val_results = None # dummy variable for optional return values
133 | indicator = ''
134 | for i in range(epochs):
135 | start = time()
136 | # train
137 | train_loss = self._train_iteration(train_loader)
138 | train_loss_history.append(train_loss)
139 |
140 | if val_loader is not None:
141 | # validate if validation loader is provided
142 | if return_val_results:
143 | val_results, val_loss = self.test(val_loader, return_errors=return_val_results)
144 | else:
145 | val_loss = self.test(val_loader, return_errors=return_val_results)
146 | val_loss_history.append(val_loss)
147 | else:
148 | # use training loss for early stopping if no validation data
149 | val_loss = train_loss
150 |
151 | # check for early stopping
152 | if val_loss < best_val_loss:
153 | best_val_loss = val_loss
154 | best_val_results = val_results
155 | best_train_loss = train_loss
156 | best_model_state = deepcopy(self.model.state_dict())
157 | early_stopping_counter = 0
158 | indicator = '*'
159 | else:
160 | early_stopping_counter += 1
161 | indicator = ''
162 |
163 | if verbose:
164 | # print loss of epoch
165 | time_elapsed = format_time(time() - start)
166 | train_print_string = f'Train Loss: {train_loss:>9.5f}'
167 | val_print_string = f' || Validation Loss: {val_loss:>9.5f}' if val_loader is not None else ''
168 | print(f' Epoch {i+1:>2}/{epochs} ({time_elapsed}/it) -- ({train_print_string}{val_print_string}) {indicator}')
169 |
170 | # stop training if early stopping criterion is fulfilled
171 | if early_stopping_counter == early_stopping_point and not epochs == i+1:
172 | if verbose:
173 | print(f' ...Stopping early after {i+1} epochs...')
174 | break
175 | # end of epoch
176 | # end of training loop
177 |
178 | if verbose:
179 | # print loss after training
180 | print(' Training Results:')
181 | print(f' Train MSE: {best_train_loss:.5f}')
182 | if val_loader is not None:
183 | print(f' Validation MSE: {best_val_loss:.5f}\n')
184 |
185 | # return values: loss for each epoch, validation results (optional), model_state_dict (optional)
186 | if val_loader is None:
187 | re = [train_loss_history, None]
188 | best_val_results = None
189 | else:
190 | re = [train_loss_history, val_loss_history]
191 | self.model.load_state_dict(best_model_state)
192 | if return_model_state:
193 | re.append(best_model_state)
194 | if return_val_results:
195 | re.append(best_val_results)
196 |
197 | if len(re) == 1:
198 | return re.pop()
199 | else:
200 | return tuple(re)
201 |
202 |
203 |
--------------------------------------------------------------------------------
/src/utils/utils.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn.functional as F
3 | from .device import get_device
4 |
5 | def normalize_with_median_iqr(x):
6 | '''
7 | Row normalization with median und interquartile range for 2d tensors.
8 |
9 | Args:
10 | x (Tensor): 2-dimensional input tensor.
11 | '''
12 | assert isinstance(x, torch.Tensor)
13 |
14 | device = get_device()
15 |
16 | quantiles = torch.tensor([.25, .5, .75]).to(device)
17 | q1, median, q3, = torch.quantile(x, quantiles, dim=1)
18 | iqr = q3 - q1
19 |
20 | return (x - median.unsqueeze(0).T) / (1 + iqr.unsqueeze(0).T)
21 |
22 | def weighted_average_smoothing(x, k, mode='mean'):
23 | '''
24 | Average (weighted) smooothing of rows of a 2d tensor with 1d kernel, padding='same'.
25 |
26 | Args:
27 | x (Tensor): 2-dimensional input tensor.
28 | k (int): Size of the smoothing kernel.
29 | mode (str): Weighting of the average. Can be:
30 | 'mean' : no weighting
31 | 'exp' : exponentially gives heigher weights to the right side of a row
32 |
33 | '''
34 | assert isinstance(x, torch.Tensor)
35 |
36 | device = get_device()
37 |
38 | n = x.size(0)
39 | div, mod = divmod(k, 2)
40 | p1d = (div, div - (mod ^ 1)) # padding size
41 | x = torch.constant_pad_nd(x, p1d, value=0.0)
42 | x = x.view(n, 1, -1)
43 |
44 | if mode == 'mean':
45 | kernel = torch.full(size=(1,1,k), fill_value=1/k, requires_grad=False)
46 | elif mode == 'exp':
47 | kernel = torch.logspace(-k+1, 0, k, base=1.5, requires_grad=False)
48 | kernel /= kernel.sum()
49 | kernel = kernel.view(1,1,k)
50 |
51 | return F.conv1d(x, kernel.to(device)).squeeze()
52 |
53 | def cast(dtype, *args):
54 | '''
55 | Casts arbitrary number of tensors to specified type.
56 |
57 | Args:
58 | dtype (type or string): The desired type.
59 | *args: Tensors to be type-cast.
60 |
61 | '''
62 | a = [x.type(dtype) for x in args]
63 | if len(a) == 1:
64 | return a.pop()
65 | else:
66 | return a
67 |
68 | def equalize_len(t1, t2, value=0):
69 | '''
70 | Returns new tensors with equal length according to max(len(t1), len(t2)).
71 |
72 | Args:
73 | t1 (Tensor): Input tensor
74 | t2 (Tensor): Input tensor
75 | value (int, float, optional): Fill value for new entries in shorter tensor.
76 | '''
77 |
78 | assert isinstance(t1, torch.Tensor)
79 | assert isinstance(t2, torch.Tensor)
80 |
81 | if len(t1) == len(t2):
82 | return t1, t2
83 |
84 | diff = abs(len(t2) - len(t1))
85 | p1d = (0, diff)
86 | if len(t1) > len(t2):
87 | t2 = F.pad(t2, p1d, 'constant', value)
88 | return t1, t2
89 | else:
90 | t1 = F.pad(t1, p1d, 'constant', value)
91 | return t1, t2
92 |
93 | def format_time(t):
94 | '''
95 | Format seconds to days, hours, minutes, and seconds.
96 | -> Output format example: "01d-09h-24m-54s"
97 |
98 | Args:
99 | t (float, int): Time in seconds.
100 | '''
101 | assert isinstance(t, (float, int))
102 |
103 | h, r = divmod(t,3600)
104 | d, h = divmod(h, 24)
105 | m, r = divmod(r, 60)
106 | s, r = divmod(r, 1)
107 |
108 | values = [d, h, m, s]
109 | symbols = ['d', 'h', 'm', 's']
110 | for i, val in enumerate(values):
111 | if val > 0:
112 | symbols[i] = ''.join([f'{int(val):02d}', symbols[i]])
113 | else:
114 | symbols[i] = ''
115 | return '-'.join(s for s in symbols if s) if any(symbols) else '<1s'
--------------------------------------------------------------------------------
/src/visualization/error_distribution.py:
--------------------------------------------------------------------------------
1 | ### TODO:NEEDS REWORK !!!
2 |
3 |
4 | import numpy as np
5 | import torch
6 | import seaborn as sns
7 | import pandas as pd
8 | from ..utils.utils import normalize_with_median_iqr
9 |
10 | def get_error_distribution_plot(results_dict):
11 | '''
12 | Returns a plot of the error distribution derived from predictions
13 | and groundtruth values.
14 |
15 | Args:
16 | results_dict (dict): Dictionary of test results.
17 | '''
18 |
19 | errors = []
20 | for key, value in results_dict.items():
21 | y_pred, y, _ = value
22 | err = torch.abs(y_pred - y).cpu().numpy()
23 | err = normalize_with_median_iqr(err)
24 | s = pd.Series(err, index=[key]*len(err))
25 | errors.append(s)
26 | if key == 'Validation':
27 | threshold = err.max()
28 |
29 | errors = pd.Series(dtype=np.float64).append(errors)
30 | errors = errors.apply(lambda x: np.nan if x < threshold*0.75 else x)
31 | df = pd.DataFrame({'normalized_error': errors})
32 | df = df.dropna()
33 | df.index.name = 'Mode'
34 | df = df.reset_index()
35 |
36 | error_plot = sns.displot(df, x="normalized_error", hue="Mode", kind="kde", fill=True)
37 | return error_plot
38 |
39 |
--------------------------------------------------------------------------------
/src/visualization/graph_plot.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import networkx as nx
3 | from sklearn.manifold import TSNE
4 | from pyvis.network import Network
5 | import matplotlib.pyplot as plt
6 |
7 | def plot_embedding(edge_indices, W, labels, path=None, notebook=False):
8 | '''
9 | Creates a plot of the given graph. Layout is determined by t-SNE dimensionality reduction to 2d.
10 | Saves a plot of the 2d t-SNE of the embedding space.
11 | Can directly display plot if used within notebook.
12 | '''
13 |
14 | assert isinstance(edge_indices, torch.Tensor)
15 | assert isinstance(W, torch.Tensor)
16 |
17 | edge_indices = edge_indices.cpu()
18 | W = W.detach().cpu()
19 |
20 | num_nodes = edge_indices.max() + 1
21 |
22 | # compute list of edge index pairs from sparse adj matrix shape [2, num_edges]
23 | edge_list = zip(*edge_indices.detach().tolist())
24 | edge_list = [(a,b) for a,b in edge_list if not a == b] # remove self loops
25 |
26 | # generate graph from edge list
27 | G = nx.from_edgelist(edge_list)
28 |
29 | ### PARAMETERS FOR DRAWING
30 | # node ids
31 | node_keys = range(num_nodes)
32 | def node_dict(x): return dict(zip(node_keys, x))
33 |
34 | # embedding mapping from Nd space to 2d
35 | W2D = TSNE(n_components=2).fit_transform(W)
36 |
37 | xs, ys = W2D.T.tolist()
38 |
39 | # node coordinates
40 | x_map = node_dict(xs)
41 | y_map = node_dict(ys)
42 |
43 | # node labels
44 | node_labels = node_dict(labels)
45 |
46 | # node sizes
47 | sizes = [12] * num_nodes
48 | size_map = node_dict(sizes)
49 |
50 | # node colours
51 | string_split = [string.split('_') for string in labels]
52 | if len(string_split[0]) == 1: # swat
53 | sensor_types = [str(*string)[:-3] for string in string_split]
54 | else: # wadi
55 | sensor_types = ['_'.join(string[:2]) for string in string_split]
56 | sensor_set = set(sensor_types)
57 | mapping = dict(zip(sensor_set, range(len(sensor_set))))
58 | node_color_map = dict(zip(node_keys, [mapping[key] for key in sensor_types]))
59 |
60 | nx.set_node_attributes(G, node_labels, 'label')
61 | nx.set_node_attributes(G, node_color_map, 'group')
62 | nx.set_node_attributes(G, size_map, 'size')
63 | nx.set_node_attributes(G, x_map, 'x')
64 | nx.set_node_attributes(G, y_map, 'y')
65 |
66 | # pyvis network from networkx graph
67 | net = Network('1000px', '100%', bgcolor='#222222', font_color='white', notebook=notebook)
68 | # net = Network('1000px', '100%', bgcolor='#ffffff', font_color='black', notebook=notebook)
69 | net.from_nx(G)
70 | # gravity model for plot layout
71 | net.force_atlas_2based(gravity=-30, central_gravity=0.1, spring_length=50, spring_strength=0.001, damping=0.09, overlap=0.1)
72 | net.show_buttons(filter_=['physics'])
73 | if path is not None:
74 | net.save_graph(path)
75 | if notebook:
76 | net.show('graph.html')
77 |
78 | # plot of t-SNE
79 | _, axes = plt.subplots(1,2)
80 | for ax in axes:
81 | ax.scatter(xs, ys, c=list(node_color_map.values()), alpha=0.7)
82 | for i, label in enumerate(labels):
83 | axes[1].annotate(label, (xs[i], ys[i]))
84 | path = path.rsplit('.')[0] + '_tSNE.png'
85 | plt.savefig(path)
86 |
87 |
88 | def plot_adjacency(A, labels, path=None, notebook=False):
89 | '''
90 | Creates a plot of the given graph. Layout is determined by t-SNE dimensionality reduction to 2d.
91 | Saves a plot of the 2d t-SNE of the embedding space.
92 | Can directly display plot if used within notebook.
93 | '''
94 | assert isinstance(A, torch.Tensor)
95 |
96 | A = A.detach()
97 | A.fill_diagonal_(0)
98 | A = A.cpu().numpy()
99 |
100 | num_nodes = A.shape[0]
101 |
102 | # generate graph from adjacency matrix
103 | directed = (A != A.T).any()
104 | if directed:
105 | G = nx.from_numpy_matrix(A, create_using=nx.DiGraph)
106 | else:
107 | G = nx.from_numpy_matrix(A)
108 |
109 | ### PARAMETERS FOR DRAWING
110 | # node ids
111 | node_keys = range(num_nodes)
112 | def node_dict(x): return dict(zip(node_keys, x))
113 |
114 | # node labels
115 | node_labels = node_dict(labels)
116 |
117 | # node sizes
118 | sizes = [12] * num_nodes
119 | size_map = node_dict(sizes)
120 |
121 | # node colours
122 | string_split = [string.split('_') for string in labels]
123 | if len(string_split[0]) == 1: # swat
124 | sensor_types = [str(*string)[:-3] for string in string_split]
125 | else: # wadi
126 | sensor_types = ['_'.join(string[:2]) for string in string_split]
127 | sensor_set = set(sensor_types)
128 | mapping = dict(zip(sensor_set, range(len(sensor_set))))
129 | node_color_map = dict(zip(node_keys, [mapping[key] for key in sensor_types]))
130 |
131 | nx.set_node_attributes(G, node_labels, 'label')
132 | nx.set_node_attributes(G, node_color_map, 'group')
133 | nx.set_node_attributes(G, size_map, 'size')
134 |
135 | # pyvis network from networkx graph
136 | directed = (A != A.T).any()
137 | net = Network('1000px', '100%', directed=directed, bgcolor='#222222', font_color='white', notebook=notebook)
138 | # net = Network('1000px', '100%', directed=directed, bgcolor='#ffffff', font_color='black', notebook=notebook)
139 | net.from_nx(G)
140 | # gravity model for plot layout
141 | net.force_atlas_2based(gravity=-30, central_gravity=0.1, spring_length=50, spring_strength=0.001, damping=0.09, overlap=0.1)
142 | net.show_buttons(filter_=['physics'])
143 | if path is not None:
144 | net.save_graph(path)
145 | if notebook:
146 | net.show('graph.html')
--------------------------------------------------------------------------------
/src/visualization/loss_plot.py:
--------------------------------------------------------------------------------
1 | import matplotlib.pyplot as plt
2 |
3 | def get_loss_plot(train_loss_history, val_loss_history):
4 | '''
5 | Returns a pyplot figure object with the plot of the
6 | train and validation loss lists obtained in training.
7 | '''
8 |
9 | colors = ['#2300a8', '#8400a8'] # '#8400a8', '#00A658'
10 | plot_dict = {'Training': (train_loss_history, colors[0]), 'Validation': (val_loss_history, colors[1])}
11 |
12 | n = len(train_loss_history)
13 |
14 | # plot train and val losses and fill area under the curve
15 | fig, ax = plt.subplots()
16 | x_axis = list(range(1, n+1))
17 | for key, (data, color) in plot_dict.items():
18 | if data is not None:
19 | ax.plot(x_axis, data,
20 | label=key,
21 | linewidth=2,
22 | linestyle='-',
23 | marker='o',
24 | alpha=1,
25 | color=color)
26 | ax.fill_between(x_axis, data,
27 | alpha=0.3,
28 | color=color)
29 |
30 | # x axis ticks
31 | n_x_ticks = 10
32 | k = max(1, n // n_x_ticks)
33 | x_ticks = list(range(1, n+1, k))
34 | ax.set_xticks(x_ticks)
35 |
36 | # figure labels
37 | ax.set_title('Loss over time', fontweight='bold')
38 | ax.set_xlabel('Epochs', fontweight='bold')
39 | ax.set_ylabel('Mean Squared Error', fontweight='bold')
40 | ax.legend(loc='upper right')
41 |
42 | # remove top and right borders
43 | ax.spines['top'].set_visible(False)
44 | ax.spines['right'].set_visible(False)
45 |
46 | # adds major gridlines
47 | ax.grid(color='grey', linestyle='-', linewidth=0.35, alpha=0.8)
48 |
49 | # log scale of y-axis
50 | ax.set_yscale('log')
51 |
52 | return fig
--------------------------------------------------------------------------------
/src/visualization/tensorboard.py:
--------------------------------------------------------------------------------
1 |
2 | # from torch.utils.tensorboard import SummaryWriter
3 |
4 | # stamp = datetime.now().strftime("%Y%m%d-%H%M%S")
5 | # logdir = 'logs/%s' % stamp
6 |
7 | # writer = SummaryWriter(logdir)
8 | # writer.add_graph(model, [x[:,:window], *[torch.tensor(np.nan)] * 2])
9 | # writer.close()
10 |
--------------------------------------------------------------------------------