├── LICENSE ├── README.md ├── checkpoint ├── fpdm_cnn.h5 ├── fpdm_cnn_lstm.h5 ├── fpdm_cnn_transformer.h5 ├── fpdm_transformer.h5 ├── fpdm_xtm.h5 ├── images │ ├── barplot.jpg │ └── roc_curve.jpg └── ldm_xtm.h5 ├── main.py └── scripts ├── Data.py ├── FPDM.py ├── FPDM_Models.py ├── LDM_Data.py ├── LDM_Model.py ├── Utility_Functions.py └── __pycache__ ├── Data.cpython-39.pyc ├── FPDM.cpython-39.pyc ├── FPDM_Models.cpython-39.pyc ├── LDM_Data.cpython-39.pyc ├── LDM_Model.cpython-39.pyc └── Utility_Functions.cpython-39.pyc /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Gobinda Chandra Sarker 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # XTM 2 |
This Repository is accompanied by the paper titled,
3 | 4 | > ***XTM: A Novel Transformer and LSTM-Based Model for Detection and Localization of Formally Verified FDI Attack in Smart Grid***. 5 | 6 |False data injection (FDI) attack in smart grid can cause catastrophic impact in energy management and distribution. Here, a novel hybrid model combining the state of the art transformer and LSTM is developed to detect the presence of FDI as well as the location of attack in smart grid.
7 | 8 | 9 | ## Dataset: 10 | *** 11 |The initial dataset consists of hourly historical sensor measurements of one year. we have tested our model on IEEE-14 bus system . So, So, we have taken 54 measurements. We have generated the attack vector in accordance with this article . Further we have extended the hourly dataset to minutely dataset. Both hourly, minutely data and the generated attack vector can be accessed from google drive .
12 | 13 | ## Setting Up 14 | *** 15 | The model is developed on the following system environments, 16 | - python 3.8 17 | - tensorflow 2.7. 18 | 19 | Use the following steps to run the model. In the future update the readme file will be modified accordingly. 20 | - clone the repository into the local machine. 21 | - download the data files from drive link above and into the *dataset* directory. 22 | - run *main.py* 23 | 24 | ## Citation 25 | 26 |Baul, A.; Sarker, G.C.; Sadhu, P.K.; Yanambaka, V.P.; Abdelgawad, A. XTM:A Novel Transformer and LSTM-Based Model for Detection and Localization of Formally Verified FDI Attack in Smart Grid. Electronics 2023, 12, 797. https://doi.org/10.3390/electronics12040797
27 | 28 | 29 | 30 | -------------------------------------------------------------------------------- /checkpoint/fpdm_cnn.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/checkpoint/fpdm_cnn.h5 -------------------------------------------------------------------------------- /checkpoint/fpdm_cnn_lstm.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/checkpoint/fpdm_cnn_lstm.h5 -------------------------------------------------------------------------------- /checkpoint/fpdm_cnn_transformer.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/checkpoint/fpdm_cnn_transformer.h5 -------------------------------------------------------------------------------- /checkpoint/fpdm_transformer.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/checkpoint/fpdm_transformer.h5 -------------------------------------------------------------------------------- /checkpoint/fpdm_xtm.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/checkpoint/fpdm_xtm.h5 -------------------------------------------------------------------------------- /checkpoint/images/barplot.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/checkpoint/images/barplot.jpg -------------------------------------------------------------------------------- /checkpoint/images/roc_curve.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/checkpoint/images/roc_curve.jpg -------------------------------------------------------------------------------- /checkpoint/ldm_xtm.h5: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/checkpoint/ldm_xtm.h5 -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import os 2 | from sklearn.preprocessing import MinMaxScaler 3 | 4 | from scripts.Data import Data 5 | from scripts.FPDM import FPDM 6 | from scripts.FPDM_Models import FPDM_Models 7 | from scripts.LDM_Data import LDM_Data 8 | from scripts.LDM_Model import LDM_Model 9 | from scripts.Utility_Functions import utility_functions 10 | 11 | data_path = os.path.join(os.getcwd(), "dataset\\hourly_dataset.csv") 12 | atk_vector_path = os.path.join(os.getcwd(), "dataset\\fdi_attack_vector.xlsx") 13 | mm_scaler = MinMaxScaler((0,1)) 14 | lookback = 48 # see pash 48 datapoints 15 | delay = 1 # forecast future 1 datapoint 16 | batch_size=32 # batch size for training FPDM model 17 | steps=1 # step 18 | 19 | n_sensor_measurement = 54 # number of sensor_measurements 20 | fpdm_input_shape = (lookback,n_sensor_measurement) # input shape for fdi presence detection module 21 | fpdm_training = False # To specify whether to train fpdm 22 | algorithm = 'xtm' # Choose which algorithm to use for fpdm 23 | checkpoint_path = "checkpoint\\" 24 | 25 | ldm_training = False # To specify whether to train ldm 26 | load_data_ = False # To specify whether to load prepared randomly injected data for ldm training from memory 27 | 28 | data = Data(data_path = data_path, atk_vector_path = atk_vector_path, scaler = mm_scaler) 29 | fpdm_model = FPDM_Models(input_shape = fpdm_input_shape, algorithm = algorithm, training = fpdm_training) 30 | 31 | if fpdm_training: 32 | steps_per_epoch = (data.len_train_set - lookback)//batch_size 33 | validation_steps = (data.len_val_set - lookback)//batch_size 34 | fpdm_model.train(data.train_gen, data.val_gen, steps_per_epoch, validation_steps, epochs = 50) 35 | 36 | ldm_data = LDM_Data(data, fpdm_model.model, algorithm = algorithm, training = ldm_training, load_data_ = load_data_) 37 | 38 | # fDI presence detection module 39 | fpdm = FPDM(data, fpdm_model, 48, threshold = 0.4) 40 | 41 | print('\nplotting histogram showing how fpdm can detect attack vector 5 from our attack vector dataset') 42 | fpdm.barplot_for_specific_fdi(real_data = ldm_data.test_set, predicted_data = ldm_data.test_predictions, atk_vector_indx = 5) 43 | 44 | print('\ncalculating MAE, MSE and RMSE score of FPDM forecasting model on test data') 45 | fpdm.get_forecasting_errors(ldm_data.test_set,ldm_data.test_predictions) 46 | 47 | prf, cm = fpdm.get_prf(ldm_data.test_set, ldm_data.test_predictions, threshold = 0.4) 48 | print('\nCalculating precision recall f1-score of FPDM on test data') 49 | print('precision = {}, recall = {}, f1-score = {}'.format(prf[0], prf[1], prf[2])) 50 | 51 | #ldm_data = LDM_Data(data, fpdm_model.model, algorithm = algorithm, training = ldm_training) 52 | ldm_model = LDM_Model(ldm_data = ldm_data, algorithm = algorithm, training= ldm_training) 53 | 54 | if ldm_training: 55 | ldm_data.generate_randomly_injected_fdi_data() 56 | ldm_model.train() 57 | else: 58 | #for injecting only test set 59 | injected_test_data, fdi_location_test = ldm_data.inject_fdi(ldm_data.test_set) 60 | 61 | # To get the ROC curve of location detection on test data 62 | print("\nThe roc curve for location detection module") 63 | ldm_model.get_roc_curve(injected_test_data, ldm_data.test_predictions, fdi_location_test) 64 | 65 | # location prediction and Calculating precision recall f1-score on test data 66 | print('\ncalculating precision recall f1 score of LDM') 67 | location_prediction = ldm_model.predict(real_data = injected_test_data, predicted_data = ldm_data.test_predictions) 68 | ldm_model.get_ldm_prf(real_location_data = fdi_location_test, predicted_location_data = location_prediction) 69 | 70 | -------------------------------------------------------------------------------- /scripts/Data.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pandas as pd 3 | 4 | class Data: 5 | 6 | def __init__(self, data_path, atk_vector_path, scaler, lookback = 48, delay = 1, batch_size=32, step=1): 7 | self.data_path = data_path 8 | self.atk_vector_path = atk_vector_path 9 | self.mm = scaler 10 | self.lookback = lookback 11 | self.delay = delay 12 | self.step = step 13 | self.batch_size = batch_size 14 | 15 | self.training_set, self.columns = Data.load_benign_data(self.data_path) 16 | self.atk_vectors, self.label_set = Data.load_atk_vectors(self.atk_vector_path) 17 | 18 | self.mm = self.mm.fit(self.training_set) 19 | self.scaled = self.scale(self.training_set) 20 | 21 | self.train_set, self.val_set, self.test_set = self.train_val_test_split() 22 | self.len_train_set, self.len_val_set, self.len_test_set = len(self.train_set), len(self.val_set), len(self.test_set) 23 | 24 | self.train_gen = self.generator(self.train_set, 25 | lookback=self.lookback, 26 | delay=self.delay, 27 | min_index=0, 28 | max_index=None, 29 | shuffle=True, 30 | step=self.step, 31 | batch_size = self.batch_size) 32 | 33 | self.val_gen = self.generator(self.val_set, 34 | lookback=self.lookback, 35 | delay=self.delay, 36 | min_index=0, 37 | max_index=None, 38 | shuffle=True, 39 | step=self.step, 40 | batch_size = self.batch_size) 41 | 42 | @staticmethod 43 | def load_benign_data(data_path): 44 | print("\nLoading Benign Hourly Data...") 45 | df = pd.read_csv(data_path) 46 | columns = df.columns[1:] 47 | 48 | print("\nFeatures Selected {}".format(columns)) 49 | training_set = df[columns].astype(float).to_numpy() 50 | 51 | return training_set, columns 52 | 53 | 54 | @staticmethod 55 | def load_atk_vectors(atk_vector_path): 56 | print("\nLoading False Data Injection Attack Vectors...") 57 | atk_vectors = pd.read_excel(atk_vector_path) 58 | atk_vectors.drop("Unnamed: 0", axis=1, inplace = True) 59 | atk_vectors = atk_vectors.to_numpy() 60 | label_set = [] 61 | for i in atk_vectors: 62 | label_set.append([0 if k==0 else 1 for k in i]) 63 | label_set = np.array(label_set) 64 | 65 | print("\nshape of Attack vector", atk_vectors.shape) 66 | print("shape of labels ", label_set.shape) 67 | 68 | return atk_vectors, label_set 69 | 70 | 71 | def scale(self, data): 72 | return self.mm.transform(data) 73 | 74 | def inv_scale(self, scaled_data): 75 | return self.mm.inverse_transform(scaled_data) 76 | 77 | def train_val_test_split(self, test_set_percentage= 0.2, val_set_percentage = 0.5): 78 | test_split = int(self.scaled.shape[0]*test_set_percentage) 79 | test_set = self.scaled[-test_split:] 80 | train_set = self.scaled[:-test_split] 81 | 82 | val_split = int(test_set.shape[0]*val_set_percentage) 83 | final_test_set = test_set[-val_split:] 84 | val_set = test_set[:-val_split] 85 | 86 | print("\nTraining set shape: ",train_set.shape) 87 | print("Validation set shape ",val_set.shape) 88 | print("Test Set Shape ",final_test_set.shape) 89 | 90 | return train_set, val_set, final_test_set 91 | 92 | def inject_fixed_attackvec(self, data, atk_vector_indx, scale = True): 93 | 94 | atk_vector = self.atk_vectors[atk_vector_indx] 95 | for dt_point in data: 96 | for i in range(len(atk_vector)): 97 | if atk_vector[i] !=0: 98 | dt_point[i] = dt_point[i]+atk_vector[i] 99 | return data 100 | 101 | def inject_random_attackvec(self, data): 102 | location_set = [] 103 | for row,sample in enumerate(data): 104 | indx = np.squeeze(np.random.choice(self.atk_vectors.shape[0],size = 1, replace = True)) 105 | atk_vector = self.atk_vectors[indx,] 106 | data[row] = sample+atk_vector 107 | location_set.append(self.label_set[indx,]) 108 | return data, np.array(location_set) 109 | 110 | def generator(self, data, lookback, delay, min_index, max_index, shuffle=False, batch_size=32, step=1): 111 | if max_index is None: 112 | max_index = len(data)- delay #8 113 | i = min_index + lookback 114 | while 1: 115 | if shuffle: 116 | rows = np.random.randint(min_index + lookback, max_index+1, size=batch_size) 117 | else: 118 | if i + batch_size-1 > max_index: #+ batch_size 119 | i = min_index + lookback 120 | rows = np.arange(i, min(i + batch_size, max_index+1)) 121 | i += 1 122 | samples = np.zeros((len(rows), lookback // step, data.shape[-1])) 123 | targets = np.zeros((len(rows),data.shape[-1])) 124 | for j, row in enumerate(rows): 125 | indices = range(rows[j] - lookback, rows[j], step) 126 | samples[j] = data[indices] 127 | targets[j] = data[rows[j]+ delay-1][:] # + delay-1 128 | yield samples, [i for i in targets.T] 129 | -------------------------------------------------------------------------------- /scripts/FPDM.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import matplotlib.pyplot as plt 3 | from numpy.linalg import norm 4 | from sklearn.metrics import accuracy_score, confusion_matrix 5 | 6 | from scripts.Utility_Functions import utility_functions 7 | 8 | class FPDM: 9 | 10 | def __init__(self, data, fpdm_model, lookback = 48, threshold = 0.4): 11 | self.data = data 12 | self.fpdm_model = fpdm_model 13 | self.lookback = lookback 14 | self.threshold = threshold 15 | pass 16 | 17 | def predict(self, model, batch): 18 | pass 19 | 20 | def barplot_for_specific_fdi(self, real_data, predicted_data, atk_vector_indx): 21 | inv_scaled_real_data = self.data.inv_scale(real_data) 22 | injected_data = self.data.inject_fixed_attackvec(inv_scaled_real_data, atk_vector_indx) 23 | injected_data = self.data.scale(injected_data) 24 | #injeted_data = injected_data[self.lookback:] 25 | print('injected_data shape == {}'.format(injected_data.shape)) 26 | errors = abs(real_data - predicted_data) 27 | errors_fdi = abs(injected_data - predicted_data) 28 | t_err = [norm(i) for i in errors] 29 | t_err_fdi = [norm(i) for i in errors_fdi] 30 | utility_functions.show_barplot(data_list = [t_err, t_err_fdi], label = ['real','anomaly'], n_bins= 50) 31 | 32 | def get_forecasting_errors(self, real_data, predicted_data): 33 | predictions = self.data.inv_scale(predicted_data) 34 | real = self.data.inv_scale(real_data) 35 | mae = utility_functions.MAE(real, predictions) 36 | mse = utility_functions.MSE(real,predictions) 37 | rmse = utility_functions.RMSE(real,predictions) 38 | print('mae : {}, mse : {}, rmse : {}'.format(mae,mse,rmse)) 39 | 40 | 41 | def is_fdi(self, real_data, predicted_data, threshold): 42 | errors = abs(real_data - predicted_data) 43 | fdi = [] 44 | for i in errors: 45 | if norm(i)> threshold: 46 | fdi.append(1) 47 | else: 48 | fdi.append(0) 49 | return np.array(fdi) 50 | 51 | 52 | def get_prf(self, real_data, predicted_data, threshold): 53 | 54 | fdi = self.is_fdi(real_data, predicted_data, threshold) 55 | actual = np.zeros((real_data.shape[0],)) #calculating confusion matrix with only real data 56 | cm0 = confusion_matrix(actual,fdi, labels = [0,1]) 57 | 58 | inv_scaled_real_data = self.data.inv_scale(real_data) 59 | injected_data, _ = self.data.inject_random_attackvec(inv_scaled_real_data) 60 | injected_data = self.data.scale(injected_data) 61 | 62 | fdi = self.is_fdi(injected_data, predicted_data, threshold) 63 | actual = np.ones((real_data.shape[0],)) 64 | cm1 = confusion_matrix(actual,fdi, labels = [0,1]) 65 | 66 | cm= cm0 + cm1 #adding confusion matrix with Fdi data 67 | return np.array(utility_functions.cm2prf(cm.T)), cm.T 68 | 69 | 70 | -------------------------------------------------------------------------------- /scripts/FPDM_Models.py: -------------------------------------------------------------------------------- 1 | import os 2 | import tensorflow as tf 3 | from tensorflow import keras 4 | from tensorflow.keras import layers 5 | 6 | 7 | class FPDM_Models: 8 | 9 | def __init__(self, input_shape, #input_shape = (lookback,54) 10 | lookback = 48, 11 | algorithm = 'xtm', 12 | head_size = 9, 13 | num_heads = 6, 14 | ff_dim = 128, 15 | num_transformer_blocks = 1, 16 | mlp_units=[128], 17 | mlp_dropout = 0.1, 18 | dropout=0.1, 19 | training = False, 20 | loss_function = 'mse', 21 | optimizer = keras.optimizers.Adam(learning_rate=1e-4), 22 | checkpoint_path = "checkpoint\\", 23 | ): 24 | 25 | self.training = training 26 | self.input_shape = input_shape 27 | self.checkpoint_path = checkpoint_path 28 | self.algorithm = algorithm 29 | self.head_size = head_size 30 | self.num_heads = num_heads 31 | self.ff_dim = ff_dim 32 | self.num_transformer_blocks = num_transformer_blocks 33 | self.mlp_units = mlp_units 34 | self.mlp_dropout = mlp_dropout 35 | self.dropout = dropout 36 | self.loss_function = loss_function 37 | self.optimizer = optimizer 38 | self.lookback = lookback 39 | print('Loading '+ self.algorithm+ 'model for FDI presence detection module ...') 40 | if self.training: 41 | 42 | if self.algorithm == 'xtm': 43 | self.model = self.build_xtm_model(self.input_shape, 44 | self.head_size, 45 | self.num_heads, 46 | self.ff_dim, 47 | self.num_transformer_blocks, 48 | self.mlp_units, 49 | self.mlp_dropout, 50 | self.dropout 51 | ) 52 | elif self.algorithm == 'cnn_transformer': 53 | self.model = self.build_cnn_transformer_model(self.input_shape, 54 | self.head_size, 55 | self.num_heads, 56 | self.ff_dim, 57 | self.num_transformer_blocks, 58 | self.mlp_units, 59 | self.mlp_dropout, 60 | self.dropout 61 | ) 62 | 63 | elif self.algorithm == 'cnn': 64 | self.model = self.build_cnn_model(self.input_shape, 65 | self.mlp_units, 66 | self.mlp_dropout, 67 | self.dropout 68 | ) 69 | elif self.algorithm == 'cnn_lstm': 70 | self.model = self.build_cnn_lstm_model(self.input_shape, 71 | self.mlp_units, 72 | self.mlp_dropout, 73 | self.dropout 74 | ) 75 | 76 | elif self.algorithm == 'transformer': 77 | self.model = self.build_transformer_model(self.input_shape, 78 | self.head_size, 79 | self.num_heads, 80 | self.ff_dim, 81 | self.num_transformer_blocks, 82 | self.mlp_units, 83 | self.mlp_dropout, 84 | self.dropout 85 | ) 86 | 87 | else: 88 | self.model = self.build_xtm_model(self.input_shape, 89 | self.head_size, 90 | self.num_heads, 91 | self.ff_dim, 92 | self.num_transformer_blocks, 93 | self.mlp_units, 94 | self.mlp_dropout, 95 | self.dropout 96 | ) 97 | 98 | self.model = self.compile_model(self.model,self.input_shape,self.loss_function, self.optimizer) 99 | 100 | self.cpt_path = self.checkpoint_path+self.algorithm+"\\" 101 | #self.cpt_path = os.path.join(os.getcwd(), "checkpoint\\"+algorithm+"\\") 102 | #if not os.path.exists(self.cpt_path): 103 | # os.mkdir(self.cpt_path) 104 | 105 | self.callbacks = [keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True), 106 | keras.callbacks.ModelCheckpoint(filepath = self.cpt_path, save_weights_only=True, monitor= 'val_loss',mode = 'min', save_best_only=True)] 107 | else: 108 | #self.checkpoint_file_path = os.path.join(os.getcwd(), "checkpoint\\"+"fpdm_"+self.algorithm+".h5") 109 | self.checkpoint_file_path = self.checkpoint_path+"fpdm_"+self.algorithm+".h5" 110 | print("loading "+self.algorithm+" from saved models...") 111 | self.model = keras.models.load_model(self.checkpoint_file_path) 112 | 113 | def transformer_encoder(self, inputs, head_size, num_heads, ff_dim, dropout=0): 114 | # Attention and Normalization 115 | x = layers.MultiHeadAttention( 116 | key_dim=head_size, num_heads=num_heads, dropout=dropout 117 | )(inputs, inputs) 118 | x = layers.Dropout(dropout)(x) 119 | x = layers.LayerNormalization(epsilon=1e-6)(x) 120 | res = x + inputs 121 | 122 | # Feed Forward Part 123 | x = layers.Dense(ff_dim, activation = 'relu')(res) 124 | x = layers.Dropout(dropout)(x) 125 | x = layers.Dense(inputs.shape[-1], activation = 'relu')(x) 126 | x = layers.LayerNormalization(epsilon=1e-6)(x) 127 | return x+res 128 | 129 | # Feed Forward Part 130 | #x = layers.Conv1D(filters=ff_dim, kernel_size=1, activation="relu")(res) 131 | #x = layers.Dropout(dropout)(x) 132 | #x = layers.Conv1D(filters=inputs.shape[-1], kernel_size=1)(x) 133 | #x = layers.LayerNormalization(epsilon=1e-6)(x) 134 | #return x + res 135 | 136 | 137 | def build_xtm_model(self,input_shape,head_size,num_heads,ff_dim,num_transformer_blocks,mlp_units,mlp_dropout,dropout): 138 | inputs = keras.Input(shape=input_shape) 139 | x = inputs 140 | 141 | for _ in range(num_transformer_blocks): 142 | x = self.transformer_encoder(x, head_size, num_heads, ff_dim, dropout) 143 | 144 | #x = layers.GlobalAveragePooling1D(data_format="channels_last")(x) 145 | x = layers.LSTM(128, activation = 'tanh',return_sequences=True)(x) 146 | x = layers.LSTM(128, activation = 'tanh')(x) 147 | for dim in mlp_units: 148 | x = layers.Dense(dim, activation="relu")(x) 149 | x = layers.Dropout(mlp_dropout)(x) 150 | outputs = [layers.Dense(1)(x) for i in range(54)] 151 | return keras.Model(inputs, outputs) 152 | 153 | 154 | def build_cnn_transformer_model(self,input_shape,head_size,num_heads,ff_dim,num_transformer_blocks,mlp_units,dropout,mlp_dropout): 155 | inputs = keras.Input(shape=input_shape) 156 | x = inputs 157 | x = layers.Conv1D(128, 9, activation='relu')(x) 158 | x = layers.MaxPooling1D(2)(x) 159 | #x = layers.Conv1D(128, 9, activation='relu')(x) 160 | 161 | for _ in range(num_transformer_blocks): 162 | x = self.transformer_encoder(x, head_size, num_heads, ff_dim, dropout) 163 | 164 | x = layers.GlobalAveragePooling1D(data_format="channels_last")(x) 165 | for dim in mlp_units: 166 | x = layers.Dense(dim, activation="relu")(x) 167 | x = layers.Dropout(mlp_dropout)(x) 168 | #outputs = [layers.Dense(1)(x) for i in range(input_shape[1])] 169 | outputs = [layers.Dense(1)(x) for i in range(54)] 170 | return keras.Model(inputs, outputs) 171 | 172 | 173 | def build_cnn_model(self,input_shape, mlp_units, mlp_dropout, dropout): 174 | inputs = keras.Input(shape=input_shape) 175 | x = inputs 176 | x = layers.Conv1D(128, 9, activation='relu')(x) 177 | x = layers.MaxPooling1D(2)(x) 178 | x = layers.Conv1D(128, 9, activation='relu')(x) 179 | x = layers.GlobalAveragePooling1D(data_format="channels_last")(x) 180 | for dim in mlp_units: 181 | x = layers.Dense(dim, activation="relu")(x) 182 | x = layers.Dropout(mlp_dropout)(x) 183 | outputs = [layers.Dense(1)(x) for i in range(54)] 184 | return keras.Model(inputs, outputs) 185 | 186 | 187 | def build_cnn_lstm_model(self,input_shape, mlp_units, mlp_dropout, dropout): 188 | inputs = keras.Input(shape=input_shape) 189 | x = inputs 190 | x = layers.Conv1D(128, 9, activation='relu')(x) 191 | x = layers.MaxPooling1D(3)(x) 192 | x = layers.LSTM(128, activation = 'tanh',return_sequences=True)(x) 193 | x = layers.LSTM(128, activation = 'tanh')(x) 194 | for dim in mlp_units: 195 | x = layers.Dense(dim, activation="relu")(x) 196 | x = layers.Dropout(mlp_dropout)(x) 197 | outputs = [layers.Dense(1)(x) for i in range(54)] 198 | return keras.Model(inputs, outputs) 199 | 200 | 201 | def build_transformer_model(self,input_shape,head_size,num_heads,ff_dim,num_transformer_blocks,mlp_units,mlp_dropout,dropout): 202 | inputs = keras.Input(shape=input_shape) 203 | x = inputs 204 | for _ in range(num_transformer_blocks): 205 | x = self.transformer_encoder(x, head_size, num_heads, ff_dim, dropout) 206 | 207 | x = layers.GlobalAveragePooling1D(data_format="channels_last")(x) 208 | for dim in mlp_units: 209 | x = layers.Dense(dim, activation="relu")(x) 210 | x = layers.Dropout(mlp_dropout)(x) 211 | outputs = [layers.Dense(1)(x) for i in range(54)] 212 | return keras.Model(inputs, outputs) 213 | 214 | 215 | def compile_model(self, model, input_shape, loss_function, optimizer): 216 | losses = [loss_function for i in range(input_shape[-1])] 217 | model.compile(loss = losses, optimizer = optimizer) 218 | return model 219 | 220 | def train(self, train_gen, val_gen, steps_per_epoch, validation_steps, epochs = 50, save_model = False): 221 | training_history= self.model.fit(train_gen, 222 | steps_per_epoch = steps_per_epoch, 223 | validation_data = val_gen, 224 | validation_steps = validation_steps, 225 | epochs=epochs, 226 | callbacks=self.callbacks 227 | ) 228 | if save_model: 229 | print('saving model ...') 230 | self.model.save(self.checkpoint_file_path) 231 | history_df = pd.DataFrame(training_history.history) 232 | #history_df.to_excel(os.path.join(os.getcwd(), 'checkpoint\\training_history\\'+self.algorithm+'_training_history.xlsx')) 233 | print('saving training history...') 234 | history_df.to_excel(self.checkpoint_path+ "training_history\\"+self.algorithm+"_training_history.xlsx") 235 | 236 | 237 | def get_model_summary(self): 238 | return self.model.summary() 239 | 240 | def plot_model(self): 241 | return keras.utils.plot_model(self.model) 242 | 243 | 244 | def predict(self): 245 | pass 246 | 247 | def is_fdi(self): 248 | pass 249 | -------------------------------------------------------------------------------- /scripts/LDM_Data.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | from tqdm import tqdm 4 | 5 | class LDM_Data: 6 | 7 | def __init__(self, data, model, algorithm, lookback = 48, delay = 1, batch_size= 50, training = False, load_data_ = True): 8 | 9 | self.data = data 10 | self.model = model 11 | self.algorithm = algorithm 12 | self.lookback = lookback 13 | self.delay = delay 14 | self.batch_size = batch_size 15 | self.training = training 16 | self.load_data_ = load_data_ 17 | self.data_path = os.path.join(os.getcwd(),"checkpoint\\") 18 | 19 | self.train_gen = self.model2_generator(self.data.train_set, 20 | lookback=self.lookback, 21 | delay=self.delay, 22 | min_index=0, 23 | max_index=None, 24 | shuffle=False, 25 | step=1, 26 | batch_size = self.batch_size) 27 | 28 | self.val_gen = self.model2_generator(self.data.val_set, 29 | lookback=self.lookback, 30 | delay=self.delay, 31 | min_index=0, 32 | max_index=None, 33 | shuffle=False, 34 | step=1, 35 | batch_size = self.batch_size) 36 | 37 | self.test_gen = self.model2_generator(self.data.test_set, 38 | lookback=self.lookback, 39 | delay=self.delay, 40 | min_index=0, 41 | max_index=None, 42 | shuffle=False, 43 | step=1, 44 | batch_size = self.batch_size) 45 | if self.training: 46 | self.train_set, self.train_predictions = self.prepare_dataset_for_ldm(self.data.train_set, self.train_gen, filename = 'LDM_'+self.algorithm+'_train') 47 | self.val_set, self.val_predictions = self.prepare_dataset_for_ldm(self.data.val_set, self.val_gen, filename = 'LDM_'+self.algorithm+'_val') 48 | 49 | self.test_set, self.test_predictions = self.prepare_dataset_for_ldm(self.data.test_set, self.test_gen, filename = 'LDM_'+self.algorithm+'_test') 50 | 51 | def inject_fdi(self, data): 52 | # injecting random attack vectors to data 53 | inv_scaled_data = self.data.inv_scale(data) 54 | injected_data, fdi_location = self.data.inject_random_attackvec(inv_scaled_data) 55 | injected_data = self.data.scale(injected_data) 56 | return injected_data, fdi_location 57 | 58 | def generate_randomly_injected_fdi_data(self): 59 | # injecting random attack vectors to train, val, test for training LDM. This function needs to run manually 60 | print('\ninjecting datasets with random attack vector...') 61 | print("\nPreparing train, validation test set for ldm training...") 62 | self.train_set, self.fdi_location_train = self.inject_fdi(self.train_set) 63 | self.val_set, self.fdi_location_val = self.inject_fdi(self.val_set) 64 | self.test_set, self.fdi_location_test = self.inject_fdi(self.test_set) 65 | 66 | def randomize_ldm_data(self): 67 | pass 68 | 69 | def save_data(self,data, filename): 70 | np.savetxt(os.path.join(self.data_path, filename),data) 71 | 72 | def load_data(self, filename): 73 | return np.loadtxt(os.path.join(self.data_path, filename)) 74 | 75 | def model2_generator(self, data, lookback, delay, min_index, max_index, shuffle=False, batch_size=50, step=1): 76 | if max_index is None: 77 | max_index = len(data)- delay #8 78 | i = min_index + lookback 79 | while 1: 80 | if shuffle: 81 | rows = np.random.randint(min_index + lookback, max_index+1, size=batch_size) 82 | else: 83 | if i + batch_size > max_index: #+ batch_size 84 | rows = np.arange(i,max_index+1) 85 | i = min_index + lookback 86 | else: 87 | rows = np.arange(i, min(i + batch_size, max_index+1)) 88 | i += len(rows) 89 | samples = np.zeros((len(rows), lookback // step, data.shape[-1])) 90 | targets = np.zeros((len(rows),data.shape[-1])) 91 | for j, row in enumerate(rows): 92 | indices = range(rows[j] - lookback, rows[j], step) 93 | samples[j] = data[indices] 94 | targets[j] = data[rows[j]+ delay-1][:] # + delay-1 95 | yield samples, np.array([k for k in targets.T]) #[i for i in targets.T] 96 | 97 | def prepare_dataset_for_ldm(self, dataset, gen, filename): 98 | if self.load_data_: 99 | print("\nLoading data...") 100 | real = self.load_data(filename+'.txt') 101 | prediction = self.load_data(filename+'_predictions.txt') 102 | return real, prediction 103 | 104 | real = np.zeros((len(dataset)-self.lookback, dataset.shape[-1])) 105 | prediction = np.zeros((len(dataset)-self.lookback, dataset.shape[-1])) 106 | 107 | print("\nPreparing test set for LDM (real and Predicted from FPDM) ...") 108 | for i in tqdm(range(0,len(dataset)-self.lookback, self.batch_size)): 109 | test_data = next(gen) 110 | pred = np.array(self.model.predict_on_batch(test_data[0])) 111 | pred = np.reshape(pred,(pred.shape[0],pred.shape[1])).T 112 | rows = np.arange(i,min(len(dataset)-self.lookback,i+self.batch_size)) 113 | for j, row in enumerate(rows): 114 | real[row] = test_data[1].T[j] 115 | prediction[row] = pred[j] 116 | 117 | print("\nSaving Data...") 118 | self.save_data(real, filename+'.txt') 119 | self.save_data(prediction, filename+'_predictions.txt') 120 | return real, prediction -------------------------------------------------------------------------------- /scripts/LDM_Model.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | from sklearn.metrics import accuracy_score, confusion_matrix 4 | 5 | import tensorflow as tf 6 | from tensorflow import keras 7 | from tensorflow.keras import layers 8 | 9 | from scripts.Utility_Functions import utility_functions 10 | 11 | class LDM_Model: 12 | 13 | def __init__(self, ldm_data, 14 | input_shape = (54,), 15 | output_heads = 54, 16 | algorithm = 'xtm', 17 | epochs = 100, 18 | batch_size = 32, 19 | loss_function= 'binary_crossentropy', 20 | optimizer = keras.optimizers.Adam(learning_rate=1e-3), 21 | training= False): 22 | self.ldm_data = ldm_data 23 | self.input_shape = input_shape 24 | self.output_heads = output_heads 25 | self.algorithm = algorithm 26 | self.epochs = epochs 27 | self.batch_size = batch_size 28 | self.loss_function = loss_function 29 | self.optimizer = optimizer 30 | self.training = training 31 | self.model = self.ldm_model() 32 | if self.training: 33 | self.compile_model() 34 | self.callbacks = [keras.callbacks.EarlyStopping(patience=10, restore_best_weights=True)] 35 | 36 | else: 37 | self.checkpoint_file_path = os.path.join(os.getcwd(), "checkpoint\\"+"ldm_"+self.algorithm+".h5") 38 | print("loading ldm model trained with predictions from "+self.algorithm+" from the saved models...") 39 | self.model = keras.models.load_model(self.checkpoint_file_path) 40 | 41 | 42 | 43 | def ldm_model(self): 44 | pred_inp = keras.Input(shape=self.input_shape) 45 | fdi_inp = keras.Input(shape = self.input_shape) 46 | concat = layers.concatenate([pred_inp,fdi_inp]) 47 | dense1 = layers.Dense(128, activation = 'relu')(concat) 48 | dense2 = layers.Dense(128,activation = 'relu')(dense1) 49 | dense3 = layers.Dense(128, activation = 'relu')(dense2) 50 | outputs = [layers.Dense(1, activation = 'sigmoid')(dense3) for i in range(self.output_heads)] 51 | 52 | return keras.Model([pred_inp,fdi_inp], outputs) 53 | 54 | def compile_model(self): 55 | losses = [self.loss_function for i in range(self.input_shape[-1])] 56 | self.model.compile(loss = losses, optimizer = self.optimizer) 57 | 58 | 59 | def train(self, save_model = False): 60 | training_history= self.model.fit([self.ldm_data.train_predictions,self.ldm_data.train_set], 61 | [i for i in self.ldm_data.fdi_location_train.T], 62 | validation_data = ([self.ldm_data.val_predictions,self.ldm_data.val_set],[k for k in self.ldm_data.fdi_location_val.T]), 63 | epochs=self.epochs, 64 | batch_size = self.batch_size, 65 | callbacks=self.callbacks 66 | ) 67 | 68 | if save_model: 69 | self.model.save(self.checkpoint_file_path) 70 | history_df = pd.DataFrame(training_history.history) 71 | history_df.to_excel(os.path.join(os.getcwd(), 'checkpoint\\training_history\\'+self.algorithm+'_location_training_history.xlsx')) 72 | 73 | 74 | def get_model_summary(self): 75 | return self.model.summary() 76 | 77 | def plot_model(self): 78 | return keras.utils.plot_model(self.model) 79 | 80 | def predict(self, real_data, predicted_data, threshold = 0.5): 81 | raw_location_prediction = np.squeeze(self.model.predict([predicted_data,real_data])) 82 | raw_location_prediction = raw_location_prediction.T 83 | location_prediction = [] 84 | for i in raw_location_prediction: 85 | temp = [] 86 | for k in i: 87 | if k> threshold: 88 | temp.append(1) 89 | elif (threshold == 1.0) and (k == 1.0): 90 | temp.append(0) 91 | elif (threshold == 0) and (k == 0): 92 | temp.append(1) 93 | else: 94 | temp.append(0) 95 | location_prediction.append(temp) 96 | return np.array(location_prediction) 97 | 98 | def roc_curve(self, real_data , predicted_data, threshold, location_set): 99 | cms = [] 100 | tprs0 = [] 101 | fprs0 = [] 102 | tprs1 = [] 103 | fprs1 = [] 104 | loc = self.predict(real_data, predicted_data, threshold) 105 | for i in range(54): 106 | cms.append(confusion_matrix(location_set.T[i],loc.T[i], labels = [0,1]).T) 107 | 108 | for i in cms: 109 | #class 0 110 | tp = i[0][0] 111 | fp = i[0][1] 112 | fn = i[1][0] 113 | tn = i[1][1] 114 | 115 | tpr0 = tp/(tp+fn) 116 | fpr0 = fp/(fp+tn) 117 | 118 | #class 1 119 | tp = i[1][1] 120 | fp = i[1][0] 121 | fn = i[0][1] 122 | tn = i[0][0] 123 | 124 | tpr1 = tp/(tp+fn) 125 | fpr1 = fp/(fp+tn) 126 | 127 | tprs0.append(tpr0) 128 | fprs0.append(fpr0) 129 | tprs1.append(tpr1) 130 | fprs1.append(fpr1) 131 | tprs0,fprs0,tprs1,fprs1 = np.array(tprs0),np.array(fprs0),np.array(tprs1),np.array(fprs1) 132 | return np.mean(tprs0), np.mean(fprs0), np.mean(tprs1), np.mean(fprs1) 133 | 134 | def get_roc_curve(self, real_data , predicted_data, location_set): 135 | thresholds = [] 136 | tpr_class0 = [] 137 | fpr_class0 = [] 138 | tpr_class1 = [] 139 | fpr_class1 = [] 140 | 141 | ts = np.linspace(0,1.0, num = 11) 142 | 143 | for i in ts: 144 | temp = self.roc_curve(real_data = real_data, predicted_data = predicted_data ,threshold = i, location_set = location_set) 145 | thresholds.append(i) 146 | tpr_class0.append(temp[0]) 147 | fpr_class0.append(temp[1]) 148 | tpr_class1.append(temp[2]) 149 | fpr_class1.append(temp[3]) 150 | 151 | utility_functions.plot_roc_curve([tpr_class0,tpr_class1],[fpr_class0,fpr_class1], n_classes=2) 152 | 153 | def get_ldm_prf(self, real_location_data, predicted_location_data): 154 | prf_loc = [] 155 | for i in range(54): 156 | cm = confusion_matrix(real_location_data.T[i],predicted_location_data.T[i], labels = [0,1]) 157 | prf_loc.append(np.array(utility_functions.cm2prf(cm))) 158 | 159 | prf_loc = np.array(prf_loc) 160 | prf_loc = np.mean(prf_loc, axis = 0) 161 | print('precision = {}, recall = {}, f1 score = {}'.format(prf_loc[0],prf_loc[1],prf_loc[2])) 162 | return prf_loc -------------------------------------------------------------------------------- /scripts/Utility_Functions.py: -------------------------------------------------------------------------------- 1 | import os 2 | import matplotlib.pyplot as plt 3 | import numpy as np 4 | from numpy.linalg import norm 5 | 6 | 7 | class utility_functions: 8 | def __init__(self): 9 | pass 10 | 11 | @staticmethod 12 | def MAE(real, predictions): 13 | # real = (samples x num of sensors) 14 | # predictions = (predictions on samples x num of sensors) 15 | return np.mean(np.mean(abs(real-predictions), axis = 0)) 16 | 17 | @staticmethod 18 | def MSE(real, predictions): 19 | # real = (samples x num of sensors) 20 | # predictions = (predictions on samples x num of sensors) 21 | return np.mean(np.mean(np.square(abs(real-predictions)),axis = 0)) 22 | 23 | @staticmethod 24 | def RMSE(real, predictions): 25 | # real = (samples x num of sensors) 26 | # predictions = (predictions on samples x num of sensors) 27 | return np.mean(np.sqrt(np.mean(np.square(abs(real-predictions)),axis = 0))) 28 | 29 | @staticmethod 30 | def show_barplot(data_list, label, n_bins = 50): 31 | # data_list = list, data on the plot 32 | # n_bins = number of bins 33 | # label = list, labels indicating the data 34 | for i in range(len(data_list)): 35 | plt.hist(data_list[i] ,bins=n_bins, label = label[i]) 36 | plt.legend() 37 | plt.savefig(os.path.join(os.getcwd(), 'checkpoint\\images\\barplot.jpg')) 38 | plt.show() 39 | 40 | 41 | @staticmethod 42 | def plot_roc_curve(tprs, fprs, n_classes): 43 | #list of tprs [tpr for class 0, tpr for class 1, ....] 44 | #list of fprs [fpr for class 0, fpr for class 1, ...] 45 | for i in range(n_classes): 46 | plt.plot(fprs[i],tprs[i], label = 'class '+str(i)) 47 | 48 | plt.xlabel('False Positive Rate') 49 | plt.ylabel('True Positive Rate') 50 | plt.title('ROC Curve') 51 | plt.legend() 52 | plt.savefig(os.path.join(os.getcwd(), 'checkpoint\\images\\roc_curve.jpg')) 53 | plt.show() 54 | 55 | 56 | @staticmethod 57 | def cm2prf(cm): 58 | # for class 0 59 | tp0 = cm[0][0] 60 | fp0 = cm[0][1] 61 | fn0 = cm[1][0] 62 | tn0 = cm[1][1] 63 | pr0 = tp0/(tp0+fp0) 64 | re0 = tp0/(tp0+fn0) 65 | f10 = 2*((pr0*re0)/(pr0+re0)) 66 | 67 | #for class 1 68 | tp1 = cm[1][1] 69 | fp1 = cm[1][0] 70 | fn1 = cm[0][1] 71 | tn1 = cm[0][0] 72 | pr1 = tp1/(tp1+fp1) 73 | re1 = tp1/(tp1+fn1) 74 | f11 = 2*((pr1*re1)/(pr1+re1)) 75 | 76 | return (pr0+pr1)/2 , (re0+re1)/2, (f10+f11)/2 # returning macro average 77 | 78 | -------------------------------------------------------------------------------- /scripts/__pycache__/Data.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/scripts/__pycache__/Data.cpython-39.pyc -------------------------------------------------------------------------------- /scripts/__pycache__/FPDM.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/scripts/__pycache__/FPDM.cpython-39.pyc -------------------------------------------------------------------------------- /scripts/__pycache__/FPDM_Models.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/scripts/__pycache__/FPDM_Models.cpython-39.pyc -------------------------------------------------------------------------------- /scripts/__pycache__/LDM_Data.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/scripts/__pycache__/LDM_Data.cpython-39.pyc -------------------------------------------------------------------------------- /scripts/__pycache__/LDM_Model.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/scripts/__pycache__/LDM_Model.cpython-39.pyc -------------------------------------------------------------------------------- /scripts/__pycache__/Utility_Functions.cpython-39.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gcsarker/XTM/72f77e702378f906caaedbb07e1481ed1e205a77/scripts/__pycache__/Utility_Functions.cpython-39.pyc --------------------------------------------------------------------------------