├── INDRNN_(V)AE ├── dataset │ ├── readme.md │ ├── test.npy │ └── train.npy ├── graph.png ├── ind_rnn_cell.py ├── indrnn_ae_vae.py └── readme.md ├── LSTM_VAE ├── LSTM_VAE.png ├── LSTM_VAE.py ├── dataset │ ├── data0.csv │ ├── lstm_test.npy │ ├── lstm_test_label.npy │ └── lstm_train.npy ├── readme.md └── utils.py ├── MLP_VAE ├── MLP_VAE.py ├── data │ ├── a.py │ ├── test.npy │ ├── test_label.npy │ └── train.npy ├── img │ ├── MLP_VAE.png │ ├── a.txt │ ├── iforest.png │ └── lof.png └── readme.md └── README.md /INDRNN_(V)AE/dataset/readme.md: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /INDRNN_(V)AE/dataset/test.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/INDRNN_(V)AE/dataset/test.npy -------------------------------------------------------------------------------- /INDRNN_(V)AE/dataset/train.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/INDRNN_(V)AE/dataset/train.npy -------------------------------------------------------------------------------- /INDRNN_(V)AE/graph.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/INDRNN_(V)AE/graph.png -------------------------------------------------------------------------------- /INDRNN_(V)AE/ind_rnn_cell.py: -------------------------------------------------------------------------------- 1 | """Module implementing the IndRNN cell""" 2 | 3 | from tensorflow.python.ops import math_ops 4 | from tensorflow.python.ops import init_ops 5 | from tensorflow.python.ops import nn_ops 6 | from tensorflow.python.ops import clip_ops 7 | from tensorflow.python.layers import base as base_layer 8 | 9 | try: 10 | # TF 1.7+ 11 | from tensorflow.python.ops.rnn_cell_impl import LayerRNNCell 12 | except ImportError: 13 | from tensorflow.python.ops.rnn_cell_impl import _LayerRNNCell as LayerRNNCell 14 | 15 | 16 | class IndRNNCell(LayerRNNCell): 17 | """Independently RNN Cell. Adapted from `rnn_cell_impl.BasicRNNCell`. 18 | 19 | Each unit has a single recurrent weight connected to its last hidden state. 20 | 21 | The implementation is based on: 22 | 23 | https://arxiv.org/abs/1803.04831 24 | 25 | Shuai Li, Wanqing Li, Chris Cook, Ce Zhu, Yanbo Gao 26 | "Independently Recurrent Neural Network (IndRNN): Building A Longer and 27 | Deeper RNN" 28 | 29 | The default initialization values for recurrent weights, input weights and 30 | biases are taken from: 31 | 32 | https://arxiv.org/abs/1504.00941 33 | 34 | Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton 35 | "A Simple Way to Initialize Recurrent Networks of Rectified Linear Units" 36 | 37 | Args: 38 | num_units: int, The number of units in the RNN cell. 39 | recurrent_min_abs: float, minimum absolute value of each recurrent weight. 40 | recurrent_max_abs: (optional) float, maximum absolute value of each 41 | recurrent weight. For `relu` activation, `pow(2, 1/timesteps)` is 42 | recommended. If None, recurrent weights will not be clipped. 43 | Default: None. 44 | recurrent_kernel_initializer: (optional) The initializer to use for the 45 | recurrent weights. If None, every recurrent weight is initially set to 1. 46 | Default: None. 47 | input_kernel_initializer: (optional) The initializer to use for the input 48 | weights. If None, the input weights are initialized from a random normal 49 | distribution with `mean=0` and `stddev=0.001`. Default: None. 50 | activation: Nonlinearity to use. Default: `relu`. 51 | reuse: (optional) Python boolean describing whether to reuse variables 52 | in an existing scope. If not `True`, and the existing scope already has 53 | the given variables, an error is raised. 54 | name: String, the name of the layer. Layers with the same name will 55 | share weights, but to avoid mistakes we require reuse=True in such 56 | cases. 57 | """ 58 | 59 | def __init__(self, 60 | num_units, 61 | recurrent_min_abs=0, 62 | recurrent_max_abs=None, 63 | recurrent_kernel_initializer=None, 64 | input_kernel_initializer=None, 65 | activation=None, 66 | reuse=None, 67 | name=None): 68 | super(IndRNNCell, self).__init__(_reuse=reuse, name=name) 69 | 70 | # Inputs must be 2-dimensional. 71 | self.input_spec = base_layer.InputSpec(ndim=2) 72 | 73 | self._num_units = num_units 74 | self._recurrent_min_abs = recurrent_min_abs 75 | self._recurrent_max_abs = recurrent_max_abs 76 | self._recurrent_initializer = recurrent_kernel_initializer 77 | self._input_initializer = input_kernel_initializer 78 | self._activation = activation or nn_ops.relu 79 | 80 | @property 81 | def state_size(self): 82 | return self._num_units 83 | 84 | @property 85 | def output_size(self): 86 | return self._num_units 87 | 88 | def build(self, inputs_shape): 89 | if inputs_shape[1].value is None: 90 | raise ValueError("Expected inputs.shape[-1] to be known, saw shape: %s" 91 | % inputs_shape) 92 | 93 | input_depth = inputs_shape[1].value 94 | if self._input_initializer is None: 95 | self._input_initializer = init_ops.random_normal_initializer(mean=0.0, 96 | stddev=0.001) 97 | self._input_kernel = self.add_variable( 98 | "input_kernel", 99 | shape=[input_depth, self._num_units], 100 | initializer=self._input_initializer) 101 | 102 | if self._recurrent_initializer is None: 103 | self._recurrent_initializer = init_ops.constant_initializer(1.) 104 | self._recurrent_kernel = self.add_variable( 105 | "recurrent_kernel", 106 | shape=[self._num_units], 107 | initializer=self._recurrent_initializer) 108 | 109 | # Clip the absolute values of the recurrent weights to the specified minimum 110 | if self._recurrent_min_abs: 111 | abs_kernel = math_ops.abs(self._recurrent_kernel) 112 | min_abs_kernel = math_ops.maximum(abs_kernel, self._recurrent_min_abs) 113 | self._recurrent_kernel = math_ops.multiply( 114 | math_ops.sign(self._recurrent_kernel), 115 | min_abs_kernel 116 | ) 117 | 118 | # Clip the absolute values of the recurrent weights to the specified maximum 119 | if self._recurrent_max_abs: 120 | self._recurrent_kernel = clip_ops.clip_by_value(self._recurrent_kernel, 121 | -self._recurrent_max_abs, 122 | self._recurrent_max_abs) 123 | 124 | self._bias = self.add_variable( 125 | "bias", 126 | shape=[self._num_units], 127 | initializer=init_ops.zeros_initializer(dtype=self.dtype)) 128 | 129 | self.built = True 130 | 131 | def call(self, inputs, state): 132 | """Run one time step of the IndRNN. 133 | 134 | Calculates the output and new hidden state using the IndRNN equation 135 | 136 | `output = new_state = act(W * input + u (*) state + b)` 137 | 138 | where `*` is the matrix multiplication and `(*)` is the Hadamard product. 139 | 140 | Args: 141 | inputs: Tensor, 2-D tensor of shape `[batch, num_units]`. 142 | state: Tensor, 2-D tensor of shape `[batch, num_units]` containing the 143 | previous hidden state. 144 | 145 | Returns: 146 | A tuple containing the output and new hidden state. Both are the same 147 | 2-D tensor of shape `[batch, num_units]`. 148 | """ 149 | gate_inputs = math_ops.matmul(inputs, self._input_kernel) 150 | recurrent_update = math_ops.multiply(state, self._recurrent_kernel) 151 | gate_inputs = math_ops.add(gate_inputs, recurrent_update) 152 | gate_inputs = nn_ops.bias_add(gate_inputs, self._bias) 153 | output = self._activation(gate_inputs) 154 | return output, output 155 | -------------------------------------------------------------------------------- /INDRNN_(V)AE/indrnn_ae_vae.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | A simple Implementation of INDRNN_(V)AE based algorithm 4 | for both Anomaly(Novelty) Detection in Multivariate Time Series; 5 | We also persent a health-judge mechanism for accessing the statement of 6 | the input Multivariate Time Series, which might be useful in machine maintenance; 7 | 8 | 9 | A special note between LSTM_VAE and INDRNN_(V)AE is that INDRNN_(V)AE 10 | can be adopted by high-frequency scenarios (industry sensors,for example). 11 | 12 | Author: Schindler Liang 13 | 14 | Reference: 15 | https://github.com/twairball/keras_lstm_vae 16 | https://github.com/batzner/indrnn 17 | """ 18 | 19 | import numpy as np 20 | import tensorflow as tf 21 | from tensorflow.nn.rnn_cell import MultiRNNCell 22 | from ind_rnn_cell import IndRNNCell 23 | 24 | xavier_init = tf.contrib.layers.xavier_initializer(seed=2019) 25 | zero_init = tf.zeros_initializer() 26 | 27 | def _INDRNNCells(unit_list,time_steps): 28 | recurrent_max = pow(2, 1 / time_steps) 29 | return MultiRNNCell([IndRNNCell(unit,recurrent_max_abs=recurrent_max) 30 | for unit in unit_list],state_is_tuple=True) 31 | 32 | class Data_Hanlder: 33 | def __init__(self,train_file): 34 | self.train_data = np.load(train_file) 35 | 36 | def fetch_data(self,batch_size): 37 | indices = np.random.choice(self.train_data.shape[0],batch_size) 38 | return self.train_data[indices] 39 | 40 | class INDRNN_VAE(object): 41 | def __init__(self,train_file, 42 | z_dim=10, 43 | encoder_layers=2, 44 | decode_layers=2, 45 | outlier_fraction=0.01 46 | ): 47 | 48 | self.outlier_fraction = outlier_fraction 49 | self.data_source = Data_Hanlder(train_file) 50 | self.n_hidden = 16 51 | self.batch_size = 128 52 | self.learning_rate = 0.0005 53 | self.train_iters = 7000 54 | self.encoder_layers = encoder_layers 55 | self.decode_layers = decode_layers 56 | self.time_steps = self.data_source.train_data.shape[1] 57 | self.input_dim = self.data_source.train_data.shape[2] 58 | self.z_dim = z_dim 59 | self.anomaly_score = 0 60 | self.sess = tf.Session() 61 | self._build_network() 62 | self.sess.run(tf.global_variables_initializer()) 63 | 64 | def _build_network(self): 65 | with tf.variable_scope('ph'): 66 | self.X = tf.placeholder(tf.float32,shape=[None,self.time_steps,self.input_dim],name='input_X') 67 | 68 | with tf.variable_scope('encoder',initializer=xavier_init): 69 | with tf.variable_scope('AE'): 70 | ae_fw_lstm_cells = _INDRNNCells([self.n_hidden]*self.encoder_layers,self.time_steps) 71 | ae_bw_lstm_cells = _INDRNNCells([self.n_hidden]*self.encoder_layers,self.time_steps) 72 | (ae_fw_outputs,ae_bw_outputs),_ = tf.nn.bidirectional_dynamic_rnn( 73 | ae_fw_lstm_cells, 74 | ae_bw_lstm_cells, 75 | self.X, dtype=tf.float32) 76 | ae_outputs = tf.add(ae_fw_outputs,ae_bw_outputs) 77 | 78 | with tf.variable_scope('lat_Z'): 79 | z_fw_lstm_cells = _INDRNNCells([self.n_hidden]*self.encoder_layers, 80 | self.time_steps) 81 | z_bw_lstm_cells = _INDRNNCells([self.n_hidden]*self.encoder_layers, 82 | self.time_steps) 83 | (z_fw_outputs,z_bw_outputs),_ = tf.nn.bidirectional_dynamic_rnn( 84 | z_fw_lstm_cells, 85 | z_bw_lstm_cells, 86 | self.X, dtype=tf.float32) 87 | z_outputs = tf.reduce_mean( (z_fw_outputs+z_bw_outputs),axis=1 ) 88 | 89 | mu_outputs = tf.layers.dense(z_outputs,self.z_dim,activation=tf.nn.tanh) 90 | log_sigma_outputs = tf.layers.dense(z_outputs,self.z_dim) 91 | 92 | sample_Z = mu_outputs + tf.exp(log_sigma_outputs/2) * tf.random_normal( 93 | tf.shape(mu_outputs), 94 | 0,1,dtype=tf.float32) 95 | 96 | 97 | with tf.variable_scope('decoder'): 98 | sample_Z = tf.expand_dims(sample_Z,axis=1) 99 | sample_Z = tf.tile(sample_Z,[1,self.time_steps,1]) 100 | decoder_input = tf.concat([ae_outputs,sample_Z],axis=-1) 101 | 102 | recons_fw_lstm_cells = _INDRNNCells([self.n_hidden]*self.decode_layers + [self.input_dim], 103 | self.time_steps) 104 | recons_bw_lstm_cells = _INDRNNCells([self.n_hidden]*self.decode_layers + [self.input_dim], 105 | self.time_steps) 106 | (recons_fw_outputs,recons_bw_outputs),_ = tf.nn.bidirectional_dynamic_rnn( 107 | recons_fw_lstm_cells, 108 | recons_bw_lstm_cells, 109 | decoder_input, dtype=tf.float32) 110 | self.recons_X = tf.add(recons_fw_outputs,recons_bw_outputs) 111 | 112 | with tf.variable_scope('loss'): 113 | reduce_dims = np.arange(1,tf.keras.backend.ndim(self.X)) 114 | recons_loss = tf.losses.mean_squared_error(self.X, self.recons_X) 115 | kl_loss = - 0.5 * tf.reduce_mean(1 + log_sigma_outputs - tf.square(mu_outputs) - tf.exp(log_sigma_outputs)) 116 | self.opt_loss = recons_loss + kl_loss 117 | self.all_losses = tf.reduce_sum(tf.square(self.X - self.recons_X),reduction_indices=reduce_dims) 118 | 119 | with tf.variable_scope('train'): 120 | self.uion_train_op = tf.train.AdamOptimizer(self.learning_rate).minimize(self.opt_loss) 121 | 122 | 123 | def train(self): 124 | for i in range(self.train_iters): 125 | this_X = self.data_source.fetch_data(self.batch_size) 126 | self.sess.run([self.uion_train_op],feed_dict={ 127 | self.X: this_X 128 | }) 129 | if i % 200 ==0: 130 | mse_loss = self.sess.run([self.opt_loss],feed_dict={ 131 | self.X: self.data_source.train_data 132 | }) 133 | print('epoch {}: with loss: {}'.format(i,mse_loss)) 134 | self._arange_score(self.data_source.train_data) 135 | 136 | def _arange_score(self,input_data): 137 | all_losses = self.sess.run(self.all_losses,feed_dict={ 138 | self.X: input_data 139 | }) 140 | self.sorted_loss = np.sort(all_losses).ravel() 141 | self.anomaly_score = np.percentile(self.sorted_loss,(1-self.outlier_fraction)*100) 142 | 143 | 144 | def judge_health(self,test): 145 | all_losses = self.sess.run(self.all_losses,feed_dict={ 146 | self.X: test 147 | }).ravel() 148 | percentile_95 = self.sorted_loss[int(self.sorted_loss.shape[0]*0.95)] 149 | value_gap = self.sorted_loss[-1] - percentile_95 150 | def _get_health(loss): 151 | min_index = np.argmin(np.abs(self.sorted_loss-loss)) 152 | if min_index < self.sorted_loss.shape[0] - 1: 153 | minus_ratio = min_index / self.sorted_loss.shape[0] 154 | else: 155 | exceed_loss = loss - self.sorted_loss[-1] 156 | minus_ratio = exceed_loss / value_gap * 0.05 + 1 157 | return 100.0 - 40 * minus_ratio 158 | all_health = list(map(lambda x:_get_health(x),all_losses)) 159 | return all_health 160 | 161 | def judge_anomaly(self,test): 162 | all_losses = self.sess.run(self.all_losses,feed_dict={ 163 | self.X: test 164 | }).ravel() 165 | judge_label = list( map(lambda x: -1 if x>self.anomaly_score else 1,all_losses) ) 166 | return judge_label 167 | 168 | 169 | indrnn_ae = INDRNN_VAE(train_file='dataset/train.npy',z_dim=10,outlier_fraction=0.04) 170 | indrnn_ae.train() 171 | 172 | test = np.load('dataset/test.npy') 173 | z1 = indrnn_ae.judge_health(test) 174 | z2 = indrnn_ae.judge_anomaly(test) 175 | 176 | import matplotlib.pyplot as plt 177 | plt.plot(z1) 178 | -------------------------------------------------------------------------------- /INDRNN_(V)AE/readme.md: -------------------------------------------------------------------------------- 1 | ## Reference 2 | High-frequency Multivariate Time Series Anomaly Detection based on IndRNN with AutoEncoder(both AE and VAE); 3 | [reference1](https://github.com/twairball/keras_lstm_vae). The IndRNN implementation is from 4 | [reference2](https://github.com/batzner/indrnn); 5 | 6 | 7 | ## Prerequisites 8 | * Python 3.3+ 9 | * Tensorflow 1.12.0 10 | * Sklearn 0.20.1 11 | * Numpy 1.15.4 12 | * Pandas 0.23.4 13 | * Matplotlib 3.0.2 14 | 15 | ## Dataset and Preprocessing 16 | The dataset used is the [MTSAD](https://github.com/jsonbruce/MTSAnomalyDetection), which has 2 dimensions. 17 | Then we re-set the dataset to be 3_dimensional with time_steps of 16. The detailed preprecessing process can be found at 18 | the LSTM_VAE chapter[reference3](https://github.com/SchindlerLiang/VAE-for-Anomaly-Detection/blob/master/LSTM_VAE/utils.py). 19 | 20 | IndRNN_(V)AE algorithm should be trained on the Normal samples. In this algorithm, we present two score-functions for accessing the test_data. judge_anomaly() for anomaly detection and judge_health() for healthy accessment, which may be of use in high-frequency industry sensors. 21 | 22 | 23 | ## Network Structure 24 | The Structure of the network presented here 25 | 26 | ![Network Structure for IndRNN_(V)AE](https://github.com/SchindlerLiang/VAE-for-Anomaly-Detection/blob/master/INDRNN_(V)AE/graph.png) 27 | 28 | Note that we use both AE and VAE structure, with the thoughts of keeping time-dependent information by AE and maitaining variability by VAE. 29 | -------------------------------------------------------------------------------- /LSTM_VAE/LSTM_VAE.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/LSTM_VAE/LSTM_VAE.png -------------------------------------------------------------------------------- /LSTM_VAE/LSTM_VAE.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | One simple Implementation of LSTM_VAE based algorithm for Anomaly Detection in Multivariate Time Series; 4 | 5 | Author: Schindler Liang 6 | 7 | Reference: 8 | https://www.researchgate.net/publication/304758073_LSTM-based_Encoder-Decoder_for_Multi-sensor_Anomaly_Detection 9 | https://github.com/twairball/keras_lstm_vae 10 | https://arxiv.org/pdf/1711.00614.pdf 11 | """ 12 | import numpy as np 13 | import tensorflow as tf 14 | from tensorflow.nn.rnn_cell import MultiRNNCell, LSTMCell 15 | from utils import Data_Hanlder 16 | 17 | 18 | def lrelu(x, leak=0.2, name='lrelu'): 19 | return tf.maximum(x, leak*x) 20 | 21 | 22 | def _LSTMCells(unit_list,act_fn_list): 23 | return MultiRNNCell([LSTMCell(unit, 24 | activation=act_fn) 25 | for unit,act_fn in zip(unit_list,act_fn_list )]) 26 | 27 | class LSTM_VAE(object): 28 | def __init__(self,dataset_name,columns,z_dim,time_steps,outlier_fraction): 29 | self.outlier_fraction = outlier_fraction 30 | self.data_source = Data_Hanlder(dataset_name,columns,time_steps) 31 | self.n_hidden = 16 32 | self.batch_size = 128 33 | self.learning_rate = 0.0005 34 | self.train_iters = 4000 35 | 36 | self.input_dim = len(columns) 37 | self.z_dim = z_dim 38 | self.time_steps = time_steps 39 | 40 | self.pointer = 0 41 | self.anomaly_score = 0 42 | self.sess = tf.Session() 43 | self._build_network() 44 | self.sess.run(tf.global_variables_initializer()) 45 | 46 | def _build_network(self): 47 | with tf.variable_scope('ph'): 48 | self.X = tf.placeholder(tf.float32,shape=[None,self.time_steps,self.input_dim],name='input_X') 49 | 50 | with tf.variable_scope('encoder'): 51 | with tf.variable_scope('lat_mu'): 52 | mu_fw_lstm_cells = _LSTMCells([self.z_dim],[lrelu]) 53 | mu_bw_lstm_cells = _LSTMCells([self.z_dim],[lrelu]) 54 | 55 | (mu_fw_outputs,mu_fw_outputs),_ = tf.nn.bidirectional_dynamic_rnn( 56 | mu_fw_lstm_cells, 57 | mu_bw_lstm_cells, 58 | self.X, dtype=tf.float32) 59 | mu_outputs = tf.add(mu_fw_outputs,mu_fw_outputs) 60 | 61 | with tf.variable_scope('lat_sigma'): 62 | sigma_fw_lstm_cells = _LSTMCells([self.z_dim],[tf.nn.softplus]) 63 | sigma_bw_lstm_cells = _LSTMCells([self.z_dim],[tf.nn.softplus]) 64 | (sigma_fw_outputs,sigma_bw_outputs),_ = tf.nn.bidirectional_dynamic_rnn( 65 | sigma_fw_lstm_cells, 66 | sigma_bw_lstm_cells, 67 | self.X, dtype=tf.float32) 68 | sigma_outputs = tf.add(sigma_fw_outputs,sigma_bw_outputs) 69 | sample_Z = mu_outputs + sigma_outputs * tf.random_normal( 70 | tf.shape(mu_outputs), 71 | 0,1,dtype=tf.float32) 72 | 73 | with tf.variable_scope('decoder'): 74 | recons_lstm_cells = _LSTMCells([self.n_hidden,self.input_dim],[lrelu,lrelu]) 75 | self.recons_X,_ = tf.nn.dynamic_rnn(recons_lstm_cells, sample_Z, dtype=tf.float32) 76 | 77 | with tf.variable_scope('loss'): 78 | reduce_dims = np.arange(1,tf.keras.backend.ndim(self.X)) 79 | recons_loss = tf.losses.mean_squared_error(self.X, self.recons_X) 80 | kl_loss = - 0.5 * tf.reduce_mean(1 + sigma_outputs - tf.square(mu_outputs) - tf.exp(sigma_outputs)) 81 | self.opt_loss = recons_loss + kl_loss 82 | self.all_losses = tf.reduce_sum(tf.square(self.X - self.recons_X),reduction_indices=reduce_dims) 83 | 84 | with tf.variable_scope('train'): 85 | self.uion_train_op = tf.train.AdamOptimizer(self.learning_rate).minimize(self.opt_loss) 86 | 87 | 88 | def train(self): 89 | for i in range(self.train_iters): 90 | this_X = self.data_source.fetch_data(self.batch_size) 91 | self.sess.run([self.uion_train_op],feed_dict={ 92 | self.X: this_X 93 | }) 94 | if i % 200 ==0: 95 | mse_loss = self.sess.run([self.opt_loss],feed_dict={ 96 | self.X: self.data_source.train 97 | }) 98 | print('round {}: with loss: {}'.format(i,mse_loss)) 99 | self._arange_score(self.data_source.train) 100 | 101 | 102 | def _arange_score(self,input_data): 103 | input_all_losses = self.sess.run(self.all_losses,feed_dict={ 104 | self.X: input_data 105 | }) 106 | self.anomaly_score = np.percentile(input_all_losses,(1-self.outlier_fraction)*100) 107 | 108 | def judge(self,test): 109 | all_test_loss = self.sess.run(self.all_losses,feed_dict={ 110 | self.X: test 111 | }) 112 | result = map(lambda x: 1 if x< self.anomaly_score else -1,all_test_loss) 113 | 114 | return list(result) 115 | 116 | 117 | def plot_confusion_matrix(self): 118 | predict_label = self.judge(self.data_source.test) 119 | self.data_source.plot_confusion_matrix(self.data_source.test_label,predict_label,['Abnormal','Normal'],'LSTM_VAE Confusion-Matrix') 120 | 121 | 122 | def main(): 123 | 124 | lstm_vae = LSTM_VAE('dataset/data0.csv',['v0','v1'],z_dim=8,time_steps=16,outlier_fraction=0.01) 125 | lstm_vae.train() 126 | lstm_vae.plot_confusion_matrix() 127 | 128 | if __name__ == '__main__': 129 | main() 130 | 131 | -------------------------------------------------------------------------------- /LSTM_VAE/dataset/lstm_test.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/LSTM_VAE/dataset/lstm_test.npy -------------------------------------------------------------------------------- /LSTM_VAE/dataset/lstm_test_label.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/LSTM_VAE/dataset/lstm_test_label.npy -------------------------------------------------------------------------------- /LSTM_VAE/dataset/lstm_train.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/LSTM_VAE/dataset/lstm_train.npy -------------------------------------------------------------------------------- /LSTM_VAE/readme.md: -------------------------------------------------------------------------------- 1 | ## Reference 2 | LSTM_VAE used for Multivariate Time Series Anomaly Detection; 3 | [reference1](https://www.researchgate.net/publication/304758073_LSTM-based_Encoder-Decoder_for_Multi-sensor_Anomaly_Detection); 4 | [reference2](https://github.com/twairball/keras_lstm_vae); 5 | [reference3](https://arxiv.org/pdf/1711.00614.pdf); 6 | 7 | ## Prerequisites 8 | * Python 3.3+ 9 | * Tensorflow 1.12.0 10 | * Sklearn 0.20.1 11 | * Numpy 1.15.4 12 | * Pandas 0.23.4 13 | * Matplotlib 3.0.2 14 | 15 | ## Dataset and Preprocessing 16 | The dataset used is the [MTSAD](https://github.com/jsonbruce/MTSAnomalyDetection), which has 2 dimensions. 17 | We use StandardScaler and MinMaxScaler to preprocess the initial data. Then we re-set the dataset to be 3_dimensional with time_steps of 10. 18 | For each sample, if ANY ONE in the 10_timesteps is labeled as abnormal, then the corresponding 3_dimensional sample is labeled as ABNORMAL; 19 | 20 | In total, there are 55 abnormal samples and 8661 normal samples. We randomly select 8000 normal samples as train dataset, 661 normal samples and 55 abnormal samples as test dataset. As a result, the abnormal samples constitute only 7.7% of the test dataset. 21 | 22 | `LSTM_VAE should be trained on NORMAL Dataset. However, dataset with only a few ABNORMAL samples is also acceptable, since we can adjust the hyper-parameter outliers_fraction, which may slightly influnce the detection score.` 23 | 24 | ## Result 25 | The confusion_matrix of the test dataset are presented as: 26 | 27 | ![Confusion_Matrix for LSTM_VAE](https://github.com/SchindlerLiang/VAE-for-Anomaly-Detection/blob/master/LSTM_VAE/LSTM_VAE.png) 28 | 29 | It can be concluded from above that LSTM_VAE is capable of capturing most of the outliers (anomaly) in the test dataset. 30 | 31 | -------------------------------------------------------------------------------- /LSTM_VAE/utils.py: -------------------------------------------------------------------------------- 1 | import pandas as pd 2 | import numpy as np 3 | from sklearn.preprocessing import StandardScaler,MinMaxScaler 4 | import os 5 | from sklearn.metrics import confusion_matrix 6 | import matplotlib.pyplot as plt 7 | 8 | ''' 9 | time_steps = 10 10 | ''' 11 | class Data_Hanlder(object): 12 | 13 | def __init__(self,dataset_name,columns,time_steps): 14 | self.time_steps = time_steps 15 | self.data = pd.read_csv(dataset_name,index_col=0) 16 | self.columns = columns 17 | 18 | self.data['Class'] = 0 19 | self.data['Class'] = self.data['result'].apply(lambda x: 1 if x=='normal' else -1) 20 | self.data[self.columns] = self.data[self.columns].shift(-1) - self.data[self.columns] 21 | self.data = self.data.dropna(how='any') 22 | self.pointer = 0 23 | self.train = np.array([]) 24 | self.test = np.array([]) 25 | self.test_label = np.array([]) 26 | 27 | 28 | self.split_fraction = 0.2 29 | 30 | 31 | def _process_source_data(self): 32 | 33 | self._data_scale() 34 | self._data_arrage() 35 | self._split_save_data() 36 | 37 | def _data_scale(self): 38 | 39 | standscaler = StandardScaler() 40 | mscaler = MinMaxScaler(feature_range=(0,1)) 41 | self.data[self.columns] = standscaler.fit_transform(self.data[self.columns]) 42 | self.data[self.columns] = mscaler.fit_transform(self.data[self.columns]) 43 | 44 | 45 | def _data_arrage(self): 46 | 47 | self.all_data = np.array([]) 48 | self.labels = np.array([]) 49 | d_array = self.data[self.columns].values 50 | class_array = self.data['Class'].values 51 | for index in range(self.data.shape[0]-self.time_steps+1): 52 | this_array = d_array[index:index+self.time_steps].reshape((-1,self.time_steps,len(self.columns))) 53 | time_steps_label = class_array[index:index+self.time_steps] 54 | if np.any(time_steps_label==-1): 55 | this_label = -1 56 | else: 57 | this_label = 1 58 | if self.all_data.shape[0] == 0: 59 | self.all_data = this_array 60 | self.labels = this_label 61 | else: 62 | self.all_data = np.concatenate([self.all_data,this_array],axis=0) 63 | self.labels = np.append(self.labels,this_label) 64 | 65 | def _split_save_data(self): 66 | normal = self.all_data[self.labels==1] 67 | abnormal = self.all_data[self.labels==-1] 68 | 69 | split_no = normal.shape[0] - abnormal.shape[0] 70 | 71 | self.train = normal[:split_no,:] 72 | self.test = np.concatenate([normal[split_no:,:],abnormal],axis=0) 73 | self.test_label = np.concatenate([np.ones(normal[split_no:,:].shape[0]),-np.ones(abnormal.shape[0])]) 74 | np.save('dataset/train.npy',self.train) 75 | np.save('dataset/test.npy',self.test) 76 | np.save('dataset/test_label.npy',self.test_label) 77 | 78 | def _get_data(self): 79 | if os.path.exists('dataset/train.npy'): 80 | self.train = np.load('dataset/train.npy') 81 | self.test = np.load('dataset/test.npy') 82 | self.test_label = np.load('dataset/test_label.npy') 83 | if self.train.ndim ==3: 84 | if self.train.shape[1] == self.time_steps and self.train.shape[2] != len(self.columns): 85 | return 0 86 | self._process_source_data() 87 | 88 | 89 | def fetch_data(self,batch_size): 90 | if self.train.shape[0] == 0: 91 | self._get_data() 92 | 93 | if self.train.shape[0] < batch_size: 94 | return_train = self.train 95 | else: 96 | if (self.pointer + 1) * batch_size >= self.train.shape[0]-1: 97 | self.pointer = 0 98 | return_train = self.train[self.pointer * batch_size:,] 99 | else: 100 | self.pointer = self.pointer + 1 101 | return_train = self.train[self.pointer * batch_size:(self.pointer + 1) * batch_size,] 102 | if return_train.ndim < self.train.ndim: 103 | return_train = np.expand_dims(return_train,0) 104 | return return_train 105 | 106 | def plot_confusion_matrix(self,y_true, y_pred, labels,title): 107 | cmap = plt.cm.binary 108 | cm = confusion_matrix(y_true, y_pred) 109 | tick_marks = np.array(range(len(labels))) + 0.5 110 | np.set_printoptions(precision=2) 111 | cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] 112 | plt.figure(figsize=(8, 4), dpi=120) 113 | ind_array = np.arange(len(labels)) 114 | x, y = np.meshgrid(ind_array, ind_array) 115 | intFlag = 0 116 | for x_val, y_val in zip(x.flatten(), y.flatten()): 117 | 118 | if (intFlag): 119 | c = cm[y_val][x_val] 120 | plt.text(x_val, y_val, "%d" % (c,), color='red', fontsize=10, va='center', ha='center') 121 | 122 | else: 123 | c = cm_normalized[y_val][x_val] 124 | if (c > 0.01): 125 | plt.text(x_val, y_val, "%0.2f" % (c,), color='red', fontsize=10, va='center', ha='center') 126 | else: 127 | plt.text(x_val, y_val, "%d" % (0,), color='red', fontsize=10, va='center', ha='center') 128 | if(intFlag): 129 | plt.imshow(cm, interpolation='nearest', cmap=cmap) 130 | else: 131 | plt.imshow(cm_normalized, interpolation='nearest', cmap=cmap) 132 | plt.gca().set_xticks(tick_marks, minor=True) 133 | plt.gca().set_yticks(tick_marks, minor=True) 134 | plt.gca().xaxis.set_ticks_position('none') 135 | plt.gca().yaxis.set_ticks_position('none') 136 | plt.grid(True, which='minor', linestyle='-') 137 | plt.gcf().subplots_adjust(bottom=0.15) 138 | plt.title(title) 139 | plt.colorbar() 140 | xlocations = np.array(range(len(labels))) 141 | plt.xticks(xlocations, labels) 142 | plt.yticks(xlocations, labels) 143 | plt.ylabel('Index of True Classes') 144 | plt.xlabel('Index of Predict Classes') 145 | plt.show() -------------------------------------------------------------------------------- /MLP_VAE/MLP_VAE.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | Schindler Liang 4 | 5 | MLP Variational AutoEncoder for Anomaly Detection 6 | reference: https://pdfs.semanticscholar.org/0611/46b1d7938d7a8dae70e3531a00fceb3c78e8.pdf 7 | """ 8 | import random 9 | import tensorflow as tf 10 | import numpy as np 11 | import pandas as pd 12 | import matplotlib.pyplot as plt 13 | from sklearn.preprocessing import StandardScaler,MinMaxScaler 14 | from sklearn.metrics import confusion_matrix 15 | 16 | 17 | def lrelu(x, leak=0.2, name='lrelu'): 18 | return tf.maximum(x, leak*x) 19 | 20 | 21 | def build_dense(input_vector,unit_no,activation): 22 | return tf.layers.dense(input_vector,unit_no,activation=activation, 23 | kernel_initializer=tf.contrib.layers.xavier_initializer(), 24 | bias_initializer=tf.zeros_initializer()) 25 | 26 | class MLP_VAE: 27 | def __init__(self,input_dim,lat_dim, outliers_fraction): 28 | # input_paras: 29 | # input_dim: input dimension for X 30 | # lat_dim: latent dimension for Z 31 | # outliers_fraction: pre-estimated fraction of outliers in trainning dataset 32 | 33 | self.outliers_fraction = outliers_fraction # for computing the threshold of anomaly score 34 | self.input_dim = input_dim 35 | self.lat_dim = lat_dim # the lat_dim can exceed input_dim 36 | 37 | self.input_X = tf.placeholder(tf.float32,shape=[None,self.input_dim],name='source_x') 38 | 39 | self.learning_rate = 0.0005 40 | self.batch_size = 32 41 | # batch_size should be smaller than normal setting for getting 42 | # a relatively lower anomaly-score-threshold 43 | self.train_iter = 3000 44 | self.hidden_units = 128 45 | self._build_VAE() 46 | self.sess = tf.Session() 47 | self.sess.run(tf.global_variables_initializer()) 48 | self.pointer = 0 49 | 50 | def _encoder(self): 51 | with tf.variable_scope('encoder',reuse=tf.AUTO_REUSE): 52 | l1 = build_dense(self.input_X,self.hidden_units,activation=lrelu) 53 | # l1 = tf.nn.dropout(l1,0.8) 54 | l2 = build_dense(l1,self.hidden_units,activation=lrelu) 55 | # l2 = tf.nn.dropout(l2,0.8) 56 | mu = tf.layers.dense(l2,self.lat_dim) 57 | sigma = tf.layers.dense(l2,self.lat_dim,activation=tf.nn.softplus) 58 | sole_z = mu + sigma * tf.random_normal(tf.shape(mu),0,1,dtype=tf.float32) 59 | return mu,sigma,sole_z 60 | 61 | def _decoder(self,z): 62 | with tf.variable_scope('decoder',reuse=tf.AUTO_REUSE): 63 | l1 = build_dense(z,self.hidden_units,activation=lrelu) 64 | # l1 = tf.nn.dropout(l1,0.8) 65 | l2 = build_dense(l1,self.hidden_units,activation=lrelu) 66 | # l2 = tf.nn.dropout(l2,0.8) 67 | recons_X = tf.layers.dense(l2,self.input_dim) 68 | return recons_X 69 | 70 | 71 | def _build_VAE(self): 72 | self.mu_z,self.sigma_z,sole_z = self._encoder() 73 | self.recons_X = self._decoder(sole_z) 74 | 75 | with tf.variable_scope('loss'): 76 | KL_divergence = 0.5 * tf.reduce_sum(tf.square(self.mu_z) + tf.square(self.sigma_z) - tf.log(1e-8 + tf.square(self.sigma_z)) - 1, 1) 77 | mse_loss = tf.reduce_sum(tf.square(self.input_X-self.recons_X), 1) 78 | self.all_loss = mse_loss 79 | self.loss = tf.reduce_mean(mse_loss + KL_divergence) 80 | 81 | with tf.variable_scope('train'): 82 | self.train_op = tf.train.AdamOptimizer(self.learning_rate).minimize(self.loss) 83 | 84 | 85 | def _fecth_data(self,input_data): 86 | if (self.pointer+1) * self.batch_size >= input_data.shape[0]: 87 | return_data = input_data[self.pointer*self.batch_size:,:] 88 | self.pointer = 0 89 | else: 90 | return_data = input_data[ self.pointer*self.batch_size:(self.pointer+1)*self.batch_size,:] 91 | self.pointer = self.pointer + 1 92 | return return_data 93 | 94 | 95 | 96 | def train(self,train_X): 97 | for index in range(self.train_iter): 98 | this_X = self._fecth_data(train_X) 99 | self.sess.run([self.train_op],feed_dict={ 100 | self.input_X: this_X 101 | }) 102 | self.arrage_recons_loss(train_X) 103 | 104 | 105 | def arrage_recons_loss(self,input_data): 106 | all_losses = self.sess.run(self.all_loss,feed_dict={ 107 | self.input_X: input_data 108 | }) 109 | self.judge_loss = np.percentile(all_losses,(1-self.outliers_fraction)*100) 110 | 111 | 112 | def judge(self,input_data): 113 | return_label = [] 114 | for index in range(input_data.shape[0]): 115 | single_X = input_data[index].reshape(1,-1) 116 | this_loss = self.sess.run(self.loss,feed_dict={ 117 | self.input_X: single_X 118 | }) 119 | 120 | if this_loss < self.judge_loss: 121 | return_label.append(1) 122 | else: 123 | return_label.append(-1) 124 | return return_label 125 | 126 | def plot_confusion_matrix(y_true, y_pred, labels,title): 127 | cmap = plt.cm.binary 128 | cm = confusion_matrix(y_true, y_pred) 129 | tick_marks = np.array(range(len(labels))) + 0.5 130 | np.set_printoptions(precision=2) 131 | cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis] 132 | plt.figure(figsize=(4, 2), dpi=120) 133 | ind_array = np.arange(len(labels)) 134 | x, y = np.meshgrid(ind_array, ind_array) 135 | intFlag = 0 136 | for x_val, y_val in zip(x.flatten(), y.flatten()): 137 | # 138 | 139 | if (intFlag): 140 | c = cm[y_val][x_val] 141 | plt.text(x_val, y_val, "%d" % (c,), color='red', fontsize=8, va='center', ha='center') 142 | 143 | else: 144 | c = cm_normalized[y_val][x_val] 145 | if (c > 0.01): 146 | plt.text(x_val, y_val, "%0.2f" % (c,), color='red', fontsize=7, va='center', ha='center') 147 | else: 148 | plt.text(x_val, y_val, "%d" % (0,), color='red', fontsize=7, va='center', ha='center') 149 | if(intFlag): 150 | plt.imshow(cm, interpolation='nearest', cmap=cmap) 151 | else: 152 | plt.imshow(cm_normalized, interpolation='nearest', cmap=cmap) 153 | plt.gca().set_xticks(tick_marks, minor=True) 154 | plt.gca().set_yticks(tick_marks, minor=True) 155 | plt.gca().xaxis.set_ticks_position('none') 156 | plt.gca().yaxis.set_ticks_position('none') 157 | plt.grid(True, which='minor', linestyle='-') 158 | plt.gcf().subplots_adjust(bottom=0.15) 159 | plt.title(title) 160 | plt.colorbar() 161 | xlocations = np.array(range(len(labels))) 162 | plt.xticks(xlocations, labels) 163 | plt.yticks(xlocations, labels) 164 | plt.ylabel('Index of True Classes') 165 | plt.xlabel('Index of Predict Classes') 166 | plt.show() 167 | 168 | def mlp_vae_predict(train,test,test_label): 169 | mlp_vae = MLP_VAE(8,20,0.07) 170 | mlp_vae.train(train) 171 | mlp_vae_predict_label = mlp_vae.judge(test) 172 | plot_confusion_matrix(test_label, mlp_vae_predict_label, ['anomaly','normal'],'MLP_VAE Confusion-Matrix') 173 | 174 | def iforest_predict(train,test,test_label): 175 | from sklearn.ensemble import IsolationForest 176 | iforest = IsolationForest(max_samples = 'auto', 177 | behaviour="new",contamination=0.01) 178 | 179 | iforest.fit(train) 180 | iforest_predict_label = iforest.predict(test) 181 | plot_confusion_matrix(test_label, iforest_predict_label, ['anomaly','normal'],'iforest Confusion-Matrix') 182 | 183 | def lof_predict(train,test,test_label): 184 | from sklearn.neighbors import LocalOutlierFactor 185 | lof = LocalOutlierFactor(novelty=True,contamination=0.01) 186 | lof.fit(train) 187 | lof_predict_label = lof.predict(test) 188 | plot_confusion_matrix(test_label, lof_predict_label, ['anomaly','normal'],'LOF Confusion-Matrix') 189 | 190 | if __name__ == '__main__': 191 | train = np.load('data/train.npy') 192 | test = np.load('data/test.npy') 193 | test_label = np.load('data/test_label.npy') 194 | mlp_vae_predict(train,test,test_label) 195 | iforest_predict(train,test,test_label) 196 | lof_predict(train,test,test_label) 197 | 198 | 199 | 200 | 201 | 202 | 203 | -------------------------------------------------------------------------------- /MLP_VAE/data/a.py: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /MLP_VAE/data/test.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/MLP_VAE/data/test.npy -------------------------------------------------------------------------------- /MLP_VAE/data/test_label.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/MLP_VAE/data/test_label.npy -------------------------------------------------------------------------------- /MLP_VAE/data/train.npy: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/MLP_VAE/data/train.npy -------------------------------------------------------------------------------- /MLP_VAE/img/MLP_VAE.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/MLP_VAE/img/MLP_VAE.png -------------------------------------------------------------------------------- /MLP_VAE/img/a.txt: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /MLP_VAE/img/iforest.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/MLP_VAE/img/iforest.png -------------------------------------------------------------------------------- /MLP_VAE/img/lof.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/SchindlerLiang/VAE-for-Anomaly-Detection/061b6a68d8e4918c23ac154dcd1948d9941e5802/MLP_VAE/img/lof.png -------------------------------------------------------------------------------- /MLP_VAE/readme.md: -------------------------------------------------------------------------------- 1 | 2 | MLP_VAE used for anomaly detection; 3 | [reference](https://pdfs.semanticscholar.org/0611/46b1d7938d7a8dae70e3531a00fceb3c78e8.pdf); 4 | 5 | The dataset used is the [HTRU2 Data Set](http://archive.ics.uci.edu/ml/datasets/HTRU2). This is an unbanlaced dataset, where samples with Class 1 constitutes less than 10% of the entire dataset, which is treated as anomaly class; 6 | 7 | All the dimensions are preprocessed by sklearn StandardScaler and MinMaxScaler to better fit for MLP_VAE; 8 | 9 | The test results of MLP_VAE,IForest and LOF are presented as follows: 10 | 11 | ![Confusion_Matrix for MLP_VAE](https://github.com/SchindlerLiang/VAE-for-Anomaly-Detection/blob/master/MLP_VAE/img/MLP_VAE.png) 12 | 13 | ![Confusion_Matrix for Iforest](https://github.com/SchindlerLiang/VAE-for-Anomaly-Detection/blob/master/MLP_VAE/img/iforest.png) 14 | 15 | ![Confusion_Matrix for LOF](https://github.com/SchindlerLiang/VAE-for-Anomaly-Detection/blob/master/MLP_VAE/img/lof.png) 16 | 17 | The outliers_fraction for MLP_VAE are specially set to be different for better computing the anomaly score. It can be seen from above that MLP_VAE can obtain even results with IForest and LOF; 18 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # VAE-for-Anomaly-Detection 2 | MLP_VAE, Anomaly Detection, LSTM_VAE, Multivariate Time-Series Anomaly Detection,IndRNN_VAE, High_Frequency sensor Anomaly Detection,Tensorflow 3 | --------------------------------------------------------------------------------