├── README.md ├── get_autoenc_adasyn_synthetic.py ├── iwgan-autoenc-novelty.py ├── iwgan-discrim-novelty.py ├── iwgan-seq2seq-synthetic-mult-min.py ├── iwgan-seq2seq.py ├── iwgan-synthetic-mult-min.py ├── iwgan.py ├── requirements.txt ├── run_autoenc.py ├── run_seq2one.py ├── run_seq2one_output.py ├── run_seq2seq.py ├── run_seq2seq_output.py └── utils ├── AttentionLSTM.py ├── AttentionWithContext.py ├── config.py ├── keras_models.py ├── model_outputs.py ├── tf_models.py ├── tf_models_seq2seq.py └── train_models.py /README.md: -------------------------------------------------------------------------------- 1 | This repository contains implementations of the models discussed in the paper 2 | ["Autoencoders and Generative Adversarial Networks for Anomaly Detection for Sequences"](https://arxiv.org/abs/1901.02514) 3 | by Stephanie Ger and Diego Klabjan. 4 | 5 | ## Dependencies 6 | Tensorflow 1.12.0 (and all dependencies) Keras 2.1.5 (and all dependencies) 7 | 8 | ## Table of Contents 9 | * Data 10 | * Baseline Models 11 | * GAN Based Models 12 | * ADASYN with Autoencoder Models 13 | 14 | ## Data 15 | Models were evaluated on two public datasets and these datasets are available [here](https://northwestern.box.com/s/lt1mkyjhbl0ksq21y1m0o9qpkd6g5ib5). The file norm-sentiment-0.01.tar.gz refers to the sentiment dataset with 1% imbalance and the norm-sentiment-0.05.tar.gz is the sentiment dataset with 5% imbalance. The files with power in the filename contain the power datasets. We provide ensembled power datasets with 5 different seeds. Each .zip file contains the ensembled training data, validation and test data. Minority and majority data is also included to train GAN and autoencoder models for the oversampling methods described in the paper. All data files are stored as numpy arrays. 16 | 17 | ## Baseline Models 18 | The baseline model is run using the run_seq2one.py or run_seq2seq.py scripts depending on if the label vector is a 19 | sequence or not. The F1-score for the validation and test sets can be computed using the run_seq2one_output.py and 20 | run_seq2seq.py scripts respectively. 21 | 22 | ## GAN Models 23 | For novelty detection with either the GAN discriminator or GAN autoencoder as the novelty detection method, first a GAN 24 | is trained on majority data using the iwgan.py script. Then, the two novelty detection methods can be run with the 25 | iwgan-autoenc-novelty.py and iwgan-discrim-novelty.py scripts respectively. 26 | 27 | For GAN based synthetic data generation, a GAN is trained on minority data with the iwgan.py script or iwgan-seq2seq.py 28 | script depending on if the label vector is a sequence or not. Then, synthetic data can be generated with 29 | iwgan-synthetic-mult-min.py or iwgan-seq2seq-synthetic-mult-min.py respectively and the seq2one or seq2seq model can be 30 | run. 31 | 32 | ## ADASYN with Autoencoder Models 33 | For ADASYN with Autoencoder, the run_autoenc.py script can be used to train the autoencoder model on the minority data. 34 | Then get_autoenc_adasyn_synthetic.py can be used to generate the synthetic data. The training set with the synthetic 35 | data can be used to train a seq2one model with the run_seq2one.py script. 36 | -------------------------------------------------------------------------------- /get_autoenc_adasyn_synthetic.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.keras_models import Autoencoder 3 | import glob 4 | 5 | ''' 6 | This script runs the ADASYN algorithm on the trained autoencoder from run_autoenc.py to generate synthetic minority 7 | data. 8 | 9 | Inputs: 10 | ensem_folder: folder where training data is located 11 | model_folder: folder where the trained model is located 12 | data_set: name to give folder with ADASYN synthetic data 13 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 14 | ''' 15 | 16 | # get data 17 | ensem_folder = sys.argv[1] 18 | model_folder = sys.argv[2] 19 | model_h5 = max(glob.glob(model_folder+'model??.h5')) 20 | data_set = sys.argv[3] 21 | data = sys.argv[4] 22 | save_folder = '../'+data_set+'_autoenc_syn_adasyn_ensem/' 23 | 24 | # generate synthetic data with SMOTE 25 | autoencoder = Autoencoder(data) 26 | autoencoder.runAdasyn(ensem_folder, model_h5, save_folder) 27 | -------------------------------------------------------------------------------- /iwgan-autoenc-novelty.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.tf_models import * 3 | import numpy as np 4 | import os 5 | 6 | ''' 7 | Use this script to do novelty detection with GAN autoencoder once a GAN is trained on majority data. 8 | 9 | Inputs: 10 | data_dir: directory where training/test/validation data is saved 11 | data_set: name to give training output folder 12 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 13 | checkpoint: saved model weights in trained GAN model folder to use for data generation 14 | epoch: epoch of GAN training where weights for model generation are taken from 15 | ''' 16 | 17 | data_dir = sys.argv[1] 18 | data = sys.argv[2] 19 | data_set = sys.argv[3] 20 | checkpoint = sys.argv[4] 21 | epoch = sys.argv[5] 22 | 23 | # load relevant data from the data directory 24 | x_test = np.load(data_dir+'x_test.npy') 25 | y_test = np.load(data_dir+'y_test.npy') 26 | x_val = np.load(data_dir+'x_val.npy') 27 | y_val = np.load(data_dir+'y_val.npy') 28 | 29 | # make save directory 30 | save_dir = '../'+data_set+'_iwgan_novelty/' + epoch + '/' 31 | 32 | # get iwgan mode 33 | iwgan = IWGAN(data) 34 | iwgan.iwganNovelty(save_dir, x_val, y_val, x_test, y_test, checkpoint) 35 | -------------------------------------------------------------------------------- /iwgan-discrim-novelty.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.tf_models import * 3 | import numpy as np 4 | 5 | ''' 6 | Use this script to do novelty detection with GAN discriminator once a GAN is trained on majority data. 7 | 8 | Inputs: 9 | data_dir: directory where training/test/validation data is saved 10 | data: data type to invoke the correct Config file 11 | data_set: name to give training output folder, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 12 | checkpoint: saved model weights in trained GAN model folder to use for data generation 13 | epoch: epoch of GAN training where weights for model generation are taken from 14 | ''' 15 | 16 | data_dir = sys.argv[1] 17 | data = sys.argv[2] 18 | data_set = sys.argv[3] 19 | checkpoint = sys.argv[4] 20 | epoch = sys.argv[5] 21 | 22 | # load relevant data from the data directory 23 | x_test = np.load(data_dir+'x_test.npy') 24 | y_test = np.load(data_dir+'y_test.npy') 25 | x_val = np.load(data_dir+'x_val.npy') 26 | y_val = np.load(data_dir+'y_val.npy') 27 | 28 | # make save directory 29 | save_dir = '../'+data_set+'_iwgan_discrim/' + epoch + '/' 30 | 31 | # get iwgan mode 32 | iwgan = IWGAN(data) 33 | iwgan.iwganDisc(save_dir, x_val, y_val, x_test, y_test, checkpoint) 34 | 35 | -------------------------------------------------------------------------------- /iwgan-seq2seq-synthetic-mult-min.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.tf_models_seq2seq import * 3 | import numpy as np 4 | 5 | ''' 6 | Use this script to generate synthetic data on minority given a GAN model trained on minority for sequence label 7 | vectors. 8 | 9 | Inputs: 10 | data_dir: directory where training data is located 11 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 12 | data_set: name to give folder where training data with GAN synthetic data will be located 13 | checkpoint: saved model weights in trained GAN model folder to use for data generation 14 | epoch: epoch of GAN training where weights for model generation are taken from 15 | fake_real_flag: either 'FAKE' to add noise to generator during data generation or 'REAL' to generate data without added 16 | noise 17 | ''' 18 | 19 | data_dir = sys.argv[1] 20 | data = sys.argv[2] 21 | data_set = sys.argv[3] 22 | checkpoint = sys.argv[4] 23 | epoch = sys.argv[5] 24 | fake_real_flag = sys.argv[6] 25 | 26 | y_min = np.load(data_dir+'y_min.npy') 27 | gen_ensem_save_dir = '../'+fake_real_flag+'_'+data_set+'_iwgan_syn/' 28 | ensem_dir = data_dir+'ensem_' 29 | save_dir = '../'+fake_real_flag+'_'+data_set+'_'+epoch+'_iwgan_syn_ensem/' 30 | syn_dir = gen_ensem_save_dir 31 | 32 | iwgan = seq2seqIWGAN(data) 33 | iwgan.iwganGenEnsemFolder(data_dir, gen_ensem_save_dir, checkpoint, fake_real_flag) 34 | iwgan.integrateSynthetic(ensem_dir, syn_dir, save_dir) -------------------------------------------------------------------------------- /iwgan-seq2seq.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.tf_models_seq2seq import * 3 | import numpy as np 4 | 5 | ''' 6 | Use this script to train GAN with Autoencoder on data (either majority or minority) for sequence labels. 7 | 8 | Inputs: 9 | x_train: training data file for GAN model 10 | y_train: training label file for GAN model 11 | data_set: name to give training output folder 12 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 13 | ''' 14 | 15 | x_train = np.load(sys.argv[1]) 16 | y_train = np.load(sys.argv[2]) 17 | data_set = sys.argv[3] 18 | data = sys.argv[4] 19 | 20 | save_folder = '../'+data_set+'_iwgan_out/' 21 | 22 | iwgan = seq2seqIWGAN(data) 23 | model_train = iwgan.iwganAeTrain(x_train, y_train, save_folder) 24 | -------------------------------------------------------------------------------- /iwgan-synthetic-mult-min.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.tf_models import * 3 | import numpy as np 4 | import os 5 | 6 | ''' 7 | Use this script to generate synthetic data on minority given a GAN model trained on minority for non-sequence label 8 | vectors. 9 | 10 | Inputs: 11 | data_dir: directory where training data is located 12 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 13 | data_set: name to give folder where training data with GAN synthetic data will be located 14 | checkpoint: saved model weights in trained GAN model folder to use for data generation 15 | epoch: epoch of GAN training where weights for model generation are taken from 16 | fake_real_flag: either 'FAKE' to add noise to generator during data generation or 'REAL' to generate data without added 17 | noise 18 | ''' 19 | 20 | data_dir = sys.argv[1] 21 | data = sys.argv[2] 22 | data_set = sys.argv[3] 23 | checkpoint = sys.argv[4] 24 | epoch = sys.argv[5] 25 | fake_real_flag = sys.argv[7] 26 | 27 | y_min = np.load(data_dir+'y_min.npy') 28 | gen_ensem_save_dir = '../'+fake_real_flag+'_'+data_set + '_' + epoch + '_iwgan_syn/' 29 | ensem_dir = data_dir+'ensem_' 30 | save_dir = '../'+fake_real_flag+'_'+data_set + '_' + epoch + '_iwgan_syn_ensem/' 31 | syn_dir = gen_ensem_save_dir + 'synthetic_data_' 32 | 33 | iwgan = IWGAN(data) 34 | iwgan.iwganGenEnsemFolder(data_dir, gen_ensem_save_dir, checkpoint, fake_real_flag) 35 | iwgan.integrateSynthetic(ensem_dir, syn_dir, save_dir, y_min) -------------------------------------------------------------------------------- /iwgan.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.tf_models import * 3 | import numpy as np 4 | 5 | ''' 6 | Use this script to train GAN with Autoencoder on data (either majority or minority) where labels are not sequences. 7 | 8 | Inputs: 9 | x_train: training data file for GAN model 10 | data_set: name to give training output folder 11 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 12 | ''' 13 | 14 | x_train = np.load(sys.argv[1]) 15 | data_set = sys.argv[2] 16 | data = sys.argv[3] 17 | 18 | save_folder = '../'+data_set+'_iwgan_out/' 19 | 20 | iwgan = IWGAN(data) 21 | model_train = iwgan.iwganAeTrain(x_train, save_folder) 22 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | absl-py==0.7.0 2 | astor==0.7.1 3 | boto==2.49.0 4 | boto3==1.9.134 5 | botocore==1.12.134 6 | bz2file==0.98 7 | certifi==2019.3.9 8 | chardet==3.0.4 9 | cycler==0.10.0 10 | docutils==0.14 11 | gast==0.2.2 12 | gensim==3.7.2 13 | grpcio==1.19.0 14 | h5py==2.9.0 15 | idna==2.8 16 | imbalanced-learn==0.4.3 17 | jmespath==0.9.4 18 | Keras==2.2.4 19 | Keras-Applications==1.0.7 20 | Keras-Preprocessing==1.0.9 21 | kiwisolver==1.0.1 22 | Markdown==3.0.1 23 | matplotlib==3.0.3 24 | mock==2.0.0 25 | nltk==3.4.1 26 | numpy==1.16.2 27 | pandas==0.24.2 28 | pbr==5.1.2 29 | protobuf==3.6.1 30 | pycurl==7.43.0 31 | pygobject==3.20.0 32 | pyparsing==2.4.0 33 | python-apt==1.1.0b1+ubuntu0.16.4.2 34 | python-dateutil==2.8.0 35 | pytz==2019.1 36 | PyYAML==5.1 37 | requests==2.21.0 38 | s3transfer==0.2.0 39 | scikit-learn==0.20.3 40 | scipy==1.2.1 41 | seaborn==0.9.0 42 | six==1.12.0 43 | smart-open==1.8.2 44 | tensorboard==1.13.0 45 | tensorflow-estimator==1.13.0 46 | tensorflow-gpu==1.13.1 47 | termcolor==1.1.0 48 | urllib3==1.24.2 49 | Werkzeug==0.14.1 50 | -------------------------------------------------------------------------------- /run_autoenc.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import numpy as np 3 | from utils.train_models import AutoencoderTraining 4 | from utils.keras_models import Autoencoder 5 | 6 | ''' 7 | This script trains an autoencoder for the autoencoder with adasyn oversampling method. 8 | 9 | Inputs: 10 | x_train: training data, typically a set of minority data 11 | data_set: name to give training output folder 12 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 13 | ''' 14 | 15 | # get data 16 | x_train = np.load(sys.argv[1]) 17 | data_set = sys.argv[2] 18 | data = sys.argv[3] 19 | save_folder = '../'+data_set+'_autoenc-out/' 20 | 21 | model= Autoencoder(data) 22 | autoencoder, hidden = model.Autoencoder() 23 | print(autoencoder.summary()) 24 | 25 | train = AutoencoderTraining(data) 26 | train.trainAutoenc(autoencoder, x_train, save_folder) 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------- /run_seq2one.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import numpy as np 3 | seed_ = int(sys.argv[4]) 4 | from numpy.random import seed 5 | seed(seed_) 6 | from tensorflow import set_random_seed 7 | set_random_seed(seed_) 8 | from utils.config import * 9 | from utils.train_models import Seq2oneTraining 10 | from utils.keras_models import Seq2oneModel 11 | 12 | ''' 13 | This script trains a seq2one model on ensembled training data. 14 | 15 | Inputs: 16 | train_folder: folder where training data is located 17 | data_set: name to give training output folder 18 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 19 | seed: set seed for tensorflow and numpy 20 | ''' 21 | 22 | # load files 23 | train_folder = sys.argv[1] 24 | x_val = np.load(train_folder+'x_val.npy') 25 | y_val = np.load(train_folder+'y_val.npy') 26 | train_data = train_folder + 'ensem_' 27 | data_set = sys.argv[2] 28 | data = sys.argv[3] 29 | 30 | if data == 'Sentiment': 31 | Config = SentimentConfig() 32 | elif data == 'Power': 33 | Config = PowerConfig() 34 | else: 35 | raise ValueError('Invalid value for data option') 36 | 37 | save_folder = '../'+data_set+'_seq2one-out-'+str(Config.NUM_LAYERS)+'/' 38 | 39 | # build model 40 | seq2one = Seq2oneModel(data) 41 | model = seq2one.seq2oneModel() 42 | 43 | # model trainer 44 | train = Seq2oneTraining(data) 45 | train.runSeq2oneEnsemble(seq2one.seq2oneModel, train_data, x_val, y_val, save_folder) 46 | -------------------------------------------------------------------------------- /run_seq2one_output.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.config import * 3 | from utils.model_outputs import Seq2oneModelOutput 4 | from utils.keras_models import Seq2oneModel 5 | import numpy as np 6 | 7 | ''' 8 | This script is used to get the validation and test F1 score on a trained seq2one model. Validation and test F1-scores 9 | are saved as numpy arrays in the model_folder. 10 | 11 | Inputs: 12 | model_folder: folder where the trained model is located 13 | data_folder: folder where training data is located 14 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 15 | ''' 16 | 17 | # load files 18 | model_folder = sys.argv[1] 19 | data_folder = sys.argv[2] 20 | x_val = np.load(data_folder+'x_val.npy') 21 | y_val = np.load(data_folder+'y_val.npy') 22 | x_test = np.load(data_folder+'x_test.npy') 23 | y_test = np.load(data_folder+'y_test.npy') 24 | data = sys.argv[3] 25 | 26 | if data == 'Sentiment': 27 | Config = SentimentConfig() 28 | elif data == 'Power': 29 | Config = PowerConfig() 30 | else: 31 | raise ValueError('Invalid value for data option') 32 | 33 | 34 | # build model 35 | seq2one = Seq2oneModel(data) 36 | model = seq2one.seq2oneModel() 37 | 38 | # model trainer 39 | getoutput = Seq2oneModelOutput(data) 40 | getoutput.ensembleoutput(x_val,y_val, x_test, y_test, model_folder, seq2one.seq2oneModel) -------------------------------------------------------------------------------- /run_seq2seq.py: -------------------------------------------------------------------------------- 1 | import sys 2 | import numpy as np 3 | seed_ = int(sys.argv[4]) 4 | from numpy.random import seed 5 | seed(seed_) 6 | from tensorflow import set_random_seed 7 | set_random_seed(seed_) 8 | from utils.config import * 9 | from utils.train_models import Seq2seqTraining 10 | from utils.keras_models import Seq2seqModel 11 | 12 | ''' 13 | This script trains a seq2seq model on ensembled training data. 14 | 15 | Inputs: 16 | train_folder: folder where training data is located 17 | data_set: name to give training output folder 18 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 19 | seed: set seed for tensorflow and numpy 20 | ''' 21 | 22 | # load files 23 | train_folder = sys.argv[1] 24 | x_val = np.load(train_folder+'x_val.npy') 25 | y_val = np.load(train_folder+'y_val.npy') 26 | train_data = train_folder + 'ensem_' 27 | data_set = sys.argv[2] 28 | data = sys.argv[3] 29 | 30 | if data == 'Sentiment': 31 | Config = SentimentConfig() 32 | elif data == 'Power': 33 | Config = PowerConfig() 34 | else: 35 | raise ValueError('Invalid value for data option') 36 | 37 | save_folder = '../'+data_set+'_seq2seq-out-'+str(Config.NUM_LAYERS)+'/' 38 | 39 | # build model 40 | seq2seq = Seq2seqModel(data) 41 | model = seq2seq.seq2seqModel() 42 | 43 | # model trainer 44 | train = Seq2seqTraining(data) 45 | train.runSeq2seqEnsemble(seq2seq.seq2seqModel, train_data, x_val, y_val, save_folder) 46 | -------------------------------------------------------------------------------- /run_seq2seq_output.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from utils.config import * 3 | from utils.model_outputs import Seq2seqModelOutput 4 | from utils.keras_models import Seq2seqModel 5 | import numpy as np 6 | 7 | ''' 8 | This script is used to get the validation and test F1 score on a trained seq2seq model. Validation and test F1-scores 9 | are saved as numpy arrays in the model_folder. 10 | 11 | Inputs: 12 | model_folder: folder where the trained model is located 13 | data_folder: folder where training data is located 14 | data: data type to invoke the correct Config file, either 'Power' for power dataset or 'Sentiment' for sentiment dataset 15 | ''' 16 | 17 | # load files 18 | model_folder = sys.argv[1] 19 | data_folder = sys.argv[2] 20 | x_val = np.load(data_folder+'x_val.npy') 21 | y_val = np.load(data_folder+'y_val.npy') 22 | x_test = np.load(data_folder+'x_test.npy') 23 | y_test = np.load(data_folder+'y_test.npy') 24 | idx = np.load(data_folder + 'idx.npy') 25 | data = sys.argv[3] 26 | 27 | if data == 'Sentiment': 28 | Config = SentimentConfig() 29 | elif data == 'Power': 30 | Config = PowerConfig() 31 | else: 32 | raise ValueError('Invalid value for data option') 33 | 34 | # build model 35 | seq2seq = Seq2seqModel(data) 36 | model = seq2seq.seq2seqModel() 37 | 38 | # model trainer 39 | getoutput = Seq2seqModelOutput(data) 40 | getoutput.ensembleoutput(x_val,y_val, x_test, y_test, model_folder, seq2seq.seq2seqModel, idx) -------------------------------------------------------------------------------- /utils/AttentionWithContext.py: -------------------------------------------------------------------------------- 1 | from keras.layers.core import Layer 2 | from keras import initializers, regularizers, constraints 3 | from keras import backend as K 4 | 5 | 6 | class AttentionWithContext(Layer): 7 | """ 8 | Attention operation, with a context/query vector, for temporal data. 9 | Supports Masking. 10 | Follows the work of Yang et al. [https://www.cs.cmu.edu/~diyiy/docs/naacl16.pdf] 11 | "Hierarchical Attention Networks for Document Classification" 12 | by using a context vector to assist the attention 13 | # Input shape 14 | 3D tensor with shape: `(samples, steps, features)`. 15 | # Output shape 16 | 2D tensor with shape: `(samples, features)`. 17 | :param kwargs: 18 | Just put it on top of an RNN Layer (GRU/LSTM/SimpleRNN) with return_sequences=True. 19 | The dimensions are inferred based on the output shape of the RNN. 20 | Example: 21 | model.add(LSTM(64, return_sequences=True)) 22 | model.add(AttentionWithContext()) 23 | """ 24 | 25 | def __init__(self, 26 | W_regularizer=None, u_regularizer=None, b_regularizer=None, 27 | W_constraint=None, u_constraint=None, b_constraint=None, 28 | bias=True, **kwargs): 29 | 30 | self.supports_masking = True 31 | self.init = initializers.get('glorot_uniform') 32 | 33 | self.W_regularizer = regularizers.get(W_regularizer) 34 | self.u_regularizer = regularizers.get(u_regularizer) 35 | self.b_regularizer = regularizers.get(b_regularizer) 36 | 37 | self.W_constraint = constraints.get(W_constraint) 38 | self.u_constraint = constraints.get(u_constraint) 39 | self.b_constraint = constraints.get(b_constraint) 40 | 41 | self.bias = bias 42 | super(AttentionWithContext, self).__init__(**kwargs) 43 | 44 | def build(self, input_shape): 45 | assert len(input_shape) == 3 46 | 47 | self.W = self.add_weight((input_shape[-1], input_shape[-1],), 48 | initializer=self.init, 49 | name='{}_W'.format(self.name), 50 | regularizer=self.W_regularizer, 51 | constraint=self.W_constraint) 52 | if self.bias: 53 | self.b = self.add_weight((input_shape[-1],), 54 | initializer='zero', 55 | name='{}_b'.format(self.name), 56 | regularizer=self.b_regularizer, 57 | constraint=self.b_constraint) 58 | 59 | self.u = self.add_weight((input_shape[-1],), 60 | initializer=self.init, 61 | name='{}_u'.format(self.name), 62 | regularizer=self.u_regularizer, 63 | constraint=self.u_constraint) 64 | 65 | super(AttentionWithContext, self).build(input_shape) 66 | 67 | def compute_mask(self, input, input_mask=None): 68 | # do not pass the mask to the next layers 69 | return None 70 | 71 | def call(self, x, mask=None): 72 | uit = dot_product(x, self.W) 73 | 74 | if self.bias: 75 | uit += self.b 76 | 77 | uit = K.tanh(uit) 78 | ait = dot_product(uit, self.u) 79 | 80 | a = K.exp(ait) 81 | 82 | # apply mask after the exp. will be re-normalized next 83 | if mask is not None: 84 | # Cast the mask to floatX to avoid float64 upcasting in theano 85 | a *= K.cast(mask, K.floatx()) 86 | 87 | # in some cases especially in the early stages of training the sum may be almost zero 88 | # and this results in NaN's. A workaround is to add a very small positive number ε to the sum. 89 | # a /= K.cast(K.sum(a, axis=1, keepdims=True), K.floatx()) 90 | a /= K.cast(K.sum(a, axis=1, keepdims=True) + K.epsilon(), K.floatx()) 91 | 92 | a = K.expand_dims(a) 93 | weighted_input = x * a 94 | return K.sum(weighted_input, axis=1) 95 | 96 | def get_output_shape_for(self, input_shape): 97 | return input_shape[0], input_shape[-1] 98 | 99 | def compute_output_shape(self, input_shape): 100 | """Shape transformation logic so Keras can infer output shape 101 | """ 102 | return (input_shape[0], input_shape[-1]) 103 | 104 | 105 | def dot_product(x, kernel): 106 | """ 107 | Wrapper for dot product operation, in order to be compatible with both 108 | Theano and Tensorflow 109 | Args: 110 | x (): input 111 | kernel (): weights 112 | Returns: 113 | """ 114 | if K.backend() == 'tensorflow': 115 | # todo: check that this is correct 116 | return K.squeeze(K.dot(x, K.expand_dims(kernel)), axis=-1) 117 | else: 118 | return K.dot(x, kernel) -------------------------------------------------------------------------------- /utils/config.py: -------------------------------------------------------------------------------- 1 | ## Configuration file for Power dataset 2 | class PowerConfig: 3 | 4 | # parameters for preprocessing data 5 | PAD = False 6 | 7 | # parameters for data 8 | TIMESTEPS = 20 9 | DECODESTEPS = 4 10 | DATA_DIM = 6 11 | NUM_CLASSES = 2 12 | 13 | # model building parameters 14 | HIDDEN_NEURONS = 8 15 | DENSE_HIDDEN_NEURONS = 8 16 | NUM_LAYERS = 5 17 | 18 | # model training parameters 19 | BATCH_SIZE = 256 20 | GAN_BATCH_SIZE = 1024 21 | EPOCHS = 100 22 | NUM_ENSEMBLES = 10 23 | 24 | # smote model parameters 25 | N = 1 # number of times to resample each point 26 | K = 5 # 27 | BETA = 0.001 28 | 29 | # autoencoder parameters 30 | # model building parameters 31 | AE_EPOCHS = 200 32 | GAN_AE_EPOCHS = 3000 33 | G_HIDDEN_NEURONS = 64 34 | G_NUM_LAYERS = 2 35 | D_HIDDEN_NEURONS = 64 36 | D_NUM_LAYERS = 2 37 | A_HIDDEN_NEURONS = 64 38 | A_NUM_LAYERS=2 39 | AE_DENSE_NEURONS = 32 40 | DROPOUT_IN = 0.8 41 | DROPOUT_OUT = 0.8 42 | DROPOUT_STATE = 1 43 | INSTANCE_NOISE = True 44 | 45 | # hyperparameters 46 | LAMBDA = 10 47 | MU = 1 48 | 49 | # training parameters 50 | GEN_CRITIC_ITERS = 2 51 | DISC_CRITIC_ITERS = 1 52 | LEARNING_RATE = 1e-4 53 | DISC_LEARNING_RATE = 1e-5 54 | EP = 1e-10 55 | CHECKPOINT_STEP = 100 56 | 57 | # parameters for generating data 58 | NUM_SYN_ITER = 3 59 | LAST_EPOCH = GAN_AE_EPOCHS-1 60 | AE_NUM_ENSEMBLES = 10 61 | 62 | 63 | class SentimentConfig: 64 | 65 | # parameters for preprocessing data 66 | PAD = True 67 | 68 | # parameters for data 69 | TIMESTEPS = 600 70 | DATA_DIM = 300 71 | NUM_CLASSES = 2 72 | 73 | # model building parameters 74 | HIDDEN_NEURONS = 64 75 | DENSE_HIDDEN_NEURONS = 64 76 | NUM_LAYERS = 3 77 | 78 | # model training parameters 79 | BATCH_SIZE = 128 80 | GAN_BATCH_SIZE = 64 81 | EPOCHS = 100 82 | NUM_ENSEMBLES = 10 83 | 84 | # smote model parameters 85 | N = 1 # number of times to resample each point 86 | K = 5 # 87 | BETA = 0.001 88 | 89 | # autoencoder parameters 90 | # model building parameters 91 | AE_EPOCHS = 200 92 | GAN_AE_EPOCHS = 2500 93 | G_HIDDEN_NEURONS = 64 94 | G_NUM_LAYERS = 2 95 | D_HIDDEN_NEURONS = 64 96 | D_NUM_LAYERS = 2 97 | A_HIDDEN_NEURONS = 64 98 | A_NUM_LAYERS=2 99 | AE_DENSE_NEURONS = 32 100 | DROPOUT_IN = 0.8 101 | DROPOUT_OUT = 0.8 102 | DROPOUT_STATE = 1 103 | INSTANCE_NOISE = True 104 | 105 | # hyperparameters 106 | LAMBDA = 10 107 | MU = 0.1 108 | 109 | # training parameters 110 | GEN_CRITIC_ITERS = 1 111 | DISC_CRITIC_ITERS = 2 112 | LEARNING_RATE = 1e-3 113 | DISC_LEARNING_RATE = 1e-3 114 | EP = 1e-10 115 | CHECKPOINT_STEP = 100 116 | 117 | # parameters for generating data 118 | NUM_SYN_ITER = 3 119 | LAST_EPOCH = GAN_AE_EPOCHS-1 120 | AE_NUM_ENSEMBLES = 10 121 | -------------------------------------------------------------------------------- /utils/keras_models.py: -------------------------------------------------------------------------------- 1 | from keras.layers import Dense, Bidirectional, LSTM, BatchNormalization, Dropout, RepeatVector, Lambda, GRU 2 | from keras.models import * 3 | from numpy.random import randint, rand 4 | from sklearn.neighbors import NearestNeighbors 5 | from utils.config import * 6 | from utils.AttentionWithContext import AttentionWithContext 7 | from imblearn.over_sampling import SMOTE, ADASYN 8 | from utils.AttentionLSTM import AttentionDecoder 9 | 10 | 11 | class Seq2seqModel: 12 | 13 | def __init__(self, data): 14 | 15 | if data == 'Sentiment': 16 | self.Config = SentimentConfig() 17 | elif data == 'Power': 18 | self.Config = PowerConfig() 19 | else: 20 | raise ValueError('Invalid value for data option') 21 | 22 | def seq2seqModel(self): 23 | 24 | inputs = Input(shape=(self.Config.TIMESTEPS, self.Config.DATA_DIM,)) 25 | 26 | # define the first layer of the encoder separately 27 | enc, state_h, state_c = LSTM(self.Config.HIDDEN_NEURONS, return_sequences=True, stateful=False, dropout=0.2, 28 | recurrent_dropout=0.2, activation='relu', return_state=True, 29 | kernel_initializer='orthogonal')(inputs) 30 | 31 | 32 | # iteratively add total number of layers to encoder 33 | for i in range(self.Config.NUM_LAYERS - 1): 34 | enc, state_h, state_c = LSTM(self.Config.HIDDEN_NEURONS, return_sequences=True, stateful=False, 35 | dropout=0.2, recurrent_dropout=0.2, activation='relu', 36 | return_state=True, kernel_initializer='orthogonal')(enc) 37 | 38 | # for attention decoder 39 | decoder_input = Input(shape=(1, self.Config.NUM_CLASSES,)) 40 | 41 | decoder_lstm = AttentionDecoder(self.Config.HIDDEN_NEURONS, return_sequences=True, 42 | return_state=True) # , dropout=0.2, recurrent_dropout=0.2) 43 | decoder_dropout = Dropout(0.2) 44 | decoder_dense = Dense(self.Config.NUM_CLASSES, activation='softmax') 45 | all_outputs = [] 46 | 47 | # initialize decoder 48 | states = [state_h, state_c] 49 | inputs_ = decoder_input 50 | 51 | for _ in range(self.Config.DECODESTEPS): 52 | atten, state_h, state_c = decoder_lstm(inputs_, initial_state=states, constants=enc) 53 | outputs = Lambda(lambda x: K.expand_dims(x, axis=1))(state_h) 54 | outputs = decoder_dropout(outputs) 55 | outputs = decoder_dense(outputs) 56 | 57 | # append each output to all_outputs 58 | all_outputs.append(outputs) 59 | 60 | # reset states 61 | states = [state_h, state_c] 62 | inputs_ = outputs 63 | 64 | decoder_outputs = Lambda(lambda x: K.concatenate(x, axis=1))(all_outputs) 65 | 66 | # build model 67 | model = Model(inputs=[inputs, decoder_input], outputs=[decoder_outputs]) 68 | 69 | # define optimizer 70 | adadelta = optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=1e-08, decay=0.0, clipvalue=100) 71 | 72 | # compile model 73 | model.compile(loss='categorical_crossentropy', optimizer=adadelta, metrics=['accuracy']) 74 | 75 | return model 76 | 77 | 78 | class Seq2oneModel: 79 | 80 | def __init__(self, data): 81 | 82 | if data == 'Sentiment': 83 | self.Config = SentimentConfig() 84 | elif data == 'Power': 85 | self.Config = PowerConfig() 86 | else: 87 | raise ValueError('Invalid value for data option') 88 | 89 | # use this to build a model that uses the seq2one model setup 90 | def seq2oneModel(self): 91 | # need to define the inputs 92 | inputs = Input(shape=(self.Config.TIMESTEPS, self.Config.DATA_DIM,)) 93 | 94 | # define the first layer of the encoder separately 95 | enc, state_h, state_c = LSTM(self.Config.HIDDEN_NEURONS, return_sequences=True, stateful=False, dropout=0.2, 96 | recurrent_dropout=0.2, activation='relu', return_state=True, 97 | kernel_initializer='orthogonal')(inputs) 98 | 99 | # iteratively add total number of layers to encoder 100 | for i in range(self.Config.NUM_LAYERS - 1): 101 | enc, state_h, state_c = LSTM(self.Config.HIDDEN_NEURONS, return_sequences=True, stateful=False, 102 | dropout=0.2, recurrent_dropout=0.2, activation='relu', 103 | return_state=True, kernel_initializer='orthogonal')(enc) 104 | 105 | # deal with attention now 106 | atten = AttentionWithContext()(enc) 107 | 108 | encoder_states = [atten, state_c] 109 | 110 | decoder_input = Input(shape=(1, self.Config.NUM_CLASSES)) 111 | decoder_lstm = LSTM(self.Config.HIDDEN_NEURONS, activation='relu', dropout=0.2, recurrent_dropout=0.2, 112 | return_sequences=True, kernel_initializer='orthogonal') 113 | 114 | # decode with LSTM and dense layers 115 | outputs = decoder_lstm(decoder_input, initial_state=encoder_states) 116 | for i in range(self.Config.NUM_LAYERS-1): 117 | outputs = Dense(self.Config.DENSE_HIDDEN_NEURONS)(outputs) 118 | outputs = Dropout(0.2)(outputs) 119 | 120 | outputs = Dense(self.Config.NUM_CLASSES, activation='softmax')(outputs) 121 | 122 | # build model 123 | model = Model(inputs=[inputs, decoder_input], outputs=[outputs]) 124 | 125 | # define optimizer 126 | adadelta = optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=1e-08, decay=0.0, clipvalue=100) 127 | 128 | # compile model 129 | model.compile(loss='categorical_crossentropy', optimizer=adadelta, metrics=['accuracy']) 130 | 131 | return model 132 | 133 | 134 | class Autoencoder: 135 | 136 | def __init__(self, data): 137 | 138 | if data == 'Sentiment': 139 | self.Config = SentimentConfig() 140 | elif data == 'Power': 141 | self.Config = PowerConfig() 142 | else: 143 | raise ValueError('Invalid value for data option') 144 | 145 | def Autoencoder(self): 146 | 147 | # need to define the inputs 148 | inputs = Input(shape=(self.Config.TIMESTEPS, self.Config.DATA_DIM,)) 149 | 150 | # define the first layer of the encoder separately 151 | enc, state_h, state_c = LSTM(self.Config.HIDDEN_NEURONS, return_sequences=True, stateful=False, 152 | activation='relu', dropout=0.2, recurrent_dropout=0.2, 153 | return_state=True, kernel_initializer='orthogonal')(inputs) 154 | 155 | # iteratively add total number of layers to encoder 156 | for i in range(self.Config.NUM_LAYERS - 1): 157 | enc, state_h, state_c = LSTM(self.Config.HIDDEN_NEURONS, return_sequences=True, stateful=False, 158 | activation='relu', dropout=0.2, recurrent_dropout=0.2, 159 | return_state=True, kernel_initializer='orthogonal')(enc) 160 | 161 | # deal with attention now 162 | atten = AttentionWithContext()(enc) 163 | atten_repeat = RepeatVector(self.Config.TIMESTEPS)(atten) 164 | 165 | # deal with decoder now 166 | decoder_lstm = Bidirectional(LSTM(self.Config.HIDDEN_NEURONS, activation='relu', return_sequences=True, dropout=0.2, 167 | recurrent_dropout=0.2, kernel_initializer='orthogonal')) 168 | 169 | # make decoder 170 | outputs = decoder_lstm(atten_repeat) 171 | outputs = Dense(self.Config.DATA_DIM, activation='tanh')(outputs) 172 | 173 | # build autoencoder 174 | model = Model(inputs=[inputs], outputs=outputs) 175 | hidden_states = Model(inputs=model.inputs, outputs=atten) 176 | 177 | adadelta = optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=1e-08, decay=0.0, clipvalue=100) 178 | 179 | model.compile(loss='mse', 180 | optimizer=adadelta, 181 | metrics=['accuracy']) 182 | 183 | return model, hidden_states 184 | 185 | 186 | # load models and build encoder, decoder and autoencoder 187 | def loadAutoencoder(self, model_h5): 188 | 189 | # need to define the inputs 190 | inputs = Input(shape=(self.Config.TIMESTEPS, self.Config.DATA_DIM,)) 191 | 192 | # define the first layer of the encoder separately 193 | enc, state_h, state_c = LSTM(self.Config.HIDDEN_NEURONS, return_sequences=True, stateful=False, 194 | activation='relu', dropout=0.2, recurrent_dropout=0.2, 195 | return_state=True, kernel_initializer='orthogonal')(inputs) 196 | 197 | # iteratively add total number of layers to encoder 198 | for i in range(self.Config.NUM_LAYERS - 1): 199 | enc, state_h, state_c = LSTM(self.Config.HIDDEN_NEURONS, return_sequences=True, stateful=False, 200 | activation='relu', dropout=0.2, recurrent_dropout=0.2, 201 | return_state=True, kernel_initializer='orthogonal')(enc) 202 | 203 | # deal with attention now 204 | atten = AttentionWithContext()(enc) 205 | atten_repeat = RepeatVector(self.Config.TIMESTEPS)(atten) 206 | 207 | # deal with decoder now 208 | decoder_lstm = Bidirectional(LSTM(self.Config.HIDDEN_NEURONS, activation='relu', return_sequences=True, dropout=0.2, 209 | recurrent_dropout=0.2, kernel_initializer='orthogonal')) 210 | 211 | # make decoder 212 | outputs = decoder_lstm(atten_repeat) 213 | outputs = Dense(self.Config.DATA_DIM, activation='tanh')(outputs) 214 | 215 | # build autoencoder 216 | model = Model(inputs=[inputs], outputs=outputs) 217 | 218 | adadelta = optimizers.Adadelta(lr=1.0, rho=0.95, epsilon=1e-08, decay=0.0) 219 | 220 | model.compile(loss='mse', 221 | optimizer=adadelta, 222 | metrics=['accuracy']) 223 | 224 | model.load_weights(model_h5) 225 | 226 | encoder = Model(inputs=inputs, outputs=atten) 227 | 228 | encoder.compile(loss='mse', 229 | optimizer=adadelta, 230 | metrics=['accuracy']) 231 | 232 | decoder_layers = model.layers[-3:] 233 | inputs = Input(shape=(self.Config.HIDDEN_NEURONS, )) 234 | decode = decoder_layers[0](inputs) 235 | for i in range(1, len(decoder_layers)): 236 | decode = decoder_layers[i](decode) 237 | 238 | decoder = Model(inputs=[inputs], outputs=decode) 239 | 240 | return model, encoder, decoder 241 | 242 | def integrateSynDat(self, syn_dir, save_dir, data_dir): 243 | 244 | if not os.path.exists(save_dir): 245 | os.makedirs(save_dir) 246 | 247 | for i in range(self.Config.NUM_ENSEMBLES): 248 | print(i) 249 | e_lab = np.load(data_dir + 'dat' + str(i) + '.npy') 250 | e_dat = np.load(data_dir + 'lab' + str(i) + '.npy') 251 | s_lab = np.load(syn_dir + 'synthetic_lab' + str(i) + '.npy') 252 | s_dat = np.load(syn_dir + 'synthetic_dat' + str(i) + '.npy') 253 | 254 | c_dat = np.concatenate((e_dat, s_dat), axis=0) 255 | c_lab = np.concatenate((e_lab, s_lab), axis=0) 256 | 257 | shuffle = np.random.choice(len(c_lab), len(c_lab), replace=False) 258 | 259 | np.save(save_dir + 'lab_' + str(i) + '.npy', c_lab[shuffle]) 260 | np.save(save_dir + 'dat' + str(i) + '.npy', c_dat[shuffle]) 261 | 262 | return 263 | 264 | def runAdasyn(self, ensem_folder, model_h5, save_dir): 265 | 266 | if not os.path.exists(save_dir): 267 | os.makedirs(save_dir) 268 | 269 | # build and load models 270 | autoencoder, encoder, decoder = self.loadAutoencoder(model_h5) 271 | 272 | for ensem in range(self.Config.NUM_ENSEMBLES): 273 | 274 | dat = np.load(ensem_folder+'ensem_dat'+str(ensem)+'.npy') 275 | lab = np.load(ensem_folder+'ensem_lab'+str(ensem)+'.npy') 276 | dat_ = encoder.predict(dat) 277 | 278 | # resize data 279 | if len(lab.shape) == 3: 280 | lab = lab[:, -1, :] 281 | lab = np.argmax(lab, axis=1) 282 | else: 283 | lab = np.argmax(lab, axis=1) 284 | 285 | # run adasyn 286 | print(ensem) 287 | print('run ADASYN') 288 | 289 | ada = ADASYN(ratio='minority', random_state=42) 290 | 291 | # fit smote object 292 | print('fit smote object for ensem ' + str(ensem)) 293 | x_res, y_res = ada.fit_sample(dat_, lab) 294 | 295 | x_syn = decoder.predict(x_res) 296 | 297 | y_res_ = [] 298 | for i in range(len(y_res)): 299 | if y_res[i] == 0: 300 | y_res_ += [np.array([1, 0])] 301 | else: 302 | y_res_ += [np.array([0, 1])] 303 | 304 | y_res_ = np.array(y_res_) 305 | 306 | # save data 307 | print('save ensem ' + str(ensem)) 308 | np.save(save_dir + 'ensem_dat' + str(ensem) + '.npy', x_syn) 309 | np.save(save_dir + 'ensem_lab' + str(ensem) + '.npy', y_res_) 310 | 311 | 312 | return 313 | 314 | -------------------------------------------------------------------------------- /utils/model_outputs.py: -------------------------------------------------------------------------------- 1 | from sklearn.metrics import f1_score 2 | from utils.config import * 3 | from utils.keras_models import * 4 | import numpy as np 5 | 6 | class Seq2seqModelOutput: 7 | 8 | def __init__(self, data): 9 | 10 | if data == 'Sentiment': 11 | self.Config = SentimentConfig() 12 | elif data == 'Power': 13 | self.Config = PowerConfig() 14 | else: 15 | raise ValueError('Invalid value for data option') 16 | 17 | def ensembleoutput(self, x_val, y_val, x_test, y_test, model_folder, model_func, idx): 18 | 19 | # create dictionary for getting label 20 | dict_ = dict((tuple(x.tolist()), i) for (i, x) in enumerate(idx)) 21 | 22 | print('transform labels') 23 | y_val.resize((y_val.shape[0], y_val.shape[1]*y_val.shape[2])) 24 | y_test.resize((y_test.shape[0], y_test.shape[1] * y_test.shape[2])) 25 | y_val = [dict_[tuple(x.tolist())] for x in y_val] 26 | y_test = [dict_[tuple(x.tolist())] for x in y_test] 27 | 28 | # set decoder inputs 29 | decoder_val = np.zeros((len(y_val), 1, self.Config.NUM_CLASSES)) 30 | decoder_test = np.zeros((len(y_test), 1, self.Config.NUM_CLASSES)) 31 | 32 | model = model_func() 33 | 34 | # predict on model for validation data 35 | print('predict on validation data') 36 | pred_sum = np.zeros((len(y_val), self.Config.DECODESTEPS, self.Config.NUM_CLASSES)) 37 | for i in range(self.Config.NUM_ENSEMBLES): 38 | model.load_weights(model_folder + 'seq_ensem' + str(i) + '.h5') 39 | pred = model.predict([x_val, decoder_val], batch_size=self.Config.BATCH_SIZE) 40 | pred_sum += pred 41 | 42 | pred_avg = pred_sum/float(self.Config.NUM_ENSEMBLES) 43 | 44 | # format the prediction and label 45 | pred = np.round(pred_avg) 46 | pred.resize((pred.shape[0], pred.shape[1]*pred.shape[2])) 47 | y_pred = [dict_[tuple(x.tolist())] for x in pred] 48 | 49 | val_f1 = [np.mean(f1_score(y_val, y_pred, average=None)[1:])] 50 | 51 | np.save(model_folder + 'val_lab.npy', y_val) 52 | np.save(model_folder + 'val_pred.npy', pred) 53 | 54 | # predict on model for test data 55 | print('predict on test data') 56 | pred_sum = np.zeros((len(y_test), self.Config.DECODESTEPS, self.Config.NUM_CLASSES)) 57 | for i in range(self.Config.NUM_ENSEMBLES): 58 | model.load_weights(model_folder + 'seq_ensem' + str(i) + '.h5') 59 | pred = model.predict([x_test, decoder_test], batch_size=self.Config.BATCH_SIZE) 60 | pred_sum += pred 61 | 62 | pred_avg = pred_sum/float(self.Config.NUM_ENSEMBLES) 63 | pred = np.round(pred_avg) 64 | pred.resize((pred.shape[0], pred.shape[1] * pred.shape[2])) 65 | y_pred = [dict_[tuple(x.tolist())] for x in pred] 66 | 67 | test_f1 = [np.mean(f1_score(y_test, y_pred, average=None)[1:])] 68 | 69 | np.save(model_folder + 'test_lab.npy', y_test) 70 | np.save(model_folder + 'test_pred.npy', pred) 71 | 72 | # save outputs 73 | np.save(model_folder+'ensem_test_f1.npy', test_f1) 74 | np.save(model_folder+'ensem_val_f1.npy', val_f1) 75 | 76 | return 77 | 78 | 79 | class Seq2oneModelOutput: 80 | 81 | def __init__(self, data): 82 | 83 | if data == 'Sentiment': 84 | self.Config = SentimentConfig() 85 | elif data == 'Power': 86 | self.Config = PowerConfig() 87 | else: 88 | raise ValueError('Invalid value for data option') 89 | 90 | # go to ensembling output method 91 | def ensembleoutput(self, x_val, y_val, x_test, y_test, model_folder, model_func): 92 | 93 | print('load data') 94 | # load validation data and labels 95 | if len(y_val.shape) == 3: 96 | y_val = y_val[:, -1, :] 97 | y_val = y_val.argmax(axis=1) 98 | 99 | # load test data and labels 100 | if len(y_test.shape) == 3: 101 | y_test = y_test[:, -1, :] 102 | y_test = y_test.argmax(axis=1) 103 | 104 | # set decoder inputs 105 | decoder_val = np.zeros((len(y_val), 1, self.Config.NUM_CLASSES)) 106 | decoder_test = np.zeros((len(y_test), 1, self.Config.NUM_CLASSES)) 107 | 108 | 109 | model = model_func() 110 | 111 | # predict on model for validation data 112 | print('predict on validation data') 113 | pred_sum = np.zeros((len(y_val), self.Config.NUM_CLASSES)) 114 | for i in range(self.Config.NUM_ENSEMBLES): 115 | model.load_weights(model_folder + 'seq_ensem' + str(i) + '.h5') 116 | pred = model.predict([x_val, decoder_val], batch_size=self.Config.BATCH_SIZE) 117 | pred.resize((len(pred), self.Config.NUM_CLASSES)) 118 | pred_sum += pred 119 | 120 | pred_avg = pred_sum/float(self.Config.NUM_ENSEMBLES) 121 | pred = np.argmax(pred_avg, axis=1) 122 | 123 | if self.Config.NUM_CLASSES == 2: 124 | val_f1 = [f1_score(y_val, pred)] 125 | else: 126 | val_f1 = [np.mean(f1_score(y_val, pred, average=None)[1:])] 127 | 128 | np.save(model_folder + 'val_lab.npy', y_val) 129 | np.save(model_folder + 'val_pred.npy', pred) 130 | 131 | # predict on model for test data 132 | print('predict on test data') 133 | pred_sum = np.zeros((len(y_test), self.Config.NUM_CLASSES)) 134 | for i in range(self.Config.NUM_ENSEMBLES): 135 | model.load_weights(model_folder + 'seq_ensem' + str(i) + '.h5') 136 | pred = model.predict([x_test, decoder_test], batch_size=self.Config.BATCH_SIZE) 137 | pred.resize((len(pred), self.Config.NUM_CLASSES)) 138 | pred_sum += pred 139 | 140 | pred_avg = pred_sum/float(self.Config.NUM_ENSEMBLES) 141 | pred = np.argmax(pred_avg, axis=1) 142 | 143 | if self.Config.NUM_CLASSES == 2: 144 | test_f1 = [f1_score(y_test, pred)] 145 | else: 146 | test_f1 = [np.mean(f1_score(y_test, pred, average=None)[1:])] 147 | 148 | np.save(model_folder + 'test_lab.npy', y_test) 149 | np.save(model_folder + 'test_pred.npy', pred) 150 | 151 | # save outputs 152 | np.save(model_folder+'ensem_test_f1.npy', test_f1) 153 | np.save(model_folder+'ensem_val_f1.npy', val_f1) 154 | 155 | return -------------------------------------------------------------------------------- /utils/tf_models.py: -------------------------------------------------------------------------------- 1 | import os 2 | import keras.backend as K 3 | import numpy as np 4 | import tensorflow as tf 5 | import tensorflow.contrib.rnn as rnn 6 | from sklearn.metrics import f1_score 7 | from sklearn.metrics import mean_squared_error as mse 8 | from utils.config import * 9 | from utils.AttentionWithContext import AttentionWithContext 10 | 11 | 12 | # implementation of improved wassertein gan in tensorflow 13 | 14 | class IWGAN: 15 | 16 | def __init__(self, data): 17 | 18 | if data == 'Sentiment': 19 | self.Config = SentimentConfig() 20 | elif data == 'Power': 21 | self.Config = PowerConfig() 22 | else: 23 | raise ValueError('Invalid value for data option') 24 | 25 | self.batchsize = self.Config.GAN_BATCH_SIZE 26 | self.timesteps = self.Config.TIMESTEPS 27 | self.data_dim = self.Config.DATA_DIM 28 | self.data = data 29 | 30 | # define leakyrelu activation 31 | def leakyrelu(self, x, alpha=0.3, name='lrelu'): 32 | return tf.maximum(x, alpha*x) 33 | 34 | # define generator model 35 | def generator(self, noisy, real_data, noise_level, reuse=False): 36 | 37 | with tf.variable_scope('generator', reuse=reuse) as scope: 38 | 39 | # define LSTM with dropout 40 | def lstm_(): 41 | return rnn.DropoutWrapper(rnn.LSTMCell(self.Config.G_HIDDEN_NEURONS), 42 | input_keep_prob=self.Config.DROPOUT_IN, 43 | output_keep_prob=self.Config.DROPOUT_OUT) 44 | 45 | # define noisy hidden state for fake data 46 | h_state = tf.random_normal([self.batchsize, self.Config.G_HIDDEN_NEURONS]) 47 | c_state = tf.zeros([self.batchsize, self.Config.G_HIDDEN_NEURONS]) 48 | 49 | lstm_state = rnn.LSTMStateTuple(c_state, h_state) 50 | 51 | # stack layers of LSTM 52 | stacked_lstm = rnn.MultiRNNCell( 53 | [lstm_() for _ in range(self.Config.G_NUM_LAYERS)]) 54 | 55 | # set initial state based on if it's real data or not 56 | if noisy == True: 57 | init_state = [lstm_state for _ in range(self.Config.G_NUM_LAYERS)] 58 | init_state = tuple(init_state) 59 | 60 | else: 61 | init_state = stacked_lstm.zero_state(self.batchsize, tf.float32) 62 | 63 | 64 | outputs, states = tf.nn.dynamic_rnn(stacked_lstm, real_data, initial_state=init_state, time_major=False, 65 | scope=scope) 66 | 67 | outputs = self.leakyrelu(outputs) 68 | 69 | 70 | if self.Config.INSTANCE_NOISE: 71 | output_noise = tf.random_normal([self.batchsize, self.timesteps, self.Config.G_HIDDEN_NEURONS], 72 | stddev=noise_level) 73 | outputs += output_noise 74 | 75 | 76 | return outputs, states 77 | 78 | # define discriminator model 79 | def discriminator(self, inputs, reuse=False): 80 | 81 | with tf.variable_scope('discriminator', reuse=reuse) as scope: 82 | 83 | # build RNN here 84 | lstm_ = rnn.LSTMCell(self.Config.D_HIDDEN_NEURONS) 85 | lstm_ = rnn.DropoutWrapper(lstm_, input_keep_prob=self.Config.DROPOUT_IN, 86 | output_keep_prob=self.Config.DROPOUT_OUT) 87 | 88 | stacked_lstm = rnn.MultiRNNCell( 89 | [lstm_ for _ in range(self.Config.D_NUM_LAYERS)]) 90 | 91 | # initialize initial state to 0 92 | init_state = stacked_lstm.zero_state(self.batchsize, tf.float32) 93 | 94 | # bidirectional RNN 95 | outputs, _, _ = rnn.static_bidirectional_rnn(stacked_lstm, stacked_lstm, 96 | tf.unstack(tf.transpose(inputs, perm=[1, 0, 2])), 97 | initial_state_fw=init_state, initial_state_bw=init_state, 98 | scope=scope) 99 | 100 | outputs = tf.contrib.layers.fully_connected(outputs, self.Config.AE_DENSE_NEURONS, reuse=reuse, scope=scope, 101 | activation_fn=None) 102 | # specify activation function here 103 | outputs = self.leakyrelu(outputs) 104 | # dropout 105 | outputs = tf.nn.dropout(outputs, keep_prob = self.Config.DROPOUT_OUT) 106 | outputs = tf.transpose(outputs, perm=[1, 0, 2]) 107 | 108 | # use only the last hidden state as input to dense layer 109 | outputs = tf.slice(outputs, [0, self.Config.TIMESTEPS - 1, 0],[self.batchsize, 1, 110 | self.Config.AE_DENSE_NEURONS]) 111 | # use last hidden state as input to a dense layer 112 | outputs = tf.layers.dense(outputs, int(self.Config.AE_DENSE_NEURONS/2), name='discriminator/pre_output') 113 | # specify activation function here 114 | outputs = self.leakyrelu(outputs) 115 | # dropout 116 | outputs = tf.nn.dropout(outputs, keep_prob=self.Config.DROPOUT_OUT) 117 | # final output of discriminator 118 | outputs = tf.layers.dense(outputs, 1, name='discriminator/output') 119 | 120 | return outputs 121 | 122 | # build autoencoder seq2seq model with predictions 123 | def autoencoder_seq2seq_pred(self, inputs, states, data_in, reuse=False): 124 | 125 | with tf.variable_scope('autoencoder', reuse=reuse) as scope: 126 | # get attention from hidden states of generator 127 | atten = AttentionWithContext()(inputs) 128 | 129 | states = tf.unstack(states, axis=0) 130 | # set inital cell state to cell state from generator 131 | # set inital hidden state to attention of hidden states from generator 132 | rnn_tuple_state = tuple( 133 | [tf.nn.rnn_cell.LSTMStateTuple(states[idx][0], atten) for idx in range(self.Config.A_NUM_LAYERS)]) 134 | 135 | def lstm_(): 136 | return rnn.DropoutWrapper(rnn.LSTMCell(self.Config.A_HIDDEN_NEURONS), 137 | input_keep_prob=self.Config.DROPOUT_IN, output_keep_prob=self.Config.DROPOUT_OUT) 138 | 139 | stacked_lstm = rnn.MultiRNNCell( 140 | [lstm_() for _ in range(self.Config.G_NUM_LAYERS)]) 141 | 142 | ae_in = tf.zeros([self.batchsize, 1, self.Config.DATA_DIM]) 143 | 144 | all_outputs = [] 145 | 146 | # feed in previous output as current input (teacher forcing) 147 | for j in range(self.Config.TIMESTEPS): 148 | outputs, states = tf.nn.dynamic_rnn(stacked_lstm, ae_in, initial_state=rnn_tuple_state, 149 | time_major=False, 150 | scope=scope) 151 | outputs = self.leakyrelu(outputs) 152 | outputs = tf.layers.dense(outputs, self.Config.DATA_DIM, name='autoencoder/output') 153 | 154 | all_outputs += [outputs] 155 | ae_in = tf.expand_dims(data_in[:, j, :], axis=1) 156 | 157 | scope.reuse_variables() 158 | 159 | outputs = tf.concat(all_outputs, 1) 160 | 161 | return outputs 162 | 163 | def iwganAeTrain(self, x_train, save_dir): 164 | 165 | d_loss = [] 166 | d_loss_diff = [] 167 | g_loss = [] 168 | a_loss = [] 169 | g_loss_diff = [] 170 | 171 | if not os.path.exists(save_dir): 172 | os.makedirs(save_dir) 173 | 174 | data_in = tf.placeholder(tf.float32, shape=[None, self.timesteps, self.data_dim]) 175 | noise = tf.placeholder(tf.float32, shape=()) 176 | 177 | fake_data, fake_states = self.generator(True, data_in, noise, reuse=False) 178 | real_data, real_states = self.generator(False, data_in, noise, reuse=True) 179 | disc_fake = self.discriminator(fake_data, reuse=False) 180 | disc_real = self.discriminator(real_data, reuse=True) 181 | 182 | autoenc_real = self.autoencoder_seq2seq_pred(real_data, real_states, data_in, reuse=False) 183 | 184 | # calculate loss 185 | ae_cost = tf.reduce_mean(tf.norm(tf.reshape(data_in, [self.batchsize, self.timesteps*self.data_dim])- 186 | tf.reshape(autoenc_real,[self.batchsize, self.timesteps*self.data_dim]), 187 | axis=1)) 188 | 189 | disc_cost_ = tf.reduce_mean(disc_fake) - tf.reduce_mean(disc_real) 190 | gen_cost_ = -tf.reduce_mean(disc_fake) 191 | gen_cost = gen_cost_ + self.Config.MU*ae_cost 192 | 193 | # gradient loss 194 | alpha = tf.random_uniform( 195 | shape=[self.batchsize, 1, 1], 196 | minval=0., 197 | maxval=1. 198 | ) 199 | interpolates = alpha * real_data + ((1 - alpha) * fake_data) 200 | disc_interpolates = self.discriminator(interpolates, reuse=True) 201 | gradients = tf.gradients(disc_interpolates, [interpolates])[0] 202 | slopes = tf.sqrt(tf.reduce_sum(tf.square(gradients), reduction_indices=[1]) + self.Config.EP) 203 | gradient_penalty = tf.reduce_mean((slopes - 1) ** 2) 204 | disc_cost = disc_cost_ + self.Config.LAMBDA * gradient_penalty 205 | 206 | ################################################ SET UP MODEL TRAINING ######################################### 207 | print('set up model training') 208 | t_vars = tf.trainable_variables() 209 | g_vars = [var for var in t_vars if 'generator' in var.name] 210 | d_vars = [var for var in t_vars if 'discriminator' in var.name] 211 | a_vars = [var for var in t_vars if 'autoencoder' in var.name] 212 | 213 | disc_train_op = tf.train.AdamOptimizer( 214 | learning_rate=self.Config.DISC_LEARNING_RATE, 215 | beta1=0.5, 216 | beta2=0.9).minimize(disc_cost, var_list=d_vars) 217 | 218 | gen_train_op = tf.train.AdamOptimizer( 219 | learning_rate=self.Config.LEARNING_RATE, 220 | beta1=0.5, 221 | beta2=0.9).minimize(gen_cost, var_list=g_vars) 222 | 223 | autoenc_train_op = tf.train.AdamOptimizer( 224 | learning_rate=self.Config.LEARNING_RATE, 225 | beta1=0.5, 226 | beta2=0.9).minimize(ae_cost, var_list=a_vars) 227 | 228 | ################################################## TRAINING LOOP ############################################### 229 | print('training loop') 230 | saver = tf.train.Saver(max_to_keep=None) 231 | 232 | init_op = tf.global_variables_initializer() 233 | 234 | session = tf.Session() 235 | K.set_session(session) 236 | session.run(init_op) 237 | 238 | # set initial noise 239 | _noise = 1 240 | 241 | with session.as_default(): 242 | 243 | for epoch in range(self.Config.GAN_AE_EPOCHS): 244 | print('epoch:', epoch) 245 | np.random.shuffle(x_train) 246 | minibatch_size = self.batchsize * (self.Config.DISC_CRITIC_ITERS+self.Config.GEN_CRITIC_ITERS+1) 247 | 248 | if (epoch + 1) % 100 == 0: 249 | 250 | if self.data == 'Sentiment': 251 | _noise += -0.2 252 | 253 | if _noise < 0: 254 | _noise = 0. 255 | 256 | elif self.data == 'Power': 257 | _noise += -0.2 258 | 259 | if _noise < 0: 260 | _noise = 0. 261 | 262 | print(_noise) 263 | 264 | for i in range(int(len(x_train) // (self.batchsize * (self.Config.DISC_CRITIC_ITERS + 265 | self.Config.GEN_CRITIC_ITERS + 1)))): 266 | data_minibatch = x_train[i * minibatch_size: (i + 1) * minibatch_size] 267 | print('minibatch:', i) 268 | for j in range(self.Config.GEN_CRITIC_ITERS): 269 | _data = data_minibatch[j * self.batchsize: (j + 1) * self.batchsize] 270 | _gen_cost, _gen_cost_diff, _ = session.run([gen_cost, gen_cost_, gen_train_op], 271 | feed_dict={data_in: _data, noise: _noise}) 272 | g_loss += [_gen_cost] 273 | g_loss_diff += [_gen_cost_diff] 274 | for j in range(self.Config.DISC_CRITIC_ITERS): 275 | _data = data_minibatch[(self.Config.GEN_CRITIC_ITERS+j)*self.batchsize: 276 | (self.Config.GEN_CRITIC_ITERS + j+1) * self.batchsize] 277 | _disc_cost, _disc_cost_diff, _ = session.run([disc_cost, disc_cost_, disc_train_op], 278 | feed_dict={data_in: _data, noise: _noise}) 279 | d_loss += [_disc_cost] 280 | d_loss_diff += [_disc_cost_diff] 281 | 282 | _data = data_minibatch[(self.Config.DISC_CRITIC_ITERS+self.Config.GEN_CRITIC_ITERS) * 283 | self.batchsize: (self.Config.GEN_CRITIC_ITERS + 284 | self.Config.DISC_CRITIC_ITERS+1) * self.batchsize] 285 | _ae_cost, _ = session.run([ae_cost, autoenc_train_op], feed_dict={data_in: _data, noise: _noise}) 286 | a_loss += [_ae_cost] 287 | 288 | if epoch >= 0 and epoch % self.Config.CHECKPOINT_STEP == 0: 289 | saver.save(session, save_dir + 'model', global_step=epoch) 290 | np.save(save_dir + 'd_loss.npy', d_loss) 291 | np.save(save_dir + 'g_loss.npy', g_loss) 292 | np.save(save_dir + 'g_diff_loss.npy', g_loss_diff) 293 | np.save(save_dir + 'd_diff_loss.npy', d_loss_diff) 294 | np.save(save_dir + 'a_loss.npy', a_loss) 295 | 296 | saver.save(session, save_dir + 'model', global_step=self.Config.GAN_AE_EPOCHS-1) 297 | np.save(save_dir + 'd_loss.npy', d_loss) 298 | np.save(save_dir + 'g_loss.npy', g_loss) 299 | np.save(save_dir + 'g_diff_loss.npy', g_loss_diff) 300 | np.save(save_dir + 'd_diff_loss.npy', d_loss_diff) 301 | np.save(save_dir + 'a_loss.npy', a_loss) 302 | 303 | return 304 | 305 | # build a model to generate ensembles of data 306 | def iwganGenEnsemFolder(self, data_dir, save_dir, checkpoint, flag): 307 | 308 | if not os.path.exists(save_dir): 309 | os.makedirs(save_dir) 310 | 311 | session = tf.Session() 312 | K.set_session(session) 313 | 314 | print('calculate loss function') 315 | # get model outputs 316 | data_in = tf.placeholder(tf.float32, shape=[None, self.timesteps, self.data_dim]) 317 | noise = 0.0 318 | 319 | fake_data, fake_states = self.generator(True, data_in, noise, reuse=False) 320 | real_data, real_states = self.generator(False, data_in, noise, reuse=True) 321 | 322 | if flag == 'FAKE': 323 | autoenc_fake = self.autoencoder_seq2seq_pred(fake_data, fake_states, data_in, reuse=False) 324 | elif flag == 'REAL': 325 | autoenc_fake = self.autoencoder_seq2seq_pred(real_data, real_states, data_in, reuse=False) 326 | ###################################### LOAD SAVED VARIABLES AND GENERATE DATA ################################## 327 | 328 | print('generate data') 329 | 330 | init_op = tf.global_variables_initializer() 331 | session.run(init_op) 332 | 333 | saver = tf.train.Saver() 334 | 335 | with session.as_default(): 336 | 337 | saver.restore(session, checkpoint) 338 | 339 | print('weights restored') 340 | 341 | for ensem in range(self.Config.AE_NUM_ENSEMBLES): 342 | 343 | dat = np.load(data_dir+'ensem_dat'+str(ensem)+'.npy') 344 | lab = np.load(data_dir+'ensem_lab'+str(ensem)+'.npy') 345 | 346 | if len(lab.shape) == 3: 347 | idx = np.where(lab[:, -1, 1] == 1)[0] 348 | else: 349 | idx = np.where(lab[:, 1] == 1)[0] 350 | x_train = dat[idx] 351 | print('ensemble:', ensem) 352 | syn_dat = [] 353 | 354 | for gen_epochs in range(self.Config.NUM_SYN_ITER): 355 | print('iteration through data:', gen_epochs) 356 | 357 | np.random.shuffle(x_train) 358 | 359 | for i in range(int(len(x_train)) // self.batchsize): 360 | data_ = x_train[i * self.batchsize: (i + 1) * self.batchsize] 361 | syn_dat += [autoenc_fake.eval(feed_dict={data_in: data_})] 362 | 363 | syn_dat = np.array(syn_dat) 364 | syn_dat.resize(syn_dat.shape[0] * syn_dat.shape[1], syn_dat.shape[2], syn_dat.shape[3]) 365 | np.save(save_dir + 'synthetic_data_' + str(ensem) + '.npy', syn_dat) 366 | return 367 | 368 | # do iwgan accuracy 369 | def iwganAcc(self, save_dir, x_train, checkpoint): 370 | 371 | if not os.path.exists(save_dir): 372 | os.makedirs(save_dir) 373 | 374 | session = tf.Session() 375 | K.set_session(session) 376 | 377 | # get sizes of batches 378 | num_batch = int(len(x_train)//self.batchsize) 379 | test_size = int(num_batch/2) 380 | 381 | # shuffle dataset 382 | np.random.shuffle(x_train) 383 | 384 | print('calculate loss function') 385 | # get model outputs 386 | data_in = tf.placeholder(tf.float32, shape=[None, self.timesteps, self.data_dim]) 387 | noise = 0.0 388 | 389 | fake_data, fake_states = self.generator(True, data_in, noise, reuse=False) 390 | real_data, real_states = self.generator(False, data_in, noise, reuse=True) 391 | disc_fake = self.discriminator(fake_data, reuse=False) 392 | disc_real = self.discriminator(real_data, reuse=True) 393 | 394 | ###################################### LOAD SAVED VARIABLES AND GENERATE DATA ###################################### 395 | 396 | print('generate data') 397 | 398 | init_op = tf.global_variables_initializer() 399 | session.run(init_op) 400 | 401 | saver = tf.train.Saver() 402 | 403 | with session.as_default(): 404 | saver.restore(session, checkpoint) 405 | 406 | real_disc_lab = [] 407 | fake_disc_lab = [] 408 | for i in range(test_size): 409 | data_ = x_train[i * self.batchsize: (i + 1) * self.batchsize] 410 | real_disc_lab += [disc_real.eval(feed_dict={data_in: data_})] 411 | fake_disc_lab += [disc_fake.eval(feed_dict={data_in: data_})] 412 | 413 | real_disc_lab = np.array(real_disc_lab) 414 | fake_disc_lab = np.array(fake_disc_lab) 415 | real_disc_lab.resize(real_disc_lab.shape[0] * real_disc_lab.shape[1], real_disc_lab.shape[2]) 416 | fake_disc_lab.resize(fake_disc_lab.shape[0] * fake_disc_lab.shape[1], fake_disc_lab.shape[2]) 417 | 418 | print('try tau values') 419 | # now try tau 420 | tau_ = np.array([x / 10 for x in range(-30, 10)]) # np.array([x/20 for x in range(1,11)]) 421 | tau_f1 = [] 422 | real_lab = np.zeros(real_disc_lab.shape) 423 | fake_lab = np.ones(fake_disc_lab.shape) 424 | 425 | lab = np.concatenate((real_lab, fake_lab)) 426 | 427 | for tau in tau_: 428 | val_real_lab = np.where(real_disc_lab > tau, 1, 0) 429 | val_fake_lab = np.where(fake_disc_lab > tau, 1, 0) 430 | val_lab = np.concatenate((val_real_lab, val_fake_lab)) 431 | tau_f1 += [f1_score(lab, val_lab)] 432 | tau = tau_[np.argmax(tau_f1)] 433 | val_f1 = max(tau_f1) 434 | 435 | # test on data now 436 | real_disc_lab = [] 437 | fake_disc_lab = [] 438 | for i in range(test_size, 2 * test_size): 439 | data_ = x_train[i * self.batchsize: (i + 1) * self.batchsize] 440 | real_disc_lab += [disc_real.eval(feed_dict={data_in: data_})] 441 | fake_disc_lab += [disc_fake.eval(feed_dict={data_in: data_})] 442 | 443 | real_disc_lab = np.array(real_disc_lab) 444 | fake_disc_lab = np.array(fake_disc_lab) 445 | real_disc_lab.resize(real_disc_lab.shape[0] * real_disc_lab.shape[1], real_disc_lab.shape[2]) 446 | fake_disc_lab.resize(fake_disc_lab.shape[0] * fake_disc_lab.shape[1], fake_disc_lab.shape[2]) 447 | 448 | print('test on test data') 449 | 450 | real_lab = np.zeros(real_disc_lab.shape) 451 | fake_lab = np.ones(fake_disc_lab.shape) 452 | lab = np.concatenate((real_lab, fake_lab)) 453 | 454 | test_real_lab = np.where(real_disc_lab > tau, 1, 0) 455 | test_fake_lab = np.where(fake_disc_lab > tau, 1, 0) 456 | test_lab = np.concatenate((test_real_lab, test_fake_lab)) 457 | 458 | print('calculate test f1') 459 | test_f1 = [f1_score(lab, test_lab)] 460 | 461 | np.save(save_dir + 'test_f1.npy', test_f1) 462 | np.save(save_dir + 'val_f1.npy', val_f1) 463 | np.save(save_dir + 'tau.npy', tau) 464 | 465 | return 466 | 467 | # use this to get model accuracy based on the discriminator model 468 | def iwganDisc(self, save_dir, x_val, y_val, x_test, y_test, checkpoint): 469 | 470 | session = tf.Session() 471 | K.set_session(session) 472 | 473 | if not os.path.exists(save_dir): 474 | os.makedirs(save_dir) 475 | 476 | if len(y_val.shape) == 3: 477 | y_val = y_val[:, -1, :] 478 | y_val = y_val.argmax(axis=1) 479 | 480 | if len(y_test.shape) == 3: 481 | y_test = y_test[:, -1, :] 482 | y_test = y_test.argmax(axis=1) 483 | 484 | print('calculate loss function') 485 | # get model outputs 486 | data_in = tf.placeholder(tf.float32, shape=[None, self.timesteps, self.data_dim]) 487 | noise = 0.0 488 | 489 | real_data, real_states = self.generator(False, data_in, noise, reuse=True) 490 | disc_real = self.discriminator(real_data, reuse=True) 491 | 492 | ###################################### LOAD SAVED VARIABLES AND GENERATE DATA ################################## 493 | print('generate data') 494 | 495 | init_op = tf.global_variables_initializer() 496 | session.run(init_op) 497 | 498 | saver = tf.train.Saver() 499 | 500 | with session.as_default(): 501 | saver.restore(session, checkpoint) 502 | 503 | idx = np.array(range(len(x_val))) 504 | idx_ = np.array(range(len(x_test))) 505 | 506 | np.random.shuffle(idx) 507 | np.random.shuffle(idx_) 508 | 509 | x_val = x_val[idx] 510 | y_val = y_val[idx] 511 | 512 | x_test = x_test[idx_] 513 | y_test = y_test[idx_] 514 | 515 | disc_lab = [] 516 | for i in range(int(len(y_val) // self.batchsize)): 517 | data_ = x_val[i * self.batchsize: (i + 1) * self.batchsize] 518 | disc_lab += [disc_real.eval(feed_dict={data_in: data_})] 519 | 520 | disc_lab = np.array(disc_lab) 521 | disc_lab.resize(disc_lab.shape[0] * disc_lab.shape[1], disc_lab.shape[2]) 522 | 523 | print(disc_lab.shape) 524 | print(y_val[0:int(len(y_val) // self.batchsize) * self.batchsize].shape) 525 | 526 | print('try tau values') 527 | # now try tau 528 | tau_ = np.array([x / 10 for x in range(-30, 10)]) # np.array([x/20 for x in range(1,11)]) 529 | tau_f1 = [] 530 | for tau in tau_: 531 | val_test_lab = np.where(disc_lab > tau, 1, 0) 532 | tau_f1 += [f1_score(y_val[0:int(len(y_val) // self.batchsize) * self.batchsize], val_test_lab)] 533 | 534 | tau = tau_[np.argmax(tau_f1)] 535 | val_f1 = max(tau_f1) 536 | 537 | disc_lab = [] 538 | print('predict on test data') 539 | # evaluate the model on the validation data 540 | for i in range(int(len(x_test) // self.batchsize)): 541 | data_ = x_test[i * self.batchsize: (i + 1) * self.batchsize] 542 | disc_lab += [disc_real.eval(feed_dict={data_in: data_})] 543 | 544 | disc_lab = np.array(disc_lab) 545 | disc_lab.resize(disc_lab.shape[0] * disc_lab.shape[1], disc_lab.shape[2]) 546 | 547 | test_pred_lab = np.where(disc_lab > tau, 1, 0) 548 | print(test_pred_lab.shape) 549 | print(y_test[0:int(len(y_test) // self.batchsize) * self.batchsize].shape) 550 | 551 | print('calculate test f1') 552 | test_f1 = f1_score(y_test[0:int(len(x_test) // self.batchsize) * self.batchsize], test_pred_lab) 553 | 554 | np.save(save_dir + 'test_f1.npy', test_f1) 555 | np.save(save_dir + 'val_f1.npy', val_f1) 556 | np.save(save_dir + 'tau.npy', tau) 557 | return 558 | 559 | # use this to get novelty detection 560 | def iwganNovelty(self, save_dir, x_val, y_val, x_test, y_test, checkpoint): 561 | session = tf.Session() 562 | K.set_session(session) 563 | 564 | if not os.path.exists(save_dir): 565 | os.makedirs(save_dir) 566 | 567 | if len(y_val.shape) == 3: 568 | y_val = y_val[:, -1, :] 569 | y_val = y_val.argmax(axis=1) 570 | 571 | if len(y_test.shape) == 3: 572 | y_test = y_test[:, -1, :] 573 | y_test = y_test.argmax(axis=1) 574 | 575 | # get model outputs 576 | data_in = tf.placeholder(tf.float32, shape=[None, self.timesteps, self.data_dim]) 577 | noise = 0.0 578 | 579 | real_data, real_states = self.generator(False, data_in, noise, reuse=True) 580 | 581 | autoenc_real = self.autoencoder_seq2seq_pred(real_data, real_states, data_in, reuse=False) 582 | ###################################### LOAD SAVED VARIABLES AND GENERATE DATA ################################## 583 | 584 | print('generate data') 585 | 586 | init_op = tf.global_variables_initializer() 587 | session.run(init_op) 588 | 589 | saver = tf.train.Saver() 590 | syn_dat = [] 591 | 592 | with session.as_default(): 593 | saver.restore(session, checkpoint) 594 | 595 | idx = np.array(range(len(x_val))) 596 | idx_ = np.array(range(len(x_test))) 597 | 598 | np.random.shuffle(idx) 599 | np.random.shuffle(idx_) 600 | 601 | x_test = x_test[idx_] 602 | y_test = y_test[idx_] 603 | 604 | x_val = x_val[idx] 605 | y_val = y_val[idx] 606 | 607 | for i in range(int(len(y_val) // self.batchsize)): 608 | data_ = x_val[i * self.batchsize: (i + 1) * self.batchsize] 609 | syn_dat += [autoenc_real.eval(feed_dict={data_in: data_})] 610 | 611 | syn_dat = np.array(syn_dat) 612 | syn_dat.resize(syn_dat.shape[0] * syn_dat.shape[1], syn_dat.shape[2] * syn_dat.shape[3]) 613 | x_val.resize(x_val.shape[0], x_val.shape[1] * x_val.shape[2]) 614 | # now compare the val_dat and autoencoder loss 615 | 616 | loss = [] 617 | 618 | print('calculate loss') 619 | for i in range(int(len(x_val) // self.batchsize) * self.batchsize): 620 | loss += [mse(x_val[i], syn_dat[i])] 621 | 622 | print('try tau values') 623 | # now try tau 624 | tau_ = np.array([x / 20 for x in range(1, 41)]) # np.array([x/20 for x in range(1,11)]) 625 | tau_f1 = [] 626 | for tau in tau_: 627 | val_test_lab = np.where(loss > tau, 1, 0) 628 | tau_f1 += [f1_score(y_val[0:int(len(y_val) // self.batchsize) * self.batchsize], val_test_lab)] 629 | 630 | tau = tau_[np.argmax(tau_f1)] 631 | val_f1 = max(tau_f1) 632 | 633 | syn_dat = [] 634 | print('predict on test data') 635 | # evaluate the model on the validation data 636 | for i in range(int(len(x_test) // self.batchsize)): 637 | data_ = x_test[i * self.batchsize: (i + 1) * self.batchsize] 638 | syn_dat += [autoenc_real.eval(feed_dict={data_in: data_})] 639 | 640 | syn_dat = np.array(syn_dat) 641 | syn_dat.resize(syn_dat.shape[0] * syn_dat.shape[1], syn_dat.shape[2] * syn_dat.shape[3]) 642 | 643 | # calculate loss for each of the model 644 | x_test.resize(x_test.shape[0], x_test.shape[1] * x_test.shape[2]) 645 | 646 | print('calculate loss') 647 | loss = [] 648 | for i in range(int(len(x_test) // self.batchsize) * self.batchsize): 649 | loss += [mse(x_test[i], syn_dat[i])] 650 | 651 | test_pred_lab = np.where(loss > tau, 1, 0) 652 | 653 | print('calculate test f1') 654 | test_f1 = f1_score(y_test[0:int(len(y_test) // self.batchsize) * self.batchsize], test_pred_lab) 655 | 656 | np.save(save_dir + 'test_f1.npy', test_f1) 657 | np.save(save_dir + 'val_f1.npy', val_f1) 658 | np.save(save_dir + 'tau.npy', tau) 659 | 660 | return 661 | 662 | # use this function to integrate synthetic data with real data 663 | def integrateSynthetic(self, ensem_dir, syn_dir, save_dir, y_min): 664 | # ensem_dir should only need to append lab or dat plus number 665 | # syn_dir should only need to append number 666 | 667 | if not os.path.exists(save_dir): 668 | os.makedirs(save_dir) 669 | 670 | minority_label = y_min[0] 671 | 672 | for i in range(self.Config.AE_NUM_ENSEMBLES): 673 | 674 | print(i) 675 | e_lab = np.load(ensem_dir + 'lab' + str(i) + '.npy') 676 | e_dat = np.load(ensem_dir + 'dat' + str(i) + '.npy') 677 | s_dat = np.load(syn_dir + str(i) + '.npy') 678 | 679 | # build synthetic label 680 | s_lab = [] 681 | for j in range(len(s_dat)): 682 | s_lab += [minority_label] 683 | s_lab = np.array(s_lab) 684 | 685 | c_dat = np.concatenate((e_dat, s_dat), axis=0) 686 | c_lab = np.concatenate((e_lab, s_lab), axis=0) 687 | 688 | shuffle = np.random.choice(len(c_lab), len(c_lab), replace=False) 689 | 690 | np.save(save_dir + 'ensem_lab' + str(i) + '.npy', c_lab[shuffle]) 691 | np.save(save_dir + 'ensem_dat' + str(i) + '.npy', c_dat[shuffle]) 692 | 693 | return 694 | -------------------------------------------------------------------------------- /utils/tf_models_seq2seq.py: -------------------------------------------------------------------------------- 1 | import os 2 | import keras.backend as K 3 | import numpy as np 4 | import tensorflow as tf 5 | import tensorflow.contrib.rnn as rnn 6 | from sklearn.metrics import f1_score 7 | from sklearn.metrics import mean_squared_error as mse 8 | from utils.config import * 9 | from utils.AttentionWithContext import AttentionWithContext 10 | import keras 11 | from utils.AttentionLSTM import AttentionDecoder 12 | 13 | # implementation of improved wassertein gan in tensorflow 14 | 15 | class seq2seqIWGAN: 16 | def __init__(self, data): 17 | 18 | if data == 'Sentiment': 19 | self.Config = SentimentConfig() 20 | elif data == 'Power': 21 | self.Config = PowerConfig() 22 | else: 23 | raise ValueError('Invalid value for data option') 24 | 25 | self.batchsize = self.Config.GAN_BATCH_SIZE 26 | self.timesteps = self.Config.TIMESTEPS 27 | self.decodesteps = self.Config.DECODESTEPS 28 | self.data_dim = self.Config.DATA_DIM 29 | self.data = data 30 | self.num_classes = self.Config.NUM_CLASSES 31 | 32 | # define leakyrelu activation 33 | def leakyrelu(self, x, alpha=0.3, name='lrelu'): 34 | return tf.maximum(x, alpha * x) 35 | 36 | # define generator model 37 | def generator(self, noisy, real_data, real_labels, noise_level, reuse=False): 38 | 39 | with tf.variable_scope('generator', reuse=reuse) as scope: 40 | 41 | # define LSTM with dropout 42 | def lstm_(): 43 | return rnn.DropoutWrapper(rnn.LSTMCell(self.Config.G_HIDDEN_NEURONS), 44 | input_keep_prob=self.Config.DROPOUT_IN, 45 | output_keep_prob=self.Config.DROPOUT_OUT) 46 | 47 | # define noisy hidden state for fake data 48 | h_state = tf.random_normal([self.batchsize, self.Config.G_HIDDEN_NEURONS]) 49 | c_state = tf.zeros([self.batchsize, self.Config.G_HIDDEN_NEURONS]) 50 | 51 | lstm_state = rnn.LSTMStateTuple(c_state, h_state) 52 | 53 | 54 | # define encoder 55 | # stack layers of LSTM 56 | enc_stacked_lstm = rnn.MultiRNNCell( 57 | [lstm_() for _ in range(self.Config.G_NUM_LAYERS)]) 58 | 59 | # set initial state based on if it's real data or not 60 | if noisy == True: 61 | enc_init_state = [lstm_state for _ in range(self.Config.G_NUM_LAYERS)] 62 | enc_init_state = tuple(enc_init_state) 63 | 64 | else: 65 | enc_init_state = enc_stacked_lstm.zero_state(self.batchsize, tf.float32) 66 | 67 | enc_outputs, enc_states = tf.nn.dynamic_rnn(enc_stacked_lstm, real_data, initial_state=enc_init_state, time_major=False, 68 | scope='generator/encoder') 69 | 70 | enc_outputs = self.leakyrelu(enc_outputs) 71 | 72 | 73 | # define decoder 74 | dec_stacked_lstm = rnn.MultiRNNCell( 75 | [lstm_() for _ in range(self.Config.G_NUM_LAYERS)]) 76 | 77 | enc_states = tf.unstack(enc_states, axis=0) 78 | dec_init_state = tuple([tf.nn.rnn_cell.LSTMStateTuple(enc_states[idx][0], enc_states[idx][1]) 79 | for idx in range(self.Config.G_NUM_LAYERS)]) 80 | 81 | dec_outputs, dec_states = tf.nn.dynamic_rnn(dec_stacked_lstm, real_labels, initial_state=dec_init_state, 82 | time_major=False, 83 | scope='generator/decoder') 84 | 85 | 86 | if self.Config.INSTANCE_NOISE: 87 | enc_output_noise = tf.random_normal([self.batchsize, self.timesteps, self.Config.G_HIDDEN_NEURONS], 88 | stddev=noise_level) 89 | dec_output_noise = tf.random_normal([self.batchsize, self.decodesteps, self.Config.G_HIDDEN_NEURONS], 90 | stddev=noise_level) 91 | enc_outputs += enc_output_noise 92 | dec_outputs += dec_output_noise 93 | 94 | return enc_outputs, enc_states, dec_outputs, dec_states 95 | 96 | # define discriminator model 97 | def discriminator(self, enc_inputs, dec_inputs, reuse=False): 98 | 99 | with tf.variable_scope('discriminator', reuse=reuse) as scope: 100 | # build RNN here 101 | lstm_ = rnn.LSTMCell(self.Config.D_HIDDEN_NEURONS) 102 | lstm_ = rnn.DropoutWrapper(lstm_, input_keep_prob=self.Config.DROPOUT_IN, 103 | output_keep_prob=self.Config.DROPOUT_OUT) 104 | 105 | stacked_lstm = rnn.MultiRNNCell( 106 | [lstm_ for _ in range(self.Config.D_NUM_LAYERS)]) 107 | 108 | # initialize initial state to 0 109 | init_state = stacked_lstm.zero_state(self.batchsize, tf.float32) 110 | 111 | # bidirectional RNN 112 | enc_outputs, enc_fw_state, enc_bw_state = rnn.static_bidirectional_rnn(stacked_lstm, stacked_lstm, 113 | tf.unstack(tf.transpose(enc_inputs, perm=[1, 0, 2])), 114 | initial_state_fw=init_state, initial_state_bw=init_state, 115 | scope='discriminator/encoder/') 116 | 117 | dec_outputs, _, _ = rnn.static_bidirectional_rnn(stacked_lstm, stacked_lstm, 118 | tf.unstack(tf.transpose(dec_inputs,perm=[1, 0, 2])), 119 | initial_state_fw=enc_fw_state, 120 | initial_state_bw=enc_bw_state, 121 | scope='discriminator/decoder/') 122 | 123 | outputs = tf.contrib.layers.fully_connected(dec_outputs, self.Config.AE_DENSE_NEURONS, reuse=reuse, 124 | scope=scope, activation_fn=None) 125 | 126 | # specify activation function here 127 | outputs = self.leakyrelu(outputs) 128 | # dropout 129 | outputs = tf.nn.dropout(outputs, keep_prob=self.Config.DROPOUT_OUT) 130 | outputs = tf.transpose(outputs, perm=[1, 0, 2]) 131 | 132 | # use only the last hidden state as input to dense layer 133 | outputs = tf.slice(outputs, [0, self.Config.DECODESTEPS - 1, 0], [self.batchsize, 1, 134 | self.Config.AE_DENSE_NEURONS]) 135 | # use last hidden state as input to a dense layer 136 | outputs = tf.layers.dense(outputs, int(self.Config.AE_DENSE_NEURONS / 2), name='discriminator/pre_output') 137 | # specify activation function here 138 | outputs = self.leakyrelu(outputs) 139 | # dropout 140 | outputs = tf.nn.dropout(outputs, keep_prob=self.Config.DROPOUT_OUT) 141 | # final output of discriminator 142 | outputs = tf.layers.dense(outputs, 1, name='discriminator/output') 143 | 144 | return outputs 145 | 146 | # build autoencoder seq2seq model with predictions 147 | def autoencoder_seq2seq_pred(self, enc_inputs, enc_states, dec_inputs, dec_states, data_in, label_in, reuse=False): 148 | 149 | with tf.variable_scope('autoencoder', reuse=reuse) as scope: 150 | # get attention from hidden states of generator 151 | atten = AttentionWithContext()(enc_inputs) 152 | 153 | enc_states = tf.unstack(enc_states, axis=0) 154 | dec_states = tf.unstack(dec_states, axis=0) 155 | # set inital cell state to cell state from generator 156 | # set inital hidden state to attention of hidden states from generator 157 | enc_tuple_state = tuple( 158 | [tf.nn.rnn_cell.LSTMStateTuple(enc_states[idx][0], atten) for idx in range(self.Config.A_NUM_LAYERS)]) 159 | dec_tuple_state = tuple( 160 | [tf.nn.rnn_cell.LSTMStateTuple(dec_states[idx][0], dec_states[idx][1]) 161 | for idx in range(self.Config.A_NUM_LAYERS)]) 162 | 163 | def lstm_(): 164 | return rnn.DropoutWrapper(rnn.LSTMCell(self.Config.A_HIDDEN_NEURONS), 165 | input_keep_prob=self.Config.DROPOUT_IN, 166 | output_keep_prob=self.Config.DROPOUT_OUT) 167 | 168 | stacked_lstm = rnn.MultiRNNCell( 169 | [lstm_() for _ in range(self.Config.A_NUM_LAYERS)]) 170 | 171 | enc_in = tf.zeros([self.batchsize, 1, self.Config.DATA_DIM]) 172 | dec_in = tf.zeros([self.batchsize, 1, self.Config.NUM_CLASSES]) 173 | 174 | all_enc_outputs = [] 175 | all_dec_outputs = [] 176 | 177 | # feed in previous output as current input (teacher forcing) 178 | for j in range(self.Config.TIMESTEPS): 179 | outputs, states = tf.nn.dynamic_rnn(stacked_lstm, enc_in, initial_state=enc_tuple_state, 180 | time_major=False, 181 | scope=scope) 182 | outputs = self.leakyrelu(outputs) 183 | outputs = tf.layers.dense(outputs, self.Config.DATA_DIM, name='autoencoder/enc_output') 184 | # outputs = tf.sigmoid(outputs, name='autoencoder/sigmoid') 185 | 186 | all_enc_outputs += [outputs] 187 | # ae_in = outputs 188 | enc_in = tf.expand_dims(data_in[:, j, :], axis=1) 189 | 190 | scope.reuse_variables() 191 | 192 | enc_outputs = tf.concat(all_enc_outputs, 1) 193 | 194 | decoder_lstm = keras.layers.LSTM(self.Config.A_HIDDEN_NEURONS, return_sequences=True, 195 | return_state=True) # , dropout=0.2, recurrent_dropout=0.2) 196 | decoder_dense = keras.layers.Dense(self.Config.NUM_CLASSES, activation='softmax') 197 | dec_states = [dec_states[0][0], dec_states[0][1]] 198 | 199 | for j in range(self.Config.DECODESTEPS): 200 | 201 | outputs, state_h, state_c = decoder_lstm(dec_in, initial_state=dec_states) 202 | outputs = decoder_dense(outputs) 203 | 204 | all_dec_outputs += [outputs] 205 | dec_in = tf.expand_dims(label_in[:, j, :], axis=1) 206 | dec_states = [state_h, state_c] 207 | 208 | scope.reuse_variables() 209 | 210 | dec_outputs = tf.concat(all_dec_outputs, 1) 211 | 212 | return enc_outputs, dec_outputs 213 | 214 | def iwganAeTrain(self, x_train, y_train, save_dir): 215 | 216 | d_loss = [] 217 | d_loss_diff = [] 218 | g_loss = [] 219 | a_loss = [] 220 | g_loss_diff = [] 221 | 222 | if not os.path.exists(save_dir): 223 | os.makedirs(save_dir) 224 | 225 | data_in = tf.placeholder(tf.float32, shape=[None, self.timesteps, self.data_dim]) 226 | label_in = tf.placeholder(tf.float32, shape=[None, self.decodesteps, self.num_classes]) 227 | noise = tf.placeholder(tf.float32, shape=()) 228 | 229 | fake_enc_data, fake_enc_states, fake_dec_data, fake_dec_states = self.generator(True, data_in, 230 | label_in, noise, reuse=False) 231 | real_enc_data, real_enc_states, real_dec_data, real_dec_states = self.generator(False, data_in, 232 | label_in, noise, reuse=True) 233 | disc_fake = self.discriminator(fake_enc_data, fake_dec_data, reuse=False) 234 | disc_real = self.discriminator(real_enc_data, real_dec_data, reuse=True) 235 | 236 | autoenc_data, autoenc_label = self.autoencoder_seq2seq_pred(real_enc_data, real_enc_states, real_dec_data, 237 | real_dec_states, data_in, label_in, reuse=False) 238 | 239 | # calculate loss 240 | ae_cost = 0 241 | # data AE loss 242 | ae_cost += tf.reduce_mean(tf.norm(tf.reshape(data_in, [self.batchsize, self.timesteps * self.data_dim]) - 243 | tf.reshape(autoenc_data, [self.batchsize, self.timesteps * self.data_dim]), 244 | axis=1)) 245 | # label AE loss 246 | ae_cost += tf.reduce_mean(tf.norm(tf.reshape(label_in, [self.batchsize, self.decodesteps * self.num_classes]) - 247 | tf.reshape(autoenc_label, [self.batchsize, self.decodesteps * 248 | self.num_classes]), 249 | axis=1)) 250 | 251 | disc_cost_ = tf.reduce_mean(disc_fake) - tf.reduce_mean(disc_real) 252 | gen_cost_ = -tf.reduce_mean(disc_fake) 253 | gen_cost = gen_cost_ + self.Config.MU * ae_cost 254 | 255 | # gradient loss 256 | alpha = tf.random_uniform( 257 | shape=[self.batchsize, 1, 1], 258 | minval=0., 259 | maxval=1. 260 | ) 261 | dec_interpolates = alpha * real_dec_data + ((1 - alpha) * fake_dec_data) 262 | enc_interpolates = alpha * real_enc_data + ((1 - alpha) * fake_enc_data) 263 | disc_interpolates = self.discriminator(enc_interpolates, dec_interpolates, reuse=True) 264 | gradients = tf.gradients(disc_interpolates, [enc_interpolates, dec_interpolates])[0] 265 | slopes = tf.sqrt(tf.reduce_sum(tf.square(gradients), reduction_indices=[1]) + self.Config.EP) 266 | gradient_penalty = tf.reduce_mean((slopes - 1) ** 2) 267 | disc_cost = disc_cost_ + self.Config.LAMBDA * gradient_penalty 268 | 269 | ################################################ SET UP MODEL TRAINING ######################################### 270 | print('set up model training') 271 | t_vars = tf.trainable_variables() 272 | g_vars = [var for var in t_vars if 'generator' in var.name] 273 | d_vars = [var for var in t_vars if 'discriminator' in var.name] 274 | a_vars = [var for var in t_vars if 'autoencoder' in var.name] 275 | 276 | disc_train_op = tf.train.AdamOptimizer( 277 | learning_rate=self.Config.DISC_LEARNING_RATE, 278 | beta1=0.5, 279 | beta2=0.9).minimize(disc_cost, var_list=d_vars) 280 | 281 | gen_train_op = tf.train.AdamOptimizer( 282 | learning_rate=self.Config.LEARNING_RATE, 283 | beta1=0.5, 284 | beta2=0.9).minimize(gen_cost, var_list=g_vars) 285 | 286 | autoenc_train_op = tf.train.AdamOptimizer( 287 | learning_rate=self.Config.LEARNING_RATE, 288 | beta1=0.5, 289 | beta2=0.9).minimize(ae_cost, var_list=a_vars) 290 | 291 | ################################################## TRAINING LOOP ############################################### 292 | print('training loop') 293 | saver = tf.train.Saver(max_to_keep=None) 294 | 295 | init_op = tf.global_variables_initializer() 296 | 297 | session = tf.Session() 298 | K.set_session(session) 299 | session.run(init_op) 300 | 301 | # set initial noise 302 | _noise = 1 303 | 304 | with session.as_default(): 305 | 306 | for epoch in range(self.Config.GAN_AE_EPOCHS): 307 | print('epoch:', epoch) 308 | np.random.shuffle(x_train) 309 | minibatch_size = self.batchsize * (self.Config.DISC_CRITIC_ITERS + self.Config.GEN_CRITIC_ITERS + 1) 310 | 311 | if (epoch + 1) % 100 == 0: 312 | if self.data == 'Sentiment': 313 | _noise += -0.2 314 | 315 | if _noise < 0: 316 | _noise = 0. 317 | 318 | elif self.data == 'Power': 319 | _noise += -0.2 320 | 321 | if _noise < 0: 322 | _noise = 0. 323 | 324 | print(_noise) 325 | 326 | for i in range(int(len(x_train) // (self.batchsize * (self.Config.DISC_CRITIC_ITERS + 327 | self.Config.GEN_CRITIC_ITERS + 1)))): 328 | data_minibatch = x_train[i * minibatch_size: (i + 1) * minibatch_size] 329 | label_minibatch = y_train[i * minibatch_size: (i + 1) * minibatch_size] 330 | print('minibatch:', i) 331 | for j in range(self.Config.GEN_CRITIC_ITERS): 332 | _data = data_minibatch[j * self.batchsize: (j + 1) * self.batchsize] 333 | _label = label_minibatch[j * self.batchsize: (j + 1) * self.batchsize] 334 | _gen_cost, _gen_cost_diff, _ = session.run([gen_cost, gen_cost_, gen_train_op], 335 | feed_dict={data_in: _data, label_in: _label, 336 | noise: _noise}) 337 | g_loss += [_gen_cost] 338 | g_loss_diff += [_gen_cost_diff] 339 | for j in range(self.Config.DISC_CRITIC_ITERS): 340 | _data = data_minibatch[(self.Config.GEN_CRITIC_ITERS + j) * self.batchsize: 341 | (self.Config.GEN_CRITIC_ITERS + j + 1) * self.batchsize] 342 | _label = label_minibatch[(self.Config.GEN_CRITIC_ITERS + j) * self.batchsize: 343 | (self.Config.GEN_CRITIC_ITERS + j + 1) * self.batchsize] 344 | _disc_cost, _disc_cost_diff, _ = session.run([disc_cost, disc_cost_, disc_train_op], 345 | feed_dict={data_in: _data, label_in: _label, 346 | noise: _noise}) 347 | d_loss += [_disc_cost] 348 | d_loss_diff += [_disc_cost_diff] 349 | 350 | _data = data_minibatch[(self.Config.DISC_CRITIC_ITERS + self.Config.GEN_CRITIC_ITERS) * 351 | self.batchsize: (self.Config.GEN_CRITIC_ITERS + 352 | self.Config.DISC_CRITIC_ITERS + 1) * self.batchsize] 353 | _label = label_minibatch[(self.Config.DISC_CRITIC_ITERS + self.Config.GEN_CRITIC_ITERS) * 354 | self.batchsize: (self.Config.GEN_CRITIC_ITERS + 355 | self.Config.DISC_CRITIC_ITERS + 1) * self.batchsize] 356 | _ae_cost, _ = session.run([ae_cost, autoenc_train_op], feed_dict={data_in: _data, label_in: _label, 357 | noise: _noise}) 358 | a_loss += [_ae_cost] 359 | 360 | if epoch >= 0 and epoch % self.Config.CHECKPOINT_STEP == 0: 361 | saver.save(session, save_dir + 'model', global_step=epoch) 362 | np.save(save_dir + 'd_loss.npy', d_loss) 363 | np.save(save_dir + 'g_loss.npy', g_loss) 364 | np.save(save_dir + 'g_diff_loss.npy', g_loss_diff) 365 | np.save(save_dir + 'd_diff_loss.npy', d_loss_diff) 366 | np.save(save_dir + 'a_loss.npy', a_loss) 367 | 368 | saver.save(session, save_dir + 'model', global_step=self.Config.GAN_AE_EPOCHS - 1) 369 | np.save(save_dir + 'd_loss.npy', d_loss) 370 | np.save(save_dir + 'g_loss.npy', g_loss) 371 | np.save(save_dir + 'g_diff_loss.npy', g_loss_diff) 372 | np.save(save_dir + 'd_diff_loss.npy', d_loss_diff) 373 | np.save(save_dir + 'a_loss.npy', a_loss) 374 | 375 | return 376 | 377 | 378 | # build a model to generate ensembles of data 379 | def iwganGenEnsemFolder(self, data_dir, save_dir, checkpoint, flag): 380 | 381 | if not os.path.exists(save_dir): 382 | os.makedirs(save_dir) 383 | 384 | session = tf.Session() 385 | K.set_session(session) 386 | 387 | print('calculate loss function') 388 | # get model outputs 389 | data_in = tf.placeholder(tf.float32, shape=[None, self.timesteps, self.data_dim]) 390 | label_in = tf.placeholder(tf.float32, shape=[None, self.decodesteps, self.num_classes]) 391 | noise = 0.0 392 | 393 | fake_enc_data, fake_enc_states, fake_dec_data, fake_dec_states = self.generator(True, data_in, 394 | label_in, noise, reuse=False) 395 | real_enc_data, real_enc_states, real_dec_data, real_dec_states = self.generator(False, data_in, 396 | label_in, noise, reuse=True) 397 | 398 | if flag == 'FAKE': 399 | autoenc_data, autoenc_label = self.autoencoder_seq2seq_pred(fake_enc_data, fake_enc_states, fake_enc_data, 400 | fake_enc_states, data_in, label_in, reuse=False) 401 | if flag == 'REAL': 402 | autoenc_data, autoenc_label = self.autoencoder_seq2seq_pred(real_enc_data, real_enc_states, real_enc_data, 403 | real_enc_states, data_in, label_in, reuse=False) 404 | 405 | ###################################### LOAD SAVED VARIABLES AND GENERATE DATA ################################## 406 | # create label dictionary 407 | dict_idx = np.load(data_dir + 'idx.npy') 408 | dict_ = dict((tuple(x.tolist()), i) for (i, x) in enumerate(dict_idx)) 409 | 410 | print('generate data') 411 | 412 | init_op = tf.global_variables_initializer() 413 | session.run(init_op) 414 | 415 | saver = tf.train.Saver() 416 | 417 | with session.as_default(): 418 | 419 | saver.restore(session, checkpoint) 420 | 421 | print('weights restored') 422 | 423 | for ensem in range(self.Config.AE_NUM_ENSEMBLES): 424 | 425 | dat = np.load(data_dir + 'ensem_dat' + str(ensem) + '.npy') 426 | lab = np.load(data_dir + 'ensem_lab' + str(ensem) + '.npy') 427 | 428 | lab_ = np.reshape(lab, (lab.shape[0], lab.shape[1]*lab.shape[2])) 429 | lab_ = [dict_[tuple(x.tolist())] for x in lab_] 430 | idx = np.where(lab_ != 0)[0] 431 | 432 | x_train = dat[idx] 433 | print(x_train.shape) 434 | y_train = lab[idx] 435 | print('ensemble:', ensem) 436 | syn_dat = [] 437 | syn_lab = [] 438 | 439 | for gen_epochs in range(self.Config.NUM_SYN_ITER): 440 | print('iteration through data:', gen_epochs) 441 | 442 | np.random.shuffle(x_train) 443 | if int(len(x_train)) < self.batchsize: 444 | x_train = np.repeat(x_train, self.batchsize, axis=0) 445 | y_train = np.repeat(y_train, self.batchsize, axis=0) 446 | for i in range(int(len(x_train)) // self.batchsize): 447 | data_ = x_train[i * self.batchsize: (i + 1) * self.batchsize] 448 | label_ = y_train[i * self.batchsize: (i + 1) * self.batchsize] 449 | syn_dat += [autoenc_data.eval(feed_dict={data_in: data_, label_in: label_})] 450 | syn_lab += [autoenc_label.eval(feed_dict={data_in: data_, label_in: label_})] 451 | 452 | syn_dat = np.array(syn_dat) 453 | syn_lab = np.array(syn_lab) 454 | syn_lab = np.round(syn_lab) 455 | 456 | syn_lab = syn_lab[0::self.batchsize] 457 | syn_dat = syn_dat[0::self.batchsize] 458 | else: 459 | for i in range(int(len(x_train)) // self.batchsize): 460 | data_ = x_train[i * self.batchsize: (i + 1) * self.batchsize] 461 | label_ = y_train[i * self.batchsize: (i + 1) * self.batchsize] 462 | syn_dat += [autoenc_data.eval(feed_dict={data_in: data_, label_in: label_})] 463 | syn_lab += [autoenc_label.eval(feed_dict={data_in: data_, label_in: label_})] 464 | 465 | syn_dat = np.array(syn_dat) 466 | syn_lab = np.array(syn_lab) 467 | syn_lab = np.round(syn_lab) 468 | 469 | 470 | print(syn_lab) 471 | syn_dat.resize(syn_dat.shape[0] * syn_dat.shape[1], syn_dat.shape[2], syn_dat.shape[3]) 472 | syn_lab.resize(syn_lab.shape[0] * syn_lab.shape[1], syn_lab.shape[2], syn_lab.shape[3]) 473 | np.save(save_dir + 'synthetic_data_' + str(ensem) + '.npy', syn_dat) 474 | np.save(save_dir + 'synthetic_label_' + str(ensem) + '.npy', syn_lab) 475 | return 476 | 477 | def iwganGenEnsemListFolder(self, data_dir, ensem_list, save_dir, checkpoint, flag): 478 | 479 | if not os.path.exists(save_dir): 480 | os.makedirs(save_dir) 481 | 482 | session = tf.Session() 483 | K.set_session(session) 484 | 485 | print('calculate loss function') 486 | # get model outputs 487 | data_in = tf.placeholder(tf.float32, shape=[None, self.timesteps, self.data_dim]) 488 | label_in = tf.placeholder(tf.float32, shape=[None, self.decodesteps, self.num_classes]) 489 | noise = 0.0 490 | 491 | fake_enc_data, fake_enc_states, fake_dec_data, fake_dec_states = self.generator(True, data_in, 492 | label_in, noise, reuse=False) 493 | real_enc_data, real_enc_states, real_dec_data, real_dec_states = self.generator(False, data_in, 494 | label_in, noise, reuse=True) 495 | 496 | if flag == 'FAKE': 497 | autoenc_data, autoenc_label = self.autoencoder_seq2seq_pred(fake_enc_data, fake_enc_states, fake_enc_data, 498 | fake_enc_states, data_in, label_in, reuse=False) 499 | if flag == 'REAL': 500 | autoenc_data, autoenc_label = self.autoencoder_seq2seq_pred(real_enc_data, real_enc_states, real_enc_data, 501 | real_enc_states, data_in, label_in, reuse=False) 502 | 503 | ###################################### LOAD SAVED VARIABLES AND GENERATE DATA ################################## 504 | # create label dictionary 505 | dict_idx = np.load(data_dir + 'idx.npy') 506 | dict_ = dict((tuple(x.tolist()), i) for (i, x) in enumerate(dict_idx)) 507 | 508 | print('generate data') 509 | 510 | init_op = tf.global_variables_initializer() 511 | session.run(init_op) 512 | 513 | saver = tf.train.Saver() 514 | 515 | with session.as_default(): 516 | 517 | saver.restore(session, checkpoint) 518 | 519 | print('weights restored') 520 | 521 | for ensem in ensem_list: 522 | 523 | dat = np.load(data_dir + 'ensem_dat' + str(ensem) + '.npy') 524 | lab = np.load(data_dir + 'ensem_lab' + str(ensem) + '.npy') 525 | 526 | lab_ = [dict_[tuple(x.tolist())] for x in lab] 527 | idx = np.where(lab_ == 0)[0] 528 | 529 | x_train = dat[idx] 530 | y_train = lab[idx] 531 | print('ensemble:', ensem) 532 | syn_dat = [] 533 | syn_lab = [] 534 | 535 | for gen_epochs in range(self.Config.NUM_SYN_ITER): 536 | print('iteration through data:', gen_epochs) 537 | 538 | np.random.shuffle(x_train) 539 | 540 | for i in range(int(len(x_train)) // self.batchsize): 541 | data_ = x_train[i * self.batchsize: (i + 1) * self.batchsize] 542 | label_ = y_train[i * self.batchsize: (i + 1) * self.batchsize] 543 | syn_dat += [autoenc_data.eval(feed_dict={data_in: data_, label_in: label_})] 544 | syn_lab += [autoenc_label.eval(feed_dict={data_in: data_, label_in: label_})] 545 | 546 | syn_dat = np.array(syn_dat) 547 | syn_lab = np.array(syn_lab) 548 | syn_lab = np.round(syn_lab) 549 | syn_dat.resize(syn_dat.shape[0] * syn_dat.shape[1], syn_dat.shape[2], syn_dat.shape[3]) 550 | syn_lab.resize(syn_lab.shape[0] * syn_lab.shape[1], syn_lab.shape[2], syn_lab.shape[3]) 551 | np.save(save_dir + 'synthetic_data_' + str(ensem) + '.npy', syn_dat) 552 | np.save(save_dir + 'synthetic_label_' + str(ensem) + '.npy', syn_lab) 553 | return 554 | 555 | # use this function to integrate synthetic data with real data 556 | def integrateSynthetic(self, ensem_dir, syn_dir, save_dir): 557 | # ensem_dir should only need to append lab or dat plus number 558 | # syn_dir should only need to append number 559 | 560 | if not os.path.exists(save_dir): 561 | os.makedirs(save_dir) 562 | 563 | 564 | for i in range(self.Config.AE_NUM_ENSEMBLES): 565 | 566 | print(i) 567 | e_lab = np.load(ensem_dir + 'lab' + str(i) + '.npy') 568 | e_dat = np.load(ensem_dir + 'dat' + str(i) + '.npy') 569 | s_dat = np.load(syn_dir + 'synthetic_data_' + str(i) + '.npy') 570 | s_lab = np.load(syn_dir + 'synthetic_label_' + str(i) + '.npy') 571 | 572 | c_dat = np.concatenate((e_dat, s_dat), axis=0) 573 | c_lab = np.concatenate((e_lab, s_lab), axis=0) 574 | 575 | shuffle = np.random.choice(len(c_lab), len(c_lab), replace=False) 576 | 577 | np.save(save_dir + 'ensem_lab' + str(i) + '.npy', c_lab[shuffle]) 578 | np.save(save_dir + 'ensem_dat' + str(i) + '.npy', c_dat[shuffle]) 579 | 580 | return 581 | -------------------------------------------------------------------------------- /utils/train_models.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | from sklearn.metrics import f1_score 4 | from sklearn.metrics import log_loss 5 | from sklearn.preprocessing import MinMaxScaler 6 | from sklearn.utils import class_weight 7 | from utils.config import * 8 | from sklearn.model_selection import train_test_split 9 | 10 | # train seq2seq model 11 | class Seq2seqTraining: 12 | 13 | def __init__(self, data): 14 | 15 | if data == 'Sentiment': 16 | self.Config = SentimentConfig() 17 | elif data == 'Power': 18 | self.Config = PowerConfig() 19 | else: 20 | raise ValueError('Invalid value for data option') 21 | 22 | def runSeq2seqEnsemble(self, model_func, train_folder, x_val, y_val, save_folder): 23 | 24 | if not os.path.exists(save_folder): 25 | os.makedirs(save_folder) 26 | 27 | y_val = y_val.argmax(axis=2) 28 | 29 | decoder_val = np.zeros((len(y_val), 1, self.Config.NUM_CLASSES)) 30 | 31 | # get weight labels for data 32 | weights = class_weight.compute_class_weight('balanced', np.unique(np.max(y_val, axis=1)), 33 | np.max(y_val, axis=1)) 34 | 35 | models = [] 36 | model = model_func() 37 | for i in range(self.Config.NUM_ENSEMBLES): 38 | models += [model] 39 | 40 | # train model 41 | for j in range(self.Config.NUM_ENSEMBLES): 42 | print('ensemble', j) 43 | train_dat = np.load(train_folder + 'dat' + str(j) + '.npy') 44 | train_lab = np.load(train_folder + 'lab' + str(j) + '.npy') 45 | 46 | decoder_train = np.zeros((len(train_lab), 1, self.Config.NUM_CLASSES)) 47 | 48 | # store accuracy and loss 49 | accuracy = -1 50 | val_f1 = [] 51 | val_loss = [] 52 | train_f1 = [] 53 | 54 | for i in range(self.Config.EPOCHS): 55 | print('epoch', i) 56 | # train model 57 | models[j].fit([train_dat, decoder_train], train_lab, batch_size=self.Config.BATCH_SIZE, 58 | epochs=1, verbose=1, class_weight=weights) 59 | 60 | # predict on model to get val and train f1 61 | pred = models[j].predict([train_dat, decoder_train], batch_size=self.Config.BATCH_SIZE) 62 | train_pred = pred.argmax(axis=2) 63 | np.save(save_folder + 'train_pred.npy', pred) 64 | np.save(save_folder + 'train_lab.npy', train_lab) 65 | 66 | pred = models[j].predict([x_val, decoder_val], batch_size=self.Config.BATCH_SIZE) 67 | np.save('val_pred.npy', pred) 68 | val_pred = pred.argmax(axis=2) 69 | 70 | # get f1-scores and loss 71 | if self.Config.NUM_CLASSES == 2: 72 | train_f1 += [np.mean(f1_score(train_lab.argmax(axis=2), train_pred, average=None)[1:])] 73 | val_f1 += [np.mean(f1_score(y_val, val_pred, average=None)[1:])] 74 | else: 75 | train_f1 += [np.mean(f1_score(train_lab.argmax(axis=2), train_pred, 76 | labels=[0, 1, 2], average=None)[1:])] 77 | val_f1 += [np.mean(f1_score(y_val, val_pred, labels=[0, 1, 2], average=None)[1:])] 78 | val_loss += [log_loss(y_val, val_pred)] 79 | # np.save(save_folder + 'val_pred.npy', val_pred) 80 | np.save(save_folder + 'val_lab.npy', y_val) 81 | 82 | # save model weights, loss and f1-score if validation accuracy improves 83 | if val_f1[-1] > accuracy: 84 | accuracy = val_f1[-1] 85 | 86 | print('Saved model to disk') 87 | model_json = models[j].to_json() 88 | with open(save_folder + "seq_ensem" + str(j) + ".json", "w") as json_file: 89 | json_file.write(model_json) 90 | # serialize weights to HDF5 91 | models[j].save_weights(save_folder + "seq_ensem" + str(j) + ".h5") 92 | 93 | # save weights, gives us a sense of where the model left off 94 | np.save(save_folder + 'ensem_' + str(j) + '_val_loss.npy', val_loss) 95 | np.save(save_folder + 'ensem_' + str(j) + '_val_fscore.npy', val_f1) 96 | 97 | np.save(save_folder + 'ensem_' + str(j) + '_val_loss.npy', val_loss) 98 | np.save(save_folder + 'ensem_' + str(j) + '_val_fscore.npy', val_f1) 99 | np.save(save_folder + 'ensem_' + str(j) + '_train_fscore.npy', train_f1) 100 | 101 | return 102 | 103 | 104 | # train sequence to one models 105 | class Seq2oneTraining: 106 | 107 | def __init__(self, data): 108 | 109 | if data == 'Sentiment': 110 | self.Config = SentimentConfig() 111 | elif data == 'Power': 112 | self.Config = PowerConfig() 113 | else: 114 | raise ValueError('Invalid value for data option') 115 | 116 | # train model with ensembles 117 | 118 | def runSeq2oneEnsemble(self, model_func, train_folder, x_val, y_val, save_folder): 119 | 120 | if not os.path.exists(save_folder): 121 | os.makedirs(save_folder) 122 | 123 | # load data 124 | if len(y_val.shape) == 3: 125 | y_val = y_val[:, -1, :] 126 | y_val = y_val.argmax(axis=1) 127 | 128 | decoder_val = np.zeros((len(y_val), 1, self.Config.NUM_CLASSES)) 129 | 130 | # get weight labels for data 131 | weights = class_weight.compute_class_weight('balanced', np.unique(y_val), y_val) 132 | 133 | models = [] 134 | model = model_func() 135 | for i in range(self.Config.NUM_ENSEMBLES): 136 | models += [model] 137 | 138 | # train model 139 | for j in range(self.Config.NUM_ENSEMBLES): 140 | print('ensemble', j) 141 | train_dat = np.load(train_folder+'dat'+str(j)+'.npy') 142 | train_lab = np.load(train_folder+'lab'+str(j)+'.npy') 143 | if len(train_lab.shape) == 3: 144 | train_lab = train_lab[:, -1, :] 145 | train_lab = np.expand_dims(train_lab, axis=1) 146 | 147 | decoder_train = np.zeros((len(train_lab), 1, self.Config.NUM_CLASSES)) 148 | 149 | # store accuracy and loss 150 | accuracy = -1 151 | val_f1 = [] 152 | val_loss = [] 153 | train_f1 = [] 154 | 155 | for i in range(self.Config.EPOCHS): 156 | print('epoch', i) 157 | # train model 158 | models[j].fit([train_dat, decoder_train], train_lab, batch_size=self.Config.BATCH_SIZE, 159 | epochs=1, verbose=1, class_weight=weights) 160 | 161 | # predict on model to get val and train f1 162 | pred = models[j].predict([train_dat, decoder_train], batch_size=self.Config.BATCH_SIZE) 163 | train_pred = pred.argmax(axis=2) 164 | np.save(save_folder + 'train_pred.npy', pred) 165 | np.save(save_folder + 'train_lab.npy', train_lab) 166 | 167 | pred = models[j].predict([x_val, decoder_val], batch_size=self.Config.BATCH_SIZE) 168 | np.save('val_pred.npy', pred) 169 | val_pred = pred.argmax(axis=2) 170 | 171 | # get f1-scores and loss 172 | if self.Config.NUM_CLASSES == 2: 173 | train_f1 += [f1_score(train_lab.argmax(axis=2), train_pred)] 174 | val_f1 += [f1_score(y_val, val_pred)] 175 | else: 176 | train_f1 += [np.mean(f1_score(train_lab.argmax(axis=2), train_pred, 177 | labels=[0, 1, 2], average=None)[1:])] 178 | val_f1 += [np.mean(f1_score(y_val, val_pred, labels=[0, 1, 2], average=None)[1:])] 179 | val_loss += [log_loss(y_val, np.squeeze(pred, 1))] 180 | 181 | # save model weights, loss and f1-score if validation accuracy improves 182 | if val_f1[-1] > accuracy: 183 | accuracy = val_f1[-1] 184 | 185 | print('Saved model to disk') 186 | model_json = models[j].to_json() 187 | with open(save_folder+"seq_ensem" + str(j) + ".json", "w") as json_file: 188 | json_file.write(model_json) 189 | # serialize weights to HDF5 190 | models[j].save_weights(save_folder+"seq_ensem" + str(j) + ".h5") 191 | 192 | # save weights, gives us a sense of where the model left off 193 | np.save(save_folder +'ensem_' + str(j) + '_val_loss.npy', val_loss) 194 | np.save(save_folder +'ensem_' + str(j) + '_val_fscore.npy', val_f1) 195 | 196 | np.save(save_folder + 'ensem_' + str(j) + '_val_loss.npy', val_loss) 197 | np.save(save_folder + 'ensem_' + str(j) + '_val_fscore.npy', val_f1) 198 | np.save(save_folder + 'ensem_' + str(j) + '_train_fscore.npy', train_f1) 199 | 200 | return 201 | 202 | 203 | # train autoencoder models 204 | class AutoencoderTraining: 205 | 206 | def __init__(self, data): 207 | 208 | if data == 'Sentiment': 209 | self.Config = SentimentConfig() 210 | elif data == 'Power': 211 | self.Config = PowerConfig() 212 | else: 213 | raise ValueError('Invalid value for data option') 214 | 215 | def trainAutoenc(self, model, x_train, save_folder): 216 | 217 | if not os.path.exists(save_folder): 218 | os.makedirs(save_folder) 219 | 220 | # use this to keep track of loss 221 | loss = 100000 222 | val_loss = [] 223 | train_loss = [] 224 | 225 | # transform data 226 | min_max_scaler = MinMaxScaler() 227 | train_scale = np.reshape(x_train, (x_train.shape[0], x_train.shape[1] * x_train.shape[2])) 228 | y_train = min_max_scaler.fit_transform(train_scale) 229 | y_train.resize(x_train.shape) 230 | 231 | # get validation set 232 | x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.1, random_state=42) 233 | print(x_train.shape) 234 | print(x_val.shape) 235 | 236 | # train model 237 | for i in range(self.Config.EPOCHS): 238 | 239 | hist = model.fit(x_train, y_train, batch_size=self.Config.BATCH_SIZE, 240 | validation_data=(x_val, y_val), 241 | epochs=1, verbose=1) 242 | val_loss += [hist.history['val_loss'][0]] 243 | train_loss += [hist.history['loss'][0]] 244 | 245 | if loss > val_loss[-1]: 246 | loss = val_loss[-1] 247 | 248 | print("Saved model to disk") 249 | 250 | # serialize model to json 251 | model_json = model.to_json() 252 | with open(save_folder + "model" + str(i) + ".json", "w") as json_file: 253 | json_file.write(model_json) 254 | # serialize weights to HDF5 255 | model.save_weights(save_folder + "model" + str(i) + ".h5") 256 | 257 | # save weights, gives us a sense of where the model left off 258 | np.save(save_folder + 'val_loss.npy', val_loss) 259 | np.save(save_folder + 'train_loss.npy', train_loss) 260 | 261 | np.save(save_folder + 'val_loss.npy', val_loss) 262 | np.save(save_folder + 'train_loss.npy', train_loss) 263 | 264 | return 265 | --------------------------------------------------------------------------------