├── LICENSE ├── Media ├── Graph_102.png ├── pred_tar.png └── train_val.png ├── README.md ├── _config.yml ├── autoencoder.py ├── network.py └── test_bench.py /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Jonathan Zia 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Media/Graph_102.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jonzia/Recurrent_Autoencoder/cff12ad78028a58ab7eb3054200657e96956775d/Media/Graph_102.png -------------------------------------------------------------------------------- /Media/pred_tar.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jonzia/Recurrent_Autoencoder/cff12ad78028a58ab7eb3054200657e96956775d/Media/pred_tar.png -------------------------------------------------------------------------------- /Media/train_val.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jonzia/Recurrent_Autoencoder/cff12ad78028a58ab7eb3054200657e96956775d/Media/train_val.png -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Recurrent Autoencoder v1.0.3 2 | 3 | ## Overview 4 | Recurrent autoencoder for unsupervised feature extraction from multidimensional time-series. 5 | 6 | ## Description 7 | This program implements a recurrent autoencoder for time-series analysis. The input to the program is a .csv file with feature columns. The time-series input is encoded with a single LSTM layer and decoded with a second LSTM layer to recreate the input. The output of the encoder layer feeds in to a single latent layer, which contains a compressed representation of the feature vector. The architecture for the network is the same as illustrated in [this paper](https://arxiv.org/pdf/1406.1078.pdf). 8 | 9 | ![Tensorboard Graph](https://raw.githubusercontent.com/jonzia/Recurrent_Autoencoder/master/Media/Graph_102.png?token=AQD91RksLIO02mj5jW6CVhJiP9AC5ibyks5aytVQwA%3D%3D) 10 | 11 | ## To Run 12 | 1. The program is designed to accept .csv files with the following format. 13 | 14 | Timestamps | Features | Labels 15 | --- | --- | --- 16 | t = 1 | Feature 1 ... Feature N | Label 1 ... Label M 17 | ... | ... | ... 18 | t = T | Feature 1 ... Feature N | Label 1 ... Label M 19 | 20 | 2. Set LSTM parameters/architecture for encoder and decoder in *network.py*. 21 | 3. Set training parameters in *autoencoder.py*: 22 | ```python 23 | # Training 24 | NUM_TRAINING = 100 # Number of training batches (balanced minibatches) 25 | NUM_VALIDATION = 100 # Number of validation batches (balanced minibatches) 26 | # Learning rate decay 27 | # Decay type can be 'none', 'exp', 'inv_time', or 'nat_exp' 28 | DECAY_TYPE = 'exp' # Set decay type for learning rate 29 | LEARNING_RATE_INIT = 0.001 # Set initial learning rate for optimizer (default 0.001) (fixed LR for 'none') 30 | LEARNING_RATE_END = 0.00001 # Set ending learning rate for optimizer 31 | # Load File 32 | LOAD_FILE = False # Load initial LSTM model from saved checkpoint? 33 | ``` 34 | 4. Set file paths in *autoencoder.py*: 35 | ```python 36 | # Specify filenames 37 | # Root directory: 38 | dir_name = "ROOT_DIRECTORY" 39 | with tf.name_scope("Training_Data"): # Training dataset 40 | tDataset = os.path.join(dir_name, "trainingdata.csv") 41 | with tf.name_scope("Validation_Data"): # Validation dataset 42 | vDataset = os.path.join(dir_name, "validationdata.csv") 43 | with tf.name_scope("Model_Data"): # Model save/load paths 44 | load_path = os.path.join(dir_name, "checkpoints/model") # Load previous model 45 | save_path = os.path.join(dir_name, "checkpoints/model") # Save model at each step 46 | save_path_op = os.path.join(dir_name, "checkpoints/model_op") # Save optimal model 47 | with tf.name_scope("Filewriter_Data"): # Filewriter save path 48 | filewriter_path = os.path.join(dir_name, "output") 49 | with tf.name_scope("Output_Data"): # Output data filenames (.txt) 50 | # These .txt files will contain loss data for Matlab analysis 51 | training_loss = os.path.join(dir_name, "training_loss.txt") 52 | validation_loss = os.path.join(dir_name, "validation_loss.txt") 53 | ``` 54 | 5. Run *autoencoder.py*. **(1)** Outputs of the program include training and validation loss .txt files as well as the Tensorboard graph and summaries. The model is saved at each timestep, and the optimal model as per the validation loss is saved separately. 55 | 6. *(Optional)* Run *test_bench.py* to obtain prediction, target, and latent representation values on a test dataset for the trained model. Ensure that proper filepaths are set before running. 56 | 57 | The following is an example of a rapid implementation which encodes the features of a simple sine wave. 58 | 59 | ![Training and Validation Loss](https://raw.githubusercontent.com/jonzia/Recurrent_Autoencoder/master/Media/train_val.png) 60 | ![Predictions and Targets](https://raw.githubusercontent.com/jonzia/Recurrent_Autoencoder/master/Media/pred_tar.png) 61 | 62 | ## Change Log 63 | _v1.0.3_: Updated to a more generalizable architecture proposed by [Cho et. al (2014)](https://arxiv.org/pdf/1406.1078.pdf). 64 | 65 | _v1.0.2_: Updated architecture from that proposed by [D. Hsu (2017)](https://arxiv.org/pdf/1707.07961.pdf) to that proposed in [this paper](https://openreview.net/pdf/74b996ba787a74199dc0b5ba1df77e436f6ad5a5.pdf), namely one latent layer compressing all time steps and output-feedback into the decoder layer. 66 | 67 | ## Notes 68 | (1) Ensure that you have [Tensorflow](https://www.tensorflow.org/) and [Pandas](https://pandas.pydata.org) installed. This program was built on Python 3.6 and Tensorflow 1.5. 69 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-minimal 2 | title: Recurrent Autoencoder 3 | -------------------------------------------------------------------------------- /autoencoder.py: -------------------------------------------------------------------------------- 1 | # ---------------------------------------------------- 2 | # Time-Series Autoencoder using Tensorflow 1.0.3 3 | # Created by: Jonathan Zia 4 | # Last Modified: Friday, March 9, 2018 5 | # Georgia Institute of Technology 6 | # ---------------------------------------------------- 7 | import tensorflow as tf 8 | import network as net 9 | import pandas as pd 10 | import random as rd 11 | import numpy as np 12 | import time 13 | import math 14 | import csv 15 | import os 16 | 17 | # ---------------------------------------------------- 18 | # User-Defined Constants 19 | # ---------------------------------------------------- 20 | # Training 21 | NUM_TRAINING = 500 # Number of training batches (balanced minibatches) 22 | NUM_VALIDATION = 500 # Number of validation batches (balanced minibatches) 23 | 24 | # Learning rate decay 25 | # Decay type can be 'none', 'exp', 'inv_time', or 'nat_exp' 26 | DECAY_TYPE = 'none' # Set decay type for learning rate 27 | LEARNING_RATE_INIT = 0.001 # Set initial learning rate for optimizer (default 0.001) (fixed LR for 'none') 28 | LEARNING_RATE_END = 0.00001 # Set ending learning rate for optimizer 29 | 30 | # Load File 31 | LOAD_FILE = False # Load initial LSTM model from saved checkpoint? 32 | 33 | 34 | # ---------------------------------------------------- 35 | # Instantiate Network Classes 36 | # ---------------------------------------------------- 37 | lstm_encoder = net.EncoderNetwork() 38 | lstm_decoder = net.DecoderNetwork(batch_size = lstm_encoder.batch_size, num_steps = lstm_encoder.num_steps, 39 | input_features = lstm_encoder.latent+lstm_encoder.input_features) 40 | 41 | 42 | # ---------------------------------------------------- 43 | # Input data files 44 | # ---------------------------------------------------- 45 | # Specify filenames 46 | # Root directory: 47 | dir_name = "/Users/username" 48 | with tf.name_scope("Training_Data"): # Training dataset 49 | tDataset = os.path.join(dir_name, "data/dataset.csv") 50 | with tf.name_scope("Validation_Data"): # Validation dataset 51 | vDataset = os.path.join(dir_name, "data/dataset.csv") 52 | with tf.name_scope("Model_Data"): # Model save/load paths 53 | load_path = os.path.join(dir_name, "checkpoints/model") # Load previous model 54 | save_path = os.path.join(dir_name, "checkpoints/model") # Save model at each step 55 | save_path_op = os.path.join(dir_name, "checkpoints/model_op") # Save optimal model 56 | with tf.name_scope("Filewriter_Data"): # Filewriter save path 57 | filewriter_path = os.path.join(dir_name, "output") 58 | with tf.name_scope("Output_Data"): # Output data filenames (.txt) 59 | # These .txt files will contain loss data for Matlab analysis 60 | training_loss = os.path.join(dir_name, "training_loss.txt") 61 | validation_loss = os.path.join(dir_name, "validation_loss.txt") 62 | 63 | # Obtain length of testing and validation datasets 64 | file_length = len(pd.read_csv(tDataset)) 65 | v_file_length = len(pd.read_csv(vDataset)) 66 | 67 | 68 | # ---------------------------------------------------- 69 | # User-Defined Methods 70 | # ---------------------------------------------------- 71 | 72 | def init_values(shape): 73 | """ 74 | Initialize Weight and Bias Matrices 75 | Returns: Tensor of shape "shape" w/ normally-distributed values 76 | """ 77 | temp = tf.truncated_normal(shape, stddev=0.1) 78 | return tf.Variable(temp) 79 | 80 | 81 | def extract_data(filename, batch_size, num_steps, input_features, f_length): 82 | """ 83 | Extract features and labels from filename.csv in random batches 84 | Returns: 85 | feature_batch ~ [batch_size, num_steps, input_features] 86 | label_batch := feature_batch 87 | """ 88 | 89 | # Initialize numpy arrays for return value placeholders 90 | feature_batch = np.zeros((batch_size,num_steps,input_features)) 91 | label_batch = np.zeros((batch_size,num_steps,input_features)) 92 | 93 | # Import data from CSV as a random minibatch: 94 | for i in range(batch_size): 95 | 96 | # Generate random index for number of rows to skip 97 | temp_index = rd.randint(0, f_length-num_steps-1) 98 | # Read data from CSV and write as matrix 99 | temp = pd.read_csv(filename, skiprows=temp_index, nrows=num_steps, header=None) 100 | temp = temp.as_matrix() 101 | 102 | # Return features in specified columns 103 | feature_batch[i,:,:] = temp[:,1:input_features+1] 104 | # Setting features as labels for autoencoding 105 | label_batch[i,:,:] = feature_batch[i,:,:] 106 | 107 | # Return feature and label batches 108 | return feature_batch, label_batch 109 | 110 | 111 | def set_decay_rate(decay_type, learning_rate_init, learning_rate_end, num_training): 112 | """ 113 | Calcualte decay rate for specified decay type 114 | Returns: Scalar decay rate 115 | """ 116 | 117 | if decay_type == 'none': 118 | return 0 119 | elif decay_type == 'exp': 120 | return math.pow((learning_rate_end/learning_rate_init),(1/num_training)) 121 | elif decay_type == 'inv_time': 122 | return ((learning_rate_init/learning_rate_end)-1)/num_training 123 | elif decay_type == 'nat_exp': 124 | return (-1/num_training)*math.log(learning_rate_end/learning_rate_init) 125 | else: 126 | return 0 127 | 128 | 129 | def decayed_rate(decay_type, decay_rate, learning_rate_init, step): 130 | """ 131 | Calculate decayed learning rate for specified parameters 132 | Returns: Scalar decayed learning rate 133 | """ 134 | 135 | if decay_type == 'none': 136 | return learning_rate_init 137 | elif decay_type == 'exp': 138 | return learning_rate_init*math.pow(decay_rate,step) 139 | elif decay_type == 'inv_time': 140 | return learning_rate_init/(1+decay_rate*step) 141 | elif decay_type == 'nat_exp': 142 | return learning_rate_init*math.exp(-decay_rate*step) 143 | 144 | 145 | # ---------------------------------------------------- 146 | # Importing Session Parameters 147 | # ---------------------------------------------------- 148 | 149 | # Create placeholders for inputs and target values 150 | # Input dimensions: BATCH_SIZE x NUM_STEPS x INPUT_FEATURES 151 | # Target dimensions: BATCH_SIZE x NUM_STEPS x INPUT_FEATURES 152 | inputs = tf.placeholder(tf.float32, [lstm_encoder.batch_size, lstm_encoder.num_steps, lstm_encoder.input_features], name="Input_Placeholder") 153 | targets = tf.placeholder(tf.float32, [lstm_encoder.batch_size, lstm_encoder.num_steps, lstm_encoder.input_features], name="Target_Placeholder") 154 | 155 | # Create placeholder for learning rate 156 | learning_rate = tf.placeholder(tf.float32, name="Learning_Rate_Placeholder") 157 | 158 | 159 | # ---------------------------------------------------- 160 | # Building an LSTM Encoder 161 | # ---------------------------------------------------- 162 | # Build LSTM cell 163 | # Creating basic LSTM cell 164 | encoder_cell = tf.contrib.rnn.BasicLSTMCell(lstm_encoder.num_lstm_hidden,name='Encoder_Cell') 165 | # Adding dropout wrapper to cell 166 | encoder_cell = tf.nn.rnn_cell.DropoutWrapper(encoder_cell, input_keep_prob=lstm_encoder.i_keep_prob, output_keep_prob=lstm_encoder.o_keep_prob) 167 | 168 | # Initialize weights and biases for latent layer. 169 | with tf.name_scope("Encoder_Variables"): 170 | W_latent = init_values([lstm_encoder.num_lstm_hidden, lstm_encoder.latent]) 171 | tf.summary.histogram('Weights',W_latent) 172 | b_latent = init_values([lstm_encoder.latent]) 173 | tf.summary.histogram('Biases',b_latent) 174 | 175 | # Add LSTM cells to dynamic_rnn and implement truncated BPTT 176 | initial_state_encoder = state_encoder = encoder_cell.zero_state(lstm_encoder.batch_size, tf.float32) 177 | with tf.variable_scope("Encoder_RNN"): 178 | for i in range(lstm_encoder.num_steps): 179 | # Obtain output at each step 180 | output, state_encoder = tf.nn.dynamic_rnn(encoder_cell, inputs[:,i:i+1,:], initial_state=state_encoder) 181 | # Obtain final output and convert to logit 182 | # Reshape output to remove extra dimension 183 | output = tf.reshape(output,[lstm_encoder.batch_size,lstm_encoder.num_lstm_hidden]) 184 | with tf.name_scope("Encoder_Output"): 185 | # Obtain logits by performing (weights)*(output)+(biases) 186 | logit = tf.matmul(output, W_latent) + b_latent 187 | # Convert logits to tensor 188 | latent_layer = tf.convert_to_tensor(logit) 189 | # Converting to dimensions [batch_size, 1 (num_steps), latent] 190 | latent_layer = tf.expand_dims(latent_layer,1,name='latent_layer') 191 | 192 | 193 | # ---------------------------------------------------- 194 | # Building an LSTM Decoder 195 | # ---------------------------------------------------- 196 | # Build LSTM cell 197 | # Creating basic LSTM cells 198 | # decoder_cell_1 is the first cell in the decoder layer, accepting latent_layer as an input 199 | # decoder_cell_2 is each subsequent cell in the decoder layer, accepting the output at (t-1) 200 | # as the input and the hidden state at (t-1) as the initial state. 201 | decoder_cell_1 = tf.contrib.rnn.BasicLSTMCell(lstm_decoder.num_lstm_hidden,name='Decoder_Cell_1') 202 | decoder_cell_2 = tf.contrib.rnn.BasicLSTMCell(lstm_decoder.num_lstm_hidden,name='Decoder_Cell_2') 203 | # Adding dropout wrapper to each cell 204 | decoder_cell_1 = tf.nn.rnn_cell.DropoutWrapper(decoder_cell_1, input_keep_prob=lstm_decoder.i_keep_prob, output_keep_prob=lstm_decoder.o_keep_prob) 205 | decoder_cell_2 = tf.nn.rnn_cell.DropoutWrapper(decoder_cell_2, input_keep_prob=lstm_decoder.i_keep_prob, output_keep_prob=lstm_decoder.o_keep_prob) 206 | 207 | # Initialize weights and biases for output layer. 208 | with tf.name_scope("Decoder_Variables"): 209 | W_output = init_values([lstm_decoder.num_lstm_hidden, lstm_encoder.input_features]) 210 | tf.summary.histogram('Weights',W_output) 211 | b_output = init_values([lstm_encoder.input_features]) 212 | tf.summary.histogram('Biases',b_output) 213 | 214 | # Initialize the initial state for the first LSTM cell 215 | initial_state_decoder = state_decoder = decoder_cell_1.zero_state(lstm_decoder.batch_size, tf.float32) 216 | # Initialize placeholder for outputs of the decoder layer at each timestep 217 | logits = [] 218 | with tf.variable_scope("Decoder_RNN"): 219 | for i in range(lstm_decoder.num_steps): 220 | # Obtain output at each step 221 | # For the first timestep... 222 | if i == 0: 223 | # Input the latent layer and obtain the output and hidden state 224 | output, state_decoder = tf.nn.dynamic_rnn(decoder_cell_1, latent_layer, initial_state=state_decoder) 225 | else: # For all subsequent timesteps... 226 | # Combine the output layer at (t-1) with the latent layer; then input to obtain output at (t) and the hidden state 227 | input_vector = tf.concat([tf.expand_dims(output,1),latent_layer],axis=2,name='Decoder_Input') 228 | output, state_decoder = tf.nn.dynamic_rnn(decoder_cell_2, input_vector, initial_state=state_decoder) 229 | # Obtain output and convert to logit 230 | # Reshape output to remove extra dimension 231 | output = tf.reshape(output,[lstm_decoder.batch_size,lstm_decoder.num_lstm_hidden]) 232 | with tf.name_scope("Decoder_Output"): 233 | # Obtain logits by applying operation (weights)*(outputs)+(biases) 234 | logit = tf.matmul(output, W_output) + b_output 235 | # Append output at each timestep 236 | logits.append(logit) 237 | # Convert logits to tensor entitled "predictions" 238 | predictions = tf.convert_to_tensor(logits) 239 | # Converting to dimensions [batch_size, num_steps, input_features] 240 | predictions = tf.transpose(logits, perm=[1, 0, 2], name='Predictions') 241 | 242 | 243 | # ---------------------------------------------------- 244 | # Calculate Loss and Define Optimizer 245 | # ---------------------------------------------------- 246 | # Calculating mean squared error of labels and logits 247 | loss = tf.losses.mean_squared_error(labels=targets, predictions=predictions) 248 | loss = tf.reduce_mean(loss) 249 | optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss) 250 | 251 | 252 | # ---------------------------------------------------- 253 | # Run Session 254 | # ---------------------------------------------------- 255 | init = tf.global_variables_initializer() 256 | saver = tf.train.Saver() # Instantiate Saver class 257 | t_loss = [] # Placeholder for training loss values 258 | v_loss = [] # Placeholder for validation loss values 259 | with tf.Session() as sess: 260 | # Create Tensorboard graph 261 | writer = tf.summary.FileWriter(filewriter_path, sess.graph) 262 | merged = tf.summary.merge_all() 263 | 264 | # If there is a model checkpoint saved, load the checkpoint. Else, initialize variables. 265 | if LOAD_FILE: 266 | # Restore saved session 267 | saver.restore(sess, load_path) 268 | else: 269 | # Initialize the variables 270 | sess.run(init) 271 | 272 | # Training the network 273 | 274 | # Setting step ranges 275 | step_range = NUM_TRAINING # Set step range for training 276 | v_step_range = NUM_VALIDATION # Set step range for validation 277 | 278 | # Obtain start time 279 | start_time = time.time() 280 | # Initialize optimal loss 281 | loss_op = 0 282 | # Determine learning rate decay 283 | decay_rate = set_decay_rate(DECAY_TYPE, LEARNING_RATE_INIT, LEARNING_RATE_END, NUM_TRAINING) 284 | 285 | if DECAY_TYPE != 'none': 286 | print('\nLearning Decay Rate = ', decay_rate) 287 | 288 | # Set number of trials to NUM_TRAINING 289 | for step in range(0,step_range): 290 | 291 | # Initialize optimal model saver to False 292 | save_op = False 293 | 294 | try: # While there is no out-of-bounds exception... 295 | 296 | # Obtaining batch of features and labels from TRAINING dataset(s) 297 | features, labels = extract_data(tDataset, lstm_encoder.batch_size, lstm_encoder.num_steps, lstm_encoder.input_features, file_length) 298 | 299 | except: 300 | break 301 | 302 | # Set optional conditional for network training 303 | if True: 304 | # Print step 305 | print("\nOptimizing at step", step) 306 | 307 | # Calculate time-decay learning rate: 308 | decayed_learning_rate = decayed_rate(DECAY_TYPE, decay_rate, LEARNING_RATE_INIT, step) 309 | 310 | # Input data and learning rate 311 | feed_dict = {inputs: features, targets:labels, learning_rate:decayed_learning_rate} 312 | # Run optimizer, loss, and predicted error ops in graph 313 | predictions_, targets_, _, loss_ = sess.run([predictions, targets, optimizer, loss], feed_dict=feed_dict) 314 | 315 | # Record loss 316 | t_loss.append(loss_) 317 | 318 | # Evaluate network and print data in terminal periodically 319 | with tf.name_scope("Validation"): 320 | # Conditional statement for validation and printing 321 | if step % 50 == 0: 322 | print("\nMinibatch train loss at step", step, ":", loss_) 323 | 324 | # Evaluate network 325 | test_loss = [] 326 | for step_num in range(0,v_step_range): 327 | 328 | try: # While there is no out-of-bounds exception... 329 | 330 | # Obtaining batch of features and labels from VALIDATION dataset(s) 331 | v_features, v_labels = extract_data(vDataset, lstm_encoder.batch_size, lstm_encoder.num_steps, lstm_encoder.input_features, v_file_length) 332 | 333 | except: 334 | break 335 | 336 | # Input data and run session to find loss 337 | data_test = {inputs: v_features, targets: v_labels} 338 | loss_test = sess.run(loss, feed_dict=data_test) 339 | test_loss.append(loss_test) 340 | 341 | # Record loss 342 | v_loss.append(np.mean(test_loss)) 343 | # Print test loss 344 | print("Test loss: %.3f" % np.mean(test_loss)) 345 | # For the first step, set optimal loss to test loss 346 | if step == 0: 347 | loss_op = np.mean(test_loss) 348 | # If test_loss < optimal loss, overwrite optimal loss 349 | if np.mean(test_loss) < loss_op: 350 | loss_op = np.mean(test_loss) 351 | save_op = True # Save model as new optimal model 352 | 353 | # Print predictions and targets for reference 354 | print("Predictions:") 355 | print(predictions_) 356 | print("Targets:") 357 | print(targets_) 358 | 359 | # Save and overwrite the session at each training step 360 | saver.save(sess, save_path) 361 | # Save the model if loss over the test set is optimal 362 | if save_op: 363 | saver.save(sess,save_path_op) 364 | 365 | # Writing summaries to Tensorboard at each training step 366 | summ = sess.run(merged) 367 | writer.add_summary(summ,step) 368 | 369 | # Conditional statement for calculating time remaining and percent completion 370 | if step % 10 == 0: 371 | 372 | # Report percent completion 373 | p_completion = 100*step/NUM_TRAINING 374 | print("\nPercent completion: %.3f%%" % p_completion) 375 | 376 | # Print time remaining 377 | avg_elapsed_time = (time.time() - start_time)/(step+1) 378 | sec_remaining = avg_elapsed_time*(NUM_TRAINING-step) 379 | min_remaining = round(sec_remaining/60) 380 | print("\nTime Remaining: %d minutes" % min_remaining) 381 | 382 | # Print learning rate if learning rate decay is used 383 | if DECAY_TYPE != 'none': 384 | print("\nLearning Rate = ", decayed_learning_rate) 385 | 386 | # Write training and validation loss to file 387 | t_loss = np.array(t_loss) 388 | v_loss = np.array(v_loss) 389 | with open(training_loss, 'a') as file_object: 390 | np.savetxt(file_object, t_loss) 391 | with open(validation_loss, 'a') as file_object: 392 | np.savetxt(file_object, v_loss) 393 | 394 | # Close the writer 395 | writer.close() -------------------------------------------------------------------------------- /network.py: -------------------------------------------------------------------------------- 1 | # ---------------------------------------------------- 2 | # Network Class for Autoencoder v1.0.3 3 | # Created by: Jonathan Zia 4 | # Last Modified: Friday, March 9, 2018 5 | # Georgia Institute of Technology 6 | # ---------------------------------------------------- 7 | 8 | class EncoderNetwork(): 9 | """Class containing all network parameters for use 10 | across LSTM_main and test_bench programs for ENCODER.""" 11 | 12 | def __init__(self): 13 | """Initialize network attributes""" 14 | 15 | # Architecture 16 | self.batch_size = 5 # Batch size 17 | self.num_steps = 100 # Max steps for BPTT 18 | self.num_lstm_hidden = 15 # Number of LSTM hidden units 19 | self.input_features = 9 # Number of input features 20 | self.i_keep_prob = 1.0 # Input keep probability / LSTM cell 21 | self.o_keep_prob = 1.0 # Output keep probability / LSTM cell 22 | 23 | # Special autoencoder parameters 24 | self.latent = 10 # Number of elements in latent layer 25 | 26 | class DecoderNetwork(): 27 | """Class containing all network parameters for use 28 | across LSTM_main and test_bench programs for DECODER.""" 29 | 30 | def __init__(self, batch_size, num_steps, input_features): 31 | """Initialize network attributes""" 32 | 33 | # Architecture 34 | self.batch_size = batch_size # Batch size 35 | self.num_steps = num_steps # Max steps for BPTT 36 | self.num_lstm_hidden = 15 # Number of LSTM hidden units 37 | self.input_features = input_features # Number of input features 38 | self.i_keep_prob = 1.0 # Input keep probability / LSTM cell 39 | self.o_keep_prob = 1.0 # Output keep probability / LSTM cell -------------------------------------------------------------------------------- /test_bench.py: -------------------------------------------------------------------------------- 1 | # ---------------------------------------------------- 2 | # LSTM Network Test Bench for Autoencoder v1.0.3 3 | # Created by: Jonathan Zia 4 | # Last Modified: Friday, March 9, 2018 5 | # Georgia Institute of Technology 6 | # ---------------------------------------------------- 7 | import tensorflow as tf 8 | import network as net 9 | import pandas as pd 10 | import random as rd 11 | import numpy as np 12 | import time 13 | import math 14 | import csv 15 | import os 16 | 17 | 18 | # ---------------------------------------------------- 19 | # User-Defined Constants 20 | # ---------------------------------------------------- 21 | # Training 22 | I_KEEP_PROB = 1.0 # Input keep probability / LSTM cell 23 | O_KEEP_PROB = 1.0 # Output keep probability / LSTM cell 24 | BATCH_SIZE = 1 # Batch size 25 | 26 | 27 | # ---------------------------------------------------- 28 | # Instantiate Network Classes 29 | # ---------------------------------------------------- 30 | lstm_encoder = net.EncoderNetwork() 31 | lstm_decoder = net.DecoderNetwork(batch_size = BATCH_SIZE, num_steps = lstm_encoder.num_steps, 32 | input_features = lstm_encoder.latent+lstm_encoder.input_features) 33 | WINDOW_INT = lstm_encoder.num_steps # Rolling window step interval 34 | 35 | 36 | # ---------------------------------------------------- 37 | # Input data files 38 | # ---------------------------------------------------- 39 | # Specify filenames 40 | # Root directory: 41 | dir_name = "Users/username" 42 | with tf.name_scope("Training_Data"): # Testing dataset 43 | Dataset = os.path.join(dir_name, "data/dataset.csv") 44 | with tf.name_scope("Model_Data"): # Model load path 45 | load_path = os.path.join(dir_name, "checkpoints/model") 46 | with tf.name_scope("Filewriter_Data"): # Filewriter save path 47 | filewriter_path = os.path.join(dir_name, "test_bench") 48 | with tf.name_scope("Output_Data"): # Output data filenames (.txt) 49 | # These .txt files will contain prediction and target data respectively for Matlab analysis 50 | prediction_file = os.path.join(dir_name, "predictions.txt") 51 | target_file = os.path.join(dir_name, "targets.txt") 52 | # This .txt file will contain latent representation for each time step 53 | latent_file = os.path.join(dir_name, "latent.txt") 54 | 55 | # Obtain length of testing and validation datasets 56 | file_length = len(pd.read_csv(Dataset)) 57 | 58 | # ---------------------------------------------------- 59 | # User-Defined Methods 60 | # ---------------------------------------------------- 61 | def init_values(shape): 62 | """ 63 | Initialize Weight and Bias Matrices 64 | Returns: Tensor of shape "shape" w/ normally-distributed values 65 | """ 66 | temp = tf.truncated_normal(shape, stddev=0.1) 67 | return tf.Variable(temp) 68 | 69 | def extract_data(filename, batch_size, num_steps, input_features, step): 70 | """ 71 | Extract features and labels from filename.csv as a rolling-window 72 | Returns: 73 | feature_batch ~ [1, num_steps, input_features] 74 | label_batch ~ [1, num_steps, input features] 75 | """ 76 | # NOTE: The empty dimension is required in order to feed inputs to LSTM cell. 77 | 78 | # Initialize numpy arrays for return value placeholders 79 | feature_batch = np.zeros((batch_size,num_steps, input_features)) 80 | label_batch = np.zeros((batch_size, num_steps, input_features)) 81 | 82 | # Import data from CSV as a sliding window: 83 | # First, import data starting from t = step to t = step + num_steps 84 | # ... add feature data to feature_batch[0, :, :] 85 | # ... assign label_batch the same value as feature_batch 86 | # Repeat for all batches. 87 | temp = pd.read_csv(filename, skiprows=step, nrows=num_steps, header=None) 88 | temp = temp.as_matrix() 89 | # Return features in specified columns 90 | feature_batch[0,:,:] = temp[:,1:input_features+1] 91 | # Return label batch, which has the same values as feature batch 92 | label_batch = feature_batch 93 | 94 | # Return feature and label batches 95 | return feature_batch, label_batch 96 | 97 | 98 | # ---------------------------------------------------- 99 | # Importing Session Parameters 100 | # ---------------------------------------------------- 101 | 102 | # Create placeholders for inputs and target values 103 | # Input dimensions: BATCH_SIZE x NUM_STEPS x INPUT_FEATURES 104 | # Target dimensions: BATCH_SIZE x NUM_STEPS x INPUT_FEATURES 105 | inputs = tf.placeholder(tf.float32, [BATCH_SIZE, lstm_encoder.num_steps, lstm_encoder.input_features], name="Input_Placeholder") 106 | targets = tf.placeholder(tf.float32, [BATCH_SIZE, lstm_encoder.num_steps, lstm_encoder.input_features], name="Target_Placeholder") 107 | 108 | 109 | # ---------------------------------------------------- 110 | # Building an LSTM Encoder 111 | # ---------------------------------------------------- 112 | # Build LSTM cell 113 | # Creating basic LSTM cell 114 | encoder_cell = tf.contrib.rnn.BasicLSTMCell(lstm_encoder.num_lstm_hidden,name='Encoder_Cell') 115 | # Adding dropout wrapper to cell 116 | encoder_cell = tf.nn.rnn_cell.DropoutWrapper(encoder_cell, input_keep_prob=lstm_encoder.i_keep_prob, output_keep_prob=lstm_encoder.o_keep_prob) 117 | 118 | # Initialize weights and biases for latent layer. 119 | with tf.name_scope("Encoder_Variables"): 120 | W_latent = init_values([lstm_encoder.num_lstm_hidden, lstm_encoder.latent]) 121 | tf.summary.histogram('Weights',W_latent) 122 | b_latent = init_values([lstm_encoder.latent]) 123 | tf.summary.histogram('Biases',b_latent) 124 | 125 | # Add LSTM cells to dynamic_rnn and implement truncated BPTT 126 | initial_state_encoder = state_encoder = encoder_cell.zero_state(BATCH_SIZE, tf.float32) 127 | with tf.variable_scope("Encoder_RNN"): 128 | for i in range(lstm_encoder.num_steps): 129 | # Obtain output at each step 130 | output, state_encoder = tf.nn.dynamic_rnn(encoder_cell, inputs[:,i:i+1,:], initial_state=state_encoder) 131 | # Obtain final output and convert to logit 132 | # Reshape output to remove extra dimension 133 | output = tf.reshape(output,[BATCH_SIZE,lstm_encoder.num_lstm_hidden]) 134 | with tf.name_scope("Encoder_Output"): 135 | # Obtain logits by performing (weights)*(output)+(biases) 136 | logit = tf.matmul(output, W_latent) + b_latent 137 | # Convert logits to tensor 138 | logit_tensor = tf.convert_to_tensor(logit) 139 | # Converting to dimensions [batch_size, 1 (num_steps), latent] 140 | latent_layer = tf.expand_dims(logit_tensor,1,name='latent_layer') 141 | 142 | 143 | # ---------------------------------------------------- 144 | # Building an LSTM Decoder 145 | # ---------------------------------------------------- 146 | # Build LSTM cell 147 | # Creating basic LSTM cells 148 | # decoder_cell_1 is the first cell in the decoder layer, accepting latent_layer as an input 149 | # decoder_cell_2 is each subsequent cell in the decoder layer, accepting the output at (t-1) 150 | # as the input and the hidden state at (t-1) as the initial state. 151 | decoder_cell_1 = tf.contrib.rnn.BasicLSTMCell(lstm_decoder.num_lstm_hidden,name='Decoder_Cell_1') 152 | decoder_cell_2 = tf.contrib.rnn.BasicLSTMCell(lstm_decoder.num_lstm_hidden,name='Decoder_Cell_2') 153 | # Adding dropout wrapper to each cell 154 | decoder_cell_1 = tf.nn.rnn_cell.DropoutWrapper(decoder_cell_1, input_keep_prob=lstm_decoder.i_keep_prob, output_keep_prob=lstm_decoder.o_keep_prob) 155 | decoder_cell_2 = tf.nn.rnn_cell.DropoutWrapper(decoder_cell_2, input_keep_prob=lstm_decoder.i_keep_prob, output_keep_prob=lstm_decoder.o_keep_prob) 156 | 157 | # Initialize weights and biases for output layer. 158 | with tf.name_scope("Decoder_Variables"): 159 | W_output = init_values([lstm_decoder.num_lstm_hidden, lstm_encoder.input_features]) 160 | tf.summary.histogram('Weights',W_output) 161 | b_output = init_values([lstm_encoder.input_features]) 162 | tf.summary.histogram('Biases',b_output) 163 | 164 | # Initialize the initial state for the first LSTM cell 165 | initial_state_decoder = state_decoder = decoder_cell_1.zero_state(BATCH_SIZE, tf.float32) 166 | # Initialize placeholder for outputs of the decoder layer at each timestep 167 | logits = [] 168 | with tf.variable_scope("Decoder_RNN"): 169 | for i in range(lstm_decoder.num_steps): 170 | # Obtain output at each step 171 | # For the first timestep... 172 | if i == 0: 173 | # Input the latent layer and obtain the output and hidden state 174 | output, state_decoder = tf.nn.dynamic_rnn(decoder_cell_1, latent_layer, initial_state=state_decoder) 175 | else: # For all subsequent timesteps... 176 | # Combine the output layer at (t-1) with the latent layer; then input to obtain output at (t) and the hidden state 177 | input_vector = tf.concat([tf.expand_dims(output,1),latent_layer],axis=2,name='Decoder_Input') 178 | output, state_decoder = tf.nn.dynamic_rnn(decoder_cell_2, input_vector, initial_state=state_decoder) 179 | # Obtain output and convert to logit 180 | # Reshape output to remove extra dimension 181 | output = tf.reshape(output,[BATCH_SIZE,lstm_decoder.num_lstm_hidden]) 182 | with tf.name_scope("Decoder_Output"): 183 | # Obtain logits by applying operation (weights)*(outputs)+(biases) 184 | logit = tf.matmul(output, W_output) + b_output 185 | # Append output at each timestep 186 | logits.append(logit) 187 | # Convert logits to tensor entitled "predictions" 188 | predictions = tf.convert_to_tensor(logits) 189 | # Converting to dimensions [batch_size, num_steps, input_features] 190 | predictions = tf.transpose(logits, perm=[1, 0, 2], name='Predictions') 191 | 192 | 193 | # ---------------------------------------------------- 194 | # Calculate Loss 195 | # ---------------------------------------------------- 196 | # Calculating softmax cross entropy of labels and logits 197 | loss = tf.losses.mean_squared_error(labels=targets, predictions=predictions) 198 | loss = tf.reduce_mean(loss) 199 | 200 | 201 | # ---------------------------------------------------- 202 | # Run Session 203 | # ---------------------------------------------------- 204 | saver = tf.train.Saver() # Instantiate Saver class 205 | with tf.Session() as sess: 206 | # Create Tensorboard graph 207 | writer = tf.summary.FileWriter(filewriter_path, sess.graph) 208 | merged = tf.summary.merge_all() 209 | 210 | # Restore saved session 211 | saver.restore(sess, load_path) 212 | 213 | # Running the network 214 | # Set range (prevent index out-of-range exception for rolling window) 215 | for step in range(0,file_length,WINDOW_INT): 216 | 217 | try: # While there is no out-of-bounds exception 218 | # Obtaining batch of features and labels from TRAINING dataset(s) 219 | features, labels = extract_data(Dataset, BATCH_SIZE, lstm_encoder.num_steps, lstm_encoder.input_features, step) 220 | except: 221 | break 222 | 223 | # Input data 224 | data = {inputs: features, targets:labels} 225 | # Run and evaluate for summary variables, loss, predictions, targets, and latent layer 226 | summary, loss_, pred, tar, latent = sess.run([merged, loss, predictions, targets, logit_tensor], feed_dict=data) 227 | 228 | # Report parameters 229 | if True: # Conditional statement for filtering outputs 230 | p_completion = math.floor(100*step*BATCH_SIZE/file_length) 231 | print("\nLoss: %.3f, Percent Completion: " % loss_, p_completion) 232 | print("\nPredictions:") 233 | print(pred) 234 | print("\nTargets:") 235 | print(tar) 236 | 237 | # Write results to file for Matlab analysis 238 | # Write predictions 239 | with open(prediction_file, 'a') as file_object: 240 | np.savetxt(file_object, pred[0,:,:]) 241 | # Write targets 242 | with open(target_file, 'a') as file_object: 243 | np.savetxt(file_object, tar[0,:,:]) 244 | # Write latent representation 245 | with open(latent_file, 'a') as file_object: 246 | np.savetxt(file_object, latent) 247 | 248 | # Writing summaries to Tensorboard 249 | writer.add_summary(summary,step) 250 | 251 | # Close the writer 252 | writer.close() --------------------------------------------------------------------------------