├── .gitignore ├── README.md ├── __init__.py ├── data_formatters ├── __init__.py ├── base.py ├── electricity.py ├── favorita.py ├── traffic.py └── volatility.py ├── dataset_analysis_images ├── Electricity_Dataset_Additive.png ├── Grocery_Dataset_Additive.png ├── Traffic_Dataset_Additive.png └── Volatility_Dataset_Additive.png ├── electricity_dataset_experiments ├── Huber_Experiment.py ├── LogCosh_Experiment.py ├── MAE_Experiment.py ├── MAPE_Experiment.py ├── MBE_Experiment.py ├── MSE_Experiment.py ├── MSLE_Experiment.py ├── NRMSE_Experiment.py ├── Quantile_Experiment.py ├── RAE_Experiment.py ├── RMSE_Experiment.py ├── RMSLE_Experiment.py ├── RRMSE_Experiment.py ├── RSE_Experiment.py ├── __init__.py ├── electricity_dataset_experiments.json └── running_experiments.py ├── expt_settings ├── __init__.py └── configs.py ├── favorita_dataset_experiments ├── Huber_Experiment.py ├── LogCosh_Experiment.py ├── MAE_Experiment.py ├── MAPE_Experiment.py ├── MBE_Experiment.py ├── MSE_Experiment.py ├── MSLE_Experiment.py ├── NRMSE_Experiment.py ├── Quantile_Experiment.py ├── RAE_Experiment.py ├── RMSE_Experiment.py ├── RMSLE_Experiment.py ├── RRMSE_Experiment.py ├── RSE_Experiment.py ├── __init__.py ├── favorita_dataset_experiments.json └── running_experiments.py ├── libs ├── __init__.py ├── hyperparam_opt.py ├── tft_model.py ├── tft_model_huber_loss.py ├── tft_model_log_cosh.py ├── tft_model_mae_loss.py ├── tft_model_mape_loss.py ├── tft_model_mbe_loss.py ├── tft_model_mse_loss.py ├── tft_model_msle_loss.py ├── tft_model_nrmse_loss.py ├── tft_model_quantile_loss.py ├── tft_model_rae_loss.py ├── tft_model_rmse_loss.py ├── tft_model_rmsle_loss.py ├── tft_model_rrmse_loss.py ├── tft_model_rse_loss.py └── utils.py ├── loss_functions_plots ├── Huber-Loss.png ├── LogCosh-Loss.png ├── Loss-Functions-Summary.png ├── MAE-Loss.png ├── MAPE-Loss.png ├── MBE-Loss.png ├── MSE-Loss.png ├── NRMSE-Loss.png ├── Quantile-Loss.png ├── RAE-Loss.png ├── RMSE-Loss.png ├── RMSLE-Loss.png ├── RRMSE-Loss.png └── RSE-Loss.png ├── requirements.txt ├── script_download_data.py ├── script_hyperparam_opt.py ├── script_train_fixed_params.py ├── traffic_dataset_experiments ├── Huber_Experiment.py ├── LogCosh_Experiment.py ├── MAE_Experiment.py ├── MAPE_Experiment.py ├── MBE_Experiment.py ├── MSE_Experiment.py ├── MSLE_Experiment.py ├── NRMSE_Experiment.py ├── Quantile_Experiment.py ├── RAE_Experiment.py ├── RMSE_Experiment.py ├── RMSLE_Experiment.py ├── RRMSE_Experiment.py ├── RSE_Experiment.py ├── __init__.py ├── running_experiments.py └── traffic_dataset_experiments.json └── volatility_dataset_experiments ├── Huber_Experiment.py ├── LogCosh_Experiment.py ├── MAE_Experiment.py ├── MAPE_Experiment.py ├── MBE_Experiment.py ├── MSE_Experiment.py ├── MSLE_Experiment.py ├── NRMSE_Experiment.py ├── Quantile_Experiment.py ├── RAE_Experiment.py ├── RMSE_Experiment.py ├── RMSLE_Experiment.py ├── RRMSE_Experiment.py ├── RSE_Experiment.py ├── __init__.py ├── running_experiments.py └── volatility_dataset_experiments.json /.gitignore: -------------------------------------------------------------------------------- 1 | /.ipynb_checkpoints/ 2 | /notebooks/ 3 | /volatility_dataset_experiments/volatility_dataset/ 4 | /traffic_dataset_experiments/traffic_dataset/ 5 | /experiments/ 6 | /electricity_dataset_experiments/electricity_dataset/ 7 | /favorita_dataset_experiments/favorita_dataset/ 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Regression Loss Functions Performance Evaluation in Time Series Forecasting using Temporal Fusion Transformers 2 | 3 | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7542550.svg)](https://doi.org/10.5281/zenodo.7542550) 4 | 5 | ``` 6 | This repository contains the implementation of paper Temporal Fusion Transformers for Interpretable 7 | Multi-horizon Time Series Forecasting with different loss functions in Tensorflow. 8 | We have compared 14 regression loss functions performance on 4 different datasets. 9 | Summary of experiment with instructions on how to replicate this experiment can be find below. 10 | ``` 11 | 12 | ## About Temporal Fusion Transformers 13 | 14 | Paper Link: https://arxiv.org/pdf/1912.09363.pdf 15 | Authors: Bryan Lim, Sercan Arik, Nicolas Loeff and Tomas Pfister 16 | 17 | > Abstract - Multi-horizon forecasting problems often contain a complex mix of inputs -- including static (i.e. time-invariant) 18 | > covariates, known future inputs, and other exogenous time series that are only observed historically -- without any 19 | > prior information on how they interact with the target. While several deep learning models have been proposed for 20 | > multi-step prediction, they typically comprise black-box models which do not account for the full range of inputs 21 | > present in common scenarios. In this paper, we introduce the Temporal Fusion Transformer (TFT) -- a novel 22 | > attention-based architecture which combines high-performance multi-horizon forecasting with interpretable insights 23 | > into temporal dynamics. To learn temporal relationships at different scales, the TFT utilizes recurrent layers for 24 | > local processing and interpretable self-attention layers for learning long-term dependencies. 25 | > The TFT also uses specialized components for the judicious selection of relevant features and a series of gating layers 26 | > to suppress unnecessary components, enabling high performance in a wide range of regimes. On a variety of real-world datasets, 27 | > we demonstrate significant performance improvements over existing benchmarks, and showcase three practical 28 | > interpretability use-cases of TFT. 29 | 30 | Majority of this repository work is taken from - https://github.com/google-research/google-research/tree/master/tft. 31 | 32 | ## Experiments Summary and Our Paper 33 | 34 | ### Cite Our Paper 35 | ``` 36 | @inproceedings{jadon2024comprehensive, 37 | title={A comprehensive survey of regression-based loss functions for time series forecasting}, 38 | author={Jadon, Aryan and Patil, Avinash and Jadon, Shruti}, 39 | booktitle={International Conference on Data Management, Analytics \& Innovation}, 40 | pages={117--147}, 41 | year={2024}, 42 | organization={Springer} 43 | } 44 | ``` 45 | 46 | ### Paper Links 47 | 1. https://link.springer.com/chapter/10.1007/978-981-97-3245-6_9 48 | 2. https://arxiv.org/abs/2211.02989 49 | 50 | 51 | ![Summary of Loss Functions](https://github.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/blob/main/loss_functions_plots/Loss-Functions-Summary.png) 52 | 53 | ## Replicating this Repository and Experiments 54 | 55 | ### Downloading Data and Running Default Experiments 56 | 57 | The key modules for experiments are organised as: 58 | 59 | * **data\_formatters**: Stores the main dataset-specific column definitions, along with functions for data transformation and normalization. For compatibility with the TFT, new experiments should implement a unique ``GenericDataFormatter`` (see **base.py**), with examples for the default experiments shown in the other python files. 60 | * **expt\_settings**: Holds the folder paths and configurations for the default experiments, 61 | * **libs**: Contains the main libraries, including classes to manage hyperparameter optimisation (**hyperparam\_opt.py**), the main TFT network class (**tft\_model.py**), and general helper functions (**utils.py**) 62 | 63 | Scripts are all saved in the main folder, with descriptions below: 64 | 65 | * **run.sh**: Simple shell script to ensure correct environmental setup. 66 | * **script\_download\_data.py**: Downloads data for the main experiment and processes them into csv files ready for training/evaluation. 67 | * **script\_train\_fixed\_params.py**: Calibrates the TFT using a predefined set of hyperparameters, and evaluates for a given experiment. 68 | * **script\_hyperparameter\_optimisation.py**: Runs full hyperparameter optimization using the default random search ranges defined for the TFT. 69 | 70 | Our four default experiments are divided into ``volatility``, ``electricity``, ``traffic``, and``favorita``. 71 | To run these experiments, first download the data, and then run the relevant training routine. 72 | 73 | #### Step 1: Download data for default experiments 74 | To download the experiment data, run the following script: 75 | 76 | ```bash 77 | python3 -m script_download_data $EXPT $OUTPUT_FOLDER 78 | ``` 79 | 80 | where ``$EXPT`` can be any of {``volatility``, ``electricity``, ``traffic``, ``favorita``}, and ``$OUTPUT_FOLDER`` denotes the root folder in which experiment outputs are saved. 81 | 82 | #### Step 2: Train and evaluate network 83 | To train the network with the optimal default parameters, run: 84 | 85 | ```bash 86 | python3 -m script_train_fixed_params $EXPT $OUTPUT_FOLDER $USE_GPU 87 | ``` 88 | 89 | where ``$EXPT`` and ``$OUTPUT_FOLDER`` are as above, ``$GPU`` denotes whether to run with GPU support (options are {``'yes'`` or``'no'``}). 90 | 91 | 92 | For full hyperparameter optimization, run: 93 | 94 | ```bash 95 | python3 -m script_hyperparam_opt $EXPT $OUTPUT_FOLDER $USE_GPU yes 96 | ``` 97 | 98 | where options are as above. 99 | 100 | ### Running Experiments with Loss Functions 101 | 102 | #### Move the Downloaded Dataset to their Respective Experiment Folder 103 | 104 | Run Experiment Script 105 | 106 | ```bash 107 | python3 running_experiments.py 108 | ``` 109 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/__init__.py -------------------------------------------------------------------------------- /data_formatters/__init__.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # Copyright 2021 The Google Research Authors. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | 16 | -------------------------------------------------------------------------------- /data_formatters/traffic.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # Copyright 2021 The Google Research Authors. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | 16 | # Lint as: python3 17 | """Custom formatting functions for Traffic dataset. 18 | 19 | Defines dataset specific column definitions and data transformations. This also 20 | performs z-score normalization across the entire dataset, hence re-uses most of 21 | the same functions as volatility. 22 | """ 23 | 24 | import data_formatters.base 25 | import data_formatters.volatility 26 | 27 | VolatilityFormatter = data_formatters.volatility.VolatilityFormatter 28 | DataTypes = data_formatters.base.DataTypes 29 | InputTypes = data_formatters.base.InputTypes 30 | 31 | 32 | class TrafficFormatter(VolatilityFormatter): 33 | """Defines and formats data for the traffic dataset. 34 | 35 | This also performs z-score normalization across the entire dataset, hence 36 | re-uses most of the same functions as volatility. 37 | 38 | Attributes: 39 | column_definition: Defines input and data type of column used in the 40 | experiment. 41 | identifiers: Entity identifiers used in experiments. 42 | """ 43 | 44 | _column_definition = [ 45 | ('id', DataTypes.REAL_VALUED, InputTypes.ID), 46 | ('hours_from_start', DataTypes.REAL_VALUED, InputTypes.TIME), 47 | ('values', DataTypes.REAL_VALUED, InputTypes.TARGET), 48 | ('time_on_day', DataTypes.REAL_VALUED, InputTypes.KNOWN_INPUT), 49 | ('day_of_week', DataTypes.REAL_VALUED, InputTypes.KNOWN_INPUT), 50 | ('hours_from_start', DataTypes.REAL_VALUED, InputTypes.KNOWN_INPUT), 51 | ('categorical_id', DataTypes.CATEGORICAL, InputTypes.STATIC_INPUT), 52 | ] 53 | 54 | def split_data(self, df, valid_boundary=151, test_boundary=166): 55 | """Splits data frame into training-validation-test data frames. 56 | 57 | This also calibrates scaling object, and transforms data for each split. 58 | 59 | Args: 60 | df: Source data frame to split. 61 | valid_boundary: Starting year for validation data 62 | test_boundary: Starting year for test data 63 | 64 | Returns: 65 | Tuple of transformed (train, valid, test) data. 66 | """ 67 | 68 | print('Formatting train-valid-test splits.') 69 | 70 | index = df['sensor_day'] 71 | train = df.loc[index < valid_boundary] 72 | valid = df.loc[(index >= valid_boundary - 7) & (index < test_boundary)] 73 | test = df.loc[index >= test_boundary - 7] 74 | 75 | self.set_scalers(train) 76 | 77 | return (self.transform_inputs(data) for data in [train, valid, test]) 78 | 79 | # Default params 80 | def get_fixed_params(self): 81 | """Returns fixed model parameters for experiments.""" 82 | 83 | fixed_params = { 84 | 'total_time_steps': 8 * 24, 85 | 'num_encoder_steps': 7 * 24, 86 | 'num_epochs': 100, 87 | 'early_stopping_patience': 5, 88 | 'multiprocessing_workers': 5 89 | } 90 | 91 | return fixed_params 92 | 93 | def get_default_model_params(self): 94 | """Returns default optimised model parameters.""" 95 | 96 | model_params = { 97 | 'dropout_rate': 0.3, 98 | 'hidden_layer_size': 320, 99 | 'learning_rate': 0.03, 100 | 'minibatch_size': 128, 101 | 'max_gradient_norm': 100., 102 | 'num_heads': 4, 103 | 'stack_size': 1 104 | } 105 | 106 | return model_params 107 | 108 | def get_num_samples_for_calibration(self): 109 | """Gets the default number of training and validation samples. 110 | 111 | Use to sub-sample the data for network calibration and a value of -1 uses 112 | all available samples. 113 | 114 | Returns: 115 | Tuple of (training samples, validation samples) 116 | """ 117 | return 450000, 50000 118 | -------------------------------------------------------------------------------- /dataset_analysis_images/Electricity_Dataset_Additive.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/dataset_analysis_images/Electricity_Dataset_Additive.png -------------------------------------------------------------------------------- /dataset_analysis_images/Grocery_Dataset_Additive.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/dataset_analysis_images/Grocery_Dataset_Additive.png -------------------------------------------------------------------------------- /dataset_analysis_images/Traffic_Dataset_Additive.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/dataset_analysis_images/Traffic_Dataset_Additive.png -------------------------------------------------------------------------------- /dataset_analysis_images/Volatility_Dataset_Additive.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/dataset_analysis_images/Volatility_Dataset_Additive.png -------------------------------------------------------------------------------- /electricity_dataset_experiments/Huber_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_huber_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_huber_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('electricity_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "electricity" 26 | dataset_folder_path = "electricity_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Huber Delta 0.5 p10 Loss": str(p10_loss.mean()), 169 | "Huber Delta 0.5 p50 Loss": str(p50_loss.mean()), 170 | "Huber Delta 0.5 p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("electricity_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /electricity_dataset_experiments/LogCosh_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_log_cosh 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_log_cosh.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('electricity_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "electricity" 26 | dataset_folder_path = "electricity_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "LogCosh p10 Loss": str(p10_loss.mean()), 169 | "LogCosh p50 Loss": str(p50_loss.mean()), 170 | "LogCosh p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("electricity_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /electricity_dataset_experiments/MAE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mae_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mae_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('electricity_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "electricity" 26 | dataset_folder_path = "electricity_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Absolute Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Absolute Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Absolute Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("electricity_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /electricity_dataset_experiments/MBE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mbe_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mbe_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('electricity_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "electricity" 26 | dataset_folder_path = "electricity_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Bias Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Bias Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Bias Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("electricity_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /electricity_dataset_experiments/MSE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mse_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mse_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('electricity_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "electricity" 26 | dataset_folder_path = "electricity_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Squared Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Squared Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Squared Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("electricity_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /electricity_dataset_experiments/Quantile_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_quantile_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_quantile_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('electricity_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "electricity" 26 | dataset_folder_path = "electricity_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Quantile p10 Loss": str(p10_loss.mean()), 169 | "Quantile p50 Loss": str(p50_loss.mean()), 170 | "Quantile p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("electricity_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /electricity_dataset_experiments/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/electricity_dataset_experiments/__init__.py -------------------------------------------------------------------------------- /electricity_dataset_experiments/electricity_dataset_experiments.json: -------------------------------------------------------------------------------- 1 | { 2 | "Mean Absolute Error p10 Loss": "0.09792971822353345", 3 | "Mean Absolute Error p50 Loss": "0.21813680930463045", 4 | "Mean Absolute Error p90 Loss": "0.33699885694865034", 5 | "Mean Absolute Percentage Error p10 Loss": "0.2415723898488404", 6 | "Mean Absolute Percentage Error p50 Loss": "0.41594873341829713", 7 | "Mean Absolute Percentage Error p90 Loss": "0.5899982767535746", 8 | "Mean Squared Error p10 Loss": "0.08204780062185207", 9 | "Mean Squared Error p50 Loss": "0.14977680825581505", 10 | "Mean Squared Error p90 Loss": "0.2119298696294117", 11 | "Mean Squared Logarithmic Error p10 Loss": "0.19256604832945942", 12 | "Mean Squared Logarithmic Error p50 Loss": "0.8686124928634792", 13 | "Mean Squared Logarithmic Error p90 Loss": "1.2493013170543357", 14 | "Mean Bias Error p10 Loss": "1068.7364351758758", 15 | "Mean Bias Error p50 Loss": "579.2512818415947", 16 | "Mean Bias Error p90 Loss": "119.86387821945264", 17 | "Relative Absolute Error p10 Loss": "0.1430538724722938", 18 | "Relative Absolute Error p50 Loss": "0.19802510341364568", 19 | "Relative Absolute Error p90 Loss": "0.25521186705865545", 20 | "Relative Squared Error p10 Loss": "0.428843102209387", 21 | "Relative Squared Error p50 Loss": "0.34051647814606617", 22 | "Relative Squared Error p90 Loss": "1.1033177430982228", 23 | "Relative Root Mean Squared Error p10 Loss": "0.1889267903692731", 24 | "Relative Root Mean Squared Error p50 Loss": "0.19295448751762734", 25 | "Relative Root Mean Squared Error p90 Loss": "0.19464861776868847", 26 | "Huber Delta 0.5 p10 Loss": "0.1267006887517332", 27 | "Huber Delta 0.5 p50 Loss": "0.20304555916431546", 28 | "Huber Delta 0.5 p90 Loss": "0.2687886851289519", 29 | "Quantile p10 Loss": "0.07357385342551045", 30 | "Quantile p50 Loss": "0.17355706311129568", 31 | "Quantile p90 Loss": "0.0749553108768246", 32 | "LogCosh p10 Loss": "0.12587851546229897", 33 | "LogCosh p50 Loss": "0.22744704187528822", 34 | "LogCosh p90 Loss": "0.3182045943679069", 35 | "Root Mean Squared Logarithmic Error p10 Loss": "0.23925887102654542", 36 | "Root Mean Squared Logarithmic Error p50 Loss": "1.5357226110304023", 37 | "Root Mean Squared Logarithmic Error p90 Loss": "2.5967067517445765", 38 | "Root Mean Squared Error p10 Loss": "0.1357113044349002", 39 | "Root Mean Squared Error p50 Loss": "0.2342938312131286", 40 | "Root Mean Squared Error p90 Loss": "0.3256507445031028", 41 | "Normalized Root Mean Squared Error p10 Loss": "0.12048648434221977", 42 | "Normalized Root Mean Squared Error p50 Loss": "0.42408581517854654", 43 | "Normalized Root Mean Squared Error p90 Loss": "0.4514950272266751" 44 | } -------------------------------------------------------------------------------- /electricity_dataset_experiments/running_experiments.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | 3 | subprocess.call("python MAE_Experiment.py", shell=True) 4 | subprocess.call("python MAPE_Experiment.py", shell=True) 5 | subprocess.call("python MSE_Experiment.py", shell=True) 6 | 7 | subprocess.call("python MSLE_Experiment.py", shell=True) 8 | subprocess.call("python MBE_Experiment.py", shell=True) 9 | subprocess.call("python RAE_Experiment.py", shell=True) 10 | 11 | subprocess.call("python RSE_Experiment.py", shell=True) 12 | subprocess.call("python NRMSE_Experiment.py", shell=True) 13 | subprocess.call("python RRMSE_Experiment.py", shell=True) 14 | 15 | subprocess.call("python RMSLE_Experiment.py", shell=True) 16 | subprocess.call("python RMSE_Experiment.py", shell=True) 17 | subprocess.call("python Huber_Experiment.py", shell=True) 18 | 19 | subprocess.call("python Quantile_Experiment.py", shell=True) 20 | subprocess.call("python LogCosh_Experiment.py", shell=True) 21 | -------------------------------------------------------------------------------- /expt_settings/__init__.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # Copyright 2021 The Google Research Authors. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | 16 | -------------------------------------------------------------------------------- /expt_settings/configs.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # Copyright 2021 The Google Research Authors. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | 16 | # Lint as: python3 17 | """Default configs for TFT experiments. 18 | 19 | Contains the default output paths for data, serialised models and predictions 20 | for the main experiments used in the publication. 21 | """ 22 | 23 | import os 24 | 25 | import data_formatters.electricity 26 | import data_formatters.favorita 27 | import data_formatters.traffic 28 | import data_formatters.volatility 29 | 30 | 31 | class ExperimentConfig(object): 32 | """Defines experiment configs and paths to outputs. 33 | 34 | Attributes: 35 | root_folder: Root folder to contain all experimental outputs. 36 | experiment: Name of experiment to run. 37 | data_folder: Folder to store data for experiment. 38 | model_folder: Folder to store serialised models. 39 | results_folder: Folder to store results. 40 | data_csv_path: Path to primary data csv file used in experiment. 41 | hyperparam_iterations: Default number of random search iterations for 42 | experiment. 43 | """ 44 | 45 | default_experiments = ['volatility', 'electricity', 'traffic', 'favorita'] 46 | 47 | def __init__(self, experiment='volatility', root_folder=None): 48 | """Creates configs based on default experiment chosen. 49 | 50 | Args: 51 | experiment: Name of experiment. 52 | root_folder: Root folder to save all outputs of training. 53 | """ 54 | 55 | if experiment not in self.default_experiments: 56 | raise ValueError('Unrecognised experiment={}'.format(experiment)) 57 | 58 | # Defines all relevant paths 59 | if root_folder is None: 60 | root_folder = os.path.join( 61 | os.path.dirname(os.path.realpath(__file__)), '..', 'outputs') 62 | print('Using root folder {}'.format(root_folder)) 63 | 64 | self.root_folder = root_folder 65 | self.experiment = experiment 66 | self.data_folder = os.path.join(root_folder, 'data', experiment) 67 | self.model_folder = os.path.join(root_folder, 'saved_models', experiment) 68 | self.results_folder = os.path.join(root_folder, 'results', experiment) 69 | 70 | # Creates folders if they don't exist 71 | for relevant_directory in [ 72 | self.root_folder, self.data_folder, self.model_folder, 73 | self.results_folder 74 | ]: 75 | if not os.path.exists(relevant_directory): 76 | os.makedirs(relevant_directory) 77 | 78 | @property 79 | def data_csv_path(self): 80 | csv_map = { 81 | 'volatility': 'formatted_omi_vol.csv', 82 | 'electricity': 'hourly_electricity.csv', 83 | 'traffic': 'hourly_data.csv', 84 | 'favorita': 'favorita_consolidated.csv' 85 | } 86 | 87 | return os.path.join(self.data_folder, csv_map[self.experiment]) 88 | 89 | @property 90 | def hyperparam_iterations(self): 91 | 92 | return 240 if self.experiment == 'volatility' else 60 93 | 94 | def make_data_formatter(self): 95 | """Gets a data formatter object for experiment. 96 | 97 | Returns: 98 | Default DataFormatter per experiment. 99 | """ 100 | 101 | data_formatter_class = { 102 | 'volatility': data_formatters.volatility.VolatilityFormatter, 103 | 'electricity': data_formatters.electricity.ElectricityFormatter, 104 | 'traffic': data_formatters.traffic.TrafficFormatter, 105 | 'favorita': data_formatters.favorita.FavoritaFormatter 106 | } 107 | 108 | return data_formatter_class[self.experiment]() 109 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/Huber_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_huber_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_huber_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('favorita_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "favorita" 26 | dataset_folder_path = "favorita_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Huber Delta 0.5 p10 Loss": str(p10_loss.mean()), 169 | "Huber Delta 0.5 p50 Loss": str(p50_loss.mean()), 170 | "Huber Delta 0.5 p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("favorita_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/LogCosh_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_log_cosh 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_log_cosh.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('favorita_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "favorita" 26 | dataset_folder_path = "favorita_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "LogCosh p10 Loss": str(p10_loss.mean()), 169 | "LogCosh p50 Loss": str(p50_loss.mean()), 170 | "LogCosh p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("favorita_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/MAE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mae_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mae_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('favorita_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "favorita" 26 | dataset_folder_path = "favorita_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Absolute Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Absolute Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Absolute Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("favorita_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/MBE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mbe_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mbe_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('favorita_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "favorita" 26 | dataset_folder_path = "favorita_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Bias Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Bias Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Bias Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("favorita_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/MSE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mse_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mse_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('favorita_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "favorita" 26 | dataset_folder_path = "favorita_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Squared Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Squared Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Squared Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("favorita_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/Quantile_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_quantile_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_quantile_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('favorita_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "favorita" 26 | dataset_folder_path = "favorita_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Quantile p10 Loss": str(p10_loss.mean()), 169 | "Quantile p50 Loss": str(p50_loss.mean()), 170 | "Quantile p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("favorita_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/RAE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_rae_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_rae_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('favorita_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "favorita" 26 | dataset_folder_path = "favorita_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Relative Absolute Error p10 Loss": str(p10_loss.mean()), 169 | "Relative Absolute Error p50 Loss": str(p50_loss.mean()), 170 | "Relative Absolute Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("favorita_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/RSE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_rse_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_rse_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('favorita_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "favorita" 26 | dataset_folder_path = "favorita_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Relative Squared Error p10 Loss": str(p10_loss.mean()), 169 | "Relative Squared Error p50 Loss": str(p50_loss.mean()), 170 | "Relative Squared Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("favorita_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /favorita_dataset_experiments/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/favorita_dataset_experiments/__init__.py -------------------------------------------------------------------------------- /favorita_dataset_experiments/favorita_dataset_experiments.json: -------------------------------------------------------------------------------- 1 | { 2 | "Mean Absolute Error p10 Loss": "0.6426598846305366", 3 | "Mean Absolute Error p50 Loss": "0.5906839172788886", 4 | "Mean Absolute Error p90 Loss": "0.6915747395903965", 5 | "Mean Absolute Percentage Error p10 Loss": "0.30985850128438464", 6 | "Mean Absolute Percentage Error p50 Loss": "0.607960616354056", 7 | "Mean Absolute Percentage Error p90 Loss": "0.4172795597411947", 8 | "Mean Squared Logarithmic Error p10 Loss": "0.3034171933734274", 9 | "Mean Squared Logarithmic Error p50 Loss": "0.6643826207546163", 10 | "Mean Squared Logarithmic Error p90 Loss": "5.120504603440817", 11 | "Relative Squared Error p10 Loss": "2.7439491407304626", 12 | "Relative Squared Error p50 Loss": "0.6252856878103299", 13 | "Relative Squared Error p90 Loss": "0.3448551281036559", 14 | "Relative Root Mean Squared Error p10 Loss": "0.6620256772743425", 15 | "Relative Root Mean Squared Error p50 Loss": "0.5525196665046799", 16 | "Relative Root Mean Squared Error p90 Loss": "0.5118596539943806", 17 | "Root Mean Squared Error p10 Loss": "0.5193378711427281", 18 | "Root Mean Squared Error p50 Loss": "0.4622634587882006", 19 | "Root Mean Squared Error p90 Loss": "0.40497620686487296", 20 | "Huber Delta 0.5 p10 Loss": "0.5746583659175247", 21 | "Huber Delta 0.5 p50 Loss": "0.5842756876558967", 22 | "Huber Delta 0.5 p90 Loss": "0.5224606349575559", 23 | "Quantile p10 Loss": "0.20236085255062736", 24 | "Quantile p50 Loss": "0.5816861365783124", 25 | "Quantile p90 Loss": "0.2743975374726861", 26 | "LogCosh p10 Loss": "0.3363576896573901", 27 | "LogCosh p50 Loss": "0.37472587799952917", 28 | "LogCosh p90 Loss": "0.39649554794970426", 29 | "Root Mean Squared Logarithmic Error p10 Loss": "0.3632761488257274", 30 | "Root Mean Squared Logarithmic Error p50 Loss": "0.7082698002169495", 31 | "Root Mean Squared Logarithmic Error p90 Loss": "3.1433517239590825", 32 | "Normalized Root Mean Squared Error p10 Loss": "0.5344668964009097", 33 | "Normalized Root Mean Squared Error p50 Loss": "0.5242454867994474", 34 | "Normalized Root Mean Squared Error p90 Loss": "0.5035633388490524", 35 | "Mean Squared Error p10 Loss": "0.4726808648055693", 36 | "Mean Squared Error p50 Loss": "0.4472480613074809", 37 | "Mean Squared Error p90 Loss": "0.4204969982152634", 38 | "Relative Absolute Error p10 Loss": "0.4635142060676026", 39 | "Relative Absolute Error p50 Loss": "0.37607876919179134", 40 | "Relative Absolute Error p90 Loss": "0.28875546285185444", 41 | "Mean Bias Error p10 Loss": "1103.6319916315724", 42 | "Mean Bias Error p50 Loss": "616.9170945785401", 43 | "Mean Bias Error p90 Loss": "126.38706163589534" 44 | } -------------------------------------------------------------------------------- /favorita_dataset_experiments/running_experiments.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | 3 | subprocess.call("python MAE_Experiment.py", shell=True) 4 | subprocess.call("python MAPE_Experiment.py", shell=True) 5 | subprocess.call("python MSE_Experiment.py", shell=True) 6 | 7 | subprocess.call("python MSLE_Experiment.py", shell=True) 8 | subprocess.call("python MBE_Experiment.py", shell=True) 9 | subprocess.call("python RAE_Experiment.py", shell=True) 10 | 11 | subprocess.call("python RSE_Experiment.py", shell=True) 12 | subprocess.call("python NRMSE_Experiment.py", shell=True) 13 | subprocess.call("python RRMSE_Experiment.py", shell=True) 14 | 15 | subprocess.call("python RMSLE_Experiment.py", shell=True) 16 | subprocess.call("python RMSE_Experiment.py", shell=True) 17 | subprocess.call("python Huber_Experiment.py", shell=True) 18 | 19 | subprocess.call("python Quantile_Experiment.py", shell=True) 20 | subprocess.call("python LogCosh_Experiment.py", shell=True) 21 | -------------------------------------------------------------------------------- /libs/__init__.py: -------------------------------------------------------------------------------- 1 | # coding=utf-8 2 | # Copyright 2021 The Google Research Authors. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 13 | # See the License for the specific language governing permissions and 14 | # limitations under the License. 15 | 16 | -------------------------------------------------------------------------------- /loss_functions_plots/Huber-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/Huber-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/LogCosh-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/LogCosh-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/Loss-Functions-Summary.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/Loss-Functions-Summary.png -------------------------------------------------------------------------------- /loss_functions_plots/MAE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/MAE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/MAPE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/MAPE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/MBE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/MBE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/MSE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/MSE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/NRMSE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/NRMSE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/Quantile-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/Quantile-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/RAE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/RAE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/RMSE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/RMSE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/RMSLE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/RMSLE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/RRMSE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/RRMSE-Loss.png -------------------------------------------------------------------------------- /loss_functions_plots/RSE-Loss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/loss_functions_plots/RSE-Loss.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | numpy==1.17.4 2 | pandas==0.25.3 3 | scikit-learn==0.22 4 | tensorflow-probability==0.8.0 5 | wget==3.2 6 | pyunpack==0.1.2 7 | patool==1.12 -------------------------------------------------------------------------------- /traffic_dataset_experiments/Huber_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_huber_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_huber_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Huber Delta 0.5 p10 Loss": str(p10_loss.mean()), 169 | "Huber Delta 0.5 p50 Loss": str(p50_loss.mean()), 170 | "Huber Delta 0.5 p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/LogCosh_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_log_cosh 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_log_cosh.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "LogCosh p10 Loss": str(p10_loss.mean()), 169 | "LogCosh p50 Loss": str(p50_loss.mean()), 170 | "LogCosh p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/MAE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mae_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mae_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Absolute Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Absolute Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Absolute Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/MBE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mbe_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mbe_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Bias Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Bias Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Bias Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/MSE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mse_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mse_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Squared Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Squared Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Squared Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/Quantile_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_quantile_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_quantile_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Quantile p10 Loss": str(p10_loss.mean()), 169 | "Quantile p50 Loss": str(p50_loss.mean()), 170 | "Quantile p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/RAE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_rae_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_rae_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Relative Absolute Error p10 Loss": str(p10_loss.mean()), 169 | "Relative Absolute Error p50 Loss": str(p50_loss.mean()), 170 | "Relative Absolute Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/RMSE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_rmse_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_rmse_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Root Mean Squared Error p10 Loss": str(p10_loss.mean()), 169 | "Root Mean Squared Error p50 Loss": str(p50_loss.mean()), 170 | "Root Mean Squared Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/RSE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_rse_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_rse_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('traffic_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "traffic" 26 | dataset_folder_path = "traffic_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Relative Squared Error p10 Loss": str(p10_loss.mean()), 169 | "Relative Squared Error p50 Loss": str(p50_loss.mean()), 170 | "Relative Squared Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("traffic_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/traffic_dataset_experiments/__init__.py -------------------------------------------------------------------------------- /traffic_dataset_experiments/running_experiments.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | 3 | # subprocess.call("python MAE_Experiment.py", shell=True) 4 | # subprocess.call("python MAPE_Experiment.py", shell=True) 5 | # subprocess.call("python MSE_Experiment.py", shell=True) 6 | 7 | # subprocess.call("python MSLE_Experiment.py", shell=True) 8 | # subprocess.call("python MBE_Experiment.py", shell=True) 9 | # subprocess.call("python RAE_Experiment.py", shell=True) 10 | 11 | # subprocess.call("python RSE_Experiment.py", shell=True) 12 | # subprocess.call("python NRMSE_Experiment.py", shell=True) 13 | # subprocess.call("python RRMSE_Experiment.py", shell=True) 14 | subprocess.call("python RMSLE_Experiment.py", shell=True) 15 | subprocess.call("python RMSE_Experiment.py", shell=True) 16 | 17 | # subprocess.call("python Huber_Experiment.py", shell=True) 18 | # subprocess.call("python Quantile_Experiment.py", shell=True) 19 | # subprocess.call("python LogCosh_Experiment.py", shell=True) 20 | -------------------------------------------------------------------------------- /traffic_dataset_experiments/traffic_dataset_experiments.json: -------------------------------------------------------------------------------- 1 | { 2 | "Mean Absolute Error p10 Loss": "0.2149700098515992", 3 | "Mean Absolute Error p50 Loss": "0.2599976134044644", 4 | "Mean Absolute Error p90 Loss": "0.3026338907797226", 5 | "Mean Absolute Percentage Error p10 Loss": "0.31587777409938284", 6 | "Mean Absolute Percentage Error p50 Loss": "0.43101565344338716", 7 | "Mean Absolute Percentage Error p90 Loss": "0.5165055192279181", 8 | "Mean Squared Error p10 Loss": "0.33964135767331216", 9 | "Mean Squared Error p50 Loss": "0.31213823622988685", 10 | "Mean Squared Error p90 Loss": "0.334369158251204", 11 | "Mean Squared Logarithmic Error p10 Loss": "0.5709465025309931", 12 | "Mean Squared Logarithmic Error p50 Loss": "0.485518275348245", 13 | "Mean Squared Logarithmic Error p90 Loss": "4.200314928717656", 14 | "Mean Bias Error p10 Loss": "1374.865649230446", 15 | "Mean Bias Error p50 Loss": "792.3097505791287", 16 | "Mean Bias Error p90 Loss": "159.8725489956689", 17 | "Relative Absolute Error p10 Loss": "0.27977943293948515", 18 | "Relative Absolute Error p50 Loss": "0.3283051696422157", 19 | "Relative Absolute Error p90 Loss": "0.3942593968609253", 20 | "Relative Squared Error p10 Loss": "0.8120313832508891", 21 | "Relative Squared Error p50 Loss": "0.6196871439664883", 22 | "Relative Squared Error p90 Loss": "0.5100966962703984", 23 | "Huber Delta 0.5 p10 Loss": "0.28792663198516383", 24 | "Huber Delta 0.5 p50 Loss": "0.35444103229790414", 25 | "Huber Delta 0.5 p90 Loss": "0.43353158314037604", 26 | "Quantile p10 Loss": "0.10870677074639072", 27 | "Quantile p50 Loss": "0.31512755338192133", 28 | "Quantile p90 Loss": "0.2324860052743446", 29 | "LogCosh p10 Loss": "0.21281489314654034", 30 | "LogCosh p50 Loss": "0.24873953761120907", 31 | "LogCosh p90 Loss": "0.2895278358251913", 32 | "Relative Root Mean Squared Error p10 Loss": "0.8622717586666065", 33 | "Relative Root Mean Squared Error p50 Loss": "1.193876846712492", 34 | "Relative Root Mean Squared Error p90 Loss": "1.3045465492312036", 35 | "Root Mean Squared Error p10 Loss": "0.3533757601703649", 36 | "Root Mean Squared Error p50 Loss": "0.339826989511792", 37 | "Root Mean Squared Error p90 Loss": "0.3374425805883618", 38 | "Normalized Root Mean Squared Error p10 Loss": "0.6621396996598361", 39 | "Normalized Root Mean Squared Error p50 Loss": "0.6854751675486987", 40 | "Normalized Root Mean Squared Error p90 Loss": "0.5022536859602558", 41 | "Root Mean Squared Logarithmic Error p10 Loss": "0.5078594784184328", 42 | "Root Mean Squared Logarithmic Error p50 Loss": "2.0775628085274107", 43 | "Root Mean Squared Logarithmic Error p90 Loss": "3.581882167915962" 44 | } -------------------------------------------------------------------------------- /volatility_dataset_experiments/Huber_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_huber_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_huber_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('volatility_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "volatility" 26 | dataset_folder_path = "volatility_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Huber Delta 0.5 p10 Loss": str(p10_loss.mean()), 169 | "Huber Delta 0.5 p50 Loss": str(p50_loss.mean()), 170 | "Huber Delta 0.5 p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("volatility_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /volatility_dataset_experiments/LogCosh_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_log_cosh 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_log_cosh.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('volatility_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "volatility" 26 | dataset_folder_path = "volatility_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "LogCosh p10 Loss": str(p10_loss.mean()), 169 | "LogCosh p50 Loss": str(p50_loss.mean()), 170 | "LogCosh p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("volatility_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /volatility_dataset_experiments/MAE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mae_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mae_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('volatility_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "volatility" 26 | dataset_folder_path = "volatility_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Absolute Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Absolute Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Absolute Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("volatility_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /volatility_dataset_experiments/MBE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mbe_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mbe_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('volatility_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "volatility" 26 | dataset_folder_path = "volatility_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Bias Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Bias Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Bias Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("volatility_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /volatility_dataset_experiments/MSE_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_mse_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_mse_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('volatility_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "volatility" 26 | dataset_folder_path = "volatility_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Mean Squared Error p10 Loss": str(p10_loss.mean()), 169 | "Mean Squared Error p50 Loss": str(p50_loss.mean()), 170 | "Mean Squared Error p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("volatility_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /volatility_dataset_experiments/Quantile_Experiment.py: -------------------------------------------------------------------------------- 1 | import datetime as dte 2 | import os 3 | import json 4 | import data_formatters.base 5 | import expt_settings.configs 6 | import libs.hyperparam_opt 7 | import libs.tft_model_quantile_loss 8 | import libs.utils as utils 9 | import numpy as np 10 | import pandas as pd 11 | import tensorflow.compat.v1 as tf 12 | import warnings 13 | 14 | warnings.filterwarnings('ignore') 15 | os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 16 | 17 | ExperimentConfig = expt_settings.configs.ExperimentConfig 18 | HyperparamOptManager = libs.hyperparam_opt.HyperparamOptManager 19 | ModelClass = libs.tft_model_quantile_loss.TemporalFusionTransformer 20 | tf.experimental.output_all_intermediates(True) 21 | 22 | with open('volatility_dataset_experiments.json', 'r') as f: 23 | loss_experiment_tracker = json.load(f) 24 | 25 | dataset_name = "volatility" 26 | dataset_folder_path = "volatility_dataset" 27 | 28 | name = dataset_name 29 | output_folder = dataset_folder_path 30 | 31 | use_tensorflow_with_gpu = True 32 | print("Using output folder {}".format(output_folder)) 33 | 34 | config = ExperimentConfig(name, output_folder) 35 | formatter = config.make_data_formatter() 36 | 37 | expt_name = name 38 | use_gpu = use_tensorflow_with_gpu 39 | model_folder = os.path.join(config.model_folder, "fixed") 40 | data_csv_path = config.data_csv_path 41 | data_formatter = formatter 42 | use_testing_mode = True 43 | 44 | num_repeats = 1 45 | 46 | if not isinstance(data_formatter, data_formatters.base.GenericDataFormatter): 47 | raise ValueError( 48 | "Data formatters should inherit from" + 49 | "AbstractDataFormatter! Type={}".format(type(data_formatter))) 50 | 51 | # Tensorflow setup 52 | default_keras_session = tf.keras.backend.get_session() 53 | 54 | if use_gpu: 55 | tf_config = utils.get_default_tensorflow_config(tf_device="gpu", gpu_id=0) 56 | else: 57 | tf_config = utils.get_default_tensorflow_config(tf_device="cpu") 58 | 59 | print("*** Training from defined parameters for {} ***".format(expt_name)) 60 | 61 | print("Loading & splitting data...") 62 | raw_data = pd.read_csv(data_csv_path, index_col=0) 63 | train, valid, test = data_formatter.split_data(raw_data) 64 | train_samples, valid_samples = data_formatter.get_num_samples_for_calibration() 65 | 66 | # Sets up default params 67 | fixed_params = data_formatter.get_experiment_params() 68 | params = data_formatter.get_default_model_params() 69 | params["model_folder"] = model_folder 70 | 71 | # Parameter overrides for testing only! Small sizes used to speed up script. 72 | if use_testing_mode: 73 | fixed_params["num_epochs"] = 15 74 | params["hidden_layer_size"] = 16 75 | train_samples, valid_samples = 1000, 100 76 | 77 | # Sets up hyper-param manager 78 | print("*** Loading hyperparm manager ***") 79 | opt_manager = HyperparamOptManager({k: [params[k]] for k in params}, 80 | fixed_params, model_folder) 81 | 82 | # Training -- one iteration only 83 | print("*** Running calibration ***") 84 | print("Params Selected:") 85 | 86 | for k in params: 87 | print("{}: {}".format(k, params[k])) 88 | 89 | best_loss = np.Inf 90 | 91 | for _ in range(num_repeats): 92 | tf.reset_default_graph() 93 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 94 | tf.keras.backend.set_session(sess) 95 | params = opt_manager.get_next_parameters() 96 | model = ModelClass(params, use_cudnn=use_gpu) 97 | 98 | if not model.training_data_cached(): 99 | model.cache_batched_data(train, "train", num_samples=train_samples) 100 | model.cache_batched_data(valid, "valid", num_samples=valid_samples) 101 | 102 | sess.run(tf.global_variables_initializer()) 103 | model.fit() 104 | 105 | val_loss = model.evaluate() 106 | 107 | if val_loss < best_loss: 108 | opt_manager.update_score(params, val_loss, model) 109 | best_loss = val_loss 110 | 111 | tf.keras.backend.set_session(default_keras_session) 112 | 113 | print("*** Running tests ***") 114 | tf.reset_default_graph() 115 | 116 | with tf.Graph().as_default(), tf.Session(config=tf_config) as sess: 117 | tf.keras.backend.set_session(sess) 118 | best_params = opt_manager.get_best_params() 119 | model = ModelClass(best_params, use_cudnn=use_gpu) 120 | 121 | model.load(opt_manager.hyperparam_folder) 122 | 123 | print("Computing best validation loss") 124 | val_loss = model.evaluate(valid) 125 | 126 | print("Computing test loss") 127 | output_map = model.predict(test, return_targets=True) 128 | 129 | targets = data_formatter.format_predictions(output_map["targets"]) 130 | p10_forecast = data_formatter.format_predictions(output_map["p10"]) 131 | p50_forecast = data_formatter.format_predictions(output_map["p50"]) 132 | p90_forecast = data_formatter.format_predictions(output_map["p90"]) 133 | 134 | 135 | def extract_numerical_data(data): 136 | """Strips out forecast time and identifier columns.""" 137 | return data[[ 138 | col for col in data.columns 139 | if col not in {"forecast_time", "identifier"} 140 | ]] 141 | 142 | 143 | p10_loss = utils.numpy_normalised_quantile_loss( 144 | extract_numerical_data(targets), extract_numerical_data(p10_forecast), 145 | 0.1) 146 | 147 | p50_loss = utils.numpy_normalised_quantile_loss( 148 | extract_numerical_data(targets), extract_numerical_data(p50_forecast), 149 | 0.5) 150 | 151 | p90_loss = utils.numpy_normalised_quantile_loss( 152 | extract_numerical_data(targets), extract_numerical_data(p90_forecast), 153 | 0.9) 154 | 155 | tf.keras.backend.set_session(default_keras_session) 156 | 157 | print("Training completed @ {}".format(dte.datetime.now())) 158 | print("Best validation loss = {}".format(val_loss)) 159 | print("Params:") 160 | 161 | for k in best_params: 162 | print(k, " = ", best_params[k]) 163 | 164 | print("Normalised Quantile Loss for Test Data: P10={}, P50={}, P90={}".format( 165 | p10_loss.mean(), p50_loss.mean(), p90_loss.mean())) 166 | 167 | loss_experiment_tracker.update({ 168 | "Quantile p10 Loss": str(p10_loss.mean()), 169 | "Quantile p50 Loss": str(p50_loss.mean()), 170 | "Quantile p90 Loss": str(p90_loss.mean()), 171 | }) 172 | 173 | with open("volatility_dataset_experiments.json", "w") as outfile: 174 | json.dump(loss_experiment_tracker, outfile) 175 | -------------------------------------------------------------------------------- /volatility_dataset_experiments/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/aryan-jadon/Regression-Loss-Functions-in-Time-Series-Forecasting-Tensorflow/509857e1d43a57e2afc62c046bc67247e8ef6b74/volatility_dataset_experiments/__init__.py -------------------------------------------------------------------------------- /volatility_dataset_experiments/running_experiments.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | 3 | subprocess.call("python MAE_Experiment.py", shell=True) 4 | subprocess.call("python MAPE_Experiment.py", shell=True) 5 | subprocess.call("python MSE_Experiment.py", shell=True) 6 | 7 | subprocess.call("python MSLE_Experiment.py", shell=True) 8 | subprocess.call("python MBE_Experiment.py", shell=True) 9 | subprocess.call("python RAE_Experiment.py", shell=True) 10 | 11 | subprocess.call("python RSE_Experiment.py", shell=True) 12 | subprocess.call("python NRMSE_Experiment.py", shell=True) 13 | subprocess.call("python RRMSE_Experiment.py", shell=True) 14 | 15 | subprocess.call("python RMSLE_Experiment.py", shell=True) 16 | subprocess.call("python RMSE_Experiment.py", shell=True) 17 | subprocess.call("python Huber_Experiment.py", shell=True) 18 | 19 | subprocess.call("python Quantile_Experiment.py", shell=True) 20 | subprocess.call("python LogCosh_Experiment.py", shell=True) 21 | -------------------------------------------------------------------------------- /volatility_dataset_experiments/volatility_dataset_experiments.json: -------------------------------------------------------------------------------- 1 | { 2 | "Mean Absolute Error p10 Loss": "0.042824335947627426", 3 | "Mean Absolute Error p50 Loss": "0.04168029241496022", 4 | "Mean Absolute Error p90 Loss": "0.03884058353596124", 5 | "Mean Absolute Percentage Error p10 Loss": "0.12355033245265527", 6 | "Mean Absolute Percentage Error p50 Loss": "0.07438012624099026", 7 | "Mean Absolute Percentage Error p90 Loss": "0.023561194940066848", 8 | "Mean Squared Error p10 Loss": "0.04010624478444199", 9 | "Mean Squared Error p50 Loss": "0.04386896257603089", 10 | "Mean Squared Error p90 Loss": "0.0540467312190543", 11 | "Mean Squared Logarithmic Error p10 Loss": "0.02077089525455014", 12 | "Mean Squared Logarithmic Error p50 Loss": "0.12188398084429979", 13 | "Mean Squared Logarithmic Error p90 Loss": "0.2765753210679013", 14 | "Mean Bias Error p10 Loss": "340.36138917779664", 15 | "Mean Bias Error p50 Loss": "183.4781183729714", 16 | "Mean Bias Error p90 Loss": "36.62861322122592", 17 | "Relative Absolute Error p10 Loss": "0.03719687908521616", 18 | "Relative Absolute Error p50 Loss": "0.04236396893799749", 19 | "Relative Absolute Error p90 Loss": "0.05440406932961669", 20 | "Relative Squared Error p10 Loss": "0.14905548838620944", 21 | "Relative Squared Error p50 Loss": "0.14042761570620096", 22 | "Relative Squared Error p90 Loss": "0.04396719253099129", 23 | "Normalized Root Mean Squared Error p10 Loss": "7.952003329280757", 24 | "Normalized Root Mean Squared Error p50 Loss": "2.4256375886133474", 25 | "Normalized Root Mean Squared Error p90 Loss": "3.772548861352754", 26 | "Relative Root Mean Squared Error p10 Loss": "0.046893215258235116", 27 | "Relative Root Mean Squared Error p50 Loss": "0.05121035461243939", 28 | "Relative Root Mean Squared Error p90 Loss": "0.06704528321049934", 29 | "Root Mean Squared Error p10 Loss": "0.050491769855794524", 30 | "Root Mean Squared Error p50 Loss": "0.044495598202912726", 31 | "Root Mean Squared Error p90 Loss": "0.039098884894069964", 32 | "Quantile p10 Loss": "0.017485148358211435", 33 | "Quantile p50 Loss": "0.0419765934049694", 34 | "Quantile p90 Loss": "0.020519211905835673", 35 | "LogCosh p10 Loss": "0.03635449294866819", 36 | "LogCosh p50 Loss": "0.043583865646608655", 37 | "LogCosh p90 Loss": "0.04626212874915689", 38 | "Huber Delta 0.5 p10 Loss": "0.03754627051463181", 39 | "Huber Delta 0.5 p50 Loss": "0.04053489932253964", 40 | "Huber Delta 0.5 p90 Loss": "0.04121816078822414", 41 | "Root Mean Squared Logarithmic Error p10 Loss": "0.040913331749046065", 42 | "Root Mean Squared Logarithmic Error p50 Loss": "0.3326591104746543", 43 | "Root Mean Squared Logarithmic Error p90 Loss": "0.5300992075211683" 44 | } --------------------------------------------------------------------------------