├── ICTSP-MultiTask ├── .gitignore ├── README.md ├── configs │ ├── pretrain_configs_finetune.json │ └── pretrain_configs_sequential.json ├── data_provider │ ├── data_factory.py │ ├── data_loader.py │ ├── ictsp_dataloader.py │ ├── ictsp_tokenizer.py │ └── ts_feature.py ├── dataset_finetune │ └── ETTh2 │ │ └── ETTh2.csv ├── dataset_test │ └── ETTh2.csv ├── exp │ ├── exp_basic.py │ └── exp_icpretrain.py ├── icfinetune.sh ├── icpretrain.sh ├── inference.ipynb ├── layers │ └── NanoGPTBlock.py ├── models │ └── ICPretrain.py ├── requirements.txt ├── run_longExp.py ├── ts_generation │ ├── Ode_generator.py │ └── TS_DS_Generator.py └── utils │ ├── metrics.py │ ├── scientific_report.py │ ├── timefeatures.py │ └── tools.py ├── ICTSP ├── .gitattributes ├── README.md ├── data_provider │ ├── data_factory.py │ └── data_loader.py ├── exp │ ├── exp_basic.py │ ├── exp_main.py │ └── exp_stat.py ├── layers │ ├── AutoCorrelation.py │ ├── Autoformer_EncDec.py │ ├── Conv_Blocks.py │ ├── Embed.py │ ├── PatchTST_backbone.py │ ├── PatchTST_backbone_aip.py │ ├── PatchTST_layers.py │ ├── RevIN.py │ ├── SelfAttention_Family.py │ └── Transformer_EncDec.py ├── models │ ├── Autoformer.py │ ├── DLinear.py │ ├── FITS.py │ ├── ICTSP.py │ ├── Informer.py │ ├── Linear.py │ ├── NBeats.py │ ├── NLinear.py │ ├── PatchTST.py │ ├── Stat_models.py │ ├── TiDE.py │ ├── TimesNet.py │ ├── Transformer.py │ └── iTransformer.py ├── requirements.txt ├── run_longExp.py ├── scripts │ └── ICTSP │ │ ├── ictsp_fewshot005.sh │ │ ├── ictsp_fewshot010.sh │ │ ├── ictsp_full.sh │ │ └── ictsp_zeroshot.sh └── utils │ ├── masking.py │ ├── metrics.py │ ├── scientific_report.py │ ├── timefeatures.py │ └── tools.py ├── LICENSE ├── README.md └── figs └── ICTSP.png /ICTSP-MultiTask/.gitignore: -------------------------------------------------------------------------------- 1 | /runs 2 | /results 3 | /test_results 4 | /logs 5 | /dataset 6 | /checkpoints 7 | /.ipynb_checkpoints 8 | nohup.out 9 | result.txt 10 | nohup.txt 11 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/README.md: -------------------------------------------------------------------------------- 1 | ## ICTSP - Multi-task Time Series Foundation Model Pretraining 2 | 3 | ### 1. Install Required Packages 4 | 5 | To get started, install the necessary dependencies: 6 | 7 | ```bash 8 | pip3 install torch 9 | pip3 install -r requirements.txt 10 | ``` 11 | 12 | ### 2. Manage Your Pretraining Data 13 | 14 | The `ts_generation` folder contains two time series generators: 15 | 16 | - `TS_DS_Generator.py` 17 | - `Ode_generator.py` 18 | 19 | These can generate multivariate time series data with inter-series dependencies. Alternatively, you can use your own custom data. To do so, place your data files (in `.csv` format by default) in a new folder under `dataset/`. Then update the pretraining configuration file to include your new data as follow. 20 | 21 | ### 3. Edit the Training Configuration 22 | 23 | Customize the training settings by editing the file `configs/pretrain_configs_sequential.json`. 24 | 25 | #### Adjust Data Source Weights 26 | 27 | To balance training across different data sources, assign weights to the datasets. Use the first-level folder name under `dataset/` as the data source name. For example, the following configuration sets equal-weight sampling for the data in `dataset/ICL_pretrain_1/` and `dataset/ODE_pretrain_1/`: 28 | 29 | ```json 30 | "source_weight_reg": { 31 | "ICL_pretrain_1": 1, 32 | "ODE_pretrain_1": 1 33 | } 34 | ``` 35 | 36 | #### Configure Task Weights 37 | 38 | Enable multi-task time series pretraining by setting weights for each task. For instance: 39 | 40 | ```json 41 | "token_type_weight_reg": { 42 | "forecasting": 1, 43 | "classification": 0, 44 | "imitation": 0, 45 | "imputation": 0.2, 46 | "cropping": 0.2, 47 | "reflection": 0.2, 48 | "shifting": 0.2, 49 | "hyperres": 0.2, 50 | "statistics": 0, 51 | "differencing": 0.2, 52 | "movingavg": 0.2, 53 | "expsmoothing": 0, 54 | "decomposition": 0 55 | } 56 | ``` 57 | 58 | ### 4. Run the Scripts 59 | 60 | After configuring `configs/pretrain_configs_sequential.json`, start the pretraining process by running the script: 61 | 62 | ```bash 63 | ./icpretrain.sh 64 | ``` 65 | 66 | This script uses a single time series data file as the validation and test reference, specified as `data_name="ETTh2.csv"`. 67 | 68 | To fine-tune the model on a specific dataset with a fixed (lookback, future) pair after pretraining, run: 69 | 70 | ```bash 71 | ./icfinetune.sh 72 | ``` 73 | 74 | Modify `data_name="ETTh2.csv"` to point to your dataset path. 75 | 76 | For fine-tuning on multiple datasets with flexible (lookback, future) settings, place your datasets in a new folder under `dataset_finetune/`. Then, change the following setting in `configs/pretrain_configs_finetune.json`: 77 | 78 | ```json 79 | "source_weight_reg": { 80 | "ETTh2": 1 81 | } 82 | ... 83 | "use_legacy_dataloader": false, 84 | ``` 85 | 86 | This enables the ICTSP tokenizer and dataloader for multi-task training. 87 | 88 | ### 5. Track Your Training 89 | 90 | To monitor the training process, use TensorBoard: 91 | 92 | ```bash 93 | nohup tensorboard --logdir runs --port 6006 --bind_all > tensorb.log 2>&1 & 94 | ``` 95 | 96 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/configs/pretrain_configs_finetune.json: -------------------------------------------------------------------------------- 1 | { 2 | "stage": "dev", 3 | "scale": true, 4 | "shorter_lookback_for_finetuning": 104, 5 | "shorter_future_for_finetuning": 4, 6 | "force_legacy_lookback_for_inference": 104, 7 | "force_training_set_split_rate": 0.729, 8 | "number_of_targets": 53, 9 | "max_L_I": 4096, 10 | "lookback": 768, 11 | "future": 768, 12 | "randomized_training_flag": true, 13 | "random_series_shuffle": false, 14 | "force_resize_time_series_to_size_limit": false, 15 | "zero_padding_to_hard_token_limit": true, 16 | "soft_token_limit": 2048, 17 | "hard_token_limit": 2048, 18 | "sampling_step": 2, 19 | "max_channel_vocab_size": 4096, 20 | "max_position_vocab_size": 4096, 21 | "max_source_vocab_size": 4096, 22 | "max_tag_vocab_size": 4096, 23 | "number_of_targets": 0, 24 | "n_series_registrar": 0, 25 | "cls_max_random_number_of_targets": 8, 26 | "root_path": "dataset_finetune", 27 | "training_soft_series_limit": 1024, 28 | "db_path": "data_sources_finetune.db", 29 | "current_ds_type": "both", 30 | "source_weight_cls": { 31 | "ETTh2": 0 32 | }, 33 | "token_type_weight_cls": { 34 | "forecasting": 0, 35 | "classification": 1, 36 | "imitation": 0, 37 | "imputation": 0, 38 | "cropping": 0, 39 | "reflection": 0, 40 | "shifting": 0, 41 | "hyperres": 0, 42 | "statistics": 0, 43 | "differencing": 0, 44 | "movingavg": 0, 45 | "expsmoothing": 0, 46 | "decomposition": 0 47 | }, 48 | "source_weight_reg": { 49 | "ETTh2": 1 50 | }, 51 | "token_type_weight_reg": { 52 | "forecasting": 1, 53 | "classification": 0, 54 | "imitation": 0, 55 | "imputation": 0.0, 56 | "cropping": 0.0, 57 | "reflection": 0.0, 58 | "shifting": 0.0, 59 | "hyperres": 0.0, 60 | "statistics": 0, 61 | "differencing": 0.0, 62 | "movingavg": 0.0, 63 | "expsmoothing": 0, 64 | "decomposition": 0 65 | }, 66 | "ds_len": 4000, 67 | "batch_size": 8, 68 | "num_workers": 12, 69 | "rebuild_every": 96000, 70 | "d_model": 768, 71 | "n_layers": 12, 72 | "n_heads": 12, 73 | "mlp_ratio": 4, 74 | "dropout": 0.0, 75 | "enable_task_embedding": true, 76 | "use_legacy_dataloader": true, 77 | "tokenization_mode": "sequential", 78 | "compile": true 79 | } -------------------------------------------------------------------------------- /ICTSP-MultiTask/configs/pretrain_configs_sequential.json: -------------------------------------------------------------------------------- 1 | { 2 | "stage": "dev", 3 | "max_L_I": 4096, 4 | "lookback": 768, 5 | "future": 768, 6 | "randomized_training_flag": true, 7 | "random_series_shuffle": true, 8 | "force_resize_time_series_to_size_limit": false, 9 | "zero_padding_to_hard_token_limit": true, 10 | "force_max_number_of_series": 1024, 11 | "soft_token_limit": 2048, 12 | "hard_token_limit": 2048, 13 | "sampling_step": 2, 14 | "max_channel_vocab_size": 4096, 15 | "max_position_vocab_size": 4096, 16 | "max_source_vocab_size": 4096, 17 | "max_tag_vocab_size": 4096, 18 | "number_of_targets": 0, 19 | "n_series_registrar": 0, 20 | "cls_max_random_number_of_targets": 8, 21 | "root_path": "dataset", 22 | "training_soft_series_limit": 1024, 23 | "db_path": "data_sources.db", 24 | "current_ds_type": "both", 25 | "source_weight_cls": { 26 | "ICL_pretrain_1": 0, 27 | "ODE_pretrain_1": 0 28 | }, 29 | "token_type_weight_cls": { 30 | "forecasting": 0, 31 | "classification": 1, 32 | "imitation": 0, 33 | "imputation": 0, 34 | "cropping": 0, 35 | "reflection": 0, 36 | "shifting": 0, 37 | "hyperres": 0, 38 | "statistics": 0, 39 | "differencing": 0, 40 | "movingavg": 0, 41 | "expsmoothing": 0, 42 | "decomposition": 0 43 | }, 44 | "source_weight_reg": { 45 | "ICL_pretrain_1": 1, 46 | "ODE_pretrain_1": 1 47 | }, 48 | "token_type_weight_reg": { 49 | "forecasting": 1, 50 | "classification": 0, 51 | "imitation": 0, 52 | "imputation": 0.2, 53 | "cropping": 0.2, 54 | "reflection": 0.2, 55 | "shifting": 0.2, 56 | "hyperres": 0.2, 57 | "statistics": 0, 58 | "differencing": 0.2, 59 | "movingavg": 0.2, 60 | "expsmoothing": 0, 61 | "decomposition": 0 62 | }, 63 | "ds_len": 4000, 64 | "batch_size": 4, 65 | "num_workers": 6, 66 | "rebuild_every": 96000, 67 | "d_model": 768, 68 | "n_layers": 12, 69 | "n_heads": 12, 70 | "mlp_ratio": 4, 71 | "dropout": 0.0, 72 | "enable_task_embedding": true, 73 | "use_legacy_dataloader": false, 74 | "tokenization_mode": "sequential", 75 | "compile": true 76 | } -------------------------------------------------------------------------------- /ICTSP-MultiTask/data_provider/data_factory.py: -------------------------------------------------------------------------------- 1 | from data_provider.data_loader import Dataset_ETT_hour, Dataset_ETT_minute, Dataset_Custom, Dataset_Pred 2 | from torch.utils.data import DataLoader 3 | from itertools import chain 4 | import copy 5 | 6 | import random 7 | 8 | import re 9 | 10 | def read_file_as_string(file_path): 11 | try: 12 | with open(file_path, 'r', encoding='utf-8') as file: 13 | content = file.read() 14 | return content 15 | except FileNotFoundError: 16 | return f"Error: The file {file_path} does not exist." 17 | except Exception as e: 18 | return f"An error occurred: {e}" 19 | 20 | 21 | def random_sample_dataset(lst, num_elements=8): 22 | if len(lst) < num_elements: 23 | return lst 24 | else: 25 | return random.sample(lst, num_elements) 26 | 27 | def extract_data(s): 28 | pattern = re.compile(r'(?:\[(\d+\.\d+)(?:,(\d+))?(?:,(\d+))?\])?([^,]+)') 29 | ratios = [] 30 | pred_lens = [] 31 | batchsize = [] 32 | filenames = [] 33 | 34 | for match in pattern.findall(s): 35 | ratio = match[0] 36 | pred_len = match[1] 37 | batch_size = match[2] 38 | filename = match[3].strip() 39 | 40 | ratios.append(float(ratio) if ratio else 1.0) 41 | pred_lens.append(int(pred_len) if pred_len else 0) 42 | batchsize.append(int(batch_size) if batch_size else 0) 43 | filenames.append(filename) 44 | 45 | return ratios, pred_lens, batchsize, filenames 46 | 47 | class RandomizedDataLoaderIter: 48 | def __init__(self, dataloaders, sample_len=4096): 49 | self.dataloaders = [iter(dl) for dl in dataloaders] 50 | self.active_iters = list(range(len(self.dataloaders))) 51 | self.sample_len = sample_len 52 | self.sample_counter = 0 53 | 54 | def __iter__(self): 55 | return self 56 | 57 | def __next__(self): 58 | if self.sample_counter >= self.sample_len: 59 | raise StopIteration 60 | 61 | while self.active_iters: 62 | choice = random.choice(self.active_iters) 63 | try: 64 | data = next(self.dataloaders[choice]) 65 | self.sample_counter += 1 66 | return data 67 | except StopIteration: 68 | self.active_iters.remove(choice) 69 | 70 | raise StopIteration 71 | 72 | def __len__(self): 73 | return self.sample_len 74 | 75 | data_dict = { 76 | 'ETTh1': Dataset_ETT_hour, 77 | 'ETTh2': Dataset_ETT_hour, 78 | 'ETTm1': Dataset_ETT_minute, 79 | 'ETTm2': Dataset_ETT_minute, 80 | 'custom': Dataset_Custom, 81 | } 82 | 83 | def data_provider(args, flag, element_wise_shuffle=True): 84 | if args.data_path in ['pretrain_config', 'baseline_config']: 85 | args.data_path = read_file_as_string(f"{args.data_path}.txt") 86 | if ',' not in args.data_path: 87 | return data_provider_subset(args, flag) 88 | else: 89 | #data_names = args.data_path.split(',') 90 | data_few_shot_ratios, pred_lens, batchsizes, data_names = extract_data(args.data_path) 91 | pred_lens = [i if i != 0 else args.pred_len for i in pred_lens] 92 | batchsizes = [i if i != 0 else args.batch_size for i in batchsizes] 93 | 94 | mapping = [['custom' if 'ETT' not in dn else dn.split('.')[0], dn, r, bs, pdl] for dn, r, bs, pdl in zip(data_names, data_few_shot_ratios, batchsizes, pred_lens)] 95 | data_sets, data_loaders = [], [] 96 | temp_args = copy.deepcopy(args) 97 | if flag in ['val', 'test']: 98 | for d in [mapping[-1]]: 99 | temp_args.data, temp_args.data_path, temp_args.few_shot_ratio, temp_args.batch_size, temp_args.batch_size_test, temp_args.pred_len = d[0], d[1], d[2], d[3], d[3], d[4] 100 | ds, dl = data_provider_subset(temp_args, flag) 101 | data_sets.append(ds) 102 | data_loaders.append(dl) 103 | return data_sets[-1], data_loaders[-1] 104 | 105 | current_datasets = random_sample_dataset(mapping[0:-1]) if args.transfer_learning else random_sample_dataset(mapping) 106 | current_datasets = current_datasets + [mapping[-2]] 107 | for d in current_datasets: 108 | temp_args.data, temp_args.data_path, temp_args.few_shot_ratio, temp_args.batch_size, temp_args.batch_size_test, temp_args.pred_len = d[0], d[1], d[2], d[3], d[3], d[4] 109 | ds, dl = data_provider_subset(temp_args, flag) 110 | data_sets.append(ds) 111 | data_loaders.append(dl) 112 | # For validation set and test set 113 | # For training set 114 | # if args.transfer_learning: 115 | # return chain.from_iterable(data_sets[0:-1]), chain.from_iterable(data_loaders[0:-1]) if not element_wise_shuffle else RandomizedDataLoaderIter(data_loaders[0:-1]) 116 | # else: 117 | return chain.from_iterable(data_sets), chain.from_iterable(data_loaders) if not element_wise_shuffle else RandomizedDataLoaderIter(data_loaders) 118 | 119 | def data_provider_subset(args, flag): 120 | Data = data_dict[args.data] 121 | timeenc = 0 if args.embed != 'timeF' else 1 122 | 123 | if flag == 'test': 124 | shuffle_flag = False 125 | drop_last = False 126 | batch_size = args.batch_size if args.batch_size_test == 0 else args.batch_size_test 127 | freq = args.freq 128 | elif flag == 'pred': 129 | shuffle_flag = False 130 | drop_last = False 131 | batch_size = 1 132 | freq = args.freq 133 | Data = Dataset_Pred 134 | else: 135 | shuffle_flag = True 136 | drop_last = False 137 | batch_size = args.batch_size 138 | freq = args.freq 139 | 140 | data_set = Data( 141 | root_path=args.root_path, 142 | data_path=args.data_path, 143 | flag=flag, 144 | size=[args.seq_len, args.label_len, args.pred_len], 145 | features=args.features, 146 | target=args.target, 147 | timeenc=timeenc, 148 | scale=args.scale, 149 | freq=freq, 150 | train_ratio=args.train_ratio, 151 | test_ratio=args.test_ratio, 152 | percent=int(getattr(args, 'few_shot_ratio', 1)*100), 153 | #force_fair_comparison_for_extendable_and_extended_input_length=args.model == 'ICFormer' and flag == 'test' 154 | do_forecasting=getattr(args, 'do_forecasting', False), 155 | min_max_scaling=getattr(args, 'min_max_scaling', False) 156 | ) 157 | print(flag, len(data_set)) 158 | data_loader = DataLoader( 159 | data_set, 160 | batch_size=batch_size, 161 | shuffle=shuffle_flag, 162 | num_workers=args.num_workers, 163 | drop_last=drop_last, 164 | pin_memory=True) 165 | return data_set, data_loader 166 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/data_provider/ts_feature.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import pandas as pd 3 | from scipy.stats import kurtosis, skew 4 | from statsmodels.tsa.stattools import adfuller 5 | from scipy.signal import find_peaks 6 | from scipy.stats import entropy 7 | from statsmodels.tsa.stattools import acf, pacf 8 | from tsfresh.feature_extraction import feature_calculators as fc 9 | import pywt 10 | import random 11 | from statsmodels.tsa.seasonal import STL 12 | 13 | from scipy.signal import lfilter 14 | 15 | def clip_output(): 16 | min_value = np.random.randint(-25, -15) 17 | max_value = np.random.randint(15, 25) 18 | def decorator(func): 19 | def wrapper(*args, **kwargs): 20 | output = func(*args, **kwargs) 21 | return np.clip(output, min_value, max_value) 22 | return wrapper 23 | return decorator 24 | 25 | # 1. Calculate mean 26 | def mean(series): 27 | return np.mean(series, axis=1) 28 | 29 | # 2. Calculate median 30 | def median(series): 31 | return np.median(series, axis=1) 32 | 33 | # 3. Calculate maximum value 34 | def max_value(series): 35 | return np.max(series, axis=1) 36 | 37 | # 4. Calculate minimum value 38 | def min_value(series): 39 | return np.min(series, axis=1) 40 | 41 | # 5. Calculate range 42 | def range_value(series): 43 | return np.ptp(series, axis=1) 44 | 45 | # 6. Calculate interquartile range 46 | def iqr(series): 47 | return np.percentile(series, 75, axis=1) - np.percentile(series, 25, axis=1) 48 | 49 | # 7. Calculate standard deviation 50 | def std(series): 51 | return np.std(series, axis=1) 52 | 53 | # 8. Calculate variance 54 | def var(series): 55 | return np.var(series, axis=1) 56 | 57 | # 9. Calculate skewness 58 | def skewness(series): 59 | return skew(series, axis=1) 60 | 61 | # 10. Calculate kurtosis 62 | def kurt(series): 63 | return kurtosis(series, axis=1) 64 | 65 | # 11. Calculate mean absolute deviation 66 | def mad(series): 67 | return np.mean(np.abs(series - np.mean(series, axis=1, keepdims=True)), axis=1) 68 | 69 | # 12. Calculate mean absolute percentage error 70 | def mape(series): 71 | diffs = np.abs(np.diff(series, axis=1)) 72 | return np.mean(diffs / (series[:, :-1] + 1e-5), axis=1) * 100 73 | 74 | # 13. Calculate symmetric mean absolute percentage error 75 | def smape(series): 76 | diffs = np.abs(np.diff(series, axis=1)) 77 | return 2.0 * np.mean(diffs / (series[:, :-1] + series[:, 1:] + 1e-5), axis=1) * 100 78 | 79 | # 14. Calculate maximum of rolling mean 80 | def rolling_mean_max(series, window): 81 | return np.max(pd.DataFrame(series).T.rolling(window=window).mean().dropna().T.values, axis=1) 82 | 83 | # 15. Calculate minimum of rolling mean 84 | def rolling_mean_min(series, window): 85 | return np.min(pd.DataFrame(series).T.rolling(window=window).mean().dropna().T.values, axis=1) 86 | 87 | # 16. Calculate maximum of rolling standard deviation 88 | def rolling_std_max(series, window): 89 | return np.max(pd.DataFrame(series).T.rolling(window=window).std().dropna().T.values, axis=1) 90 | 91 | # 17. Calculate minimum of rolling standard deviation 92 | def rolling_std_min(series, window): 93 | return np.min(pd.DataFrame(series).T.rolling(window=window).std().dropna().T.values, axis=1) 94 | 95 | # 18. Calculate p-value of ADF (Augmented Dickey-Fuller) unit root test 96 | def adf_test_p_value(series): 97 | return np.array([adfuller(s)[1] for s in series]) 98 | 99 | # 19. Calculate maximum of autocorrelation function (ACF) 100 | def max_acf(series, nlags): 101 | return np.array([np.max(acf(s, nlags=nlags)) for s in series]) 102 | 103 | # 20. Calculate maximum of partial autocorrelation function (PACF) 104 | def max_pacf(series, nlags): 105 | return np.array([np.max(pacf(s, nlags=nlags)) for s in series]) 106 | 107 | # 21. Calculate number of periodic peaks 108 | def num_peaks(series, distance, prominence): 109 | return np.array([len(find_peaks(s, distance=distance, prominence=prominence)[0]) for s in series]) 110 | 111 | # 22. Calculate signal energy 112 | def signal_energy(series): 113 | return np.sum(series ** 2, axis=1) / (series.shape[1] ** 2) 114 | 115 | # 23. Calculate Shannon entropy of the signal 116 | def shannon_entropy(series): 117 | return np.array([entropy(pd.Series(s).value_counts(normalize=True)) for s in series]) / (series.shape[1]) 118 | 119 | # 24. Calculate signal percentiles 120 | def percentile(series, q): 121 | return np.percentile(series, q, axis=1) 122 | 123 | # 25. Calculate sum of absolute values of autocorrelation function 124 | def autocorrelation_sum(series, lag): 125 | return np.array([np.sum(np.abs(np.correlate(s, s, mode='full')[len(s) - 1:-lag])) for s in series]) 126 | 127 | # 26. Calculate mean absolute change of the signal 128 | def mean_abs_change(series): 129 | return np.mean(np.abs(np.diff(series, axis=1)), axis=1) 130 | 131 | # 27. Calculate mean squared change of the signal 132 | def mean_squared_change(series): 133 | return np.mean(np.diff(series, axis=1) ** 2, axis=1) ** (1/2) 134 | 135 | # 28. Calculate sum of absolute values of the signal 136 | def abs_energy(series): 137 | return np.array([fc.abs_energy(s) for s in series]) / (series.shape[1] ** 2) 138 | 139 | # 29. Calculate energy of the signal after wavelet transform 140 | def wavelet_energy(series, wavelet): 141 | return np.array([np.sum(np.array(pywt.wavedec(s, wavelet)[0]) ** 2) for s in series]) 142 | 143 | # 30. Calculate energy of the signal after discrete wavelet transform 144 | def wavelet_denoised_energy(series, wavelet): 145 | return np.array([np.sum(np.array(pywt.waverec(pywt.wavedec(s, wavelet)[:-1] + [None] * len(pywt.wavedec(s, wavelet)[-1]), wavelet)) ** 2) for s in series]) 146 | 147 | # 31. Calculate time reversibility of the signal 148 | def time_reversibility(series): 149 | return np.array([fc.time_reversal_asymmetry_statistic(s, lag=1) for s in series]) 150 | 151 | # 32. Calculate autoregression coefficients of the signal 152 | def ar_coeffs(series, n_coeffs): 153 | return np.array([fc.autoregression_coefficients(s, n_coeffs) for s in series]) 154 | 155 | # 33. Calculate long and short term memory of the signal 156 | def long_short_term_memory(series, memory): 157 | below_mean = np.array([fc.longest_strike_below_mean(s) for s in series]) 158 | above_mean = np.array([fc.longest_strike_above_mean(s) for s in series]) 159 | return below_mean / (series.shape[1]), above_mean / (series.shape[1]) 160 | 161 | # 34. Calculate the proportion of time the signal is above the mean 162 | def time_above_mean(series): 163 | return np.array([fc.mean_abs_change(s) / len(s) for s in series]) 164 | 165 | @clip_output() 166 | def extract_time_series_features(series, indices=None, randomized=True, get_indices=False): 167 | nlags = 10 168 | wavelet = 'db4' 169 | distance = 10 170 | prominence = 1 171 | n_coeffs = 5 172 | memory = 1 173 | 174 | lstm_0, lstm_1 = long_short_term_memory(series, memory) 175 | features = { 176 | 'mean': mean(series), 177 | 'median': median(series), 178 | 'max': max_value(series), 179 | 'min': min_value(series), 180 | 'range': range_value(series), 181 | 'iqr': iqr(series), 182 | 'std': std(series), 183 | 'var': var(series), 184 | 'skewness': skewness(series), 185 | 'kurtosis': kurt(series), 186 | 'mad': mad(series), 187 | 'mape': mape(series), 188 | 'smape': smape(series), 189 | 'rolling_mean_max_3': rolling_mean_max(series, 3), 190 | 'rolling_mean_min_3': rolling_mean_min(series, 3), 191 | 'rolling_std_max_3': rolling_std_max(series, 3), 192 | 'rolling_std_min_3': rolling_std_min(series, 3), 193 | #'adf_test_p_value': adf_test_p_value(series), 194 | #'max_acf': max_acf(series, nlags), 195 | #'max_pacf': max_pacf(series, nlags), 196 | 'num_peaks': num_peaks(series, distance, prominence), 197 | 'signal_energy': signal_energy(series), 198 | 'shannon_entropy': shannon_entropy(series), 199 | 'percentile_25': percentile(series, 25), 200 | 'percentile_75': percentile(series, 75), 201 | #'autocorrelation_sum': autocorrelation_sum(series, nlags), 202 | 'mean_abs_change': mean_abs_change(series), 203 | 'mean_squared_change': mean_squared_change(series), 204 | 'abs_energy': abs_energy(series), 205 | # 'wavelet_energy': wavelet_energy(series, wavelet), 206 | # 'wavelet_denoised_energy': wavelet_denoised_energy(series, wavelet), 207 | # 'time_reversibility': time_reversibility(series), 208 | # **ar_coeffs(series, n_coeffs), 209 | 'longest_strike_below_mean': lstm_0, 210 | 'longest_strike_above_mean': lstm_1, 211 | 'time_above_mean': time_above_mean(series) 212 | } 213 | features = list(features.values()) 214 | if randomized: 215 | if indices is None: 216 | indices = range(len(features)) 217 | indices = random.sample(indices, int(np.ceil(len(features)*np.random.rand()))) 218 | if get_indices: 219 | return indices 220 | features = [features[i] for i in indices] 221 | return np.stack(features).T 222 | 223 | def differencing(series, order): 224 | diff_series = np.diff(series, n=order, axis=1) 225 | return diff_series 226 | 227 | def moving_average(series, window): 228 | ma_series = np.zeros((series.shape[0], series.shape[1] - window + 1)) 229 | for i in range(series.shape[0]): 230 | ma_series[i, :] = np.convolve(series[i, :], np.ones(window)/window, mode='valid') 231 | return ma_series 232 | 233 | def exponential_smoothing(series, alpha): 234 | b = [alpha] 235 | a = [1, alpha - 1] 236 | smoothed_series = np.zeros(series.shape) 237 | for i in range(series.shape[0]): 238 | smoothed_series[i, :] = lfilter(b, a, series[i, :]) 239 | return smoothed_series 240 | 241 | def stl_decomposition(series, period, seasonal=7): 242 | decomposed = [] 243 | for i in range(series.shape[0]): 244 | ts_series = pd.Series(series[i, :]) 245 | stl = STL(ts_series, period=period, seasonal=seasonal) 246 | result = stl.fit() 247 | decomposed.append(np.stack([result.trend.values, result.seasonal.values, result.resid.values])) 248 | return np.concatenate(decomposed, axis=0) 249 | 250 | @clip_output() 251 | def time_series_transformation(data, method="differencing"): 252 | C, L = data.shape 253 | if method == "differencing": 254 | order = random.randint(1, min(3, L)) # Random order for differencing 255 | return differencing(data, order) 256 | elif method == "movingavg": 257 | window = random.randint(2, min(40, L)) # Random window size for moving average 258 | return moving_average(data, window) 259 | elif method == "expsmoothing": 260 | alpha = random.uniform(0.02, 0.98) # Random alpha for exponential smoothing 261 | return exponential_smoothing(data, alpha) 262 | elif method == "decomposition": 263 | period = random.randint(2, min(12, L)) # Random period for STL decomposition 264 | seasonal = random.choice([3, 5, 7, 9, 11]) # Random seasonal parameter for STL decomposition 265 | return stl_decomposition(data, period, seasonal) -------------------------------------------------------------------------------- /ICTSP-MultiTask/exp/exp_basic.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import numpy as np 4 | 5 | 6 | class Exp_Basic(object): 7 | def __init__(self, args): 8 | self.args = args 9 | self.device = self._acquire_device() 10 | self.model = self._build_model().to(self.device) 11 | 12 | def _build_model(self): 13 | raise NotImplementedError 14 | return None 15 | 16 | def _acquire_device(self): 17 | if self.args.use_gpu: 18 | os.environ["CUDA_VISIBLE_DEVICES"] = str( 19 | self.args.gpu) if not self.args.use_multi_gpu else self.args.devices 20 | device = torch.device('cuda:{}'.format(self.args.gpu)) 21 | print('Use GPU: cuda:{}'.format(self.args.gpu)) 22 | else: 23 | device = torch.device('cpu') 24 | print('Use CPU') 25 | return device 26 | 27 | def _get_data(self): 28 | pass 29 | 30 | def vali(self): 31 | pass 32 | 33 | def train(self): 34 | pass 35 | 36 | def test(self): 37 | pass 38 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/icfinetune.sh: -------------------------------------------------------------------------------- 1 | if [ ! -d "./logs" ]; then 2 | mkdir ./logs 3 | fi 4 | 5 | log_folder=Finetune 6 | 7 | if [ ! -d "./logs/"$log_folder ]; then 8 | mkdir ./logs/$log_folder 9 | fi 10 | 11 | resume="resume.pth" 12 | icpretrain_config_path="./configs/pretrain_configs_finetune.json" 13 | 14 | root_path_name=./dataset_test/ 15 | features=M 16 | data_name="ETTh2.csv" # only for validation and testing 17 | data_type=ETTh2 18 | 19 | patience=200 20 | seq_len=2048 21 | pred_len=96 22 | random_seed=2024 23 | test_every=20 24 | plot_every=1 25 | plot_full_details=0 26 | scale=1 27 | iterative_prediction=0 28 | 29 | model_name=ICTSPretrain 30 | batch_size=8 31 | gradient_accumulation=16 32 | batch_size_test=8 33 | learning_rate=0.00001 34 | max_grad_norm=1 35 | 36 | train_ratio=0.7 37 | test_ratio=0.2 38 | 39 | number_of_targets=0 40 | 41 | python -u run_longExp.py \ 42 | --is_training 1 \ 43 | --model_type 'IC' \ 44 | --root_path $root_path_name \ 45 | --data_path $data_name \ 46 | --model_id $model_name'_'$random_seed'_'$data_name'_'$data_type'_'$seq_len'_'$pred_len \ 47 | --model $model_name \ 48 | --data $data_type \ 49 | --features $features \ 50 | --seq_len $seq_len \ 51 | --label_len 0 \ 52 | --pred_len $pred_len \ 53 | --scale $scale \ 54 | --des 'Exp' \ 55 | --patience $patience \ 56 | --test_every $test_every \ 57 | --random_seed $random_seed \ 58 | --max_grad_norm $max_grad_norm \ 59 | --train_epochs 2000 \ 60 | --itr 1 \ 61 | --batch_size $batch_size \ 62 | --batch_size_test $batch_size_test \ 63 | --plot_every $plot_every \ 64 | --learning_rate $learning_rate \ 65 | --plot_full_details $plot_full_details \ 66 | --devices "0,1" \ 67 | --num_workers 6 \ 68 | --resume $resume \ 69 | --use_amp \ 70 | --train_ratio $train_ratio \ 71 | --test_ratio $test_ratio \ 72 | --icpretrain_config_path $icpretrain_config_path \ 73 | --number_of_targets $number_of_targets \ 74 | --iterative_prediction $iterative_prediction >logs/$log_folder'/'$model_name'_'$random_seed'_'$data_name'_'$data_type'_'$seq_len'_'$pred_len -------------------------------------------------------------------------------- /ICTSP-MultiTask/icpretrain.sh: -------------------------------------------------------------------------------- 1 | if [ ! -d "./logs" ]; then 2 | mkdir ./logs 3 | fi 4 | 5 | log_folder=Pretrain 6 | 7 | if [ ! -d "./logs/"$log_folder ]; then 8 | mkdir ./logs/$log_folder 9 | fi 10 | 11 | resume="none" #"resume.pth" 12 | icpretrain_config_path="./configs/pretrain_configs_sequential.json" 13 | 14 | root_path_name=./dataset_test/ 15 | features=M 16 | data_name="ETTh2.csv" # only for validation and testing 17 | data_type=ETTh2 18 | 19 | patience=1800 20 | seq_len=2048 21 | pred_len=96 22 | random_seed=2024 23 | test_every=1500 24 | plot_every=1 25 | plot_full_details=0 26 | scale=1 27 | iterative_prediction=0 28 | 29 | model_name=ICTSPretrain 30 | batch_size=8 31 | gradient_accumulation=16 32 | batch_size_test=8 33 | learning_rate=0.0006 34 | max_grad_norm=1 35 | 36 | 37 | python -u run_longExp.py \ 38 | --is_training 1 \ 39 | --model_type 'IC' \ 40 | --root_path $root_path_name \ 41 | --data_path $data_name \ 42 | --model_id $model_name'_'$random_seed'_'$data_name'_'$data_type'_'$seq_len'_'$pred_len \ 43 | --model $model_name \ 44 | --data $data_type \ 45 | --features $features \ 46 | --seq_len $seq_len \ 47 | --label_len 0 \ 48 | --pred_len $pred_len \ 49 | --scale $scale \ 50 | --des 'Exp' \ 51 | --patience $patience \ 52 | --test_every $test_every \ 53 | --random_seed $random_seed \ 54 | --max_grad_norm $max_grad_norm \ 55 | --train_epochs 2000 \ 56 | --itr 1 \ 57 | --batch_size $batch_size \ 58 | --batch_size_test $batch_size_test \ 59 | --plot_every $plot_every \ 60 | --learning_rate $learning_rate \ 61 | --plot_full_details $plot_full_details \ 62 | --devices "0,1" \ 63 | --num_workers 6 \ 64 | --resume $resume \ 65 | --use_amp \ 66 | --icpretrain_config_path $icpretrain_config_path \ 67 | --iterative_prediction $iterative_prediction >logs/$log_folder'/'$model_name'_'$random_seed'_'$data_name'_'$data_type'_'$seq_len'_'$pred_len -------------------------------------------------------------------------------- /ICTSP-MultiTask/inference.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": null, 6 | "id": "83d9bf36-213f-4741-a6c2-ce606a616dba", 7 | "metadata": {}, 8 | "outputs": [], 9 | "source": [ 10 | "# pip install plotly\n", 11 | "\n", 12 | "import torch\n", 13 | "import numpy as np\n", 14 | "from data_provider.data_factory import data_provider_subset as data_provider\n", 15 | "from data_provider.ictsp_dataloader import ForecastingDatasetWrapper\n", 16 | "from types import SimpleNamespace\n", 17 | "from models.ICPretrain import ICPretrain\n", 18 | "import matplotlib.pyplot as plt\n", 19 | "from torch.utils.data import DataLoader\n", 20 | "from tqdm import tqdm\n", 21 | "import os\n", 22 | "\n", 23 | "import json\n", 24 | "with open('configs/pretrain_configs_sequential.json', 'r') as file:\n", 25 | " config_data = json.load(file)\n", 26 | "icpretrain_configs = SimpleNamespace(**config_data)\n", 27 | "icpretrain_configs.stage = \"inference\"\n", 28 | "weight_path = './pt_model_2048_96_current.pth'\n", 29 | "model_name = 'ICTSP_FT'\n", 30 | "\n", 31 | "def nested_collate_fn(batch):\n", 32 | " elem = batch[0]\n", 33 | " elem_type = type(elem)\n", 34 | " \n", 35 | " if isinstance(elem, np.ndarray):\n", 36 | " tensor_batch = list(map(torch.Tensor, batch))\n", 37 | " return torch.nested.nested_tensor(tensor_batch)\n", 38 | " elif isinstance(elem, float):\n", 39 | " return torch.tensor(batch, dtype=torch.float)\n", 40 | " elif isinstance(elem, int):\n", 41 | " return torch.LongTensor(batch)\n", 42 | " elif isinstance(elem, str):\n", 43 | " return batch\n", 44 | " elif isinstance(elem, tuple):\n", 45 | " transposed = zip(*batch)\n", 46 | " return [nested_collate_fn(samples) for samples in transposed]\n", 47 | " elif isinstance(elem, list):\n", 48 | " transposed = zip(*batch)\n", 49 | " return [nested_collate_fn(samples) for samples in transposed]\n", 50 | " else:\n", 51 | " print(batch)\n", 52 | " raise TypeError(f\"batch must contain tensors, numpy arrays or numbers; found {elem_type}\")\n", 53 | "\n", 54 | "def get_dataset(seq_len=2048, pred_len=720, data_type='ETTh2', root_path='./dataset/', data_path='ETTh2.csv', train_ratio=0.7, test_ratio=0.2, \n", 55 | " flag='test', do_forecasting=False, batch_size=8, force_lookback=52):\n", 56 | " data_args = SimpleNamespace(embed='timeF', \n", 57 | " batch_size=batch_size,\n", 58 | " batch_size_test=batch_size,\n", 59 | " freq='h',\n", 60 | " data=data_type,\n", 61 | " root_path=root_path,\n", 62 | " data_path=data_path,\n", 63 | " seq_len=seq_len,\n", 64 | " label_len=0,\n", 65 | " pred_len=pred_len,\n", 66 | " features='M',\n", 67 | " target='OT',\n", 68 | " scale=1,\n", 69 | " train_ratio=train_ratio,\n", 70 | " test_ratio=test_ratio,\n", 71 | " num_workers=0,\n", 72 | " do_forecasting=do_forecasting\n", 73 | " )\n", 74 | " dataset, dataloader = data_provider(data_args, flag=flag)\n", 75 | " ds = ForecastingDatasetWrapper(dataset, icpretrain_configs)\n", 76 | " ds.force_legacy_lookback_for_inference = force_lookback\n", 77 | " dl = DataLoader(\n", 78 | " ds,\n", 79 | " batch_size=batch_size,\n", 80 | " shuffle=False,\n", 81 | " drop_last=False,\n", 82 | " pin_memory=False,\n", 83 | " collate_fn=nested_collate_fn)\n", 84 | " return ds, dl\n", 85 | "\n", 86 | "def preprocess_data(data, device):\n", 87 | " # flipped both in the tokenizer and the preprocessing here, just to ensure the \"float to the right\" alignment format properly applied on the channel dimension\n", 88 | " task_id = data[8].int().to(device, non_blocking=True)\n", 89 | " token_x_part = torch.nested.to_padded_tensor(data[0].float().to(device, non_blocking=True), 0).flip(1)\n", 90 | " y_true = torch.nested.to_padded_tensor(data[1].float().to(device, non_blocking=True), 0).flip(1) if task_id[0] != 1 else torch.nested.to_padded_tensor(data[1].int().to(device, non_blocking=True), 0).flip(1) # C L or C\n", 91 | " token_y_part = torch.nested.to_padded_tensor(data[2].float().to(device, non_blocking=True), 0).flip(1) if task_id[0] != 1 else torch.nested.to_padded_tensor(data[2].int().to(device, non_blocking=True), 0).flip(1)\n", 92 | " channel_label = torch.nested.to_padded_tensor(data[3].int().to(device, non_blocking=True), 0).flip(1)\n", 93 | " position_label = torch.nested.to_padded_tensor(data[4].int().to(device, non_blocking=True), 0).flip(1)\n", 94 | " source_label = torch.nested.to_padded_tensor(data[5].int().to(device, non_blocking=True), 0).flip(1)\n", 95 | " tag_multihot = torch.nested.to_padded_tensor(data[6].float().to(device, non_blocking=True), 0).flip(1)\n", 96 | " y_true_shape = torch.nested.to_padded_tensor(data[7].int().to(device, non_blocking=True), 0)\n", 97 | " return task_id, token_x_part, y_true, token_y_part, channel_label, position_label, source_label, tag_multihot, y_true_shape\n", 98 | "\n", 99 | "from collections import OrderedDict\n", 100 | "\n", 101 | "model = ICPretrain(icpretrain_configs)#.float()\n", 102 | "\n", 103 | "#weight_path = './pt_model_2048_96_current.pth'\n", 104 | "\n", 105 | "state_dict = torch.load(weight_path, map_location='cpu')\n", 106 | "\n", 107 | "new_state_dict = OrderedDict()\n", 108 | "for k, v in state_dict.items():\n", 109 | " name = k[7:] if k.startswith('module.') else k # remove `module.` prefix\n", 110 | " new_state_dict[name] = v\n", 111 | "\n", 112 | "model.load_state_dict(new_state_dict) #, strict=False\n", 113 | "\n", 114 | "for attribute in dir(model):\n", 115 | " if isinstance(getattr(model, attribute), torch._dynamo.eval_frame.OptimizedModule):\n", 116 | " setattr(model, attribute, getattr(model, attribute)._orig_mod)\n", 117 | "\n", 118 | "device = torch.device('cuda')\n", 119 | "model = model.to(device)\n", 120 | "model.process_output = True\n", 121 | "model.eval()" 122 | ] 123 | }, 124 | { 125 | "cell_type": "code", 126 | "execution_count": null, 127 | "id": "6c33a788-14f7-40d4-9348-32b6a6f14230", 128 | "metadata": {}, 129 | "outputs": [], 130 | "source": [ 131 | "# path to your inference data\n", 132 | "data_path = 'main.csv'\n", 133 | "\n", 134 | "# seq_len: L_I, force_lookback: L_b, pred_len: L_P\n", 135 | "vali_data, vali_loader = get_dataset(seq_len=416, pred_len=4, data_type='custom', root_path='./dataset/', data_path=data_path, train_ratio=0.87, test_ratio=0.1, \n", 136 | " flag='val', do_forecasting=True, batch_size=8, force_lookback=104)\n", 137 | "test_data, test_loader = get_dataset(seq_len=416, pred_len=4, data_type='custom', root_path='./dataset/', data_path=data_path, train_ratio=0.87, test_ratio=0.1, \n", 138 | " flag='test', do_forecasting=True, batch_size=8, force_lookback=104)\n", 139 | "number_of_targets = 0 # All\n", 140 | "\n", 141 | "def calculation(dataloader):\n", 142 | " preds = []\n", 143 | " trues = []\n", 144 | " number_of_targets = 53\n", 145 | " index = 0\n", 146 | " # x: (L_I, C), y: (L_P, C)\n", 147 | " for index, data in tqdm(enumerate(dataloader)):\n", 148 | " task_id, token_x_part, y_true, token_y_part, channel_label, position_label, source_label, tag_multihot, y_true_shape = preprocess_data(data, device=device)\n", 149 | " with torch.no_grad():\n", 150 | " res = model(token_x_part, token_y_part, channel_label, position_label, source_label, tag_multihot, y_true_shape, task_id)\n", 151 | " res = res.detach().cpu().numpy().transpose((0, 2, 1))[:, :, -number_of_targets:]\n", 152 | " y = y_true.detach().cpu().numpy().transpose((0, 2, 1))[:, :, -number_of_targets:]\n", 153 | " preds.append(res)\n", 154 | " trues.append(y)\n", 155 | " index += 1\n", 156 | "\n", 157 | " preds = np.concatenate(preds, axis=0)\n", 158 | " trues = np.concatenate(trues, axis=0)\n", 159 | " return preds, trues\n", 160 | "\n", 161 | "preds_vali, trues_vali = calculation(vali_loader)\n", 162 | "preds, trues = calculation(test_loader)\n", 163 | "preds.shape" 164 | ] 165 | } 166 | ], 167 | "metadata": { 168 | "kernelspec": { 169 | "display_name": "Python 3 (ipykernel)", 170 | "language": "python", 171 | "name": "python3" 172 | }, 173 | "language_info": { 174 | "codemirror_mode": { 175 | "name": "ipython", 176 | "version": 3 177 | }, 178 | "file_extension": ".py", 179 | "mimetype": "text/x-python", 180 | "name": "python", 181 | "nbconvert_exporter": "python", 182 | "pygments_lexer": "ipython3", 183 | "version": "3.12.3" 184 | } 185 | }, 186 | "nbformat": 4, 187 | "nbformat_minor": 5 188 | } 189 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/layers/NanoGPTBlock.py: -------------------------------------------------------------------------------- 1 | ### From: https://github.com/karpathy/nanoGPT/blob/master/model.py 2 | 3 | import torch 4 | import torch.nn as nn 5 | from torch.nn import functional as F 6 | 7 | class LayerNorm(nn.Module): 8 | """ LayerNorm but with an optional bias. PyTorch doesn't support simply bias=False """ 9 | 10 | def __init__(self, ndim, bias): 11 | super().__init__() 12 | self.weight = nn.Parameter(torch.ones(ndim)) 13 | self.bias = nn.Parameter(torch.zeros(ndim)) if bias else None 14 | 15 | def forward(self, input): 16 | return F.layer_norm(input, self.weight.shape, self.weight, self.bias, 1e-5) 17 | 18 | class RMSNorm(torch.nn.Module): 19 | def __init__(self, dim: int, eps: float = 1e-6): 20 | """ 21 | From: https://github.com/meta-llama/llama/blob/main/llama/model.py 22 | Initialize the RMSNorm normalization layer. 23 | 24 | Args: 25 | dim (int): The dimension of the input tensor. 26 | eps (float, optional): A small value added to the denominator for numerical stability. Default is 1e-6. 27 | 28 | Attributes: 29 | eps (float): A small value added to the denominator for numerical stability. 30 | weight (nn.Parameter): Learnable scaling parameter. 31 | 32 | """ 33 | super().__init__() 34 | self.eps = eps 35 | self.weight = nn.Parameter(torch.ones(dim)) 36 | 37 | def _norm(self, x): 38 | """ 39 | Apply the RMSNorm normalization to the input tensor. 40 | 41 | Args: 42 | x (torch.Tensor): The input tensor. 43 | 44 | Returns: 45 | torch.Tensor: The normalized tensor. 46 | 47 | """ 48 | return x * torch.rsqrt(x.pow(2).mean(-1, keepdim=True) + self.eps) 49 | 50 | def forward(self, x): 51 | """ 52 | Forward pass through the RMSNorm layer. 53 | 54 | Args: 55 | x (torch.Tensor): The input tensor. 56 | 57 | Returns: 58 | torch.Tensor: The output tensor after applying RMSNorm. 59 | 60 | """ 61 | output = self._norm(x.float()).type_as(x) 62 | return output * self.weight 63 | 64 | class CausalSelfAttention(nn.Module): 65 | 66 | def __init__(self, config): 67 | super().__init__() 68 | assert config.n_embd % config.n_head == 0 69 | # key, query, value projections for all heads, but in a batch 70 | self.c_attn = nn.Linear(config.n_embd, 3 * config.n_embd, bias=config.bias) 71 | # output projection 72 | self.c_proj = nn.Linear(config.n_embd, config.n_embd, bias=config.bias) 73 | # regularization 74 | self.attn_dropout = nn.Dropout(config.dropout) 75 | self.resid_dropout = nn.Dropout(config.dropout) 76 | self.n_head = config.n_head 77 | self.n_embd = config.n_embd 78 | self.dropout = config.dropout 79 | # flash attention make GPU go brrrrr but support is only in PyTorch >= 2.0 80 | self.flash = hasattr(torch.nn.functional, 'scaled_dot_product_attention') 81 | self.RoPE = nn.Parameter(rotary_embedding(8192, config.n_embd), requires_grad=False) 82 | if not self.flash: 83 | print("WARNING: using slow attention. Flash Attention requires PyTorch >= 2.0") 84 | # causal mask to ensure that attention is only applied to the left in the input sequence 85 | self.register_buffer("bias", torch.tril(torch.ones(config.block_size, config.block_size)) 86 | .view(1, 1, config.block_size, config.block_size)) 87 | 88 | def forward(self, x, attn_mask=None): 89 | B, T, C = x.size() # batch size, sequence length, embedding dimensionality (n_embd) 90 | x = apply_rotary_pos_emb(x, self.RoPE, reverse=True) 91 | 92 | # calculate query, key, values for all heads in batch and move head forward to be the batch dim 93 | q, k, v = self.c_attn(x).split(self.n_embd, dim=2) 94 | k = k.view(B, T, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs) 95 | q = q.view(B, T, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs) 96 | v = v.view(B, T, self.n_head, C // self.n_head).transpose(1, 2) # (B, nh, T, hs) 97 | 98 | # causal self-attention; Self-attend: (B, nh, T, hs) x (B, nh, hs, T) -> (B, nh, T, T) 99 | if self.flash: 100 | # efficient attention using Flash Attention CUDA kernels 101 | y = torch.nn.functional.scaled_dot_product_attention(q, k, v, attn_mask=attn_mask, dropout_p=self.dropout if self.training else 0, is_causal=False) 102 | else: 103 | # manual implementation of attention 104 | att = (q @ k.transpose(-2, -1)) * (1.0 / math.sqrt(k.size(-1))) 105 | att = att.masked_fill(self.bias[:,:,:T,:T] == 0, float('-inf')) 106 | att = F.softmax(att, dim=-1) 107 | att = self.attn_dropout(att) 108 | y = att @ v # (B, nh, T, T) x (B, nh, T, hs) -> (B, nh, T, hs) 109 | y = y.transpose(1, 2).contiguous().view(B, T, C) # re-assemble all head outputs side by side 110 | 111 | # output projection 112 | y = self.resid_dropout(self.c_proj(y)) 113 | return y 114 | 115 | class MLP(nn.Module): 116 | 117 | def __init__(self, config): 118 | super().__init__() 119 | self.c_fc = nn.Linear(config.n_embd, 4 * config.n_embd, bias=config.bias) 120 | self.gelu = nn.GELU() 121 | self.c_proj = nn.Linear(4 * config.n_embd, config.n_embd, bias=config.bias) 122 | self.dropout = nn.Dropout(config.dropout) 123 | 124 | def forward(self, x): 125 | x = self.c_fc(x) 126 | x = self.gelu(x) 127 | x = self.c_proj(x) 128 | x = self.dropout(x) 129 | return x 130 | 131 | def rotary_embedding(max_len, dim): 132 | inv_freq = 1.0 / (10000 ** (torch.arange(0, dim, 2).float() / dim)) 133 | position = torch.arange(0, max_len).unsqueeze(1) 134 | sinusoid_inp = torch.ger(position.squeeze(), inv_freq).float() 135 | return torch.cat((sinusoid_inp.sin(), sinusoid_inp.cos()), dim=-1) 136 | 137 | def apply_rotary_pos_emb(x, sincos, reverse=False): 138 | B, seq_len, dim = x.shape 139 | if reverse: 140 | sincos = sincos[0:seq_len].flip(0) 141 | else: 142 | sincos = sincos[0:seq_len] 143 | sin, cos = sincos[:, :dim//2].repeat_interleave(2, dim=-1), sincos[:, dim//2:].repeat_interleave(2, dim=-1) 144 | return (x * cos) + (torch.roll(x, shifts=1, dims=-1) * sin) 145 | 146 | # class AttentionPlacer(nn.Module): 147 | # def __init__(self, D, H): 148 | # super(AttentionPlacer, self).__init__() 149 | # self.D = D 150 | # self.H = H 151 | # self.d = D // H 152 | # #self.evaluator = nn.Linear(self.d, 1) 153 | # self.evaluator = nn.Sequential( 154 | # nn.Linear(self.d, 4 * self.d, bias=False), 155 | # nn.Dropout(0.1), 156 | # nn.GELU(), 157 | # nn.Linear(4 * self.d, 1, bias=False), 158 | # ) 159 | # self.activation = nn.Softmax(dim=-1) 160 | # self.eps = 1e-6 161 | # self.RoPE = nn.Parameter(rotary_embedding(8192, D), requires_grad=False) 162 | 163 | # def forward(self, X): 164 | # B, L, D = X.shape 165 | 166 | # # Reshape to (B, L, H, d) 167 | # X = X.reshape(B, L, self.H, self.d) 168 | 169 | # # Permute to (B, H, L, d) 170 | # X = X.permute(0, 2, 1, 3) 171 | 172 | # # Apply linear layer to (B, H, L, d) and get (B, H, L, 1) 173 | # X_score = self.evaluator(X) # (B, H, L, 1) 174 | # X_score = self.activation(X_score.squeeze(-1)) # (B, H, L, 1) 175 | 176 | # X_prob_max, X_indices = torch.sort(X_score, dim=-1) 177 | # X_prob_max, X_indices = X_prob_max.unsqueeze(-1).expand(-1, -1, -1, self.d), X_indices.unsqueeze(-1).expand(-1, -1, -1, self.d) 178 | 179 | # X_sorted = torch.gather(X, 2, X_indices) # (B, H, L, d) 180 | 181 | # X_prob_max = X_prob_max + self.eps 182 | # X_sorted = X_sorted * (X_prob_max / X_prob_max.detach()) # (B, H, L, d) 183 | 184 | # X_sorted = X_sorted.permute(0, 2, 1, 3).reshape(B, L, self.H*self.d) # (B, L, D) 185 | # X_sorted = apply_rotary_pos_emb(X_sorted, self.RoPE, reverse=True) 186 | 187 | # return X_sorted 188 | 189 | # class Block(nn.Module): 190 | 191 | # def __init__(self, config): 192 | # super().__init__() 193 | # self.ln_1 = torch.compile(RMSNorm(config.n_embd)) 194 | # self.attn1 = AttentionPlacer(config.n_embd, config.n_head) 195 | # #self.attn2 = torch.compile(CausalSelfAttention(config)) 196 | # self.ln_2 = torch.compile(RMSNorm(config.n_embd)) 197 | # self.ln_3 = torch.compile(RMSNorm(config.n_embd)) 198 | # self.mlp = torch.compile(MLP(config)) 199 | 200 | # def forward(self, x, src_mask=None): 201 | # x = self.attn1(self.ln_1(x)) 202 | # #x = x + self.attn2(self.ln_2(x), attn_mask=src_mask) 203 | # x = x + self.mlp(self.ln_3(x)) 204 | # return x 205 | 206 | class AttentionPlacer(nn.Module): 207 | def __init__(self, D, H): 208 | super(AttentionPlacer, self).__init__() 209 | self.D = D 210 | self.H = H 211 | self.d = D // H 212 | #self.evaluator = nn.Linear(self.d, 1) 213 | self.evaluator = nn.Sequential( 214 | nn.Linear(self.d, 4 * self.d, bias=False), 215 | nn.Dropout(0.1), 216 | nn.GELU(), 217 | nn.Linear(4 * self.d, self.d+1, bias=False), 218 | ) 219 | self.activation = nn.Softmax(dim=-1) 220 | self.eps = 1e-6 221 | self.RoPE = nn.Parameter(rotary_embedding(8192, D), requires_grad=False) 222 | 223 | def forward(self, X): 224 | B, L, D = X.shape 225 | 226 | X = apply_rotary_pos_emb(X, self.RoPE, reverse=True) 227 | 228 | # Reshape to (B, L, H, d) 229 | X = X.reshape(B, L, self.H, self.d) 230 | 231 | # Permute to (B, H, L, d) 232 | X = X.permute(0, 2, 1, 3) 233 | 234 | # Apply linear layer to (B, H, L, d) and get (B, H, L, 1) 235 | X_forward = self.evaluator(X) # (B, H, L, 1) 236 | X = X_forward[:, :, :, 0:-1] 237 | X_score = X_forward[:, :, :, [-1]] 238 | X_score = self.activation(X_score.squeeze(-1)) # (B, H, L, 1) 239 | 240 | X_prob_max, X_indices = torch.sort(X_score, dim=-1) 241 | X_prob_max, X_indices = X_prob_max.unsqueeze(-1).expand(-1, -1, -1, self.d), X_indices.unsqueeze(-1).expand(-1, -1, -1, self.d) 242 | 243 | X_sorted = torch.gather(X, 2, X_indices) # (B, H, L, d) 244 | 245 | X_prob_max = X_prob_max + self.eps 246 | X_sorted = X_sorted * (X_prob_max / X_prob_max.detach()) # (B, H, L, d) 247 | 248 | X_sorted = X_sorted.permute(0, 2, 1, 3).reshape(B, L, self.H*self.d) # (B, L, D) 249 | X_sorted = apply_rotary_pos_emb(X_sorted, self.RoPE, reverse=False) 250 | 251 | return X_sorted 252 | 253 | class Block(nn.Module): 254 | 255 | def __init__(self, config): 256 | super().__init__() 257 | self.ln_1 = RMSNorm(config.n_embd) 258 | #self.attn1 = AttentionPlacer(config.n_embd, config.n_head) 259 | self.attn = CausalSelfAttention(config) 260 | self.ln_2 = RMSNorm(config.n_embd) 261 | self.ln_3 = RMSNorm(config.n_embd) 262 | self.mlp = MLP(config) 263 | 264 | def forward(self, x, src_mask=None): 265 | # x = x + self.attn1(self.ln_1(x)) 266 | x = x + self.attn(self.ln_2(x), attn_mask=src_mask) 267 | x = x + self.mlp(self.ln_3(x)) 268 | return x -------------------------------------------------------------------------------- /ICTSP-MultiTask/requirements.txt: -------------------------------------------------------------------------------- 1 | einops==0.8.0 2 | fbm==0.3.0 3 | h5py==3.12.1 4 | hdf5plugin==5.0.0 5 | lightgbm==4.5.0 6 | matplotlib==3.9.2 7 | matplotlib-inline==0.1.7 8 | numpy==2.0.2 9 | objgraph==3.6.2 10 | pandas==2.2.3 11 | pmdarima==2.0.4 12 | ptflops==0.7.4 13 | pyarrow==18.0.0 14 | pynvml==11.5.3 15 | PyWavelets==1.7.0 16 | scipy==1.14.1 17 | seaborn==0.13.2 18 | statsforecast==1.7.8 19 | statsmodels==0.14.4 20 | sympy==1.13.1 21 | tables==3.10.1 22 | tensorboard==2.18.0 23 | tensorboard-data-server==0.7.2 24 | tqdm==4.67.0 25 | tsfresh==0.20.3 26 | tvm==1.0.0 27 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/ts_generation/Ode_generator.py: -------------------------------------------------------------------------------- 1 | import sympy as sp 2 | import random 3 | import numpy as np 4 | import matplotlib.pyplot as plt 5 | from scipy.integrate import odeint 6 | import pandas as pd 7 | import time 8 | from tqdm import tqdm 9 | import os 10 | import string 11 | from concurrent.futures import ProcessPoolExecutor, as_completed 12 | from multiprocessing import Pool, cpu_count 13 | import signal 14 | from datetime import datetime 15 | 16 | def generate_random_string(length): 17 | characters = string.ascii_letters + string.digits 18 | random_string = ''.join(random.choice(characters) for i in range(length)) 19 | return random_string 20 | 21 | def random_function(x, t): 22 | f1, f2 = random.uniform(1, 10), random.uniform(1, 10) 23 | functions = [ 24 | lambda x, t: x**2, 25 | lambda x, t: sp.sin(f1 * x), 26 | lambda x, t: sp.cos(f1 * x), 27 | lambda x, t: sp.sin(f1 * t), 28 | lambda x, t: sp.cos(f1 * t), 29 | lambda x, t: sp.exp(sp.cos(f1 * t)), 30 | lambda x, t: sp.sin(f1 * t) * sp.cos(f2 * x), 31 | lambda x, t: sp.cos(f1 * t) * sp.sin(f2 * x), 32 | lambda x, t: sp.sin(f1 * x) * sp.cos(f2 * t), 33 | lambda x, t: sp.sin(f1 * x) * sp.sin(f2 * t), 34 | lambda x, t: sp.sin(f1 * x + f2 * t), 35 | lambda x, t: sp.cos(f1 * x + f2 * t), 36 | lambda x, t: sp.exp(sp.cos(f1 * t + f2 * x)), 37 | ] 38 | func = random.choice(functions) 39 | return func(x, t) 40 | 41 | def generate_random_odes(num_variables): 42 | """ Generate random ODE systems based on symbolic logic to ensure equation validity """ 43 | t = sp.symbols('t') 44 | variables = sp.symbols(f'x0:{num_variables}') 45 | functions = [sp.Function(f'x{i}')(t) for i in range(num_variables)] 46 | equations = [] 47 | 48 | for i in range(num_variables): 49 | equation = 0 50 | num_terms = random.randint(1, 4) # Increase the number of terms 51 | for _ in range(num_terms): 52 | coeff = random.uniform(-0.2, 0.2) # Strictly limit the range of coefficients 53 | random_var = random.choice(variables) 54 | math_func = random_function(random_var, t) 55 | operation = random.choice([lambda a, b: a + b, lambda a, b: a * b, lambda a, b: a - b]) 56 | equation = operation(equation, coeff * math_func) 57 | 58 | # Add cross-coupling terms, periodic terms, and external forcing terms to ensure the system has complex dynamic behavior 59 | cross_coupling = random.uniform(-0.2, 0.2) * sum(random.uniform(-0.1, 0.1) * v for v in variables if v != variables[i]) 60 | frequency_a = random.uniform(1, 150) 61 | frequency_b = random.uniform(1, 15) 62 | external_forcing = random.uniform(-1, 1) * (sp.sin(frequency_a * t) + sp.cos(frequency_b * t)) 63 | damping = random.uniform(0.01, 0.1) * variables[i] 64 | equation += cross_coupling + external_forcing - damping 65 | 66 | derivative = sp.Derivative(functions[i], t) 67 | ode = sp.Eq(derivative, equation) 68 | equations.append(ode) 69 | return equations 70 | 71 | def solve_odes(odes, num_variables, t_max=100, steps=5000): 72 | t = sp.symbols('t') 73 | variables = sp.symbols(f'x0:{num_variables}') 74 | rhs = [sp.lambdify((t,) + tuple(variables), ode.rhs, 'numpy') for ode in odes] 75 | 76 | def system(y, t): 77 | return [rhs_i(t, *y) for rhs_i in rhs] 78 | 79 | t_values = np.linspace(0, t_max, steps) 80 | y0 = np.random.uniform(-1, 1, size=num_variables) # Random initial conditions 81 | solution = odeint(system, y0, t_values, rtol=1e-6, atol=1e-8) 82 | 83 | output = np.nan_to_num(np.clip(solution, -np.random.randint(7, 15), np.random.randint(7, 15))) 84 | 85 | return output 86 | 87 | def visualize_odes_plot(ys, t_max=None, steps=None): 88 | t_values = np.linspace(0, t_max, steps) if t_max is not None and steps is not None else np.arange(ys.shape[0]) 89 | # Plot the results 90 | plt.figure(figsize=(12, 8)) 91 | for i in range(ys.shape[-1]): 92 | plt.plot(t_values, ys[:, i], label=f'x{i}') # Limit the output range 93 | plt.title('ODE System Solution') 94 | plt.xlabel('Time t') 95 | plt.ylabel('Variables') 96 | plt.legend() 97 | plt.show() 98 | 99 | def random_partition_with_limits(X, L, U): 100 | if L > U: 101 | raise ValueError("L should not be greater than U") 102 | if X < L: 103 | return [X] 104 | parts = [] 105 | while X > 0: 106 | if X <= U: 107 | part = X 108 | else: 109 | part = random.randint(L, min(U, X - L)) 110 | parts.append(part) 111 | X -= part 112 | return parts 113 | 114 | def random_ode_generator(): 115 | # Ensure the number of equations and the number of variables are consistent 116 | partition_min = 5 117 | partition_max = 30 118 | t_max = np.random.randint(15, 400) 119 | n_steps = np.random.randint(1024, 8192*2) 120 | num_vars = np.random.randint(3, 100) 121 | 122 | n_ode_groups = random_partition_with_limits(num_vars, partition_min, partition_max) 123 | print('t={}, step={}, n_vars={}, group={}'.format(t_max, n_steps, num_vars, n_ode_groups), flush=True) 124 | 125 | generated_ode_groups = [] 126 | 127 | start_time = time.time() 128 | for n in n_ode_groups: 129 | random_odes = generate_random_odes(n) 130 | # for ode in random_odes: 131 | # print(ode) 132 | res = solve_odes(random_odes, n, t_max=t_max, steps=n_steps) 133 | generated_ode_groups.append(res) 134 | 135 | generated_ode_groups = np.concatenate(generated_ode_groups, axis=1) 136 | 137 | end_time = time.time() 138 | execution_time = end_time - start_time 139 | # analysis.append([t_max, n_steps, num_vars, n_ode_groups, execution_time]) 140 | return generated_ode_groups 141 | 142 | def transform_to_ts_df(x): 143 | length = x.shape[0] 144 | dates = pd.date_range(start="2001-01-01", periods=length, freq='min') 145 | data = {'date': dates} 146 | df = pd.DataFrame(x) 147 | cols = df.columns.tolist() 148 | df['date'] = dates 149 | df = df[['date'] + cols] 150 | return df.fillna(0).round(4) 151 | 152 | def generate_and_write_ode_csv(to_path): 153 | timestamp = datetime.now().strftime('%Y%m%d%H%M%S') + '_' + generate_random_string(6) 154 | csv_name = os.path.join(f"data_{timestamp}.csv") 155 | generated_ode_data_df = transform_to_ts_df(random_ode_generator()) 156 | generated_ode_data_df.to_csv(f"{to_path}/{csv_name}", index=False) 157 | 158 | def run_with_timeout(func, args=(), kwargs={}, timeout_duration=300): 159 | def handler(signum, frame): 160 | raise TimeoutError() 161 | 162 | signal.signal(signal.SIGALRM, handler) 163 | signal.alarm(timeout_duration) 164 | try: 165 | result = func(*args, **kwargs) 166 | except TimeoutError: 167 | result = None 168 | finally: 169 | signal.alarm(0) 170 | 171 | return result 172 | 173 | dataset_path = "../dataset" 174 | to_path = f"{dataset_path}/ODE_pretrain_1" 175 | os.makedirs(to_path, exist_ok=True) 176 | 177 | # Use ProcessPoolExecutor to parallelize the generation process 178 | num_tasks = 10000 179 | n_cpus = cpu_count() 180 | with ProcessPoolExecutor(n_cpus-4) as executor: 181 | futures = [executor.submit(run_with_timeout, generate_and_write_ode_csv, (to_path,), {}, 300) for _ in range(num_tasks)] 182 | for future in tqdm(as_completed(futures), total=num_tasks): 183 | try: 184 | future.result() 185 | except Exception as e: 186 | print(f"Task failed with exception: {e}") 187 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/utils/metrics.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def RSE(pred, true): 5 | return np.sqrt(np.sum((true - pred) ** 2)) / np.sqrt(np.sum((true - true.mean()) ** 2)) 6 | 7 | 8 | def CORR(pred, true): 9 | u = ((true - true.mean(0)) * (pred - pred.mean(0))).sum(0) 10 | d = np.sqrt(((true - true.mean(0)) ** 2 * (pred - pred.mean(0)) ** 2).sum(0)) 11 | d += 1e-12 12 | return 0.01*(u / d).mean(-1) 13 | 14 | 15 | def MAE(pred, true): 16 | return np.mean(np.abs(pred - true)) 17 | 18 | 19 | def MSE(pred, true): 20 | return np.mean((pred - true) ** 2) 21 | 22 | 23 | def RMSE(pred, true): 24 | return np.sqrt(MSE(pred, true)) 25 | 26 | 27 | def MAPE(pred, true): 28 | return np.mean(np.abs((pred - true) / true)) 29 | 30 | 31 | def MSPE(pred, true): 32 | return np.mean(np.square((pred - true) / true)) 33 | 34 | 35 | def metric(pred, true): 36 | mae = MAE(pred, true) 37 | mse = MSE(pred, true) 38 | rmse = RMSE(pred, true) 39 | mape = MAPE(pred, true) 40 | mspe = MSPE(pred, true) 41 | rse = RSE(pred, true) 42 | corr = CORR(pred, true) 43 | 44 | return mae, mse, rmse, mape, mspe, rse, corr 45 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/utils/timefeatures.py: -------------------------------------------------------------------------------- 1 | from typing import List 2 | 3 | import numpy as np 4 | import pandas as pd 5 | from pandas.tseries import offsets 6 | from pandas.tseries.frequencies import to_offset 7 | 8 | 9 | class TimeFeature: 10 | def __init__(self): 11 | pass 12 | 13 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 14 | pass 15 | 16 | def __repr__(self): 17 | return self.__class__.__name__ + "()" 18 | 19 | 20 | class SecondOfMinute(TimeFeature): 21 | """Minute of hour encoded as value between [-0.5, 0.5]""" 22 | 23 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 24 | return index.second / 59.0 - 0.5 25 | 26 | 27 | class MinuteOfHour(TimeFeature): 28 | """Minute of hour encoded as value between [-0.5, 0.5]""" 29 | 30 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 31 | return index.minute / 59.0 - 0.5 32 | 33 | 34 | class HourOfDay(TimeFeature): 35 | """Hour of day encoded as value between [-0.5, 0.5]""" 36 | 37 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 38 | return index.hour / 23.0 - 0.5 39 | 40 | 41 | class DayOfWeek(TimeFeature): 42 | """Hour of day encoded as value between [-0.5, 0.5]""" 43 | 44 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 45 | return index.dayofweek / 6.0 - 0.5 46 | 47 | 48 | class DayOfMonth(TimeFeature): 49 | """Day of month encoded as value between [-0.5, 0.5]""" 50 | 51 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 52 | return (index.day - 1) / 30.0 - 0.5 53 | 54 | 55 | class DayOfYear(TimeFeature): 56 | """Day of year encoded as value between [-0.5, 0.5]""" 57 | 58 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 59 | return (index.dayofyear - 1) / 365.0 - 0.5 60 | 61 | 62 | class MonthOfYear(TimeFeature): 63 | """Month of year encoded as value between [-0.5, 0.5]""" 64 | 65 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 66 | return (index.month - 1) / 11.0 - 0.5 67 | 68 | 69 | class WeekOfYear(TimeFeature): 70 | """Week of year encoded as value between [-0.5, 0.5]""" 71 | 72 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 73 | return (index.isocalendar().week - 1) / 52.0 - 0.5 74 | 75 | 76 | def time_features_from_frequency_str(freq_str: str) -> List[TimeFeature]: 77 | """ 78 | Returns a list of time features that will be appropriate for the given frequency string. 79 | Parameters 80 | ---------- 81 | freq_str 82 | Frequency string of the form [multiple][granularity] such as "12H", "5min", "1D" etc. 83 | """ 84 | 85 | features_by_offsets = { 86 | offsets.YearEnd: [], 87 | offsets.QuarterEnd: [MonthOfYear], 88 | offsets.MonthEnd: [MonthOfYear], 89 | offsets.Week: [DayOfMonth, WeekOfYear], 90 | offsets.Day: [DayOfWeek, DayOfMonth, DayOfYear], 91 | offsets.BusinessDay: [DayOfWeek, DayOfMonth, DayOfYear], 92 | offsets.Hour: [HourOfDay, DayOfWeek, DayOfMonth, DayOfYear], 93 | offsets.Minute: [ 94 | MinuteOfHour, 95 | HourOfDay, 96 | DayOfWeek, 97 | DayOfMonth, 98 | DayOfYear, 99 | ], 100 | offsets.Second: [ 101 | SecondOfMinute, 102 | MinuteOfHour, 103 | HourOfDay, 104 | DayOfWeek, 105 | DayOfMonth, 106 | DayOfYear, 107 | ], 108 | } 109 | 110 | offset = to_offset(freq_str) 111 | 112 | for offset_type, feature_classes in features_by_offsets.items(): 113 | if isinstance(offset, offset_type): 114 | return [cls() for cls in feature_classes] 115 | 116 | supported_freq_msg = f""" 117 | Unsupported frequency {freq_str} 118 | The following frequencies are supported: 119 | Y - yearly 120 | alias: A 121 | M - monthly 122 | W - weekly 123 | D - daily 124 | B - business days 125 | H - hourly 126 | T - minutely 127 | alias: min 128 | S - secondly 129 | """ 130 | raise RuntimeError(supported_freq_msg) 131 | 132 | 133 | def time_features(dates, freq='h'): 134 | return np.vstack([feat(dates) for feat in time_features_from_frequency_str(freq)]) 135 | -------------------------------------------------------------------------------- /ICTSP-MultiTask/utils/tools.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | import matplotlib.pyplot as plt 4 | import time 5 | 6 | plt.switch_backend('agg') 7 | 8 | 9 | def adjust_learning_rate(optimizer, scheduler, epoch, args, printout=True): 10 | # lr = args.learning_rate * (0.2 ** (epoch // 2)) 11 | if args.lradj == 'type1': 12 | lr_adjust = {epoch: args.learning_rate * (0.5 ** ((epoch - 1) // 1))} 13 | elif args.lradj == 'type2': 14 | lr_adjust = { 15 | 2: 5e-5, 4: 1e-5, 6: 5e-6, 8: 1e-6, 16 | 10: 5e-7, 15: 1e-7, 20: 5e-8 17 | } 18 | elif args.lradj == 'type3': 19 | lr_adjust = {epoch: args.learning_rate if epoch < 3 else args.learning_rate * (0.9985 ** ((epoch - 3) // 1))} 20 | elif args.lradj == 'constant': 21 | lr_adjust = {epoch: args.learning_rate} 22 | elif args.lradj == '3': 23 | lr_adjust = {epoch: args.learning_rate if epoch < 10 else args.learning_rate*0.1} 24 | elif args.lradj == '4': 25 | lr_adjust = {epoch: args.learning_rate if epoch < 15 else args.learning_rate*0.1} 26 | elif args.lradj == '5': 27 | lr_adjust = {epoch: args.learning_rate if epoch < 25 else args.learning_rate*0.1} 28 | elif args.lradj == '6': 29 | lr_adjust = {epoch: args.learning_rate if epoch < 5 else args.learning_rate*0.1} 30 | elif args.lradj == 'TST': 31 | lr_adjust = {epoch: scheduler.get_last_lr()[0]} 32 | 33 | if epoch in lr_adjust.keys(): 34 | lr = lr_adjust[epoch] 35 | for index, param_group in enumerate(optimizer.param_groups): 36 | if index == 0: 37 | param_group['lr'] = lr 38 | else: 39 | param_group['lr'] = lr*1 40 | if printout: print('Updating learning rate to {}'.format(lr)) 41 | 42 | 43 | class EarlyStopping: 44 | def __init__(self, patience=7, verbose=False, delta=0, configs=None): 45 | self.patience = patience 46 | self.verbose = verbose 47 | self.counter = 0 48 | self.best_score = None 49 | self.early_stop = False 50 | self.val_loss_min = np.inf 51 | self.delta = delta 52 | self.configs = configs 53 | 54 | def __call__(self, val_loss, model, path): 55 | score = -val_loss 56 | if self.best_score is None: 57 | self.best_score = score 58 | self.save_checkpoint(val_loss, model, path) 59 | elif score < self.best_score + self.delta: 60 | self.counter += 1 61 | print(f'EarlyStopping counter: {self.counter} out of {self.patience}') 62 | if self.counter >= self.patience: 63 | self.early_stop = True 64 | else: 65 | self.best_score = score 66 | self.save_checkpoint(val_loss, model, path) 67 | self.counter = 0 68 | 69 | def save_checkpoint(self, val_loss, model, path): 70 | if self.verbose: 71 | print(f'Validation loss decreased ({self.val_loss_min:.6f} --> {val_loss:.6f}). Saving model ...') 72 | torch.save(getattr(model, '_orig_mod', model).state_dict(), path + '/' + 'checkpoint.pth') 73 | torch.save(getattr(model, '_orig_mod', model).state_dict(), f'pt_model_{self.configs.seq_len}_{self.configs.pred_len}.pth') 74 | self.val_loss_min = val_loss 75 | 76 | 77 | class dotdict(dict): 78 | """dot.notation access to dictionary attributes""" 79 | __getattr__ = dict.get 80 | __setattr__ = dict.__setitem__ 81 | __delattr__ = dict.__delitem__ 82 | 83 | 84 | class StandardScaler(): 85 | def __init__(self, mean, std): 86 | self.mean = mean 87 | self.std = std 88 | 89 | def transform(self, data): 90 | return (data - self.mean) / self.std 91 | 92 | def inverse_transform(self, data): 93 | return (data * self.std) + self.mean 94 | 95 | 96 | def visual(true, preds=None, name='./pic/test.pdf'): 97 | """ 98 | Results visualization 99 | """ 100 | plt.figure() 101 | plt.plot(true, label='GroundTruth', linewidth=2) 102 | if preds is not None: 103 | plt.plot(preds, label='Prediction', linewidth=2) 104 | plt.legend() 105 | plt.savefig(name, bbox_inches='tight') 106 | 107 | def test_params_flop(model,x_shape): 108 | """ 109 | If you want to thest former's flop, you need to give default value to inputs in model.forward(), the following code can only pass one argument to forward() 110 | """ 111 | model_params = 0 112 | for parameter in model.parameters(): 113 | model_params += parameter.numel() 114 | print('INFO: Trainable parameter count: {:.2f}M'.format(model_params / 1000000.0)) 115 | from ptflops import get_model_complexity_info 116 | with torch.cuda.device(0): 117 | macs, params = get_model_complexity_info(model.cuda(), x_shape, as_strings=True, print_per_layer_stat=True) 118 | # print('Flops:' + flops) 119 | # print('Params:' + params) 120 | print('{:<30} {:<8}'.format('Computational complexity: ', macs)) 121 | print('{:<30} {:<8}'.format('Number of parameters: ', params)) -------------------------------------------------------------------------------- /ICTSP/.gitattributes: -------------------------------------------------------------------------------- 1 | *.csv filter=lfs diff=lfs merge=lfs -text 2 | -------------------------------------------------------------------------------- /ICTSP/README.md: -------------------------------------------------------------------------------- 1 | # In-context Time Series Predictor 2 | 3 | --- 4 | 5 | #### 1. Install Required Packages 6 | 7 | ```bash 8 | pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 9 | pip3 install -r requirements.txt 10 | ``` 11 | 12 | #### 2. Download Other Datasets (Optional) 13 | 14 | You can use the link provided by [Autoformer](https://drive.google.com/drive/folders/1ZOYpTUa82_jCcxIdTmyr0LXQfvaM9vIy) to download the datasets. 15 | 16 | #### 3. Track the Training 17 | 18 | ```bash 19 | nohup tensorboard --logdir runs --port 6006 --bind_all > tensorb.log 2>&1 & 20 | ``` 21 | 22 | #### 4. Run the Scripts 23 | 24 | Run the training scripts under `./scripts` folder. 25 | -------------------------------------------------------------------------------- /ICTSP/data_provider/data_factory.py: -------------------------------------------------------------------------------- 1 | from data_provider.data_loader import Dataset_ETT_hour, Dataset_ETT_minute, Dataset_Custom, Dataset_Pred 2 | from torch.utils.data import DataLoader 3 | from itertools import chain 4 | import copy 5 | 6 | import random 7 | 8 | import re 9 | 10 | 11 | def random_sample_dataset(lst, num_elements=16): 12 | if len(lst) < num_elements: 13 | return lst 14 | else: 15 | return random.sample(lst, num_elements) 16 | 17 | def extract_data(s): 18 | pattern = re.compile(r'(?:\[(\d+\.\d+)(?:,(\d+))?(?:,(\d+))?\])?([^,]+)') 19 | ratios = [] 20 | pred_lens = [] 21 | batchsize = [] 22 | filenames = [] 23 | 24 | for match in pattern.findall(s): 25 | ratio = match[0] 26 | pred_len = match[1] 27 | batch_size = match[2] 28 | filename = match[3].strip() 29 | 30 | ratios.append(float(ratio) if ratio else 1.0) 31 | pred_lens.append(int(pred_len) if pred_len else 0) 32 | batchsize.append(int(batch_size) if batch_size else 0) 33 | filenames.append(filename) 34 | 35 | return ratios, pred_lens, batchsize, filenames 36 | 37 | class RandomizedDataLoaderIter: 38 | def __init__(self, dataloaders, sample_len=2000): 39 | self.dataloaders = [iter(dl) for dl in dataloaders] 40 | self.active_iters = list(range(len(self.dataloaders))) 41 | self.sample_len = sample_len 42 | self.sample_counter = 0 43 | 44 | def __iter__(self): 45 | return self 46 | 47 | def __next__(self): 48 | if self.sample_counter >= self.sample_len: 49 | raise StopIteration 50 | 51 | while self.active_iters: 52 | choice = random.choice(self.active_iters) 53 | try: 54 | data = next(self.dataloaders[choice]) 55 | self.sample_counter += 1 56 | return data 57 | except StopIteration: 58 | self.active_iters.remove(choice) 59 | 60 | raise StopIteration 61 | 62 | def __len__(self): 63 | return self.sample_len 64 | 65 | data_dict = { 66 | 'ETTh1': Dataset_ETT_hour, 67 | 'ETTh2': Dataset_ETT_hour, 68 | 'ETTm1': Dataset_ETT_minute, 69 | 'ETTm2': Dataset_ETT_minute, 70 | 'custom': Dataset_Custom, 71 | } 72 | 73 | def data_provider(args, flag, element_wise_shuffle=True): 74 | if ',' not in args.data_path: 75 | return data_provider_subset(args, flag) 76 | else: 77 | #data_names = args.data_path.split(',') 78 | data_few_shot_ratios, pred_lens, batchsizes, data_names = extract_data(args.data_path) 79 | pred_lens = [i if i != 0 else args.pred_len for i in pred_lens] 80 | batchsizes = [i if i != 0 else args.batch_size for i in batchsizes] 81 | 82 | mapping = [['custom' if 'ETT' not in dn else dn.split('.')[0], dn, r, bs, pdl] for dn, r, bs, pdl in zip(data_names, data_few_shot_ratios, batchsizes, pred_lens)] 83 | data_sets, data_loaders = [], [] 84 | temp_args = copy.deepcopy(args) 85 | if flag in ['val', 'test']: 86 | for d in [mapping[-1]]: 87 | temp_args.data, temp_args.data_path, temp_args.few_shot_ratio, temp_args.batch_size, temp_args.batch_size_test, temp_args.pred_len = d[0], d[1], d[2], d[3], d[3], d[4] 88 | ds, dl = data_provider_subset(temp_args, flag) 89 | data_sets.append(ds) 90 | data_loaders.append(dl) 91 | return data_sets[-1], data_loaders[-1] 92 | 93 | current_datasets = random_sample_dataset(mapping[0:-1]) if args.transfer_learning else random_sample_dataset(mapping) 94 | current_datasets = current_datasets + [mapping[-2]] 95 | for d in current_datasets: 96 | temp_args.data, temp_args.data_path, temp_args.few_shot_ratio, temp_args.batch_size, temp_args.batch_size_test, temp_args.pred_len = d[0], d[1], d[2], d[3], d[3], d[4] 97 | ds, dl = data_provider_subset(temp_args, flag) 98 | data_sets.append(ds) 99 | data_loaders.append(dl) 100 | # For validation set and test set 101 | # For training set 102 | # if args.transfer_learning: 103 | # return chain.from_iterable(data_sets[0:-1]), chain.from_iterable(data_loaders[0:-1]) if not element_wise_shuffle else RandomizedDataLoaderIter(data_loaders[0:-1]) 104 | # else: 105 | return chain.from_iterable(data_sets), chain.from_iterable(data_loaders) if not element_wise_shuffle else RandomizedDataLoaderIter(data_loaders) 106 | 107 | def data_provider_subset(args, flag): 108 | Data = data_dict[args.data] 109 | timeenc = 0 if args.embed != 'timeF' else 1 110 | 111 | if flag == 'test': 112 | shuffle_flag = False 113 | drop_last = False 114 | batch_size = args.batch_size if args.batch_size_test == 0 else args.batch_size_test 115 | freq = args.freq 116 | elif flag == 'pred': 117 | shuffle_flag = False 118 | drop_last = False 119 | batch_size = 1 120 | freq = args.freq 121 | Data = Dataset_Pred 122 | else: 123 | shuffle_flag = True 124 | drop_last = False 125 | batch_size = args.batch_size 126 | freq = args.freq 127 | 128 | data_set = Data( 129 | root_path=args.root_path, 130 | data_path=args.data_path, 131 | flag=flag, 132 | size=[args.seq_len, args.label_len, args.pred_len], 133 | features=args.features, 134 | target=args.target, 135 | timeenc=timeenc, 136 | scale=args.scale, 137 | freq=freq, 138 | train_ratio=args.train_ratio, 139 | test_ratio=args.test_ratio, 140 | percent=int(getattr(args, 'few_shot_ratio', 1)*100), 141 | force_fair_comparison_for_extendable_and_extended_input_length=args.model == 'ICFormer' and flag == 'test' 142 | ) 143 | print(flag, len(data_set)) 144 | data_loader = DataLoader( 145 | data_set, 146 | batch_size=batch_size, 147 | shuffle=shuffle_flag, 148 | num_workers=args.num_workers, 149 | drop_last=drop_last, 150 | pin_memory=True) 151 | return data_set, data_loader 152 | -------------------------------------------------------------------------------- /ICTSP/exp/exp_basic.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import numpy as np 4 | 5 | 6 | class Exp_Basic(object): 7 | def __init__(self, args): 8 | self.args = args 9 | self.device = self._acquire_device() 10 | self.model = self._build_model().to(self.device) 11 | 12 | def _build_model(self): 13 | raise NotImplementedError 14 | return None 15 | 16 | def _acquire_device(self): 17 | if self.args.use_gpu: 18 | os.environ["CUDA_VISIBLE_DEVICES"] = str( 19 | self.args.gpu) if not self.args.use_multi_gpu else self.args.devices 20 | device = torch.device('cuda:{}'.format(self.args.gpu)) 21 | print('Use GPU: cuda:{}'.format(self.args.gpu)) 22 | else: 23 | device = torch.device('cpu') 24 | print('Use CPU') 25 | return device 26 | 27 | def _get_data(self): 28 | pass 29 | 30 | def vali(self): 31 | pass 32 | 33 | def train(self): 34 | pass 35 | 36 | def test(self): 37 | pass 38 | -------------------------------------------------------------------------------- /ICTSP/layers/AutoCorrelation.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import matplotlib.pyplot as plt 5 | import numpy as np 6 | import math 7 | from math import sqrt 8 | import os 9 | 10 | 11 | class AutoCorrelation(nn.Module): 12 | """ 13 | AutoCorrelation Mechanism with the following two phases: 14 | (1) period-based dependencies discovery 15 | (2) time delay aggregation 16 | This block can replace the self-attention family mechanism seamlessly. 17 | """ 18 | def __init__(self, mask_flag=True, factor=1, scale=None, attention_dropout=0.1, output_attention=False): 19 | super(AutoCorrelation, self).__init__() 20 | self.factor = factor 21 | self.scale = scale 22 | self.mask_flag = mask_flag 23 | self.output_attention = output_attention 24 | self.dropout = nn.Dropout(attention_dropout) 25 | 26 | def time_delay_agg_training(self, values, corr): 27 | """ 28 | SpeedUp version of Autocorrelation (a batch-normalization style design) 29 | This is for the training phase. 30 | """ 31 | head = values.shape[1] 32 | channel = values.shape[2] 33 | length = values.shape[3] 34 | # find top k 35 | top_k = int(self.factor * math.log(length)) 36 | mean_value = torch.mean(torch.mean(corr, dim=1), dim=1) 37 | index = torch.topk(torch.mean(mean_value, dim=0), top_k, dim=-1)[1] 38 | weights = torch.stack([mean_value[:, index[i]] for i in range(top_k)], dim=-1) 39 | # update corr 40 | tmp_corr = torch.softmax(weights, dim=-1) 41 | # aggregation 42 | tmp_values = values 43 | delays_agg = torch.zeros_like(values).float() 44 | for i in range(top_k): 45 | pattern = torch.roll(tmp_values, -int(index[i]), -1) 46 | delays_agg = delays_agg + pattern * \ 47 | (tmp_corr[:, i].unsqueeze(1).unsqueeze(1).unsqueeze(1).repeat(1, head, channel, length)) 48 | return delays_agg 49 | 50 | def time_delay_agg_inference(self, values, corr): 51 | """ 52 | SpeedUp version of Autocorrelation (a batch-normalization style design) 53 | This is for the inference phase. 54 | """ 55 | batch = values.shape[0] 56 | head = values.shape[1] 57 | channel = values.shape[2] 58 | length = values.shape[3] 59 | # index init 60 | init_index = torch.arange(length).unsqueeze(0).unsqueeze(0).unsqueeze(0).repeat(batch, head, channel, 1).cuda() 61 | # find top k 62 | top_k = int(self.factor * math.log(length)) 63 | mean_value = torch.mean(torch.mean(corr, dim=1), dim=1) 64 | weights = torch.topk(mean_value, top_k, dim=-1)[0] 65 | delay = torch.topk(mean_value, top_k, dim=-1)[1] 66 | # update corr 67 | tmp_corr = torch.softmax(weights, dim=-1) 68 | # aggregation 69 | tmp_values = values.repeat(1, 1, 1, 2) 70 | delays_agg = torch.zeros_like(values).float() 71 | for i in range(top_k): 72 | tmp_delay = init_index + delay[:, i].unsqueeze(1).unsqueeze(1).unsqueeze(1).repeat(1, head, channel, length) 73 | pattern = torch.gather(tmp_values, dim=-1, index=tmp_delay) 74 | delays_agg = delays_agg + pattern * \ 75 | (tmp_corr[:, i].unsqueeze(1).unsqueeze(1).unsqueeze(1).repeat(1, head, channel, length)) 76 | return delays_agg 77 | 78 | def time_delay_agg_full(self, values, corr): 79 | """ 80 | Standard version of Autocorrelation 81 | """ 82 | batch = values.shape[0] 83 | head = values.shape[1] 84 | channel = values.shape[2] 85 | length = values.shape[3] 86 | # index init 87 | init_index = torch.arange(length).unsqueeze(0).unsqueeze(0).unsqueeze(0).repeat(batch, head, channel, 1).cuda() 88 | # find top k 89 | top_k = int(self.factor * math.log(length)) 90 | weights = torch.topk(corr, top_k, dim=-1)[0] 91 | delay = torch.topk(corr, top_k, dim=-1)[1] 92 | # update corr 93 | tmp_corr = torch.softmax(weights, dim=-1) 94 | # aggregation 95 | tmp_values = values.repeat(1, 1, 1, 2) 96 | delays_agg = torch.zeros_like(values).float() 97 | for i in range(top_k): 98 | tmp_delay = init_index + delay[..., i].unsqueeze(-1) 99 | pattern = torch.gather(tmp_values, dim=-1, index=tmp_delay) 100 | delays_agg = delays_agg + pattern * (tmp_corr[..., i].unsqueeze(-1)) 101 | return delays_agg 102 | 103 | def forward(self, queries, keys, values, attn_mask): 104 | B, L, H, E = queries.shape 105 | _, S, _, D = values.shape 106 | if L > S: 107 | zeros = torch.zeros_like(queries[:, :(L - S), :]).float() 108 | values = torch.cat([values, zeros], dim=1) 109 | keys = torch.cat([keys, zeros], dim=1) 110 | else: 111 | values = values[:, :L, :, :] 112 | keys = keys[:, :L, :, :] 113 | 114 | # period-based dependencies 115 | q_fft = torch.fft.rfft(queries.permute(0, 2, 3, 1).contiguous(), dim=-1) 116 | k_fft = torch.fft.rfft(keys.permute(0, 2, 3, 1).contiguous(), dim=-1) 117 | res = q_fft * torch.conj(k_fft) 118 | corr = torch.fft.irfft(res, dim=-1) 119 | 120 | # time delay agg 121 | if self.training: 122 | V = self.time_delay_agg_training(values.permute(0, 2, 3, 1).contiguous(), corr).permute(0, 3, 1, 2) 123 | else: 124 | V = self.time_delay_agg_inference(values.permute(0, 2, 3, 1).contiguous(), corr).permute(0, 3, 1, 2) 125 | 126 | if self.output_attention: 127 | return (V.contiguous(), corr.permute(0, 3, 1, 2)) 128 | else: 129 | return (V.contiguous(), None) 130 | 131 | 132 | class AutoCorrelationLayer(nn.Module): 133 | def __init__(self, correlation, d_model, n_heads, d_keys=None, 134 | d_values=None): 135 | super(AutoCorrelationLayer, self).__init__() 136 | 137 | d_keys = d_keys or (d_model // n_heads) 138 | d_values = d_values or (d_model // n_heads) 139 | 140 | self.inner_correlation = correlation 141 | self.query_projection = nn.Linear(d_model, d_keys * n_heads) 142 | self.key_projection = nn.Linear(d_model, d_keys * n_heads) 143 | self.value_projection = nn.Linear(d_model, d_values * n_heads) 144 | self.out_projection = nn.Linear(d_values * n_heads, d_model) 145 | self.n_heads = n_heads 146 | 147 | def forward(self, queries, keys, values, attn_mask): 148 | B, L, _ = queries.shape 149 | _, S, _ = keys.shape 150 | H = self.n_heads 151 | 152 | queries = self.query_projection(queries).view(B, L, H, -1) 153 | keys = self.key_projection(keys).view(B, S, H, -1) 154 | values = self.value_projection(values).view(B, S, H, -1) 155 | 156 | out, attn = self.inner_correlation( 157 | queries, 158 | keys, 159 | values, 160 | attn_mask 161 | ) 162 | out = out.view(B, L, -1) 163 | 164 | return self.out_projection(out), attn 165 | -------------------------------------------------------------------------------- /ICTSP/layers/Autoformer_EncDec.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | 6 | class my_Layernorm(nn.Module): 7 | """ 8 | Special designed layernorm for the seasonal part 9 | """ 10 | def __init__(self, channels): 11 | super(my_Layernorm, self).__init__() 12 | self.layernorm = nn.LayerNorm(channels) 13 | 14 | def forward(self, x): 15 | x_hat = self.layernorm(x) 16 | bias = torch.mean(x_hat, dim=1).unsqueeze(1).repeat(1, x.shape[1], 1) 17 | return x_hat - bias 18 | 19 | 20 | class moving_avg(nn.Module): 21 | """ 22 | Moving average block to highlight the trend of time series 23 | """ 24 | def __init__(self, kernel_size, stride): 25 | super(moving_avg, self).__init__() 26 | self.kernel_size = kernel_size 27 | self.avg = nn.AvgPool1d(kernel_size=kernel_size, stride=stride, padding=0) 28 | 29 | def forward(self, x): 30 | # padding on the both ends of time series 31 | front = x[:, 0:1, :].repeat(1, (self.kernel_size - 1) // 2, 1) 32 | end = x[:, -1:, :].repeat(1, (self.kernel_size - 1) // 2, 1) 33 | x = torch.cat([front, x, end], dim=1) 34 | x = self.avg(x.permute(0, 2, 1)) 35 | x = x.permute(0, 2, 1) 36 | return x 37 | 38 | 39 | class series_decomp(nn.Module): 40 | """ 41 | Series decomposition block 42 | """ 43 | def __init__(self, kernel_size): 44 | super(series_decomp, self).__init__() 45 | self.moving_avg = moving_avg(kernel_size, stride=1) 46 | 47 | def forward(self, x): 48 | moving_mean = self.moving_avg(x) 49 | res = x - moving_mean 50 | return res, moving_mean 51 | 52 | 53 | class EncoderLayer(nn.Module): 54 | """ 55 | Autoformer encoder layer with the progressive decomposition architecture 56 | """ 57 | def __init__(self, attention, d_model, d_ff=None, moving_avg=25, dropout=0.1, activation="relu"): 58 | super(EncoderLayer, self).__init__() 59 | d_ff = d_ff or 4 * d_model 60 | self.attention = attention 61 | self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1, bias=False) 62 | self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1, bias=False) 63 | self.decomp1 = series_decomp(moving_avg) 64 | self.decomp2 = series_decomp(moving_avg) 65 | self.dropout = nn.Dropout(dropout) 66 | self.activation = F.relu if activation == "relu" else F.gelu 67 | 68 | def forward(self, x, attn_mask=None): 69 | new_x, attn = self.attention( 70 | x, x, x, 71 | attn_mask=attn_mask 72 | ) 73 | x = x + self.dropout(new_x) 74 | x, _ = self.decomp1(x) 75 | y = x 76 | y = self.dropout(self.activation(self.conv1(y.transpose(-1, 1)))) 77 | y = self.dropout(self.conv2(y).transpose(-1, 1)) 78 | res, _ = self.decomp2(x + y) 79 | return res, attn 80 | 81 | 82 | class Encoder(nn.Module): 83 | """ 84 | Autoformer encoder 85 | """ 86 | def __init__(self, attn_layers, conv_layers=None, norm_layer=None): 87 | super(Encoder, self).__init__() 88 | self.attn_layers = nn.ModuleList(attn_layers) 89 | self.conv_layers = nn.ModuleList(conv_layers) if conv_layers is not None else None 90 | self.norm = norm_layer 91 | 92 | def forward(self, x, attn_mask=None): 93 | attns = [] 94 | if self.conv_layers is not None: 95 | for attn_layer, conv_layer in zip(self.attn_layers, self.conv_layers): 96 | x, attn = attn_layer(x, attn_mask=attn_mask) 97 | x = conv_layer(x) 98 | attns.append(attn) 99 | x, attn = self.attn_layers[-1](x) 100 | attns.append(attn) 101 | else: 102 | for attn_layer in self.attn_layers: 103 | x, attn = attn_layer(x, attn_mask=attn_mask) 104 | attns.append(attn) 105 | 106 | if self.norm is not None: 107 | x = self.norm(x) 108 | 109 | return x, attns 110 | 111 | 112 | class DecoderLayer(nn.Module): 113 | """ 114 | Autoformer decoder layer with the progressive decomposition architecture 115 | """ 116 | def __init__(self, self_attention, cross_attention, d_model, c_out, d_ff=None, 117 | moving_avg=25, dropout=0.1, activation="relu"): 118 | super(DecoderLayer, self).__init__() 119 | d_ff = d_ff or 4 * d_model 120 | self.self_attention = self_attention 121 | self.cross_attention = cross_attention 122 | self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1, bias=False) 123 | self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1, bias=False) 124 | self.decomp1 = series_decomp(moving_avg) 125 | self.decomp2 = series_decomp(moving_avg) 126 | self.decomp3 = series_decomp(moving_avg) 127 | self.dropout = nn.Dropout(dropout) 128 | self.projection = nn.Conv1d(in_channels=d_model, out_channels=c_out, kernel_size=3, stride=1, padding=1, 129 | padding_mode='circular', bias=False) 130 | self.activation = F.relu if activation == "relu" else F.gelu 131 | 132 | def forward(self, x, cross, x_mask=None, cross_mask=None): 133 | x = x + self.dropout(self.self_attention( 134 | x, x, x, 135 | attn_mask=x_mask 136 | )[0]) 137 | x, trend1 = self.decomp1(x) 138 | x = x + self.dropout(self.cross_attention( 139 | x, cross, cross, 140 | attn_mask=cross_mask 141 | )[0]) 142 | x, trend2 = self.decomp2(x) 143 | y = x 144 | y = self.dropout(self.activation(self.conv1(y.transpose(-1, 1)))) 145 | y = self.dropout(self.conv2(y).transpose(-1, 1)) 146 | x, trend3 = self.decomp3(x + y) 147 | 148 | residual_trend = trend1 + trend2 + trend3 149 | residual_trend = self.projection(residual_trend.permute(0, 2, 1)).transpose(1, 2) 150 | return x, residual_trend 151 | 152 | 153 | class Decoder(nn.Module): 154 | """ 155 | Autoformer encoder 156 | """ 157 | def __init__(self, layers, norm_layer=None, projection=None): 158 | super(Decoder, self).__init__() 159 | self.layers = nn.ModuleList(layers) 160 | self.norm = norm_layer 161 | self.projection = projection 162 | 163 | def forward(self, x, cross, x_mask=None, cross_mask=None, trend=None): 164 | for layer in self.layers: 165 | x, residual_trend = layer(x, cross, x_mask=x_mask, cross_mask=cross_mask) 166 | trend = trend + residual_trend 167 | 168 | if self.norm is not None: 169 | x = self.norm(x) 170 | 171 | if self.projection is not None: 172 | x = self.projection(x) 173 | return x, trend 174 | -------------------------------------------------------------------------------- /ICTSP/layers/Conv_Blocks.py: -------------------------------------------------------------------------------- 1 | ### From: https://github.com/thuml/Time-Series-Library/blob/main/layers/Conv_Blocks.py 2 | 3 | import torch 4 | import torch.nn as nn 5 | 6 | 7 | class Inception_Block_V1(nn.Module): 8 | def __init__(self, in_channels, out_channels, num_kernels=6, init_weight=True): 9 | super(Inception_Block_V1, self).__init__() 10 | self.in_channels = in_channels 11 | self.out_channels = out_channels 12 | self.num_kernels = num_kernels 13 | kernels = [] 14 | for i in range(self.num_kernels): 15 | kernels.append(nn.Conv2d(in_channels, out_channels, kernel_size=2 * i + 1, padding=i)) 16 | self.kernels = nn.ModuleList(kernels) 17 | if init_weight: 18 | self._initialize_weights() 19 | 20 | def _initialize_weights(self): 21 | for m in self.modules(): 22 | if isinstance(m, nn.Conv2d): 23 | nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 24 | if m.bias is not None: 25 | nn.init.constant_(m.bias, 0) 26 | 27 | def forward(self, x): 28 | res_list = [] 29 | for i in range(self.num_kernels): 30 | res_list.append(self.kernels[i](x)) 31 | res = torch.stack(res_list, dim=-1).mean(-1) 32 | return res 33 | 34 | 35 | class Inception_Block_V2(nn.Module): 36 | def __init__(self, in_channels, out_channels, num_kernels=6, init_weight=True): 37 | super(Inception_Block_V2, self).__init__() 38 | self.in_channels = in_channels 39 | self.out_channels = out_channels 40 | self.num_kernels = num_kernels 41 | kernels = [] 42 | for i in range(self.num_kernels // 2): 43 | kernels.append(nn.Conv2d(in_channels, out_channels, kernel_size=[1, 2 * i + 3], padding=[0, i + 1])) 44 | kernels.append(nn.Conv2d(in_channels, out_channels, kernel_size=[2 * i + 3, 1], padding=[i + 1, 0])) 45 | kernels.append(nn.Conv2d(in_channels, out_channels, kernel_size=1)) 46 | self.kernels = nn.ModuleList(kernels) 47 | if init_weight: 48 | self._initialize_weights() 49 | 50 | def _initialize_weights(self): 51 | for m in self.modules(): 52 | if isinstance(m, nn.Conv2d): 53 | nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu') 54 | if m.bias is not None: 55 | nn.init.constant_(m.bias, 0) 56 | 57 | def forward(self, x): 58 | res_list = [] 59 | for i in range(self.num_kernels + 1): 60 | res_list.append(self.kernels[i](x)) 61 | res = torch.stack(res_list, dim=-1).mean(-1) 62 | return res -------------------------------------------------------------------------------- /ICTSP/layers/Embed.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | from torch.nn.utils import weight_norm 5 | import math 6 | 7 | 8 | class PositionalEmbedding(nn.Module): 9 | def __init__(self, d_model, max_len=5000): 10 | super(PositionalEmbedding, self).__init__() 11 | # Compute the positional encodings once in log space. 12 | pe = torch.zeros(max_len, d_model).float() 13 | pe.require_grad = False 14 | 15 | position = torch.arange(0, max_len).float().unsqueeze(1) 16 | div_term = (torch.arange(0, d_model, 2).float() * -(math.log(10000.0) / d_model)).exp() 17 | 18 | pe[:, 0::2] = torch.sin(position * div_term) 19 | pe[:, 1::2] = torch.cos(position * div_term) 20 | 21 | pe = pe.unsqueeze(0) 22 | self.register_buffer('pe', pe) 23 | 24 | def forward(self, x): 25 | return self.pe[:, :x.size(1)] 26 | 27 | 28 | class TokenEmbedding(nn.Module): 29 | def __init__(self, c_in, d_model): 30 | super(TokenEmbedding, self).__init__() 31 | padding = 1 if torch.__version__ >= '1.5.0' else 2 32 | self.tokenConv = nn.Conv1d(in_channels=c_in, out_channels=d_model, 33 | kernel_size=3, padding=padding, padding_mode='circular', bias=False) 34 | for m in self.modules(): 35 | if isinstance(m, nn.Conv1d): 36 | nn.init.kaiming_normal_(m.weight, mode='fan_in', nonlinearity='leaky_relu') 37 | 38 | def forward(self, x): 39 | x = self.tokenConv(x.permute(0, 2, 1)).transpose(1, 2) 40 | return x 41 | 42 | 43 | class FixedEmbedding(nn.Module): 44 | def __init__(self, c_in, d_model): 45 | super(FixedEmbedding, self).__init__() 46 | 47 | w = torch.zeros(c_in, d_model).float() 48 | w.require_grad = False 49 | 50 | position = torch.arange(0, c_in).float().unsqueeze(1) 51 | div_term = (torch.arange(0, d_model, 2).float() * -(math.log(10000.0) / d_model)).exp() 52 | 53 | w[:, 0::2] = torch.sin(position * div_term) 54 | w[:, 1::2] = torch.cos(position * div_term) 55 | 56 | self.emb = nn.Embedding(c_in, d_model) 57 | self.emb.weight = nn.Parameter(w, requires_grad=False) 58 | 59 | def forward(self, x): 60 | return self.emb(x).detach() 61 | 62 | 63 | class TemporalEmbedding(nn.Module): 64 | def __init__(self, d_model, embed_type='fixed', freq='h'): 65 | super(TemporalEmbedding, self).__init__() 66 | 67 | minute_size = 4 68 | hour_size = 24 69 | weekday_size = 7 70 | day_size = 32 71 | month_size = 13 72 | 73 | Embed = FixedEmbedding if embed_type == 'fixed' else nn.Embedding 74 | if freq == 't': 75 | self.minute_embed = Embed(minute_size, d_model) 76 | self.hour_embed = Embed(hour_size, d_model) 77 | self.weekday_embed = Embed(weekday_size, d_model) 78 | self.day_embed = Embed(day_size, d_model) 79 | self.month_embed = Embed(month_size, d_model) 80 | 81 | def forward(self, x): 82 | x = x.long() 83 | 84 | minute_x = self.minute_embed(x[:, :, 4]) if hasattr(self, 'minute_embed') else 0. 85 | hour_x = self.hour_embed(x[:, :, 3]) 86 | weekday_x = self.weekday_embed(x[:, :, 2]) 87 | day_x = self.day_embed(x[:, :, 1]) 88 | month_x = self.month_embed(x[:, :, 0]) 89 | 90 | return hour_x + weekday_x + day_x + month_x + minute_x 91 | 92 | 93 | class TimeFeatureEmbedding(nn.Module): 94 | def __init__(self, d_model, embed_type='timeF', freq='h'): 95 | super(TimeFeatureEmbedding, self).__init__() 96 | 97 | freq_map = {'h': 4, 't': 5, 's': 6, 'm': 1, 'a': 1, 'w': 2, 'd': 3, 'b': 3} 98 | d_inp = freq_map[freq] 99 | self.embed = nn.Linear(d_inp, d_model, bias=False) 100 | 101 | def forward(self, x): 102 | return self.embed(x) 103 | 104 | 105 | class DataEmbedding(nn.Module): 106 | def __init__(self, c_in, d_model, embed_type='fixed', freq='h', dropout=0.1): 107 | super(DataEmbedding, self).__init__() 108 | 109 | self.value_embedding = TokenEmbedding(c_in=c_in, d_model=d_model) 110 | self.position_embedding = PositionalEmbedding(d_model=d_model) 111 | self.temporal_embedding = TemporalEmbedding(d_model=d_model, embed_type=embed_type, 112 | freq=freq) if embed_type != 'timeF' else TimeFeatureEmbedding( 113 | d_model=d_model, embed_type=embed_type, freq=freq) 114 | self.dropout = nn.Dropout(p=dropout) 115 | 116 | def forward(self, x, x_mark): 117 | x = self.value_embedding(x) + self.temporal_embedding(x_mark) + self.position_embedding(x) 118 | return self.dropout(x) 119 | 120 | 121 | class DataEmbedding_wo_pos(nn.Module): 122 | def __init__(self, c_in, d_model, embed_type='fixed', freq='h', dropout=0.1): 123 | super(DataEmbedding_wo_pos, self).__init__() 124 | 125 | self.value_embedding = TokenEmbedding(c_in=c_in, d_model=d_model) 126 | self.position_embedding = PositionalEmbedding(d_model=d_model) 127 | self.temporal_embedding = TemporalEmbedding(d_model=d_model, embed_type=embed_type, 128 | freq=freq) if embed_type != 'timeF' else TimeFeatureEmbedding( 129 | d_model=d_model, embed_type=embed_type, freq=freq) 130 | self.dropout = nn.Dropout(p=dropout) 131 | 132 | def forward(self, x, x_mark): 133 | x = self.value_embedding(x) + self.temporal_embedding(x_mark) 134 | return self.dropout(x) 135 | 136 | class DataEmbedding_wo_pos_temp(nn.Module): 137 | def __init__(self, c_in, d_model, embed_type='fixed', freq='h', dropout=0.1): 138 | super(DataEmbedding_wo_pos_temp, self).__init__() 139 | 140 | self.value_embedding = TokenEmbedding(c_in=c_in, d_model=d_model) 141 | self.position_embedding = PositionalEmbedding(d_model=d_model) 142 | self.temporal_embedding = TemporalEmbedding(d_model=d_model, embed_type=embed_type, 143 | freq=freq) if embed_type != 'timeF' else TimeFeatureEmbedding( 144 | d_model=d_model, embed_type=embed_type, freq=freq) 145 | self.dropout = nn.Dropout(p=dropout) 146 | 147 | def forward(self, x, x_mark): 148 | x = self.value_embedding(x) 149 | return self.dropout(x) 150 | 151 | class DataEmbedding_wo_temp(nn.Module): 152 | def __init__(self, c_in, d_model, embed_type='fixed', freq='h', dropout=0.1): 153 | super(DataEmbedding_wo_temp, self).__init__() 154 | 155 | self.value_embedding = TokenEmbedding(c_in=c_in, d_model=d_model) 156 | self.position_embedding = PositionalEmbedding(d_model=d_model) 157 | self.temporal_embedding = TemporalEmbedding(d_model=d_model, embed_type=embed_type, 158 | freq=freq) if embed_type != 'timeF' else TimeFeatureEmbedding( 159 | d_model=d_model, embed_type=embed_type, freq=freq) 160 | self.dropout = nn.Dropout(p=dropout) 161 | 162 | def forward(self, x, x_mark): 163 | x = self.value_embedding(x) + self.position_embedding(x) 164 | return self.dropout(x) -------------------------------------------------------------------------------- /ICTSP/layers/PatchTST_layers.py: -------------------------------------------------------------------------------- 1 | __all__ = ['Transpose', 'get_activation_fn', 'moving_avg', 'series_decomp', 'PositionalEncoding', 'SinCosPosEncoding', 'Coord2dPosEncoding', 'Coord1dPosEncoding', 'positional_encoding'] 2 | 3 | import torch 4 | from torch import nn 5 | import math 6 | 7 | class Transpose(nn.Module): 8 | def __init__(self, *dims, contiguous=False): 9 | super().__init__() 10 | self.dims, self.contiguous = dims, contiguous 11 | def forward(self, x): 12 | if self.contiguous: return x.transpose(*self.dims).contiguous() 13 | else: return x.transpose(*self.dims) 14 | 15 | 16 | def get_activation_fn(activation): 17 | if callable(activation): return activation() 18 | elif activation.lower() == "relu": return nn.ReLU() 19 | elif activation.lower() == "gelu": return nn.GELU() 20 | raise ValueError(f'{activation} is not available. You can use "relu", "gelu", or a callable') 21 | 22 | 23 | # decomposition 24 | 25 | class moving_avg(nn.Module): 26 | """ 27 | Moving average block to highlight the trend of time series 28 | """ 29 | def __init__(self, kernel_size, stride): 30 | super(moving_avg, self).__init__() 31 | self.kernel_size = kernel_size 32 | self.avg = nn.AvgPool1d(kernel_size=kernel_size, stride=stride, padding=0) 33 | 34 | def forward(self, x): 35 | # padding on the both ends of time series 36 | front = x[:, 0:1, :].repeat(1, (self.kernel_size - 1) // 2, 1) 37 | end = x[:, -1:, :].repeat(1, (self.kernel_size - 1) // 2, 1) 38 | x = torch.cat([front, x, end], dim=1) 39 | x = self.avg(x.permute(0, 2, 1)) 40 | x = x.permute(0, 2, 1) 41 | return x 42 | 43 | 44 | class series_decomp(nn.Module): 45 | """ 46 | Series decomposition block 47 | """ 48 | def __init__(self, kernel_size): 49 | super(series_decomp, self).__init__() 50 | self.moving_avg = moving_avg(kernel_size, stride=1) 51 | 52 | def forward(self, x): 53 | moving_mean = self.moving_avg(x) 54 | res = x - moving_mean 55 | return res, moving_mean 56 | 57 | 58 | 59 | # pos_encoding 60 | 61 | def PositionalEncoding(q_len, d_model, normalize=True): 62 | pe = torch.zeros(q_len, d_model) 63 | position = torch.arange(0, q_len).unsqueeze(1) 64 | div_term = torch.exp(torch.arange(0, d_model, 2) * -(math.log(10000.0) / d_model)) 65 | pe[:, 0::2] = torch.sin(position * div_term) 66 | pe[:, 1::2] = torch.cos(position * div_term) 67 | if normalize: 68 | pe = pe - pe.mean() 69 | pe = pe / (pe.std() * 10) 70 | return pe 71 | 72 | SinCosPosEncoding = PositionalEncoding 73 | 74 | def Coord2dPosEncoding(q_len, d_model, exponential=False, normalize=True, eps=1e-3, verbose=False): 75 | x = .5 if exponential else 1 76 | i = 0 77 | for i in range(100): 78 | cpe = 2 * (torch.linspace(0, 1, q_len).reshape(-1, 1) ** x) * (torch.linspace(0, 1, d_model).reshape(1, -1) ** x) - 1 79 | pv(f'{i:4.0f} {x:5.3f} {cpe.mean():+6.3f}', verbose) 80 | if abs(cpe.mean()) <= eps: break 81 | elif cpe.mean() > eps: x += .001 82 | else: x -= .001 83 | i += 1 84 | if normalize: 85 | cpe = cpe - cpe.mean() 86 | cpe = cpe / (cpe.std() * 10) 87 | return cpe 88 | 89 | def Coord1dPosEncoding(q_len, exponential=False, normalize=True): 90 | cpe = (2 * (torch.linspace(0, 1, q_len).reshape(-1, 1)**(.5 if exponential else 1)) - 1) 91 | if normalize: 92 | cpe = cpe - cpe.mean() 93 | cpe = cpe / (cpe.std() * 10) 94 | return cpe 95 | 96 | def positional_encoding(pe, learn_pe, q_len, d_model): 97 | # Positional encoding 98 | if pe == None: 99 | W_pos = torch.empty((q_len, d_model)) # pe = None and learn_pe = False can be used to measure impact of pe 100 | nn.init.uniform_(W_pos, -0.02, 0.02) 101 | learn_pe = False 102 | elif pe == 'zero': 103 | W_pos = torch.empty((q_len, 1)) 104 | nn.init.uniform_(W_pos, -0.02, 0.02) 105 | elif pe == 'zeros': 106 | W_pos = torch.empty((q_len, d_model)) 107 | nn.init.uniform_(W_pos, -0.02, 0.02) 108 | elif pe == 'normal' or pe == 'gauss': 109 | W_pos = torch.zeros((q_len, 1)) 110 | torch.nn.init.normal_(W_pos, mean=0.0, std=0.1) 111 | elif pe == 'uniform': 112 | W_pos = torch.zeros((q_len, 1)) 113 | nn.init.uniform_(W_pos, a=0.0, b=0.1) 114 | elif pe == 'lin1d': W_pos = Coord1dPosEncoding(q_len, exponential=False, normalize=True) 115 | elif pe == 'exp1d': W_pos = Coord1dPosEncoding(q_len, exponential=True, normalize=True) 116 | elif pe == 'lin2d': W_pos = Coord2dPosEncoding(q_len, d_model, exponential=False, normalize=True) 117 | elif pe == 'exp2d': W_pos = Coord2dPosEncoding(q_len, d_model, exponential=True, normalize=True) 118 | elif pe == 'sincos': W_pos = PositionalEncoding(q_len, d_model, normalize=True) 119 | else: raise ValueError(f"{pe} is not a valid pe (positional encoder. Available types: 'gauss'=='normal', \ 120 | 'zeros', 'zero', uniform', 'lin1d', 'exp1d', 'lin2d', 'exp2d', 'sincos', None.)") 121 | return nn.Parameter(W_pos, requires_grad=learn_pe) -------------------------------------------------------------------------------- /ICTSP/layers/RevIN.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class RevIN(nn.Module): 5 | def __init__(self, num_features: int, eps=1e-5, affine=True, subtract_last=False): 6 | """ 7 | :param num_features: the number of features or channels 8 | :param eps: a value added for numerical stability 9 | :param affine: if True, RevIN has learnable affine parameters 10 | """ 11 | super(RevIN, self).__init__() 12 | self.num_features = num_features 13 | self.eps = eps 14 | self.affine = affine 15 | self.subtract_last = subtract_last 16 | if self.affine: 17 | self._init_params() 18 | 19 | def forward(self, x, mode:str): 20 | if mode == 'norm': 21 | self._get_statistics(x) 22 | x = self._normalize(x) 23 | elif mode == 'denorm': 24 | x = self._denormalize(x) 25 | else: raise NotImplementedError 26 | return x 27 | 28 | def _init_params(self): 29 | # initialize RevIN params: (C,) 30 | self.affine_weight = nn.Parameter(torch.ones(self.num_features)) 31 | self.affine_bias = nn.Parameter(torch.zeros(self.num_features)) 32 | 33 | def _get_statistics(self, x): 34 | dim2reduce = tuple(range(1, x.ndim-1)) 35 | if self.subtract_last: 36 | self.last = x[:,-1,:].unsqueeze(1) 37 | else: 38 | self.mean = torch.mean(x, dim=dim2reduce, keepdim=True).detach() 39 | self.stdev = torch.sqrt(torch.var(x, dim=dim2reduce, keepdim=True, unbiased=False) + self.eps).detach() 40 | 41 | def _normalize(self, x): 42 | if self.subtract_last: 43 | x = x - self.last 44 | else: 45 | x = x - self.mean 46 | x = x / self.stdev 47 | if self.affine: 48 | x = x * self.affine_weight 49 | x = x + self.affine_bias 50 | return x 51 | 52 | def _denormalize(self, x): 53 | if self.affine: 54 | x = x - self.affine_bias 55 | x = x / (self.affine_weight + self.eps*self.eps) 56 | x = x * self.stdev 57 | if self.subtract_last: 58 | x = x + self.last 59 | else: 60 | x = x + self.mean 61 | return x -------------------------------------------------------------------------------- /ICTSP/layers/SelfAttention_Family.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | import matplotlib.pyplot as plt 6 | 7 | import numpy as np 8 | import math 9 | from math import sqrt 10 | from utils.masking import TriangularCausalMask, ProbMask 11 | import os 12 | 13 | 14 | class FullAttention(nn.Module): 15 | def __init__(self, mask_flag=True, factor=5, scale=None, attention_dropout=0.1, output_attention=False): 16 | super(FullAttention, self).__init__() 17 | self.scale = scale 18 | self.mask_flag = mask_flag 19 | self.output_attention = output_attention 20 | self.dropout = nn.Dropout(attention_dropout) 21 | 22 | def forward(self, queries, keys, values, attn_mask): 23 | B, L, H, E = queries.shape 24 | _, S, _, D = values.shape 25 | scale = self.scale or 1. / sqrt(E) 26 | 27 | scores = torch.einsum("blhe,bshe->bhls", queries, keys) 28 | 29 | if self.mask_flag: 30 | if attn_mask is None: 31 | attn_mask = TriangularCausalMask(B, L, device=queries.device) 32 | 33 | scores.masked_fill_(attn_mask.mask, -np.inf) 34 | 35 | A = self.dropout(torch.softmax(scale * scores, dim=-1)) 36 | V = torch.einsum("bhls,bshd->blhd", A, values) 37 | 38 | if self.output_attention: 39 | return (V.contiguous(), A) 40 | else: 41 | return (V.contiguous(), None) 42 | 43 | 44 | class ProbAttention(nn.Module): 45 | def __init__(self, mask_flag=True, factor=5, scale=None, attention_dropout=0.1, output_attention=False): 46 | super(ProbAttention, self).__init__() 47 | self.factor = factor 48 | self.scale = scale 49 | self.mask_flag = mask_flag 50 | self.output_attention = output_attention 51 | self.dropout = nn.Dropout(attention_dropout) 52 | 53 | def _prob_QK(self, Q, K, sample_k, n_top): # n_top: c*ln(L_q) 54 | # Q [B, H, L, D] 55 | B, H, L_K, E = K.shape 56 | _, _, L_Q, _ = Q.shape 57 | 58 | # calculate the sampled Q_K 59 | K_expand = K.unsqueeze(-3).expand(B, H, L_Q, L_K, E) 60 | index_sample = torch.randint(L_K, (L_Q, sample_k)) # real U = U_part(factor*ln(L_k))*L_q 61 | K_sample = K_expand[:, :, torch.arange(L_Q).unsqueeze(1), index_sample, :] 62 | Q_K_sample = torch.matmul(Q.unsqueeze(-2), K_sample.transpose(-2, -1)).squeeze() 63 | 64 | # find the Top_k query with sparisty measurement 65 | M = Q_K_sample.max(-1)[0] - torch.div(Q_K_sample.sum(-1), L_K) 66 | M_top = M.topk(n_top, sorted=False)[1] 67 | 68 | # use the reduced Q to calculate Q_K 69 | Q_reduce = Q[torch.arange(B)[:, None, None], 70 | torch.arange(H)[None, :, None], 71 | M_top, :] # factor*ln(L_q) 72 | Q_K = torch.matmul(Q_reduce, K.transpose(-2, -1)) # factor*ln(L_q)*L_k 73 | 74 | return Q_K, M_top 75 | 76 | def _get_initial_context(self, V, L_Q): 77 | B, H, L_V, D = V.shape 78 | if not self.mask_flag: 79 | # V_sum = V.sum(dim=-2) 80 | V_sum = V.mean(dim=-2) 81 | contex = V_sum.unsqueeze(-2).expand(B, H, L_Q, V_sum.shape[-1]).clone() 82 | else: # use mask 83 | assert (L_Q == L_V) # requires that L_Q == L_V, i.e. for self-attention only 84 | contex = V.cumsum(dim=-2) 85 | return contex 86 | 87 | def _update_context(self, context_in, V, scores, index, L_Q, attn_mask): 88 | B, H, L_V, D = V.shape 89 | 90 | if self.mask_flag: 91 | attn_mask = ProbMask(B, H, L_Q, index, scores, device=V.device) 92 | scores.masked_fill_(attn_mask.mask, -np.inf) 93 | 94 | attn = torch.softmax(scores, dim=-1) # nn.Softmax(dim=-1)(scores) 95 | 96 | context_in[torch.arange(B)[:, None, None], 97 | torch.arange(H)[None, :, None], 98 | index, :] = torch.matmul(attn, V).type_as(context_in) 99 | if self.output_attention: 100 | attns = (torch.ones([B, H, L_V, L_V]) / L_V).type_as(attn).to(attn.device) 101 | attns[torch.arange(B)[:, None, None], torch.arange(H)[None, :, None], index, :] = attn 102 | return (context_in, attns) 103 | else: 104 | return (context_in, None) 105 | 106 | def forward(self, queries, keys, values, attn_mask): 107 | B, L_Q, H, D = queries.shape 108 | _, L_K, _, _ = keys.shape 109 | 110 | queries = queries.transpose(2, 1) 111 | keys = keys.transpose(2, 1) 112 | values = values.transpose(2, 1) 113 | 114 | U_part = self.factor * np.ceil(np.log(L_K)).astype('int').item() # c*ln(L_k) 115 | u = self.factor * np.ceil(np.log(L_Q)).astype('int').item() # c*ln(L_q) 116 | 117 | U_part = U_part if U_part < L_K else L_K 118 | u = u if u < L_Q else L_Q 119 | 120 | scores_top, index = self._prob_QK(queries, keys, sample_k=U_part, n_top=u) 121 | 122 | # add scale factor 123 | scale = self.scale or 1. / sqrt(D) 124 | if scale is not None: 125 | scores_top = scores_top * scale 126 | # get the context 127 | context = self._get_initial_context(values, L_Q) 128 | # update the context with selected top_k queries 129 | context, attn = self._update_context(context, values, scores_top, index, L_Q, attn_mask) 130 | 131 | return context.contiguous(), attn 132 | 133 | 134 | class AttentionLayer(nn.Module): 135 | def __init__(self, attention, d_model, n_heads, d_keys=None, 136 | d_values=None): 137 | super(AttentionLayer, self).__init__() 138 | 139 | d_keys = d_keys or (d_model // n_heads) 140 | d_values = d_values or (d_model // n_heads) 141 | 142 | self.inner_attention = attention 143 | self.query_projection = nn.Linear(d_model, d_keys * n_heads) 144 | self.key_projection = nn.Linear(d_model, d_keys * n_heads) 145 | self.value_projection = nn.Linear(d_model, d_values * n_heads) 146 | self.out_projection = nn.Linear(d_values * n_heads, d_model) 147 | self.n_heads = n_heads 148 | 149 | def forward(self, queries, keys, values, attn_mask): 150 | B, L, _ = queries.shape 151 | _, S, _ = keys.shape 152 | H = self.n_heads 153 | 154 | queries = self.query_projection(queries).view(B, L, H, -1) 155 | keys = self.key_projection(keys).view(B, S, H, -1) 156 | values = self.value_projection(values).view(B, S, H, -1) 157 | 158 | out, attn = self.inner_attention( 159 | queries, 160 | keys, 161 | values, 162 | attn_mask 163 | ) 164 | out = out.view(B, L, -1) 165 | 166 | return self.out_projection(out), attn 167 | -------------------------------------------------------------------------------- /ICTSP/layers/Transformer_EncDec.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | 6 | class ConvLayer(nn.Module): 7 | def __init__(self, c_in): 8 | super(ConvLayer, self).__init__() 9 | self.downConv = nn.Conv1d(in_channels=c_in, 10 | out_channels=c_in, 11 | kernel_size=3, 12 | padding=2, 13 | padding_mode='circular') 14 | self.norm = nn.BatchNorm1d(c_in) 15 | self.activation = nn.ELU() 16 | self.maxPool = nn.MaxPool1d(kernel_size=3, stride=2, padding=1) 17 | 18 | def forward(self, x): 19 | x = self.downConv(x.permute(0, 2, 1)) 20 | x = self.norm(x) 21 | x = self.activation(x) 22 | x = self.maxPool(x) 23 | x = x.transpose(1, 2) 24 | return x 25 | 26 | 27 | class EncoderLayer(nn.Module): 28 | def __init__(self, attention, d_model, d_ff=None, dropout=0.1, activation="relu"): 29 | super(EncoderLayer, self).__init__() 30 | d_ff = d_ff or 4 * d_model 31 | self.attention = attention 32 | self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1) 33 | self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1) 34 | self.norm1 = nn.LayerNorm(d_model) 35 | self.norm2 = nn.LayerNorm(d_model) 36 | self.dropout = nn.Dropout(dropout) 37 | self.activation = F.relu if activation == "relu" else F.gelu 38 | 39 | def forward(self, x, attn_mask=None): 40 | new_x, attn = self.attention( 41 | x, x, x, 42 | attn_mask=attn_mask 43 | ) 44 | x = x + self.dropout(new_x) 45 | 46 | y = x = self.norm1(x) 47 | y = self.dropout(self.activation(self.conv1(y.transpose(-1, 1)))) 48 | y = self.dropout(self.conv2(y).transpose(-1, 1)) 49 | 50 | return self.norm2(x + y), attn 51 | 52 | 53 | class Encoder(nn.Module): 54 | def __init__(self, attn_layers, conv_layers=None, norm_layer=None): 55 | super(Encoder, self).__init__() 56 | self.attn_layers = nn.ModuleList(attn_layers) 57 | self.conv_layers = nn.ModuleList(conv_layers) if conv_layers is not None else None 58 | self.norm = norm_layer 59 | 60 | def forward(self, x, attn_mask=None): 61 | # x [B, L, D] 62 | attns = [] 63 | if self.conv_layers is not None: 64 | for attn_layer, conv_layer in zip(self.attn_layers, self.conv_layers): 65 | x, attn = attn_layer(x, attn_mask=attn_mask) 66 | x = conv_layer(x) 67 | attns.append(attn) 68 | x, attn = self.attn_layers[-1](x) 69 | attns.append(attn) 70 | else: 71 | for attn_layer in self.attn_layers: 72 | x, attn = attn_layer(x, attn_mask=attn_mask) 73 | attns.append(attn) 74 | 75 | if self.norm is not None: 76 | x = self.norm(x) 77 | 78 | return x, attns 79 | 80 | 81 | class DecoderLayer(nn.Module): 82 | def __init__(self, self_attention, cross_attention, d_model, d_ff=None, 83 | dropout=0.1, activation="relu"): 84 | super(DecoderLayer, self).__init__() 85 | d_ff = d_ff or 4 * d_model 86 | self.self_attention = self_attention 87 | self.cross_attention = cross_attention 88 | self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1) 89 | self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1) 90 | self.norm1 = nn.LayerNorm(d_model) 91 | self.norm2 = nn.LayerNorm(d_model) 92 | self.norm3 = nn.LayerNorm(d_model) 93 | self.dropout = nn.Dropout(dropout) 94 | self.activation = F.relu if activation == "relu" else F.gelu 95 | 96 | def forward(self, x, cross, x_mask=None, cross_mask=None): 97 | x = x + self.dropout(self.self_attention( 98 | x, x, x, 99 | attn_mask=x_mask 100 | )[0]) 101 | x = self.norm1(x) 102 | 103 | x = x + self.dropout(self.cross_attention( 104 | x, cross, cross, 105 | attn_mask=cross_mask 106 | )[0]) 107 | 108 | y = x = self.norm2(x) 109 | y = self.dropout(self.activation(self.conv1(y.transpose(-1, 1)))) 110 | y = self.dropout(self.conv2(y).transpose(-1, 1)) 111 | 112 | return self.norm3(x + y) 113 | 114 | 115 | class Decoder(nn.Module): 116 | def __init__(self, layers, norm_layer=None, projection=None): 117 | super(Decoder, self).__init__() 118 | self.layers = nn.ModuleList(layers) 119 | self.norm = norm_layer 120 | self.projection = projection 121 | 122 | def forward(self, x, cross, x_mask=None, cross_mask=None): 123 | for layer in self.layers: 124 | x = layer(x, cross, x_mask=x_mask, cross_mask=cross_mask) 125 | 126 | if self.norm is not None: 127 | x = self.norm(x) 128 | 129 | if self.projection is not None: 130 | x = self.projection(x) 131 | return x 132 | -------------------------------------------------------------------------------- /ICTSP/models/Autoformer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | from layers.Embed import DataEmbedding, DataEmbedding_wo_pos,DataEmbedding_wo_pos_temp,DataEmbedding_wo_temp 5 | from layers.AutoCorrelation import AutoCorrelation, AutoCorrelationLayer 6 | from layers.Autoformer_EncDec import Encoder, Decoder, EncoderLayer, DecoderLayer, my_Layernorm, series_decomp 7 | import math 8 | import numpy as np 9 | 10 | 11 | class Model(nn.Module): 12 | """ 13 | Autoformer is the first method to achieve the series-wise connection, 14 | with inherent O(LlogL) complexity 15 | """ 16 | def __init__(self, configs): 17 | super(Model, self).__init__() 18 | self.seq_len = configs.seq_len 19 | self.label_len = configs.label_len 20 | self.pred_len = configs.pred_len 21 | self.output_attention = configs.output_attention 22 | 23 | # Decomp 24 | kernel_size = configs.moving_avg 25 | self.decomp = series_decomp(kernel_size) 26 | 27 | # Embedding 28 | # The series-wise connection inherently contains the sequential information. 29 | # Thus, we can discard the position embedding of transformers. 30 | if configs.embed_type == 0: 31 | self.enc_embedding = DataEmbedding_wo_pos(configs.enc_in, configs.d_model, configs.embed, configs.freq, 32 | configs.dropout) 33 | self.dec_embedding = DataEmbedding_wo_pos(configs.dec_in, configs.d_model, configs.embed, configs.freq, 34 | configs.dropout) 35 | elif configs.embed_type == 1: 36 | self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq, 37 | configs.dropout) 38 | self.dec_embedding = DataEmbedding(configs.dec_in, configs.d_model, configs.embed, configs.freq, 39 | configs.dropout) 40 | elif configs.embed_type == 2: 41 | self.enc_embedding = DataEmbedding_wo_pos(configs.enc_in, configs.d_model, configs.embed, configs.freq, 42 | configs.dropout) 43 | self.dec_embedding = DataEmbedding_wo_pos(configs.dec_in, configs.d_model, configs.embed, configs.freq, 44 | configs.dropout) 45 | 46 | elif configs.embed_type == 3: 47 | self.enc_embedding = DataEmbedding_wo_temp(configs.enc_in, configs.d_model, configs.embed, configs.freq, 48 | configs.dropout) 49 | self.dec_embedding = DataEmbedding_wo_temp(configs.dec_in, configs.d_model, configs.embed, configs.freq, 50 | configs.dropout) 51 | elif configs.embed_type == 4: 52 | self.enc_embedding = DataEmbedding_wo_pos_temp(configs.enc_in, configs.d_model, configs.embed, configs.freq, 53 | configs.dropout) 54 | self.dec_embedding = DataEmbedding_wo_pos_temp(configs.dec_in, configs.d_model, configs.embed, configs.freq, 55 | configs.dropout) 56 | 57 | # Encoder 58 | self.encoder = Encoder( 59 | [ 60 | EncoderLayer( 61 | AutoCorrelationLayer( 62 | AutoCorrelation(False, configs.factor, attention_dropout=configs.dropout, 63 | output_attention=configs.output_attention), 64 | configs.d_model, configs.n_heads), 65 | configs.d_model, 66 | configs.d_ff, 67 | moving_avg=configs.moving_avg, 68 | dropout=configs.dropout, 69 | activation=configs.activation 70 | ) for l in range(configs.e_layers) 71 | ], 72 | norm_layer=my_Layernorm(configs.d_model) 73 | ) 74 | # Decoder 75 | self.decoder = Decoder( 76 | [ 77 | DecoderLayer( 78 | AutoCorrelationLayer( 79 | AutoCorrelation(True, configs.factor, attention_dropout=configs.dropout, 80 | output_attention=False), 81 | configs.d_model, configs.n_heads), 82 | AutoCorrelationLayer( 83 | AutoCorrelation(False, configs.factor, attention_dropout=configs.dropout, 84 | output_attention=False), 85 | configs.d_model, configs.n_heads), 86 | configs.d_model, 87 | configs.c_out, 88 | configs.d_ff, 89 | moving_avg=configs.moving_avg, 90 | dropout=configs.dropout, 91 | activation=configs.activation, 92 | ) 93 | for l in range(configs.d_layers) 94 | ], 95 | norm_layer=my_Layernorm(configs.d_model), 96 | projection=nn.Linear(configs.d_model, configs.c_out, bias=True) 97 | ) 98 | 99 | def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, 100 | enc_self_mask=None, dec_self_mask=None, dec_enc_mask=None): 101 | # decomp init 102 | mean = torch.mean(x_enc, dim=1).unsqueeze(1).repeat(1, self.pred_len, 1) 103 | zeros = torch.zeros([x_dec.shape[0], self.pred_len, x_dec.shape[2]], device=x_enc.device) 104 | seasonal_init, trend_init = self.decomp(x_enc) 105 | # decoder input 106 | trend_init = torch.cat([trend_init[:, -self.label_len:, :], mean], dim=1) 107 | seasonal_init = torch.cat([seasonal_init[:, -self.label_len:, :], zeros], dim=1) 108 | # enc 109 | enc_out = self.enc_embedding(x_enc, x_mark_enc) 110 | enc_out, attns = self.encoder(enc_out, attn_mask=enc_self_mask) 111 | # dec 112 | dec_out = self.dec_embedding(seasonal_init, x_mark_dec) 113 | seasonal_part, trend_part = self.decoder(dec_out, enc_out, x_mask=dec_self_mask, cross_mask=dec_enc_mask, 114 | trend=trend_init) 115 | # final 116 | dec_out = trend_part + seasonal_part 117 | 118 | if self.output_attention: 119 | return dec_out[:, -self.pred_len:, :], attns 120 | else: 121 | return dec_out[:, -self.pred_len:, :] # [B, L, D] -------------------------------------------------------------------------------- /ICTSP/models/DLinear.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import numpy as np 5 | 6 | class moving_avg(nn.Module): 7 | """ 8 | Moving average block to highlight the trend of time series 9 | """ 10 | def __init__(self, kernel_size, stride): 11 | super(moving_avg, self).__init__() 12 | self.kernel_size = kernel_size 13 | self.avg = nn.AvgPool1d(kernel_size=kernel_size, stride=stride, padding=0) 14 | 15 | def forward(self, x): 16 | # padding on the both ends of time series 17 | front = x[:, 0:1, :].repeat(1, (self.kernel_size - 1) // 2, 1) 18 | end = x[:, -1:, :].repeat(1, (self.kernel_size - 1) // 2, 1) 19 | x = torch.cat([front, x, end], dim=1) 20 | x = self.avg(x.permute(0, 2, 1)) 21 | x = x.permute(0, 2, 1) 22 | return x 23 | 24 | 25 | class series_decomp(nn.Module): 26 | """ 27 | Series decomposition block 28 | """ 29 | def __init__(self, kernel_size): 30 | super(series_decomp, self).__init__() 31 | self.moving_avg = moving_avg(kernel_size, stride=1) 32 | 33 | def forward(self, x): 34 | moving_mean = self.moving_avg(x) 35 | res = x - moving_mean 36 | return res, moving_mean 37 | 38 | class Model(nn.Module): 39 | """ 40 | Decomposition-Linear 41 | """ 42 | def __init__(self, configs): 43 | super(Model, self).__init__() 44 | self.seq_len = configs.seq_len 45 | self.pred_len = configs.pred_len 46 | 47 | # Decompsition Kernel Size 48 | kernel_size = 25 49 | self.decompsition = series_decomp(kernel_size) 50 | self.individual = configs.individual 51 | self.channels = configs.enc_in 52 | 53 | if self.individual: 54 | self.Linear_Seasonal = nn.ModuleList() 55 | self.Linear_Trend = nn.ModuleList() 56 | 57 | for i in range(self.channels): 58 | self.Linear_Seasonal.append(nn.Linear(self.seq_len,self.pred_len)) 59 | self.Linear_Trend.append(nn.Linear(self.seq_len,self.pred_len)) 60 | 61 | # Use this two lines if you want to visualize the weights 62 | # self.Linear_Seasonal[i].weight = nn.Parameter((1/self.seq_len)*torch.ones([self.pred_len,self.seq_len])) 63 | # self.Linear_Trend[i].weight = nn.Parameter((1/self.seq_len)*torch.ones([self.pred_len,self.seq_len])) 64 | else: 65 | self.Linear_Seasonal = nn.Linear(self.seq_len,self.pred_len) 66 | self.Linear_Trend = nn.Linear(self.seq_len,self.pred_len) 67 | 68 | # Use this two lines if you want to visualize the weights 69 | # self.Linear_Seasonal.weight = nn.Parameter((1/self.seq_len)*torch.ones([self.pred_len,self.seq_len])) 70 | # self.Linear_Trend.weight = nn.Parameter((1/self.seq_len)*torch.ones([self.pred_len,self.seq_len])) 71 | 72 | def forward(self, x): 73 | # x: [Batch, Input length, Channel] 74 | seasonal_init, trend_init = self.decompsition(x) 75 | seasonal_init, trend_init = seasonal_init.permute(0,2,1), trend_init.permute(0,2,1) 76 | if self.individual: 77 | seasonal_output = torch.zeros([seasonal_init.size(0),seasonal_init.size(1),self.pred_len],dtype=seasonal_init.dtype).to(seasonal_init.device) 78 | trend_output = torch.zeros([trend_init.size(0),trend_init.size(1),self.pred_len],dtype=trend_init.dtype).to(trend_init.device) 79 | for i in range(self.channels): 80 | seasonal_output[:,i,:] = self.Linear_Seasonal[i](seasonal_init[:,i,:]) 81 | trend_output[:,i,:] = self.Linear_Trend[i](trend_init[:,i,:]) 82 | else: 83 | seasonal_output = self.Linear_Seasonal(seasonal_init) 84 | trend_output = self.Linear_Trend(trend_init) 85 | 86 | x = seasonal_output + trend_output 87 | return x.permute(0,2,1) # to [Batch, Output length, Channel] -------------------------------------------------------------------------------- /ICTSP/models/FITS.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import numpy as np 5 | #import models.NLinear as DLinear 6 | 7 | class Model(nn.Module): 8 | 9 | # FITS: Frequency Interpolation Time Series Forecasting 10 | 11 | def __init__(self, configs): 12 | super(Model, self).__init__() 13 | self.seq_len = configs.seq_len 14 | self.pred_len = configs.pred_len 15 | self.individual = configs.individual 16 | self.channels = configs.enc_in 17 | base_T = 24 18 | H_order = 6 19 | cut_freq = int(self.seq_len // base_T + 1) * H_order + 10 20 | self.dominance_freq = cut_freq # 720/24 21 | 22 | self.length_ratio = (self.seq_len + self.pred_len)/self.seq_len 23 | 24 | if self.individual: 25 | self.freq_upsampler = nn.ModuleList() 26 | for i in range(self.channels): 27 | self.freq_upsampler.append(nn.Linear(self.dominance_freq, int(self.dominance_freq*self.length_ratio)).to(torch.cfloat)) 28 | 29 | else: 30 | self.freq_upsampler = nn.Linear(self.dominance_freq, int(self.dominance_freq*self.length_ratio)).to(torch.cfloat) # complex layer for frequency upcampling] 31 | # configs.pred_len=configs.seq_len+configs.pred_len 32 | # #self.Dlinear=DLinear.Model(configs) 33 | # configs.pred_len=self.pred_len 34 | 35 | 36 | def forward(self, x, x_mark_enc=None, x_dec=None, x_mark_dec=None): 37 | # RIN 38 | x_mean = torch.mean(x, dim=1, keepdim=True) 39 | x = x - x_mean 40 | x_var=torch.var(x, dim=1, keepdim=True)+ 1e-5 41 | # print(x_var) 42 | x = x / torch.sqrt(x_var) 43 | 44 | low_specx = torch.fft.rfft(x, dim=1) 45 | low_specx[:,self.dominance_freq:]=0 # LPF 46 | low_specx = low_specx[:,0:self.dominance_freq,:] # LPF 47 | # print(low_specx.permute(0,2,1)) 48 | if self.individual: 49 | low_specxy_ = torch.zeros([low_specx.size(0),int(self.dominance_freq*self.length_ratio),low_specx.size(2)],dtype=low_specx.dtype).to(low_specx.device) 50 | for i in range(self.channels): 51 | low_specxy_[:,:,i]=self.freq_upsampler[i](low_specx[:,:,i].permute(0,1)).permute(0,1) 52 | else: 53 | low_specxy_ = self.freq_upsampler(low_specx.permute(0,2,1)).permute(0,2,1) 54 | # print(low_specxy_) 55 | low_specxy = torch.zeros([low_specxy_.size(0),int((self.seq_len+self.pred_len)/2+1),low_specxy_.size(2)],dtype=low_specxy_.dtype).to(low_specxy_.device) 56 | low_specxy[:,0:low_specxy_.size(1),:]=low_specxy_ # zero padding 57 | low_xy=torch.fft.irfft(low_specxy, dim=1) 58 | low_xy=low_xy * self.length_ratio # energy compemsation for the length change 59 | # dom_x=x-low_x 60 | 61 | # dom_xy=self.Dlinear(dom_x) 62 | # xy=(low_xy+dom_xy) * torch.sqrt(x_var) +x_mean # REVERSE RIN 63 | xy=(low_xy) * torch.sqrt(x_var) +x_mean 64 | xy = xy[:, -self.pred_len:, :] 65 | return xy#, low_xy* torch.sqrt(x_var) 66 | -------------------------------------------------------------------------------- /ICTSP/models/Informer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | from utils.masking import TriangularCausalMask, ProbMask 5 | from layers.Transformer_EncDec import Decoder, DecoderLayer, Encoder, EncoderLayer, ConvLayer 6 | from layers.SelfAttention_Family import FullAttention, ProbAttention, AttentionLayer 7 | from layers.Embed import DataEmbedding,DataEmbedding_wo_pos,DataEmbedding_wo_temp,DataEmbedding_wo_pos_temp 8 | import numpy as np 9 | 10 | 11 | class Model(nn.Module): 12 | """ 13 | Informer with Propspare attention in O(LlogL) complexity 14 | """ 15 | def __init__(self, configs): 16 | super(Model, self).__init__() 17 | self.pred_len = configs.pred_len 18 | self.output_attention = configs.output_attention 19 | 20 | # Embedding 21 | if configs.embed_type == 0: 22 | self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq, 23 | configs.dropout) 24 | self.dec_embedding = DataEmbedding(configs.dec_in, configs.d_model, configs.embed, configs.freq, 25 | configs.dropout) 26 | elif configs.embed_type == 1: 27 | self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq, 28 | configs.dropout) 29 | self.dec_embedding = DataEmbedding(configs.dec_in, configs.d_model, configs.embed, configs.freq, 30 | configs.dropout) 31 | elif configs.embed_type == 2: 32 | self.enc_embedding = DataEmbedding_wo_pos(configs.enc_in, configs.d_model, configs.embed, configs.freq, 33 | configs.dropout) 34 | self.dec_embedding = DataEmbedding_wo_pos(configs.dec_in, configs.d_model, configs.embed, configs.freq, 35 | configs.dropout) 36 | 37 | elif configs.embed_type == 3: 38 | self.enc_embedding = DataEmbedding_wo_temp(configs.enc_in, configs.d_model, configs.embed, configs.freq, 39 | configs.dropout) 40 | self.dec_embedding = DataEmbedding_wo_temp(configs.dec_in, configs.d_model, configs.embed, configs.freq, 41 | configs.dropout) 42 | elif configs.embed_type == 4: 43 | self.enc_embedding = DataEmbedding_wo_pos_temp(configs.enc_in, configs.d_model, configs.embed, configs.freq, 44 | configs.dropout) 45 | self.dec_embedding = DataEmbedding_wo_pos_temp(configs.dec_in, configs.d_model, configs.embed, configs.freq, 46 | configs.dropout) 47 | # Encoder 48 | self.encoder = Encoder( 49 | [ 50 | EncoderLayer( 51 | AttentionLayer( 52 | ProbAttention(False, configs.factor, attention_dropout=configs.dropout, 53 | output_attention=configs.output_attention), 54 | configs.d_model, configs.n_heads), 55 | configs.d_model, 56 | configs.d_ff, 57 | dropout=configs.dropout, 58 | activation=configs.activation 59 | ) for l in range(configs.e_layers) 60 | ], 61 | [ 62 | ConvLayer( 63 | configs.d_model 64 | ) for l in range(configs.e_layers - 1) 65 | ] if configs.distil else None, 66 | norm_layer=torch.nn.LayerNorm(configs.d_model) 67 | ) 68 | # Decoder 69 | self.decoder = Decoder( 70 | [ 71 | DecoderLayer( 72 | AttentionLayer( 73 | ProbAttention(True, configs.factor, attention_dropout=configs.dropout, output_attention=False), 74 | configs.d_model, configs.n_heads), 75 | AttentionLayer( 76 | ProbAttention(False, configs.factor, attention_dropout=configs.dropout, output_attention=False), 77 | configs.d_model, configs.n_heads), 78 | configs.d_model, 79 | configs.d_ff, 80 | dropout=configs.dropout, 81 | activation=configs.activation, 82 | ) 83 | for l in range(configs.d_layers) 84 | ], 85 | norm_layer=torch.nn.LayerNorm(configs.d_model), 86 | projection=nn.Linear(configs.d_model, configs.c_out, bias=True) 87 | ) 88 | 89 | def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, 90 | enc_self_mask=None, dec_self_mask=None, dec_enc_mask=None): 91 | 92 | enc_out = self.enc_embedding(x_enc, x_mark_enc) 93 | enc_out, attns = self.encoder(enc_out, attn_mask=enc_self_mask) 94 | 95 | dec_out = self.dec_embedding(x_dec, x_mark_dec) 96 | dec_out = self.decoder(dec_out, enc_out, x_mask=dec_self_mask, cross_mask=dec_enc_mask) 97 | 98 | if self.output_attention: 99 | return dec_out[:, -self.pred_len:, :], attns 100 | else: 101 | return dec_out[:, -self.pred_len:, :] # [B, L, D] -------------------------------------------------------------------------------- /ICTSP/models/Linear.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import numpy as np 5 | 6 | class Model(nn.Module): 7 | """ 8 | Just one Linear layer 9 | """ 10 | def __init__(self, configs): 11 | super(Model, self).__init__() 12 | self.seq_len = configs.seq_len 13 | self.pred_len = configs.pred_len 14 | self.Linear = nn.Linear(self.seq_len, self.pred_len) 15 | # Use this line if you want to visualize the weights 16 | # self.Linear.weight = nn.Parameter((1/self.seq_len)*torch.ones([self.pred_len,self.seq_len])) 17 | 18 | def forward(self, x): 19 | # x: [Batch, Input length, Channel] 20 | x = self.Linear(x.permute(0,2,1)).permute(0,2,1) 21 | return x # [Batch, Output length, Channel] -------------------------------------------------------------------------------- /ICTSP/models/NLinear.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import numpy as np 5 | 6 | class Model(nn.Module): 7 | """ 8 | Normalization-Linear 9 | """ 10 | def __init__(self, configs): 11 | super(Model, self).__init__() 12 | self.seq_len = configs.seq_len 13 | self.pred_len = configs.pred_len 14 | self.Linear = nn.Linear(self.seq_len, self.pred_len) 15 | # Use this line if you want to visualize the weights 16 | # self.Linear.weight = nn.Parameter((1/self.seq_len)*torch.ones([self.pred_len,self.seq_len])) 17 | 18 | def forward(self, x): 19 | # x: [Batch, Input length, Channel] 20 | seq_last = x[:,-1:,:].detach() 21 | x = x - seq_last 22 | x = self.Linear(x.permute(0,2,1)).permute(0,2,1) 23 | x = x + seq_last 24 | return x # [Batch, Output length, Channel] -------------------------------------------------------------------------------- /ICTSP/models/PatchTST.py: -------------------------------------------------------------------------------- 1 | __all__ = ['PatchTST'] 2 | 3 | # Cell 4 | from typing import Callable, Optional 5 | import torch 6 | from torch import nn 7 | from torch import Tensor 8 | import torch.nn.functional as F 9 | import numpy as np 10 | 11 | from layers.PatchTST_backbone import PatchTST_backbone 12 | from layers.PatchTST_layers import series_decomp 13 | 14 | 15 | class Model(nn.Module): 16 | def __init__(self, configs, max_seq_len:Optional[int]=1024, d_k:Optional[int]=None, d_v:Optional[int]=None, norm:str='BatchNorm', attn_dropout:float=0., 17 | act:str="gelu", key_padding_mask:bool='auto',padding_var:Optional[int]=None, attn_mask:Optional[Tensor]=None, res_attention:bool=True, 18 | pre_norm:bool=False, store_attn:bool=False, pe:str='zeros', learn_pe:bool=True, pretrain_head:bool=False, head_type = 'flatten', verbose:bool=False, **kwargs): 19 | 20 | super().__init__() 21 | 22 | # load parameters 23 | c_in = configs.enc_in 24 | context_window = configs.seq_len 25 | target_window = configs.pred_len 26 | 27 | n_layers = configs.e_layers 28 | n_heads = configs.n_heads 29 | d_model = configs.d_model 30 | d_ff = configs.d_ff 31 | dropout = configs.dropout 32 | fc_dropout = configs.fc_dropout 33 | head_dropout = configs.head_dropout 34 | 35 | individual = configs.individual 36 | 37 | patch_len = configs.patch_len 38 | stride = configs.stride 39 | padding_patch = configs.padding_patch 40 | 41 | revin = configs.revin 42 | affine = configs.affine 43 | subtract_last = configs.subtract_last 44 | 45 | decomposition = configs.decomposition 46 | kernel_size = configs.kernel_size 47 | 48 | 49 | # model 50 | self.decomposition = decomposition 51 | if self.decomposition: 52 | self.decomp_module = series_decomp(kernel_size) 53 | self.model_trend = PatchTST_backbone(c_in=c_in, context_window = context_window, target_window=target_window, patch_len=patch_len, stride=stride, 54 | max_seq_len=max_seq_len, n_layers=n_layers, d_model=d_model, 55 | n_heads=n_heads, d_k=d_k, d_v=d_v, d_ff=d_ff, norm=norm, attn_dropout=attn_dropout, 56 | dropout=dropout, act=act, key_padding_mask=key_padding_mask, padding_var=padding_var, 57 | attn_mask=attn_mask, res_attention=res_attention, pre_norm=pre_norm, store_attn=store_attn, 58 | pe=pe, learn_pe=learn_pe, fc_dropout=fc_dropout, head_dropout=head_dropout, padding_patch = padding_patch, 59 | pretrain_head=pretrain_head, head_type=head_type, individual=individual, revin=revin, affine=affine, 60 | subtract_last=subtract_last, verbose=verbose, **kwargs) 61 | self.model_res = PatchTST_backbone(c_in=c_in, context_window = context_window, target_window=target_window, patch_len=patch_len, stride=stride, 62 | max_seq_len=max_seq_len, n_layers=n_layers, d_model=d_model, 63 | n_heads=n_heads, d_k=d_k, d_v=d_v, d_ff=d_ff, norm=norm, attn_dropout=attn_dropout, 64 | dropout=dropout, act=act, key_padding_mask=key_padding_mask, padding_var=padding_var, 65 | attn_mask=attn_mask, res_attention=res_attention, pre_norm=pre_norm, store_attn=store_attn, 66 | pe=pe, learn_pe=learn_pe, fc_dropout=fc_dropout, head_dropout=head_dropout, padding_patch = padding_patch, 67 | pretrain_head=pretrain_head, head_type=head_type, individual=individual, revin=revin, affine=affine, 68 | subtract_last=subtract_last, verbose=verbose, **kwargs) 69 | else: 70 | self.model = PatchTST_backbone(c_in=c_in, context_window = context_window, target_window=target_window, patch_len=patch_len, stride=stride, 71 | max_seq_len=max_seq_len, n_layers=n_layers, d_model=d_model, 72 | n_heads=n_heads, d_k=d_k, d_v=d_v, d_ff=d_ff, norm=norm, attn_dropout=attn_dropout, 73 | dropout=dropout, act=act, key_padding_mask=key_padding_mask, padding_var=padding_var, 74 | attn_mask=attn_mask, res_attention=res_attention, pre_norm=pre_norm, store_attn=store_attn, 75 | pe=pe, learn_pe=learn_pe, fc_dropout=fc_dropout, head_dropout=head_dropout, padding_patch = padding_patch, 76 | pretrain_head=pretrain_head, head_type=head_type, individual=individual, revin=revin, affine=affine, 77 | subtract_last=subtract_last, verbose=verbose, **kwargs) 78 | 79 | 80 | def forward(self, x): # x: [Batch, Input length, Channel] 81 | if self.decomposition: 82 | res_init, trend_init = self.decomp_module(x) 83 | res_init, trend_init = res_init.permute(0,2,1), trend_init.permute(0,2,1) # x: [Batch, Channel, Input length] 84 | res = self.model_res(res_init) 85 | trend = self.model_trend(trend_init) 86 | x = res + trend 87 | x = x.permute(0,2,1) # x: [Batch, Input length, Channel] 88 | else: 89 | x = x.permute(0,2,1) # x: [Batch, Channel, Input length] 90 | x = self.model(x) 91 | x = x.permute(0,2,1) # x: [Batch, Input length, Channel] 92 | return x -------------------------------------------------------------------------------- /ICTSP/models/Stat_models.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import numpy as np 5 | from tqdm import tqdm 6 | import pmdarima as pm 7 | import threading 8 | from sklearn.ensemble import GradientBoostingRegressor 9 | 10 | class Naive_repeat(nn.Module): 11 | def __init__(self, configs): 12 | super(Naive_repeat, self).__init__() 13 | self.pred_len = configs.pred_len 14 | 15 | def forward(self, x): 16 | B,L,D = x.shape 17 | x = x[:,-1,:].reshape(B,1,D).repeat(self.pred_len,axis=1) 18 | return x # [B, L, D] 19 | 20 | class Naive_thread(threading.Thread): 21 | def __init__(self,func,args=()): 22 | super(Naive_thread,self).__init__() 23 | self.func = func 24 | self.args = args 25 | 26 | def run(self): 27 | self.results = self.func(*self.args) 28 | 29 | def return_result(self): 30 | threading.Thread.join(self) 31 | return self.results 32 | 33 | def _arima(seq,pred_len,bt,i): 34 | model = pm.auto_arima(seq) 35 | forecasts = model.predict(pred_len) 36 | return forecasts,bt,i 37 | 38 | class Arima(nn.Module): 39 | """ 40 | Extremely slow, please sample < 0.1 41 | """ 42 | def __init__(self, configs): 43 | super(Arima, self).__init__() 44 | self.pred_len = configs.pred_len 45 | 46 | def forward(self, x): 47 | result = np.zeros([x.shape[0],self.pred_len,x.shape[2]]) 48 | threads = [] 49 | for bt,seqs in tqdm(enumerate(x)): 50 | for i in range(seqs.shape[-1]): 51 | seq = seqs[:,i] 52 | one_seq = Naive_thread(func=_arima,args=(seq,self.pred_len,bt,i)) 53 | threads.append(one_seq) 54 | threads[-1].start() 55 | for every_thread in tqdm(threads): 56 | forcast,bt,i = every_thread.return_result() 57 | result[bt,:,i] = forcast 58 | 59 | return result # [B, L, D] 60 | 61 | def _sarima(season,seq,pred_len,bt,i): 62 | model = pm.auto_arima(seq, seasonal=True, m=season) 63 | forecasts = model.predict(pred_len) 64 | return forecasts,bt,i 65 | 66 | class SArima(nn.Module): 67 | """ 68 | Extremely extremely slow, please sample < 0.01 69 | """ 70 | def __init__(self, configs): 71 | super(SArima, self).__init__() 72 | self.pred_len = configs.pred_len 73 | self.seq_len = configs.seq_len 74 | self.season = 24 75 | if 'Ettm' in configs.data_path: 76 | self.season = 12 77 | elif 'ILI' in configs.data_path: 78 | self.season = 1 79 | if self.season >= self.seq_len: 80 | self.season = 1 81 | 82 | def forward(self, x): 83 | result = np.zeros([x.shape[0],self.pred_len,x.shape[2]]) 84 | threads = [] 85 | for bt,seqs in tqdm(enumerate(x)): 86 | for i in range(seqs.shape[-1]): 87 | seq = seqs[:,i] 88 | one_seq = Naive_thread(func=_sarima,args=(self.season,seq,self.pred_len,bt,i)) 89 | threads.append(one_seq) 90 | threads[-1].start() 91 | for every_thread in tqdm(threads): 92 | forcast,bt,i = every_thread.return_result() 93 | result[bt,:,i] = forcast 94 | return result # [B, L, D] 95 | 96 | def _gbrt(seq,seq_len,pred_len,bt,i): 97 | model = GradientBoostingRegressor() 98 | model.fit(np.arange(seq_len).reshape(-1,1),seq.reshape(-1,1)) 99 | forecasts = model.predict(np.arange(seq_len,seq_len+pred_len).reshape(-1,1)) 100 | return forecasts,bt,i 101 | 102 | class GBRT(nn.Module): 103 | def __init__(self, configs): 104 | super(GBRT, self).__init__() 105 | self.seq_len = configs.seq_len 106 | self.pred_len = configs.pred_len 107 | 108 | def forward(self, x): 109 | result = np.zeros([x.shape[0],self.pred_len,x.shape[2]]) 110 | threads = [] 111 | for bt,seqs in tqdm(enumerate(x)): 112 | for i in range(seqs.shape[-1]): 113 | seq = seqs[:,i] 114 | one_seq = Naive_thread(func=_gbrt,args=(seq,self.seq_len,self.pred_len,bt,i)) 115 | threads.append(one_seq) 116 | threads[-1].start() 117 | for every_thread in tqdm(threads): 118 | forcast,bt,i = every_thread.return_result() 119 | result[bt,:,i] = forcast 120 | return result # [B, L, D] 121 | -------------------------------------------------------------------------------- /ICTSP/models/TiDE.py: -------------------------------------------------------------------------------- 1 | ### From: https://github.com/thuml/Time-Series-Library/blob/main/models/TiDE.py 2 | 3 | import torch 4 | import torch.nn as nn 5 | import torch.nn.functional as F 6 | 7 | 8 | class LayerNorm(nn.Module): 9 | """ LayerNorm but with an optional bias. PyTorch doesn't support simply bias=False """ 10 | 11 | def __init__(self, ndim, bias): 12 | super().__init__() 13 | self.weight = nn.Parameter(torch.ones(ndim)) 14 | self.bias = nn.Parameter(torch.zeros(ndim)) if bias else None 15 | 16 | def forward(self, input): 17 | return F.layer_norm(input, self.weight.shape, self.weight, self.bias, 1e-5) 18 | 19 | 20 | 21 | class ResBlock(nn.Module): 22 | def __init__(self, input_dim, hidden_dim, output_dim, dropout=0.1, bias=True): 23 | super().__init__() 24 | 25 | self.fc1 = nn.Linear(input_dim, hidden_dim, bias=bias) 26 | self.fc2 = nn.Linear(hidden_dim, output_dim, bias=bias) 27 | self.fc3 = nn.Linear(input_dim, output_dim, bias=bias) 28 | self.dropout = nn.Dropout(dropout) 29 | self.relu = nn.ReLU() 30 | self.ln = LayerNorm(output_dim, bias=bias) 31 | 32 | def forward(self, x): 33 | 34 | out = self.fc1(x) 35 | out = self.relu(out) 36 | out = self.fc2(out) 37 | out = self.dropout(out) 38 | out = out + self.fc3(x) 39 | out = self.ln(out) 40 | return out 41 | 42 | 43 | #TiDE 44 | class Model(nn.Module): 45 | """ 46 | paper: https://arxiv.org/pdf/2304.08424.pdf 47 | """ 48 | def __init__(self, configs, bias=True, feature_encode_dim=2): 49 | super(Model, self).__init__() 50 | self.configs = configs 51 | self.task_name = 'long_term_forecast'#configs.task_name 52 | self.seq_len = configs.seq_len #L 53 | self.label_len = configs.label_len 54 | self.pred_len = configs.pred_len #H 55 | self.hidden_dim=configs.d_model 56 | self.res_hidden=configs.d_model 57 | self.encoder_num=configs.e_layers 58 | self.decoder_num=configs.d_layers 59 | self.freq=configs.freq 60 | self.feature_encode_dim=feature_encode_dim 61 | self.decode_dim = configs.enc_in + 1#configs.c_out 62 | self.temporalDecoderHidden=configs.d_ff 63 | dropout=configs.dropout 64 | 65 | 66 | freq_map = {'h': 4, 't': 5, 's': 6, 67 | 'm': 1, 'a': 1, 'w': 2, 'd': 3, 'b': 3} 68 | 69 | self.feature_dim=freq_map[self.freq] 70 | 71 | 72 | flatten_dim = self.seq_len + (self.seq_len + self.pred_len) * self.feature_encode_dim 73 | 74 | self.feature_encoder = ResBlock(self.feature_dim, self.res_hidden, self.feature_encode_dim, dropout, bias) 75 | self.encoders = nn.Sequential(ResBlock(flatten_dim, self.res_hidden, self.hidden_dim, dropout, bias),*([ ResBlock(self.hidden_dim, self.res_hidden, self.hidden_dim, dropout, bias)]*(self.encoder_num-1))) 76 | if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast': 77 | self.decoders = nn.Sequential(*([ ResBlock(self.hidden_dim, self.res_hidden, self.hidden_dim, dropout, bias)]*(self.decoder_num-1)),ResBlock(self.hidden_dim, self.res_hidden, self.decode_dim * self.pred_len, dropout, bias)) 78 | self.temporalDecoder = ResBlock(self.decode_dim + self.feature_encode_dim, self.temporalDecoderHidden, 1, dropout, bias) 79 | self.residual_proj = nn.Linear(self.seq_len, self.pred_len, bias=bias) 80 | if self.task_name == 'imputation': 81 | self.decoders = nn.Sequential(*([ ResBlock(self.hidden_dim, self.res_hidden, self.hidden_dim, dropout, bias)]*(self.decoder_num-1)),ResBlock(self.hidden_dim, self.res_hidden, self.decode_dim * self.seq_len, dropout, bias)) 82 | self.temporalDecoder = ResBlock(self.decode_dim + self.feature_encode_dim, self.temporalDecoderHidden, 1, dropout, bias) 83 | self.residual_proj = nn.Linear(self.seq_len, self.seq_len, bias=bias) 84 | if self.task_name == 'anomaly_detection': 85 | self.decoders = nn.Sequential(*([ ResBlock(self.hidden_dim, self.res_hidden, self.hidden_dim, dropout, bias)]*(self.decoder_num-1)),ResBlock(self.hidden_dim, self.res_hidden, self.decode_dim * self.seq_len, dropout, bias)) 86 | self.temporalDecoder = ResBlock(self.decode_dim + self.feature_encode_dim, self.temporalDecoderHidden, 1, dropout, bias) 87 | self.residual_proj = nn.Linear(self.seq_len, self.seq_len, bias=bias) 88 | 89 | 90 | def forecast(self, x_enc, x_mark_enc, x_dec, batch_y_mark): 91 | # Normalization 92 | means = x_enc.mean(1, keepdim=True).detach() 93 | x_enc = x_enc - means 94 | stdev = torch.sqrt(torch.var(x_enc.clone(), dim=1, keepdim=True, unbiased=False) + 1e-5) 95 | x_enc = x_enc / stdev 96 | 97 | feature = self.feature_encoder(batch_y_mark) 98 | hidden = self.encoders(torch.cat([x_enc, feature.reshape(feature.shape[0], -1)], dim=-1)) 99 | decoded = self.decoders(hidden).reshape(hidden.shape[0], self.pred_len, self.decode_dim) 100 | dec_out = self.temporalDecoder(torch.cat([feature[:,self.seq_len:], decoded], dim=-1)).squeeze(-1) + self.residual_proj(x_enc) 101 | 102 | 103 | # De-Normalization 104 | dec_out = dec_out * (stdev[:, 0].unsqueeze(1).repeat(1, self.pred_len)) 105 | dec_out = dec_out + (means[:, 0].unsqueeze(1).repeat(1, self.pred_len)) 106 | return dec_out 107 | 108 | def imputation(self, x_enc, x_mark_enc, x_dec, batch_y_mark, mask): 109 | # Normalization 110 | means = x_enc.mean(1, keepdim=True).detach() 111 | x_enc = x_enc - means 112 | stdev = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5) 113 | x_enc /= stdev 114 | 115 | feature = self.feature_encoder(x_mark_enc) 116 | hidden = self.encoders(torch.cat([x_enc, feature.reshape(feature.shape[0], -1)], dim=-1)) 117 | decoded = self.decoders(hidden).reshape(hidden.shape[0], self.seq_len, self.decode_dim) 118 | dec_out = self.temporalDecoder(torch.cat([feature[:,:self.seq_len], decoded], dim=-1)).squeeze(-1) + self.residual_proj(x_enc) 119 | 120 | # De-Normalization 121 | dec_out = dec_out * (stdev[:, 0].unsqueeze(1).repeat(1, self.seq_len)) 122 | dec_out = dec_out + (means[:, 0].unsqueeze(1).repeat(1, self.seq_len)) 123 | return dec_out 124 | 125 | 126 | def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None): 127 | '''x_mark_enc is the exogenous dynamic feature described in the original paper''' 128 | if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast': 129 | x_mark_dec=torch.concat([x_mark_enc, x_mark_dec[:, -self.pred_len:, :]],dim=1) 130 | dec_out = torch.stack([self.forecast(x_enc[:, :, feature], x_mark_enc, x_dec, x_mark_dec) for feature in range(x_enc.shape[-1])],dim=-1) 131 | return dec_out # [B, L, D] 132 | if self.task_name == 'imputation': 133 | dec_out = torch.stack([self.imputation(x_enc[:, :, feature], x_mark_enc, x_dec, x_mark_dec, mask) for feature in range(x_enc.shape[-1])],dim=-1) 134 | return dec_out # [B, L, D] 135 | if self.task_name == 'anomaly_detection': 136 | raise NotImplementedError("Task anomaly_detection for Tide is temporarily not supported") 137 | if self.task_name == 'classification': 138 | raise NotImplementedError("Task classification for Tide is temporarily not supported") 139 | return None 140 | 141 | 142 | 143 | 144 | -------------------------------------------------------------------------------- /ICTSP/models/TimesNet.py: -------------------------------------------------------------------------------- 1 | ### From: https://github.com/thuml/Time-Series-Library/blob/main/scripts/long_term_forecast/ETT_script/TimesNet_ETTh2.sh 2 | 3 | import torch 4 | import torch.nn as nn 5 | import torch.nn.functional as F 6 | import torch.fft 7 | from layers.Embed import DataEmbedding 8 | from layers.Conv_Blocks import Inception_Block_V1 9 | 10 | 11 | def FFT_for_Period(x, k=2): 12 | # [B, T, C] 13 | xf = torch.fft.rfft(x, dim=1) 14 | # find period by amplitudes 15 | frequency_list = abs(xf).mean(0).mean(-1) 16 | frequency_list[0] = 0 17 | _, top_list = torch.topk(frequency_list, k) 18 | top_list = top_list.detach().cpu().numpy() 19 | period = x.shape[1] // top_list 20 | return period, abs(xf).mean(-1)[:, top_list] 21 | 22 | 23 | class TimesBlock(nn.Module): 24 | def __init__(self, configs): 25 | super(TimesBlock, self).__init__() 26 | self.seq_len = configs.seq_len 27 | self.pred_len = configs.pred_len 28 | self.k = configs.top_k 29 | # parameter-efficient design 30 | self.conv = nn.Sequential( 31 | Inception_Block_V1(configs.d_model, configs.d_ff, 32 | num_kernels=configs.num_kernels), 33 | nn.GELU(), 34 | Inception_Block_V1(configs.d_ff, configs.d_model, 35 | num_kernels=configs.num_kernels) 36 | ) 37 | 38 | def forward(self, x): 39 | B, T, N = x.size() 40 | period_list, period_weight = FFT_for_Period(x, self.k) 41 | 42 | res = [] 43 | for i in range(self.k): 44 | period = period_list[i] 45 | # padding 46 | if (self.seq_len + self.pred_len) % period != 0: 47 | length = ( 48 | ((self.seq_len + self.pred_len) // period) + 1) * period 49 | padding = torch.zeros([x.shape[0], (length - (self.seq_len + self.pred_len)), x.shape[2]]).to(x.device) 50 | out = torch.cat([x, padding], dim=1) 51 | else: 52 | length = (self.seq_len + self.pred_len) 53 | out = x 54 | # reshape 55 | out = out.reshape(B, length // period, period, 56 | N).permute(0, 3, 1, 2).contiguous() 57 | # 2D conv: from 1d Variation to 2d Variation 58 | out = self.conv(out) 59 | # reshape back 60 | out = out.permute(0, 2, 3, 1).reshape(B, -1, N) 61 | res.append(out[:, :(self.seq_len + self.pred_len), :]) 62 | res = torch.stack(res, dim=-1) 63 | # adaptive aggregation 64 | period_weight = F.softmax(period_weight, dim=1) 65 | period_weight = period_weight.unsqueeze( 66 | 1).unsqueeze(1).repeat(1, T, N, 1) 67 | res = torch.sum(res * period_weight, -1) 68 | # residual connection 69 | res = res + x 70 | return res 71 | 72 | 73 | class Model(nn.Module): 74 | """ 75 | Paper link: https://openreview.net/pdf?id=ju_Uqw384Oq 76 | """ 77 | 78 | def __init__(self, configs): 79 | super(Model, self).__init__() 80 | configs.task_name = 'long_term_forecast' 81 | 82 | self.configs = configs 83 | self.task_name = configs.task_name 84 | self.seq_len = configs.seq_len 85 | self.label_len = configs.label_len 86 | self.pred_len = configs.pred_len 87 | 88 | #configs.label_len = 48 89 | configs.d_model = 32 90 | configs.d_ff = 32 91 | configs.num_kernels = 6 92 | configs.top_k = 5 93 | configs.e_layers = 2 94 | configs.d_layers = 1 95 | configs.dec_in = configs.enc_in 96 | configs.c_out = configs.enc_in 97 | 98 | 99 | self.model = nn.ModuleList([TimesBlock(configs) 100 | for _ in range(configs.e_layers)]) 101 | self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq, 102 | configs.dropout) 103 | self.layer = configs.e_layers 104 | self.layer_norm = nn.LayerNorm(configs.d_model) 105 | if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast': 106 | self.predict_linear = nn.Linear( 107 | self.seq_len, self.pred_len + self.seq_len) 108 | self.projection = nn.Linear( 109 | configs.d_model, configs.c_out, bias=True) 110 | if self.task_name == 'imputation' or self.task_name == 'anomaly_detection': 111 | self.projection = nn.Linear( 112 | configs.d_model, configs.c_out, bias=True) 113 | if self.task_name == 'classification': 114 | self.act = F.gelu 115 | self.dropout = nn.Dropout(configs.dropout) 116 | self.projection = nn.Linear( 117 | configs.d_model * configs.seq_len, configs.num_class) 118 | 119 | def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec): 120 | # Normalization from Non-stationary Transformer 121 | means = x_enc.mean(1, keepdim=True).detach() 122 | x_enc = x_enc - means 123 | stdev = torch.sqrt( 124 | torch.var(x_enc.clone(), dim=1, keepdim=True, unbiased=False) + 1e-5) 125 | x_enc = x_enc / stdev 126 | 127 | # embedding 128 | enc_out = self.enc_embedding(x_enc, x_mark_enc) # [B,T,C] 129 | enc_out = self.predict_linear(enc_out.permute(0, 2, 1)).permute( 130 | 0, 2, 1) # align temporal dimension 131 | # TimesNet 132 | for i in range(self.layer): 133 | enc_out = self.layer_norm(self.model[i](enc_out)) 134 | # porject back 135 | dec_out = self.projection(enc_out) 136 | 137 | # De-Normalization from Non-stationary Transformer 138 | dec_out = dec_out * \ 139 | (stdev[:, 0, :].unsqueeze(1).repeat( 140 | 1, self.pred_len + self.seq_len, 1)) 141 | dec_out = dec_out + \ 142 | (means[:, 0, :].unsqueeze(1).repeat( 143 | 1, self.pred_len + self.seq_len, 1)) 144 | return dec_out 145 | 146 | def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask): 147 | # Normalization from Non-stationary Transformer 148 | means = torch.sum(x_enc, dim=1) / torch.sum(mask == 1, dim=1) 149 | means = means.unsqueeze(1).detach() 150 | x_enc = x_enc - means 151 | x_enc = x_enc.masked_fill(mask == 0, 0) 152 | stdev = torch.sqrt(torch.sum(x_enc * x_enc, dim=1) / 153 | torch.sum(mask == 1, dim=1) + 1e-5) 154 | stdev = stdev.unsqueeze(1).detach() 155 | x_enc /= stdev 156 | 157 | # embedding 158 | enc_out = self.enc_embedding(x_enc, x_mark_enc) # [B,T,C] 159 | # TimesNet 160 | for i in range(self.layer): 161 | enc_out = self.layer_norm(self.model[i](enc_out)) 162 | # porject back 163 | dec_out = self.projection(enc_out) 164 | 165 | # De-Normalization from Non-stationary Transformer 166 | dec_out = dec_out * \ 167 | (stdev[:, 0, :].unsqueeze(1).repeat( 168 | 1, self.pred_len + self.seq_len, 1)) 169 | dec_out = dec_out + \ 170 | (means[:, 0, :].unsqueeze(1).repeat( 171 | 1, self.pred_len + self.seq_len, 1)) 172 | return dec_out 173 | 174 | def anomaly_detection(self, x_enc): 175 | # Normalization from Non-stationary Transformer 176 | means = x_enc.mean(1, keepdim=True).detach() 177 | x_enc = x_enc - means 178 | stdev = torch.sqrt( 179 | torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5) 180 | x_enc /= stdev 181 | 182 | # embedding 183 | enc_out = self.enc_embedding(x_enc, None) # [B,T,C] 184 | # TimesNet 185 | for i in range(self.layer): 186 | enc_out = self.layer_norm(self.model[i](enc_out)) 187 | # porject back 188 | dec_out = self.projection(enc_out) 189 | 190 | # De-Normalization from Non-stationary Transformer 191 | dec_out = dec_out * \ 192 | (stdev[:, 0, :].unsqueeze(1).repeat( 193 | 1, self.pred_len + self.seq_len, 1)) 194 | dec_out = dec_out + \ 195 | (means[:, 0, :].unsqueeze(1).repeat( 196 | 1, self.pred_len + self.seq_len, 1)) 197 | return dec_out 198 | 199 | def classification(self, x_enc, x_mark_enc): 200 | # embedding 201 | enc_out = self.enc_embedding(x_enc, None) # [B,T,C] 202 | # TimesNet 203 | for i in range(self.layer): 204 | enc_out = self.layer_norm(self.model[i](enc_out)) 205 | 206 | # Output 207 | # the output transformer encoder/decoder embeddings don't include non-linearity 208 | output = self.act(enc_out) 209 | output = self.dropout(output) 210 | # zero-out padding embeddings 211 | output = output * x_mark_enc.unsqueeze(-1) 212 | # (batch_size, seq_length * d_model) 213 | output = output.reshape(output.shape[0], -1) 214 | output = self.projection(output) # (batch_size, num_classes) 215 | return output 216 | 217 | def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None): 218 | if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast': 219 | dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec) 220 | return dec_out[:, -self.pred_len:, :] # [B, L, D] 221 | if self.task_name == 'imputation': 222 | dec_out = self.imputation( 223 | x_enc, x_mark_enc, x_dec, x_mark_dec, mask) 224 | return dec_out # [B, L, D] 225 | if self.task_name == 'anomaly_detection': 226 | dec_out = self.anomaly_detection(x_enc) 227 | return dec_out # [B, L, D] 228 | if self.task_name == 'classification': 229 | dec_out = self.classification(x_enc, x_mark_enc) 230 | return dec_out # [B, N] 231 | return None -------------------------------------------------------------------------------- /ICTSP/models/Transformer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | from layers.Transformer_EncDec import Decoder, DecoderLayer, Encoder, EncoderLayer, ConvLayer 5 | from layers.SelfAttention_Family import FullAttention, AttentionLayer 6 | from layers.Embed import DataEmbedding,DataEmbedding_wo_pos,DataEmbedding_wo_temp,DataEmbedding_wo_pos_temp 7 | import numpy as np 8 | 9 | 10 | class Model(nn.Module): 11 | """ 12 | Vanilla Transformer with O(L^2) complexity 13 | """ 14 | def __init__(self, configs): 15 | super(Model, self).__init__() 16 | self.pred_len = configs.pred_len 17 | self.output_attention = configs.output_attention 18 | 19 | # Embedding 20 | if configs.embed_type == 0: 21 | self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq, 22 | configs.dropout) 23 | self.dec_embedding = DataEmbedding(configs.dec_in, configs.d_model, configs.embed, configs.freq, 24 | configs.dropout) 25 | elif configs.embed_type == 1: 26 | self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq, 27 | configs.dropout) 28 | self.dec_embedding = DataEmbedding(configs.dec_in, configs.d_model, configs.embed, configs.freq, 29 | configs.dropout) 30 | elif configs.embed_type == 2: 31 | self.enc_embedding = DataEmbedding_wo_pos(configs.enc_in, configs.d_model, configs.embed, configs.freq, 32 | configs.dropout) 33 | self.dec_embedding = DataEmbedding_wo_pos(configs.dec_in, configs.d_model, configs.embed, configs.freq, 34 | configs.dropout) 35 | 36 | elif configs.embed_type == 3: 37 | self.enc_embedding = DataEmbedding_wo_temp(configs.enc_in, configs.d_model, configs.embed, configs.freq, 38 | configs.dropout) 39 | self.dec_embedding = DataEmbedding_wo_temp(configs.dec_in, configs.d_model, configs.embed, configs.freq, 40 | configs.dropout) 41 | elif configs.embed_type == 4: 42 | self.enc_embedding = DataEmbedding_wo_pos_temp(configs.enc_in, configs.d_model, configs.embed, configs.freq, 43 | configs.dropout) 44 | self.dec_embedding = DataEmbedding_wo_pos_temp(configs.dec_in, configs.d_model, configs.embed, configs.freq, 45 | configs.dropout) 46 | # Encoder 47 | self.encoder = Encoder( 48 | [ 49 | EncoderLayer( 50 | AttentionLayer( 51 | FullAttention(False, configs.factor, attention_dropout=configs.dropout, 52 | output_attention=configs.output_attention), configs.d_model, configs.n_heads), 53 | configs.d_model, 54 | configs.d_ff, 55 | dropout=configs.dropout, 56 | activation=configs.activation 57 | ) for l in range(configs.e_layers) 58 | ], 59 | norm_layer=torch.nn.LayerNorm(configs.d_model) 60 | ) 61 | # Decoder 62 | self.decoder = Decoder( 63 | [ 64 | DecoderLayer( 65 | AttentionLayer( 66 | FullAttention(True, configs.factor, attention_dropout=configs.dropout, output_attention=False), 67 | configs.d_model, configs.n_heads), 68 | AttentionLayer( 69 | FullAttention(False, configs.factor, attention_dropout=configs.dropout, output_attention=False), 70 | configs.d_model, configs.n_heads), 71 | configs.d_model, 72 | configs.d_ff, 73 | dropout=configs.dropout, 74 | activation=configs.activation, 75 | ) 76 | for l in range(configs.d_layers) 77 | ], 78 | norm_layer=torch.nn.LayerNorm(configs.d_model), 79 | projection=nn.Linear(configs.d_model, configs.c_out, bias=True) 80 | ) 81 | 82 | def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, 83 | enc_self_mask=None, dec_self_mask=None, dec_enc_mask=None): 84 | 85 | enc_out = self.enc_embedding(x_enc, x_mark_enc) 86 | enc_out, attns = self.encoder(enc_out, attn_mask=enc_self_mask) 87 | 88 | dec_out = self.dec_embedding(x_dec, x_mark_dec) 89 | dec_out = self.decoder(dec_out, enc_out, x_mask=dec_self_mask, cross_mask=dec_enc_mask) 90 | 91 | if self.output_attention: 92 | return dec_out[:, -self.pred_len:, :], attns 93 | else: 94 | return dec_out[:, -self.pred_len:, :] # [B, L, D] -------------------------------------------------------------------------------- /ICTSP/models/iTransformer.py: -------------------------------------------------------------------------------- 1 | # From: https://github.com/thuml/iTransformer/blob/main/model/iTransformer.py 2 | 3 | 4 | import torch 5 | import torch.nn as nn 6 | import torch.nn.functional as F 7 | from layers.Transformer_EncDec import Encoder, EncoderLayer 8 | from layers.SelfAttention_Family import FullAttention, AttentionLayer 9 | from layers.Embed import DataEmbedding_inverted 10 | import numpy as np 11 | 12 | 13 | 14 | class DataEmbedding_inverted(nn.Module): 15 | def __init__(self, c_in, d_model, embed_type='fixed', freq='h', dropout=0.1): 16 | super(DataEmbedding_inverted, self).__init__() 17 | self.value_embedding = nn.Linear(c_in, d_model) 18 | self.dropout = nn.Dropout(p=dropout) 19 | 20 | def forward(self, x, x_mark): 21 | x = x.permute(0, 2, 1) 22 | # x: [Batch Variate Time] 23 | if x_mark is None: 24 | x = self.value_embedding(x) 25 | else: 26 | # the potential to take covariates (e.g. timestamps) as tokens 27 | x = self.value_embedding(torch.cat([x, x_mark.permute(0, 2, 1)], 1)) 28 | # x: [Batch Variate d_model] 29 | return self.dropout(x) 30 | 31 | class Model(nn.Module): 32 | """ 33 | Paper link: https://arxiv.org/abs/2310.06625 34 | """ 35 | 36 | def __init__(self, configs): 37 | super(Model, self).__init__() 38 | self.seq_len = configs.seq_len 39 | self.pred_len = configs.pred_len 40 | self.output_attention = configs.output_attention 41 | self.use_norm = configs.use_norm 42 | # Embedding 43 | self.enc_embedding = DataEmbedding_inverted(configs.seq_len, configs.d_model, configs.embed, configs.freq, 44 | configs.dropout) 45 | self.class_strategy = configs.class_strategy 46 | # Encoder-only architecture 47 | self.encoder = Encoder( 48 | [ 49 | EncoderLayer( 50 | AttentionLayer( 51 | FullAttention(False, configs.factor, attention_dropout=configs.dropout, 52 | output_attention=configs.output_attention), configs.d_model, configs.n_heads), 53 | configs.d_model, 54 | configs.d_ff, 55 | dropout=configs.dropout, 56 | activation=configs.activation 57 | ) for l in range(configs.e_layers) 58 | ], 59 | norm_layer=torch.nn.LayerNorm(configs.d_model) 60 | ) 61 | self.projector = nn.Linear(configs.d_model, configs.pred_len, bias=True) 62 | 63 | def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec): 64 | if self.use_norm: 65 | # Normalization from Non-stationary Transformer 66 | means = x_enc.mean(1, keepdim=True).detach() 67 | x_enc = x_enc - means 68 | stdev = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5) 69 | x_enc /= stdev 70 | 71 | _, _, N = x_enc.shape # B L N 72 | # B: batch_size; E: d_model; 73 | # L: seq_len; S: pred_len; 74 | # N: number of variate (tokens), can also includes covariates 75 | 76 | # Embedding 77 | # B L N -> B N E (B L N -> B L E in the vanilla Transformer) 78 | enc_out = self.enc_embedding(x_enc, x_mark_enc) # covariates (e.g timestamp) can be also embedded as tokens 79 | 80 | # B N E -> B N E (B L E -> B L E in the vanilla Transformer) 81 | # the dimensions of embedded time series has been inverted, and then processed by native attn, layernorm and ffn modules 82 | enc_out, attns = self.encoder(enc_out, attn_mask=None) 83 | 84 | # B N E -> B N S -> B S N 85 | dec_out = self.projector(enc_out).permute(0, 2, 1)[:, :, :N] # filter the covariates 86 | 87 | if self.use_norm: 88 | # De-Normalization from Non-stationary Transformer 89 | dec_out = dec_out * (stdev[:, 0, :].unsqueeze(1).repeat(1, self.pred_len, 1)) 90 | dec_out = dec_out + (means[:, 0, :].unsqueeze(1).repeat(1, self.pred_len, 1)) 91 | 92 | return dec_out 93 | 94 | 95 | def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None): 96 | dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec) 97 | return dec_out[:, -self.pred_len:, :] # [B, L, D] -------------------------------------------------------------------------------- /ICTSP/requirements.txt: -------------------------------------------------------------------------------- 1 | einops 2 | fbm 3 | matplotlib 4 | numpy==1.26.4 5 | pandas 6 | ptflops 7 | pynvml 8 | scipy 9 | seaborn 10 | sympy 11 | torch 12 | tqdm 13 | tvm 14 | IPython 15 | tensorboard 16 | statsmodels 17 | statsforecast 18 | scikit_learn 19 | pmdarima 20 | lightgbm 21 | 22 | # pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 -------------------------------------------------------------------------------- /ICTSP/scripts/ICTSP/ictsp_fewshot005.sh: -------------------------------------------------------------------------------- 1 | if [ ! -d "./logs" ]; then 2 | mkdir ./logs 3 | fi 4 | 5 | if [ ! -d "./logs/LongForecasting" ]; then 6 | mkdir ./logs/LongForecasting 7 | fi 8 | 9 | root_path_name=./dataset/ 10 | features=M 11 | 12 | resume="none" 13 | data_type=custom 14 | transfer_learning=1 15 | fix_embedding=0 16 | enc_in=7 # total number of columns 17 | number_of_targets=0 # number of columns 18 | 19 | patience=50 20 | seq_len=1440 21 | lookback=512 22 | random_seed=2024 23 | test_every=200 24 | plot_every=10 25 | scale=1 26 | 27 | model_name=ICTSP 28 | batch_size=32 29 | batch_size_test=32 30 | learning_rate=0.0005 31 | # Training 32 | time_emb_dim=0 33 | 34 | e_layers=3 35 | d_model=128 36 | n_heads=8 37 | mlp_ratio=4 38 | dropout=0.5 39 | sampling_step=8 40 | token_retriever_flag=1 41 | linear_warmup_steps=5000 42 | token_limit=2048 43 | # Training 44 | max_grad_norm=0 45 | 46 | # fewshot rate 47 | fewshot_rate=0.05 48 | 49 | to_ds="ETTm2.csv" 50 | 51 | for pred_len in 96 # 192 336 720 52 | do 53 | 54 | data_alias="Fewshot-$fewshot_rate-to-$to_ds-$pred_len" 55 | data_name="[$fewshot_rate,$pred_len]$to_ds,$to_ds" 56 | 57 | python -u run_longExp.py \ 58 | --is_training 1 \ 59 | --root_path $root_path_name \ 60 | --data_path $data_name \ 61 | --model_id $model_name'_'$data_alias'_'$random_seed'_RD('$random_drop_training')_'$seq_len'_'$lookback'_'$pred_len'_K('$e_layers')_d('$d_model')_Dropout('$dropout')_m('$sampling_step')_W('$linear_warmup_steps')_Limit('$token_limit')_wRet('$token_retriever_flag \ 62 | --model $model_name \ 63 | --data $data_type \ 64 | --features $features \ 65 | --seq_len $seq_len \ 66 | --lookback $lookback \ 67 | --pred_len $pred_len \ 68 | --enc_in $enc_in \ 69 | --number_of_targets $number_of_targets \ 70 | --des 'Exp' \ 71 | --patience $patience \ 72 | --test_every $test_every \ 73 | --time_emb_dim $time_emb_dim \ 74 | --e_layers $e_layers \ 75 | --d_model $d_model \ 76 | --n_heads $n_heads \ 77 | --mlp_ratio $mlp_ratio \ 78 | --dropout $dropout \ 79 | --sampling_step $sampling_step \ 80 | --token_retriever_flag $token_retriever_flag \ 81 | --linear_warmup_steps $linear_warmup_steps \ 82 | --token_limit $token_limit \ 83 | --label_len $seq_len \ 84 | --random_seed $random_seed \ 85 | --max_grad_norm $max_grad_norm \ 86 | --scale $scale \ 87 | --train_epochs 200 \ 88 | --itr 1 \ 89 | --batch_size $batch_size \ 90 | --batch_size_test $batch_size_test \ 91 | --plot_every $plot_every \ 92 | --learning_rate $learning_rate \ 93 | --transfer_learning $transfer_learning \ 94 | --fix_embedding $fix_embedding \ 95 | --resume $resume >logs/LongForecasting/$model_name'_'$data_alias'_'$random_seed'_RD('$random_drop_training')_'$seq_len'_'$lookback'_'$pred_len'_K('$e_layers')_d('$d_model')_Dropout('$dropout')_m('$sampling_step')_W('$linear_warmup_steps')_Limit('$token_limit')_wRet('$token_retriever_flag')' 96 | done -------------------------------------------------------------------------------- /ICTSP/scripts/ICTSP/ictsp_fewshot010.sh: -------------------------------------------------------------------------------- 1 | if [ ! -d "./logs" ]; then 2 | mkdir ./logs 3 | fi 4 | 5 | if [ ! -d "./logs/LongForecasting" ]; then 6 | mkdir ./logs/LongForecasting 7 | fi 8 | 9 | root_path_name=./dataset/ 10 | features=M 11 | 12 | resume="none" 13 | data_type=custom 14 | transfer_learning=1 15 | fix_embedding=0 16 | enc_in=7 17 | number_of_targets=0 18 | 19 | patience=50 20 | seq_len=1440 21 | lookback=512 22 | random_seed=2024 23 | test_every=200 24 | plot_every=10 25 | scale=1 26 | 27 | model_name=ICTSP 28 | batch_size=32 29 | batch_size_test=32 30 | learning_rate=0.0005 31 | # Training 32 | time_emb_dim=0 33 | 34 | e_layers=3 35 | d_model=128 36 | n_heads=8 37 | mlp_ratio=4 38 | dropout=0.5 39 | sampling_step=8 40 | token_retriever_flag=1 41 | linear_warmup_steps=5000 42 | token_limit=2048 43 | # Training 44 | max_grad_norm=0 45 | 46 | # fewshot rate 47 | fewshot_rate=0.10 48 | 49 | to_ds="ETTm2.csv" 50 | 51 | for pred_len in 96 # 192 336 720 52 | do 53 | 54 | data_alias="Fewshot-$fewshot_rate-to-$to_ds-$pred_len" 55 | data_name="[$fewshot_rate,$pred_len]$to_ds,$to_ds" 56 | 57 | python -u run_longExp.py \ 58 | --is_training 1 \ 59 | --root_path $root_path_name \ 60 | --data_path $data_name \ 61 | --model_id $model_name'_'$data_alias'_'$random_seed'_RD('$random_drop_training')_'$seq_len'_'$lookback'_'$pred_len'_K('$e_layers')_d('$d_model')_Dropout('$dropout')_m('$sampling_step')_W('$linear_warmup_steps')_Limit('$token_limit')_wRet('$token_retriever_flag \ 62 | --model $model_name \ 63 | --data $data_type \ 64 | --features $features \ 65 | --seq_len $seq_len \ 66 | --lookback $lookback \ 67 | --pred_len $pred_len \ 68 | --enc_in $enc_in \ 69 | --number_of_targets $number_of_targets \ 70 | --des 'Exp' \ 71 | --patience $patience \ 72 | --test_every $test_every \ 73 | --time_emb_dim $time_emb_dim \ 74 | --e_layers $e_layers \ 75 | --d_model $d_model \ 76 | --n_heads $n_heads \ 77 | --mlp_ratio $mlp_ratio \ 78 | --dropout $dropout \ 79 | --sampling_step $sampling_step \ 80 | --token_retriever_flag $token_retriever_flag \ 81 | --linear_warmup_steps $linear_warmup_steps \ 82 | --token_limit $token_limit \ 83 | --label_len $seq_len \ 84 | --random_seed $random_seed \ 85 | --max_grad_norm $max_grad_norm \ 86 | --scale $scale \ 87 | --train_epochs 200 \ 88 | --itr 1 \ 89 | --batch_size $batch_size \ 90 | --batch_size_test $batch_size_test \ 91 | --plot_every $plot_every \ 92 | --learning_rate $learning_rate \ 93 | --transfer_learning $transfer_learning \ 94 | --fix_embedding $fix_embedding \ 95 | --resume $resume >logs/LongForecasting/$model_name'_'$data_alias'_'$random_seed'_RD('$random_drop_training')_'$seq_len'_'$lookback'_'$pred_len'_K('$e_layers')_d('$d_model')_Dropout('$dropout')_m('$sampling_step')_W('$linear_warmup_steps')_Limit('$token_limit')_wRet('$token_retriever_flag')' 96 | done -------------------------------------------------------------------------------- /ICTSP/scripts/ICTSP/ictsp_full.sh: -------------------------------------------------------------------------------- 1 | if [ ! -d "./logs" ]; then 2 | mkdir ./logs 3 | fi 4 | 5 | if [ ! -d "./logs/LongForecasting" ]; then 6 | mkdir ./logs/LongForecasting 7 | fi 8 | 9 | root_path_name=./dataset/ 10 | features=M 11 | 12 | resume="none" 13 | fix_embedding=1 14 | number_of_targets=0 15 | 16 | patience=30 17 | seq_len=1440 18 | lookback=512 19 | random_seed=2024 20 | test_every=200 21 | plot_every=10 22 | scale=1 23 | 24 | model_name=ICTSP 25 | batch_size=32 26 | batch_size_test=32 27 | gradient_accumulation=1 28 | learning_rate=0.0005 29 | # Training 30 | time_emb_dim=0 # no timestamp embedding used 31 | e_layers=3 32 | d_model=128 33 | n_heads=8 34 | mlp_ratio=4 35 | dropout=0.5 36 | sampling_step=8 37 | token_retriever_flag=1 38 | linear_warmup_steps=5000 39 | token_limit=2048 40 | # Training 41 | max_grad_norm=0 42 | 43 | # m1 44 | 45 | for pred_len in 96 192 336 720 46 | do 47 | 48 | from_ds="ETTm2.csv" 49 | to_ds="ETTm2.csv" 50 | enc_in=7 51 | 52 | data_alias="Full-$from_ds-to-$to_ds-$pred_len" 53 | transfer_learning=1 54 | data_name="$from_ds,$to_ds" # same source and target datasets, not transfer learning 55 | data_type=custom 56 | 57 | python -u run_longExp.py \ 58 | --is_training 1 \ 59 | --root_path $root_path_name \ 60 | --data_path $data_name \ 61 | --model_id $model_name'_'$data_alias'_'$random_seed'_RD('$random_drop_training')_'$seq_len'_'$lookback'_'$pred_len'_K('$e_layers')_d('$d_model')_Dropout('$dropout')_m('$sampling_step')_W('$linear_warmup_steps')_Limit('$token_limit')_wRet('$token_retriever_flag \ 62 | --model $model_name \ 63 | --data $data_type \ 64 | --features $features \ 65 | --seq_len $seq_len \ 66 | --lookback $lookback \ 67 | --pred_len $pred_len \ 68 | --enc_in $enc_in \ 69 | --number_of_targets $number_of_targets \ 70 | --des 'Exp' \ 71 | --patience $patience \ 72 | --test_every $test_every \ 73 | --time_emb_dim $time_emb_dim \ 74 | --e_layers $e_layers \ 75 | --d_model $d_model \ 76 | --n_heads $n_heads \ 77 | --mlp_ratio $mlp_ratio \ 78 | --dropout $dropout \ 79 | --sampling_step $sampling_step \ 80 | --token_retriever_flag $token_retriever_flag \ 81 | --linear_warmup_steps $linear_warmup_steps \ 82 | --token_limit $token_limit \ 83 | --label_len $seq_len \ 84 | --random_seed $random_seed \ 85 | --max_grad_norm $max_grad_norm \ 86 | --scale $scale \ 87 | --train_epochs 1000 \ 88 | --itr 1 \ 89 | --batch_size $batch_size \ 90 | --batch_size_test $batch_size_test \ 91 | --plot_every $plot_every \ 92 | --learning_rate $learning_rate \ 93 | --transfer_learning $transfer_learning \ 94 | --fix_embedding $fix_embedding \ 95 | --gradient_accumulation $gradient_accumulation\ 96 | --resume $resume >logs/LongForecasting/$model_name'_'$data_alias'_'$random_seed'_RD('$random_drop_training')_'$seq_len'_'$lookback'_'$pred_len'_K('$e_layers')_d('$d_model')_Dropout('$dropout')_m('$sampling_step')_W('$linear_warmup_steps')_Limit('$token_limit')_wRet('$token_retriever_flag')' 97 | 98 | done -------------------------------------------------------------------------------- /ICTSP/scripts/ICTSP/ictsp_zeroshot.sh: -------------------------------------------------------------------------------- 1 | if [ ! -d "./logs" ]; then 2 | mkdir ./logs 3 | fi 4 | 5 | if [ ! -d "./logs/LongForecasting" ]; then 6 | mkdir ./logs/LongForecasting 7 | fi 8 | 9 | root_path_name=./dataset/ 10 | features=M 11 | 12 | resume="none" 13 | data_type=custom 14 | transfer_learning=1 15 | fix_embedding=0 16 | enc_in=7 # total number of columns 17 | number_of_targets=0 # number of columns 18 | 19 | patience=50 20 | seq_len=1440 21 | lookback=512 22 | random_seed=2024 23 | test_every=200 24 | plot_every=10 25 | scale=1 26 | 27 | model_name=ICTSP 28 | batch_size=32 29 | batch_size_test=32 30 | learning_rate=0.0005 31 | # Training 32 | time_emb_dim=0 33 | 34 | e_layers=3 35 | d_model=128 36 | n_heads=8 37 | mlp_ratio=4 38 | dropout=0.5 39 | sampling_step=8 40 | token_retriever_flag=1 41 | linear_warmup_steps=5000 42 | token_limit=2048 43 | # Training 44 | max_grad_norm=0 45 | 46 | # m1 -> m2 47 | 48 | from_ds="ETTm1.csv" 49 | to_ds="ETTm2.csv" 50 | 51 | for pred_len in 96 192 336 720 52 | do 53 | 54 | data_alias="Zeroshot-$from_ds-to-$to_ds-$pred_len" 55 | # to activate the pretraining mode 56 | data_name="[0.999999,$pred_len]$from_ds,$to_ds" 57 | 58 | python -u run_longExp.py \ 59 | --is_training 1 \ 60 | --root_path $root_path_name \ 61 | --data_path $data_name \ 62 | --model_id $model_name'_'$data_alias'_'$random_seed'_RD('$random_drop_training')_'$seq_len'_'$lookback'_'$pred_len'_K('$e_layers')_d('$d_model')_Dropout('$dropout')_m('$sampling_step')_W('$linear_warmup_steps')_Limit('$token_limit')_wRet('$token_retriever_flag \ 63 | --model $model_name \ 64 | --data $data_type \ 65 | --features $features \ 66 | --seq_len $seq_len \ 67 | --lookback $lookback \ 68 | --pred_len $pred_len \ 69 | --enc_in $enc_in \ 70 | --number_of_targets $number_of_targets \ 71 | --des 'Exp' \ 72 | --patience $patience \ 73 | --test_every $test_every \ 74 | --time_emb_dim $time_emb_dim \ 75 | --e_layers $e_layers \ 76 | --d_model $d_model \ 77 | --n_heads $n_heads \ 78 | --mlp_ratio $mlp_ratio \ 79 | --dropout $dropout \ 80 | --sampling_step $sampling_step \ 81 | --token_retriever_flag $token_retriever_flag \ 82 | --linear_warmup_steps $linear_warmup_steps \ 83 | --token_limit $token_limit \ 84 | --label_len $seq_len \ 85 | --random_seed $random_seed \ 86 | --max_grad_norm $max_grad_norm \ 87 | --scale $scale \ 88 | --train_epochs 200 \ 89 | --itr 1 \ 90 | --batch_size $batch_size \ 91 | --batch_size_test $batch_size_test \ 92 | --plot_every $plot_every \ 93 | --learning_rate $learning_rate \ 94 | --transfer_learning $transfer_learning \ 95 | --fix_embedding $fix_embedding \ 96 | --resume $resume >logs/LongForecasting/$model_name'_'$data_alias'_'$random_seed'_RD('$random_drop_training')_'$seq_len'_'$lookback'_'$pred_len'_K('$e_layers')_d('$d_model')_Dropout('$dropout')_m('$sampling_step')_W('$linear_warmup_steps')_Limit('$token_limit')_wRet('$token_retriever_flag')' 97 | 98 | done 99 | -------------------------------------------------------------------------------- /ICTSP/utils/masking.py: -------------------------------------------------------------------------------- 1 | import torch 2 | 3 | 4 | class TriangularCausalMask(): 5 | def __init__(self, B, L, device="cpu"): 6 | mask_shape = [B, 1, L, L] 7 | with torch.no_grad(): 8 | self._mask = torch.triu(torch.ones(mask_shape, dtype=torch.bool), diagonal=1).to(device) 9 | 10 | @property 11 | def mask(self): 12 | return self._mask 13 | 14 | 15 | class ProbMask(): 16 | def __init__(self, B, H, L, index, scores, device="cpu"): 17 | _mask = torch.ones(L, scores.shape[-1], dtype=torch.bool).to(device).triu(1) 18 | _mask_ex = _mask[None, None, :].expand(B, H, L, scores.shape[-1]) 19 | indicator = _mask_ex[torch.arange(B)[:, None, None], 20 | torch.arange(H)[None, :, None], 21 | index, :].to(device) 22 | self._mask = indicator.view(scores.shape).to(device) 23 | 24 | @property 25 | def mask(self): 26 | return self._mask 27 | -------------------------------------------------------------------------------- /ICTSP/utils/metrics.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | 4 | def RSE(pred, true): 5 | return np.sqrt(np.sum((true - pred) ** 2)) / np.sqrt(np.sum((true - true.mean()) ** 2)) 6 | 7 | 8 | def CORR(pred, true): 9 | u = ((true - true.mean(0)) * (pred - pred.mean(0))).sum(0) 10 | d = np.sqrt(((true - true.mean(0)) ** 2 * (pred - pred.mean(0)) ** 2).sum(0)) 11 | d += 1e-12 12 | return 0.01*(u / d).mean(-1) 13 | 14 | 15 | def MAE(pred, true): 16 | return np.mean(np.abs(pred - true)) 17 | 18 | 19 | def MSE(pred, true): 20 | return np.mean((pred - true) ** 2) 21 | 22 | 23 | def RMSE(pred, true): 24 | return np.sqrt(MSE(pred, true)) 25 | 26 | 27 | def MAPE(pred, true): 28 | return np.mean(np.abs((pred - true) / true)) 29 | 30 | 31 | def MSPE(pred, true): 32 | return np.mean(np.square((pred - true) / true)) 33 | 34 | 35 | def metric(pred, true): 36 | mae = MAE(pred, true) 37 | mse = MSE(pred, true) 38 | rmse = RMSE(pred, true) 39 | mape = MAPE(pred, true) 40 | mspe = MSPE(pred, true) 41 | rse = RSE(pred, true) 42 | corr = CORR(pred, true) 43 | 44 | return mae, mse, rmse, mape, mspe, rse, corr 45 | -------------------------------------------------------------------------------- /ICTSP/utils/scientific_report.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from IPython.display import display, Markdown, Latex, HTML 3 | import matplotlib.pyplot as plt 4 | import torch 5 | 6 | def mts_visualize(pred, true, split_step=720, title='Long-term Time Series Forecasting', dpi=72, col_names=None): 7 | groups = range(true.shape[-1]) 8 | C = true.shape[-1] 9 | i = 1 10 | # plot each column 11 | f = plt.figure(figsize=(10, 2.1*len(groups)), dpi=dpi) 12 | f.suptitle(title, y=0.9) 13 | index = 0 14 | for group in groups: 15 | plt.subplot(len(groups), 1, i) 16 | plt.plot(true[:, group], alpha=0.75, label='True') 17 | if type(pred) is list: 18 | for index, p in enumerate(pred): 19 | plt.plot(list(range(split_step, true.shape[0])), p[:, group], alpha=0.5, label=f'Pred_{index}') 20 | else: 21 | plt.plot(list(range(split_step, true.shape[0])), pred[:, group], alpha=0.75, label='Pred') 22 | #plt.title(f'S{i}', y=1, loc='right') 23 | if col_names is None or C-index > len(col_names): 24 | plt.title(f"Series (-{C-index})", y=1, loc='right') 25 | else: 26 | plt.title(f"{col_names[-(C-index)]} (-{C-index})", y=1, loc='right') 27 | index += 1 28 | plt.legend(loc='lower left') 29 | plt.axvline(x=split_step, linewidth=1, color='Purple') 30 | i += 1 31 | return f 32 | 33 | def mts_visualize_horizontal(pred, true, split_step=720, title='Long-term Time Series Forecasting', dpi=72, width=10, col_names=None): 34 | groups = range(true.shape[-1]) 35 | C = true.shape[-1] 36 | i = 1 37 | # plot each column 38 | f = plt.figure(figsize=(width, 2.1*len(groups)), dpi=dpi) 39 | f.suptitle(title, y=0.9) 40 | index = 0 41 | for group in groups: 42 | plt.subplot(len(groups), 1, i) 43 | plt.plot(true[:, group], alpha=0.75, label='True') 44 | if type(pred) is list: 45 | for index, p in enumerate(pred): 46 | plt.plot(list(range(split_step, true.shape[0])), p[:, group], alpha=0.75, label=f'Pred_{index}', linestyle=':') 47 | else: 48 | plt.plot(list(range(split_step, true.shape[0])), pred[:, group], alpha=0.75, label='Pred') 49 | #plt.title(f'S{i}', y=1, loc='right') 50 | if col_names is None or C-index > len(col_names): 51 | plt.title(f"Series (-{C-index})", y=1, loc='right') 52 | else: 53 | plt.title(f"{col_names[-(C-index)]} (-{C-index})", y=1, loc='right') 54 | index += 1 55 | plt.legend(ncol=1000, loc='upper center', bbox_to_anchor=(0.5, -0.1)) 56 | plt.axvline(x=split_step, linewidth=1, color='Purple') 57 | i += 1 58 | return f -------------------------------------------------------------------------------- /ICTSP/utils/timefeatures.py: -------------------------------------------------------------------------------- 1 | from typing import List 2 | 3 | import numpy as np 4 | import pandas as pd 5 | from pandas.tseries import offsets 6 | from pandas.tseries.frequencies import to_offset 7 | 8 | 9 | class TimeFeature: 10 | def __init__(self): 11 | pass 12 | 13 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 14 | pass 15 | 16 | def __repr__(self): 17 | return self.__class__.__name__ + "()" 18 | 19 | 20 | class SecondOfMinute(TimeFeature): 21 | """Minute of hour encoded as value between [-0.5, 0.5]""" 22 | 23 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 24 | return index.second / 59.0 - 0.5 25 | 26 | 27 | class MinuteOfHour(TimeFeature): 28 | """Minute of hour encoded as value between [-0.5, 0.5]""" 29 | 30 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 31 | return index.minute / 59.0 - 0.5 32 | 33 | 34 | class HourOfDay(TimeFeature): 35 | """Hour of day encoded as value between [-0.5, 0.5]""" 36 | 37 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 38 | return index.hour / 23.0 - 0.5 39 | 40 | 41 | class DayOfWeek(TimeFeature): 42 | """Hour of day encoded as value between [-0.5, 0.5]""" 43 | 44 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 45 | return index.dayofweek / 6.0 - 0.5 46 | 47 | 48 | class DayOfMonth(TimeFeature): 49 | """Day of month encoded as value between [-0.5, 0.5]""" 50 | 51 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 52 | return (index.day - 1) / 30.0 - 0.5 53 | 54 | 55 | class DayOfYear(TimeFeature): 56 | """Day of year encoded as value between [-0.5, 0.5]""" 57 | 58 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 59 | return (index.dayofyear - 1) / 365.0 - 0.5 60 | 61 | 62 | class MonthOfYear(TimeFeature): 63 | """Month of year encoded as value between [-0.5, 0.5]""" 64 | 65 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 66 | return (index.month - 1) / 11.0 - 0.5 67 | 68 | 69 | class WeekOfYear(TimeFeature): 70 | """Week of year encoded as value between [-0.5, 0.5]""" 71 | 72 | def __call__(self, index: pd.DatetimeIndex) -> np.ndarray: 73 | return (index.isocalendar().week - 1) / 52.0 - 0.5 74 | 75 | 76 | def time_features_from_frequency_str(freq_str: str) -> List[TimeFeature]: 77 | """ 78 | Returns a list of time features that will be appropriate for the given frequency string. 79 | Parameters 80 | ---------- 81 | freq_str 82 | Frequency string of the form [multiple][granularity] such as "12H", "5min", "1D" etc. 83 | """ 84 | 85 | features_by_offsets = { 86 | offsets.YearEnd: [], 87 | offsets.QuarterEnd: [MonthOfYear], 88 | offsets.MonthEnd: [MonthOfYear], 89 | offsets.Week: [DayOfMonth, WeekOfYear], 90 | offsets.Day: [DayOfWeek, DayOfMonth, DayOfYear], 91 | offsets.BusinessDay: [DayOfWeek, DayOfMonth, DayOfYear], 92 | offsets.Hour: [HourOfDay, DayOfWeek, DayOfMonth, DayOfYear], 93 | offsets.Minute: [ 94 | MinuteOfHour, 95 | HourOfDay, 96 | DayOfWeek, 97 | DayOfMonth, 98 | DayOfYear, 99 | ], 100 | offsets.Second: [ 101 | SecondOfMinute, 102 | MinuteOfHour, 103 | HourOfDay, 104 | DayOfWeek, 105 | DayOfMonth, 106 | DayOfYear, 107 | ], 108 | } 109 | 110 | offset = to_offset(freq_str) 111 | 112 | for offset_type, feature_classes in features_by_offsets.items(): 113 | if isinstance(offset, offset_type): 114 | return [cls() for cls in feature_classes] 115 | 116 | supported_freq_msg = f""" 117 | Unsupported frequency {freq_str} 118 | The following frequencies are supported: 119 | Y - yearly 120 | alias: A 121 | M - monthly 122 | W - weekly 123 | D - daily 124 | B - business days 125 | H - hourly 126 | T - minutely 127 | alias: min 128 | S - secondly 129 | """ 130 | raise RuntimeError(supported_freq_msg) 131 | 132 | 133 | def time_features(dates, freq='h'): 134 | return np.vstack([feat(dates) for feat in time_features_from_frequency_str(freq)]) 135 | -------------------------------------------------------------------------------- /ICTSP/utils/tools.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | import matplotlib.pyplot as plt 4 | import time 5 | 6 | plt.switch_backend('agg') 7 | 8 | 9 | def adjust_learning_rate(optimizer, scheduler, epoch, args, printout=True): 10 | # lr = args.learning_rate * (0.2 ** (epoch // 2)) 11 | if args.lradj == 'type1': 12 | lr_adjust = {epoch: args.learning_rate * (0.5 ** ((epoch - 1) // 1))} 13 | elif args.lradj == 'type2': 14 | lr_adjust = { 15 | 2: 5e-5, 4: 1e-5, 6: 5e-6, 8: 1e-6, 16 | 10: 5e-7, 15: 1e-7, 20: 5e-8 17 | } 18 | elif args.lradj == 'type3': 19 | lr_adjust = {epoch: args.learning_rate if epoch < 3 else args.learning_rate * (0.9985 ** ((epoch - 3) // 1))} 20 | elif args.lradj == 'constant': 21 | lr_adjust = {epoch: args.learning_rate} 22 | elif args.lradj == '3': 23 | lr_adjust = {epoch: args.learning_rate if epoch < 10 else args.learning_rate*0.1} 24 | elif args.lradj == '4': 25 | lr_adjust = {epoch: args.learning_rate if epoch < 15 else args.learning_rate*0.1} 26 | elif args.lradj == '5': 27 | lr_adjust = {epoch: args.learning_rate if epoch < 25 else args.learning_rate*0.1} 28 | elif args.lradj == '6': 29 | lr_adjust = {epoch: args.learning_rate if epoch < 5 else args.learning_rate*0.1} 30 | elif args.lradj == 'TST': 31 | lr_adjust = {epoch: scheduler.get_last_lr()[0]} 32 | 33 | if epoch in lr_adjust.keys(): 34 | lr = lr_adjust[epoch] 35 | for index, param_group in enumerate(optimizer.param_groups): 36 | if index == 0: 37 | param_group['lr'] = lr 38 | else: 39 | param_group['lr'] = lr*1 40 | if printout: print('Updating learning rate to {}'.format(lr)) 41 | 42 | 43 | class EarlyStopping: 44 | def __init__(self, patience=7, verbose=False, delta=0, configs=None): 45 | self.patience = patience 46 | self.verbose = verbose 47 | self.counter = 0 48 | self.best_score = None 49 | self.early_stop = False 50 | self.val_loss_min = np.inf 51 | self.delta = delta 52 | self.configs = configs 53 | 54 | def __call__(self, val_loss, model, path): 55 | score = -val_loss 56 | if self.best_score is None: 57 | self.best_score = score 58 | self.save_checkpoint(val_loss, model, path) 59 | elif score < self.best_score + self.delta: 60 | self.counter += 1 61 | print(f'EarlyStopping counter: {self.counter} out of {self.patience}') 62 | if self.counter >= self.patience: 63 | self.early_stop = True 64 | else: 65 | self.best_score = score 66 | self.save_checkpoint(val_loss, model, path) 67 | self.counter = 0 68 | 69 | def save_checkpoint(self, val_loss, model, path): 70 | if self.verbose: 71 | print(f'Validation loss decreased ({self.val_loss_min:.6f} --> {val_loss:.6f}). Saving model ...') 72 | torch.save(model.state_dict(), path + '/' + 'checkpoint.pth') 73 | torch.save(model.state_dict(), f'pt_model_{self.configs.seq_len}_{self.configs.pred_len}.pth') 74 | self.val_loss_min = val_loss 75 | 76 | 77 | class dotdict(dict): 78 | """dot.notation access to dictionary attributes""" 79 | __getattr__ = dict.get 80 | __setattr__ = dict.__setitem__ 81 | __delattr__ = dict.__delitem__ 82 | 83 | 84 | class StandardScaler(): 85 | def __init__(self, mean, std): 86 | self.mean = mean 87 | self.std = std 88 | 89 | def transform(self, data): 90 | return (data - self.mean) / self.std 91 | 92 | def inverse_transform(self, data): 93 | return (data * self.std) + self.mean 94 | 95 | 96 | def visual(true, preds=None, name='./pic/test.pdf'): 97 | """ 98 | Results visualization 99 | """ 100 | plt.figure() 101 | plt.plot(true, label='GroundTruth', linewidth=2) 102 | if preds is not None: 103 | plt.plot(preds, label='Prediction', linewidth=2) 104 | plt.legend() 105 | plt.savefig(name, bbox_inches='tight') 106 | 107 | def test_params_flop(model,x_shape): 108 | """ 109 | If you want to thest former's flop, you need to give default value to inputs in model.forward(), the following code can only pass one argument to forward() 110 | """ 111 | model_params = 0 112 | for parameter in model.parameters(): 113 | model_params += parameter.numel() 114 | print('INFO: Trainable parameter count: {:.2f}M'.format(model_params / 1000000.0)) 115 | from ptflops import get_model_complexity_info 116 | with torch.cuda.device(0): 117 | macs, params = get_model_complexity_info(model.cuda(), x_shape, as_strings=True, print_per_layer_stat=True) 118 | # print('Flops:' + flops) 119 | # print('Params:' + params) 120 | print('{:<30} {:<8}'.format('Computational complexity: ', macs)) 121 | print('{:<30} {:<8}'.format('Number of parameters: ', params)) -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # In-context Time Series Predictor 2 | 3 | This repository will host the code for our paper: [In-context Time Series Predictor](https://arxiv.org/abs/2405.14982). 4 | 5 | ## Code Release 6 | 7 | This repository contains the in-development project code for the ICTSP model. We have currently released two codebases: one for the time series forecasting task (as described in the paper) and one for multi-task time series analysis. These can be found in the `ICTSP/` and `ICTSP-MultiTask/` folders respectively. Please refer to the README files in both folders for implementation details. 8 | 9 | ## Overall Architecture 10 | 11 | ![Overall Architecture](figs/ICTSP.png) 12 | -------------------------------------------------------------------------------- /figs/ICTSP.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/LJC-FVNR/In-context-Time-Series-Predictor/91da21052adaf8c993745d751250b94dc340ed61/figs/ICTSP.png --------------------------------------------------------------------------------