├── README.md
├── data_provider
    ├── __init__.py
    ├── data_factory.py
    ├── data_loader.py
    ├── m4.py
    └── uea.py
├── dataset
    └── EPF
    │   ├── BE.csv
    │   ├── DE.csv
    │   ├── FR.csv
    │   ├── NP.csv
    │   └── PJM.csv
├── exp
    ├── __init__.py
    ├── exp_anomaly_detection.py
    ├── exp_basic.py
    ├── exp_classification.py
    ├── exp_imputation.py
    ├── exp_long_term_forecasting.py
    └── exp_short_term_forecasting.py
├── figures
    ├── ERA5.png
    ├── Introduction.png
    ├── Result_EPF.png
    ├── Result_Multivariate.png
    └── TimeXer.png
├── layers
    ├── AutoCorrelation.py
    ├── Autoformer_EncDec.py
    ├── Conv_Blocks.py
    ├── Crossformer_EncDec.py
    ├── ETSformer_EncDec.py
    ├── Embed.py
    ├── FourierCorrelation.py
    ├── MultiWaveletCorrelation.py
    ├── Pyraformer_EncDec.py
    ├── SelfAttention_Family.py
    ├── StandardNorm.py
    ├── Transformer_EncDec.py
    └── __init__.py
├── models
    ├── Autoformer.py
    ├── Crossformer.py
    ├── DLinear.py
    ├── ETSformer.py
    ├── FEDformer.py
    ├── FiLM.py
    ├── FreTS.py
    ├── Informer.py
    ├── Koopa.py
    ├── LightTS.py
    ├── MICN.py
    ├── Mamba.py
    ├── MambaSimple.py
    ├── Nonstationary_Transformer.py
    ├── PatchTST.py
    ├── Pyraformer.py
    ├── Reformer.py
    ├── SCINet.py
    ├── SegRNN.py
    ├── TSMixer.py
    ├── TemporalFusionTransformer.py
    ├── TiDE.py
    ├── TimeMixer.py
    ├── TimeXer.py
    ├── TimesNet.py
    ├── Transformer.py
    ├── __init__.py
    └── iTransformer.py
├── requirements.txt
├── run.py
├── scripts
    ├── forecast_exogenous
    │   ├── ECL
    │   │   └── TimeXer.sh
    │   ├── EPF
    │   │   └── TimeXer.sh
    │   ├── ETTh1
    │   │   └── TimeXer.sh
    │   ├── ETTh2
    │   │   └── TimeXer.sh
    │   ├── ETTm1
    │   │   └── TimeXer.sh
    │   ├── ETTm2
    │   │   └── TimeXer.sh
    │   ├── Traffic
    │   │   └── TimeXer.sh
    │   ├── Weather
    │   │   └── TimeXer.sh
    │   └── meteorology
    │   │   ├── temp.sh
    │   │   └── wind.sh
    └── multivariate
    │   ├── ECL
    │       └── TimeXer.sh
    │   ├── ETT
    │       ├── TimeXer_ETTh1.sh
    │       ├── TimeXer_ETTh2.sh
    │       ├── TimeXer_ETTm1.sh
    │       └── TimeXer_ETTm2.sh
    │   ├── Traffic
    │       └── TimeXer.sh
    │   └── Weather
    │       └── TimeXer.sh
└── utils
    ├── __init__.py
    ├── augmentation.py
    ├── losses.py
    ├── m4_summary.py
    ├── masking.py
    ├── metrics.py
    ├── print_args.py
    ├── timefeatures.py
    └── tools.py


/README.md:
--------------------------------------------------------------------------------
 1 | # TimeXer
 2 | 
 3 | This repo is the official implementation for the paper: [TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables](https://arxiv.org/abs/2402.19072).
 4 | 
 5 | ## Introduction
 6 | This paper focuses on forecasting with exogenous variables which is a practical forecasting paradigm applied extensively in real scenarios. TimeXer empower the canonical Transformer with the ability to reconcile endogenous and exogenous information without any architectural modifications and achieves consistent state-of-the-art performance on twelve real-world forecasting benchmarks.
 7 | 
 8 | <p align="center">
 9 | <img src=".\figures\Introduction.png" width = "800" height = "" alt="" align=center />
10 | </p>
11 | 
12 | ## Overall Architecture
13 | TimeXer employs patch-level and variate-level representations respectively for endogenous and exogenous variables, with an endogenous global token as a bridge in-between. With this design, TimeXer can jointly capture intra-endogenous temporal dependencies and exogenous-to-endogenous correlations.
14 | 
15 | <p align="center">
16 | <img src=".\figures\TimeXer.png" width = "800" height = "" alt="" align=center />
17 | </p>
18 | 
19 | ## Usage 
20 | 
21 | 1. Short-term Electricity Price Forecasting Dataset have alreadly included in "./dataset/EPF". Multivariate datasets can be obtained from [[Google Drive]](https://drive.google.com/drive/folders/13Cg1KYOlzM5C7K8gK8NfC-F3EYxkM3D2?usp=sharing) or [[Baidu Drive]](https://pan.baidu.com/s/1r3KhGd0Q9PJIUZdfEYoymg?pwd=i9iy).
22 | 
23 | 2. Install Pytorch and other necessary dependencies.
24 | ```
25 | pip install -r requirements.txt
26 | ```
27 | 3. Train and evaluate model. We provide the experiment scripts under the folder ./scripts/. You can reproduce the experiment results as the following examples:
28 | 
29 | ```
30 | bash ./scripts/forecast_exogenous/EPF/TimeXer.sh
31 | ```
32 | 
33 | ## Main Results
34 | We evaluate TimeXer on short-term forecasting with exogenous variables and long-term multivariate forecasting benchmarks. Comprehensive forecasting results demonstrate that TimeXer effectively ingests exogenous information to facilitate the prediction of endogenous series.
35 | 
36 | ### Forecasting with Exogenous
37 | 
38 | <p align="center">
39 | <img src=".\figures\Result_EPF.png" width = "800" height = "" alt="" align=center />
40 | </p>
41 | 
42 | ### Multivariate Forecasting
43 | 
44 | <p align="center">
45 | <img src=".\figures\Result_Multivariate.png" width = "800" height = "" alt="" align=center />
46 | </p>
47 | 
48 | ## Experiments on Large-scale Meteorology Dataset
49 | In this paper, we build a large-scale weather dataset for forecasting with exogenous variables, where the endogenous series is the hourly temperature of 3,850 stations worldwide obtained from the National Centers for Environmental Information (NCEI), and the exogenous variables are meteorological indicators of its adjacent area from the ERA5 dataset. You can obtain this meteorology dataset from [[Google Drive]](https://drive.google.com/file/d/1EuEedepUV2A_cia1plAHwA6fJXNio47i/view?usp=drive_link).
50 | 
51 | <p align="center">
52 | <img src=".\figures\ERA5.png" width = "800" alt="" align=center />
53 | </p>
54 | 
55 | ## Citation
56 | If you find this repo helpful, please cite our paper.
57 | 
58 | ```
59 | @article{wang2024timexer,
60 |   title={Timexer: Empowering transformers for time series forecasting with exogenous variables},
61 |   author={Wang, Yuxuan and Wu, Haixu and Dong, Jiaxiang and Liu, Yong and Qiu, Yunzhong and Zhang, Haoran and Wang, Jianmin and Long, Mingsheng},
62 |   journal={Advances in Neural Information Processing Systems},
63 |   year={2024}
64 | }
65 | ```
66 | 
67 | ## Acknowledgement
68 | We appreciate the following GitHub repos a lot for their valuable code and efforts.
69 | 
70 | Reformer (https://github.com/lucidrains/reformer-pytorch)
71 | 
72 | Informer (https://github.com/zhouhaoyi/Informer2020)
73 | 
74 | Autoformer (https://github.com/thuml/Autoformer)
75 | 
76 | Stationary (https://github.com/thuml/Nonstationary_Transformers)
77 | 
78 | Time-Series-Library (https://github.com/thuml/Time-Series-Library)
79 | 
80 | ## Concat
81 | 
82 | If you have any questions or want to use the code, please contact wangyuxu22@mails.tsinghua.edu.cn
83 | 


--------------------------------------------------------------------------------
/data_provider/__init__.py:
--------------------------------------------------------------------------------
1 | 
2 | 


--------------------------------------------------------------------------------
/data_provider/data_factory.py:
--------------------------------------------------------------------------------
 1 | from data_provider.data_loader import Dataset_ETT_hour, Dataset_ETT_minute, Dataset_Custom, Dataset_M4, PSMSegLoader, \
 2 |     MSLSegLoader, SMAPSegLoader, SMDSegLoader, SWATSegLoader, UEAloader, Dataset_Meteorology
 3 | from data_provider.uea import collate_fn
 4 | from torch.utils.data import DataLoader
 5 | 
 6 | data_dict = {
 7 |     'ETTh1': Dataset_ETT_hour,
 8 |     'ETTh2': Dataset_ETT_hour,
 9 |     'ETTm1': Dataset_ETT_minute,
10 |     'ETTm2': Dataset_ETT_minute,
11 |     'custom': Dataset_Custom,
12 |     'm4': Dataset_M4,
13 |     'PSM': PSMSegLoader,
14 |     'MSL': MSLSegLoader,
15 |     'SMAP': SMAPSegLoader,
16 |     'SMD': SMDSegLoader,
17 |     'SWAT': SWATSegLoader,
18 |     'UEA': UEAloader,
19 |     'Meteorology' : Dataset_Meteorology
20 | }
21 | 
22 | 
23 | def data_provider(args, flag):
24 |     Data = data_dict[args.data]
25 |     timeenc = 0 if args.embed != 'timeF' else 1
26 | 
27 |     shuffle_flag = False if (flag == 'test' or flag == 'TEST') else True
28 |     drop_last = False
29 |     batch_size = args.batch_size
30 |     freq = args.freq
31 | 
32 |     if args.task_name == 'anomaly_detection':
33 |         drop_last = False
34 |         data_set = Data(
35 |             args = args,
36 |             root_path=args.root_path,
37 |             win_size=args.seq_len,
38 |             flag=flag,
39 |         )
40 |         print(flag, len(data_set))
41 |         data_loader = DataLoader(
42 |             data_set,
43 |             batch_size=batch_size,
44 |             shuffle=shuffle_flag,
45 |             num_workers=args.num_workers,
46 |             drop_last=drop_last)
47 |         return data_set, data_loader
48 |     elif args.task_name == 'classification':
49 |         drop_last = False
50 |         data_set = Data(
51 |             args = args,
52 |             root_path=args.root_path,
53 |             flag=flag,
54 |         )
55 | 
56 |         data_loader = DataLoader(
57 |             data_set,
58 |             batch_size=batch_size,
59 |             shuffle=shuffle_flag,
60 |             num_workers=args.num_workers,
61 |             drop_last=drop_last,
62 |             collate_fn=lambda x: collate_fn(x, max_len=args.seq_len)
63 |         )
64 |         return data_set, data_loader
65 |     else:
66 |         if args.data == 'm4':
67 |             drop_last = False
68 |         data_set = Data(
69 |             args = args,
70 |             root_path=args.root_path,
71 |             data_path=args.data_path,
72 |             flag=flag,
73 |             size=[args.seq_len, args.label_len, args.pred_len],
74 |             features=args.features,
75 |             target=args.target,
76 |             timeenc=timeenc,
77 |             freq=freq,
78 |             seasonal_patterns=args.seasonal_patterns
79 |         )
80 |         print(flag, len(data_set))
81 |         data_loader = DataLoader(
82 |             data_set,
83 |             batch_size=batch_size,
84 |             shuffle=shuffle_flag,
85 |             num_workers=args.num_workers,
86 |             drop_last=drop_last)
87 |         return data_set, data_loader


--------------------------------------------------------------------------------
/data_provider/m4.py:
--------------------------------------------------------------------------------
  1 | # This source code is provided for the purposes of scientific reproducibility
  2 | # under the following limited license from Element AI Inc. The code is an
  3 | # implementation of the N-BEATS model (Oreshkin et al., N-BEATS: Neural basis
  4 | # expansion analysis for interpretable time series forecasting,
  5 | # https://arxiv.org/abs/1905.10437). The copyright to the source code is
  6 | # licensed under the Creative Commons - Attribution-NonCommercial 4.0
  7 | # International license (CC BY-NC 4.0):
  8 | # https://creativecommons.org/licenses/by-nc/4.0/.  Any commercial use (whether
  9 | # for the benefit of third parties or internally in production) requires an
 10 | # explicit license. The subject-matter of the N-BEATS model and associated
 11 | # materials are the property of Element AI Inc. and may be subject to patent
 12 | # protection. No license to patents is granted hereunder (whether express or
 13 | # implied). Copyright © 2020 Element AI Inc. All rights reserved.
 14 | 
 15 | """
 16 | M4 Dataset
 17 | """
 18 | import logging
 19 | import os
 20 | from collections import OrderedDict
 21 | from dataclasses import dataclass
 22 | from glob import glob
 23 | 
 24 | import numpy as np
 25 | import pandas as pd
 26 | import patoolib
 27 | from tqdm import tqdm
 28 | import logging
 29 | import os
 30 | import pathlib
 31 | import sys
 32 | from urllib import request
 33 | 
 34 | 
 35 | def url_file_name(url: str) -> str:
 36 |     """
 37 |     Extract file name from url.
 38 | 
 39 |     :param url: URL to extract file name from.
 40 |     :return: File name.
 41 |     """
 42 |     return url.split('/')[-1] if len(url) > 0 else ''
 43 | 
 44 | 
 45 | def download(url: str, file_path: str) -> None:
 46 |     """
 47 |     Download a file to the given path.
 48 | 
 49 |     :param url: URL to download
 50 |     :param file_path: Where to download the content.
 51 |     """
 52 | 
 53 |     def progress(count, block_size, total_size):
 54 |         progress_pct = float(count * block_size) / float(total_size) * 100.0
 55 |         sys.stdout.write('\rDownloading {} to {} {:.1f}%'.format(url, file_path, progress_pct))
 56 |         sys.stdout.flush()
 57 | 
 58 |     if not os.path.isfile(file_path):
 59 |         opener = request.build_opener()
 60 |         opener.addheaders = [('User-agent', 'Mozilla/5.0')]
 61 |         request.install_opener(opener)
 62 |         pathlib.Path(os.path.dirname(file_path)).mkdir(parents=True, exist_ok=True)
 63 |         f, _ = request.urlretrieve(url, file_path, progress)
 64 |         sys.stdout.write('\n')
 65 |         sys.stdout.flush()
 66 |         file_info = os.stat(f)
 67 |         logging.info(f'Successfully downloaded {os.path.basename(file_path)} {file_info.st_size} bytes.')
 68 |     else:
 69 |         file_info = os.stat(file_path)
 70 |         logging.info(f'File already exists: {file_path} {file_info.st_size} bytes.')
 71 | 
 72 | 
 73 | @dataclass()
 74 | class M4Dataset:
 75 |     ids: np.ndarray
 76 |     groups: np.ndarray
 77 |     frequencies: np.ndarray
 78 |     horizons: np.ndarray
 79 |     values: np.ndarray
 80 | 
 81 |     @staticmethod
 82 |     def load(training: bool = True, dataset_file: str = '../dataset/m4') -> 'M4Dataset':
 83 |         """
 84 |         Load cached dataset.
 85 | 
 86 |         :param training: Load training part if_inverted training is True, test part otherwise.
 87 |         """
 88 |         info_file = os.path.join(dataset_file, 'M4-info.csv')
 89 |         train_cache_file = os.path.join(dataset_file, 'training.npz')
 90 |         test_cache_file = os.path.join(dataset_file, 'test.npz')
 91 |         m4_info = pd.read_csv(info_file)
 92 |         return M4Dataset(ids=m4_info.M4id.values,
 93 |                          groups=m4_info.SP.values,
 94 |                          frequencies=m4_info.Frequency.values,
 95 |                          horizons=m4_info.Horizon.values,
 96 |                          values=np.load(
 97 |                              train_cache_file if training else test_cache_file,
 98 |                              allow_pickle=True))
 99 | 
100 | 
101 | @dataclass()
102 | class M4Meta:
103 |     seasonal_patterns = ['Yearly', 'Quarterly', 'Monthly', 'Weekly', 'Daily', 'Hourly']
104 |     horizons = [6, 8, 18, 13, 14, 48]
105 |     frequencies = [1, 4, 12, 1, 1, 24]
106 |     horizons_map = {
107 |         'Yearly': 6,
108 |         'Quarterly': 8,
109 |         'Monthly': 18,
110 |         'Weekly': 13,
111 |         'Daily': 14,
112 |         'Hourly': 48
113 |     }  # different predict length
114 |     frequency_map = {
115 |         'Yearly': 1,
116 |         'Quarterly': 4,
117 |         'Monthly': 12,
118 |         'Weekly': 1,
119 |         'Daily': 1,
120 |         'Hourly': 24
121 |     }
122 |     history_size = {
123 |         'Yearly': 1.5,
124 |         'Quarterly': 1.5,
125 |         'Monthly': 1.5,
126 |         'Weekly': 10,
127 |         'Daily': 10,
128 |         'Hourly': 10
129 |     }  # from interpretable.gin
130 | 
131 | 
132 | def load_m4_info() -> pd.DataFrame:
133 |     """
134 |     Load M4Info file.
135 | 
136 |     :return: Pandas DataFrame of M4Info.
137 |     """
138 |     return pd.read_csv(INFO_FILE_PATH)
139 | 


--------------------------------------------------------------------------------
/data_provider/uea.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import numpy as np
  3 | import pandas as pd
  4 | import torch
  5 | 
  6 | 
  7 | def collate_fn(data, max_len=None):
  8 |     """Build mini-batch tensors from a list of (X, mask) tuples. Mask input. Create
  9 |     Args:
 10 |         data: len(batch_size) list of tuples (X, y).
 11 |             - X: torch tensor of shape (seq_length, feat_dim); variable seq_length.
 12 |             - y: torch tensor of shape (num_labels,) : class indices or numerical targets
 13 |                 (for classification or regression, respectively). num_labels > 1 for multi-task models
 14 |         max_len: global fixed sequence length. Used for architectures requiring fixed length input,
 15 |             where the batch length cannot vary dynamically. Longer sequences are clipped, shorter are padded with 0s
 16 |     Returns:
 17 |         X: (batch_size, padded_length, feat_dim) torch tensor of masked features (input)
 18 |         targets: (batch_size, padded_length, feat_dim) torch tensor of unmasked features (output)
 19 |         target_masks: (batch_size, padded_length, feat_dim) boolean torch tensor
 20 |             0 indicates masked values to be predicted, 1 indicates unaffected/"active" feature values
 21 |         padding_masks: (batch_size, padded_length) boolean tensor, 1 means keep vector at this position, 0 means padding
 22 |     """
 23 | 
 24 |     batch_size = len(data)
 25 |     features, labels = zip(*data)
 26 | 
 27 |     # Stack and pad features and masks (convert 2D to 3D tensors, i.e. add batch dimension)
 28 |     lengths = [X.shape[0] for X in features]  # original sequence length for each time series
 29 |     if max_len is None:
 30 |         max_len = max(lengths)
 31 | 
 32 |     X = torch.zeros(batch_size, max_len, features[0].shape[-1])  # (batch_size, padded_length, feat_dim)
 33 |     for i in range(batch_size):
 34 |         end = min(lengths[i], max_len)
 35 |         X[i, :end, :] = features[i][:end, :]
 36 | 
 37 |     targets = torch.stack(labels, dim=0)  # (batch_size, num_labels)
 38 | 
 39 |     padding_masks = padding_mask(torch.tensor(lengths, dtype=torch.int16),
 40 |                                  max_len=max_len)  # (batch_size, padded_length) boolean tensor, "1" means keep
 41 | 
 42 |     return X, targets, padding_masks
 43 | 
 44 | 
 45 | def padding_mask(lengths, max_len=None):
 46 |     """
 47 |     Used to mask padded positions: creates a (batch_size, max_len) boolean mask from a tensor of sequence lengths,
 48 |     where 1 means keep element at this position (time step)
 49 |     """
 50 |     batch_size = lengths.numel()
 51 |     max_len = max_len or lengths.max_val()  # trick works because of overloading of 'or' operator for non-boolean types
 52 |     return (torch.arange(0, max_len, device=lengths.device)
 53 |             .type_as(lengths)
 54 |             .repeat(batch_size, 1)
 55 |             .lt(lengths.unsqueeze(1)))
 56 | 
 57 | 
 58 | class Normalizer(object):
 59 |     """
 60 |     Normalizes dataframe across ALL contained rows (time steps). Different from per-sample normalization.
 61 |     """
 62 | 
 63 |     def __init__(self, norm_type='standardization', mean=None, std=None, min_val=None, max_val=None):
 64 |         """
 65 |         Args:
 66 |             norm_type: choose from:
 67 |                 "standardization", "minmax": normalizes dataframe across ALL contained rows (time steps)
 68 |                 "per_sample_std", "per_sample_minmax": normalizes each sample separately (i.e. across only its own rows)
 69 |             mean, std, min_val, max_val: optional (num_feat,) Series of pre-computed values
 70 |         """
 71 | 
 72 |         self.norm_type = norm_type
 73 |         self.mean = mean
 74 |         self.std = std
 75 |         self.min_val = min_val
 76 |         self.max_val = max_val
 77 | 
 78 |     def normalize(self, df):
 79 |         """
 80 |         Args:
 81 |             df: input dataframe
 82 |         Returns:
 83 |             df: normalized dataframe
 84 |         """
 85 |         if self.norm_type == "standardization":
 86 |             if self.mean is None:
 87 |                 self.mean = df.mean()
 88 |                 self.std = df.std()
 89 |             return (df - self.mean) / (self.std + np.finfo(float).eps)
 90 | 
 91 |         elif self.norm_type == "minmax":
 92 |             if self.max_val is None:
 93 |                 self.max_val = df.max()
 94 |                 self.min_val = df.min()
 95 |             return (df - self.min_val) / (self.max_val - self.min_val + np.finfo(float).eps)
 96 | 
 97 |         elif self.norm_type == "per_sample_std":
 98 |             grouped = df.groupby(by=df.index)
 99 |             return (df - grouped.transform('mean')) / grouped.transform('std')
100 | 
101 |         elif self.norm_type == "per_sample_minmax":
102 |             grouped = df.groupby(by=df.index)
103 |             min_vals = grouped.transform('min')
104 |             return (df - min_vals) / (grouped.transform('max') - min_vals + np.finfo(float).eps)
105 | 
106 |         else:
107 |             raise (NameError(f'Normalize method "{self.norm_type}" not implemented'))
108 | 
109 | 
110 | def interpolate_missing(y):
111 |     """
112 |     Replaces NaN values in pd.Series `y` using linear interpolation
113 |     """
114 |     if y.isna().any():
115 |         y = y.interpolate(method='linear', limit_direction='both')
116 |     return y
117 | 
118 | 
119 | def subsample(y, limit=256, factor=2):
120 |     """
121 |     If a given Series is longer than `limit`, returns subsampled sequence by the specified integer factor
122 |     """
123 |     if len(y) > limit:
124 |         return y[::factor].reset_index(drop=True)
125 |     return y
126 | 


--------------------------------------------------------------------------------
/exp/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/exp/__init__.py


--------------------------------------------------------------------------------
/exp/exp_basic.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import torch
 3 | from models import Autoformer, Transformer, TimesNet, Nonstationary_Transformer, DLinear, FEDformer, \
 4 |     Informer, LightTS, Reformer, ETSformer, Pyraformer, PatchTST, MICN, Crossformer, FiLM, iTransformer, \
 5 |     Koopa, TiDE, FreTS, TimeMixer, TSMixer, SegRNN, MambaSimple, TemporalFusionTransformer, SCINet, TimeXer
 6 | 
 7 | 
 8 | class Exp_Basic(object):
 9 |     def __init__(self, args):
10 |         self.args = args
11 |         self.model_dict = {
12 |             'TimesNet': TimesNet,
13 |             'Autoformer': Autoformer,
14 |             'Transformer': Transformer,
15 |             'Nonstationary_Transformer': Nonstationary_Transformer,
16 |             'DLinear': DLinear,
17 |             'FEDformer': FEDformer,
18 |             'Informer': Informer,
19 |             'LightTS': LightTS,
20 |             'Reformer': Reformer,
21 |             'ETSformer': ETSformer,
22 |             'PatchTST': PatchTST,
23 |             'Pyraformer': Pyraformer,
24 |             'MICN': MICN,
25 |             'Crossformer': Crossformer,
26 |             'FiLM': FiLM,
27 |             'iTransformer': iTransformer,
28 |             'Koopa': Koopa,
29 |             'TiDE': TiDE,
30 |             'FreTS': FreTS,
31 |             'MambaSimple': MambaSimple,
32 |             'TimeMixer': TimeMixer,
33 |             'TSMixer': TSMixer,
34 |             'SegRNN': SegRNN,
35 |             'TemporalFusionTransformer': TemporalFusionTransformer,
36 |             "SCINet": SCINet,
37 |             'TimeXer': TimeXer
38 |         }
39 |         if args.model == 'Mamba':
40 |             print('Please make sure you have successfully installed mamba_ssm')
41 |             from models import Mamba
42 |             self.model_dict['Mamba'] = Mamba
43 | 
44 |         self.device = self._acquire_device()
45 |         self.model = self._build_model().to(self.device)
46 | 
47 |     def _build_model(self):
48 |         raise NotImplementedError
49 |         return None
50 | 
51 |     def _acquire_device(self):
52 |         if self.args.use_gpu:
53 |             os.environ["CUDA_VISIBLE_DEVICES"] = str(
54 |                 self.args.gpu) if not self.args.use_multi_gpu else self.args.devices
55 |             device = torch.device('cuda:{}'.format(self.args.gpu))
56 |             print('Use GPU: cuda:{}'.format(self.args.gpu))
57 |         else:
58 |             device = torch.device('cpu')
59 |             print('Use CPU')
60 |         return device
61 | 
62 |     def _get_data(self):
63 |         pass
64 | 
65 |     def vali(self):
66 |         pass
67 | 
68 |     def train(self):
69 |         pass
70 | 
71 |     def test(self):
72 |         pass
73 | 


--------------------------------------------------------------------------------
/figures/ERA5.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/figures/ERA5.png


--------------------------------------------------------------------------------
/figures/Introduction.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/figures/Introduction.png


--------------------------------------------------------------------------------
/figures/Result_EPF.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/figures/Result_EPF.png


--------------------------------------------------------------------------------
/figures/Result_Multivariate.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/figures/Result_Multivariate.png


--------------------------------------------------------------------------------
/figures/TimeXer.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/figures/TimeXer.png


--------------------------------------------------------------------------------
/layers/AutoCorrelation.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | import matplotlib.pyplot as plt
  5 | import numpy as np
  6 | import math
  7 | from math import sqrt
  8 | import os
  9 | 
 10 | 
 11 | class AutoCorrelation(nn.Module):
 12 |     """
 13 |     AutoCorrelation Mechanism with the following two phases:
 14 |     (1) period-based dependencies discovery
 15 |     (2) time delay aggregation
 16 |     This block can replace the self-attention family mechanism seamlessly.
 17 |     """
 18 | 
 19 |     def __init__(self, mask_flag=True, factor=1, scale=None, attention_dropout=0.1, output_attention=False):
 20 |         super(AutoCorrelation, self).__init__()
 21 |         self.factor = factor
 22 |         self.scale = scale
 23 |         self.mask_flag = mask_flag
 24 |         self.output_attention = output_attention
 25 |         self.dropout = nn.Dropout(attention_dropout)
 26 | 
 27 |     def time_delay_agg_training(self, values, corr):
 28 |         """
 29 |         SpeedUp version of Autocorrelation (a batch-normalization style design)
 30 |         This is for the training phase.
 31 |         """
 32 |         head = values.shape[1]
 33 |         channel = values.shape[2]
 34 |         length = values.shape[3]
 35 |         # find top k
 36 |         top_k = int(self.factor * math.log(length))
 37 |         mean_value = torch.mean(torch.mean(corr, dim=1), dim=1)
 38 |         index = torch.topk(torch.mean(mean_value, dim=0), top_k, dim=-1)[1]
 39 |         weights = torch.stack([mean_value[:, index[i]] for i in range(top_k)], dim=-1)
 40 |         # update corr
 41 |         tmp_corr = torch.softmax(weights, dim=-1)
 42 |         # aggregation
 43 |         tmp_values = values
 44 |         delays_agg = torch.zeros_like(values).float()
 45 |         for i in range(top_k):
 46 |             pattern = torch.roll(tmp_values, -int(index[i]), -1)
 47 |             delays_agg = delays_agg + pattern * \
 48 |                          (tmp_corr[:, i].unsqueeze(1).unsqueeze(1).unsqueeze(1).repeat(1, head, channel, length))
 49 |         return delays_agg
 50 | 
 51 |     def time_delay_agg_inference(self, values, corr):
 52 |         """
 53 |         SpeedUp version of Autocorrelation (a batch-normalization style design)
 54 |         This is for the inference phase.
 55 |         """
 56 |         batch = values.shape[0]
 57 |         head = values.shape[1]
 58 |         channel = values.shape[2]
 59 |         length = values.shape[3]
 60 |         # index init
 61 |         init_index = torch.arange(length).unsqueeze(0).unsqueeze(0).unsqueeze(0).repeat(batch, head, channel, 1).cuda()
 62 |         # find top k
 63 |         top_k = int(self.factor * math.log(length))
 64 |         mean_value = torch.mean(torch.mean(corr, dim=1), dim=1)
 65 |         weights, delay = torch.topk(mean_value, top_k, dim=-1)
 66 |         # update corr
 67 |         tmp_corr = torch.softmax(weights, dim=-1)
 68 |         # aggregation
 69 |         tmp_values = values.repeat(1, 1, 1, 2)
 70 |         delays_agg = torch.zeros_like(values).float()
 71 |         for i in range(top_k):
 72 |             tmp_delay = init_index + delay[:, i].unsqueeze(1).unsqueeze(1).unsqueeze(1).repeat(1, head, channel, length)
 73 |             pattern = torch.gather(tmp_values, dim=-1, index=tmp_delay)
 74 |             delays_agg = delays_agg + pattern * \
 75 |                          (tmp_corr[:, i].unsqueeze(1).unsqueeze(1).unsqueeze(1).repeat(1, head, channel, length))
 76 |         return delays_agg
 77 | 
 78 |     def time_delay_agg_full(self, values, corr):
 79 |         """
 80 |         Standard version of Autocorrelation
 81 |         """
 82 |         batch = values.shape[0]
 83 |         head = values.shape[1]
 84 |         channel = values.shape[2]
 85 |         length = values.shape[3]
 86 |         # index init
 87 |         init_index = torch.arange(length).unsqueeze(0).unsqueeze(0).unsqueeze(0).repeat(batch, head, channel, 1).cuda()
 88 |         # find top k
 89 |         top_k = int(self.factor * math.log(length))
 90 |         weights, delay = torch.topk(corr, top_k, dim=-1)
 91 |         # update corr
 92 |         tmp_corr = torch.softmax(weights, dim=-1)
 93 |         # aggregation
 94 |         tmp_values = values.repeat(1, 1, 1, 2)
 95 |         delays_agg = torch.zeros_like(values).float()
 96 |         for i in range(top_k):
 97 |             tmp_delay = init_index + delay[..., i].unsqueeze(-1)
 98 |             pattern = torch.gather(tmp_values, dim=-1, index=tmp_delay)
 99 |             delays_agg = delays_agg + pattern * (tmp_corr[..., i].unsqueeze(-1))
100 |         return delays_agg
101 | 
102 |     def forward(self, queries, keys, values, attn_mask):
103 |         B, L, H, E = queries.shape
104 |         _, S, _, D = values.shape
105 |         if L > S:
106 |             zeros = torch.zeros_like(queries[:, :(L - S), :]).float()
107 |             values = torch.cat([values, zeros], dim=1)
108 |             keys = torch.cat([keys, zeros], dim=1)
109 |         else:
110 |             values = values[:, :L, :, :]
111 |             keys = keys[:, :L, :, :]
112 | 
113 |         # period-based dependencies
114 |         q_fft = torch.fft.rfft(queries.permute(0, 2, 3, 1).contiguous(), dim=-1)
115 |         k_fft = torch.fft.rfft(keys.permute(0, 2, 3, 1).contiguous(), dim=-1)
116 |         res = q_fft * torch.conj(k_fft)
117 |         corr = torch.fft.irfft(res, dim=-1)
118 | 
119 |         # time delay agg
120 |         if self.training:
121 |             V = self.time_delay_agg_training(values.permute(0, 2, 3, 1).contiguous(), corr).permute(0, 3, 1, 2)
122 |         else:
123 |             V = self.time_delay_agg_inference(values.permute(0, 2, 3, 1).contiguous(), corr).permute(0, 3, 1, 2)
124 | 
125 |         if self.output_attention:
126 |             return (V.contiguous(), corr.permute(0, 3, 1, 2))
127 |         else:
128 |             return (V.contiguous(), None)
129 | 
130 | 
131 | class AutoCorrelationLayer(nn.Module):
132 |     def __init__(self, correlation, d_model, n_heads, d_keys=None,
133 |                  d_values=None):
134 |         super(AutoCorrelationLayer, self).__init__()
135 | 
136 |         d_keys = d_keys or (d_model // n_heads)
137 |         d_values = d_values or (d_model // n_heads)
138 | 
139 |         self.inner_correlation = correlation
140 |         self.query_projection = nn.Linear(d_model, d_keys * n_heads)
141 |         self.key_projection = nn.Linear(d_model, d_keys * n_heads)
142 |         self.value_projection = nn.Linear(d_model, d_values * n_heads)
143 |         self.out_projection = nn.Linear(d_values * n_heads, d_model)
144 |         self.n_heads = n_heads
145 | 
146 |     def forward(self, queries, keys, values, attn_mask):
147 |         B, L, _ = queries.shape
148 |         _, S, _ = keys.shape
149 |         H = self.n_heads
150 | 
151 |         queries = self.query_projection(queries).view(B, L, H, -1)
152 |         keys = self.key_projection(keys).view(B, S, H, -1)
153 |         values = self.value_projection(values).view(B, S, H, -1)
154 | 
155 |         out, attn = self.inner_correlation(
156 |             queries,
157 |             keys,
158 |             values,
159 |             attn_mask
160 |         )
161 |         out = out.view(B, L, -1)
162 | 
163 |         return self.out_projection(out), attn
164 | 


--------------------------------------------------------------------------------
/layers/Autoformer_EncDec.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | 
  5 | 
  6 | class my_Layernorm(nn.Module):
  7 |     """
  8 |     Special designed layernorm for the seasonal part
  9 |     """
 10 | 
 11 |     def __init__(self, channels):
 12 |         super(my_Layernorm, self).__init__()
 13 |         self.layernorm = nn.LayerNorm(channels)
 14 | 
 15 |     def forward(self, x):
 16 |         x_hat = self.layernorm(x)
 17 |         bias = torch.mean(x_hat, dim=1).unsqueeze(1).repeat(1, x.shape[1], 1)
 18 |         return x_hat - bias
 19 | 
 20 | 
 21 | class moving_avg(nn.Module):
 22 |     """
 23 |     Moving average block to highlight the trend of time series
 24 |     """
 25 | 
 26 |     def __init__(self, kernel_size, stride):
 27 |         super(moving_avg, self).__init__()
 28 |         self.kernel_size = kernel_size
 29 |         self.avg = nn.AvgPool1d(kernel_size=kernel_size, stride=stride, padding=0)
 30 | 
 31 |     def forward(self, x):
 32 |         # padding on the both ends of time series
 33 |         front = x[:, 0:1, :].repeat(1, (self.kernel_size - 1) // 2, 1)
 34 |         end = x[:, -1:, :].repeat(1, (self.kernel_size - 1) // 2, 1)
 35 |         x = torch.cat([front, x, end], dim=1)
 36 |         x = self.avg(x.permute(0, 2, 1))
 37 |         x = x.permute(0, 2, 1)
 38 |         return x
 39 | 
 40 | 
 41 | class series_decomp(nn.Module):
 42 |     """
 43 |     Series decomposition block
 44 |     """
 45 | 
 46 |     def __init__(self, kernel_size):
 47 |         super(series_decomp, self).__init__()
 48 |         self.moving_avg = moving_avg(kernel_size, stride=1)
 49 | 
 50 |     def forward(self, x):
 51 |         moving_mean = self.moving_avg(x)
 52 |         res = x - moving_mean
 53 |         return res, moving_mean
 54 | 
 55 | 
 56 | class series_decomp_multi(nn.Module):
 57 |     """
 58 |     Multiple Series decomposition block from FEDformer
 59 |     """
 60 | 
 61 |     def __init__(self, kernel_size):
 62 |         super(series_decomp_multi, self).__init__()
 63 |         self.kernel_size = kernel_size
 64 |         self.series_decomp = [series_decomp(kernel) for kernel in kernel_size]
 65 | 
 66 |     def forward(self, x):
 67 |         moving_mean = []
 68 |         res = []
 69 |         for func in self.series_decomp:
 70 |             sea, moving_avg = func(x)
 71 |             moving_mean.append(moving_avg)
 72 |             res.append(sea)
 73 | 
 74 |         sea = sum(res) / len(res)
 75 |         moving_mean = sum(moving_mean) / len(moving_mean)
 76 |         return sea, moving_mean
 77 | 
 78 | 
 79 | class EncoderLayer(nn.Module):
 80 |     """
 81 |     Autoformer encoder layer with the progressive decomposition architecture
 82 |     """
 83 | 
 84 |     def __init__(self, attention, d_model, d_ff=None, moving_avg=25, dropout=0.1, activation="relu"):
 85 |         super(EncoderLayer, self).__init__()
 86 |         d_ff = d_ff or 4 * d_model
 87 |         self.attention = attention
 88 |         self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1, bias=False)
 89 |         self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1, bias=False)
 90 |         self.decomp1 = series_decomp(moving_avg)
 91 |         self.decomp2 = series_decomp(moving_avg)
 92 |         self.dropout = nn.Dropout(dropout)
 93 |         self.activation = F.relu if activation == "relu" else F.gelu
 94 | 
 95 |     def forward(self, x, attn_mask=None):
 96 |         new_x, attn = self.attention(
 97 |             x, x, x,
 98 |             attn_mask=attn_mask
 99 |         )
100 |         x = x + self.dropout(new_x)
101 |         x, _ = self.decomp1(x)
102 |         y = x
103 |         y = self.dropout(self.activation(self.conv1(y.transpose(-1, 1))))
104 |         y = self.dropout(self.conv2(y).transpose(-1, 1))
105 |         res, _ = self.decomp2(x + y)
106 |         return res, attn
107 | 
108 | 
109 | class Encoder(nn.Module):
110 |     """
111 |     Autoformer encoder
112 |     """
113 | 
114 |     def __init__(self, attn_layers, conv_layers=None, norm_layer=None):
115 |         super(Encoder, self).__init__()
116 |         self.attn_layers = nn.ModuleList(attn_layers)
117 |         self.conv_layers = nn.ModuleList(conv_layers) if conv_layers is not None else None
118 |         self.norm = norm_layer
119 | 
120 |     def forward(self, x, attn_mask=None):
121 |         attns = []
122 |         if self.conv_layers is not None:
123 |             for attn_layer, conv_layer in zip(self.attn_layers, self.conv_layers):
124 |                 x, attn = attn_layer(x, attn_mask=attn_mask)
125 |                 x = conv_layer(x)
126 |                 attns.append(attn)
127 |             x, attn = self.attn_layers[-1](x)
128 |             attns.append(attn)
129 |         else:
130 |             for attn_layer in self.attn_layers:
131 |                 x, attn = attn_layer(x, attn_mask=attn_mask)
132 |                 attns.append(attn)
133 | 
134 |         if self.norm is not None:
135 |             x = self.norm(x)
136 | 
137 |         return x, attns
138 | 
139 | 
140 | class DecoderLayer(nn.Module):
141 |     """
142 |     Autoformer decoder layer with the progressive decomposition architecture
143 |     """
144 | 
145 |     def __init__(self, self_attention, cross_attention, d_model, c_out, d_ff=None,
146 |                  moving_avg=25, dropout=0.1, activation="relu"):
147 |         super(DecoderLayer, self).__init__()
148 |         d_ff = d_ff or 4 * d_model
149 |         self.self_attention = self_attention
150 |         self.cross_attention = cross_attention
151 |         self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1, bias=False)
152 |         self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1, bias=False)
153 |         self.decomp1 = series_decomp(moving_avg)
154 |         self.decomp2 = series_decomp(moving_avg)
155 |         self.decomp3 = series_decomp(moving_avg)
156 |         self.dropout = nn.Dropout(dropout)
157 |         self.projection = nn.Conv1d(in_channels=d_model, out_channels=c_out, kernel_size=3, stride=1, padding=1,
158 |                                     padding_mode='circular', bias=False)
159 |         self.activation = F.relu if activation == "relu" else F.gelu
160 | 
161 |     def forward(self, x, cross, x_mask=None, cross_mask=None):
162 |         x = x + self.dropout(self.self_attention(
163 |             x, x, x,
164 |             attn_mask=x_mask
165 |         )[0])
166 |         x, trend1 = self.decomp1(x)
167 |         x = x + self.dropout(self.cross_attention(
168 |             x, cross, cross,
169 |             attn_mask=cross_mask
170 |         )[0])
171 |         x, trend2 = self.decomp2(x)
172 |         y = x
173 |         y = self.dropout(self.activation(self.conv1(y.transpose(-1, 1))))
174 |         y = self.dropout(self.conv2(y).transpose(-1, 1))
175 |         x, trend3 = self.decomp3(x + y)
176 | 
177 |         residual_trend = trend1 + trend2 + trend3
178 |         residual_trend = self.projection(residual_trend.permute(0, 2, 1)).transpose(1, 2)
179 |         return x, residual_trend
180 | 
181 | 
182 | class Decoder(nn.Module):
183 |     """
184 |     Autoformer encoder
185 |     """
186 | 
187 |     def __init__(self, layers, norm_layer=None, projection=None):
188 |         super(Decoder, self).__init__()
189 |         self.layers = nn.ModuleList(layers)
190 |         self.norm = norm_layer
191 |         self.projection = projection
192 | 
193 |     def forward(self, x, cross, x_mask=None, cross_mask=None, trend=None):
194 |         for layer in self.layers:
195 |             x, residual_trend = layer(x, cross, x_mask=x_mask, cross_mask=cross_mask)
196 |             trend = trend + residual_trend
197 | 
198 |         if self.norm is not None:
199 |             x = self.norm(x)
200 | 
201 |         if self.projection is not None:
202 |             x = self.projection(x)
203 |         return x, trend
204 | 


--------------------------------------------------------------------------------
/layers/Conv_Blocks.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | 
 4 | 
 5 | class Inception_Block_V1(nn.Module):
 6 |     def __init__(self, in_channels, out_channels, num_kernels=6, init_weight=True):
 7 |         super(Inception_Block_V1, self).__init__()
 8 |         self.in_channels = in_channels
 9 |         self.out_channels = out_channels
10 |         self.num_kernels = num_kernels
11 |         kernels = []
12 |         for i in range(self.num_kernels):
13 |             kernels.append(nn.Conv2d(in_channels, out_channels, kernel_size=2 * i + 1, padding=i))
14 |         self.kernels = nn.ModuleList(kernels)
15 |         if init_weight:
16 |             self._initialize_weights()
17 | 
18 |     def _initialize_weights(self):
19 |         for m in self.modules():
20 |             if isinstance(m, nn.Conv2d):
21 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
22 |                 if m.bias is not None:
23 |                     nn.init.constant_(m.bias, 0)
24 | 
25 |     def forward(self, x):
26 |         res_list = []
27 |         for i in range(self.num_kernels):
28 |             res_list.append(self.kernels[i](x))
29 |         res = torch.stack(res_list, dim=-1).mean(-1)
30 |         return res
31 | 
32 | 
33 | class Inception_Block_V2(nn.Module):
34 |     def __init__(self, in_channels, out_channels, num_kernels=6, init_weight=True):
35 |         super(Inception_Block_V2, self).__init__()
36 |         self.in_channels = in_channels
37 |         self.out_channels = out_channels
38 |         self.num_kernels = num_kernels
39 |         kernels = []
40 |         for i in range(self.num_kernels // 2):
41 |             kernels.append(nn.Conv2d(in_channels, out_channels, kernel_size=[1, 2 * i + 3], padding=[0, i + 1]))
42 |             kernels.append(nn.Conv2d(in_channels, out_channels, kernel_size=[2 * i + 3, 1], padding=[i + 1, 0]))
43 |         kernels.append(nn.Conv2d(in_channels, out_channels, kernel_size=1))
44 |         self.kernels = nn.ModuleList(kernels)
45 |         if init_weight:
46 |             self._initialize_weights()
47 | 
48 |     def _initialize_weights(self):
49 |         for m in self.modules():
50 |             if isinstance(m, nn.Conv2d):
51 |                 nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
52 |                 if m.bias is not None:
53 |                     nn.init.constant_(m.bias, 0)
54 | 
55 |     def forward(self, x):
56 |         res_list = []
57 |         for i in range(self.num_kernels // 2 * 2 + 1):
58 |             res_list.append(self.kernels[i](x))
59 |         res = torch.stack(res_list, dim=-1).mean(-1)
60 |         return res
61 | 


--------------------------------------------------------------------------------
/layers/Crossformer_EncDec.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | from einops import rearrange, repeat
  4 | from layers.SelfAttention_Family import TwoStageAttentionLayer
  5 | 
  6 | 
  7 | class SegMerging(nn.Module):
  8 |     def __init__(self, d_model, win_size, norm_layer=nn.LayerNorm):
  9 |         super().__init__()
 10 |         self.d_model = d_model
 11 |         self.win_size = win_size
 12 |         self.linear_trans = nn.Linear(win_size * d_model, d_model)
 13 |         self.norm = norm_layer(win_size * d_model)
 14 | 
 15 |     def forward(self, x):
 16 |         batch_size, ts_d, seg_num, d_model = x.shape
 17 |         pad_num = seg_num % self.win_size
 18 |         if pad_num != 0:
 19 |             pad_num = self.win_size - pad_num
 20 |             x = torch.cat((x, x[:, :, -pad_num:, :]), dim=-2)
 21 | 
 22 |         seg_to_merge = []
 23 |         for i in range(self.win_size):
 24 |             seg_to_merge.append(x[:, :, i::self.win_size, :])
 25 |         x = torch.cat(seg_to_merge, -1)
 26 | 
 27 |         x = self.norm(x)
 28 |         x = self.linear_trans(x)
 29 | 
 30 |         return x
 31 | 
 32 | 
 33 | class scale_block(nn.Module):
 34 |     def __init__(self, configs, win_size, d_model, n_heads, d_ff, depth, dropout, \
 35 |                  seg_num=10, factor=10):
 36 |         super(scale_block, self).__init__()
 37 | 
 38 |         if win_size > 1:
 39 |             self.merge_layer = SegMerging(d_model, win_size, nn.LayerNorm)
 40 |         else:
 41 |             self.merge_layer = None
 42 | 
 43 |         self.encode_layers = nn.ModuleList()
 44 | 
 45 |         for i in range(depth):
 46 |             self.encode_layers.append(TwoStageAttentionLayer(configs, seg_num, factor, d_model, n_heads, \
 47 |                                                              d_ff, dropout))
 48 | 
 49 |     def forward(self, x, attn_mask=None, tau=None, delta=None):
 50 |         _, ts_dim, _, _ = x.shape
 51 | 
 52 |         if self.merge_layer is not None:
 53 |             x = self.merge_layer(x)
 54 | 
 55 |         for layer in self.encode_layers:
 56 |             x = layer(x)
 57 | 
 58 |         return x, None
 59 | 
 60 | 
 61 | class Encoder(nn.Module):
 62 |     def __init__(self, attn_layers):
 63 |         super(Encoder, self).__init__()
 64 |         self.encode_blocks = nn.ModuleList(attn_layers)
 65 | 
 66 |     def forward(self, x):
 67 |         encode_x = []
 68 |         encode_x.append(x)
 69 | 
 70 |         for block in self.encode_blocks:
 71 |             x, attns = block(x)
 72 |             encode_x.append(x)
 73 | 
 74 |         return encode_x, None
 75 | 
 76 | 
 77 | class DecoderLayer(nn.Module):
 78 |     def __init__(self, self_attention, cross_attention, seg_len, d_model, d_ff=None, dropout=0.1):
 79 |         super(DecoderLayer, self).__init__()
 80 |         self.self_attention = self_attention
 81 |         self.cross_attention = cross_attention
 82 |         self.norm1 = nn.LayerNorm(d_model)
 83 |         self.norm2 = nn.LayerNorm(d_model)
 84 |         self.dropout = nn.Dropout(dropout)
 85 |         self.MLP1 = nn.Sequential(nn.Linear(d_model, d_model),
 86 |                                   nn.GELU(),
 87 |                                   nn.Linear(d_model, d_model))
 88 |         self.linear_pred = nn.Linear(d_model, seg_len)
 89 | 
 90 |     def forward(self, x, cross):
 91 |         batch = x.shape[0]
 92 |         x = self.self_attention(x)
 93 |         x = rearrange(x, 'b ts_d out_seg_num d_model -> (b ts_d) out_seg_num d_model')
 94 | 
 95 |         cross = rearrange(cross, 'b ts_d in_seg_num d_model -> (b ts_d) in_seg_num d_model')
 96 |         tmp, attn = self.cross_attention(x, cross, cross, None, None, None,)
 97 |         x = x + self.dropout(tmp)
 98 |         y = x = self.norm1(x)
 99 |         y = self.MLP1(y)
100 |         dec_output = self.norm2(x + y)
101 | 
102 |         dec_output = rearrange(dec_output, '(b ts_d) seg_dec_num d_model -> b ts_d seg_dec_num d_model', b=batch)
103 |         layer_predict = self.linear_pred(dec_output)
104 |         layer_predict = rearrange(layer_predict, 'b out_d seg_num seg_len -> b (out_d seg_num) seg_len')
105 | 
106 |         return dec_output, layer_predict
107 | 
108 | 
109 | class Decoder(nn.Module):
110 |     def __init__(self, layers):
111 |         super(Decoder, self).__init__()
112 |         self.decode_layers = nn.ModuleList(layers)
113 | 
114 | 
115 |     def forward(self, x, cross):
116 |         final_predict = None
117 |         i = 0
118 | 
119 |         ts_d = x.shape[1]
120 |         for layer in self.decode_layers:
121 |             cross_enc = cross[i]
122 |             x, layer_predict = layer(x, cross_enc)
123 |             if final_predict is None:
124 |                 final_predict = layer_predict
125 |             else:
126 |                 final_predict = final_predict + layer_predict
127 |             i += 1
128 | 
129 |         final_predict = rearrange(final_predict, 'b (out_d seg_num) seg_len -> b (seg_num seg_len) out_d', out_d=ts_d)
130 | 
131 |         return final_predict
132 | 


--------------------------------------------------------------------------------
/layers/Embed.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from torch.nn.utils import weight_norm
  5 | import math
  6 | 
  7 | 
  8 | class PositionalEmbedding(nn.Module):
  9 |     def __init__(self, d_model, max_len=5000):
 10 |         super(PositionalEmbedding, self).__init__()
 11 |         # Compute the positional encodings once in log space.
 12 |         pe = torch.zeros(max_len, d_model).float()
 13 |         pe.require_grad = False
 14 | 
 15 |         position = torch.arange(0, max_len).float().unsqueeze(1)
 16 |         div_term = (torch.arange(0, d_model, 2).float()
 17 |                     * -(math.log(10000.0) / d_model)).exp()
 18 | 
 19 |         pe[:, 0::2] = torch.sin(position * div_term)
 20 |         pe[:, 1::2] = torch.cos(position * div_term)
 21 | 
 22 |         pe = pe.unsqueeze(0)
 23 |         self.register_buffer('pe', pe)
 24 | 
 25 |     def forward(self, x):
 26 |         return self.pe[:, :x.size(1)]
 27 | 
 28 | 
 29 | class TokenEmbedding(nn.Module):
 30 |     def __init__(self, c_in, d_model):
 31 |         super(TokenEmbedding, self).__init__()
 32 |         padding = 1 if torch.__version__ >= '1.5.0' else 2
 33 |         self.tokenConv = nn.Conv1d(in_channels=c_in, out_channels=d_model,
 34 |                                    kernel_size=3, padding=padding, padding_mode='circular', bias=False)
 35 |         for m in self.modules():
 36 |             if isinstance(m, nn.Conv1d):
 37 |                 nn.init.kaiming_normal_(
 38 |                     m.weight, mode='fan_in', nonlinearity='leaky_relu')
 39 | 
 40 |     def forward(self, x):
 41 |         x = self.tokenConv(x.permute(0, 2, 1)).transpose(1, 2)
 42 |         return x
 43 | 
 44 | 
 45 | class FixedEmbedding(nn.Module):
 46 |     def __init__(self, c_in, d_model):
 47 |         super(FixedEmbedding, self).__init__()
 48 | 
 49 |         w = torch.zeros(c_in, d_model).float()
 50 |         w.require_grad = False
 51 | 
 52 |         position = torch.arange(0, c_in).float().unsqueeze(1)
 53 |         div_term = (torch.arange(0, d_model, 2).float()
 54 |                     * -(math.log(10000.0) / d_model)).exp()
 55 | 
 56 |         w[:, 0::2] = torch.sin(position * div_term)
 57 |         w[:, 1::2] = torch.cos(position * div_term)
 58 | 
 59 |         self.emb = nn.Embedding(c_in, d_model)
 60 |         self.emb.weight = nn.Parameter(w, requires_grad=False)
 61 | 
 62 |     def forward(self, x):
 63 |         return self.emb(x).detach()
 64 | 
 65 | 
 66 | class TemporalEmbedding(nn.Module):
 67 |     def __init__(self, d_model, embed_type='fixed', freq='h'):
 68 |         super(TemporalEmbedding, self).__init__()
 69 | 
 70 |         minute_size = 4
 71 |         hour_size = 24
 72 |         weekday_size = 7
 73 |         day_size = 32
 74 |         month_size = 13
 75 | 
 76 |         Embed = FixedEmbedding if embed_type == 'fixed' else nn.Embedding
 77 |         if freq == 't':
 78 |             self.minute_embed = Embed(minute_size, d_model)
 79 |         self.hour_embed = Embed(hour_size, d_model)
 80 |         self.weekday_embed = Embed(weekday_size, d_model)
 81 |         self.day_embed = Embed(day_size, d_model)
 82 |         self.month_embed = Embed(month_size, d_model)
 83 | 
 84 |     def forward(self, x):
 85 |         x = x.long()
 86 |         minute_x = self.minute_embed(x[:, :, 4]) if hasattr(
 87 |             self, 'minute_embed') else 0.
 88 |         hour_x = self.hour_embed(x[:, :, 3])
 89 |         weekday_x = self.weekday_embed(x[:, :, 2])
 90 |         day_x = self.day_embed(x[:, :, 1])
 91 |         month_x = self.month_embed(x[:, :, 0])
 92 | 
 93 |         return hour_x + weekday_x + day_x + month_x + minute_x
 94 | 
 95 | 
 96 | class TimeFeatureEmbedding(nn.Module):
 97 |     def __init__(self, d_model, embed_type='timeF', freq='h'):
 98 |         super(TimeFeatureEmbedding, self).__init__()
 99 | 
100 |         freq_map = {'h': 4, 't': 5, 's': 6,
101 |                     'm': 1, 'a': 1, 'w': 2, 'd': 3, 'b': 3}
102 |         d_inp = freq_map[freq]
103 |         self.embed = nn.Linear(d_inp, d_model, bias=False)
104 | 
105 |     def forward(self, x):
106 |         return self.embed(x)
107 | 
108 | 
109 | class DataEmbedding(nn.Module):
110 |     def __init__(self, c_in, d_model, embed_type='fixed', freq='h', dropout=0.1):
111 |         super(DataEmbedding, self).__init__()
112 | 
113 |         self.value_embedding = TokenEmbedding(c_in=c_in, d_model=d_model)
114 |         self.position_embedding = PositionalEmbedding(d_model=d_model)
115 |         self.temporal_embedding = TemporalEmbedding(d_model=d_model, embed_type=embed_type,
116 |                                                     freq=freq) if embed_type != 'timeF' else TimeFeatureEmbedding(
117 |             d_model=d_model, embed_type=embed_type, freq=freq)
118 |         self.dropout = nn.Dropout(p=dropout)
119 | 
120 |     def forward(self, x, x_mark):
121 |         if x_mark is None:
122 |             x = self.value_embedding(x) + self.position_embedding(x)
123 |         else:
124 |             x = self.value_embedding(
125 |                 x) + self.temporal_embedding(x_mark) + self.position_embedding(x)
126 |         return self.dropout(x)
127 | 
128 | 
129 | class DataEmbedding_inverted(nn.Module):
130 |     def __init__(self, c_in, d_model, embed_type='fixed', freq='h', dropout=0.1):
131 |         super(DataEmbedding_inverted, self).__init__()
132 |         self.value_embedding = nn.Linear(c_in, d_model)
133 |         self.dropout = nn.Dropout(p=dropout)
134 | 
135 |     def forward(self, x, x_mark):
136 |         x = x.permute(0, 2, 1)
137 |         # x: [Batch Variate Time]
138 |         if x_mark is None:
139 |             x = self.value_embedding(x)
140 |         else:
141 |             x = self.value_embedding(torch.cat([x, x_mark.permute(0, 2, 1)], 1))
142 |         # x: [Batch Variate d_model]
143 |         return self.dropout(x)
144 | 
145 | 
146 | class DataEmbedding_wo_pos(nn.Module):
147 |     def __init__(self, c_in, d_model, embed_type='fixed', freq='h', dropout=0.1):
148 |         super(DataEmbedding_wo_pos, self).__init__()
149 | 
150 |         self.value_embedding = TokenEmbedding(c_in=c_in, d_model=d_model)
151 |         self.position_embedding = PositionalEmbedding(d_model=d_model)
152 |         self.temporal_embedding = TemporalEmbedding(d_model=d_model, embed_type=embed_type,
153 |                                                     freq=freq) if embed_type != 'timeF' else TimeFeatureEmbedding(
154 |             d_model=d_model, embed_type=embed_type, freq=freq)
155 |         self.dropout = nn.Dropout(p=dropout)
156 | 
157 |     def forward(self, x, x_mark):
158 |         if x_mark is None:
159 |             x = self.value_embedding(x)
160 |         else:
161 |             x = self.value_embedding(x) + self.temporal_embedding(x_mark)
162 |         return self.dropout(x)
163 | 
164 | 
165 | class PatchEmbedding(nn.Module):
166 |     def __init__(self, d_model, patch_len, stride, padding, dropout):
167 |         super(PatchEmbedding, self).__init__()
168 |         # Patching
169 |         self.patch_len = patch_len
170 |         self.stride = stride
171 |         self.padding_patch_layer = nn.ReplicationPad1d((0, padding))
172 | 
173 |         # Backbone, Input encoding: projection of feature vectors onto a d-dim vector space
174 |         self.value_embedding = nn.Linear(patch_len, d_model, bias=False)
175 | 
176 |         # Positional embedding
177 |         self.position_embedding = PositionalEmbedding(d_model)
178 | 
179 |         # Residual dropout
180 |         self.dropout = nn.Dropout(dropout)
181 | 
182 |     def forward(self, x):
183 |         # do patching
184 |         n_vars = x.shape[1]
185 |         x = self.padding_patch_layer(x)
186 |         x = x.unfold(dimension=-1, size=self.patch_len, step=self.stride)
187 |         x = torch.reshape(x, (x.shape[0] * x.shape[1], x.shape[2], x.shape[3]))
188 |         # Input encoding
189 |         x = self.value_embedding(x) + self.position_embedding(x)
190 |         return self.dropout(x), n_vars
191 | 


--------------------------------------------------------------------------------
/layers/FourierCorrelation.py:
--------------------------------------------------------------------------------
  1 | # coding=utf-8
  2 | # author=maziqing
  3 | # email=maziqing.mzq@alibaba-inc.com
  4 | 
  5 | import numpy as np
  6 | import torch
  7 | import torch.nn as nn
  8 | 
  9 | 
 10 | def get_frequency_modes(seq_len, modes=64, mode_select_method='random'):
 11 |     """
 12 |     get modes on frequency domain:
 13 |     'random' means sampling randomly;
 14 |     'else' means sampling the lowest modes;
 15 |     """
 16 |     modes = min(modes, seq_len // 2)
 17 |     if mode_select_method == 'random':
 18 |         index = list(range(0, seq_len // 2))
 19 |         np.random.shuffle(index)
 20 |         index = index[:modes]
 21 |     else:
 22 |         index = list(range(0, modes))
 23 |     index.sort()
 24 |     return index
 25 | 
 26 | 
 27 | # ########## fourier layer #############
 28 | class FourierBlock(nn.Module):
 29 |     def __init__(self, in_channels, out_channels, seq_len, modes=0, mode_select_method='random'):
 30 |         super(FourierBlock, self).__init__()
 31 |         print('fourier enhanced block used!')
 32 |         """
 33 |         1D Fourier block. It performs representation learning on frequency domain, 
 34 |         it does FFT, linear transform, and Inverse FFT.    
 35 |         """
 36 |         # get modes on frequency domain
 37 |         self.index = get_frequency_modes(seq_len, modes=modes, mode_select_method=mode_select_method)
 38 |         print('modes={}, index={}'.format(modes, self.index))
 39 | 
 40 |         self.scale = (1 / (in_channels * out_channels))
 41 |         self.weights1 = nn.Parameter(
 42 |             self.scale * torch.rand(8, in_channels // 8, out_channels // 8, len(self.index), dtype=torch.float))
 43 |         self.weights2 = nn.Parameter(
 44 |             self.scale * torch.rand(8, in_channels // 8, out_channels // 8, len(self.index), dtype=torch.float))
 45 | 
 46 |     # Complex multiplication
 47 |     def compl_mul1d(self, order, x, weights):
 48 |         x_flag = True
 49 |         w_flag = True
 50 |         if not torch.is_complex(x):
 51 |             x_flag = False
 52 |             x = torch.complex(x, torch.zeros_like(x).to(x.device))
 53 |         if not torch.is_complex(weights):
 54 |             w_flag = False
 55 |             weights = torch.complex(weights, torch.zeros_like(weights).to(weights.device))
 56 |         if x_flag or w_flag:
 57 |             return torch.complex(torch.einsum(order, x.real, weights.real) - torch.einsum(order, x.imag, weights.imag),
 58 |                                  torch.einsum(order, x.real, weights.imag) + torch.einsum(order, x.imag, weights.real))
 59 |         else:
 60 |             return torch.einsum(order, x.real, weights.real)
 61 | 
 62 |     def forward(self, q, k, v, mask):
 63 |         # size = [B, L, H, E]
 64 |         B, L, H, E = q.shape
 65 |         x = q.permute(0, 2, 3, 1)
 66 |         # Compute Fourier coefficients
 67 |         x_ft = torch.fft.rfft(x, dim=-1)
 68 |         # Perform Fourier neural operations
 69 |         out_ft = torch.zeros(B, H, E, L // 2 + 1, device=x.device, dtype=torch.cfloat)
 70 |         for wi, i in enumerate(self.index):
 71 |             if i >= x_ft.shape[3] or wi >= out_ft.shape[3]:
 72 |                 continue
 73 |             out_ft[:, :, :, wi] = self.compl_mul1d("bhi,hio->bho", x_ft[:, :, :, i],
 74 |                                                    torch.complex(self.weights1, self.weights2)[:, :, :, wi])
 75 |         # Return to time domain
 76 |         x = torch.fft.irfft(out_ft, n=x.size(-1))
 77 |         return (x, None)
 78 | 
 79 | 
 80 | # ########## Fourier Cross Former ####################
 81 | class FourierCrossAttention(nn.Module):
 82 |     def __init__(self, in_channels, out_channels, seq_len_q, seq_len_kv, modes=64, mode_select_method='random',
 83 |                  activation='tanh', policy=0, num_heads=8):
 84 |         super(FourierCrossAttention, self).__init__()
 85 |         print(' fourier enhanced cross attention used!')
 86 |         """
 87 |         1D Fourier Cross Attention layer. It does FFT, linear transform, attention mechanism and Inverse FFT.    
 88 |         """
 89 |         self.activation = activation
 90 |         self.in_channels = in_channels
 91 |         self.out_channels = out_channels
 92 |         # get modes for queries and keys (& values) on frequency domain
 93 |         self.index_q = get_frequency_modes(seq_len_q, modes=modes, mode_select_method=mode_select_method)
 94 |         self.index_kv = get_frequency_modes(seq_len_kv, modes=modes, mode_select_method=mode_select_method)
 95 | 
 96 |         print('modes_q={}, index_q={}'.format(len(self.index_q), self.index_q))
 97 |         print('modes_kv={}, index_kv={}'.format(len(self.index_kv), self.index_kv))
 98 | 
 99 |         self.scale = (1 / (in_channels * out_channels))
100 |         self.weights1 = nn.Parameter(
101 |             self.scale * torch.rand(num_heads, in_channels // num_heads, out_channels // num_heads, len(self.index_q), dtype=torch.float))
102 |         self.weights2 = nn.Parameter(
103 |             self.scale * torch.rand(num_heads, in_channels // num_heads, out_channels // num_heads, len(self.index_q), dtype=torch.float))
104 | 
105 |     # Complex multiplication
106 |     def compl_mul1d(self, order, x, weights):
107 |         x_flag = True
108 |         w_flag = True
109 |         if not torch.is_complex(x):
110 |             x_flag = False
111 |             x = torch.complex(x, torch.zeros_like(x).to(x.device))
112 |         if not torch.is_complex(weights):
113 |             w_flag = False
114 |             weights = torch.complex(weights, torch.zeros_like(weights).to(weights.device))
115 |         if x_flag or w_flag:
116 |             return torch.complex(torch.einsum(order, x.real, weights.real) - torch.einsum(order, x.imag, weights.imag),
117 |                                  torch.einsum(order, x.real, weights.imag) + torch.einsum(order, x.imag, weights.real))
118 |         else:
119 |             return torch.einsum(order, x.real, weights.real)
120 | 
121 |     def forward(self, q, k, v, mask):
122 |         # size = [B, L, H, E]
123 |         B, L, H, E = q.shape
124 |         xq = q.permute(0, 2, 3, 1)  # size = [B, H, E, L]
125 |         xk = k.permute(0, 2, 3, 1)
126 |         xv = v.permute(0, 2, 3, 1)
127 | 
128 |         # Compute Fourier coefficients
129 |         xq_ft_ = torch.zeros(B, H, E, len(self.index_q), device=xq.device, dtype=torch.cfloat)
130 |         xq_ft = torch.fft.rfft(xq, dim=-1)
131 |         for i, j in enumerate(self.index_q):
132 |             if j >= xq_ft.shape[3]:
133 |                 continue
134 |             xq_ft_[:, :, :, i] = xq_ft[:, :, :, j]
135 |         xk_ft_ = torch.zeros(B, H, E, len(self.index_kv), device=xq.device, dtype=torch.cfloat)
136 |         xk_ft = torch.fft.rfft(xk, dim=-1)
137 |         for i, j in enumerate(self.index_kv):
138 |             if j >= xk_ft.shape[3]:
139 |                 continue
140 |             xk_ft_[:, :, :, i] = xk_ft[:, :, :, j]
141 | 
142 |         # perform attention mechanism on frequency domain
143 |         xqk_ft = (self.compl_mul1d("bhex,bhey->bhxy", xq_ft_, xk_ft_))
144 |         if self.activation == 'tanh':
145 |             xqk_ft = torch.complex(xqk_ft.real.tanh(), xqk_ft.imag.tanh())
146 |         elif self.activation == 'softmax':
147 |             xqk_ft = torch.softmax(abs(xqk_ft), dim=-1)
148 |             xqk_ft = torch.complex(xqk_ft, torch.zeros_like(xqk_ft))
149 |         else:
150 |             raise Exception('{} actiation function is not implemented'.format(self.activation))
151 |         xqkv_ft = self.compl_mul1d("bhxy,bhey->bhex", xqk_ft, xk_ft_)
152 |         xqkvw = self.compl_mul1d("bhex,heox->bhox", xqkv_ft, torch.complex(self.weights1, self.weights2))
153 |         out_ft = torch.zeros(B, H, E, L // 2 + 1, device=xq.device, dtype=torch.cfloat)
154 |         for i, j in enumerate(self.index_q):
155 |             if i >= xqkvw.shape[3] or j >= out_ft.shape[3]:
156 |                 continue
157 |             out_ft[:, :, :, j] = xqkvw[:, :, :, i]
158 |         # Return to time domain
159 |         out = torch.fft.irfft(out_ft / self.in_channels / self.out_channels, n=xq.size(-1))
160 |         return (out, None)
161 | 


--------------------------------------------------------------------------------
/layers/StandardNorm.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | 
 4 | 
 5 | class Normalize(nn.Module):
 6 |     def __init__(self, num_features: int, eps=1e-5, affine=False, subtract_last=False, non_norm=False):
 7 |         """
 8 |         :param num_features: the number of features or channels
 9 |         :param eps: a value added for numerical stability
10 |         :param affine: if True, RevIN has learnable affine parameters
11 |         """
12 |         super(Normalize, self).__init__()
13 |         self.num_features = num_features
14 |         self.eps = eps
15 |         self.affine = affine
16 |         self.subtract_last = subtract_last
17 |         self.non_norm = non_norm
18 |         if self.affine:
19 |             self._init_params()
20 | 
21 |     def forward(self, x, mode: str):
22 |         if mode == 'norm':
23 |             self._get_statistics(x)
24 |             x = self._normalize(x)
25 |         elif mode == 'denorm':
26 |             x = self._denormalize(x)
27 |         else:
28 |             raise NotImplementedError
29 |         return x
30 | 
31 |     def _init_params(self):
32 |         # initialize RevIN params: (C,)
33 |         self.affine_weight = nn.Parameter(torch.ones(self.num_features))
34 |         self.affine_bias = nn.Parameter(torch.zeros(self.num_features))
35 | 
36 |     def _get_statistics(self, x):
37 |         dim2reduce = tuple(range(1, x.ndim - 1))
38 |         if self.subtract_last:
39 |             self.last = x[:, -1, :].unsqueeze(1)
40 |         else:
41 |             self.mean = torch.mean(x, dim=dim2reduce, keepdim=True).detach()
42 |         self.stdev = torch.sqrt(torch.var(x, dim=dim2reduce, keepdim=True, unbiased=False) + self.eps).detach()
43 | 
44 |     def _normalize(self, x):
45 |         if self.non_norm:
46 |             return x
47 |         if self.subtract_last:
48 |             x = x - self.last
49 |         else:
50 |             x = x - self.mean
51 |         x = x / self.stdev
52 |         if self.affine:
53 |             x = x * self.affine_weight
54 |             x = x + self.affine_bias
55 |         return x
56 | 
57 |     def _denormalize(self, x):
58 |         if self.non_norm:
59 |             return x
60 |         if self.affine:
61 |             x = x - self.affine_bias
62 |             x = x / (self.affine_weight + self.eps * self.eps)
63 |         x = x * self.stdev
64 |         if self.subtract_last:
65 |             x = x + self.last
66 |         else:
67 |             x = x + self.mean
68 |         return x
69 | 


--------------------------------------------------------------------------------
/layers/Transformer_EncDec.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | 
  5 | 
  6 | class ConvLayer(nn.Module):
  7 |     def __init__(self, c_in):
  8 |         super(ConvLayer, self).__init__()
  9 |         self.downConv = nn.Conv1d(in_channels=c_in,
 10 |                                   out_channels=c_in,
 11 |                                   kernel_size=3,
 12 |                                   padding=2,
 13 |                                   padding_mode='circular')
 14 |         self.norm = nn.BatchNorm1d(c_in)
 15 |         self.activation = nn.ELU()
 16 |         self.maxPool = nn.MaxPool1d(kernel_size=3, stride=2, padding=1)
 17 | 
 18 |     def forward(self, x):
 19 |         x = self.downConv(x.permute(0, 2, 1))
 20 |         x = self.norm(x)
 21 |         x = self.activation(x)
 22 |         x = self.maxPool(x)
 23 |         x = x.transpose(1, 2)
 24 |         return x
 25 | 
 26 | 
 27 | class EncoderLayer(nn.Module):
 28 |     def __init__(self, attention, d_model, d_ff=None, dropout=0.1, activation="relu"):
 29 |         super(EncoderLayer, self).__init__()
 30 |         d_ff = d_ff or 4 * d_model
 31 |         self.attention = attention
 32 |         self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1)
 33 |         self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1)
 34 |         self.norm1 = nn.LayerNorm(d_model)
 35 |         self.norm2 = nn.LayerNorm(d_model)
 36 |         self.dropout = nn.Dropout(dropout)
 37 |         self.activation = F.relu if activation == "relu" else F.gelu
 38 | 
 39 |     def forward(self, x, attn_mask=None, tau=None, delta=None):
 40 |         new_x, attn = self.attention(
 41 |             x, x, x,
 42 |             attn_mask=attn_mask,
 43 |             tau=tau, delta=delta
 44 |         )
 45 |         x = x + self.dropout(new_x)
 46 | 
 47 |         y = x = self.norm1(x)
 48 |         y = self.dropout(self.activation(self.conv1(y.transpose(-1, 1))))
 49 |         y = self.dropout(self.conv2(y).transpose(-1, 1))
 50 | 
 51 |         return self.norm2(x + y), attn
 52 | 
 53 | 
 54 | class Encoder(nn.Module):
 55 |     def __init__(self, attn_layers, conv_layers=None, norm_layer=None):
 56 |         super(Encoder, self).__init__()
 57 |         self.attn_layers = nn.ModuleList(attn_layers)
 58 |         self.conv_layers = nn.ModuleList(conv_layers) if conv_layers is not None else None
 59 |         self.norm = norm_layer
 60 | 
 61 |     def forward(self, x, attn_mask=None, tau=None, delta=None):
 62 |         # x [B, L, D]
 63 |         attns = []
 64 |         if self.conv_layers is not None:
 65 |             for i, (attn_layer, conv_layer) in enumerate(zip(self.attn_layers, self.conv_layers)):
 66 |                 delta = delta if i == 0 else None
 67 |                 x, attn = attn_layer(x, attn_mask=attn_mask, tau=tau, delta=delta)
 68 |                 x = conv_layer(x)
 69 |                 attns.append(attn)
 70 |             x, attn = self.attn_layers[-1](x, tau=tau, delta=None)
 71 |             attns.append(attn)
 72 |         else:
 73 |             for attn_layer in self.attn_layers:
 74 |                 x, attn = attn_layer(x, attn_mask=attn_mask, tau=tau, delta=delta)
 75 |                 attns.append(attn)
 76 | 
 77 |         if self.norm is not None:
 78 |             x = self.norm(x)
 79 | 
 80 |         return x, attns
 81 | 
 82 | 
 83 | class DecoderLayer(nn.Module):
 84 |     def __init__(self, self_attention, cross_attention, d_model, d_ff=None,
 85 |                  dropout=0.1, activation="relu"):
 86 |         super(DecoderLayer, self).__init__()
 87 |         d_ff = d_ff or 4 * d_model
 88 |         self.self_attention = self_attention
 89 |         self.cross_attention = cross_attention
 90 |         self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1)
 91 |         self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1)
 92 |         self.norm1 = nn.LayerNorm(d_model)
 93 |         self.norm2 = nn.LayerNorm(d_model)
 94 |         self.norm3 = nn.LayerNorm(d_model)
 95 |         self.dropout = nn.Dropout(dropout)
 96 |         self.activation = F.relu if activation == "relu" else F.gelu
 97 | 
 98 |     def forward(self, x, cross, x_mask=None, cross_mask=None, tau=None, delta=None):
 99 |         x = x + self.dropout(self.self_attention(
100 |             x, x, x,
101 |             attn_mask=x_mask,
102 |             tau=tau, delta=None
103 |         )[0])
104 |         x = self.norm1(x)
105 | 
106 |         x = x + self.dropout(self.cross_attention(
107 |             x, cross, cross,
108 |             attn_mask=cross_mask,
109 |             tau=tau, delta=delta
110 |         )[0])
111 | 
112 |         y = x = self.norm2(x)
113 |         y = self.dropout(self.activation(self.conv1(y.transpose(-1, 1))))
114 |         y = self.dropout(self.conv2(y).transpose(-1, 1))
115 | 
116 |         return self.norm3(x + y)
117 | 
118 | 
119 | class Decoder(nn.Module):
120 |     def __init__(self, layers, norm_layer=None, projection=None):
121 |         super(Decoder, self).__init__()
122 |         self.layers = nn.ModuleList(layers)
123 |         self.norm = norm_layer
124 |         self.projection = projection
125 | 
126 |     def forward(self, x, cross, x_mask=None, cross_mask=None, tau=None, delta=None):
127 |         for layer in self.layers:
128 |             x = layer(x, cross, x_mask=x_mask, cross_mask=cross_mask, tau=tau, delta=delta)
129 | 
130 |         if self.norm is not None:
131 |             x = self.norm(x)
132 | 
133 |         if self.projection is not None:
134 |             x = self.projection(x)
135 |         return x
136 | 


--------------------------------------------------------------------------------
/layers/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/layers/__init__.py


--------------------------------------------------------------------------------
/models/Autoformer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from layers.Embed import DataEmbedding, DataEmbedding_wo_pos
  5 | from layers.AutoCorrelation import AutoCorrelation, AutoCorrelationLayer
  6 | from layers.Autoformer_EncDec import Encoder, Decoder, EncoderLayer, DecoderLayer, my_Layernorm, series_decomp
  7 | import math
  8 | import numpy as np
  9 | 
 10 | 
 11 | class Model(nn.Module):
 12 |     """
 13 |     Autoformer is the first method to achieve the series-wise connection,
 14 |     with inherent O(LlogL) complexity
 15 |     Paper link: https://openreview.net/pdf?id=I55UqU-M11y
 16 |     """
 17 | 
 18 |     def __init__(self, configs):
 19 |         super(Model, self).__init__()
 20 |         self.task_name = configs.task_name
 21 |         self.seq_len = configs.seq_len
 22 |         self.label_len = configs.label_len
 23 |         self.pred_len = configs.pred_len
 24 | 
 25 |         # Decomp
 26 |         kernel_size = configs.moving_avg
 27 |         self.decomp = series_decomp(kernel_size)
 28 | 
 29 |         # Embedding
 30 |         self.enc_embedding = DataEmbedding_wo_pos(configs.enc_in, configs.d_model, configs.embed, configs.freq,
 31 |                                                   configs.dropout)
 32 |         # Encoder
 33 |         self.encoder = Encoder(
 34 |             [
 35 |                 EncoderLayer(
 36 |                     AutoCorrelationLayer(
 37 |                         AutoCorrelation(False, configs.factor, attention_dropout=configs.dropout,
 38 |                                         output_attention=False),
 39 |                         configs.d_model, configs.n_heads),
 40 |                     configs.d_model,
 41 |                     configs.d_ff,
 42 |                     moving_avg=configs.moving_avg,
 43 |                     dropout=configs.dropout,
 44 |                     activation=configs.activation
 45 |                 ) for l in range(configs.e_layers)
 46 |             ],
 47 |             norm_layer=my_Layernorm(configs.d_model)
 48 |         )
 49 |         # Decoder
 50 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
 51 |             self.dec_embedding = DataEmbedding_wo_pos(configs.dec_in, configs.d_model, configs.embed, configs.freq,
 52 |                                                       configs.dropout)
 53 |             self.decoder = Decoder(
 54 |                 [
 55 |                     DecoderLayer(
 56 |                         AutoCorrelationLayer(
 57 |                             AutoCorrelation(True, configs.factor, attention_dropout=configs.dropout,
 58 |                                             output_attention=False),
 59 |                             configs.d_model, configs.n_heads),
 60 |                         AutoCorrelationLayer(
 61 |                             AutoCorrelation(False, configs.factor, attention_dropout=configs.dropout,
 62 |                                             output_attention=False),
 63 |                             configs.d_model, configs.n_heads),
 64 |                         configs.d_model,
 65 |                         configs.c_out,
 66 |                         configs.d_ff,
 67 |                         moving_avg=configs.moving_avg,
 68 |                         dropout=configs.dropout,
 69 |                         activation=configs.activation,
 70 |                     )
 71 |                     for l in range(configs.d_layers)
 72 |                 ],
 73 |                 norm_layer=my_Layernorm(configs.d_model),
 74 |                 projection=nn.Linear(configs.d_model, configs.c_out, bias=True)
 75 |             )
 76 |         if self.task_name == 'imputation':
 77 |             self.projection = nn.Linear(
 78 |                 configs.d_model, configs.c_out, bias=True)
 79 |         if self.task_name == 'anomaly_detection':
 80 |             self.projection = nn.Linear(
 81 |                 configs.d_model, configs.c_out, bias=True)
 82 |         if self.task_name == 'classification':
 83 |             self.act = F.gelu
 84 |             self.dropout = nn.Dropout(configs.dropout)
 85 |             self.projection = nn.Linear(
 86 |                 configs.d_model * configs.seq_len, configs.num_class)
 87 | 
 88 |     def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 89 |         # decomp init
 90 |         mean = torch.mean(x_enc, dim=1).unsqueeze(
 91 |             1).repeat(1, self.pred_len, 1)
 92 |         zeros = torch.zeros([x_dec.shape[0], self.pred_len,
 93 |                              x_dec.shape[2]], device=x_enc.device)
 94 |         seasonal_init, trend_init = self.decomp(x_enc)
 95 |         # decoder input
 96 |         trend_init = torch.cat(
 97 |             [trend_init[:, -self.label_len:, :], mean], dim=1)
 98 |         seasonal_init = torch.cat(
 99 |             [seasonal_init[:, -self.label_len:, :], zeros], dim=1)
100 |         # enc
101 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
102 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
103 |         # dec
104 |         dec_out = self.dec_embedding(seasonal_init, x_mark_dec)
105 |         seasonal_part, trend_part = self.decoder(dec_out, enc_out, x_mask=None, cross_mask=None,
106 |                                                  trend=trend_init)
107 |         # final
108 |         dec_out = trend_part + seasonal_part
109 |         return dec_out
110 | 
111 |     def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask):
112 |         # enc
113 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
114 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
115 |         # final
116 |         dec_out = self.projection(enc_out)
117 |         return dec_out
118 | 
119 |     def anomaly_detection(self, x_enc):
120 |         # enc
121 |         enc_out = self.enc_embedding(x_enc, None)
122 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
123 |         # final
124 |         dec_out = self.projection(enc_out)
125 |         return dec_out
126 | 
127 |     def classification(self, x_enc, x_mark_enc):
128 |         # enc
129 |         enc_out = self.enc_embedding(x_enc, None)
130 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
131 | 
132 |         # Output
133 |         # the output transformer encoder/decoder embeddings don't include non-linearity
134 |         output = self.act(enc_out)
135 |         output = self.dropout(output)
136 |         # zero-out padding embeddings
137 |         output = output * x_mark_enc.unsqueeze(-1)
138 |         # (batch_size, seq_length * d_model)
139 |         output = output.reshape(output.shape[0], -1)
140 |         output = self.projection(output)  # (batch_size, num_classes)
141 |         return output
142 | 
143 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
144 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
145 |             dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
146 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
147 |         if self.task_name == 'imputation':
148 |             dec_out = self.imputation(
149 |                 x_enc, x_mark_enc, x_dec, x_mark_dec, mask)
150 |             return dec_out  # [B, L, D]
151 |         if self.task_name == 'anomaly_detection':
152 |             dec_out = self.anomaly_detection(x_enc)
153 |             return dec_out  # [B, L, D]
154 |         if self.task_name == 'classification':
155 |             dec_out = self.classification(x_enc, x_mark_enc)
156 |             return dec_out  # [B, N]
157 |         return None
158 | 


--------------------------------------------------------------------------------
/models/Crossformer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from einops import rearrange, repeat
  5 | from layers.Crossformer_EncDec import scale_block, Encoder, Decoder, DecoderLayer
  6 | from layers.Embed import PatchEmbedding
  7 | from layers.SelfAttention_Family import AttentionLayer, FullAttention, TwoStageAttentionLayer
  8 | from models.PatchTST import FlattenHead
  9 | 
 10 | 
 11 | from math import ceil
 12 | 
 13 | 
 14 | class Model(nn.Module):
 15 |     """
 16 |     Paper link: https://openreview.net/pdf?id=vSVLM2j9eie
 17 |     """
 18 |     def __init__(self, configs):
 19 |         super(Model, self).__init__()
 20 |         self.enc_in = configs.enc_in
 21 |         self.seq_len = configs.seq_len
 22 |         self.pred_len = configs.pred_len
 23 |         self.seg_len = 12
 24 |         self.win_size = 2
 25 |         self.task_name = configs.task_name
 26 | 
 27 |         # The padding operation to handle invisible sgemnet length
 28 |         self.pad_in_len = ceil(1.0 * configs.seq_len / self.seg_len) * self.seg_len
 29 |         self.pad_out_len = ceil(1.0 * configs.pred_len / self.seg_len) * self.seg_len
 30 |         self.in_seg_num = self.pad_in_len // self.seg_len
 31 |         self.out_seg_num = ceil(self.in_seg_num / (self.win_size ** (configs.e_layers - 1)))
 32 |         self.head_nf = configs.d_model * self.out_seg_num
 33 | 
 34 |         # Embedding
 35 |         self.enc_value_embedding = PatchEmbedding(configs.d_model, self.seg_len, self.seg_len, self.pad_in_len - configs.seq_len, 0)
 36 |         self.enc_pos_embedding = nn.Parameter(
 37 |             torch.randn(1, configs.enc_in, self.in_seg_num, configs.d_model))
 38 |         self.pre_norm = nn.LayerNorm(configs.d_model)
 39 | 
 40 |         # Encoder
 41 |         self.encoder = Encoder(
 42 |             [
 43 |                 scale_block(configs, 1 if l is 0 else self.win_size, configs.d_model, configs.n_heads, configs.d_ff,
 44 |                             1, configs.dropout,
 45 |                             self.in_seg_num if l is 0 else ceil(self.in_seg_num / self.win_size ** l), configs.factor
 46 |                             ) for l in range(configs.e_layers)
 47 |             ]
 48 |         )
 49 |         # Decoder
 50 |         self.dec_pos_embedding = nn.Parameter(
 51 |             torch.randn(1, configs.enc_in, (self.pad_out_len // self.seg_len), configs.d_model))
 52 | 
 53 |         self.decoder = Decoder(
 54 |             [
 55 |                 DecoderLayer(
 56 |                     TwoStageAttentionLayer(configs, (self.pad_out_len // self.seg_len), configs.factor, configs.d_model, configs.n_heads,
 57 |                                            configs.d_ff, configs.dropout),
 58 |                     AttentionLayer(
 59 |                         FullAttention(False, configs.factor, attention_dropout=configs.dropout,
 60 |                                       output_attention=False),
 61 |                         configs.d_model, configs.n_heads),
 62 |                     self.seg_len,
 63 |                     configs.d_model,
 64 |                     configs.d_ff,
 65 |                     dropout=configs.dropout,
 66 |                     # activation=configs.activation,
 67 |                 )
 68 |                 for l in range(configs.e_layers + 1)
 69 |             ],
 70 |         )
 71 |         if self.task_name == 'imputation' or self.task_name == 'anomaly_detection':
 72 |             self.head = FlattenHead(configs.enc_in, self.head_nf, configs.seq_len,
 73 |                                     head_dropout=configs.dropout)
 74 |         elif self.task_name == 'classification':
 75 |             self.flatten = nn.Flatten(start_dim=-2)
 76 |             self.dropout = nn.Dropout(configs.dropout)
 77 |             self.projection = nn.Linear(
 78 |                 self.head_nf * configs.enc_in, configs.num_class)
 79 | 
 80 | 
 81 | 
 82 |     def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 83 |         # embedding
 84 |         x_enc, n_vars = self.enc_value_embedding(x_enc.permute(0, 2, 1))
 85 |         x_enc = rearrange(x_enc, '(b d) seg_num d_model -> b d seg_num d_model', d = n_vars)
 86 |         x_enc += self.enc_pos_embedding
 87 |         x_enc = self.pre_norm(x_enc)
 88 |         enc_out, attns = self.encoder(x_enc)
 89 | 
 90 |         dec_in = repeat(self.dec_pos_embedding, 'b ts_d l d -> (repeat b) ts_d l d', repeat=x_enc.shape[0])
 91 |         dec_out = self.decoder(dec_in, enc_out)
 92 |         return dec_out
 93 | 
 94 |     def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask):
 95 |         # embedding
 96 |         x_enc, n_vars = self.enc_value_embedding(x_enc.permute(0, 2, 1))
 97 |         x_enc = rearrange(x_enc, '(b d) seg_num d_model -> b d seg_num d_model', d=n_vars)
 98 |         x_enc += self.enc_pos_embedding
 99 |         x_enc = self.pre_norm(x_enc)
100 |         enc_out, attns = self.encoder(x_enc)
101 | 
102 |         dec_out = self.head(enc_out[-1].permute(0, 1, 3, 2)).permute(0, 2, 1)
103 | 
104 |         return dec_out
105 | 
106 |     def anomaly_detection(self, x_enc):
107 |         # embedding
108 |         x_enc, n_vars = self.enc_value_embedding(x_enc.permute(0, 2, 1))
109 |         x_enc = rearrange(x_enc, '(b d) seg_num d_model -> b d seg_num d_model', d=n_vars)
110 |         x_enc += self.enc_pos_embedding
111 |         x_enc = self.pre_norm(x_enc)
112 |         enc_out, attns = self.encoder(x_enc)
113 | 
114 |         dec_out = self.head(enc_out[-1].permute(0, 1, 3, 2)).permute(0, 2, 1)
115 |         return dec_out
116 | 
117 |     def classification(self, x_enc, x_mark_enc):
118 |         # embedding
119 |         x_enc, n_vars = self.enc_value_embedding(x_enc.permute(0, 2, 1))
120 | 
121 |         x_enc = rearrange(x_enc, '(b d) seg_num d_model -> b d seg_num d_model', d=n_vars)
122 |         x_enc += self.enc_pos_embedding
123 |         x_enc = self.pre_norm(x_enc)
124 |         enc_out, attns = self.encoder(x_enc)
125 |         # Output from Non-stationary Transformer
126 |         output = self.flatten(enc_out[-1].permute(0, 1, 3, 2))
127 |         output = self.dropout(output)
128 |         output = output.reshape(output.shape[0], -1)
129 |         output = self.projection(output)
130 |         return output
131 | 
132 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
133 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
134 |             dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
135 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
136 |         if self.task_name == 'imputation':
137 |             dec_out = self.imputation(x_enc, x_mark_enc, x_dec, x_mark_dec, mask)
138 |             return dec_out  # [B, L, D]
139 |         if self.task_name == 'anomaly_detection':
140 |             dec_out = self.anomaly_detection(x_enc)
141 |             return dec_out  # [B, L, D]
142 |         if self.task_name == 'classification':
143 |             dec_out = self.classification(x_enc, x_mark_enc)
144 |             return dec_out  # [B, N]
145 |         return None


--------------------------------------------------------------------------------
/models/DLinear.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from layers.Autoformer_EncDec import series_decomp
  5 | 
  6 | 
  7 | class Model(nn.Module):
  8 |     """
  9 |     Paper link: https://arxiv.org/pdf/2205.13504.pdf
 10 |     """
 11 | 
 12 |     def __init__(self, configs, individual=False):
 13 |         """
 14 |         individual: Bool, whether shared model among different variates.
 15 |         """
 16 |         super(Model, self).__init__()
 17 |         self.task_name = configs.task_name
 18 |         self.seq_len = configs.seq_len
 19 |         if self.task_name == 'classification' or self.task_name == 'anomaly_detection' or self.task_name == 'imputation':
 20 |             self.pred_len = configs.seq_len
 21 |         else:
 22 |             self.pred_len = configs.pred_len
 23 |         # Series decomposition block from Autoformer
 24 |         self.decompsition = series_decomp(configs.moving_avg)
 25 |         self.individual = individual
 26 |         self.channels = configs.enc_in
 27 | 
 28 |         if self.individual:
 29 |             self.Linear_Seasonal = nn.ModuleList()
 30 |             self.Linear_Trend = nn.ModuleList()
 31 | 
 32 |             for i in range(self.channels):
 33 |                 self.Linear_Seasonal.append(
 34 |                     nn.Linear(self.seq_len, self.pred_len))
 35 |                 self.Linear_Trend.append(
 36 |                     nn.Linear(self.seq_len, self.pred_len))
 37 | 
 38 |                 self.Linear_Seasonal[i].weight = nn.Parameter(
 39 |                     (1 / self.seq_len) * torch.ones([self.pred_len, self.seq_len]))
 40 |                 self.Linear_Trend[i].weight = nn.Parameter(
 41 |                     (1 / self.seq_len) * torch.ones([self.pred_len, self.seq_len]))
 42 |         else:
 43 |             self.Linear_Seasonal = nn.Linear(self.seq_len, self.pred_len)
 44 |             self.Linear_Trend = nn.Linear(self.seq_len, self.pred_len)
 45 | 
 46 |             self.Linear_Seasonal.weight = nn.Parameter(
 47 |                 (1 / self.seq_len) * torch.ones([self.pred_len, self.seq_len]))
 48 |             self.Linear_Trend.weight = nn.Parameter(
 49 |                 (1 / self.seq_len) * torch.ones([self.pred_len, self.seq_len]))
 50 | 
 51 |         if self.task_name == 'classification':
 52 |             self.projection = nn.Linear(
 53 |                 configs.enc_in * configs.seq_len, configs.num_class)
 54 | 
 55 |     def encoder(self, x):
 56 |         seasonal_init, trend_init = self.decompsition(x)
 57 |         seasonal_init, trend_init = seasonal_init.permute(
 58 |             0, 2, 1), trend_init.permute(0, 2, 1)
 59 |         if self.individual:
 60 |             seasonal_output = torch.zeros([seasonal_init.size(0), seasonal_init.size(1), self.pred_len],
 61 |                                           dtype=seasonal_init.dtype).to(seasonal_init.device)
 62 |             trend_output = torch.zeros([trend_init.size(0), trend_init.size(1), self.pred_len],
 63 |                                        dtype=trend_init.dtype).to(trend_init.device)
 64 |             for i in range(self.channels):
 65 |                 seasonal_output[:, i, :] = self.Linear_Seasonal[i](
 66 |                     seasonal_init[:, i, :])
 67 |                 trend_output[:, i, :] = self.Linear_Trend[i](
 68 |                     trend_init[:, i, :])
 69 |         else:
 70 |             seasonal_output = self.Linear_Seasonal(seasonal_init)
 71 |             trend_output = self.Linear_Trend(trend_init)
 72 |         x = seasonal_output + trend_output
 73 |         return x.permute(0, 2, 1)
 74 | 
 75 |     def forecast(self, x_enc):
 76 |         # Encoder
 77 |         return self.encoder(x_enc)
 78 | 
 79 |     def imputation(self, x_enc):
 80 |         # Encoder
 81 |         return self.encoder(x_enc)
 82 | 
 83 |     def anomaly_detection(self, x_enc):
 84 |         # Encoder
 85 |         return self.encoder(x_enc)
 86 | 
 87 |     def classification(self, x_enc):
 88 |         # Encoder
 89 |         enc_out = self.encoder(x_enc)
 90 |         # Output
 91 |         # (batch_size, seq_length * d_model)
 92 |         output = enc_out.reshape(enc_out.shape[0], -1)
 93 |         # (batch_size, num_classes)
 94 |         output = self.projection(output)
 95 |         return output
 96 | 
 97 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
 98 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
 99 |             dec_out = self.forecast(x_enc)
100 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
101 |         if self.task_name == 'imputation':
102 |             dec_out = self.imputation(x_enc)
103 |             return dec_out  # [B, L, D]
104 |         if self.task_name == 'anomaly_detection':
105 |             dec_out = self.anomaly_detection(x_enc)
106 |             return dec_out  # [B, L, D]
107 |         if self.task_name == 'classification':
108 |             dec_out = self.classification(x_enc)
109 |             return dec_out  # [B, N]
110 |         return None
111 | 


--------------------------------------------------------------------------------
/models/ETSformer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | from layers.Embed import DataEmbedding
  4 | from layers.ETSformer_EncDec import EncoderLayer, Encoder, DecoderLayer, Decoder, Transform
  5 | 
  6 | 
  7 | class Model(nn.Module):
  8 |     """
  9 |     Paper link: https://arxiv.org/abs/2202.01381
 10 |     """
 11 | 
 12 |     def __init__(self, configs):
 13 |         super(Model, self).__init__()
 14 |         self.task_name = configs.task_name
 15 |         self.seq_len = configs.seq_len
 16 |         self.label_len = configs.label_len
 17 |         if self.task_name == 'classification' or self.task_name == 'anomaly_detection' or self.task_name == 'imputation':
 18 |             self.pred_len = configs.seq_len
 19 |         else:
 20 |             self.pred_len = configs.pred_len
 21 | 
 22 |         assert configs.e_layers == configs.d_layers, "Encoder and decoder layers must be equal"
 23 | 
 24 |         # Embedding
 25 |         self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq,
 26 |                                            configs.dropout)
 27 | 
 28 |         # Encoder
 29 |         self.encoder = Encoder(
 30 |             [
 31 |                 EncoderLayer(
 32 |                     configs.d_model, configs.n_heads, configs.enc_in, configs.seq_len, self.pred_len, configs.top_k,
 33 |                     dim_feedforward=configs.d_ff,
 34 |                     dropout=configs.dropout,
 35 |                     activation=configs.activation,
 36 |                 ) for _ in range(configs.e_layers)
 37 |             ]
 38 |         )
 39 |         # Decoder
 40 |         self.decoder = Decoder(
 41 |             [
 42 |                 DecoderLayer(
 43 |                     configs.d_model, configs.n_heads, configs.c_out, self.pred_len,
 44 |                     dropout=configs.dropout,
 45 |                 ) for _ in range(configs.d_layers)
 46 |             ],
 47 |         )
 48 |         self.transform = Transform(sigma=0.2)
 49 | 
 50 |         if self.task_name == 'classification':
 51 |             self.act = torch.nn.functional.gelu
 52 |             self.dropout = nn.Dropout(configs.dropout)
 53 |             self.projection = nn.Linear(configs.d_model * configs.seq_len, configs.num_class)
 54 | 
 55 |     def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 56 |         with torch.no_grad():
 57 |             if self.training:
 58 |                 x_enc = self.transform.transform(x_enc)
 59 |         res = self.enc_embedding(x_enc, x_mark_enc)
 60 |         level, growths, seasons = self.encoder(res, x_enc, attn_mask=None)
 61 | 
 62 |         growth, season = self.decoder(growths, seasons)
 63 |         preds = level[:, -1:] + growth + season
 64 |         return preds
 65 | 
 66 |     def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask):
 67 |         res = self.enc_embedding(x_enc, x_mark_enc)
 68 |         level, growths, seasons = self.encoder(res, x_enc, attn_mask=None)
 69 |         growth, season = self.decoder(growths, seasons)
 70 |         preds = level[:, -1:] + growth + season
 71 |         return preds
 72 | 
 73 |     def anomaly_detection(self, x_enc):
 74 |         res = self.enc_embedding(x_enc, None)
 75 |         level, growths, seasons = self.encoder(res, x_enc, attn_mask=None)
 76 |         growth, season = self.decoder(growths, seasons)
 77 |         preds = level[:, -1:] + growth + season
 78 |         return preds
 79 | 
 80 |     def classification(self, x_enc, x_mark_enc):
 81 |         res = self.enc_embedding(x_enc, None)
 82 |         _, growths, seasons = self.encoder(res, x_enc, attn_mask=None)
 83 | 
 84 |         growths = torch.sum(torch.stack(growths, 0), 0)[:, :self.seq_len, :]
 85 |         seasons = torch.sum(torch.stack(seasons, 0), 0)[:, :self.seq_len, :]
 86 | 
 87 |         enc_out = growths + seasons
 88 |         output = self.act(enc_out)  # the output transformer encoder/decoder embeddings don't include non-linearity
 89 |         output = self.dropout(output)
 90 | 
 91 |         # Output
 92 |         output = output * x_mark_enc.unsqueeze(-1)  # zero-out padding embeddings
 93 |         output = output.reshape(output.shape[0], -1)  # (batch_size, seq_length * d_model)
 94 |         output = self.projection(output)  # (batch_size, num_classes)
 95 |         return output
 96 | 
 97 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
 98 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
 99 |             dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
100 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
101 |         if self.task_name == 'imputation':
102 |             dec_out = self.imputation(x_enc, x_mark_enc, x_dec, x_mark_dec, mask)
103 |             return dec_out  # [B, L, D]
104 |         if self.task_name == 'anomaly_detection':
105 |             dec_out = self.anomaly_detection(x_enc)
106 |             return dec_out  # [B, L, D]
107 |         if self.task_name == 'classification':
108 |             dec_out = self.classification(x_enc, x_mark_enc)
109 |             return dec_out  # [B, N]
110 |         return None
111 | 


--------------------------------------------------------------------------------
/models/FreTS.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | import numpy as np
  5 | 
  6 | 
  7 | class Model(nn.Module):
  8 |     """
  9 |     Paper link: https://arxiv.org/pdf/2311.06184.pdf
 10 |     """
 11 | 
 12 |     def __init__(self, configs):
 13 |         super(Model, self).__init__()
 14 |         self.task_name = configs.task_name
 15 |         if self.task_name == 'classification' or self.task_name == 'anomaly_detection' or self.task_name == 'imputation':
 16 |             self.pred_len = configs.seq_len
 17 |         else:
 18 |             self.pred_len = configs.pred_len
 19 |         self.embed_size = 128  # embed_size
 20 |         self.hidden_size = 256  # hidden_size
 21 |         self.pred_len = configs.pred_len
 22 |         self.feature_size = configs.enc_in  # channels
 23 |         self.seq_len = configs.seq_len
 24 |         self.channel_independence = configs.channel_independence
 25 |         self.sparsity_threshold = 0.01
 26 |         self.scale = 0.02
 27 |         self.embeddings = nn.Parameter(torch.randn(1, self.embed_size))
 28 |         self.r1 = nn.Parameter(self.scale * torch.randn(self.embed_size, self.embed_size))
 29 |         self.i1 = nn.Parameter(self.scale * torch.randn(self.embed_size, self.embed_size))
 30 |         self.rb1 = nn.Parameter(self.scale * torch.randn(self.embed_size))
 31 |         self.ib1 = nn.Parameter(self.scale * torch.randn(self.embed_size))
 32 |         self.r2 = nn.Parameter(self.scale * torch.randn(self.embed_size, self.embed_size))
 33 |         self.i2 = nn.Parameter(self.scale * torch.randn(self.embed_size, self.embed_size))
 34 |         self.rb2 = nn.Parameter(self.scale * torch.randn(self.embed_size))
 35 |         self.ib2 = nn.Parameter(self.scale * torch.randn(self.embed_size))
 36 | 
 37 |         self.fc = nn.Sequential(
 38 |             nn.Linear(self.seq_len * self.embed_size, self.hidden_size),
 39 |             nn.LeakyReLU(),
 40 |             nn.Linear(self.hidden_size, self.pred_len)
 41 |         )
 42 | 
 43 |     # dimension extension
 44 |     def tokenEmb(self, x):
 45 |         # x: [Batch, Input length, Channel]
 46 |         x = x.permute(0, 2, 1)
 47 |         x = x.unsqueeze(3)
 48 |         # N*T*1 x 1*D = N*T*D
 49 |         y = self.embeddings
 50 |         return x * y
 51 | 
 52 |     # frequency temporal learner
 53 |     def MLP_temporal(self, x, B, N, L):
 54 |         # [B, N, T, D]
 55 |         x = torch.fft.rfft(x, dim=2, norm='ortho')  # FFT on L dimension
 56 |         y = self.FreMLP(B, N, L, x, self.r2, self.i2, self.rb2, self.ib2)
 57 |         x = torch.fft.irfft(y, n=self.seq_len, dim=2, norm="ortho")
 58 |         return x
 59 | 
 60 |     # frequency channel learner
 61 |     def MLP_channel(self, x, B, N, L):
 62 |         # [B, N, T, D]
 63 |         x = x.permute(0, 2, 1, 3)
 64 |         # [B, T, N, D]
 65 |         x = torch.fft.rfft(x, dim=2, norm='ortho')  # FFT on N dimension
 66 |         y = self.FreMLP(B, L, N, x, self.r1, self.i1, self.rb1, self.ib1)
 67 |         x = torch.fft.irfft(y, n=self.feature_size, dim=2, norm="ortho")
 68 |         x = x.permute(0, 2, 1, 3)
 69 |         # [B, N, T, D]
 70 |         return x
 71 | 
 72 |     # frequency-domain MLPs
 73 |     # dimension: FFT along the dimension, r: the real part of weights, i: the imaginary part of weights
 74 |     # rb: the real part of bias, ib: the imaginary part of bias
 75 |     def FreMLP(self, B, nd, dimension, x, r, i, rb, ib):
 76 |         o1_real = torch.zeros([B, nd, dimension // 2 + 1, self.embed_size],
 77 |                               device=x.device)
 78 |         o1_imag = torch.zeros([B, nd, dimension // 2 + 1, self.embed_size],
 79 |                               device=x.device)
 80 | 
 81 |         o1_real = F.relu(
 82 |             torch.einsum('bijd,dd->bijd', x.real, r) - \
 83 |             torch.einsum('bijd,dd->bijd', x.imag, i) + \
 84 |             rb
 85 |         )
 86 | 
 87 |         o1_imag = F.relu(
 88 |             torch.einsum('bijd,dd->bijd', x.imag, r) + \
 89 |             torch.einsum('bijd,dd->bijd', x.real, i) + \
 90 |             ib
 91 |         )
 92 | 
 93 |         y = torch.stack([o1_real, o1_imag], dim=-1)
 94 |         y = F.softshrink(y, lambd=self.sparsity_threshold)
 95 |         y = torch.view_as_complex(y)
 96 |         return y
 97 | 
 98 |     def forecast(self, x_enc):
 99 |         # x: [Batch, Input length, Channel]
100 |         B, T, N = x_enc.shape
101 |         # embedding x: [B, N, T, D]
102 |         x = self.tokenEmb(x_enc)
103 |         bias = x
104 |         # [B, N, T, D]
105 |         if self.channel_independence == '0':
106 |             x = self.MLP_channel(x, B, N, T)
107 |         # [B, N, T, D]
108 |         x = self.MLP_temporal(x, B, N, T)
109 |         x = x + bias
110 |         x = self.fc(x.reshape(B, N, -1)).permute(0, 2, 1)
111 |         return x
112 | 
113 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
114 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
115 |             dec_out = self.forecast(x_enc)
116 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
117 |         else:
118 |             raise ValueError('Only forecast tasks implemented yet')
119 | 


--------------------------------------------------------------------------------
/models/Informer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from layers.Transformer_EncDec import Decoder, DecoderLayer, Encoder, EncoderLayer, ConvLayer
  5 | from layers.SelfAttention_Family import ProbAttention, AttentionLayer
  6 | from layers.Embed import DataEmbedding
  7 | 
  8 | 
  9 | class Model(nn.Module):
 10 |     """
 11 |     Informer with Propspare attention in O(LlogL) complexity
 12 |     Paper link: https://ojs.aaai.org/index.php/AAAI/article/view/17325/17132
 13 |     """
 14 | 
 15 |     def __init__(self, configs):
 16 |         super(Model, self).__init__()
 17 |         self.task_name = configs.task_name
 18 |         self.pred_len = configs.pred_len
 19 |         self.label_len = configs.label_len
 20 | 
 21 |         # Embedding
 22 |         self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq,
 23 |                                            configs.dropout)
 24 |         self.dec_embedding = DataEmbedding(configs.dec_in, configs.d_model, configs.embed, configs.freq,
 25 |                                            configs.dropout)
 26 | 
 27 |         # Encoder
 28 |         self.encoder = Encoder(
 29 |             [
 30 |                 EncoderLayer(
 31 |                     AttentionLayer(
 32 |                         ProbAttention(False, configs.factor, attention_dropout=configs.dropout,
 33 |                                       output_attention=False),
 34 |                         configs.d_model, configs.n_heads),
 35 |                     configs.d_model,
 36 |                     configs.d_ff,
 37 |                     dropout=configs.dropout,
 38 |                     activation=configs.activation
 39 |                 ) for l in range(configs.e_layers)
 40 |             ],
 41 |             [
 42 |                 ConvLayer(
 43 |                     configs.d_model
 44 |                 ) for l in range(configs.e_layers - 1)
 45 |             ] if configs.distil and ('forecast' in configs.task_name) else None,
 46 |             norm_layer=torch.nn.LayerNorm(configs.d_model)
 47 |         )
 48 |         # Decoder
 49 |         self.decoder = Decoder(
 50 |             [
 51 |                 DecoderLayer(
 52 |                     AttentionLayer(
 53 |                         ProbAttention(True, configs.factor, attention_dropout=configs.dropout, output_attention=False),
 54 |                         configs.d_model, configs.n_heads),
 55 |                     AttentionLayer(
 56 |                         ProbAttention(False, configs.factor, attention_dropout=configs.dropout, output_attention=False),
 57 |                         configs.d_model, configs.n_heads),
 58 |                     configs.d_model,
 59 |                     configs.d_ff,
 60 |                     dropout=configs.dropout,
 61 |                     activation=configs.activation,
 62 |                 )
 63 |                 for l in range(configs.d_layers)
 64 |             ],
 65 |             norm_layer=torch.nn.LayerNorm(configs.d_model),
 66 |             projection=nn.Linear(configs.d_model, configs.c_out, bias=True)
 67 |         )
 68 |         if self.task_name == 'imputation':
 69 |             self.projection = nn.Linear(configs.d_model, configs.c_out, bias=True)
 70 |         if self.task_name == 'anomaly_detection':
 71 |             self.projection = nn.Linear(configs.d_model, configs.c_out, bias=True)
 72 |         if self.task_name == 'classification':
 73 |             self.act = F.gelu
 74 |             self.dropout = nn.Dropout(configs.dropout)
 75 |             self.projection = nn.Linear(configs.d_model * configs.seq_len, configs.num_class)
 76 | 
 77 |     def long_forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 78 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
 79 |         dec_out = self.dec_embedding(x_dec, x_mark_dec)
 80 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 81 | 
 82 |         dec_out = self.decoder(dec_out, enc_out, x_mask=None, cross_mask=None)
 83 | 
 84 |         return dec_out  # [B, L, D]
 85 |     
 86 |     def short_forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 87 |         # Normalization
 88 |         mean_enc = x_enc.mean(1, keepdim=True).detach()  # B x 1 x E
 89 |         x_enc = x_enc - mean_enc
 90 |         std_enc = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5).detach()  # B x 1 x E
 91 |         x_enc = x_enc / std_enc
 92 | 
 93 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
 94 |         dec_out = self.dec_embedding(x_dec, x_mark_dec)
 95 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 96 | 
 97 |         dec_out = self.decoder(dec_out, enc_out, x_mask=None, cross_mask=None)
 98 | 
 99 |         dec_out = dec_out * std_enc + mean_enc
100 |         return dec_out  # [B, L, D]
101 | 
102 |     def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask):
103 |         # enc
104 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
105 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
106 |         # final
107 |         dec_out = self.projection(enc_out)
108 |         return dec_out
109 | 
110 |     def anomaly_detection(self, x_enc):
111 |         # enc
112 |         enc_out = self.enc_embedding(x_enc, None)
113 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
114 |         # final
115 |         dec_out = self.projection(enc_out)
116 |         return dec_out
117 | 
118 |     def classification(self, x_enc, x_mark_enc):
119 |         # enc
120 |         enc_out = self.enc_embedding(x_enc, None)
121 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
122 | 
123 |         # Output
124 |         output = self.act(enc_out)  # the output transformer encoder/decoder embeddings don't include non-linearity
125 |         output = self.dropout(output)
126 |         output = output * x_mark_enc.unsqueeze(-1)  # zero-out padding embeddings
127 |         output = output.reshape(output.shape[0], -1)  # (batch_size, seq_length * d_model)
128 |         output = self.projection(output)  # (batch_size, num_classes)
129 |         return output
130 | 
131 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
132 |         if self.task_name == 'long_term_forecast':
133 |             dec_out = self.long_forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
134 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
135 |         if self.task_name == 'short_term_forecast':
136 |             dec_out = self.short_forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
137 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
138 |         if self.task_name == 'imputation':
139 |             dec_out = self.imputation(x_enc, x_mark_enc, x_dec, x_mark_dec, mask)
140 |             return dec_out  # [B, L, D]
141 |         if self.task_name == 'anomaly_detection':
142 |             dec_out = self.anomaly_detection(x_enc)
143 |             return dec_out  # [B, L, D]
144 |         if self.task_name == 'classification':
145 |             dec_out = self.classification(x_enc, x_mark_enc)
146 |             return dec_out  # [B, N]
147 |         return None
148 | 


--------------------------------------------------------------------------------
/models/LightTS.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | 
  5 | 
  6 | class IEBlock(nn.Module):
  7 |     def __init__(self, input_dim, hid_dim, output_dim, num_node):
  8 |         super(IEBlock, self).__init__()
  9 | 
 10 |         self.input_dim = input_dim
 11 |         self.hid_dim = hid_dim
 12 |         self.output_dim = output_dim
 13 |         self.num_node = num_node
 14 | 
 15 |         self._build()
 16 | 
 17 |     def _build(self):
 18 |         self.spatial_proj = nn.Sequential(
 19 |             nn.Linear(self.input_dim, self.hid_dim),
 20 |             nn.LeakyReLU(),
 21 |             nn.Linear(self.hid_dim, self.hid_dim // 4)
 22 |         )
 23 | 
 24 |         self.channel_proj = nn.Linear(self.num_node, self.num_node)
 25 |         torch.nn.init.eye_(self.channel_proj.weight)
 26 | 
 27 |         self.output_proj = nn.Linear(self.hid_dim // 4, self.output_dim)
 28 | 
 29 |     def forward(self, x):
 30 |         x = self.spatial_proj(x.permute(0, 2, 1))
 31 |         x = x.permute(0, 2, 1) + self.channel_proj(x.permute(0, 2, 1))
 32 |         x = self.output_proj(x.permute(0, 2, 1))
 33 | 
 34 |         x = x.permute(0, 2, 1)
 35 | 
 36 |         return x
 37 | 
 38 | 
 39 | class Model(nn.Module):
 40 |     """
 41 |     Paper link: https://arxiv.org/abs/2207.01186
 42 |     """
 43 | 
 44 |     def __init__(self, configs, chunk_size=24):
 45 |         """
 46 |         chunk_size: int, reshape T into [num_chunks, chunk_size]
 47 |         """
 48 |         super(Model, self).__init__()
 49 |         self.task_name = configs.task_name
 50 |         self.seq_len = configs.seq_len
 51 |         if self.task_name == 'classification' or self.task_name == 'anomaly_detection' or self.task_name == 'imputation':
 52 |             self.pred_len = configs.seq_len
 53 |         else:
 54 |             self.pred_len = configs.pred_len
 55 | 
 56 |         if configs.task_name == 'long_term_forecast' or configs.task_name == 'short_term_forecast':
 57 |             self.chunk_size = min(configs.pred_len, configs.seq_len, chunk_size)
 58 |         else:
 59 |             self.chunk_size = min(configs.seq_len, chunk_size)
 60 |         # assert (self.seq_len % self.chunk_size == 0)
 61 |         if self.seq_len % self.chunk_size != 0:
 62 |             self.seq_len += (self.chunk_size - self.seq_len % self.chunk_size)  # padding in order to ensure complete division
 63 |         self.num_chunks = self.seq_len // self.chunk_size
 64 | 
 65 |         self.d_model = configs.d_model
 66 |         self.enc_in = configs.enc_in
 67 |         self.dropout = configs.dropout
 68 |         if self.task_name == 'classification':
 69 |             self.act = F.gelu
 70 |             self.dropout = nn.Dropout(configs.dropout)
 71 |             self.projection = nn.Linear(configs.enc_in * configs.seq_len, configs.num_class)
 72 |         self._build()
 73 | 
 74 |     def _build(self):
 75 |         self.layer_1 = IEBlock(
 76 |             input_dim=self.chunk_size,
 77 |             hid_dim=self.d_model // 4,
 78 |             output_dim=self.d_model // 4,
 79 |             num_node=self.num_chunks
 80 |         )
 81 | 
 82 |         self.chunk_proj_1 = nn.Linear(self.num_chunks, 1)
 83 | 
 84 |         self.layer_2 = IEBlock(
 85 |             input_dim=self.chunk_size,
 86 |             hid_dim=self.d_model // 4,
 87 |             output_dim=self.d_model // 4,
 88 |             num_node=self.num_chunks
 89 |         )
 90 | 
 91 |         self.chunk_proj_2 = nn.Linear(self.num_chunks, 1)
 92 | 
 93 |         self.layer_3 = IEBlock(
 94 |             input_dim=self.d_model // 2,
 95 |             hid_dim=self.d_model // 2,
 96 |             output_dim=self.pred_len,
 97 |             num_node=self.enc_in
 98 |         )
 99 | 
100 |         self.ar = nn.Linear(self.seq_len, self.pred_len)
101 | 
102 |     def encoder(self, x):
103 |         B, T, N = x.size()
104 | 
105 |         highway = self.ar(x.permute(0, 2, 1))
106 |         highway = highway.permute(0, 2, 1)
107 | 
108 |         # continuous sampling
109 |         x1 = x.reshape(B, self.num_chunks, self.chunk_size, N)
110 |         x1 = x1.permute(0, 3, 2, 1)
111 |         x1 = x1.reshape(-1, self.chunk_size, self.num_chunks)
112 |         x1 = self.layer_1(x1)
113 |         x1 = self.chunk_proj_1(x1).squeeze(dim=-1)
114 | 
115 |         # interval sampling
116 |         x2 = x.reshape(B, self.chunk_size, self.num_chunks, N)
117 |         x2 = x2.permute(0, 3, 1, 2)
118 |         x2 = x2.reshape(-1, self.chunk_size, self.num_chunks)
119 |         x2 = self.layer_2(x2)
120 |         x2 = self.chunk_proj_2(x2).squeeze(dim=-1)
121 | 
122 |         x3 = torch.cat([x1, x2], dim=-1)
123 | 
124 |         x3 = x3.reshape(B, N, -1)
125 |         x3 = x3.permute(0, 2, 1)
126 | 
127 |         out = self.layer_3(x3)
128 | 
129 |         out = out + highway
130 |         return out
131 | 
132 |     def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
133 |         return self.encoder(x_enc)
134 | 
135 |     def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask):
136 |         return self.encoder(x_enc)
137 | 
138 |     def anomaly_detection(self, x_enc):
139 |         return self.encoder(x_enc)
140 | 
141 |     def classification(self, x_enc, x_mark_enc):
142 |         # padding
143 |         x_enc = torch.cat([x_enc, torch.zeros((x_enc.shape[0], self.seq_len-x_enc.shape[1], x_enc.shape[2])).to(x_enc.device)], dim=1)
144 | 
145 |         enc_out = self.encoder(x_enc)
146 | 
147 |         # Output
148 |         output = enc_out.reshape(enc_out.shape[0], -1)  # (batch_size, seq_length * d_model)
149 |         output = self.projection(output)  # (batch_size, num_classes)
150 |         return output
151 | 
152 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
153 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
154 |             dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
155 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
156 |         if self.task_name == 'imputation':
157 |             dec_out = self.imputation(x_enc, x_mark_enc, x_dec, x_mark_dec, mask)
158 |             return dec_out  # [B, L, D]
159 |         if self.task_name == 'anomaly_detection':
160 |             dec_out = self.anomaly_detection(x_enc)
161 |             return dec_out  # [B, L, D]
162 |         if self.task_name == 'classification':
163 |             dec_out = self.classification(x_enc, x_mark_enc)
164 |             return dec_out  # [B, N]
165 |         return None
166 | 


--------------------------------------------------------------------------------
/models/Mamba.py:
--------------------------------------------------------------------------------
 1 | import math
 2 | 
 3 | import torch
 4 | import torch.nn as nn
 5 | import torch.nn.functional as F
 6 | 
 7 | from mamba_ssm import Mamba
 8 | 
 9 | from layers.Embed import DataEmbedding
10 | 
11 | class Model(nn.Module):
12 |     
13 |     def __init__(self, configs):
14 |         super(Model, self).__init__()
15 |         self.task_name = configs.task_name
16 |         self.pred_len = configs.pred_len
17 | 
18 |         self.d_inner = configs.d_model * configs.expand
19 |         self.dt_rank = math.ceil(configs.d_model / 16) # TODO implement "auto"
20 |         
21 |         self.embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq, configs.dropout)
22 | 
23 |         self.mamba = Mamba(
24 |             d_model = configs.d_model,
25 |             d_state = configs.d_ff,
26 |             d_conv = configs.d_conv,
27 |             expand = configs.expand,
28 |         )
29 | 
30 |         self.out_layer = nn.Linear(configs.d_model, configs.c_out, bias=False)
31 | 
32 |     def forecast(self, x_enc, x_mark_enc):
33 |         mean_enc = x_enc.mean(1, keepdim=True).detach()
34 |         x_enc = x_enc - mean_enc
35 |         std_enc = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5).detach()
36 |         x_enc = x_enc / std_enc
37 | 
38 |         x = self.embedding(x_enc, x_mark_enc)
39 |         x = self.mamba(x)
40 |         x_out = self.out_layer(x)
41 | 
42 |         x_out = x_out * std_enc + mean_enc
43 |         return x_out
44 | 
45 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
46 |         if self.task_name in ['short_term_forecast', 'long_term_forecast']:
47 |             x_out = self.forecast(x_enc, x_mark_enc)
48 |             return x_out[:, -self.pred_len:, :]
49 | 
50 |         # other tasks not implemented


--------------------------------------------------------------------------------
/models/MambaSimple.py:
--------------------------------------------------------------------------------
  1 | import math
  2 | 
  3 | import torch
  4 | import torch.nn as nn
  5 | import torch.nn.functional as F
  6 | from einops import rearrange, repeat, einsum
  7 | 
  8 | from layers.Embed import DataEmbedding
  9 | 
 10 | 
 11 | class Model(nn.Module):
 12 |     """
 13 |     Mamba, linear-time sequence modeling with selective state spaces O(L)
 14 |     Paper link: https://arxiv.org/abs/2312.00752
 15 |     Implementation refernce: https://github.com/johnma2006/mamba-minimal/
 16 |     """
 17 | 
 18 |     def __init__(self, configs):
 19 |         super(Model, self).__init__()
 20 |         self.task_name = configs.task_name
 21 |         self.pred_len = configs.pred_len
 22 | 
 23 |         self.d_inner = configs.d_model * configs.expand
 24 |         self.dt_rank = math.ceil(configs.d_model / 16)
 25 | 
 26 |         self.embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq, configs.dropout)
 27 | 
 28 |         self.layers = nn.ModuleList([ResidualBlock(configs, self.d_inner, self.dt_rank) for _ in range(configs.e_layers)])
 29 |         self.norm = RMSNorm(configs.d_model)
 30 | 
 31 |         self.out_layer = nn.Linear(configs.d_model, configs.c_out, bias=False)
 32 | 
 33 |     def forecast(self, x_enc, x_mark_enc):
 34 |         mean_enc = x_enc.mean(1, keepdim=True).detach()
 35 |         x_enc = x_enc - mean_enc
 36 |         std_enc = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5).detach()
 37 |         x_enc = x_enc / std_enc
 38 | 
 39 |         x = self.embedding(x_enc, x_mark_enc)
 40 |         for layer in self.layers:
 41 |             x = layer(x)
 42 | 
 43 |         x = self.norm(x)
 44 |         x_out = self.out_layer(x)
 45 | 
 46 |         x_out = x_out * std_enc + mean_enc
 47 |         return x_out
 48 | 
 49 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
 50 |         if self.task_name in ['short_term_forecast', 'long_term_forecast']:
 51 |             x_out = self.forecast(x_enc, x_mark_enc)
 52 |             return x_out[:, -self.pred_len:, :]
 53 | 
 54 | 
 55 | class ResidualBlock(nn.Module):
 56 |     def __init__(self, configs, d_inner, dt_rank):
 57 |         super(ResidualBlock, self).__init__()
 58 |         
 59 |         self.mixer = MambaBlock(configs, d_inner, dt_rank)
 60 |         self.norm = RMSNorm(configs.d_model)
 61 | 
 62 |     def forward(self, x):
 63 |         output = self.mixer(self.norm(x)) + x
 64 |         return output
 65 | 
 66 | class MambaBlock(nn.Module):
 67 |     def __init__(self, configs, d_inner, dt_rank):
 68 |         super(MambaBlock, self).__init__()
 69 |         self.d_inner = d_inner
 70 |         self.dt_rank = dt_rank
 71 | 
 72 |         self.in_proj = nn.Linear(configs.d_model, self.d_inner * 2, bias=False)
 73 |         
 74 |         self.conv1d = nn.Conv1d(
 75 |             in_channels = self.d_inner,
 76 |             out_channels = self.d_inner,
 77 |             bias = True,
 78 |             kernel_size = configs.d_conv,
 79 |             padding = configs.d_conv - 1,
 80 |             groups = self.d_inner
 81 |         )
 82 | 
 83 |         # takes in x and outputs the input-specific delta, B, C
 84 |         self.x_proj = nn.Linear(self.d_inner, self.dt_rank + configs.d_ff * 2, bias=False)
 85 | 
 86 |         # projects delta
 87 |         self.dt_proj = nn.Linear(self.dt_rank, self.d_inner, bias=True)
 88 | 
 89 |         A = repeat(torch.arange(1, configs.d_ff + 1), "n -> d n", d=self.d_inner)
 90 |         self.A_log = nn.Parameter(torch.log(A))
 91 |         self.D = nn.Parameter(torch.ones(self.d_inner))
 92 | 
 93 |         self.out_proj = nn.Linear(self.d_inner, configs.d_model, bias=False)
 94 | 
 95 |     def forward(self, x):
 96 |         """
 97 |         Figure 3 in Section 3.4 in the paper
 98 |         """
 99 |         (b, l, d) = x.shape
100 | 
101 |         x_and_res = self.in_proj(x) # [B, L, 2 * d_inner]
102 |         (x, res) = x_and_res.split(split_size=[self.d_inner, self.d_inner], dim=-1)
103 | 
104 |         x = rearrange(x, "b l d -> b d l")
105 |         x = self.conv1d(x)[:, :, :l]
106 |         x = rearrange(x, "b d l -> b l d")
107 | 
108 |         x = F.silu(x)
109 | 
110 |         y = self.ssm(x)
111 |         y = y * F.silu(res)
112 | 
113 |         output = self.out_proj(y)
114 |         return output
115 | 
116 | 
117 |     def ssm(self, x):
118 |         """
119 |         Algorithm 2 in Section 3.2 in the paper
120 |         """
121 |         
122 |         (d_in, n) = self.A_log.shape
123 | 
124 |         A = -torch.exp(self.A_log.float()) # [d_in, n]
125 |         D = self.D.float() # [d_in]
126 | 
127 |         x_dbl = self.x_proj(x) # [B, L, d_rank + 2 * d_ff]
128 |         (delta, B, C) = x_dbl.split(split_size=[self.dt_rank, n, n], dim=-1) # delta: [B, L, d_rank]; B, C: [B, L, n]
129 |         delta = F.softplus(self.dt_proj(delta)) # [B, L, d_in]
130 |         y = self.selective_scan(x, delta, A, B, C, D)
131 | 
132 |         return y
133 | 
134 |     def selective_scan(self, u, delta, A, B, C, D):
135 |         (b, l, d_in) = u.shape
136 |         n = A.shape[1]
137 | 
138 |         deltaA = torch.exp(einsum(delta, A, "b l d, d n -> b l d n")) # A is discretized using zero-order hold (ZOH) discretization
139 |         deltaB_u = einsum(delta, B, u, "b l d, b l n, b l d -> b l d n") # B is discretized using a simplified Euler discretization instead of ZOH. From a discussion with authors: "A is the more important term and the performance doesn't change much with the simplification on B"
140 | 
141 |         # selective scan, sequential instead of parallel
142 |         x = torch.zeros((b, d_in, n), device=deltaA.device)
143 |         ys = []
144 |         for i in range(l):
145 |             x = deltaA[:, i] * x + deltaB_u[:, i]
146 |             y = einsum(x, C[:, i, :], "b d n, b n -> b d")
147 |             ys.append(y)
148 | 
149 |         y = torch.stack(ys, dim=1) # [B, L, d_in]
150 |         y = y + u * D
151 | 
152 |         return y
153 | 
154 | class RMSNorm(nn.Module):
155 |     def __init__(self, d_model, eps=1e-5):
156 |         super(RMSNorm, self).__init__()
157 |         self.eps = eps
158 |         self.weight = nn.Parameter(torch.ones(d_model))
159 | 
160 |     def forward(self, x):
161 |         output = x * torch.rsqrt(x.pow(2).mean(-1, keepdim=True) + self.eps) * self.weight
162 |         return output
163 | 


--------------------------------------------------------------------------------
/models/Pyraformer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | from layers.Pyraformer_EncDec import Encoder
  4 | 
  5 | 
  6 | class Model(nn.Module):
  7 |     """ 
  8 |     Pyraformer: Pyramidal attention to reduce complexity
  9 |     Paper link: https://openreview.net/pdf?id=0EXmFzUn5I
 10 |     """
 11 | 
 12 |     def __init__(self, configs, window_size=[4,4], inner_size=5):
 13 |         """
 14 |         window_size: list, the downsample window size in pyramidal attention.
 15 |         inner_size: int, the size of neighbour attention
 16 |         """
 17 |         super().__init__()
 18 |         self.task_name = configs.task_name
 19 |         self.pred_len = configs.pred_len
 20 |         self.d_model = configs.d_model
 21 | 
 22 |         if self.task_name == 'short_term_forecast':
 23 |             window_size = [2,2]
 24 |         self.encoder = Encoder(configs, window_size, inner_size)
 25 | 
 26 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
 27 |             self.projection = nn.Linear(
 28 |                 (len(window_size)+1)*self.d_model, self.pred_len * configs.enc_in)
 29 |         elif self.task_name == 'imputation' or self.task_name == 'anomaly_detection':
 30 |             self.projection = nn.Linear(
 31 |                 (len(window_size)+1)*self.d_model, configs.enc_in, bias=True)
 32 |         elif self.task_name == 'classification':
 33 |             self.act = torch.nn.functional.gelu
 34 |             self.dropout = nn.Dropout(configs.dropout)
 35 |             self.projection = nn.Linear(
 36 |                 (len(window_size)+1)*self.d_model * configs.seq_len, configs.num_class)
 37 | 
 38 |     def long_forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
 39 |         enc_out = self.encoder(x_enc, x_mark_enc)[:, -1, :]
 40 |         dec_out = self.projection(enc_out).view(
 41 |             enc_out.size(0), self.pred_len, -1)
 42 |         return dec_out
 43 |     
 44 |     def short_forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
 45 |         # Normalization
 46 |         mean_enc = x_enc.mean(1, keepdim=True).detach()  # B x 1 x E
 47 |         x_enc = x_enc - mean_enc
 48 |         std_enc = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5).detach()  # B x 1 x E
 49 |         x_enc = x_enc / std_enc
 50 | 
 51 |         enc_out = self.encoder(x_enc, x_mark_enc)[:, -1, :]
 52 |         dec_out = self.projection(enc_out).view(
 53 |             enc_out.size(0), self.pred_len, -1)
 54 |         
 55 |         dec_out = dec_out * std_enc + mean_enc
 56 |         return dec_out
 57 | 
 58 |     def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask):
 59 |         enc_out = self.encoder(x_enc, x_mark_enc)
 60 |         dec_out = self.projection(enc_out)
 61 |         return dec_out
 62 | 
 63 |     def anomaly_detection(self, x_enc, x_mark_enc):
 64 |         enc_out = self.encoder(x_enc, x_mark_enc)
 65 |         dec_out = self.projection(enc_out)
 66 |         return dec_out
 67 | 
 68 |     def classification(self, x_enc, x_mark_enc):
 69 |         # enc
 70 |         enc_out = self.encoder(x_enc, x_mark_enc=None)
 71 | 
 72 |         # Output
 73 |         # the output transformer encoder/decoder embeddings don't include non-linearity
 74 |         output = self.act(enc_out)
 75 |         output = self.dropout(output)
 76 |         # zero-out padding embeddings
 77 |         output = output * x_mark_enc.unsqueeze(-1)
 78 |         # (batch_size, seq_length * d_model)
 79 |         output = output.reshape(output.shape[0], -1)
 80 |         output = self.projection(output)  # (batch_size, num_classes)
 81 | 
 82 |         return output
 83 | 
 84 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
 85 |         if self.task_name == 'long_term_forecast':
 86 |             dec_out = self.long_forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
 87 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
 88 |         if self.task_name == 'short_term_forecast':
 89 |             dec_out = self.short_forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
 90 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
 91 |         if self.task_name == 'imputation':
 92 |             dec_out = self.imputation(
 93 |                 x_enc, x_mark_enc, x_dec, x_mark_dec, mask)
 94 |             return dec_out  # [B, L, D]
 95 |         if self.task_name == 'anomaly_detection':
 96 |             dec_out = self.anomaly_detection(x_enc, x_mark_enc)
 97 |             return dec_out  # [B, L, D]
 98 |         if self.task_name == 'classification':
 99 |             dec_out = self.classification(x_enc, x_mark_enc)
100 |             return dec_out  # [B, N]
101 |         return None
102 | 


--------------------------------------------------------------------------------
/models/Reformer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from layers.Transformer_EncDec import Encoder, EncoderLayer
  5 | from layers.SelfAttention_Family import ReformerLayer
  6 | from layers.Embed import DataEmbedding
  7 | 
  8 | 
  9 | class Model(nn.Module):
 10 |     """
 11 |     Reformer with O(LlogL) complexity
 12 |     Paper link: https://openreview.net/forum?id=rkgNKkHtvB
 13 |     """
 14 | 
 15 |     def __init__(self, configs, bucket_size=4, n_hashes=4):
 16 |         """
 17 |         bucket_size: int, 
 18 |         n_hashes: int, 
 19 |         """
 20 |         super(Model, self).__init__()
 21 |         self.task_name = configs.task_name
 22 |         self.pred_len = configs.pred_len
 23 |         self.seq_len = configs.seq_len
 24 | 
 25 |         self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq,
 26 |                                            configs.dropout)
 27 |         # Encoder
 28 |         self.encoder = Encoder(
 29 |             [
 30 |                 EncoderLayer(
 31 |                     ReformerLayer(None, configs.d_model, configs.n_heads,
 32 |                                   bucket_size=bucket_size, n_hashes=n_hashes),
 33 |                     configs.d_model,
 34 |                     configs.d_ff,
 35 |                     dropout=configs.dropout,
 36 |                     activation=configs.activation
 37 |                 ) for l in range(configs.e_layers)
 38 |             ],
 39 |             norm_layer=torch.nn.LayerNorm(configs.d_model)
 40 |         )
 41 | 
 42 |         if self.task_name == 'classification':
 43 |             self.act = F.gelu
 44 |             self.dropout = nn.Dropout(configs.dropout)
 45 |             self.projection = nn.Linear(
 46 |                 configs.d_model * configs.seq_len, configs.num_class)
 47 |         else:
 48 |             self.projection = nn.Linear(
 49 |                 configs.d_model, configs.c_out, bias=True)
 50 | 
 51 |     def long_forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 52 |         # add placeholder
 53 |         x_enc = torch.cat([x_enc, x_dec[:, -self.pred_len:, :]], dim=1)
 54 |         if x_mark_enc is not None:
 55 |             x_mark_enc = torch.cat(
 56 |                 [x_mark_enc, x_mark_dec[:, -self.pred_len:, :]], dim=1)
 57 | 
 58 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)  # [B,T,C]
 59 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 60 |         dec_out = self.projection(enc_out)
 61 | 
 62 |         return dec_out  # [B, L, D]
 63 |     
 64 |     def short_forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 65 |         # Normalization
 66 |         mean_enc = x_enc.mean(1, keepdim=True).detach()  # B x 1 x E
 67 |         x_enc = x_enc - mean_enc
 68 |         std_enc = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5).detach()  # B x 1 x E
 69 |         x_enc = x_enc / std_enc
 70 | 
 71 |         # add placeholder
 72 |         x_enc = torch.cat([x_enc, x_dec[:, -self.pred_len:, :]], dim=1)
 73 |         if x_mark_enc is not None:
 74 |             x_mark_enc = torch.cat(
 75 |                 [x_mark_enc, x_mark_dec[:, -self.pred_len:, :]], dim=1)
 76 | 
 77 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)  # [B,T,C]
 78 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 79 |         dec_out = self.projection(enc_out)
 80 | 
 81 |         dec_out = dec_out * std_enc + mean_enc
 82 |         return dec_out  # [B, L, D]
 83 | 
 84 |     def imputation(self, x_enc, x_mark_enc):
 85 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)  # [B,T,C]
 86 | 
 87 |         enc_out, attns = self.encoder(enc_out)
 88 |         enc_out = self.projection(enc_out)
 89 | 
 90 |         return enc_out  # [B, L, D]
 91 | 
 92 |     def anomaly_detection(self, x_enc):
 93 |         enc_out = self.enc_embedding(x_enc, None)  # [B,T,C]
 94 | 
 95 |         enc_out, attns = self.encoder(enc_out)
 96 |         enc_out = self.projection(enc_out)
 97 | 
 98 |         return enc_out  # [B, L, D]
 99 | 
100 |     def classification(self, x_enc, x_mark_enc):
101 |         # enc
102 |         enc_out = self.enc_embedding(x_enc, None)
103 |         enc_out, attns = self.encoder(enc_out)
104 | 
105 |         # Output
106 |         # the output transformer encoder/decoder embeddings don't include non-linearity
107 |         output = self.act(enc_out)
108 |         output = self.dropout(output)
109 |         # zero-out padding embeddings
110 |         output = output * x_mark_enc.unsqueeze(-1)
111 |         # (batch_size, seq_length * d_model)
112 |         output = output.reshape(output.shape[0], -1)
113 |         output = self.projection(output)  # (batch_size, num_classes)
114 |         return output
115 | 
116 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
117 |         if self.task_name == 'long_term_forecast':
118 |             dec_out = self.long_forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
119 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
120 |         if self.task_name == 'short_term_forecast':
121 |             dec_out = self.short_forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
122 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
123 |         if self.task_name == 'imputation':
124 |             dec_out = self.imputation(x_enc, x_mark_enc)
125 |             return dec_out  # [B, L, D]
126 |         if self.task_name == 'anomaly_detection':
127 |             dec_out = self.anomaly_detection(x_enc)
128 |             return dec_out  # [B, L, D]
129 |         if self.task_name == 'classification':
130 |             dec_out = self.classification(x_enc, x_mark_enc)
131 |             return dec_out  # [B, N]
132 |         return None
133 | 


--------------------------------------------------------------------------------
/models/SCINet.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | import math
  5 | 
  6 | class Splitting(nn.Module):
  7 |     def __init__(self):
  8 |         super(Splitting, self).__init__()
  9 | 
 10 |     def even(self, x):
 11 |         return x[:, ::2, :]
 12 | 
 13 |     def odd(self, x):
 14 |         return x[:, 1::2, :]
 15 | 
 16 |     def forward(self, x):
 17 |         # return the odd and even part
 18 |         return self.even(x), self.odd(x)
 19 | 
 20 | 
 21 | class CausalConvBlock(nn.Module):
 22 |     def __init__(self, d_model, kernel_size=5, dropout=0.0):
 23 |         super(CausalConvBlock, self).__init__()
 24 |         module_list = [
 25 |             nn.ReplicationPad1d((kernel_size - 1, kernel_size - 1)),
 26 | 
 27 |             nn.Conv1d(d_model, d_model,
 28 |                       kernel_size=kernel_size),
 29 |             nn.LeakyReLU(negative_slope=0.01, inplace=True),
 30 | 
 31 |             nn.Dropout(dropout),
 32 |             nn.Conv1d(d_model, d_model,
 33 |                       kernel_size=kernel_size),
 34 |             nn.Tanh()
 35 |         ]
 36 |         self.causal_conv = nn.Sequential(*module_list)
 37 | 
 38 |     def forward(self, x):
 39 |         return self.causal_conv(x)  # return value is the same as input dimension
 40 | 
 41 | 
 42 | class SCIBlock(nn.Module):
 43 |     def __init__(self, d_model, kernel_size=5, dropout=0.0):
 44 |         super(SCIBlock, self).__init__()
 45 |         self.splitting = Splitting()
 46 |         self.modules_even, self.modules_odd, self.interactor_even, self.interactor_odd = [CausalConvBlock(d_model) for _ in range(4)]
 47 | 
 48 |     def forward(self, x):
 49 |         x_even, x_odd = self.splitting(x)
 50 |         x_even = x_even.permute(0, 2, 1)
 51 |         x_odd = x_odd.permute(0, 2, 1)
 52 | 
 53 |         x_even_temp = x_even.mul(torch.exp(self.modules_even(x_odd)))
 54 |         x_odd_temp = x_odd.mul(torch.exp(self.modules_odd(x_even)))
 55 | 
 56 |         x_even_update = x_even_temp + self.interactor_even(x_odd_temp)
 57 |         x_odd_update = x_odd_temp - self.interactor_odd(x_even_temp)
 58 | 
 59 |         return x_even_update.permute(0, 2, 1), x_odd_update.permute(0, 2, 1)
 60 | 
 61 | 
 62 | class SCINet(nn.Module):
 63 |     def __init__(self, d_model, current_level=3, kernel_size=5, dropout=0.0):
 64 |         super(SCINet, self).__init__()
 65 |         self.current_level = current_level
 66 |         self.working_block = SCIBlock(d_model, kernel_size, dropout)
 67 | 
 68 |         if current_level != 0:
 69 |             self.SCINet_Tree_odd = SCINet(d_model, current_level-1, kernel_size, dropout)
 70 |             self.SCINet_Tree_even = SCINet(d_model, current_level-1, kernel_size, dropout)
 71 | 
 72 |     def forward(self, x):
 73 |         odd_flag = False
 74 |         if x.shape[1] % 2 == 1:
 75 |             odd_flag = True
 76 |             x = torch.cat((x, x[:, -1:, :]), dim=1)
 77 |         x_even_update, x_odd_update = self.working_block(x)
 78 |         if odd_flag:
 79 |             x_odd_update = x_odd_update[:, :-1]
 80 | 
 81 |         if self.current_level == 0:
 82 |             return self.zip_up_the_pants(x_even_update, x_odd_update)
 83 |         else:
 84 |             return self.zip_up_the_pants(self.SCINet_Tree_even(x_even_update), self.SCINet_Tree_odd(x_odd_update))
 85 | 
 86 |     def zip_up_the_pants(self, even, odd):
 87 |         even = even.permute(1, 0, 2)
 88 |         odd = odd.permute(1, 0, 2)
 89 |         even_len = even.shape[0]
 90 |         odd_len = odd.shape[0]
 91 |         min_len = min(even_len, odd_len)
 92 | 
 93 |         zipped_data = []
 94 |         for i in range(min_len):
 95 |             zipped_data.append(even[i].unsqueeze(0))
 96 |             zipped_data.append(odd[i].unsqueeze(0))
 97 |         if even_len > odd_len:
 98 |             zipped_data.append(even[-1].unsqueeze(0))
 99 |         return torch.cat(zipped_data,0).permute(1, 0, 2)
100 | 
101 | 
102 | class Model(nn.Module):
103 |     def __init__(self, configs):
104 |         super(Model, self).__init__()
105 |         self.task_name = configs.task_name
106 |         self.seq_len = configs.seq_len
107 |         self.label_len = configs.label_len
108 |         self.pred_len = configs.pred_len
109 | 
110 |         # You can set the number of SCINet stacks by argument "d_layers", but should choose 1 or 2.
111 |         self.num_stacks = configs.d_layers
112 |         if self.num_stacks == 1:
113 |             self.sci_net_1 = SCINet(configs.enc_in, dropout=configs.dropout)
114 |             self.projection_1 = nn.Conv1d(self.seq_len, self.seq_len + self.pred_len, kernel_size=1, stride=1, bias=False)
115 |         else:
116 |             self.sci_net_1, self.sci_net_2 = [SCINet(configs.enc_in, dropout=configs.dropout) for _ in range(2)]
117 |             self.projection_1 = nn.Conv1d(self.seq_len, self.pred_len, kernel_size=1, stride=1, bias=False)
118 |             self.projection_2 = nn.Conv1d(self.seq_len+self.pred_len, self.seq_len+self.pred_len,
119 |                                                 kernel_size = 1, bias = False)
120 | 
121 |         # For positional encoding
122 |         self.pe_hidden_size = configs.enc_in
123 |         if self.pe_hidden_size % 2 == 1:
124 |             self.pe_hidden_size += 1
125 | 
126 |         num_timescales = self.pe_hidden_size // 2
127 |         max_timescale = 10000.0
128 |         min_timescale = 1.0
129 | 
130 |         log_timescale_increment = (
131 |                 math.log(float(max_timescale) / float(min_timescale)) /
132 |                 max(num_timescales - 1, 1))
133 |         inv_timescales = min_timescale * torch.exp(
134 |             torch.arange(num_timescales, dtype=torch.float32) *
135 |             -log_timescale_increment)
136 |         self.register_buffer('inv_timescales', inv_timescales)
137 | 
138 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
139 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
140 |             dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)  # [B,pred_len,C]
141 |             dec_out = torch.cat([torch.zeros_like(x_enc), dec_out], dim=1)
142 |             return dec_out  # [B, T, D]
143 |         return None
144 | 
145 |     def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
146 |         # Normalization from Non-stationary Transformer
147 |         means = x_enc.mean(1, keepdim=True).detach()
148 |         x_enc = x_enc - means
149 |         stdev = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5)
150 |         x_enc /= stdev
151 | 
152 |         # position-encoding
153 |         pe = self.get_position_encoding(x_enc)
154 |         if pe.shape[2] > x_enc.shape[2]:
155 |             x_enc += pe[:, :, :-1]
156 |         else:
157 |             x_enc += self.get_position_encoding(x_enc)
158 | 
159 |         # SCINet
160 |         dec_out = self.sci_net_1(x_enc)
161 |         dec_out += x_enc
162 |         dec_out = self.projection_1(dec_out)
163 |         if self.num_stacks != 1:
164 |             dec_out = torch.cat((x_enc, dec_out), dim=1)
165 |             temp = dec_out
166 |             dec_out = self.sci_net_2(dec_out)
167 |             dec_out += temp
168 |             dec_out = self.projection_2(dec_out)
169 | 
170 |         # De-Normalization from Non-stationary Transformer
171 |         dec_out = dec_out * \
172 |                   (stdev[:, 0, :].unsqueeze(1).repeat(
173 |                       1, self.pred_len + self.seq_len, 1))
174 |         dec_out = dec_out + \
175 |                   (means[:, 0, :].unsqueeze(1).repeat(
176 |                       1, self.pred_len + self.seq_len, 1))
177 |         return dec_out
178 | 
179 |     def get_position_encoding(self, x):
180 |         max_length = x.size()[1]
181 |         position = torch.arange(max_length, dtype=torch.float32,
182 |                                 device=x.device)  # tensor([0., 1., 2., 3., 4.], device='cuda:0')
183 |         scaled_time = position.unsqueeze(1) * self.inv_timescales.unsqueeze(0)  # 5 256
184 |         signal = torch.cat([torch.sin(scaled_time), torch.cos(scaled_time)], dim=1)  # [T, C]
185 |         signal = F.pad(signal, (0, 0, 0, self.pe_hidden_size % 2))
186 |         signal = signal.view(1, max_length, self.pe_hidden_size)
187 | 
188 |         return signal


--------------------------------------------------------------------------------
/models/SegRNN.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from layers.Autoformer_EncDec import series_decomp
  5 | 
  6 | 
  7 | class Model(nn.Module):
  8 |     """
  9 |     Paper link: https://arxiv.org/abs/2308.11200.pdf
 10 |     """
 11 | 
 12 |     def __init__(self, configs):
 13 |         super(Model, self).__init__()
 14 | 
 15 |         # get parameters
 16 |         self.seq_len = configs.seq_len
 17 |         self.enc_in = configs.enc_in
 18 |         self.d_model = configs.d_model
 19 |         self.dropout = configs.dropout
 20 | 
 21 |         self.task_name = configs.task_name
 22 |         if self.task_name == 'classification' or self.task_name == 'anomaly_detection' or self.task_name == 'imputation':
 23 |             self.pred_len = configs.seq_len
 24 |         else:
 25 |             self.pred_len = configs.pred_len
 26 | 
 27 |         self.seg_len = configs.seg_len
 28 |         self.seg_num_x = self.seq_len // self.seg_len
 29 |         self.seg_num_y = self.pred_len // self.seg_len
 30 | 
 31 |         # building model
 32 |         self.valueEmbedding = nn.Sequential(
 33 |             nn.Linear(self.seg_len, self.d_model),
 34 |             nn.ReLU()
 35 |         )
 36 |         self.rnn = nn.GRU(input_size=self.d_model, hidden_size=self.d_model, num_layers=1, bias=True,
 37 |                               batch_first=True, bidirectional=False)
 38 |         self.pos_emb = nn.Parameter(torch.randn(self.seg_num_y, self.d_model // 2))
 39 |         self.channel_emb = nn.Parameter(torch.randn(self.enc_in, self.d_model // 2))
 40 | 
 41 |         self.predict = nn.Sequential(
 42 |             nn.Dropout(self.dropout),
 43 |             nn.Linear(self.d_model, self.seg_len)
 44 |         )
 45 | 
 46 |         if self.task_name == 'classification':
 47 |             self.act = F.gelu
 48 |             self.dropout = nn.Dropout(configs.dropout)
 49 |             self.projection = nn.Linear(
 50 |                 configs.enc_in * configs.seq_len, configs.num_class)
 51 | 
 52 |     def encoder(self, x):
 53 |         # b:batch_size c:channel_size s:seq_len s:seq_len
 54 |         # d:d_model w:seg_len n:seg_num_x m:seg_num_y
 55 |         batch_size = x.size(0)
 56 | 
 57 |         # normalization and permute     b,s,c -> b,c,s
 58 |         seq_last = x[:, -1:, :].detach()
 59 |         x = (x - seq_last).permute(0, 2, 1) # b,c,s
 60 | 
 61 |         # segment and embedding    b,c,s -> bc,n,w -> bc,n,d
 62 |         x = self.valueEmbedding(x.reshape(-1, self.seg_num_x, self.seg_len))
 63 | 
 64 |         # encoding
 65 |         _, hn = self.rnn(x) # bc,n,d  1,bc,d
 66 | 
 67 |         # m,d//2 -> 1,m,d//2 -> c,m,d//2
 68 |         # c,d//2 -> c,1,d//2 -> c,m,d//2
 69 |         # c,m,d -> cm,1,d -> bcm, 1, d
 70 |         pos_emb = torch.cat([
 71 |             self.pos_emb.unsqueeze(0).repeat(self.enc_in, 1, 1),
 72 |             self.channel_emb.unsqueeze(1).repeat(1, self.seg_num_y, 1)
 73 |         ], dim=-1).view(-1, 1, self.d_model).repeat(batch_size,1,1)
 74 | 
 75 |         _, hy = self.rnn(pos_emb, hn.repeat(1, 1, self.seg_num_y).view(1, -1, self.d_model)) # bcm,1,d  1,bcm,d
 76 | 
 77 |         # 1,bcm,d -> 1,bcm,w -> b,c,s
 78 |         y = self.predict(hy).view(-1, self.enc_in, self.pred_len)
 79 | 
 80 |         # permute and denorm
 81 |         y = y.permute(0, 2, 1) + seq_last
 82 |         return y
 83 | 
 84 |     def forecast(self, x_enc):
 85 |         # Encoder
 86 |         return self.encoder(x_enc)
 87 | 
 88 |     def imputation(self, x_enc):
 89 |         # Encoder
 90 |         return self.encoder(x_enc)
 91 | 
 92 |     def anomaly_detection(self, x_enc):
 93 |         # Encoder
 94 |         return self.encoder(x_enc)
 95 | 
 96 |     def classification(self, x_enc):
 97 |         # Encoder
 98 |         enc_out = self.encoder(x_enc)
 99 |         # Output
100 |         # (batch_size, seq_length * d_model)
101 |         output = enc_out.reshape(enc_out.shape[0], -1)
102 |         # (batch_size, num_classes)
103 |         output = self.projection(output)
104 |         return output
105 | 
106 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
107 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
108 |             dec_out = self.forecast(x_enc)
109 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
110 |         if self.task_name == 'imputation':
111 |             dec_out = self.imputation(x_enc)
112 |             return dec_out  # [B, L, D]
113 |         if self.task_name == 'anomaly_detection':
114 |             dec_out = self.anomaly_detection(x_enc)
115 |             return dec_out  # [B, L, D]
116 |         if self.task_name == 'classification':
117 |             dec_out = self.classification(x_enc)
118 |             return dec_out  # [B, N]
119 |         return None
120 | 


--------------------------------------------------------------------------------
/models/TSMixer.py:
--------------------------------------------------------------------------------
 1 | import torch.nn as nn
 2 | 
 3 | 
 4 | class ResBlock(nn.Module):
 5 |     def __init__(self, configs):
 6 |         super(ResBlock, self).__init__()
 7 | 
 8 |         self.temporal = nn.Sequential(
 9 |             nn.Linear(configs.seq_len, configs.d_model),
10 |             nn.ReLU(),
11 |             nn.Linear(configs.d_model, configs.seq_len),
12 |             nn.Dropout(configs.dropout)
13 |         )
14 | 
15 |         self.channel = nn.Sequential(
16 |             nn.Linear(configs.enc_in, configs.d_model),
17 |             nn.ReLU(),
18 |             nn.Linear(configs.d_model, configs.enc_in),
19 |             nn.Dropout(configs.dropout)
20 |         )
21 | 
22 |     def forward(self, x):
23 |         # x: [B, L, D]
24 |         x = x + self.temporal(x.transpose(1, 2)).transpose(1, 2)
25 |         x = x + self.channel(x)
26 | 
27 |         return x
28 | 
29 | 
30 | class Model(nn.Module):
31 |     def __init__(self, configs):
32 |         super(Model, self).__init__()
33 |         self.task_name = configs.task_name
34 |         self.layer = configs.e_layers
35 |         self.model = nn.ModuleList([ResBlock(configs)
36 |                                     for _ in range(configs.e_layers)])
37 |         self.pred_len = configs.pred_len
38 |         self.projection = nn.Linear(configs.seq_len, configs.pred_len)
39 | 
40 |     def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
41 | 
42 |         # x: [B, L, D]
43 |         for i in range(self.layer):
44 |             x_enc = self.model[i](x_enc)
45 |         enc_out = self.projection(x_enc.transpose(1, 2)).transpose(1, 2)
46 | 
47 |         return enc_out
48 | 
49 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
50 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
51 |             dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
52 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
53 |         else:
54 |             raise ValueError('Only forecast tasks implemented yet')
55 | 


--------------------------------------------------------------------------------
/models/TiDE.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | 
  5 | 
  6 | class LayerNorm(nn.Module):
  7 |     """ LayerNorm but with an optional bias. PyTorch doesn't support simply bias=False """
  8 | 
  9 |     def __init__(self, ndim, bias):
 10 |         super().__init__()
 11 |         self.weight = nn.Parameter(torch.ones(ndim))
 12 |         self.bias = nn.Parameter(torch.zeros(ndim)) if bias else None
 13 | 
 14 |     def forward(self, input):
 15 |         return F.layer_norm(input, self.weight.shape, self.weight, self.bias, 1e-5)
 16 | 
 17 | 
 18 | 
 19 | class ResBlock(nn.Module):
 20 |     def __init__(self, input_dim, hidden_dim, output_dim, dropout=0.1, bias=True): 
 21 |         super().__init__()
 22 | 
 23 |         self.fc1 = nn.Linear(input_dim, hidden_dim, bias=bias) 
 24 |         self.fc2 = nn.Linear(hidden_dim, output_dim, bias=bias)
 25 |         self.fc3 = nn.Linear(input_dim, output_dim, bias=bias)
 26 |         self.dropout = nn.Dropout(dropout)
 27 |         self.relu = nn.ReLU()
 28 |         self.ln = LayerNorm(output_dim, bias=bias)
 29 |         
 30 |     def forward(self, x):
 31 | 
 32 |         out = self.fc1(x)
 33 |         out = self.relu(out)
 34 |         out = self.fc2(out)
 35 |         out = self.dropout(out)
 36 |         out = out + self.fc3(x)
 37 |         out = self.ln(out)
 38 |         return out
 39 | 
 40 | 
 41 | #TiDE
 42 | class Model(nn.Module):  
 43 |     """
 44 |     paper: https://arxiv.org/pdf/2304.08424.pdf 
 45 |     """
 46 |     def __init__(self, configs, bias=True, feature_encode_dim=2): 
 47 |         super(Model, self).__init__()
 48 |         self.configs = configs
 49 |         self.task_name = configs.task_name
 50 |         self.seq_len = configs.seq_len  #L 
 51 |         self.label_len = configs.label_len
 52 |         self.pred_len = configs.pred_len  #H 
 53 |         self.hidden_dim=configs.d_model
 54 |         self.res_hidden=configs.d_model 
 55 |         self.encoder_num=configs.e_layers
 56 |         self.decoder_num=configs.d_layers
 57 |         self.freq=configs.freq
 58 |         self.feature_encode_dim=feature_encode_dim
 59 |         self.decode_dim = configs.c_out
 60 |         self.temporalDecoderHidden=configs.d_ff
 61 |         dropout=configs.dropout
 62 | 
 63 |         
 64 |         freq_map = {'h': 4, 't': 5, 's': 6,
 65 |                     'm': 1, 'a': 1, 'w': 2, 'd': 3, 'b': 3}
 66 |         
 67 |         self.feature_dim=freq_map[self.freq]
 68 | 
 69 | 
 70 |         flatten_dim = self.seq_len + (self.seq_len + self.pred_len) * self.feature_encode_dim
 71 | 
 72 |         self.feature_encoder = ResBlock(self.feature_dim, self.res_hidden, self.feature_encode_dim, dropout, bias)
 73 |         self.encoders = nn.Sequential(ResBlock(flatten_dim, self.res_hidden, self.hidden_dim, dropout, bias),*([ ResBlock(self.hidden_dim, self.res_hidden, self.hidden_dim, dropout, bias)]*(self.encoder_num-1)))
 74 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
 75 |             self.decoders = nn.Sequential(*([ ResBlock(self.hidden_dim, self.res_hidden, self.hidden_dim, dropout, bias)]*(self.decoder_num-1)),ResBlock(self.hidden_dim, self.res_hidden, self.decode_dim * self.pred_len, dropout, bias))
 76 |             self.temporalDecoder = ResBlock(self.decode_dim + self.feature_encode_dim, self.temporalDecoderHidden, 1, dropout, bias)
 77 |             self.residual_proj = nn.Linear(self.seq_len, self.pred_len, bias=bias)
 78 |         if self.task_name == 'imputation':
 79 |             self.decoders = nn.Sequential(*([ ResBlock(self.hidden_dim, self.res_hidden, self.hidden_dim, dropout, bias)]*(self.decoder_num-1)),ResBlock(self.hidden_dim, self.res_hidden, self.decode_dim * self.seq_len, dropout, bias))
 80 |             self.temporalDecoder = ResBlock(self.decode_dim + self.feature_encode_dim, self.temporalDecoderHidden, 1, dropout, bias)
 81 |             self.residual_proj = nn.Linear(self.seq_len, self.seq_len, bias=bias)
 82 |         if self.task_name == 'anomaly_detection':
 83 |             self.decoders = nn.Sequential(*([ ResBlock(self.hidden_dim, self.res_hidden, self.hidden_dim, dropout, bias)]*(self.decoder_num-1)),ResBlock(self.hidden_dim, self.res_hidden, self.decode_dim * self.seq_len, dropout, bias))
 84 |             self.temporalDecoder = ResBlock(self.decode_dim + self.feature_encode_dim, self.temporalDecoderHidden, 1, dropout, bias)
 85 |             self.residual_proj = nn.Linear(self.seq_len, self.seq_len, bias=bias)
 86 |             
 87 |         
 88 |     def forecast(self, x_enc, x_mark_enc, x_dec, batch_y_mark):
 89 |         # Normalization
 90 |         means = x_enc.mean(1, keepdim=True).detach()
 91 |         x_enc = x_enc - means
 92 |         stdev = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5)
 93 |         x_enc /= stdev
 94 |         
 95 |         feature = self.feature_encoder(batch_y_mark)
 96 |         hidden = self.encoders(torch.cat([x_enc, feature.reshape(feature.shape[0], -1)], dim=-1))
 97 |         decoded = self.decoders(hidden).reshape(hidden.shape[0], self.pred_len, self.decode_dim)
 98 |         dec_out = self.temporalDecoder(torch.cat([feature[:,self.seq_len:], decoded], dim=-1)).squeeze(-1) + self.residual_proj(x_enc)
 99 |         
100 |         
101 |         # De-Normalization 
102 |         dec_out = dec_out * (stdev[:, 0].unsqueeze(1).repeat(1, self.pred_len))
103 |         dec_out = dec_out + (means[:, 0].unsqueeze(1).repeat(1, self.pred_len))
104 |         return dec_out
105 |     
106 |     def imputation(self, x_enc, x_mark_enc, x_dec, batch_y_mark, mask):
107 |         # Normalization
108 |         means = x_enc.mean(1, keepdim=True).detach()
109 |         x_enc = x_enc - means
110 |         stdev = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5)
111 |         x_enc /= stdev
112 | 
113 |         feature = self.feature_encoder(x_mark_enc)
114 |         hidden = self.encoders(torch.cat([x_enc, feature.reshape(feature.shape[0], -1)], dim=-1))
115 |         decoded = self.decoders(hidden).reshape(hidden.shape[0], self.seq_len, self.decode_dim)
116 |         dec_out = self.temporalDecoder(torch.cat([feature[:,:self.seq_len], decoded], dim=-1)).squeeze(-1) + self.residual_proj(x_enc)
117 |     
118 |         # De-Normalization 
119 |         dec_out = dec_out * (stdev[:, 0].unsqueeze(1).repeat(1, self.seq_len))
120 |         dec_out = dec_out + (means[:, 0].unsqueeze(1).repeat(1, self.seq_len))
121 |         return dec_out
122 |     
123 |     
124 |     def forward(self, x_enc, x_mark_enc, x_dec, batch_y_mark, mask=None):
125 |         '''x_mark_enc is the exogenous dynamic feature described in the original paper'''
126 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
127 |             if batch_y_mark is None:
128 |                 batch_y_mark = torch.zeros((x_enc.shape[0], self.seq_len+self.pred_len, self.feature_dim)).to(x_enc.device).detach()
129 |             else:
130 |                 batch_y_mark = torch.concat([x_mark_enc, batch_y_mark[:, -self.pred_len:, :]],dim=1)
131 |             dec_out = torch.stack([self.forecast(x_enc[:, :, feature], x_mark_enc, x_dec, batch_y_mark) for feature in range(x_enc.shape[-1])],dim=-1)
132 |             return dec_out # [B, L, D]
133 |         if self.task_name == 'imputation':
134 |             dec_out = torch.stack([self.imputation(x_enc[:, :, feature], x_mark_enc, x_dec, batch_y_mark, mask) for feature in range(x_enc.shape[-1])],dim=-1)
135 |             return dec_out  # [B, L, D]
136 |         if self.task_name == 'anomaly_detection':
137 |             raise NotImplementedError("Task anomaly_detection for Tide is temporarily not supported")
138 |         if self.task_name == 'classification':
139 |             raise NotImplementedError("Task classification for Tide is temporarily not supported")
140 |         return None
141 |     
142 |     
143 | 
144 | 
145 | 
146 | 


--------------------------------------------------------------------------------
/models/Transformer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from layers.Transformer_EncDec import Decoder, DecoderLayer, Encoder, EncoderLayer, ConvLayer
  5 | from layers.SelfAttention_Family import FullAttention, AttentionLayer
  6 | from layers.Embed import DataEmbedding
  7 | import numpy as np
  8 | 
  9 | 
 10 | class Model(nn.Module):
 11 |     """
 12 |     Vanilla Transformer
 13 |     with O(L^2) complexity
 14 |     Paper link: https://proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
 15 |     """
 16 | 
 17 |     def __init__(self, configs):
 18 |         super(Model, self).__init__()
 19 |         self.task_name = configs.task_name
 20 |         self.pred_len = configs.pred_len
 21 |         # Embedding
 22 |         self.enc_embedding = DataEmbedding(configs.enc_in, configs.d_model, configs.embed, configs.freq,
 23 |                                            configs.dropout)
 24 |         # Encoder
 25 |         self.encoder = Encoder(
 26 |             [
 27 |                 EncoderLayer(
 28 |                     AttentionLayer(
 29 |                         FullAttention(False, configs.factor, attention_dropout=configs.dropout,
 30 |                                       output_attention=False), configs.d_model, configs.n_heads),
 31 |                     configs.d_model,
 32 |                     configs.d_ff,
 33 |                     dropout=configs.dropout,
 34 |                     activation=configs.activation
 35 |                 ) for l in range(configs.e_layers)
 36 |             ],
 37 |             norm_layer=torch.nn.LayerNorm(configs.d_model)
 38 |         )
 39 |         # Decoder
 40 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
 41 |             self.dec_embedding = DataEmbedding(configs.dec_in, configs.d_model, configs.embed, configs.freq,
 42 |                                                configs.dropout)
 43 |             self.decoder = Decoder(
 44 |                 [
 45 |                     DecoderLayer(
 46 |                         AttentionLayer(
 47 |                             FullAttention(True, configs.factor, attention_dropout=configs.dropout,
 48 |                                           output_attention=False),
 49 |                             configs.d_model, configs.n_heads),
 50 |                         AttentionLayer(
 51 |                             FullAttention(False, configs.factor, attention_dropout=configs.dropout,
 52 |                                           output_attention=False),
 53 |                             configs.d_model, configs.n_heads),
 54 |                         configs.d_model,
 55 |                         configs.d_ff,
 56 |                         dropout=configs.dropout,
 57 |                         activation=configs.activation,
 58 |                     )
 59 |                     for l in range(configs.d_layers)
 60 |                 ],
 61 |                 norm_layer=torch.nn.LayerNorm(configs.d_model),
 62 |                 projection=nn.Linear(configs.d_model, configs.c_out, bias=True)
 63 |             )
 64 |         if self.task_name == 'imputation':
 65 |             self.projection = nn.Linear(configs.d_model, configs.c_out, bias=True)
 66 |         if self.task_name == 'anomaly_detection':
 67 |             self.projection = nn.Linear(configs.d_model, configs.c_out, bias=True)
 68 |         if self.task_name == 'classification':
 69 |             self.act = F.gelu
 70 |             self.dropout = nn.Dropout(configs.dropout)
 71 |             self.projection = nn.Linear(configs.d_model * configs.seq_len, configs.num_class)
 72 | 
 73 |     def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 74 |         # Embedding
 75 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
 76 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 77 | 
 78 |         dec_out = self.dec_embedding(x_dec, x_mark_dec)
 79 |         dec_out = self.decoder(dec_out, enc_out, x_mask=None, cross_mask=None)
 80 |         return dec_out
 81 | 
 82 |     def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask):
 83 |         # Embedding
 84 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
 85 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 86 | 
 87 |         dec_out = self.projection(enc_out)
 88 |         return dec_out
 89 | 
 90 |     def anomaly_detection(self, x_enc):
 91 |         # Embedding
 92 |         enc_out = self.enc_embedding(x_enc, None)
 93 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 94 | 
 95 |         dec_out = self.projection(enc_out)
 96 |         return dec_out
 97 | 
 98 |     def classification(self, x_enc, x_mark_enc):
 99 |         # Embedding
100 |         enc_out = self.enc_embedding(x_enc, None)
101 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
102 | 
103 |         # Output
104 |         output = self.act(enc_out)  # the output transformer encoder/decoder embeddings don't include non-linearity
105 |         output = self.dropout(output)
106 |         output = output * x_mark_enc.unsqueeze(-1)  # zero-out padding embeddings
107 |         output = output.reshape(output.shape[0], -1)  # (batch_size, seq_length * d_model)
108 |         output = self.projection(output)  # (batch_size, num_classes)
109 |         return output
110 | 
111 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
112 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
113 |             dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
114 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
115 |         if self.task_name == 'imputation':
116 |             dec_out = self.imputation(x_enc, x_mark_enc, x_dec, x_mark_dec, mask)
117 |             return dec_out  # [B, L, D]
118 |         if self.task_name == 'anomaly_detection':
119 |             dec_out = self.anomaly_detection(x_enc)
120 |             return dec_out  # [B, L, D]
121 |         if self.task_name == 'classification':
122 |             dec_out = self.classification(x_enc, x_mark_enc)
123 |             return dec_out  # [B, N]
124 |         return None
125 | 


--------------------------------------------------------------------------------
/models/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/models/__init__.py


--------------------------------------------------------------------------------
/models/iTransformer.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.nn.functional as F
  4 | from layers.Transformer_EncDec import Encoder, EncoderLayer
  5 | from layers.SelfAttention_Family import FullAttention, AttentionLayer
  6 | from layers.Embed import DataEmbedding_inverted
  7 | import numpy as np
  8 | 
  9 | 
 10 | class Model(nn.Module):
 11 |     """
 12 |     Paper link: https://arxiv.org/abs/2310.06625
 13 |     """
 14 | 
 15 |     def __init__(self, configs):
 16 |         super(Model, self).__init__()
 17 |         self.task_name = configs.task_name
 18 |         self.seq_len = configs.seq_len
 19 |         self.pred_len = configs.pred_len
 20 |         # Embedding
 21 |         self.enc_embedding = DataEmbedding_inverted(configs.seq_len, configs.d_model, configs.embed, configs.freq,
 22 |                                                     configs.dropout)
 23 |         # Encoder
 24 |         self.encoder = Encoder(
 25 |             [
 26 |                 EncoderLayer(
 27 |                     AttentionLayer(
 28 |                         FullAttention(False, configs.factor, attention_dropout=configs.dropout,
 29 |                                       output_attention=False), configs.d_model, configs.n_heads),
 30 |                     configs.d_model,
 31 |                     configs.d_ff,
 32 |                     dropout=configs.dropout,
 33 |                     activation=configs.activation
 34 |                 ) for l in range(configs.e_layers)
 35 |             ],
 36 |             norm_layer=torch.nn.LayerNorm(configs.d_model)
 37 |         )
 38 |         # Decoder
 39 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
 40 |             self.projection = nn.Linear(configs.d_model, configs.pred_len, bias=True)
 41 |         if self.task_name == 'imputation':
 42 |             self.projection = nn.Linear(configs.d_model, configs.seq_len, bias=True)
 43 |         if self.task_name == 'anomaly_detection':
 44 |             self.projection = nn.Linear(configs.d_model, configs.seq_len, bias=True)
 45 |         if self.task_name == 'classification':
 46 |             self.act = F.gelu
 47 |             self.dropout = nn.Dropout(configs.dropout)
 48 |             self.projection = nn.Linear(configs.d_model * configs.enc_in, configs.num_class)
 49 | 
 50 |     def forecast(self, x_enc, x_mark_enc, x_dec, x_mark_dec):
 51 |         # Normalization from Non-stationary Transformer
 52 |         means = x_enc.mean(1, keepdim=True).detach()
 53 |         x_enc = x_enc - means
 54 |         stdev = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5)
 55 |         x_enc /= stdev
 56 | 
 57 |         _, _, N = x_enc.shape
 58 | 
 59 |         # Embedding
 60 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
 61 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 62 | 
 63 |         dec_out = self.projection(enc_out).permute(0, 2, 1)[:, :, :N]
 64 |         # De-Normalization from Non-stationary Transformer
 65 |         dec_out = dec_out * (stdev[:, 0, :].unsqueeze(1).repeat(1, self.pred_len, 1))
 66 |         dec_out = dec_out + (means[:, 0, :].unsqueeze(1).repeat(1, self.pred_len, 1))
 67 |         return dec_out
 68 | 
 69 |     def imputation(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask):
 70 |         # Normalization from Non-stationary Transformer
 71 |         means = x_enc.mean(1, keepdim=True).detach()
 72 |         x_enc = x_enc - means
 73 |         stdev = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5)
 74 |         x_enc /= stdev
 75 | 
 76 |         _, L, N = x_enc.shape
 77 | 
 78 |         # Embedding
 79 |         enc_out = self.enc_embedding(x_enc, x_mark_enc)
 80 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
 81 | 
 82 |         dec_out = self.projection(enc_out).permute(0, 2, 1)[:, :, :N]
 83 |         # De-Normalization from Non-stationary Transformer
 84 |         dec_out = dec_out * (stdev[:, 0, :].unsqueeze(1).repeat(1, L, 1))
 85 |         dec_out = dec_out + (means[:, 0, :].unsqueeze(1).repeat(1, L, 1))
 86 |         return dec_out
 87 | 
 88 |     def anomaly_detection(self, x_enc):
 89 |         # Normalization from Non-stationary Transformer
 90 |         means = x_enc.mean(1, keepdim=True).detach()
 91 |         x_enc = x_enc - means
 92 |         stdev = torch.sqrt(torch.var(x_enc, dim=1, keepdim=True, unbiased=False) + 1e-5)
 93 |         x_enc /= stdev
 94 | 
 95 |         _, L, N = x_enc.shape
 96 | 
 97 |         # Embedding
 98 |         enc_out = self.enc_embedding(x_enc, None)
 99 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
100 | 
101 |         dec_out = self.projection(enc_out).permute(0, 2, 1)[:, :, :N]
102 |         # De-Normalization from Non-stationary Transformer
103 |         dec_out = dec_out * (stdev[:, 0, :].unsqueeze(1).repeat(1, L, 1))
104 |         dec_out = dec_out + (means[:, 0, :].unsqueeze(1).repeat(1, L, 1))
105 |         return dec_out
106 | 
107 |     def classification(self, x_enc, x_mark_enc):
108 |         # Embedding
109 |         enc_out = self.enc_embedding(x_enc, None)
110 |         enc_out, attns = self.encoder(enc_out, attn_mask=None)
111 | 
112 |         # Output
113 |         output = self.act(enc_out)  # the output transformer encoder/decoder embeddings don't include non-linearity
114 |         output = self.dropout(output)
115 |         output = output.reshape(output.shape[0], -1)  # (batch_size, c_in * d_model)
116 |         output = self.projection(output)  # (batch_size, num_classes)
117 |         return output
118 | 
119 |     def forward(self, x_enc, x_mark_enc, x_dec, x_mark_dec, mask=None):
120 |         if self.task_name == 'long_term_forecast' or self.task_name == 'short_term_forecast':
121 |             dec_out = self.forecast(x_enc, x_mark_enc, x_dec, x_mark_dec)
122 |             return dec_out[:, -self.pred_len:, :]  # [B, L, D]
123 |         if self.task_name == 'imputation':
124 |             dec_out = self.imputation(x_enc, x_mark_enc, x_dec, x_mark_dec, mask)
125 |             return dec_out  # [B, L, D]
126 |         if self.task_name == 'anomaly_detection':
127 |             dec_out = self.anomaly_detection(x_enc)
128 |             return dec_out  # [B, L, D]
129 |         if self.task_name == 'classification':
130 |             dec_out = self.classification(x_enc, x_mark_enc)
131 |             return dec_out  # [B, N]
132 |         return None
133 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | einops==0.4.0
 2 | matplotlib==3.7.0
 3 | numpy==1.23.5
 4 | pandas==1.5.3
 5 | patool==1.12
 6 | reformer-pytorch==1.4.4
 7 | scikit-learn==1.2.2
 8 | scipy==1.10.1
 9 | sktime==0.16.1
10 | sympy==1.11.1
11 | torch==2.0.0
12 | tqdm==4.64.1
13 | 


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/ECL/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=1
 2 | 
 3 | model_name=TimeXer
 4 | des='Timexer-MS'
 5 | 
 6 | 
 7 | python3 -u run.py \
 8 |   --task_name long_term_forecast \
 9 |   --is_training 1 \
10 |   --root_path ./dataset/electricity/ \
11 |   --data_path electricity.csv \
12 |   --model_id ECL_96_96 \
13 |   --model $model_name \
14 |   --data custom \
15 |   --features MS \
16 |   --seq_len 96 \
17 |   --label_len 48 \
18 |   --pred_len 96 \
19 |   --e_layers 1 \
20 |   --factor 3 \
21 |   --enc_in 321 \
22 |   --dec_in 321 \
23 |   --c_out 321 \
24 |   --des $des \
25 |   --batch_size 4 \
26 |   --itr 1
27 | 
28 | python3 -u run.py \
29 |   --task_name long_term_forecast \
30 |   --is_training 1 \
31 |   --root_path ./dataset/electricity/ \
32 |   --data_path electricity.csv \
33 |   --model_id ECL_96_192 \
34 |   --model $model_name \
35 |   --data custom \
36 |   --features MS \
37 |   --seq_len 96 \
38 |   --label_len 48 \
39 |   --pred_len 192 \
40 |   --e_layers 1 \
41 |   --factor 3 \
42 |   --enc_in 321 \
43 |   --dec_in 321 \
44 |   --c_out 321 \
45 |   --des $des \
46 |   --batch_size 32 \
47 |   --itr 1
48 | 
49 |  python3 -u run.py \
50 |   --task_name long_term_forecast \
51 |   --is_training 1 \
52 |   --root_path ./dataset/electricity/ \
53 |   --data_path electricity.csv \
54 |   --model_id ECL_96_336 \
55 |   --model $model_name \
56 |   --data custom \
57 |   --features MS \
58 |   --seq_len 96 \
59 |   --label_len 48 \
60 |   --pred_len 336 \
61 |   --e_layers 1 \
62 |   --factor 3 \
63 |   --enc_in 321 \
64 |   --dec_in 321 \
65 |   --c_out 321 \
66 |   --des $des \
67 |   --batch_size 32 \
68 |   --itr 1
69 | 
70 | python3 -u run.py \
71 |   --task_name long_term_forecast \
72 |   --is_training 1 \
73 |   --root_path ./dataset/electricity/ \
74 |   --data_path electricity.csv \
75 |   --model_id ECL_96_720 \
76 |   --model $model_name \
77 |   --data custom \
78 |   --features MS \
79 |   --seq_len 96 \
80 |   --label_len 48 \
81 |   --pred_len 720 \
82 |   --e_layers 3 \
83 |   --factor 3 \
84 |   --enc_in 321 \
85 |   --dec_in 321 \
86 |   --c_out 321 \
87 |   --des $des \
88 |   --d_model 512 \
89 |   --itr 1
90 | 


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/EPF/TimeXer.sh:
--------------------------------------------------------------------------------
  1 | export CUDA_VISIBLE_DEVICES=0
  2 | 
  3 | model_name=TimeXer
  4 | des='Timexer-MS'
  5 | patch_len=24
  6 | 
  7 | 
  8 | python3 -u run.py \
  9 |   --is_training 1 \
 10 |   --task_name long_term_forecast \
 11 |   --root_path ./dataset/EPF/ \
 12 |   --data_path NP.csv \
 13 |   --model_id NP_168_24 \
 14 |   --model $model_name \
 15 |   --data custom \
 16 |   --features MS \
 17 |   --seq_len 168 \
 18 |   --pred_len 24 \
 19 |   --e_layers 3 \
 20 |   --enc_in 3 \
 21 |   --dec_in 3 \
 22 |   --c_out 1 \
 23 |   --des $des \
 24 |   --patch_len $patch_len \
 25 |   --d_model 512 \
 26 |   --d_ff 512 \
 27 |   --batch_size 4 \
 28 |   --itr 1
 29 | 
 30 | python3 -u run.py \
 31 |   --is_training 1 \
 32 |   --task_name long_term_forecast \
 33 |   --root_path ./dataset/EPF/ \
 34 |   --data_path PJM.csv \
 35 |   --model_id PJM_168_24 \
 36 |   --model $model_name \
 37 |   --data custom \
 38 |   --features MS \
 39 |   --seq_len 168 \
 40 |   --pred_len 24 \
 41 |   --e_layers 3 \
 42 |   --enc_in 3 \
 43 |   --dec_in 3 \
 44 |   --c_out 1 \
 45 |   --des $des \
 46 |   --patch_len $patch_len \
 47 |   --d_model 512 \
 48 |   --batch_size 16 \
 49 |   --itr 1
 50 | 
 51 | python3 -u run.py \
 52 |   --is_training 1 \
 53 |   --task_name long_term_forecast \
 54 |   --root_path ./dataset/EPF/ \
 55 |   --data_path BE.csv \
 56 |   --model_id BE_168_24 \
 57 |   --model $model_name \
 58 |   --data custom \
 59 |   --features MS \
 60 |   --seq_len 168 \
 61 |   --pred_len 24 \
 62 |   --e_layers 2 \
 63 |   --enc_in 3 \
 64 |   --dec_in 3 \
 65 |   --c_out 1 \
 66 |   --des $des \
 67 |   --patch_len $patch_len \
 68 |   --d_model 512 \
 69 |   --d_ff 512 \
 70 |   --batch_size 16 \
 71 |   --itr 1
 72 | 
 73 | 
 74 | python3 -u run.py \
 75 |   --is_training 1 \
 76 |   --task_name long_term_forecast \
 77 |   --root_path ./dataset/EPF/ \
 78 |   --data_path FR.csv \
 79 |   --model_id FR_168_24 \
 80 |   --model $model_name \
 81 |   --data custom \
 82 |   --features MS \
 83 |   --seq_len 168 \
 84 |   --pred_len 24 \
 85 |   --e_layers 2 \
 86 |   --enc_in 3 \
 87 |   --dec_in 3 \
 88 |   --c_out 1 \
 89 |   --des $des \
 90 |   --patch_len $patch_len \
 91 |   --batch_size 16 \
 92 |   --d_model 512 \
 93 |   --itr 1
 94 | 
 95 | python3 -u run.py \
 96 |   --is_training 1 \
 97 |   --task_name long_term_forecast \
 98 |   --root_path ./dataset/EPF/ \
 99 |   --data_path DE.csv \
100 |   --model_id DE_168_24 \
101 |   --model $model_name \
102 |   --data custom \
103 |   --features MS \
104 |   --seq_len 168 \
105 |   --pred_len 24 \
106 |   --e_layers 1 \
107 |   --enc_in 3 \
108 |   --dec_in 3 \
109 |   --c_out 1 \
110 |   --des $des \
111 |   --patch_len $patch_len \
112 |   --batch_size 4 \
113 |   --d_model 512 \
114 |   --itr 1
115 | 


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/ETTh1/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=1
 2 | 
 3 | model_name=TimeXer
 4 | des='Timexer-MS'
 5 | 
 6 | python3 -u run.py \
 7 |   --task_name long_term_forecast \
 8 |   --is_training 1 \
 9 |   --root_path ./dataset/ETT-small/ \
10 |   --data_path ETTh1.csv \
11 |   --model_id ETTh1_96_96 \
12 |   --model $model_name \
13 |   --data ETTh1 \
14 |   --features MS \
15 |   --seq_len 96 \
16 |   --label_len 48 \
17 |   --pred_len 96 \
18 |   --e_layers 2 \
19 |   --factor 3 \
20 |   --enc_in 7 \
21 |   --dec_in 7 \
22 |   --c_out 7 \
23 |   --d_model 512 \
24 |   --d_ff 512 \
25 |   --des $des \
26 |   --itr 1
27 | 
28 | python3 -u run.py \
29 |   --task_name long_term_forecast \
30 |   --is_training 1 \
31 |   --root_path ./dataset/ETT-small/ \
32 |   --data_path ETTh1.csv \
33 |   --model_id ETTh1_96_192 \
34 |   --model $model_name \
35 |   --data ETTh1 \
36 |   --features MS \
37 |   --seq_len 96 \
38 |   --label_len 48 \
39 |   --pred_len 192 \
40 |   --e_layers 2 \
41 |   --factor 3 \
42 |   --enc_in 7 \
43 |   --dec_in 7 \
44 |   --c_out 7 \
45 |   --d_model 128 \
46 |   --d_ff 128 \
47 |   --batch_size 4 \
48 |   --des $des \
49 |   --itr 1
50 | 
51 | python3 -u run.py \
52 |   --task_name long_term_forecast \
53 |   --is_training 1 \
54 |   --root_path ./dataset/ETT-small/ \
55 |   --data_path ETTh1.csv \
56 |   --model_id ETTh1_96_336 \
57 |   --model $model_name \
58 |   --data ETTh1 \
59 |   --features MS \
60 |   --seq_len 96 \
61 |   --label_len 48 \
62 |   --pred_len 336 \
63 |   --e_layers 2 \
64 |   --factor 3 \
65 |   --enc_in 7 \
66 |   --dec_in 7 \
67 |   --c_out 7 \
68 |   --d_model 512 \
69 |   --d_ff 512 \
70 |   --batch_size 32 \
71 |   --des $des \
72 |   --itr 1
73 | 
74 | python3 -u run.py \
75 |   --task_name long_term_forecast \
76 |   --is_training 1 \
77 |   --root_path ./dataset/ETT-small/ \
78 |   --data_path ETTh1.csv \
79 |   --model_id ETTh1_96_720 \
80 |   --model $model_name \
81 |   --data ETTh1 \
82 |   --features MS \
83 |   --seq_len 96 \
84 |   --label_len 48 \
85 |   --pred_len 720 \
86 |   --e_layers 2 \
87 |   --factor 3 \
88 |   --enc_in 7 \
89 |   --dec_in 7 \
90 |   --c_out 7 \
91 |   --d_model 512 \
92 |   --batch_size 128 \
93 |   --des $des \
94 |   --itr 1
95 | 


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/ETTh2/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=3
 2 | 
 3 | model_name=TimeXer
 4 | des='Timexer-MS'
 5 | 
 6 | python3 -u run.py \
 7 |   --task_name long_term_forecast \
 8 |   --is_training 1 \
 9 |   --root_path ./dataset/ETT-small/ \
10 |   --data_path ETTh2.csv \
11 |   --model_id ETTh2_96_96 \
12 |   --model $model_name \
13 |   --data ETTh2 \
14 |   --features MS \
15 |   --seq_len 96 \
16 |   --label_len 48 \
17 |   --pred_len 96 \
18 |   --e_layers 1 \
19 |   --factor 3 \
20 |   --enc_in 7 \
21 |   --dec_in 7 \
22 |   --c_out 7 \
23 |   --d_model 128 \
24 |   --d_ff 128 \
25 |   --batch_size 128 \
26 |   --des $des \
27 |   --itr 1
28 | 
29 | python3 -u run.py \
30 |   --task_name long_term_forecast \
31 |   --is_training 1 \
32 |   --root_path ./dataset/ETT-small/ \
33 |   --data_path ETTh2.csv \
34 |   --model_id ETTh2_96_192 \
35 |   --model $model_name \
36 |   --data ETTh2 \
37 |   --features MS \
38 |   --seq_len 96 \
39 |   --label_len 48 \
40 |   --pred_len 192 \
41 |   --e_layers 1 \
42 |   --factor 3 \
43 |   --enc_in 7 \
44 |   --dec_in 7 \
45 |   --c_out 7 \
46 |   --d_model 128 \
47 |   --d_ff 512 \
48 |   --batch_size 128 \
49 |   --des $des \
50 |   --itr 1
51 | 
52 | python3 -u run.py \
53 |   --task_name long_term_forecast \
54 |   --is_training 1 \
55 |   --root_path ./dataset/ETT-small/ \
56 |   --data_path ETTh2.csv \
57 |   --model_id ETTh2_96_336 \
58 |   --model $model_name \
59 |   --data ETTh2 \
60 |   --features MS \
61 |   --seq_len 96 \
62 |   --label_len 48 \
63 |   --pred_len 336 \
64 |   --e_layers 2 \
65 |   --factor 3 \
66 |   --enc_in 7 \
67 |   --dec_in 7 \
68 |   --c_out 7 \
69 |   --d_model 128 \
70 |   --d_ff 256 \
71 |   --batch_size 16 \
72 |   --des $des \
73 |   --itr 1
74 | 
75 | python3 -u run.py \
76 |   --task_name long_term_forecast \
77 |   --is_training 1 \
78 |   --root_path ./dataset/ETT-small/ \
79 |   --data_path ETTh2.csv \
80 |   --model_id ETTh2_96_720 \
81 |   --model $model_name \
82 |   --data ETTh2 \
83 |   --features MS \
84 |   --seq_len 96 \
85 |   --label_len 48 \
86 |   --pred_len 720 \
87 |   --e_layers 1 \
88 |   --factor 3 \
89 |   --enc_in 7 \
90 |   --dec_in 7 \
91 |   --c_out 7 \
92 |   --d_model 256 \
93 |   --d_ff 512 \
94 |   --des $des \
95 |   --itr 1


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/ETTm1/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=2
 2 | 
 3 | model_name=TimeXer
 4 | des='Timexer-MS'
 5 | 
 6 | python3 -u run.py \
 7 |   --task_name long_term_forecast \
 8 |   --is_training 1 \
 9 |   --root_path ./dataset/ETT-small/ \
10 |   --data_path ETTm1.csv \
11 |   --model_id ETTm1_96_96 \
12 |   --model $model_name \
13 |   --data ETTm1 \
14 |   --features MS \
15 |   --seq_len 96 \
16 |   --label_len 48 \
17 |   --pred_len 96 \
18 |   --e_layers 1 \
19 |   --factor 3 \
20 |   --enc_in 7 \
21 |   --dec_in 7 \
22 |   --c_out 7 \
23 |   --d_model 256 \
24 |   --batch_size 128 \
25 |   --des $des \
26 |   --itr 1
27 | 
28 | python3 -u run.py \
29 |   --task_name long_term_forecast \
30 |   --is_training 1 \
31 |   --root_path ./dataset/ETT-small/ \
32 |   --data_path ETTm1.csv \
33 |   --model_id ETTm1_96_192 \
34 |   --model $model_name \
35 |   --data ETTm1 \
36 |   --features MS \
37 |   --seq_len 96 \
38 |   --label_len 48 \
39 |   --pred_len 192 \
40 |   --e_layers 1 \
41 |   --factor 3 \
42 |   --enc_in 7 \
43 |   --dec_in 7 \
44 |   --c_out 7 \
45 |   --d_model 128 \
46 |   --batch_size 128 \
47 |   --des $des \
48 |   --itr 1
49 | 
50 | python3 -u run.py \
51 |   --task_name long_term_forecast \
52 |   --is_training 1 \
53 |   --root_path ./dataset/ETT-small/ \
54 |   --data_path ETTm1.csv \
55 |   --model_id ETTm1_96_336 \
56 |   --model $model_name \
57 |   --data ETTm1 \
58 |   --features MS \
59 |   --seq_len 96 \
60 |   --label_len 48 \
61 |   --pred_len 336 \
62 |   --e_layers 1 \
63 |   --factor 3 \
64 |   --enc_in 7 \
65 |   --dec_in 7 \
66 |   --c_out 7 \
67 |   --d_model 128 \
68 |   --batch_size 128 \
69 |   --des $des \
70 |   --itr 1
71 | 
72 | python3 -u run.py \
73 |   --task_name long_term_forecast \
74 |   --is_training 1 \
75 |   --root_path ./dataset/ETT-small/ \
76 |   --data_path ETTm1.csv \
77 |   --model_id ETTm1_96_720 \
78 |   --model $model_name \
79 |   --data ETTm1 \
80 |   --features MS \
81 |   --seq_len 96 \
82 |   --label_len 48 \
83 |   --pred_len 720 \
84 |   --e_layers 1 \
85 |   --factor 3 \
86 |   --enc_in 7 \
87 |   --dec_in 7 \
88 |   --c_out 7 \
89 |   --d_model 128 \
90 |   --batch_size 128 \
91 |   --des $des \
92 |   --itr 1


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/ETTm2/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=2
 2 | 
 3 | model_name=TimeXer
 4 | des='Timexer-MS'
 5 | 
 6 | python3 -u run.py \
 7 |   --task_name long_term_forecast \
 8 |   --is_training 1 \
 9 |   --root_path ./dataset/ETT-small/ \
10 |   --data_path ETTm2.csv \
11 |   --model_id ETTm2_96_96 \
12 |   --model $model_name \
13 |   --data ETTm2 \
14 |   --features MS \
15 |   --seq_len 96 \
16 |   --label_len 48 \
17 |   --pred_len 96 \
18 |   --e_layers 1 \
19 |   --factor 3 \
20 |   --enc_in 7 \
21 |   --dec_in 7 \
22 |   --c_out 7 \
23 |   --d_model 512 \
24 |   --batch_size 16 \
25 |   --des $des \
26 |   --itr 1
27 | 
28 | python3 -u run.py \
29 |   --task_name long_term_forecast \
30 |   --is_training 1 \
31 |   --root_path ./dataset/ETT-small/ \
32 |   --data_path ETTm2.csv \
33 |   --model_id ETTm2_96_192 \
34 |   --model $model_name \
35 |   --data ETTm2 \
36 |   --features MS \
37 |   --seq_len 96 \
38 |   --label_len 48 \
39 |   --pred_len 192 \
40 |   --e_layers 1 \
41 |   --factor 3 \
42 |   --enc_in 7 \
43 |   --dec_in 7 \
44 |   --c_out 7 \
45 |   --d_model 256 \
46 |   --batch_size 4 \
47 |   --des $des \
48 |   --itr 1
49 | 
50 | python3 -u run.py \
51 |   --task_name long_term_forecast \
52 |   --is_training 1 \
53 |   --root_path ./dataset/ETT-small/ \
54 |   --data_path ETTm2.csv \
55 |   --model_id ETTm2_96_336 \
56 |   --model $model_name \
57 |   --data ETTm2 \
58 |   --features MS \
59 |   --seq_len 96 \
60 |   --label_len 48 \
61 |   --pred_len 336 \
62 |   --e_layers 1 \
63 |   --factor 3 \
64 |   --enc_in 7 \
65 |   --dec_in 7 \
66 |   --c_out 7 \
67 |   --d_model 128 \
68 |   --batch_size 128 \
69 |   --des $des \
70 |   --itr 1
71 | 
72 | python3 -u run.py \
73 |   --task_name long_term_forecast \
74 |   --is_training 1 \
75 |   --root_path ./dataset/ETT-small/ \
76 |   --data_path ETTm2.csv \
77 |   --model_id ETTm2_96_720 \
78 |   --model $model_name \
79 |   --data ETTm2 \
80 |   --features MS \
81 |   --seq_len 96 \
82 |   --label_len 48 \
83 |   --pred_len 720 \
84 |   --e_layers 1 \
85 |   --factor 3 \
86 |   --enc_in 7 \
87 |   --dec_in 7 \
88 |   --c_out 7 \
89 |   --d_model 128 \
90 |   --batch_size 128 \
91 |   --des $des \
92 |   --itr 1
93 | 


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/Traffic/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=0
 2 | 
 3 | model_name=TimeXer
 4 | des='Timexer-MS'
 5 | 
 6 | python3 -u run.py \
 7 |   --task_name long_term_forecast \
 8 |   --is_training 1 \
 9 |   --root_path ./dataset/traffic/ \
10 |   --data_path traffic.csv \
11 |   --model_id traffic_96_96 \
12 |   --model $model_name \
13 |   --data custom \
14 |   --features MS \
15 |   --seq_len 96 \
16 |   --label_len 48 \
17 |   --pred_len 96 \
18 |   --e_layers 1 \
19 |   --d_layers 1 \
20 |   --factor 3 \
21 |   --enc_in 862 \
22 |   --dec_in 862 \
23 |   --c_out 862 \
24 |   --d_model 512 \
25 |   --des $des \
26 |   --batch_size 4 \
27 |   --itr 1
28 | 
29 | python3 -u run.py \
30 |   --task_name long_term_forecast \
31 |   --is_training 1 \
32 |   --root_path ./dataset/traffic/ \
33 |   --data_path traffic.csv \
34 |   --model_id traffic_96_192 \
35 |   --model $model_name \
36 |   --data custom \
37 |   --features MS \
38 |   --seq_len 96 \
39 |   --label_len 48 \
40 |   --pred_len 192 \
41 |   --e_layers 1 \
42 |   --d_layers 1 \
43 |   --factor 3 \
44 |   --enc_in 862 \
45 |   --dec_in 862 \
46 |   --c_out 862 \
47 |   --d_model 512 \
48 |   --des 'Exp' \
49 |   --batch_size 4 \
50 |   --itr 1
51 | 
52 | python3 -u run.py \
53 |   --task_name long_term_forecast \
54 |   --is_training 1 \
55 |   --root_path ./dataset/traffic/ \
56 |   --data_path traffic.csv \
57 |   --model_id traffic_96_336 \
58 |   --model $model_name \
59 |   --data custom \
60 |   --features MS \
61 |   --seq_len 96 \
62 |   --label_len 48 \
63 |   --pred_len 336 \
64 |   --e_layers 1 \
65 |   --d_layers 1 \
66 |   --factor 3 \
67 |   --enc_in 862 \
68 |   --dec_in 862 \
69 |   --c_out 862 \
70 |   --d_model 512 \
71 |   --des $des \
72 |   --batch_size 4 \
73 |   --itr 1
74 | 
75 | python3 -u run.py \
76 |   --task_name long_term_forecast \
77 |   --is_training 1 \
78 |   --root_path ./dataset/traffic/ \
79 |   --data_path traffic.csv \
80 |   --model_id traffic_96_720 \
81 |   --model $model_name \
82 |   --data custom \
83 |   --features MS \
84 |   --seq_len 96 \
85 |   --label_len 48 \
86 |   --pred_len 720 \
87 |   --e_layers 1 \
88 |   --d_layers 1 \
89 |   --factor 3 \
90 |   --enc_in 862 \
91 |   --dec_in 862 \
92 |   --c_out 862 \
93 |   --d_model 512 \
94 |   --des $des \
95 |   --batch_size 4 \
96 |   --itr 1
97 | 


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/Weather/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=3
 2 | 
 3 | model_name=TimeXer
 4 | des='Timexer-MS'
 5 | 
 6 | 
 7 | python3 -u run.py \
 8 |   --task_name long_term_forecast \
 9 |   --is_training 1 \
10 |   --root_path ./dataset/weather/ \
11 |   --data_path weather.csv \
12 |   --model_id weather_96_96 \
13 |   --model $model_name \
14 |   --data custom \
15 |   --features MS \
16 |   --seq_len 96 \
17 |   --label_len 48 \
18 |   --pred_len 96 \
19 |   --e_layers 1 \
20 |   --factor 3 \
21 |   --enc_in 21 \
22 |   --dec_in 21 \
23 |   --c_out 21 \
24 |   --des $des \
25 |   --d_model 128 \
26 |   --itr 1
27 | 
28 | python3 -u run.py \
29 |   --task_name long_term_forecast \
30 |   --is_training 1 \
31 |   --root_path ./dataset/weather/ \
32 |   --data_path weather.csv \
33 |   --model_id weather_96_192 \
34 |   --model $model_name \
35 |   --data custom \
36 |   --features MS \
37 |   --seq_len 96 \
38 |   --label_len 48 \
39 |   --pred_len 192 \
40 |   --e_layers 1 \
41 |   --factor 3 \
42 |   --enc_in 21 \
43 |   --dec_in 21 \
44 |   --c_out 21 \
45 |   --des $des \
46 |   --d_model 128 \
47 |   --itr 1
48 | 
49 | python3 -u run.py \
50 |   --task_name long_term_forecast \
51 |   --is_training 1 \
52 |   --root_path ./dataset/weather/ \
53 |   --data_path weather.csv \
54 |   --model_id weather_96_336 \
55 |   --model $model_name \
56 |   --data custom \
57 |   --features MS \
58 |   --seq_len 96 \
59 |   --label_len 48 \
60 |   --pred_len 336 \
61 |   --e_layers 1 \
62 |   --factor 3 \
63 |   --enc_in 21 \
64 |   --dec_in 21 \
65 |   --c_out 21 \
66 |   --des $des \
67 |   --d_model 128 \
68 |   --itr 1
69 | 
70 | python3 -u run.py \
71 |   --task_name long_term_forecast \
72 |   --is_training 1 \
73 |   --root_path ./dataset/weather/ \
74 |   --data_path weather.csv \
75 |   --model_id weather_96_720 \
76 |   --model $model_name \
77 |   --data custom \
78 |   --features MS \
79 |   --seq_len 96 \
80 |   --label_len 48 \
81 |   --pred_len 720 \
82 |   --e_layers 1 \
83 |   --factor 3 \
84 |   --enc_in 21 \
85 |   --dec_in 21 \
86 |   --c_out 21 \
87 |   --des $des \
88 |   --d_model 128 \
89 |   --itr 1


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/meteorology/temp.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=0
 2 | model_name=TimeXer
 3 | 
 4 | python3 -u run.py \
 5 |   --task_name long_term_forecast \
 6 |   --is_training 1 \
 7 |   --root_path ./dataset/meteorology \
 8 |   --data_path temp.npy \
 9 |   --model_id temp \
10 |   --model $model_name \
11 |   --data Meteorology \
12 |   --features MS \
13 |   --seq_len 168 \
14 |   --label_len 1 \
15 |   --pred_len 72 \
16 |   --patch_len 8 \
17 |   --e_layers 2 \
18 |   --enc_in 37 \
19 |   --d_model 512 \
20 |   --d_ff 512 \
21 |   --des 'global_temp' \
22 |   --learning_rate 0.0001 \
23 |   --batch_size 4096


--------------------------------------------------------------------------------
/scripts/forecast_exogenous/meteorology/wind.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=1
 2 | model_name=TimeXer
 3 | 
 4 | python3 -u run.py \
 5 |   --task_name long_term_forecast \
 6 |   --is_training 1 \
 7 |   --root_path ./dataset/meteorology \
 8 |   --data_path wind.npy \
 9 |   --model_id wind \
10 |   --model $model_name \
11 |   --data Meteorology \
12 |   --features MS \
13 |   --seq_len 168 \
14 |   --label_len 1 \
15 |   --pred_len 72 \
16 |   --patch_len 8 \
17 |   --e_layers 2 \
18 |   --enc_in 37 \
19 |   --d_model 512 \
20 |   --d_ff 512 \
21 |   --des 'global_wind' \
22 |   --learning_rate 0.0001 \
23 |   --batch_size 4096


--------------------------------------------------------------------------------
/scripts/multivariate/ECL/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=1
 2 | 
 3 | model_name=TimeXer
 4 | 
 5 | python3 -u run.py \
 6 |   --task_name long_term_forecast \
 7 |   --is_training 1 \
 8 |   --root_path ./dataset/electricity/ \
 9 |   --data_path electricity.csv \
10 |   --model_id ECL_96_96 \
11 |   --model $model_name \
12 |   --data custom \
13 |   --features M \
14 |   --seq_len 96 \
15 |   --label_len 48 \
16 |   --pred_len 96 \
17 |   --e_layers 4 \
18 |   --factor 3 \
19 |   --enc_in 321 \
20 |   --dec_in 321 \
21 |   --c_out 321 \
22 |   --des 'Exp' \
23 |   --d_ff 512 \
24 |   --batch_size 4 \
25 |   --itr 1
26 | 
27 | python3 -u run.py \
28 |   --task_name long_term_forecast \
29 |   --is_training 1 \
30 |   --root_path ./dataset/electricity/ \
31 |   --data_path electricity.csv \
32 |   --model_id ECL_96_192 \
33 |   --model $model_name \
34 |   --data custom \
35 |   --features M \
36 |   --seq_len 96 \
37 |   --label_len 48 \
38 |   --pred_len 192 \
39 |   --e_layers 3 \
40 |   --factor 3 \
41 |   --enc_in 321 \
42 |   --dec_in 321 \
43 |   --c_out 321 \
44 |   --des 'Exp' \
45 |   --batch_size 4 \
46 |   --itr 1
47 | 
48 | python3 -u run.py \
49 |   --task_name long_term_forecast \
50 |   --is_training 1 \
51 |   --root_path ./dataset/electricity/ \
52 |   --data_path electricity.csv \
53 |   --model_id ECL_96_336 \
54 |   --model $model_name \
55 |   --data custom \
56 |   --features M \
57 |   --seq_len 96 \
58 |   --label_len 48 \
59 |   --pred_len 336 \
60 |   --e_layers 4 \
61 |   --factor 3 \
62 |   --enc_in 321 \
63 |   --dec_in 321 \
64 |   --c_out 321 \
65 |   --des 'Exp' \
66 |   --batch_size 4 \
67 |   --itr 1
68 | 
69 | python3 -u run.py \
70 |   --task_name long_term_forecast \
71 |   --is_training 1 \
72 |   --root_path ./dataset/electricity/ \
73 |   --data_path electricity.csv \
74 |   --model_id ECL_96_720 \
75 |   --model $model_name \
76 |   --data custom \
77 |   --features M \
78 |   --seq_len 96 \
79 |   --label_len 48 \
80 |   --pred_len 720 \
81 |   --e_layers 3 \
82 |   --factor 3 \
83 |   --enc_in 321 \
84 |   --dec_in 321 \
85 |   --c_out 321 \
86 |   --des 'Exp' \
87 |   --batch_size 4 \
88 |   --itr 1
89 | 


--------------------------------------------------------------------------------
/scripts/multivariate/ETT/TimeXer_ETTh1.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=1
 2 | 
 3 | model_name=TimeXer
 4 | 
 5 | python3 -u run.py \
 6 |   --task_name long_term_forecast \
 7 |   --is_training 1 \
 8 |   --root_path ./dataset/ETT-small/ \
 9 |   --data_path ETTh1.csv \
10 |   --model_id ETTh1_96_96 \
11 |   --model $model_name \
12 |   --data ETTh1 \
13 |   --features M \
14 |   --seq_len 96 \
15 |   --label_len 48 \
16 |   --pred_len 96 \
17 |   --e_layers 1 \
18 |   --factor 3 \
19 |   --enc_in 7 \
20 |   --dec_in 7 \
21 |   --c_out 7 \
22 |   --d_model 256 \
23 |   --batch_size 4 \
24 |   --des 'exp' \
25 |   --itr 1
26 | 
27 | 
28 | python3 -u run.py \
29 |   --task_name long_term_forecast \
30 |   --is_training 1 \
31 |   --root_path ./dataset/ETT-small/ \
32 |   --data_path ETTh1.csv \
33 |   --model_id ETTh1_96_192 \
34 |   --model $model_name \
35 |   --data ETTh1 \
36 |   --features M \
37 |   --seq_len 96 \
38 |   --label_len 48 \
39 |   --pred_len 192 \
40 |   --e_layers 2 \
41 |   --factor 3 \
42 |   --enc_in 7 \
43 |   --dec_in 7 \
44 |   --c_out 7 \
45 |   --des 'Exp' \
46 |   --d_model 128 \
47 |   --batch_size 4 \
48 |   --itr 1
49 | 
50 | python3 -u run.py \
51 |   --task_name long_term_forecast \
52 |   --is_training 1 \
53 |   --root_path ./dataset/ETT-small/ \
54 |   --data_path ETTh1.csv \
55 |   --model_id ETTh1_96_336 \
56 |   --model $model_name \
57 |   --data ETTh1 \
58 |   --features M \
59 |   --seq_len 96 \
60 |   --label_len 48 \
61 |   --pred_len 336 \
62 |   --e_layers 1 \
63 |   --factor 3 \
64 |   --enc_in 7 \
65 |   --dec_in 7 \
66 |   --c_out 7 \
67 |   --des 'Exp' \
68 |   --d_model 512 \
69 |   --d_ff 1024 \
70 |   --batch_size 16 \
71 |   --itr 1
72 | 
73 | python3 -u run.py \
74 |   --task_name long_term_forecast \
75 |   --is_training 1 \
76 |   --root_path ./dataset/ETT-small/ \
77 |   --data_path ETTh1.csv \
78 |   --model_id ETTh1_96_720 \
79 |   --model $model_name \
80 |   --data ETTh1 \
81 |   --features M \
82 |   --seq_len 96 \
83 |   --label_len 48 \
84 |   --pred_len 720 \
85 |   --e_layers 1 \
86 |   --factor 3 \
87 |   --enc_in 7 \
88 |   --dec_in 7 \
89 |   --c_out 7 \
90 |   --des 'Exp' \
91 |   --d_model 256 \
92 |   --d_ff 1024 \
93 |   --batch_size 16 \
94 |   --itr 1
95 | 


--------------------------------------------------------------------------------
/scripts/multivariate/ETT/TimeXer_ETTh2.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=0
 2 | 
 3 | model_name=TimeXer
 4 | 
 5 | python3 -u run.py \
 6 |   --task_name long_term_forecast \
 7 |   --is_training 1 \
 8 |   --root_path ./dataset/ETT-small/ \
 9 |   --data_path ETTh2.csv \
10 |   --model_id ETTh2_96_96 \
11 |   --model $model_name \
12 |   --data ETTh2 \
13 |   --features M \
14 |   --seq_len 96 \
15 |   --label_len 48 \
16 |   --pred_len 96 \
17 |   --e_layers 1 \
18 |   --factor 3 \
19 |   --enc_in 7 \
20 |   --dec_in 7 \
21 |   --c_out 7 \
22 |   --des 'Exp' \
23 |   --d_model 256 \
24 |   --d_ff 1024 \
25 |   --batch_size 16 \
26 |   --itr 1
27 | 
28 | python3 -u run.py \
29 |   --task_name long_term_forecast \
30 |   --is_training 1 \
31 |   --root_path ./dataset/ETT-small/ \
32 |   --data_path ETTh2.csv \
33 |   --model_id ETTh2_96_192 \
34 |   --model $model_name \
35 |   --data ETTh2 \
36 |   --features M \
37 |   --seq_len 96 \
38 |   --label_len 48 \
39 |   --pred_len 192 \
40 |   --e_layers 1 \
41 |   --factor 3 \
42 |   --enc_in 7 \
43 |   --dec_in 7 \
44 |   --c_out 7 \
45 |   --des 'Exp' \
46 |   --d_model 256 \
47 |   --d_ff 1024 \
48 |   --itr 1
49 | 
50 | python3 -u run.py \
51 |   --task_name long_term_forecast \
52 |   --is_training 1 \
53 |   --root_path ./dataset/ETT-small/ \
54 |   --data_path ETTh2.csv \
55 |   --model_id ETTh2_96_336 \
56 |   --model $model_name \
57 |   --data ETTh2 \
58 |   --features M \
59 |   --seq_len 96 \
60 |   --label_len 48 \
61 |   --pred_len 336 \
62 |   --e_layers 2 \
63 |   --factor 3 \
64 |   --enc_in 7 \
65 |   --dec_in 7 \
66 |   --c_out 7 \
67 |   --des 'Exp' \
68 |   --d_model 512 \
69 |   --d_ff 1024 \
70 |   --batch_size 4 \
71 |   --itr 1
72 | 
73 | python3 -u run.py \
74 |   --task_name long_term_forecast \
75 |   --is_training 1 \
76 |   --root_path ./dataset/ETT-small/ \
77 |   --data_path ETTh2.csv \
78 |   --model_id ETTh2_96_720 \
79 |   --model $model_name \
80 |   --data ETTh2 \
81 |   --features M \
82 |   --seq_len 96 \
83 |   --label_len 48 \
84 |   --pred_len 720 \
85 |   --e_layers 2 \
86 |   --factor 3 \
87 |   --enc_in 7 \
88 |   --dec_in 7 \
89 |   --c_out 7 \
90 |   --des 'Exp' \
91 |   --d_model 256 \
92 |   --d_ff 1024 \
93 |   --batch_size 16 \
94 |   --itr 1


--------------------------------------------------------------------------------
/scripts/multivariate/ETT/TimeXer_ETTm1.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=0
 2 | 
 3 | model_name=TimeXer
 4 | 
 5 | python3 -u run.py \
 6 |   --task_name long_term_forecast \
 7 |   --is_training 1 \
 8 |   --root_path ./dataset/ETT-small/ \
 9 |   --data_path ETTm1.csv \
10 |   --model_id ETTm1_96_96 \
11 |   --model $model_name \
12 |   --data ETTm1 \
13 |   --features M \
14 |   --seq_len 96 \
15 |   --label_len 48 \
16 |   --pred_len 96 \
17 |   --e_layers 1 \
18 |   --factor 3 \
19 |   --enc_in 7 \
20 |   --dec_in 7 \
21 |   --c_out 7 \
22 |   --d_model 256 \
23 |   --batch_size 4 \
24 |   --des 'Exp' \
25 |   --itr 1
26 | 
27 | python3 -u run.py \
28 |   --task_name long_term_forecast \
29 |   --is_training 1 \
30 |   --root_path ./dataset/ETT-small/ \
31 |   --data_path ETTm1.csv \
32 |   --model_id ETTm1_96_192 \
33 |   --model $model_name \
34 |   --data ETTm1 \
35 |   --features M \
36 |   --seq_len 96 \
37 |   --label_len 48 \
38 |   --pred_len 192 \
39 |   --e_layers 1 \
40 |   --factor 3 \
41 |   --enc_in 7 \
42 |   --dec_in 7 \
43 |   --c_out 7 \
44 |   --d_model 256 \
45 |   --d_ff 256 \
46 |   --batch_size 4 \
47 |   --des 'Exp' \
48 |   --itr 1
49 | 
50 | python3 -u run.py \
51 |   --task_name long_term_forecast \
52 |   --is_training 1 \
53 |   --root_path ./dataset/ETT-small/ \
54 |   --data_path ETTm1.csv \
55 |   --model_id ETTm1_96_336 \
56 |   --model $model_name \
57 |   --data ETTm1 \
58 |   --features M \
59 |   --seq_len 96 \
60 |   --label_len 48 \
61 |   --pred_len 336 \
62 |   --e_layers 1 \
63 |   --factor 3 \
64 |   --enc_in 7 \
65 |   --dec_in 7 \
66 |   --c_out 7 \
67 |   --d_model 256 \
68 |   --d_ff 1024 \
69 |   --batch_size 4 \
70 |   --des 'Exp' \
71 |   --itr 1
72 | 
73 | python3 -u run.py \
74 |   --task_name long_term_forecast \
75 |   --is_training 1 \
76 |   --root_path ./dataset/ETT-small/ \
77 |   --data_path ETTm1.csv \
78 |   --model_id ETTm1_96_720 \
79 |   --model $model_name \
80 |   --data ETTm1 \
81 |   --features M \
82 |   --seq_len 96 \
83 |   --label_len 48 \
84 |   --pred_len 720 \
85 |   --e_layers 1 \
86 |   --factor 3 \
87 |   --enc_in 7 \
88 |   --dec_in 7 \
89 |   --c_out 7 \
90 |   --d_model 256 \
91 |   --d_ff 512 \
92 |   --batch_size 4 \
93 |   --des 'Exp' \
94 |   --itr 1


--------------------------------------------------------------------------------
/scripts/multivariate/ETT/TimeXer_ETTm2.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=0
 2 | 
 3 | model_name=TimeXer
 4 | 
 5 | python3 -u run.py \
 6 |   --task_name long_term_forecast \
 7 |   --is_training 1 \
 8 |   --root_path ./dataset/ETT-small/ \
 9 |   --data_path ETTm2.csv \
10 |   --model_id ETTm2_96_96 \
11 |   --model $model_name \
12 |   --data ETTm2 \
13 |   --features M \
14 |   --seq_len 96 \
15 |   --label_len 48 \
16 |   --pred_len 96 \
17 |   --e_layers 1 \
18 |   --d_layers 1 \
19 |   --factor 3 \
20 |   --enc_in 7 \
21 |   --dec_in 7 \
22 |   --c_out 7 \
23 |   --d_model 256 \
24 |   --des 'Exp' \
25 |   --itr 1
26 | 
27 | python3 -u run.py \
28 |   --task_name long_term_forecast \
29 |   --is_training 1 \
30 |   --root_path ./dataset/ETT-small/ \
31 |   --data_path ETTm2.csv \
32 |   --model_id ETTm2_96_192 \
33 |   --model $model_name \
34 |   --data ETTm2 \
35 |   --features M \
36 |   --seq_len 96 \
37 |   --label_len 48 \
38 |   --pred_len 192 \
39 |   --e_layers 1 \
40 |   --d_layers 1 \
41 |   --factor 3 \
42 |   --enc_in 7 \
43 |   --dec_in 7 \
44 |   --c_out 7 \
45 |   --d_model 256 \
46 |   --d_ff 1024 \
47 |   --batch_size 16 \
48 |   --des 'Exp' \
49 |   --itr 1
50 | 
51 | python3 -u run.py \
52 |   --task_name long_term_forecast \
53 |   --is_training 1 \
54 |   --root_path ./dataset/ETT-small/ \
55 |   --data_path ETTm2.csv \
56 |   --model_id ETTm2_96_336 \
57 |   --model $model_name \
58 |   --data ETTm2 \
59 |   --features M \
60 |   --seq_len 96 \
61 |   --label_len 48 \
62 |   --pred_len 336 \
63 |   --e_layers 1 \
64 |   --d_layers 1 \
65 |   --factor 3 \
66 |   --enc_in 7 \
67 |   --dec_in 7 \
68 |   --c_out 7 \
69 |   --d_model 512 \
70 |   --d_ff 1024 \
71 |   --des 'Exp' \
72 |   --itr 1
73 | 
74 | 
75 | python3 -u run.py \
76 |   --task_name long_term_forecast \
77 |   --is_training 1 \
78 |   --root_path ./dataset/ETT-small/ \
79 |   --data_path ETTm2.csv \
80 |   --model_id ETTm2_96_720 \
81 |   --model $model_name \
82 |   --data ETTm2 \
83 |   --features M \
84 |   --seq_len 96 \
85 |   --label_len 48 \
86 |   --pred_len 720 \
87 |   --e_layers 1 \
88 |   --d_layers 1 \
89 |   --factor 3 \
90 |   --enc_in 7 \
91 |   --dec_in 7 \
92 |   --c_out 7 \
93 |   --d_model 512 \
94 |   --des 'Exp' \
95 |   --itr 1


--------------------------------------------------------------------------------
/scripts/multivariate/Traffic/TimeXer.sh:
--------------------------------------------------------------------------------
  1 | export CUDA_VISIBLE_DEVICES=3
  2 | 
  3 | model_name=TimeXer
  4 | 
  5 | python3 -u run.py \
  6 |   --task_name long_term_forecast \
  7 |   --is_training 1 \
  8 |   --root_path ./dataset/traffic/ \
  9 |   --data_path traffic.csv \
 10 |   --model_id traffic_96_96 \
 11 |   --model $model_name \
 12 |   --data custom \
 13 |   --features M \
 14 |   --seq_len 96 \
 15 |   --label_len 48 \
 16 |   --pred_len 96 \
 17 |   --e_layers 3 \
 18 |   --factor 3 \
 19 |   --enc_in 862 \
 20 |   --dec_in 862 \
 21 |   --c_out 862 \
 22 |   --d_model 512 \
 23 |   --d_ff 512 \
 24 |   --des 'Exp' \
 25 |   --batch_size 16 \
 26 |   --learning_rate 0.001 \
 27 |   --itr 1
 28 | 
 29 | python3 -u run.py \
 30 |   --task_name long_term_forecast \
 31 |   --is_training 1 \
 32 |   --root_path ./dataset/traffic/ \
 33 |   --data_path traffic.csv \
 34 |   --model_id traffic_96_192 \
 35 |   --model $model_name \
 36 |   --data custom \
 37 |   --features M \
 38 |   --seq_len 96 \
 39 |   --label_len 48 \
 40 |   --pred_len 192 \
 41 |   --e_layers 3 \
 42 |   --factor 3 \
 43 |   --enc_in 862 \
 44 |   --dec_in 862 \
 45 |   --c_out 862 \
 46 |   --d_model 512 \
 47 |   --d_ff 512 \
 48 |   --des 'Exp' \
 49 |   --batch_size 16 \
 50 |   --learning_rate 0.001 \
 51 |   --itr 1
 52 | 
 53 | python3 -u run.py \
 54 |   --task_name long_term_forecast \
 55 |   --is_training 1 \
 56 |   --root_path ./dataset/traffic/ \
 57 |   --data_path traffic.csv \
 58 |   --model_id traffic_96_336 \
 59 |   --model $model_name \
 60 |   --data custom \
 61 |   --features M \
 62 |   --seq_len 96 \
 63 |   --label_len 48 \
 64 |   --pred_len 336 \
 65 |   --e_layers 2 \
 66 |   --factor 3 \
 67 |   --enc_in 862 \
 68 |   --dec_in 862 \
 69 |   --c_out 862 \
 70 |   --d_model 512 \
 71 |   --d_ff 512 \
 72 |   --des 'Exp' \
 73 |   --batch_size 16 \
 74 |   --learning_rate 0.001 \
 75 |   --itr 1
 76 | 
 77 | python3 -u run.py \
 78 |   --task_name long_term_forecast \
 79 |   --is_training 1 \
 80 |   --root_path ./dataset/traffic/ \
 81 |   --data_path traffic.csv \
 82 |   --model_id traffic_96_720 \
 83 |   --model $model_name \
 84 |   --data custom \
 85 |   --features M \
 86 |   --seq_len 96 \
 87 |   --label_len 48 \
 88 |   --pred_len 720 \
 89 |   --e_layers 2 \
 90 |   --factor 3 \
 91 |   --enc_in 862 \
 92 |   --dec_in 862 \
 93 |   --c_out 862 \
 94 |   --d_model 512 \
 95 |   --d_ff 512 \
 96 |   --des 'Exp' \
 97 |   --batch_size 16 \
 98 |   --learning_rate 0.001 \
 99 |   --itr 1
100 | 


--------------------------------------------------------------------------------
/scripts/multivariate/Weather/TimeXer.sh:
--------------------------------------------------------------------------------
 1 | export CUDA_VISIBLE_DEVICES=0
 2 | 
 3 | model_name=TimeXer
 4 | 
 5 | python3 -u run.py \
 6 |   --task_name long_term_forecast \
 7 |   --is_training 1 \
 8 |   --root_path ./dataset/weather/ \
 9 |   --data_path weather.csv \
10 |   --model_id weather_96_96 \
11 |   --model $model_name \
12 |   --data custom \
13 |   --features M \
14 |   --seq_len 96 \
15 |   --label_len 48 \
16 |   --pred_len 96 \
17 |   --e_layers 1 \
18 |   --factor 3 \
19 |   --enc_in 21 \
20 |   --dec_in 21 \
21 |   --c_out 21 \
22 |   --des 'Exp' \
23 |   --d_model 256 \
24 |   --d_ff 512 \
25 |   --batch_size 4 \
26 |   --itr 1 \
27 | 
28 | python3 -u run.py \
29 |   --task_name long_term_forecast \
30 |   --is_training 1 \
31 |   --root_path ./dataset/weather/ \
32 |   --data_path weather.csv \
33 |   --model_id weather_96_192 \
34 |   --model $model_name \
35 |   --data custom \
36 |   --features M \
37 |   --seq_len 96 \
38 |   --label_len 48 \
39 |   --pred_len 192 \
40 |   --e_layers 3 \
41 |   --factor 3 \
42 |   --enc_in 21 \
43 |   --dec_in 21 \
44 |   --c_out 21 \
45 |   --des 'Exp' \
46 |   --d_model 128 \
47 |   --d_ff 1024 \
48 |   --batch_size 4 \
49 |   --itr 1
50 | 
51 | python3 -u run.py \
52 |   --task_name long_term_forecast \
53 |   --is_training 1 \
54 |   --root_path ./dataset/weather/ \
55 |   --data_path weather.csv \
56 |   --model_id weather_96_336 \
57 |   --model $model_name \
58 |   --data custom \
59 |   --features M \
60 |   --seq_len 96 \
61 |   --label_len 48 \
62 |   --pred_len 336 \
63 |   --e_layers 1 \
64 |   --factor 3 \
65 |   --enc_in 21 \
66 |   --dec_in 21 \
67 |   --c_out 21 \
68 |   --des 'Exp' \
69 |   --d_model 256 \
70 |   --batch_size 4 \
71 |   --itr 1
72 | 
73 | python3 -u run.py \
74 |   --task_name long_term_forecast \
75 |   --is_training 1 \
76 |   --root_path ./dataset/weather/ \
77 |   --data_path weather.csv \
78 |   --model_id weather_96_720 \
79 |   --model $model_name \
80 |   --data custom \
81 |   --features M \
82 |   --seq_len 96 \
83 |   --label_len 48 \
84 |   --pred_len 720 \
85 |   --e_layers 1 \
86 |   --factor 3 \
87 |   --enc_in 21 \
88 |   --dec_in 21 \
89 |   --c_out 21 \
90 |   --des 'Exp' \
91 |   --d_model 128 \
92 |   --batch_size 4 \
93 |   --itr 1
94 | 


--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/thuml/TimeXer/76011909357972bd55a27adba2e1be994d81b327/utils/__init__.py


--------------------------------------------------------------------------------
/utils/losses.py:
--------------------------------------------------------------------------------
 1 | # This source code is provided for the purposes of scientific reproducibility
 2 | # under the following limited license from Element AI Inc. The code is an
 3 | # implementation of the N-BEATS model (Oreshkin et al., N-BEATS: Neural basis
 4 | # expansion analysis for interpretable time series forecasting,
 5 | # https://arxiv.org/abs/1905.10437). The copyright to the source code is
 6 | # licensed under the Creative Commons - Attribution-NonCommercial 4.0
 7 | # International license (CC BY-NC 4.0):
 8 | # https://creativecommons.org/licenses/by-nc/4.0/.  Any commercial use (whether
 9 | # for the benefit of third parties or internally in production) requires an
10 | # explicit license. The subject-matter of the N-BEATS model and associated
11 | # materials are the property of Element AI Inc. and may be subject to patent
12 | # protection. No license to patents is granted hereunder (whether express or
13 | # implied). Copyright © 2020 Element AI Inc. All rights reserved.
14 | 
15 | """
16 | Loss functions for PyTorch.
17 | """
18 | 
19 | import torch as t
20 | import torch.nn as nn
21 | import numpy as np
22 | import pdb
23 | 
24 | 
25 | def divide_no_nan(a, b):
26 |     """
27 |     a/b where the resulted NaN or Inf are replaced by 0.
28 |     """
29 |     result = a / b
30 |     result[result != result] = .0
31 |     result[result == np.inf] = .0
32 |     return result
33 | 
34 | 
35 | class mape_loss(nn.Module):
36 |     def __init__(self):
37 |         super(mape_loss, self).__init__()
38 | 
39 |     def forward(self, insample: t.Tensor, freq: int,
40 |                 forecast: t.Tensor, target: t.Tensor, mask: t.Tensor) -> t.float:
41 |         """
42 |         MAPE loss as defined in: https://en.wikipedia.org/wiki/Mean_absolute_percentage_error
43 | 
44 |         :param forecast: Forecast values. Shape: batch, time
45 |         :param target: Target values. Shape: batch, time
46 |         :param mask: 0/1 mask. Shape: batch, time
47 |         :return: Loss value
48 |         """
49 |         weights = divide_no_nan(mask, target)
50 |         return t.mean(t.abs((forecast - target) * weights))
51 | 
52 | 
53 | class smape_loss(nn.Module):
54 |     def __init__(self):
55 |         super(smape_loss, self).__init__()
56 | 
57 |     def forward(self, insample: t.Tensor, freq: int,
58 |                 forecast: t.Tensor, target: t.Tensor, mask: t.Tensor) -> t.float:
59 |         """
60 |         sMAPE loss as defined in https://robjhyndman.com/hyndsight/smape/ (Makridakis 1993)
61 | 
62 |         :param forecast: Forecast values. Shape: batch, time
63 |         :param target: Target values. Shape: batch, time
64 |         :param mask: 0/1 mask. Shape: batch, time
65 |         :return: Loss value
66 |         """
67 |         return 200 * t.mean(divide_no_nan(t.abs(forecast - target),
68 |                                           t.abs(forecast.data) + t.abs(target.data)) * mask)
69 | 
70 | 
71 | class mase_loss(nn.Module):
72 |     def __init__(self):
73 |         super(mase_loss, self).__init__()
74 | 
75 |     def forward(self, insample: t.Tensor, freq: int,
76 |                 forecast: t.Tensor, target: t.Tensor, mask: t.Tensor) -> t.float:
77 |         """
78 |         MASE loss as defined in "Scaled Errors" https://robjhyndman.com/papers/mase.pdf
79 | 
80 |         :param insample: Insample values. Shape: batch, time_i
81 |         :param freq: Frequency value
82 |         :param forecast: Forecast values. Shape: batch, time_o
83 |         :param target: Target values. Shape: batch, time_o
84 |         :param mask: 0/1 mask. Shape: batch, time_o
85 |         :return: Loss value
86 |         """
87 |         masep = t.mean(t.abs(insample[:, freq:] - insample[:, :-freq]), dim=1)
88 |         masked_masep_inv = divide_no_nan(mask, masep[:, None])
89 |         return t.mean(t.abs(target - forecast) * masked_masep_inv)
90 | 


--------------------------------------------------------------------------------
/utils/m4_summary.py:
--------------------------------------------------------------------------------
  1 | # This source code is provided for the purposes of scientific reproducibility
  2 | # under the following limited license from Element AI Inc. The code is an
  3 | # implementation of the N-BEATS model (Oreshkin et al., N-BEATS: Neural basis
  4 | # expansion analysis for interpretable time series forecasting,
  5 | # https://arxiv.org/abs/1905.10437). The copyright to the source code is
  6 | # licensed under the Creative Commons - Attribution-NonCommercial 4.0
  7 | # International license (CC BY-NC 4.0):
  8 | # https://creativecommons.org/licenses/by-nc/4.0/.  Any commercial use (whether
  9 | # for the benefit of third parties or internally in production) requires an
 10 | # explicit license. The subject-matter of the N-BEATS model and associated
 11 | # materials are the property of Element AI Inc. and may be subject to patent
 12 | # protection. No license to patents is granted hereunder (whether express or
 13 | # implied). Copyright 2020 Element AI Inc. All rights reserved.
 14 | 
 15 | """
 16 | M4 Summary
 17 | """
 18 | from collections import OrderedDict
 19 | 
 20 | import numpy as np
 21 | import pandas as pd
 22 | 
 23 | from data_provider.m4 import M4Dataset
 24 | from data_provider.m4 import M4Meta
 25 | import os
 26 | 
 27 | 
 28 | def group_values(values, groups, group_name):
 29 |     return np.array([v[~np.isnan(v)] for v in values[groups == group_name]])
 30 | 
 31 | 
 32 | def mase(forecast, insample, outsample, frequency):
 33 |     return np.mean(np.abs(forecast - outsample)) / np.mean(np.abs(insample[:-frequency] - insample[frequency:]))
 34 | 
 35 | 
 36 | def smape_2(forecast, target):
 37 |     denom = np.abs(target) + np.abs(forecast)
 38 |     # divide by 1.0 instead of 0.0, in case when denom is zero the enumerator will be 0.0 anyway.
 39 |     denom[denom == 0.0] = 1.0
 40 |     return 200 * np.abs(forecast - target) / denom
 41 | 
 42 | 
 43 | def mape(forecast, target):
 44 |     denom = np.abs(target)
 45 |     # divide by 1.0 instead of 0.0, in case when denom is zero the enumerator will be 0.0 anyway.
 46 |     denom[denom == 0.0] = 1.0
 47 |     return 100 * np.abs(forecast - target) / denom
 48 | 
 49 | 
 50 | class M4Summary:
 51 |     def __init__(self, file_path, root_path):
 52 |         self.file_path = file_path
 53 |         self.training_set = M4Dataset.load(training=True, dataset_file=root_path)
 54 |         self.test_set = M4Dataset.load(training=False, dataset_file=root_path)
 55 |         self.naive_path = os.path.join(root_path, 'submission-Naive2.csv')
 56 | 
 57 |     def evaluate(self):
 58 |         """
 59 |         Evaluate forecasts using M4 test dataset.
 60 | 
 61 |         :param forecast: Forecasts. Shape: timeseries, time.
 62 |         :return: sMAPE and OWA grouped by seasonal patterns.
 63 |         """
 64 |         grouped_owa = OrderedDict()
 65 | 
 66 |         naive2_forecasts = pd.read_csv(self.naive_path).values[:, 1:].astype(np.float32)
 67 |         naive2_forecasts = np.array([v[~np.isnan(v)] for v in naive2_forecasts])
 68 | 
 69 |         model_mases = {}
 70 |         naive2_smapes = {}
 71 |         naive2_mases = {}
 72 |         grouped_smapes = {}
 73 |         grouped_mapes = {}
 74 |         for group_name in M4Meta.seasonal_patterns:
 75 |             file_name = self.file_path + group_name + "_forecast.csv"
 76 |             if os.path.exists(file_name):
 77 |                 model_forecast = pd.read_csv(file_name).values
 78 | 
 79 |             naive2_forecast = group_values(naive2_forecasts, self.test_set.groups, group_name)
 80 |             target = group_values(self.test_set.values, self.test_set.groups, group_name)
 81 |             # all timeseries within group have same frequency
 82 |             frequency = self.training_set.frequencies[self.test_set.groups == group_name][0]
 83 |             insample = group_values(self.training_set.values, self.test_set.groups, group_name)
 84 | 
 85 |             model_mases[group_name] = np.mean([mase(forecast=model_forecast[i],
 86 |                                                     insample=insample[i],
 87 |                                                     outsample=target[i],
 88 |                                                     frequency=frequency) for i in range(len(model_forecast))])
 89 |             naive2_mases[group_name] = np.mean([mase(forecast=naive2_forecast[i],
 90 |                                                      insample=insample[i],
 91 |                                                      outsample=target[i],
 92 |                                                      frequency=frequency) for i in range(len(model_forecast))])
 93 | 
 94 |             naive2_smapes[group_name] = np.mean(smape_2(naive2_forecast, target))
 95 |             grouped_smapes[group_name] = np.mean(smape_2(forecast=model_forecast, target=target))
 96 |             grouped_mapes[group_name] = np.mean(mape(forecast=model_forecast, target=target))
 97 | 
 98 |         grouped_smapes = self.summarize_groups(grouped_smapes)
 99 |         grouped_mapes = self.summarize_groups(grouped_mapes)
100 |         grouped_model_mases = self.summarize_groups(model_mases)
101 |         grouped_naive2_smapes = self.summarize_groups(naive2_smapes)
102 |         grouped_naive2_mases = self.summarize_groups(naive2_mases)
103 |         for k in grouped_model_mases.keys():
104 |             grouped_owa[k] = (grouped_model_mases[k] / grouped_naive2_mases[k] +
105 |                               grouped_smapes[k] / grouped_naive2_smapes[k]) / 2
106 | 
107 |         def round_all(d):
108 |             return dict(map(lambda kv: (kv[0], np.round(kv[1], 3)), d.items()))
109 | 
110 |         return round_all(grouped_smapes), round_all(grouped_owa), round_all(grouped_mapes), round_all(
111 |             grouped_model_mases)
112 | 
113 |     def summarize_groups(self, scores):
114 |         """
115 |         Re-group scores respecting M4 rules.
116 |         :param scores: Scores per group.
117 |         :return: Grouped scores.
118 |         """
119 |         scores_summary = OrderedDict()
120 | 
121 |         def group_count(group_name):
122 |             return len(np.where(self.test_set.groups == group_name)[0])
123 | 
124 |         weighted_score = {}
125 |         for g in ['Yearly', 'Quarterly', 'Monthly']:
126 |             weighted_score[g] = scores[g] * group_count(g)
127 |             scores_summary[g] = scores[g]
128 | 
129 |         others_score = 0
130 |         others_count = 0
131 |         for g in ['Weekly', 'Daily', 'Hourly']:
132 |             others_score += scores[g] * group_count(g)
133 |             others_count += group_count(g)
134 |         weighted_score['Others'] = others_score
135 |         scores_summary['Others'] = others_score / others_count
136 | 
137 |         average = np.sum(list(weighted_score.values())) / len(self.test_set.groups)
138 |         scores_summary['Average'] = average
139 | 
140 |         return scores_summary
141 | 


--------------------------------------------------------------------------------
/utils/metrics.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | 
 3 | 
 4 | def RSE(pred, true):
 5 |     return np.sqrt(np.sum((true - pred) ** 2)) / np.sqrt(np.sum((true - true.mean()) ** 2))
 6 | 
 7 | 
 8 | def CORR(pred, true):
 9 |     u = ((true - true.mean(0)) * (pred - pred.mean(0))).sum(0)
10 |     d = np.sqrt(((true - true.mean(0)) ** 2 * (pred - pred.mean(0)) ** 2).sum(0))
11 |     return (u / d).mean(-1)
12 | 
13 | 
14 | def MAE(pred, true):
15 |     return np.mean(np.abs(pred - true))
16 | 
17 | 
18 | def MSE(pred, true):
19 |     return np.mean((pred - true) ** 2)
20 | 
21 | 
22 | def RMSE(pred, true):
23 |     return np.sqrt(MSE(pred, true))
24 | 
25 | 
26 | def MAPE(pred, true):
27 |     return np.mean(np.abs((pred - true) / true))
28 | 
29 | 
30 | def MSPE(pred, true):
31 |     return np.mean(np.square((pred - true) / true))
32 | 
33 | # 标准差
34 | def STD(pred, true):
35 |     return np.std(pred - true)
36 | 
37 | def metric(pred, true):
38 |     mae = MAE(pred, true)
39 |     mse = MSE(pred, true)
40 |     rmse = RMSE(pred, true)
41 |     mape = MAPE(pred, true)
42 |     mspe = MSPE(pred, true)
43 | 
44 |     return mae, mse, rmse, mape, mspe
45 | 


--------------------------------------------------------------------------------
/utils/print_args.py:
--------------------------------------------------------------------------------
 1 | def print_args(args):
 2 |     print("\033[1m" + "Basic Config" + "\033[0m")
 3 |     print(f'  {"Task Name:":<20}{args.task_name:<20}{"Is Training:":<20}{args.is_training:<20}')
 4 |     print(f'  {"Model ID:":<20}{args.model_id:<20}{"Model:":<20}{args.model:<20}')
 5 |     print()
 6 | 
 7 |     print("\033[1m" + "Data Loader" + "\033[0m")
 8 |     print(f'  {"Data:":<20}{args.data:<20}{"Root Path:":<20}{args.root_path:<20}')
 9 |     print(f'  {"Data Path:":<20}{args.data_path:<20}{"Features:":<20}{args.features:<20}')
10 |     print(f'  {"Target:":<20}{args.target:<20}{"Freq:":<20}{args.freq:<20}')
11 |     print(f'  {"Checkpoints:":<20}{args.checkpoints:<20}')
12 |     print()
13 | 
14 |     if args.task_name in ['long_term_forecast', 'short_term_forecast']:
15 |         print("\033[1m" + "Forecasting Task" + "\033[0m")
16 |         print(f'  {"Seq Len:":<20}{args.seq_len:<20}{"Label Len:":<20}{args.label_len:<20}')
17 |         print(f'  {"Pred Len:":<20}{args.pred_len:<20}{"Seasonal Patterns:":<20}{args.seasonal_patterns:<20}')
18 |         print(f'  {"Inverse:":<20}{args.inverse:<20}')
19 |         print()
20 | 
21 |     if args.task_name == 'imputation':
22 |         print("\033[1m" + "Imputation Task" + "\033[0m")
23 |         print(f'  {"Mask Rate:":<20}{args.mask_rate:<20}')
24 |         print()
25 | 
26 |     if args.task_name == 'anomaly_detection':
27 |         print("\033[1m" + "Anomaly Detection Task" + "\033[0m")
28 |         print(f'  {"Anomaly Ratio:":<20}{args.anomaly_ratio:<20}')
29 |         print()
30 | 
31 |     print("\033[1m" + "Model Parameters" + "\033[0m")
32 |     print(f'  {"Top k:":<20}{args.top_k:<20}{"Num Kernels:":<20}{args.num_kernels:<20}')
33 |     print(f'  {"Enc In:":<20}{args.enc_in:<20}{"Dec In:":<20}{args.dec_in:<20}')
34 |     print(f'  {"C Out:":<20}{args.c_out:<20}{"d model:":<20}{args.d_model:<20}')
35 |     print(f'  {"n heads:":<20}{args.n_heads:<20}{"e layers:":<20}{args.e_layers:<20}')
36 |     print(f'  {"d layers:":<20}{args.d_layers:<20}{"d FF:":<20}{args.d_ff:<20}')
37 |     print(f'  {"Moving Avg:":<20}{args.moving_avg:<20}{"Factor:":<20}{args.factor:<20}')
38 |     print(f'  {"Distil:":<20}{args.distil:<20}{"Dropout:":<20}{args.dropout:<20}')
39 |     print(f'  {"Embed:":<20}{args.embed:<20}{"Activation:":<20}{args.activation:<20}')
40 |     print()
41 | 
42 |     print("\033[1m" + "Run Parameters" + "\033[0m")
43 |     print(f'  {"Num Workers:":<20}{args.num_workers:<20}{"Itr:":<20}{args.itr:<20}')
44 |     print(f'  {"Train Epochs:":<20}{args.train_epochs:<20}{"Batch Size:":<20}{args.batch_size:<20}')
45 |     print(f'  {"Patience:":<20}{args.patience:<20}{"Learning Rate:":<20}{args.learning_rate:<20}')
46 |     print(f'  {"Des:":<20}{args.des:<20}{"Loss:":<20}{args.loss:<20}')
47 |     print(f'  {"Lradj:":<20}{args.lradj:<20}{"Use Amp:":<20}{args.use_amp:<20}')
48 |     print()
49 | 
50 |     print("\033[1m" + "GPU" + "\033[0m")
51 |     print(f'  {"Use GPU:":<20}{args.use_gpu:<20}{"GPU:":<20}{args.gpu:<20}')
52 |     print(f'  {"Use Multi GPU:":<20}{args.use_multi_gpu:<20}{"Devices:":<20}{args.devices:<20}')
53 |     print()
54 | 
55 |     print("\033[1m" + "De-stationary Projector Params" + "\033[0m")
56 |     p_hidden_dims_str = ', '.join(map(str, args.p_hidden_dims))
57 |     print(f'  {"P Hidden Dims:":<20}{p_hidden_dims_str:<20}{"P Hidden Layers:":<20}{args.p_hidden_layers:<20}')
58 |     print()


--------------------------------------------------------------------------------
/utils/timefeatures.py:
--------------------------------------------------------------------------------
  1 | # From: gluonts/src/gluonts/time_feature/_base.py
  2 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
  3 | #
  4 | # Licensed under the Apache License, Version 2.0 (the "License").
  5 | # You may not use this file except in compliance with the License.
  6 | # A copy of the License is located at
  7 | #
  8 | #     http://www.apache.org/licenses/LICENSE-2.0
  9 | #
 10 | # or in the "license" file accompanying this file. This file is distributed
 11 | # on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
 12 | # express or implied. See the License for the specific language governing
 13 | # permissions and limitations under the License.
 14 | 
 15 | from typing import List
 16 | 
 17 | import numpy as np
 18 | import pandas as pd
 19 | from pandas.tseries import offsets
 20 | from pandas.tseries.frequencies import to_offset
 21 | 
 22 | 
 23 | class TimeFeature:
 24 |     def __init__(self):
 25 |         pass
 26 | 
 27 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 28 |         pass
 29 | 
 30 |     def __repr__(self):
 31 |         return self.__class__.__name__ + "()"
 32 | 
 33 | 
 34 | class SecondOfMinute(TimeFeature):
 35 |     """Minute of hour encoded as value between [-0.5, 0.5]"""
 36 | 
 37 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 38 |         return index.second / 59.0 - 0.5
 39 | 
 40 | 
 41 | class MinuteOfHour(TimeFeature):
 42 |     """Minute of hour encoded as value between [-0.5, 0.5]"""
 43 | 
 44 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 45 |         return index.minute / 59.0 - 0.5
 46 | 
 47 | 
 48 | class HourOfDay(TimeFeature):
 49 |     """Hour of day encoded as value between [-0.5, 0.5]"""
 50 | 
 51 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 52 |         return index.hour / 23.0 - 0.5
 53 | 
 54 | 
 55 | class DayOfWeek(TimeFeature):
 56 |     """Hour of day encoded as value between [-0.5, 0.5]"""
 57 | 
 58 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 59 |         return index.dayofweek / 6.0 - 0.5
 60 | 
 61 | 
 62 | class DayOfMonth(TimeFeature):
 63 |     """Day of month encoded as value between [-0.5, 0.5]"""
 64 | 
 65 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 66 |         return (index.day - 1) / 30.0 - 0.5
 67 | 
 68 | 
 69 | class DayOfYear(TimeFeature):
 70 |     """Day of year encoded as value between [-0.5, 0.5]"""
 71 | 
 72 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 73 |         return (index.dayofyear - 1) / 365.0 - 0.5
 74 | 
 75 | 
 76 | class MonthOfYear(TimeFeature):
 77 |     """Month of year encoded as value between [-0.5, 0.5]"""
 78 | 
 79 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 80 |         return (index.month - 1) / 11.0 - 0.5
 81 | 
 82 | 
 83 | class WeekOfYear(TimeFeature):
 84 |     """Week of year encoded as value between [-0.5, 0.5]"""
 85 | 
 86 |     def __call__(self, index: pd.DatetimeIndex) -> np.ndarray:
 87 |         return (index.isocalendar().week - 1) / 52.0 - 0.5
 88 | 
 89 | 
 90 | def time_features_from_frequency_str(freq_str: str) -> List[TimeFeature]:
 91 |     """
 92 |     Returns a list of time features that will be appropriate for the given frequency string.
 93 |     Parameters
 94 |     ----------
 95 |     freq_str
 96 |         Frequency string of the form [multiple][granularity] such as "12H", "5min", "1D" etc.
 97 |     """
 98 | 
 99 |     features_by_offsets = {
100 |         offsets.YearEnd: [],
101 |         offsets.QuarterEnd: [MonthOfYear],
102 |         offsets.MonthEnd: [MonthOfYear],
103 |         offsets.Week: [DayOfMonth, WeekOfYear],
104 |         offsets.Day: [DayOfWeek, DayOfMonth, DayOfYear],
105 |         offsets.BusinessDay: [DayOfWeek, DayOfMonth, DayOfYear],
106 |         offsets.Hour: [HourOfDay, DayOfWeek, DayOfMonth, DayOfYear],
107 |         offsets.Minute: [
108 |             MinuteOfHour,
109 |             HourOfDay,
110 |             DayOfWeek,
111 |             DayOfMonth,
112 |             DayOfYear,
113 |         ],
114 |         offsets.Second: [
115 |             SecondOfMinute,
116 |             MinuteOfHour,
117 |             HourOfDay,
118 |             DayOfWeek,
119 |             DayOfMonth,
120 |             DayOfYear,
121 |         ],
122 |     }
123 | 
124 |     offset = to_offset(freq_str)
125 | 
126 |     for offset_type, feature_classes in features_by_offsets.items():
127 |         if isinstance(offset, offset_type):
128 |             return [cls() for cls in feature_classes]
129 | 
130 |     supported_freq_msg = f"""
131 |     Unsupported frequency {freq_str}
132 |     The following frequencies are supported:
133 |         Y   - yearly
134 |             alias: A
135 |         M   - monthly
136 |         W   - weekly
137 |         D   - daily
138 |         B   - business days
139 |         H   - hourly
140 |         T   - minutely
141 |             alias: min
142 |         S   - secondly
143 |     """
144 |     raise RuntimeError(supported_freq_msg)
145 | 
146 | 
147 | def time_features(dates, freq='h'):
148 |     return np.vstack([feat(dates) for feat in time_features_from_frequency_str(freq)])
149 | 


--------------------------------------------------------------------------------
/utils/tools.py:
--------------------------------------------------------------------------------
  1 | import collections
  2 | import os
  3 | import sys
  4 | 
  5 | import numpy as np
  6 | import torch
  7 | import matplotlib.pyplot as plt
  8 | import pandas as pd
  9 | from torch.utils.data._utils.collate import np_str_obj_array_pattern, default_collate_err_msg_format
 10 | 
 11 | plt.switch_backend('agg')
 12 | 
 13 | 
 14 | def adjust_learning_rate(optimizer, epoch, args):
 15 |     # lr = args.learning_rate * (0.2 ** (epoch // 2))
 16 |     if args.lradj == 'type1':
 17 |         lr_adjust = {epoch: args.learning_rate * (0.5 ** ((epoch - 1) // 1))}
 18 |     elif args.lradj == 'type2':
 19 |         lr_adjust = {
 20 |             2: 5e-5, 4: 1e-5, 6: 5e-6, 8: 1e-6,
 21 |             10: 5e-7, 15: 1e-7, 20: 5e-8
 22 |         }
 23 |     if epoch in lr_adjust.keys():
 24 |         lr = lr_adjust[epoch]
 25 |         for param_group in optimizer.param_groups:
 26 |             param_group['lr'] = lr
 27 |         print('Updating learning rate to {}'.format(lr))
 28 | 
 29 | 
 30 | class EarlyStopping:
 31 |     def __init__(self, patience=7, verbose=False, delta=0):
 32 |         self.patience = patience
 33 |         self.verbose = verbose
 34 |         self.counter = 0
 35 |         self.best_score = None
 36 |         self.early_stop = False
 37 |         self.val_loss_min = np.Inf
 38 |         self.delta = delta
 39 | 
 40 |     def __call__(self, val_loss, model, path):
 41 |         score = -val_loss
 42 |         if self.best_score is None:
 43 |             self.best_score = score
 44 |             self.save_checkpoint(val_loss, model, path)
 45 |         elif score < self.best_score + self.delta:
 46 |             self.counter += 1
 47 |             print(f'EarlyStopping counter: {self.counter} out of {self.patience}')
 48 |             if self.counter >= self.patience:
 49 |                 self.early_stop = True
 50 |         else:
 51 |             self.best_score = score
 52 |             self.save_checkpoint(val_loss, model, path)
 53 |             self.counter = 0
 54 | 
 55 |     def save_checkpoint(self, val_loss, model, path):
 56 |         if self.verbose:
 57 |             print(f'Validation loss decreased ({self.val_loss_min:.6f} --> {val_loss:.6f}).  Saving model ...')
 58 |         torch.save(model.state_dict(), path + '/' + 'checkpoint.pth')
 59 |         self.val_loss_min = val_loss
 60 | 
 61 | 
 62 | class dotdict(dict):
 63 |     """dot.notation access to dictionary attributes"""
 64 |     __getattr__ = dict.get
 65 |     __setattr__ = dict.__setitem__
 66 |     __delattr__ = dict.__delitem__
 67 | 
 68 | 
 69 | class StandardScaler():
 70 |     def __init__(self, mean, std):
 71 |         self.mean = mean
 72 |         self.std = std
 73 | 
 74 |     def transform(self, data):
 75 |         return (data - self.mean) / self.std
 76 | 
 77 |     def inverse_transform(self, data):
 78 |         return (data * self.std) + self.mean
 79 | 
 80 | 
 81 | def visual(true, preds=None, name='./pic/test.pdf'):
 82 |     """
 83 |     Results visualization
 84 |     """
 85 |     plt.figure()
 86 |     plt.plot(true, label='GroundTruth', linewidth=2)
 87 |     if preds is not None:
 88 |         plt.plot(preds, label='Prediction', linewidth=2)
 89 |     plt.legend()
 90 |     plt.savefig(name, bbox_inches='tight')
 91 | 
 92 | 
 93 | def adjustment(gt, pred):
 94 |     anomaly_state = False
 95 |     for i in range(len(gt)):
 96 |         if gt[i] == 1 and pred[i] == 1 and not anomaly_state:
 97 |             anomaly_state = True
 98 |             for j in range(i, 0, -1):
 99 |                 if gt[j] == 0:
100 |                     break
101 |                 else:
102 |                     if pred[j] == 0:
103 |                         pred[j] = 1
104 |             for j in range(i, len(gt)):
105 |                 if gt[j] == 0:
106 |                     break
107 |                 else:
108 |                     if pred[j] == 0:
109 |                         pred[j] = 1
110 |         elif gt[i] == 0:
111 |             anomaly_state = False
112 |         if anomaly_state:
113 |             pred[i] = 1
114 |     return gt, pred
115 | 
116 | 
117 | def cal_accuracy(y_pred, y_true):
118 |     return np.mean(y_pred == y_true)
119 | 
120 | 
121 | def custom_collate(batch):
122 |     r"""source: pytorch 1.9.0, only one modification to original code """
123 | 
124 |     elem = batch[0]
125 |     elem_type = type(elem)
126 |     if isinstance(elem, torch.Tensor):
127 |         out = None
128 |         if torch.utils.data.get_worker_info() is not None:
129 |             # If we're in a background process, concatenate directly into a
130 |             # shared memory tensor to avoid an extra copy
131 |             numel = sum([x.numel() for x in batch])
132 |             storage = elem.storage()._new_shared(numel)
133 |             out = elem.new(storage)
134 |         return torch.stack(batch, 0, out=out)
135 |     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
136 |             and elem_type.__name__ != 'string_':
137 |         if elem_type.__name__ == 'ndarray' or elem_type.__name__ == 'memmap':
138 |             # array of string classes and object
139 |             if np_str_obj_array_pattern.search(elem.dtype.str) is not None:
140 |                 raise TypeError(default_collate_err_msg_format.format(elem.dtype))
141 | 
142 |             return custom_collate([torch.as_tensor(b) for b in batch])
143 |         elif elem.shape == ():  # scalars
144 |             return torch.as_tensor(batch)
145 |     elif isinstance(elem, float):
146 |         return torch.tensor(batch, dtype=torch.float64)
147 |     elif isinstance(elem, int):
148 |         return torch.tensor(batch)
149 |     elif isinstance(elem, str):
150 |         return batch
151 |     elif isinstance(elem, collections.abc.Mapping):
152 |         return {key: custom_collate([d[key] for d in batch]) for key in elem}
153 |     elif isinstance(elem, tuple) and hasattr(elem, '_fields'):  # namedtuple
154 |         return elem_type(*(custom_collate(samples) for samples in zip(*batch)))
155 |     elif isinstance(elem, collections.abc.Sequence):
156 |         # check to make sure that the elements in batch have consistent size
157 |         it = iter(batch)
158 |         elem_size = len(next(it))
159 |         if not all(len(elem) == elem_size for elem in it):
160 |             raise RuntimeError('each element in list of batch should be of equal size')
161 |         transposed = zip(*batch)
162 |         return [custom_collate(samples) for samples in transposed]
163 | 
164 |     raise TypeError(default_collate_err_msg_format.format(elem_type))
165 | 
166 | class HiddenPrints:
167 |     def __init__(self, rank):
168 |         # 如果rank是none，那么就是单机单卡，不需要隐藏打印，将rank设置为0
169 |         if rank is None:
170 |             rank = 0
171 |         self.rank = rank
172 |     def __enter__(self):
173 |         if self.rank == 0:
174 |             return
175 |         self._original_stdout = sys.stdout
176 |         sys.stdout = open(os.devnull, 'w')
177 | 
178 |     def __exit__(self, exc_type, exc_val, exc_tb):
179 |         if self.rank == 0:
180 |             return
181 |         sys.stdout.close()
182 |         sys.stdout = self._original_stdout
183 | 


--------------------------------------------------------------------------------