├── .gitignore
├── README.md
├── __init__.py
├── archive
├── __init__.py
├── io_utils.py
└── load_usr_dataset.py
├── baselines
├── __init__.py
├── evaluate_baselines.py
├── feature_based.py
└── show_baseline_results.py
├── config.py
├── dataset
└── UCRArchive_2018
│ ├── Earthquakes
│ ├── Earthquakes_TEST.arff
│ ├── Earthquakes_TEST.tsv
│ ├── Earthquakes_TRAIN.arff
│ ├── Earthquakes_TRAIN.tsv
│ ├── README.md
│ └── desktop.ini
│ ├── Strawberry
│ ├── README.md
│ ├── Strawberry_TEST.tsv
│ ├── Strawberry_TRAIN.tsv
│ └── desktop.ini
│ └── WormsTwoClass
│ ├── README.md
│ ├── WormsTwoClass_TEST.tsv
│ ├── WormsTwoClass_TRAIN.tsv
│ └── desktop.ini
├── docs
├── README.md
├── _config.yml
├── _layouts
│ └── default.html
├── exp.jpg
├── motiv.jpg
└── vis.jpg
├── evaluate_paras.py
├── requirements.txt
├── scripts
├── cache
│ ├── ucr-Earthquakes_embedding_t2g_model.cache
│ ├── ucr-Earthquakes_greedy_50_24_shapelets.cache
│ ├── ucr-Strawberry_embedding_t2g_model.cache
│ ├── ucr-Strawberry_greedy_50_15_shapelets.cache
│ ├── ucr-WormsTwoClass_embedding_t2g_model.cache
│ └── ucr-WormsTwoClass_greedy_20_30_shapelets.cache
├── run.py
└── std_test.py
├── setup.py
└── time2graph
├── __init__.py
├── core
├── __init__.py
├── distance_utils.py
├── model.py
├── model_embeds.py
├── model_sequence.py
├── model_utils.py
├── rnn
│ ├── __init__.py
│ ├── deep_models.py
│ └── deep_utils.py
├── shapelet_embedding.py
├── shapelet_utils.py
└── time_aware_shapelets.py
└── utils
├── __init__.py
├── base_utils.py
└── mp_utils.py
/.gitignore:
--------------------------------------------------------------------------------
1 | # Byte-compiled / optimized / DLL files
2 | __pycache__/
3 | *.py[cod]
4 | *$py.class
5 |
6 | # Distribution / packaging
7 | .Python
8 | env/
9 | build/
10 | develop-eggs/
11 | dist/
12 | downloads/
13 | eggs/
14 | .eggs/
15 | lib/
16 | lib64/
17 | parts/
18 | sdist/
19 | var/
20 | wheels/
21 | *.egg-info/
22 | .installed.cfg
23 | *.egg
24 |
25 | # PyInstaller
26 | # Usually these files are written by a python script from a template
27 | # before PyInstaller builds the exe, so as to inject date/other infos into it.
28 | *.manifest
29 | *.spec
30 |
31 | # Installer logs
32 | pip-log.txt
33 | pip-delete-this-directory.txt
34 |
35 | # Unit test / coverage reports
36 | htmlcov/
37 | .tox/
38 | .coverage
39 | .coverage.*
40 | .cache
41 | nosetests.xml
42 | coverage.xml
43 | *.cover
44 | .hypothesis/
45 |
46 | # Translations
47 | *.mo
48 | *.pot
49 |
50 | # Django stuff:
51 | *.log
52 | local_settings.py
53 |
54 | # Flask stuff:
55 | instance/
56 | .webassets-cache
57 |
58 | # Scrapy stuff:
59 | .scrapy
60 |
61 | # Sphinx documentation
62 | docs/_build/
63 |
64 | # Jupyter Notebook
65 | .ipynb_checkpoints
66 |
67 | # pyenv
68 | .python-version
69 |
70 | # celery beat schedule file
71 | celerybeat-schedule
72 |
73 | # dotenv
74 | .env
75 |
76 | # virtualenv
77 | .venv
78 | venv/
79 | ENV/
80 |
81 | # Spyder project settings
82 | .spyderproject
83 | .spyproject
84 |
85 | # Rope project settings
86 | .ropeproject
87 |
88 | # mkdocs documentation
89 | /site
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Time2Graph
2 | This project implements the Time2Graph model[1], which focuses on time series modeling with dynamic shapelets.
3 |
4 | ## Quick Links
5 |
6 | - [Building and Testing](#building-and-testing)
7 | - [Usage](#usage)
8 | - [Performance](#performance)
9 | - [Reference](#reference)
10 |
11 | ## Building and Testing
12 |
13 | This project is implemented primarily in Python 3.6, with several dependencies listed below. We have tested the whole framework on Ubuntu 16.04.5 LTS with kernel 4.4.0, and it is expected to easily build and run under a regular Unix-like system.
14 |
15 | ### Dependencies
16 |
17 | - [Python 3.6](https://www.python.org).
18 | Version 3.6.5 has been tested. Higher versions are expected be compatible with current implementation, while there may be syntax errors or conflicts under python 2.x.
19 |
20 | - [DeepWalk](https://github.com/phanein/deepwalk)
21 | We use a modified version of the original implementation of *deepwalk* to satisfy the support for directed and weighted graphs. The source codes with minor modifications can be found on [weighted_deepwalk](https://github.com/petecheng/weighted_deepwalk).
22 |
23 | - [PyTorch](https://pytorch.org).
24 |
25 | Version 0.4.1 has been tested. You can find installation instructions [here](https://pytorch.org/get-started/locally/). Note that the GPU support is **ENCOURAGED** as it greatly boosts training efficiency.
26 |
27 | - [XGBoost](https://github.com/dmlc/xgboost)
28 |
29 | Version 0.80 has been tested. You can find installation instructions [here](https://xgboost.readthedocs.io/en/latest/build.html).
30 |
31 | - [Other Python modules](https://pypi.python.org). Some other Python module dependencies are listed in ```requirements.txt```, which can be easily installed with pip:
32 |
33 | ```bash
34 | pip install -r requirements.txt
35 | ```
36 |
37 | Although not all dependencies are mentioned in the installation instruction links above, you can find most of the libraries in the package repository of a regular Linux distribution.
38 |
39 | ### Building the Project
40 |
41 | Before building the project, we recommend switching the working directory to the project root directory. Assume the project root is at ````, then run command
42 |
43 | ```bash
44 | cd
45 | ```
46 |
47 | Note that we assume ```` as your working directory in all the commands presented in the rest of this documentation. Then make sure that the environment variable `` PYTHONPATH`` is properly set, by running the following command (on a Linux distribution):
48 |
49 | ```bash
50 | export PYTHONPATH=`readlink -f ./`
51 | ```
52 |
53 | ### Testing the Project (Reproducibility)
54 |
55 | A test script ```scripts/std_test.py``` is available for reproducibility on the benchmark datasets:
56 |
57 | ```markdown
58 | python . -h
59 |
60 | usage: . [-h] [--dataset] [--n_splits] [--model_cache] [--shapelet_cache] [--gpu_enable]
61 |
62 | optional arguments:
63 | -h, --help show this help message and exit
64 | --dataset str, one of `ucr-Earthquakes`, `ucr-WormsTwoClass` and `ucr-Strawberry`,
65 | which we have set the optimal parameters after fine-tuning.
66 | (default: `ucr-Earthquakes`)
67 | --n_splits int, number of splits in cross-validation. (default: 5)
68 | --model_cache bool, whether to use a pretrained model.(default: False)
69 | --shapelet_cache bool, whether to use a pretrained shapelets set.(default: False)
70 | --gpu_enable bool, whether to enable GPU usage. (default: False)
71 | ```
72 |
73 | To quickly and exactly reproduce the results that reported in the paper, we highly **RECOMMEND** that set ``model_cache`` as True, since there are unavoidable randomness in the process of shapelets learning and graph embedding. And if only `shapelet_cache` is True, it will learn a new set of shapelet embeddings, which may bring some small fluctuations on the performance. So the easiest way for reproducibility and project testing is to run the following command:
74 |
75 | ```bash
76 | python scripts/std_test.py --model_cache --dataset *OPTION* --gpu_enable
77 | ```
78 |
79 | ## Usage
80 |
81 | Given a set of time series data and the corresponding labels, the **Time2Graph** framework aims to learn the representations of original time series, and conduct time series classifications under the setting of supervised learning.
82 |
83 | ### Input Format
84 |
85 | The input time series data and labels are expected to be ```numpy.ndarray```:
86 |
87 | ```markdown
88 | Time_Series X:
89 | numpy.ndarray with shape (N x L x data_size),
90 | where N is the number of time series, L is the time series length,
91 | and data_size is the data dimension.
92 | Labels Y:
93 | numpy.ndarray with shape (N x 1), with 0 as negative, and 1 as positive samples.
94 | ```
95 |
96 | We organize the preprocessing codes that load the *UCR* dataset in the `archive/` repo, and if you want to utilize the framework on other datasets, just preprocess the original data as the abovementioned format. Note that the time series data is not needed to be normalized or scaled, since you can set the parameter `scaled` as True when initializing **Time2Graph** model.
97 |
98 | ### Main Script
99 |
100 | Now that the input data is ready, the main script `scripts/run.py` is a pipeline example to train and test the whole framework. Firstly you need to modify the codes in the following block (*line 46-51*) to load your datasets, by reassigning `x_train, y_train, x_test, y_test` respectively.
101 |
102 | ```python
103 | if args.dataset.startswith('ucr'):
104 | dataset = args.dataset.rstrip('\n\r').split('-')[-1]
105 | x_train, y_train, x_test, y_test = load_usr_dataset_by_name(
106 | fname=dataset, length=args.seg_length * args.num_segment)
107 | else:
108 | raise NotImplementedError()
109 | ```
110 |
111 | The help information of the main script `scripts/run.py` is listed as follows:
112 |
113 | ```markdown
114 | python . -h
115 |
116 | usage: .[-h] [-- dataset] [--K] [--C] [--num_segment] [--seg_length] [--data_size]
117 | [--n_splits] [--njobs] [--optimizer] [--alpha] [--beta] [--init]
118 | [--gpu_enable] [--opt_metric] [--cache] [--embed] [--embed_size] [--warp]
119 | [--cmethod] [--kernel] [--percentile] [--measurement] [--batch_size]
120 | [--tflag] [--scaled] [--norm] [--no_global]
121 |
122 | optional arguments:
123 | -h, --help show this help message and exit
124 | --dataset str, indicate which dataset to load;
125 | need to modify the codes in line 46-51.
126 | --K int, number of shapelets that try to learn
127 | --C int, number of shapelet candidates used for learning shapelets
128 | --num_segment int, number of segment that a time series have
129 | --seg_length int, the segment length,
130 | so the length of a time series is num_segment * seg_length
131 | --data_size int, the dimension of time series data
132 | --n_splits int, number of cross-validation, default 5.
133 | --njobs int, number of threads if using multiprocessing.
134 | --optimizer str, optimizer used for learning shapelets, default `Adam`.
135 | --alpha float, penalty for local timing factor, default 0.1.
136 | --beta float, penalty for global timing factor, default 0.05.
137 | --init int, init offset for time series, default 0.
138 | --gpu_enable bool, whether to use GPU, default False.
139 | --opt_metric str, metric for optimizing out-classifier, default `accuracy`.
140 | --cache bool, whether to save model cache, defualt False.
141 | --embed str, embedding mode, one of `aggregate` and `concate`.
142 | --embed_size int, embedding size in deepwalk, default 256.
143 | --wrap int, warp size in greedy-dtw, default 2.
144 | --cmethod str, candidate generation method, one of `cluster` and `greedy`
145 | --kernel str, choice of outer-classifer, default `xgb`.
146 | --percentile int, distance threshold (percentile) in graph construction, default 10
147 | --measurement str, distance measurement,default `gdtw`.
148 | --batch_size int, batch size, default 50
149 | --tflag bool, whether to use timing factors, default True.
150 | --scaled bool, whether to scale time seriee by z-normalize, default False.
151 | --norm bool, whether to normalize handcraft-features, default False.
152 | --no_global bool, whether to use global timing factor
153 | when constructing shapelet evolution graph, default False.
154 | ```
155 |
156 | Some of the arguments may require further explanation:
157 |
158 | - ``--K/--C``: the number of shapelets should be carefully selected, and it is highly related with intrinsic properties of the dataset. And in our extensive experiments, `C` is often set 10 or 20 times of `K` to ensure that we can learn from a large pool of candidates.
159 | - ``--percentile`` , ``--alpha`` and `--beta`: we have conduct fine-tuning on several datasets, and in most cases we recommend the default settings, although modifying them may bring performance increment, as well as drop.
160 |
161 | ### Demo
162 |
163 | We include all three benchmark *UCR* datasets in the ``dataset`` directory, which is a subset of *UCR-Archive* time series dataset. See [Data Sets](#data-sets) for more details. Then a demo script is available by calling `scripts/run.py`, as the following:
164 |
165 | ```shell
166 | python scripts/run.py --dataset ucr-Earthquakes --K 50 --C 500
167 | --num_segment 21 --seg_length 24 --data_size 1 --embed concate --percentile 5 --gpu_enable
168 | ```
169 |
170 | ## Evaluation
171 |
172 | ### Data Sets
173 |
174 | The three benchmark datasets reported in [1] was made public by [UCR](https://www.cs.ucr.edu/%7Eeamonn/time_series_data_2018/), which consists of many time series datasets. we select several *UCR* datasets from many candidates by the following reasons that: 1) to maintain the consistency of evaluation metrics between the real-world and public datasets, we only consider binary-label ones in *UCR*; 2) we have to make sure that there are enough training cases because we need sufficient samples to capture the normal transitions between shapelets (many binary-label datasets in *UCR* only have less than 100 training samples), and 3) we omit all datasets categorized as “image”, because the proposed intuition (timing factor, shapelet evolutions) may not be appropriate for time series transformed from images. After filtering based on the abovementioned criterion, and due to space limitation, we only present those three in [1]. We have tested some others such as *Ham* and *Computers*, etc., and also achieved competitive results compared with baseline methods.
175 |
176 | Furthermore, we apply the proposed *Time2Graph* model on two real-world scenarios: Electricity Consumption Records (**ECR**) provided by State Grid of China, and Network Traffic Flow (**NTF**) from China Telecom. Detailed dataset descriptions can be found in our paper. The performance increment compared with existing models clearly demonstrate the effectiveness of the framework, and below we list the final results along with several popular baselines.
177 |
178 | ### Performance
179 |
180 | | Accuracy on UCR(%) | Earthquakes | WormsTwoClass | Strawberry |
181 | | :----------------: | :---------: | :-----------: | :--------: |
182 | | NN-DTW | 70.31 | 68.16 | 95.53 |
183 | | TSF | 74.67 | 68.51 | 96.27 |
184 | | FS | 74.66 | 70.58 | 91.66 |
185 | | Time2Graph | **79.14** | **72.73** | **96.76** |
186 |
187 | | Performance on ECR(%) | Precision | Recall | F1 |
188 | | :-------------------: | :-------: | :-------: | :-------: |
189 | | NN-DTW | 15.52 | 18.15 | 16.73 |
190 | | TSF | 26.32 | 2.02 | 3.75 |
191 | | FS | 10.45 | 79.84* | 18.48 |
192 | | Time2Graph | **30.10** | **40.26** | **34.44** |
193 |
194 | | Performance on NTF(%) | Precision | Recall | F1 |
195 | | :-------------------: | :-------: | :-------: | :-------: |
196 | | NN-DTW | 33.20 | 43.75 | 37.75 |
197 | | TSF | 57.52 | 33.85 | 42.62 |
198 | | FS | 63.55 | 35.42 | 45.49 |
199 | | Time2Graph | **71.52** | **56.25** | **62.97** |
200 |
201 | Please refer to our paper [1] for detailed information about the experimental settings, the description of unpublished data sets, the full results of our experiments, along with ablation and observational studies.
202 |
203 | ## Reference
204 |
205 | [1] Cheng, Z; Yang, Y; Wang, W; Hu, W; Zhuang, Y and Song, G, 2020, Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets, In AAAI, 2020
206 |
207 | ```
208 | @inproceedings{cheng2020time2graph,
209 | title = "{Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets}",
210 | author = {{Cheng}, Z. and {Yang}, Y. and {Wang}, W. and {Hu}, W. and {Zhuang}, Y. and {Song}, G.},
211 | booktitle={Proceedings of Association for the Advancement of Artificial Intelligence (AAAI)},
212 | year = 2020,
213 | }
214 | ```
--------------------------------------------------------------------------------
/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/petecheng/Time2Graph/f3a7387d04869f2388bdda4b900c50149b57698e/__init__.py
--------------------------------------------------------------------------------
/archive/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/petecheng/Time2Graph/f3a7387d04869f2388bdda4b900c50149b57698e/archive/__init__.py
--------------------------------------------------------------------------------
/archive/io_utils.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import time
3 | import datetime
4 |
5 |
6 | def convert_str2float(strr):
7 | if strr == '':
8 | return -1.0
9 | else:
10 | return float(strr)
11 |
12 |
13 | def convert_str2int(strr):
14 | if strr == '':
15 | return -1
16 | else:
17 | return int(strr)
18 |
19 |
20 | def get_month_in_year(timestamp):
21 | return int(time.localtime(timestamp).tm_mon) - 1
22 |
23 |
24 | def get_day_in_month(timestamp):
25 | return int(time.localtime(timestamp).tm_mday) - 1
26 |
27 |
28 | def get_day_in_year(timestamp):
29 | return int(time.localtime(timestamp).tm_yday) - 1
30 |
31 |
32 | def get_year(timestamp):
33 | return int(time.localtime(timestamp).tm_year)
34 |
35 |
36 | def format_time_from_str(time_str, tfmt):
37 | return int(time.mktime(time.strptime(time_str, tfmt)))
38 |
39 |
40 | def generate_time_series_time(begin, end, tfmt, duration):
41 | ret = []
42 | d_begin = datetime.datetime.fromtimestamp(format_time_from_str(time_str=begin, tfmt=tfmt))
43 | d_end = datetime.datetime.fromtimestamp(format_time_from_str(time_str=end, tfmt=tfmt))
44 | while d_begin <= d_end:
45 | ret.append(d_begin.strftime(tfmt))
46 | if duration == 'day':
47 | d_begin += datetime.timedelta(days=1)
48 | elif duration == 'hour':
49 | d_begin += datetime.timedelta(hours=1)
50 | else:
51 | raise NotImplementedError()
52 | return ret
53 |
54 |
55 | def generate_time_index(begin, end, tfmt, duration):
56 | ret = {}
57 | d_begin = datetime.datetime.fromtimestamp(format_time_from_str(time_str=begin, tfmt=tfmt))
58 | d_end = datetime.datetime.fromtimestamp(format_time_from_str(time_str=end, tfmt=tfmt))
59 | cnt = 0
60 | while d_begin <= d_end:
61 | ret[d_begin.strftime(tfmt)] = cnt
62 | cnt += 1
63 | if duration == 'day':
64 | d_begin += datetime.timedelta(days=1)
65 | elif duration == 'hour':
66 | d_begin += datetime.timedelta(hours=1)
67 | else:
68 | raise NotImplementedError()
69 | return ret
70 |
71 |
72 | def transform_np2tsv(x, y, fpath):
73 | output = open(fpath, 'w')
74 | for k in range(len(y)):
75 | data = x[k]
76 | output.write('{}'.format(y[k]))
77 | for i in range(len(data)):
78 | for j in range(len(data[i])):
79 | output.write('\t{}'.format(data[i, j]))
80 | output.write('\n')
81 |
--------------------------------------------------------------------------------
/archive/load_usr_dataset.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import pandas
3 | from config import *
4 |
5 |
6 | def load_usr_dataset_by_name(fname, length):
7 | """
8 | load UCR dataset given dataset name.
9 | :param fname:
10 | dataset name, e.g., Earthquakes.
11 | :param length:
12 | time series length that want to load in.
13 | :return:
14 | """
15 | dir_path = '{}/dataset/UCRArchive_2018'.format(module_path)
16 | assert path.isfile('{}/{}/{}_TEST.tsv'.format(dir_path, fname, fname)), '{} NOT EXIST in UCR!'.format(fname)
17 | train_data = pandas.read_csv('{}/{}/{}_TRAIN.tsv'.format(dir_path, fname, fname), sep='\t', header=None)
18 | test_data = pandas.read_csv('{}/{}/{}_TEST.tsv'.format(dir_path, fname, fname), sep='\t', header=None)
19 | init = train_data.shape[1] - length
20 | x_train, y_train = train_data.values[:, init:].astype(np.float).reshape(-1, length, 1), \
21 | train_data[0].values.astype(np.int)
22 | x_test, y_test = test_data.values[:, init:].astype(np.float).reshape(-1, length, 1), \
23 | test_data[0].values.astype(np.int)
24 | lbs = np.unique(y_train)
25 | y_train_return, y_test_return = np.copy(y_train), np.copy(y_test)
26 | for idx, val in enumerate(lbs):
27 | y_train_return[y_train == val] = idx
28 | y_test_return[y_test == val] = idx
29 | return x_train, y_train_return, x_test, y_test_return
30 |
31 |
32 |
--------------------------------------------------------------------------------
/baselines/__init__.py:
--------------------------------------------------------------------------------
1 | """
2 | SAXVSM: pyts.Classification
3 | LS: tslearn.Shapelet
4 | """
--------------------------------------------------------------------------------
/baselines/evaluate_baselines.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import argparse
3 | import warnings
4 | import os
5 | from config import *
6 | from time2graph.utils.base_utils import Debugger
7 | """
8 | scripts for generating java-cmd that conduct baseline algorithms.
9 | """
10 |
11 | if __name__ == '__main__':
12 | warnings.filterwarnings(module='sklearn*', action='ignore', category=DeprecationWarning)
13 | parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
14 | parser.add_argument('--dataset', type=str, default='stealing')
15 | parser.add_argument('--classpath', type=str,
16 | default='{}/baselines/TimeSeriesClassification/'.format(module_path))
17 | parser.add_argument('--input', type=str, default='{}/dataset/'.format(module_path))
18 | parser.add_argument('--output', type=str, default='{}/dataset/'.format(module_path))
19 | parser.add_argument('--top', type=str, default='{}/baselines/TimeSeriesClassification/'
20 | 'out/production/TimeSeriesClassification'.format(module_path))
21 | parser.add_argument('--gpu_number', type=int, default=0)
22 | parser.add_argument('--clf', type=str, required=True)
23 |
24 | opt = parser.parse_args()
25 | all_clf = [
26 | 'CID_DTW', 'DD_DTW', 'WDTW', 'ED', 'DTW',
27 | 'LearnShapelets', 'FastShapelets', 'BagOfPatterns',
28 | 'TSF', 'TSBF', 'LPS', 'ST', 'COTE'
29 | ]
30 |
31 | classpath = []
32 | for dirpath, dirnames, fnamesList in os.walk(opt.classpath):
33 | Debugger.info_print('{}'.format(dirpath))
34 | for fname in fnamesList:
35 | if fname.endswith('.jar'):
36 | classpath.append('{}{}'.format(dirpath, fname))
37 | break
38 | Debugger.info_print('{}'.format(classpath))
39 |
40 | cmd = 'CUDA_VISIBLE_DEVICES={} java -classpath {}'.format(opt.gpu_number, opt.top)
41 | if opt.clf != 'all':
42 | for p in classpath:
43 | cmd += ':{}'.format(p)
44 | dataset_cmd = cmd + ' development.DataSets -i {} -o {} -t {}'.format(opt.input, opt.output, opt.dataset)
45 | predict_cmd = cmd + ' timeseriesweka.examples.ClassificationExamples -i {} -o {} -t {} -c {}'.format(
46 | opt.input, opt.output, opt.dataset, opt.clf
47 | )
48 | output = open('{}/evaluate_baselines_{}_{}.sh'.format(opt.top, opt.dataset, opt.clf), 'w')
49 | output.write('#!/usr/bin/env bash\n{}\n{}\n'.format(dataset_cmd, predict_cmd))
50 | output.close()
51 | else:
52 | for p in classpath:
53 | cmd += ':{}'.format(p)
54 | dataset_cmd = cmd + ' development.DataSets -i {} -o {} -t {}'.format(opt.input, opt.output, opt.dataset)
55 | output = open('{}/evaluate_baselines_{}_{}.sh'.format(opt.top, opt.dataset, opt.clf), 'w')
56 | output.write('#!/usr/bin/env bash\n{}\n'.format(dataset_cmd))
57 | for clf in all_clf:
58 | predict_cmd = cmd + ' timeseriesweka.examples.ClassificationExamples -i {} -o {} -t {} -c {}'.format(
59 | opt.input, opt.output, opt.dataset, clf
60 | )
61 | output.write('{}\n'.format(predict_cmd))
62 | output.close()
63 |
--------------------------------------------------------------------------------
/baselines/feature_based.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | from config import *
3 | from time2graph.utils.base_utils import ModelUtils
4 | from sklearn.model_selection import StratifiedKFold
5 | from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
6 |
7 |
8 | class FeatureModel(ModelUtils):
9 | """
10 | Class for Handcraft-feature Model for time series classification.
11 | Feature list:
12 | a) mean, std of whole time series.
13 | b) mean, std of each segments.
14 | c) mean of the std of segments.
15 | d) std of the mean of segments.
16 | """
17 | def __init__(self, seg_length, kernel='xgb', opt_metric='f1', **kwargs):
18 | super(FeatureModel, self).__init__(kernel=kernel, **kwargs)
19 | self.clf = None
20 | self.seg_length = seg_length
21 | self.opt_metric = opt_metric
22 |
23 | def extract_features(self, samples):
24 | num_samples, data_size = samples.shape[0], samples.shape[-1]
25 | samples = samples.reshape(num_samples, -1, self.seg_length, data_size)
26 | series_mean = np.mean(samples.reshape(num_samples, -1, data_size), axis=1).reshape(num_samples, -1)
27 | series_std = np.std(samples.reshape(num_samples, -1, data_size), axis=1).reshape(num_samples, -1)
28 | seg_mean, seg_std = np.mean(samples, axis=2), np.mean(samples, axis=2)
29 | seg_mean_std, seg_std_mean = np.std(seg_mean, axis=1), np.mean(seg_std, axis=1)
30 | seg_mean = seg_mean.reshape(num_samples, -1)
31 | seg_std = seg_std.reshape(num_samples, -1)
32 | seg_mean_std = seg_mean_std.reshape(num_samples, -1)
33 | seg_std_mean = seg_std_mean.reshape(num_samples, -1)
34 | return np.concatenate((series_mean, series_std, seg_mean, seg_std, seg_mean_std, seg_std_mean), axis=1)
35 |
36 | def fit(self, X, Y, n_splits=5, balanced=True):
37 | x = self.extract_features(samples=X)
38 | max_accu, max_prec, max_recall, max_f1, max_metric = -1, -1, -1, -1, -1
39 | arguments, opt_args = self.clf_paras(balanced=balanced), None
40 | metric_measure = self.return_metric_method(opt_metric=self.opt_metric)
41 | for args in arguments:
42 | self.clf.set_params(**args)
43 | skf = StratifiedKFold(n_splits=n_splits, shuffle=True)
44 | tmp = np.zeros(5, dtype=np.float32).reshape(-1)
45 | measure_vector = [metric_measure, accuracy_score, precision_score, recall_score, f1_score]
46 | for train_idx, test_idx in skf.split(x, Y):
47 | self.clf.fit(x[train_idx], Y[train_idx])
48 | y_pred, y_true = self.clf.predict(x[test_idx]), Y[test_idx]
49 | for k in range(5):
50 | tmp[k] += measure_vector[k](y_true=y_true, y_pred=y_pred)
51 | tmp /= n_splits
52 | if max_metric < tmp[0]:
53 | max_metric = tmp
54 | opt_args = args
55 | max_accu, max_prec, max_recall, max_f1 = tmp[1:]
56 | Debugger.info_print('args {} for clf {}, performance: {:.4f}, {:.4f}, {:.4f}, {:.4f}'.format(
57 | opt_args, self.kernel, max_accu, max_prec, max_recall, max_f1))
58 | self.clf.set_params(**opt_args)
59 |
60 | def predict(self, X, **kwargs):
61 | x = self.extract_features(samples=X)
62 | return self.clf.predict(x)
63 |
--------------------------------------------------------------------------------
/baselines/show_baseline_results.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import argparse
3 | import warnings
4 | from config import *
5 | from time2graph.utils.base_utils import Debugger
6 | from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
7 | """
8 | scripts for show baseline results generated by java-package provided in UCR.
9 | """
10 |
11 |
12 | def load_baseline_results(fpath):
13 | y_pred, y_test = [], []
14 | with open(fpath, 'r') as f:
15 | cnt = 0
16 | for line in f:
17 | if cnt < 3:
18 | cnt += 1
19 | continue
20 | line = line.rstrip('\n').split(',')
21 | if len(line) <= 4:
22 | continue
23 | y_test.append(int(line[0]))
24 | y_pred.append(int(line[1]))
25 | f.close()
26 | return y_pred, y_test
27 |
28 |
29 | if __name__ == '__main__':
30 | warnings.filterwarnings(module='sklearn*', action='ignore', category=DeprecationWarning)
31 | parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
32 | parser.add_argument('--dataset', type=str, default='stealing')
33 | parser.add_argument('--clf', type=str, required=True)
34 |
35 | opt = parser.parse_args()
36 | all_clf = [
37 | 'CID_DTW', 'DD_DTW', 'WDTW', 'ED', 'DTW',
38 | 'LearnShapelets', 'FastShapelets', 'BagOfPatterns',
39 | 'TSF', 'TSBF', 'LPS', 'SAX', 'ST', 'COTE', 'EE'
40 | ]
41 | assert opt.clf in all_clf
42 | fpath = '{}/dataset/{}/Predictions/{}/testFold0.csv'.format(module_path, opt.clf, opt.dataset)
43 | y_pred, y_test = load_baseline_results(fpath=fpath)
44 | Debugger.info_print('{} test samples with {:.4f} positive'.format(len(y_test), sum(y_test) / len(y_test)))
45 | accu = accuracy_score(y_true=y_test, y_pred=y_pred)
46 | prec = precision_score(y_true=y_test, y_pred=y_pred)
47 | recall = recall_score(y_true=y_test, y_pred=y_pred)
48 | f1 = f1_score(y_true=y_test, y_pred=y_pred)
49 | Debugger.info_print('res: accu {:.4f}, prec {:.4f}, recall {:.4f}, f1 {:.4f}'.format(
50 | accu, prec, recall, f1
51 | ))
52 |
--------------------------------------------------------------------------------
/config.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import numpy as np
3 | from os import path
4 | from time2graph.utils.base_utils import Debugger
5 | """
6 | configuration file for benchmark datasets from UCR.
7 | Earthquakes (EQS).
8 | WormsTwoClass (WTC).
9 | StrawBerry (STB).
10 | including hyper-parameters and optimal arguments in xgboost.
11 | """
12 |
13 | module_path = path.dirname(path.abspath(__file__))
14 |
15 |
16 | EQS = {
17 | 'K': 50,
18 | 'C': 800,
19 | 'seg_length': 24,
20 | 'num_segment': 21,
21 | 'percentile': 5
22 | }
23 |
24 | WTC = {
25 | 'K': 20,
26 | 'C': 400,
27 | 'seg_length': 30,
28 | 'num_segment': 30,
29 | 'percentile': 5,
30 | 'global_flag': False
31 | }
32 |
33 | STB = {
34 | 'K': 50,
35 | 'C': 800,
36 | 'seg_length': 15,
37 | 'num_segment': 15,
38 | 'percentile': 10,
39 | 'embed': 'aggregate'
40 | }
41 |
42 | model_args = {
43 | 'ucr-Earthquakes': EQS,
44 | 'ucr-WormsTwoClass': WTC,
45 | 'ucr-Strawberry': STB
46 | }
47 |
48 | xgb_args = {
49 | 'ucr-Earthquakes': {
50 | 'max_depth': 16,
51 | 'learning_rate': 0.2,
52 | 'scale_pos_weight': 1,
53 | 'booster': 'gbtree'
54 | },
55 | 'ucr-WormsTwoClass': {
56 | 'max_depth': 2,
57 | 'learning_rate': 0.2,
58 | 'scale_pos_weight': 1,
59 | 'booster': 'gbtree'
60 | },
61 | 'ucr-Strawberry': {
62 | 'max_depth': 8,
63 | 'learning_rate': 0.2,
64 | 'scale_pos_weight': 1,
65 | 'booster': 'gbtree'
66 | }
67 | }
68 |
69 | __all__ = [
70 | 'np',
71 | 'path',
72 | 'Debugger',
73 | 'module_path',
74 | 'model_args',
75 | 'xgb_args'
76 | ]
77 |
--------------------------------------------------------------------------------
/dataset/UCRArchive_2018/Earthquakes/README.md:
--------------------------------------------------------------------------------
1 | # Earthquakes
2 |
3 | The earthquake classification problem involves predicting whether a major event is about to occur based on the most recent readings in the surrounding area. The data are taken from Northern California Earthquake Data Center and each data point is an averaged reading for one hour, with the first reading taken on Dec 1st 1967 and the last in 2003. This single time series are then turned into a classification problem of differentiating between a positive and negative major earthquake event.
4 |
5 | A major event is defined as any reading of over 5 on the Rictor scale. Major events are often followed by aftershocks. (The physics of these are well understood and their detection is not the objective of this dataset.) A positive case is defined a major event which is not preceded by another major event for at least 512 hours.
6 |
7 | Negative cases are instances where there is a reading below 4 (to avoid blurring of the boundaries between major and non-major events) that is preceded by at least 20 non-zero readings in the previous 512 hours (to avoid trivial negative cases).
8 |
9 | In total, 368 negative and 93 positive cases were extracted from 86,066 hourly readings. None of the cases overlap in time (i.e. a segmentation is used instead of a sliding window).
10 |
11 | Train size: 322
12 |
13 | Test size: 139
14 |
15 | Missing value: No
16 |
17 | Number of classses: 2
18 |
19 | Time series length: 512
20 |
21 | Data donated by Anthony Bagnall (see [1]).
22 |
23 | [1] http://www.timeseriesclassification.com/description.php?Dataset=Earthquakes
24 |
--------------------------------------------------------------------------------
/dataset/UCRArchive_2018/Earthquakes/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/dataset/UCRArchive_2018/Strawberry/README.md:
--------------------------------------------------------------------------------
1 | # Strawberry
2 |
3 | Food spectrographs are used in chemometrics to classify food types, a task that has obvious applications in food safety and quality assurance. This data was processed using Fourier transform infrared (FTIR) spectroscopy with attenuated total reflectance (ATR) sampling. More details are provided in [1][2].
4 |
5 | The classes are strawberry purees (authentic samples) and non-strawberry purees (adulterated strawberries and other fruits).
6 |
7 | Train size: 613
8 |
9 | Test size: 370
10 |
11 | Missing value: No
12 |
13 | Number of classses: 2
14 |
15 | Time series length: 235
16 |
17 | Data donated by Katherine Kemsley and Anthony Bagnall (see [1], [2], [3]).
18 |
19 | [1] Holland, J. K., E. K. Kemsley, and R. H. Wilson. "Use of Fourier transform infrared spectroscopy and partial least squares regression for the detection of adulteration of strawberry purees." Journal of the Science of Food and Agriculture 76.2 (1998): 263-269.
20 |
21 | [2] https://csr.quadram.ac.uk/example-datasets-for-download/
22 |
23 | [3] http://www.timeseriesclassification.com/description.php?Dataset=Strawberry
--------------------------------------------------------------------------------
/dataset/UCRArchive_2018/Strawberry/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/dataset/UCRArchive_2018/WormsTwoClass/README.md:
--------------------------------------------------------------------------------
1 | # WormTwoClass
2 |
3 | Caenorhabditis elegans (C. elegans) is a roundworm commonly used as a model organism in genetics study. The movement of these worms is known to be a useful indicator for understanding behavioural genetics.
4 |
5 | The data were original from [1][2], in which the authors described a system for recording the motion of worms on an agar plate and measuring a range of human-defined features.
6 |
7 | It has been shown that the space of shapes Caenorhabditis elegans adopts on an agar plate can be represented by combinations of four base shapes, or eigenworms. Once the worm outline is extracted, each frame of worm motion can be captured by four scalars representing the amplitudes along each dimension when the shape is projected onto the four eigenworms.
8 |
9 | The data were formatted for time series classification task and used in [3]. Each case is a series of the first eigenworm only, down-sampled to second-long intervals and averaged down so that all series are of length 900. There are 258 cases in total; each belongs to one of five types: one wild-type (the N2 reference strain - 109 cases) and four mutants: goa-1 (44 cases), unc-1 (35 cases), unc-38 (45 cases) and unc-63 (25 cases).
10 |
11 | In case of the *WormsTwoClass* dataset, the task is to classify worms of wild-type or mutant-type.
12 |
13 | In case of the *Worms* dataset, the task is to classify worms into one of the five categories.
14 |
15 | Train size: 181
16 |
17 | Test size: 77
18 |
19 | Missing value: No
20 |
21 | Number of classses: 2
22 |
23 | Time series length: 900
24 |
25 | Data donated by Andre Brown and Anthony Bagnall (see [1], [3]).
26 |
27 | [1] Brown, André EX, et al. "A dictionary of behavioral motifs reveals clusters of genes affecting Caenorhabditis elegans locomotion." Proceedings of the National Academy of Sciences 110.2 (2013): 791-796.
28 |
29 | [2] Yemini, Eviatar, et al. "A database of Caenorhabditis elegans behavioral phenotypes." Nature methods 10.9 (2013): 877.
30 |
31 | [3] Bagnall, Anthony, et al. "Time-series classification with COTE: the collective of transformation-based ensembles." IEEE Transactions on Knowledge and Data Engineering 27.9 (2015): 2522-2535.
32 |
33 | [4] http://www.timeseriesclassification.com/description.php?Dataset=Worms
--------------------------------------------------------------------------------
/dataset/UCRArchive_2018/WormsTwoClass/desktop.ini:
--------------------------------------------------------------------------------
1 | [.ShellClassInfo]
2 | InfoTip=This folder is shared online.
3 | IconFile=C:\Program Files\Google\Drive\googledrivesync.exe
4 | IconIndex=16
5 |
--------------------------------------------------------------------------------
/docs/README.md:
--------------------------------------------------------------------------------
1 | ### Time Series Modeling
2 |
3 | Time series modeling aims to discover the temporal relationships within chronologically arranged data. It has attracted extensive research over a wide range of fields, such as image alignment [2], speech recognition [3], etc. The key issue here is how to extract the representative features of a time series. A large part of previous frameworks range from classical feature engineering and representation learning to deep learning based models. While these methods have achieved good performance [4, 5], they have also been subject to criticism regarding their lack of interpretability.
4 |
5 | ### Intuition: Shapelet Dynamics
6 |
7 | ***Shapelets***, the time series subsequences that are representative of a class [6], can offer directly interpretable and explanatory insights in the classification scenario, and shapelet-based models have proven to be promising in various practical domains [7,8,9].
8 |
9 | Existing efforts have mainly considered shapelets as static. However, in the real world, shapelets are often dynamic, which is reflected in two respects:
10 |
11 | * First, the same shapelet appearing at different time slices may have a range of different impacts. For instance, in the scenario of detecting electricity theft, low electricity consumption in summer or winter is more suspicious than it is in spring, as refrigeration or heating equipments costs more electrical power.
12 | * Second, determining the ways in which shapelets evolve is vital to a full understanding of a time series. In fact, shapelets with small values at a particular time can hardly distinguish an electricity thief from a normal user who indeed consumes a low level of electricity. An alternative method would involve identifying users who once had high electricity consumption shapelets but suddenly consumes very few electrical power. In other words, an important clue here is how shapelets evolve over time.
13 |
14 | We refer to the subsequences of a time series that are able to reflect its representativeness at different time slices as *time-aware shapelets*. Furthermore, to deeply mining the dynamics and correlations of shapelets, we propose a novel approach to learn the representations of a time series by extracting time-aware shapelets and constructing a shapelet evolution graph, referred as our AAAI'2020 paper [1].
15 |
16 |
17 |
18 |
19 |
20 | Above shows an concrete example from real-world electricity consumption record data, which may better explain our motivations: Fig. a demonstrates the one-year electricity usage of a user who has stolen electrical power from January to May while using electrical power normally in the remaining months. We assign each month the most representative shapelet at that time and present the shapelets *#72* and *#67*, along with their timing factors in Fig. b, where dark areas indicate that the corresponding shapelet is more discriminative relative to light areas. The shapelet evolution graph is presented in Fig. c, illustrating how a shapelet would transfer from one to another *in a normal case*: for the normal electricity consumption record, there is a clear path for its shapelet transition (*#90* → *#67* → *#85*) in the graph. For the abnormal data, however, the path (*#85* → *#72* → *#7*) does not exist, indicating that the connectivity of the shapelet transition path provides an evidential basis for detecting an abnormal time series. Finally, we translate the problem of learning representations of shapelets and time series into a graph embedding problem.
21 |
22 | ### Extracting Time-aware Shapelets
23 |
24 | Formally, a shapelet $$v$$ is a segment that is representative of a certain class. More precisely, it can separate $$T$$ into two smaller sets, one that is close to $$v$$ and another far from $$v$$ by some specific criteria, such that for a time series classification task, positive and negative samples can be put into different groups. The criteria can be formalized as
25 |
26 | $$\mathcal{L} = -g(S_{pos}(v, T), S_{neg}(v, T))$$
27 |
28 | where $$S_{*}(v, T)$$ denotes the set of distances with respect to a specific group $$T_{*}$$, and the function $$g$$ takes two finite sets as input, returns a scalar value to indicate how far these two sets are, and it can be *information gain*, or some dissimilarity measurements on sets, i.e., *KL* divergence.
29 |
30 | To capture the shapelet dynamics, We define two factors for quantitatively measuring the timing effects of shapelets at different levels. Specifically, we introduce the *local factor* $$w_n$$ to denote the inner importance of the *n-th* element of a particular shapelet, then the distance between a shapelet $$v$$ and a segment $$s$$ is redefined as
31 |
32 | $$\hat{d}(v, s|w) = \tau(v, s | a^*, w) = (\sum\nolimits_{k=1}^{p}\ w_{a^*_1(k)} \cdot (v_{a^*_1(k)} - s_{a^*_2(k)})^2)^{\frac{1}{2}}$$
33 |
34 | where $$a^*$$ refers to the best alignment for DTW distance. On the other hand, at a *global level*, we aim to measure the timing effects across segments on the discriminatory power of shapelets. It is inspired from the intuition that shapelets may represent totally different meaning at different time steps, and it is straightforward to measure such deviations by adding segment-level weights. Formally, we set a *global factor* $$u_m$$ to capture the cross-segments influence, then the distance between a shapelet $$v$$ and a time series $$t$$ can be rewritten as
35 |
36 | $$\hat{D}(v, t | w, u) = \min\nolimits_{1\le k \le m} u_k \cdot \hat{d}(v, s_k | w)$$
37 |
38 | Then given a classification task, we establish a supervised learning method to select the most important time-aware shapelets and learn corresponding timing factors $$w_i$$ and $$u_i$$ for each shapelet $$v_i$$. In particular, we have a pool of segments as shapelet candidates that selected from all subsequences, and a set of time series $$T$$ with labels. For each candidate $$v$$, we have the following objective function:
39 |
40 | $$\hat{\mathcal{L}} = -g(S_{pos}(v, T), S_{neg}(v, T)) + \lambda ||w|| + \epsilon ||u||$$
41 |
42 | and after learning the timing factors from shapelet candidates separately, we select the top *K* shapelets with minimal loss as our final time-aware shapelets.
43 |
44 | ### Constructing Shapelet Evolution Graph
45 |
46 | A ***Shapelet Evolution Graph*** is a directed and weighted graph $$G = (V,E)$$ in which $$V$$ consists of $$K$$ vertices, each denoting a shapelet, and each directed edge $$e_{i, j} \in E$$ is associated with a weight $$w_{i, j}$$, indicating the occurrence probability of shapelet $$v_i \in V$$ followed by another shapelet $$v_j \in V$$ in the same time series. The key idea here is that the shapelet evolution and transition patterns can be naturally reflected from the paths in the graph, then graph embedding mythologies can be applied to learn shapelet, as well as the time series representations.
47 |
48 | We first assign each segment $$s_i$$ of each time series to several shapelets that have the closest distances to $$s_i$$ according to the time-aware dissimilarity. In detail, we standardize the shapelet assignment probability as
49 |
50 | $$p_{i, j} = \frac{
51 | \max(\hat{d_{i,*}}(v_{i, *}, s_i)) - \hat{d_{i,j}}(v_{i, j}, s_i)
52 | }{
53 | \max(\hat{d_{i,*}}(v_{i, *}, s_i)) - \min(\hat{d_{i,*}}(v_{i, *}, s_i))
54 | }$$
55 |
56 | where
57 |
58 | $$\hat{d_{i,*}}(v_{i, *}, s_i) = u_*[i] * \hat{d}(v_{i, *}, s_i | w_*)$$
59 |
60 | with a predefined constraint that $$\hat{d_{i, *}} \le \delta$$. Then, for each pair $$(j, k)$$, we create a weighted edge from shapelet $$v_{i, j}$$ to $$v_{i+1, k}$$ with weight $$p_{i, j} \cdot p_{i+1, k}$$ , and merge all duplicated edges as one by summing up their weights. Finally, we normalize the edge weights sourced from each node as 1, which naturally interprets the edge weight between each pair of nodes, i.e., $$v_i$$ and $$v_j$$ into the conditional probability that shapelet $$v_i$$ being transformed into $$v_j$$ in an adjacent time step.
61 |
62 | ### Time Series Representation Learning
63 |
64 | Finally, we learn the representations for both the shapelets and the given time series by using the shapelet evolution graph constructed as above. We first employ an existing graph embedding algorithm DeepWalk [10] to obtain vertex (shapelet) representation vectors $$\mu \in \mathbb{R}^B$$. Then, for each segment $$s_i$$ in a time series, we retrieve the embeddings of its assigned shapelets that have discussed above, and sum them up weighted by assignment probability, denoted as
65 |
66 | $$\Phi_i=(\sum\nolimits_{j}p_{i,j}\cdot\mu(v_{i,j})), \ 1 \le i \le m$$
67 |
68 | and finally concatenate or aggregate all those $$m$$ segment embedding vectors to obtain the representation vector for original time series $$t$$. The time series embeddings can then be applied to various down streaming tasks, referred to the experiment section in our paper [1].
69 |
70 | ### Evaluation Results
71 |
72 | We conduct time series classification tasks on three public benchmarks datasets from *UCR-Archive* [11] and two real-world datasets from State Grid of China and China Telecom. Experimental results are shown in the following table:
73 |
74 |
75 |
76 |
77 |
78 | We have also conduct extensive ablation and observational studies to validate our proposed framework. Here we construct the shapelet evolution graphs at different time steps for deeper understanding of shapelet dynamics, seen in the figure below. It shows two graphs, one for January and another for July. In January, shapelet *#45* has large in/out degrees, and its corresponding timing factor is highlighted in January and February (dark areas). It indicates that shapelet *#45* is likely to be a common pattern at the beginning of a year. As for July, shapelet *#45* is no longer as important as it was in January. Meanwhile, shapelet *#42*, which is almost an isolated point in January, becomes very important in July. Although we do not explicitly take seasonal information into consideration when constructing shapelet evolution graphs, the inclusion of the timing factors means that they are already incorporated into the process of the graph generation.
79 |
80 |
81 |
82 |
83 |
84 |
85 |
86 | ### Reference
87 |
88 | [1] Cheng, Z; Yang, Y; Wang, W; Hu, W; Zhuang, Y and Song, G, 2020, Time2Graph: Revisiting Time Series Modeling with Dynamic Shapelets, In AAAI, 2020
89 |
90 | [2] Peng, X.; Huang, J.; Hu, Q.; Zhang, S.; and Metaxas, D. N. 2014. Head pose estimation by instance parameterization. In *ICPR’14*, 1800–1805.
91 |
92 | [3] Shimodaira, H.; Noma, K.-i.; Nakai, M.; and Sagayama, S. 2002. Dynamic time-alignment kernel in support vector machine. In *NIPS’02*, 921–928.
93 |
94 | [4] Malhotra, P.; Ramakrishnan, A.; Anand, G.; Vig, L.; Agar- wal, P.; and Shroff, G. 2016. Lstm-based encoder- decoder for multi-sensor anomaly detection. *arXiv preprint arXiv:1607.00148*.
95 |
96 | [5] Johnson, M.; Duvenaud, D. K.; Wiltschko, A.; Adams, R. P.; and Datta, S. R. 2016. Composing graphical models with neu- ral networks for structured representations and fast inference. In *NIPS’16*, 2946–2954.
97 |
98 | [6] Ye, L., and Keogh, E. 2011. Time series shapelets: a novel technique that allows accurate, interpretable and fast classifi- cation. *DMKD.* 22(1):149–182.
99 |
100 | [7] Bostrom, A., and Bagnall, A. 2017. Binary shapelet trans- form for multiclass time series classification. In *TLSD- KCS’17.* 24–46.
101 |
102 | [8] Hills, J.; Lines, J.; Baranauskas, E.; Mapp, J.; and Bagnall, A. 2014. Classification of time series by shapelet transformation. *DMKD.* 28(4):851–881
103 |
104 | [9] Lines, J.; Davis, L. M.; Hills, J.; and Bagnall, A. 2012. A shapelet transform for time series classification. In *KDD’12*, 289–297.
105 |
106 | [10] Perozzi, B.; Al-Rfou, R.; and Skiena, S. 2014. Deepwalk: Online learning of social representations. In *KDD*, 701–710.
107 |
108 | [11] Dau, H. A.; Keogh, E.; Kamgar, K.; Yeh, C.-C. M.; Zhu, Y.; Gharghabi, S.; Ratanamahatana, C. A.; Yanping; Hu, B.; Begum, N.; Bagnall, A.; Mueen, A.; and Batista, G. 2018. The ucr time series classification archive. https://www.cs.ucr.edu/~eamonn/time_series_data_2018/.
--------------------------------------------------------------------------------
/docs/_config.yml:
--------------------------------------------------------------------------------
1 | theme: jekyll-theme-slate
2 | title: Time2Graph
3 | description: "Time2Graph: Revisting Time Series Modeling with Dynamic Shapelets"
4 |
--------------------------------------------------------------------------------
/docs/_layouts/default.html:
--------------------------------------------------------------------------------
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 |
12 | {% seo %}
13 |
14 |
15 |
16 |
17 |
18 |