├── .gitignore ├── CITATION.cff ├── LICENSE ├── Makefile ├── README.md ├── doc └── figures │ ├── eval.png │ ├── gutentag.png │ ├── results.png │ └── timeeval-architecture.png ├── images ├── gutentag.png └── timeeval.png ├── requirements.txt ├── timeeval-icon.png └── timeeval_gui ├── __init__.py ├── __main__.py ├── _pages ├── __init__.py ├── eval │ ├── __init__.py │ └── param_opt.py ├── gutentag.py ├── page.py └── results.py ├── config.py ├── files.py ├── pages ├── 1_📈_GutenTAG.py ├── 2_🚀_Eval.py └── 3_🎯_Results.py ├── st_redirect.py ├── timeseries_config.py ├── utils.py └── 🏠_Home.py /.gitignore: -------------------------------------------------------------------------------- 1 | timeeval-files 2 | 3 | __pycache__ 4 | .idea 5 | *.egg-info 6 | dist 7 | build 8 | .mypy_cache 9 | .pytest_cache 10 | venv 11 | .coverage 12 | coverage.xml 13 | -------------------------------------------------------------------------------- /CITATION.cff: -------------------------------------------------------------------------------- 1 | cff-version: 1.2.0 2 | message: "If you use this software, please cite it as below." 3 | authors: 4 | - family-names: Schmidl 5 | given-names: Sebastian 6 | orcid: https://orcid.org/0000-0002-6597-9809 7 | - family-names: Wenig 8 | given-names: Phillip 9 | orcid: https://orcid.org/0000-0002-8942-4322 10 | title: "TimeEval: A Benchmarking Toolkit for Time Series Anomaly Detection Algorithms" 11 | date-released: 2022 12 | url: "https://github.com/TimeEval/timeeval-gui" 13 | preferred-citation: 14 | type: article 15 | authors: 16 | - family-names: Wenig 17 | given-names: Phillip 18 | orcid: https://orcid.org/0000-0002-8942-4322 19 | - family-names: Schmidl 20 | given-names: Sebastian 21 | orcid: https://orcid.org/0000-0002-6597-9809 22 | - family-names: Papenbrock 23 | given-names: Thorsten 24 | orcid: https://orcid.org/0000-0002-4019-8221 25 | doi: 10.14778/3554821.3554873 26 | journal: "Proceedings of the VLDB Endowment (PVLDB)" 27 | title: "TimeEval: A Benchmarking Toolkit for Time Series Anomaly Detection Algorithms" 28 | issue: 12 29 | volume: 15 30 | year: 2022 31 | start: 3678 32 | end: 3681 33 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2022 Phillip Wenig and Sebastian Schmidl 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | install: 2 | pip install -r requirements.txt 3 | 4 | run: 5 | python -m timeeval_gui 6 | 7 | clean: 8 | rm -r timeeval-files 9 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 |
2 | TimeEval logo 3 |

TimeEval GUI / Toolkit

4 |

5 | A Benchmarking Toolkit for Time Series Anomaly Detection Algorithms 6 |

7 | 8 | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) 9 | ![python version 3.7|3.8|3.9](https://img.shields.io/badge/python-3.7%20%7C%203.8%20%7C%203.9-blue) 10 | 11 |
12 | 13 | > If you use our artifacts, please consider [citing our papers](#Citation). 14 | 15 | This repository hosts an extensible, scalable and automatic benchmarking toolkit for time series anomaly detection algorithms. 16 | TimeEval includes an extensive data generator and supports both interactive and batch evaluation scenarios. 17 | With our novel toolkit, we aim to ease the evaluation effort and help the community to provide more meaningful evaluations. 18 | 19 | The following picture shows the architecture of the TimeEval Toolkit: 20 | 21 |
22 | 23 | ![TimeEval architecture](./doc/figures/timeeval-architecture.png) 24 | 25 |
26 | 27 | It consists of four main components: a visual frontend for interactive experiments, the Python API to programmatically configure systematic batch experiments, the dataset generator GutenTAG, and the core evaluation engine (Time)Eval. 28 | While the frontend is hosted in this repository, GutenTAG and Eval are hosted in separate repositories. 29 | Those repositories also include their respective Python APIs: 30 | 31 | [![GutenTAG Badge](https://img.shields.io/badge/Repository-GutenTAG-blue?style=for-the-badge)](https://github.com/TimeEval/gutentag) 32 | [![Eval Badge](https://img.shields.io/badge/Repository-Eval-blue?style=for-the-badge)](https://github.com/TimeEval/timeeval) 33 | 34 | As initial resources for evaluations, we provide over 1,000 benchmark datasets and an increasing number of time series anomaly detection algorithms (over 70): 35 | 36 | [![Datasets Badge](https://img.shields.io/badge/Repository-Datasets-3a4750?style=for-the-badge)](https://timeeval.github.io/evaluation-paper/notebooks/Datasets.html) 37 | [![Algorithms Badge](https://img.shields.io/badge/Repository-Algorithms-3a4750?style=for-the-badge)](https://github.com/TimeEval/TimeEval-algorithms) 38 | 39 | ## Installation and Usage (tl;dr) 40 | 41 | TimeEval is tested on Linux and Mac operating systems and supports Python 3.7 until 3.9. 42 | We don't support Python 3.10 or higher at the moment because downstream libraries are incompatible. 43 | 44 | > We haven't tested if TimeEval runs on Windows. 45 | > If you use Windows, please help us and test if TimeEval runs correctly. 46 | > If there are any issues, don't hesitate to contact us. 47 | 48 | By default, TimeEval does not automatically download all available algorithms (Docker images), because there are just too many. 49 | However, you can download them easily [from our registry](https://github.com/orgs/TimeEval/packages?repo_name=TimeEval-algorithms) using docker. 50 | Please download the correct tag for the algorithm, compatible with your version of TimeEval: 51 | 52 | ```bash 53 | docker pull ghcr.io/timeeval/kmeans:0.3.0 54 | ``` 55 | 56 | After you have downloaded the algorithm images, you need to restart the GUI, so that it can find the new images. 57 | 58 | ### Web frontend 59 | 60 | ```shell 61 | # install all dependencies 62 | make install 63 | 64 | # execute streamlit and display frontend in default browser 65 | make run 66 | ``` 67 | 68 | Screenshots of web frontend: 69 | 70 | ![GutenTAG page](./doc/figures/gutentag.png) 71 | ![Eval page](./doc/figures/eval.png) 72 | ![Results page](./doc/figures/results.png) 73 | 74 | ### Python APIs 75 | 76 | Install the required components using pip: 77 | 78 | ```bash 79 | # eval component: 80 | pip install timeeval 81 | 82 | # dataset generator component: 83 | pip install timeeval-gutentag 84 | ``` 85 | 86 | For usage instructions of the respective Python APIs, please consider the project's documentation: 87 | 88 | [![GutenTAG Badge](https://img.shields.io/badge/Repository-GutenTAG-blue?style=for-the-badge)](https://github.com/TimeEval/gutentag) 89 | [![Eval Badge](https://img.shields.io/badge/Repository-Eval-blue?style=for-the-badge)](https://github.com/TimeEval/timeeval) 90 | 91 | ## Citation 92 | 93 | If you use the TimeEval toolkit or any of its components in your project or research, please cite our demonstration paper: 94 | 95 | > Phillip Wenig, Sebastian Schmidl, and Thorsten Papenbrock. 96 | > TimeEval: A Benchmarking Toolkit for Time Series Anomaly Detection Algorithms. PVLDB, 15(12): 3678 - 3681, 2022. 97 | > doi:[10.14778/3554821.3554873](https://doi.org/10.14778/3554821.3554873) 98 | 99 | If you use our evaluation results or our benchmark datasets and algorithms, please cite our evaluation paper: 100 | 101 | > Sebastian Schmidl, Phillip Wenig, and Thorsten Papenbrock. 102 | > Anomaly Detection in Time Series: A Comprehensive Evaluation. PVLDB, 15(9): 1779 - 1797, 2022. 103 | > doi:[10.14778/3538598.3538602](https://doi.org/10.14778/3538598.3538602) 104 | 105 | You can use the following BibTeX entries: 106 | 107 | ```bibtex 108 | @article{WenigEtAl2022TimeEval, 109 | title = {TimeEval: {{A}} Benchmarking Toolkit for Time Series Anomaly Detection Algorithms}, 110 | author = {Wenig, Phillip and Schmidl, Sebastian and Papenbrock, Thorsten}, 111 | date = {2022}, 112 | journaltitle = {Proceedings of the {{VLDB Endowment}} ({{PVLDB}})}, 113 | volume = {15}, 114 | number = {12}, 115 | pages = {3678--3681}, 116 | doi = {10.14778/3554821.3554873} 117 | } 118 | @article{SchmidlEtAl2022Anomaly, 119 | title = {Anomaly Detection in Time Series: {{A}} Comprehensive Evaluation}, 120 | author = {Schmidl, Sebastian and Wenig, Phillip and Papenbrock, Thorsten}, 121 | date = {2022}, 122 | journaltitle = {Proceedings of the {{VLDB Endowment}} ({{PVLDB}})}, 123 | volume = {15}, 124 | number = {9}, 125 | pages = {1779--1797}, 126 | doi = {10.14778/3538598.3538602} 127 | } 128 | ``` 129 | -------------------------------------------------------------------------------- /doc/figures/eval.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimeEval/TimeEval-GUI/c1b0f90f3048f282427fa68ce5def504df142acd/doc/figures/eval.png -------------------------------------------------------------------------------- /doc/figures/gutentag.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimeEval/TimeEval-GUI/c1b0f90f3048f282427fa68ce5def504df142acd/doc/figures/gutentag.png -------------------------------------------------------------------------------- /doc/figures/results.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimeEval/TimeEval-GUI/c1b0f90f3048f282427fa68ce5def504df142acd/doc/figures/results.png -------------------------------------------------------------------------------- /doc/figures/timeeval-architecture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimeEval/TimeEval-GUI/c1b0f90f3048f282427fa68ce5def504df142acd/doc/figures/timeeval-architecture.png -------------------------------------------------------------------------------- /images/gutentag.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimeEval/TimeEval-GUI/c1b0f90f3048f282427fa68ce5def504df142acd/images/gutentag.png -------------------------------------------------------------------------------- /images/timeeval.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimeEval/TimeEval-GUI/c1b0f90f3048f282427fa68ce5def504df142acd/images/timeeval.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | streamlit==1.11.1 2 | timeeval>=1.4,<1.5 3 | timeeval-gutentag==0.2.0 4 | pyyaml 5 | numpy 6 | pandas 7 | matplotlib 8 | requests 9 | protobuf>=3.20,<4 10 | watchdog==2.1.9 11 | plotly==5.10.* 12 | altair==4.2.2 # newer versions of altair are not compatible with streamlit 1.11.1 13 | requests==2.31.0 # newer versions break the docker connection with "Error while fetching server API version: Not supported URL scheme http+docker" 14 | -------------------------------------------------------------------------------- /timeeval-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimeEval/TimeEval-GUI/c1b0f90f3048f282427fa68ce5def504df142acd/timeeval-icon.png -------------------------------------------------------------------------------- /timeeval_gui/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/TimeEval/TimeEval-GUI/c1b0f90f3048f282427fa68ce5def504df142acd/timeeval_gui/__init__.py -------------------------------------------------------------------------------- /timeeval_gui/__main__.py: -------------------------------------------------------------------------------- 1 | import sys 2 | from pathlib import Path 3 | 4 | from streamlit import cli as stcli 5 | 6 | 7 | index_path = str(Path(__file__).parent.absolute() / "🏠_Home.py") 8 | sys.argv = ["streamlit", "run", index_path] 9 | sys.exit(stcli.main()) 10 | -------------------------------------------------------------------------------- /timeeval_gui/_pages/__init__.py: -------------------------------------------------------------------------------- 1 | import streamlit as st 2 | 3 | from .gutentag import GutenTAGPage 4 | from .eval import EvalPage 5 | from .page import Page 6 | from .results import ResultsPage 7 | -------------------------------------------------------------------------------- /timeeval_gui/_pages/eval/__init__.py: -------------------------------------------------------------------------------- 1 | import logging 2 | import sys 3 | from typing import List, Union 4 | 5 | import docker 6 | import psutil 7 | import streamlit as st 8 | from docker.errors import DockerException 9 | from durations import Duration 10 | from timeeval import Algorithm, ResourceConstraints, DefaultMetrics, TimeEval 11 | from timeeval.params import FixedParameters, FullParameterGrid, IndependentParameterGrid 12 | from timeeval.resource_constraints import GB 13 | from timeeval import algorithms as timeeval_algorithms 14 | 15 | import timeeval_gui.st_redirect as rd 16 | from .param_opt import InputParam 17 | from ..page import Page 18 | from ...config import SKIP_DOCKER_PULL 19 | from ...files import Files 20 | 21 | # keep this import! 22 | from timeeval.algorithms import * 23 | 24 | 25 | def create_algorithm_list() -> List[Algorithm]: 26 | algorithms = [eval(f"{a}(skip_pull={SKIP_DOCKER_PULL})") for a in dir(timeeval_algorithms) if "__" not in a] 27 | if SKIP_DOCKER_PULL: 28 | # filter out non-existent images from algorithm choices 29 | try: 30 | docker_client = docker.from_env() 31 | except DockerException as e: 32 | print(f"Could not connect to docker! {e}", file=sys.stderr) 33 | return [] 34 | 35 | def image_exists(name: str, tag: str) -> bool: 36 | images = docker_client.images.list(name=f"{name}:{tag}") 37 | return len(images) > 0 38 | 39 | algorithms = [a for a in algorithms if image_exists(a.main.image_name, a.main.tag)] 40 | del docker_client 41 | return algorithms 42 | 43 | 44 | algos: List[Algorithm] = create_algorithm_list() 45 | 46 | for algo in algos: 47 | st.session_state.setdefault(f"eval-{algo.name}-n_params", 0) 48 | 49 | 50 | def inc_n_params(algo_name: str): 51 | value = st.session_state.get(f"eval-{algo_name}-n_params", 0) 52 | st.session_state[f"eval-{algo_name}-n_params"] = value + 1 53 | 54 | 55 | def dec_n_params(algo_name: str): 56 | value = st.session_state.get(f"eval-{algo_name}-n_params", 0) 57 | if value > 0: 58 | st.session_state[f"eval-{algo_name}-n_params"] = value - 1 59 | 60 | 61 | def parse_list_value(tpe: str, value: str) -> List[Union[float, int, str, bool]]: 62 | subtype = tpe.split("[")[1].split("]")[0].lower() 63 | cc_subtype = { 64 | "int": int, 65 | "float": float, 66 | "bool": bool 67 | } 68 | value_str = value.split("[")[1].split("]")[0] 69 | 70 | values = value_str.split(",") 71 | return [cc_subtype.get(subtype, str)(v) for v in values] 72 | 73 | 74 | class EvalPage(Page): 75 | 76 | def _get_name(self) -> str: 77 | return "Eval" 78 | 79 | def render(self): 80 | st.image("images/timeeval.png") 81 | st.title("Eval") 82 | 83 | st.write("## Algorithms") 84 | 85 | algo_names: List[str] = st.multiselect("Algorithms", options=[a.name for a in algos]) 86 | algorithms = [a for a in algos if a.name in algo_names] 87 | 88 | st.write("### Parameters") 89 | 90 | for algorithm in algorithms: 91 | algo_name = algorithm.name 92 | with st.expander(algo_name): 93 | if not algorithm.param_schema: 94 | st.info("Algorithm has no parameters.") 95 | continue 96 | 97 | param_config_tpe = st.selectbox("Parameter configuration type", 98 | [FixedParameters, FullParameterGrid, IndependentParameterGrid], 99 | format_func=lambda x: x.__name__, 100 | help="FixedParameters - Single parameters setting with one value for each.\n" 101 | "FullParameterGrid - Grid of parameters with a discrete number of " 102 | "values for each. Yields the full cartesian product of all " 103 | "available parameter combinations.\n" 104 | "IndependentParameterGrid - Grid of parameters with a discrete " 105 | "number of values for each. The parameters in the dict are " 106 | "considered independent and explored one after the other.", 107 | key=f"eval-{algo_name}-config-tpe") 108 | st.write("---") 109 | 110 | n_params = st.session_state.get(f"eval-{algo_name}-n_params", 0) 111 | displayed_params = [] 112 | param_grid = {} 113 | optim = param_config_tpe.__name__ != FixedParameters.__name__ 114 | for i in range(n_params): 115 | param = st.selectbox("Parameter", 116 | options=[param for param in algorithm.param_schema if 117 | param not in displayed_params], 118 | format_func=lambda p: algorithm.param_schema[p]["name"], 119 | key=f"{algo_name}-parameter-name-{i}") 120 | value = InputParam.from_type(algorithm, param, optim, key=f"{algo_name}-parameter-value-{i}").render() 121 | displayed_params.append(param) 122 | param_grid[param] = value 123 | 124 | bt_col1, bt_col2, _ = st.columns((1, 1, 18)) 125 | with bt_col1: 126 | st.button("-", 127 | help="Remove a parameter configuration", 128 | on_click=dec_n_params, args=[algo_name], 129 | key=f"eval-{algo_name}-button-") 130 | with bt_col2: 131 | st.button("+", 132 | help="Add a parameter configuration", 133 | on_click=inc_n_params, args=[algo_name], 134 | key=f"eval-{algo_name}-button+") 135 | algorithm.param_config = param_config_tpe(param_grid) 136 | 137 | st.write("## Datasets") 138 | 139 | dmgr = Files().dmgr() 140 | available_datasets = dmgr.df().index.values 141 | datasets = st.multiselect("Datasets", options=available_datasets, format_func=lambda x: f"{x[0]}/{x[1]}") 142 | 143 | st.write("## General Settings") 144 | 145 | repetitions = st.slider("Repetitions", value=1, min_value=1, max_value=1000, step=1) 146 | metric_options = [ 147 | "ROC_AUC", "PR_AUC", "RANGE_PR_AUC", "AVERAGE_PRECISION", "RANGE_PRECISION", 148 | "RANGE_RECALL", "RANGE_F1", "FIXED_RANGE_PR_AUC", 149 | ] 150 | metric_names = st.multiselect("Metrics", options=metric_options, default="ROC_AUC") 151 | metrics = [getattr(DefaultMetrics, m) for m in metric_names] 152 | force_training_type_match = st.checkbox("Force training type match between algorithm and dataset", value=False) 153 | force_dimensionality_match = st.checkbox( 154 | "Force dimensionality match between algorithm and dataset (uni- or multivariate)", 155 | value=False 156 | ) 157 | 158 | # with st.expander("Remote Configuration"): 159 | # st.text_input("Scheduler Host") 160 | # st.text_area("Worker Hosts") 161 | 162 | with st.expander("Resource Constraints"): 163 | rc = ResourceConstraints.default_constraints() 164 | rc.tasks_per_host = st.number_input("Parallel tasks (distributes CPUs and memory evenly across tasks)", 165 | value=rc.tasks_per_host, 166 | min_value=1, 167 | max_value=psutil.cpu_count()) 168 | rc.train_timeout = Duration(st.text_input( 169 | "Train Timeout", 170 | value=rc.train_timeout.representation, 171 | help="Timeout for the training step of the algorithms as a duration (e.g., '2 minutes' or '1 hour').", 172 | )) 173 | rc.execute_timeout = Duration(st.text_input( 174 | "Execute Timeout", 175 | value=rc.execute_timeout.representation, 176 | help="Timeout for the execution step of the algorithms as a duration (e.g., '2 minutes' or '1 hour').", 177 | )) 178 | cpu_limit = rc.task_cpu_limit if rc.task_cpu_limit else 0. 179 | cpu_limit = st.number_input("CPU Limit (overwrites default constraints)", 180 | value=cpu_limit, 181 | min_value=0., 182 | max_value=float(psutil.cpu_count()), 183 | help="Maximum amount of CPU shares to be used per task, where 2.5 = 2.5 CPU cores and 0 = no limit.") 184 | if cpu_limit > 0: 185 | rc.task_cpu_limit = cpu_limit 186 | else: 187 | rc.task_cpu_limit = None 188 | memory_limit = rc.task_memory_limit if rc.task_memory_limit else 0. 189 | memory_limit = st.number_input("Memory Limit (GB) (overwrites default constraints)", 190 | value=memory_limit, 191 | min_value=0., 192 | max_value=float(psutil.virtual_memory().total / GB), 193 | help="Maximum amount of memory (in GB) to be used per task, where 0 = no limit.") 194 | if memory_limit > 0: 195 | rc.task_memory_limit = int(memory_limit * GB) 196 | else: 197 | rc.task_memory_limit = None 198 | limits = rc.get_compute_resource_limits() 199 | rc.task_memory_limit = int(limits[0]) 200 | st.info(f"Resulting resource limits: cpu={limits[1]:.2f}, mem={limits[0] / GB:.0f} GB") 201 | 202 | if st.button("Start Experiment"): 203 | with st.spinner('Running evaluation - please wait...'): 204 | timeeval = TimeEval( 205 | dmgr, datasets, algorithms, 206 | results_path=Files().results_folder(), 207 | distributed=False, 208 | repetitions=repetitions, 209 | resource_constraints=rc, 210 | metrics=metrics, 211 | skip_invalid_combinations=True, 212 | force_training_type_match=force_training_type_match, 213 | force_dimensionality_match=force_dimensionality_match, 214 | disable_progress_bar=True 215 | ) 216 | 217 | # reset logging backend 218 | logging.root.handlers = [] 219 | logging.basicConfig( 220 | filename=timeeval.results_path / "timeeval.log", 221 | filemode="w", 222 | level=logging.INFO, 223 | format="%(asctime)s %(levelname)6.6s - %(name)20.20s: %(message)s", 224 | ) 225 | 226 | st_out = st.empty() 227 | with rd.stdouterr(to=st_out): 228 | timeeval.run() 229 | st.success(f"... evaluation done!") 230 | 231 | st.write("## Results") 232 | 233 | df_results = timeeval.get_results(aggregated=True, short=True) 234 | st.dataframe(df_results) 235 | -------------------------------------------------------------------------------- /timeeval_gui/_pages/eval/param_opt.py: -------------------------------------------------------------------------------- 1 | from __future__ import annotations 2 | 3 | from abc import abstractmethod 4 | from typing import Any, List, Optional 5 | 6 | import numpy as np 7 | import streamlit as st 8 | from streamlit.state import NoValue 9 | 10 | 11 | class InputParam: 12 | def __init__(self, algorithm, selected, optim: bool, key=""): 13 | self.algorithm = algorithm 14 | self.selected = selected 15 | self.optim = optim 16 | self.key = key 17 | 18 | def get_default_value(self, cc) -> Any: 19 | v = self.algorithm.param_schema[self.selected]["defaultValue"] 20 | if v is None: 21 | return NoValue() 22 | else: 23 | return cc(v) 24 | 25 | @abstractmethod 26 | def _render_optim(self, help: str) -> Any: 27 | ... 28 | 29 | @abstractmethod 30 | def _render_fixed(self, help: str) -> Any: 31 | ... 32 | 33 | def render(self) -> Any: 34 | help = self.algorithm.param_schema[self.selected]["description"] 35 | 36 | if self.optim: 37 | return self._render_optim(help) 38 | else: 39 | return self._render_fixed(help) 40 | 41 | @staticmethod 42 | def from_type(algorithm, selected, optim: bool, key="") -> Optional[InputParam]: 43 | tpe = algorithm.param_schema[selected]["type"] 44 | if tpe.lower() == "int": 45 | cl = IntegerInputParam 46 | elif tpe.lower() == "float": 47 | cl = FloatInputParam 48 | elif tpe.lower().startswith("bool"): 49 | cl = BoolInputParam 50 | elif tpe.lower().startswith("enum"): 51 | cl = EnumInputParam 52 | elif tpe.lower().startswith("list") and optim: 53 | st.error("A list parameter cannot be optimized yet") 54 | return None 55 | else: 56 | cl = StringInputParam 57 | 58 | return cl(algorithm, selected, optim, key) 59 | 60 | 61 | class IntegerInputParam(InputParam): 62 | def _render_fixed(self, help: str) -> Any: 63 | return int(st.number_input("Value", 64 | value=self.get_default_value(int), 65 | help=help, 66 | step=1, 67 | key=self.key)) 68 | 69 | def _render_optim(self, help: str) -> Any: 70 | col1, col2 = st.columns(2) 71 | with col1: 72 | start_value = int(st.number_input("Start Value", 73 | value=self.get_default_value(int), 74 | help=help, 75 | step=1, 76 | key=f"{self.key}-start")) 77 | with col2: 78 | end_value = int(st.number_input("End Value", 79 | value=self.get_default_value(int), 80 | step=1, 81 | key=f"{self.key}-end")) 82 | if start_value > end_value: 83 | st.error("Start value must be smaller or equal to end value") 84 | 85 | return list(range(start_value, end_value + 1)) 86 | 87 | 88 | class FloatInputParam(InputParam): 89 | def _render_fixed(self, help: str) -> Any: 90 | return st.number_input("Value", 91 | value=self.get_default_value(float), 92 | help=help, 93 | step=None, 94 | format="%f", 95 | key=self.key) 96 | 97 | def _render_optim(self, help: str) -> Any: 98 | col1, col2, col3 = st.columns((2,2,1)) 99 | with col1: 100 | start_value = st.number_input("Start value", 101 | value=self.get_default_value(float), 102 | help=help, 103 | step=None, 104 | format="%f", 105 | key=f"{self.key}-start") 106 | with col2: 107 | end_value = st.number_input("End value", 108 | value=self.get_default_value(float), 109 | help=help, 110 | step=None, 111 | format="%f", 112 | key=f"{self.key}-end") 113 | with col3: 114 | number_steps = int(st.number_input("Steps", 115 | value=2, 116 | step=1, 117 | key=f"{self.key}-steps")) 118 | return np.linspace(start_value, end_value, number_steps).tolist() 119 | 120 | 121 | class BoolInputParam(InputParam): 122 | def _render_optim(self, help: str) -> Any: 123 | return st.multiselect("Values", 124 | options=[True, False], 125 | help=help, 126 | key=self.key) 127 | 128 | def _render_fixed(self, help: str) -> Any: 129 | st.markdown("Value") 130 | return st.checkbox("", 131 | value=self.get_default_value(bool), 132 | help=help, 133 | key=self.key) 134 | 135 | 136 | def parse_enum_param_type(tpe: str) -> List[str]: 137 | option_str = tpe.split("[")[1].split("]")[0] 138 | return option_str.split(",") 139 | 140 | 141 | class EnumInputParam(InputParam): 142 | def _render_enum(self, help: str, input_field_class, with_index=True) -> Any: 143 | default_value = self.algorithm.param_schema[self.selected]["defaultValue"] 144 | tpe = self.algorithm.param_schema[self.selected]["type"] 145 | 146 | try: 147 | default_index = parse_enum_param_type(tpe).index(default_value) 148 | except ValueError: 149 | default_index = 0 150 | 151 | kwargs = {} 152 | if with_index: 153 | kwargs["index"] = default_index 154 | 155 | return input_field_class("Value", 156 | options=parse_enum_param_type(tpe), 157 | help=help, 158 | key=self.key, **kwargs) 159 | 160 | def _render_optim(self, help: str) -> Any: 161 | return self._render_enum(help, st.multiselect, with_index=False) 162 | 163 | def _render_fixed(self, help: str) -> Any: 164 | return self._render_enum(help, st.selectbox) 165 | 166 | 167 | class StringInputParam(InputParam): 168 | def _render_optim(self, help: str) -> Any: 169 | value = st.text_input("Value (comma seperated)", 170 | value=self.algorithm.param_schema[self.selected]["defaultValue"], 171 | help=help, 172 | key=self.key) 173 | return list(map(lambda x: x.strip(), value.split(","))) 174 | 175 | def _render_fixed(self, help: str) -> Any: 176 | return st.text_input("Value", 177 | value=self.algorithm.param_schema[self.selected]["defaultValue"], 178 | help=help, 179 | key=self.key) 180 | -------------------------------------------------------------------------------- /timeeval_gui/_pages/gutentag.py: -------------------------------------------------------------------------------- 1 | import warnings 2 | from typing import Tuple, Dict, Union, Optional 3 | 4 | import streamlit as st 5 | from gutenTAG import GutenTAG 6 | from gutenTAG.generator.timeseries import TrainingType 7 | 8 | from timeeval_gui.timeseries_config import TimeSeriesConfig 9 | from timeeval_gui.utils import get_base_oscillations, get_anomaly_types, get_anomaly_params, \ 10 | get_base_oscillation_parameters 11 | from .page import Page 12 | from ..files import Files 13 | 14 | 15 | def general_area(ts_config: TimeSeriesConfig) -> TimeSeriesConfig: 16 | ts_config.set_name(st.text_input("Name")) 17 | ts_config.set_length(st.number_input("Length", min_value=10, value=1000)) 18 | 19 | if st.checkbox("Generate training time series for supervised methods"): 20 | ts_config.set_supervised() 21 | if st.checkbox("Generate training time series for semi-supervised methods"): 22 | ts_config.set_semi_supervised() 23 | return ts_config 24 | 25 | 26 | def select_base_oscillation(key="base-oscillation") -> Tuple[str, str]: 27 | bos = get_base_oscillations() 28 | value = st.selectbox("Base-Oscillation", bos.items(), format_func=lambda x: x[1], key=key) 29 | return value 30 | 31 | 32 | def select_anomaly_type(key: str, bo_kind: str) -> Tuple[str, str]: 33 | anomaly_types = get_anomaly_types(bo_kind) 34 | return st.selectbox("Anomaly Type", anomaly_types.items(), format_func=lambda x: x[1], key=key) 35 | 36 | 37 | def base_oscillation_area(c, ts_config: Optional[TimeSeriesConfig], return_dict: bool = False) -> Union[TimeSeriesConfig, Dict]: 38 | key = f"base-oscillation-{c}" 39 | base_oscillation = select_base_oscillation(key) 40 | parameters = get_base_oscillation_parameters(base_oscillation[0]) 41 | param_config = {} 42 | for p in parameters: 43 | if p.tpe == "number": 44 | param_config[p.key] = st.number_input(p.name, key=f"{p.key}-{c}", help=p.help) 45 | elif p.tpe == "integer": 46 | param_config[p.key] = int(st.number_input(p.name, key=f"{p.key}-{c}", help=p.help)) 47 | elif p.tpe == "object" and p.key == "trend": 48 | if st.checkbox("add Trend", key=f"{key}-add-trend"): 49 | st.markdown("---") 50 | param_config[p.key] = base_oscillation_area(f"{key}-{p.name}", None, return_dict=True) 51 | st.markdown("---") 52 | else: 53 | warn_msg = f"Input type ({p.tpe}) for parameter {p.name} of BO {base_oscillation[1]} not supported yet!" 54 | warnings.warn(warn_msg) 55 | st.warning(warn_msg) 56 | 57 | if return_dict: 58 | param_config["kind"] = base_oscillation[0] 59 | return param_config 60 | 61 | ts_config.add_base_oscillation(base_oscillation[0], **param_config) 62 | 63 | return ts_config 64 | 65 | 66 | def anomaly_area(a, ts_config: TimeSeriesConfig) -> TimeSeriesConfig: 67 | position = st.selectbox("Position", key=f"anomaly-position-{a}", options=["beginning", "middle", "end"], index=1) 68 | length = int(st.number_input("Length", key=f"anomaly-length-{a}", min_value=1)) 69 | channel = st.selectbox("Channel", key=f"anomaly-channel-{a}", 70 | options=list(range(len(ts_config.config["base-oscillations"])))) 71 | 72 | n_kinds = st.number_input("Number of Anomaly Types", key=f"anomaly-types-{a}", min_value=1) 73 | kinds = [] 74 | for t in range(int(n_kinds)): 75 | st.write(f"##### Type {t}") 76 | bo_kind = ts_config.config["base-oscillations"][channel]["kind"] 77 | anomaly_type, _ = select_anomaly_type(f"anomaly-type-{a}-{t}", bo_kind) 78 | parameters = parameter_area(a, t, anomaly_type, bo_kind) 79 | kinds.append({"kind": anomaly_type, "parameters": parameters}) 80 | 81 | ts_config.add_anomaly(position=position, length=length, channel=channel, kinds=kinds) 82 | return ts_config 83 | 84 | 85 | def parameter_area(a, t, anomaly_type: str, bo_kind: str) -> Dict: 86 | param_conf = {} 87 | parameters = get_anomaly_params(anomaly_type) 88 | for name, p, desc in parameters: 89 | if name.lower() == "sinusoid_k" and bo_kind != "sine": 90 | continue 91 | if name.lower() == "cbf_pattern_factor" and bo_kind != "cylinder-bell-funnel": 92 | continue 93 | 94 | key = f"{a}-{t}-{name}" 95 | if p == str: 96 | param_conf[name] = st.text_input(name.upper(), key=key, help=desc) 97 | elif p == bool: 98 | param_conf[name] = st.checkbox(name.upper(), key=key, help=desc) 99 | elif p == int: 100 | param_conf[name] = st.number_input(name.upper(), key=key, step=1, help=desc) 101 | elif p == float: 102 | param_conf[name] = st.number_input(name.upper(), key=key, help=desc) 103 | return param_conf 104 | 105 | 106 | class GutenTAGPage(Page): 107 | def _get_name(self) -> str: 108 | return "GutenTAG" 109 | 110 | def render(self): 111 | st.image("images/gutentag.png") 112 | 113 | timeseries_config = TimeSeriesConfig() 114 | 115 | st.write("## General Settings") 116 | timeseries_config = general_area(timeseries_config) 117 | 118 | st.write("## Channels") 119 | n_channels = st.number_input("Number of Channels", min_value=1) 120 | for c in range(n_channels): 121 | with st.expander(f"Channel {c}"): 122 | timeseries_config = base_oscillation_area(c, timeseries_config) 123 | 124 | st.write("## Anomalies") 125 | n_anomalies = st.number_input("Number of Anomalies", min_value=0) 126 | for a in range(n_anomalies): 127 | with st.expander(f"Anomaly {a}"): 128 | timeseries_config = anomaly_area(a, timeseries_config) 129 | 130 | st.write("---") 131 | 132 | gt = None 133 | if st.button("Build Timeseries"): 134 | if gt is None: 135 | gt = GutenTAG.from_dict({"timeseries": [timeseries_config.config]}, plot=False) 136 | gt.generate() 137 | 138 | ts = gt.timeseries[0] 139 | 140 | test_data = ts.to_dataframe(training_type=TrainingType.TEST) 141 | st.write("### Test Data") 142 | st.line_chart(data=test_data) 143 | 144 | if ts.semi_supervised: 145 | semi_supervised_data = ts.to_dataframe(training_type=TrainingType.TRAIN_NO_ANOMALIES) 146 | st.write("### Semi-Supervised Training Data (no anomalies)") 147 | st.line_chart(data=semi_supervised_data.iloc[:, :-1]) 148 | 149 | if ts.supervised: 150 | supervised_data = ts.to_dataframe(training_type=TrainingType.TRAIN_ANOMALIES) 151 | st.write("### Supervised Training Data (with anomalies)") 152 | st.line_chart(data=supervised_data) 153 | 154 | if st.button("Save"): 155 | if gt is None: 156 | gt = GutenTAG.from_dict({"timeseries": [timeseries_config.config]}, plot=False) 157 | gt.generate() 158 | Files().store_ts(gt) 159 | st.success(f"> Successfully saved new time series dataset '{timeseries_config.config['name']}' to disk.") 160 | -------------------------------------------------------------------------------- /timeeval_gui/_pages/page.py: -------------------------------------------------------------------------------- 1 | from abc import ABC, abstractmethod 2 | import streamlit as st 3 | 4 | 5 | class Page(ABC): 6 | def __init__(self): 7 | super().__init__() 8 | st.set_page_config(page_title=f"{self.name} | TimeEval - A Time Series Anomaly Detection Toolkit") 9 | 10 | @property 11 | def name(self) -> str: 12 | return self._get_name() 13 | 14 | @abstractmethod 15 | def _get_name(self) -> str: 16 | raise NotImplementedError() 17 | 18 | @abstractmethod 19 | def render(self): 20 | raise NotImplementedError() 21 | -------------------------------------------------------------------------------- /timeeval_gui/_pages/results.py: -------------------------------------------------------------------------------- 1 | import os 2 | from pathlib import Path 3 | from typing import Optional 4 | 5 | import numpy as np 6 | import pandas as pd 7 | import streamlit as st 8 | from timeeval import DatasetManager, Datasets 9 | import plotly.graph_objects as go 10 | 11 | from .page import Page 12 | from ..files import Files 13 | 14 | 15 | @st.cache(show_spinner=True, max_entries=1) 16 | def load_results(results_path: Path) -> pd.DataFrame: 17 | res = pd.read_csv(results_path / "results.csv") 18 | res["dataset_name"] = res["dataset"] 19 | res["overall_time"] = res["execute_main_time"].fillna(0) + res["train_main_time"].fillna(0) 20 | res["algorithm-index"] = res.algorithm + "-" + res.index.astype(str) 21 | res = res.drop_duplicates() 22 | return res 23 | 24 | 25 | @st.cache(show_spinner=True, max_entries=1) 26 | def create_dmgr(data_path: Path) -> Datasets: 27 | return DatasetManager(data_path, create_if_missing=False) 28 | 29 | 30 | @st.cache(show_spinner=True, max_entries=100, hash_funcs={pd.DataFrame: pd.util.hash_pandas_object, "builtins.function": lambda _: None}) 31 | def plot_boxplot(df, n_show: Optional[int] = None, title="Box plots", ax_label="values", metric="ROC_AUC", _fmt_label=lambda x: x, log: bool = False) -> go.Figure: 32 | df_asl = df.copy() 33 | df_asl["dataset_name"] = df_asl["dataset_name"].str.split(".").str[0] 34 | df_asl = df_asl.pivot(index="algorithm-index", columns="dataset_name", values=metric) 35 | df_asl = df_asl.dropna(axis=0, how="all").dropna(axis=1, how="all") 36 | df_asl["median"] = df_asl.median(axis=1) 37 | df_asl = df_asl.sort_values(by="median", ascending=True) 38 | df_asl = df_asl.drop(columns="median").T 39 | 40 | fig = go.Figure() 41 | for i, c in enumerate(df_asl.columns): 42 | fig.add_trace(go.Box( 43 | x=df_asl[c], 44 | name=_fmt_label(c), 45 | boxpoints=False, 46 | visible="legendonly" if n_show is not None and n_show < i < len(df_asl.columns) - n_show else None 47 | )) 48 | fig.update_layout( 49 | title={"text": title, "xanchor": "center", "x": 0.5}, 50 | xaxis_title=ax_label, 51 | legend_title="Algorithms" 52 | ) 53 | if log: 54 | fig.update_xaxes(type="log") 55 | return fig 56 | 57 | 58 | def load_scores_df(algorithm_name, dataset_id, df, result_path, repetition=1): 59 | params_id = df.loc[(df["algorithm"] == algorithm_name) & (df["collection"] == dataset_id[0]) & (df["dataset"] == dataset_id[1]), "hyper_params_id"].item() 60 | path = ( 61 | result_path / 62 | algorithm_name / 63 | params_id / 64 | dataset_id[0] / 65 | dataset_id[1] / 66 | str(repetition) / 67 | "anomaly_scores.ts" 68 | ) 69 | return pd.read_csv(path, header=None) 70 | 71 | 72 | def plot_scores(algorithm_name, collection_name, dataset_name, df, dmgr, result_path, **kwargs): 73 | if not isinstance(algorithm_name, list): 74 | algorithms = [algorithm_name] 75 | else: 76 | algorithms = algorithm_name 77 | # construct dataset ID 78 | if collection_name == "GutenTAG" and not dataset_name.endswith("supervised"): 79 | dataset_id = (collection_name, f"{dataset_name}.unsupervised") 80 | else: 81 | dataset_id = (collection_name, dataset_name) 82 | 83 | # load dataset details 84 | df_dataset = dmgr.get_dataset_df(dataset_id) 85 | 86 | # check if dataset is multivariate 87 | dataset_dim = df.loc[(df["collection"] == collection_name) & (df["dataset_name"] == dataset_name), "dataset_input_dimensionality"].unique().item() 88 | dataset_dim = dataset_dim.lower() 89 | 90 | auroc = {} 91 | df_scores = pd.DataFrame(index=df_dataset.index) 92 | skip_algos = [] 93 | algos = [] 94 | for algo in algorithms: 95 | algos.append(algo) 96 | # get algorithm metric results 97 | try: 98 | auroc[algo] = df.loc[(df["algorithm"] == algo) & (df["collection"] == collection_name) & (df["dataset_name"] == dataset_name), "ROC_AUC"].item() 99 | except ValueError: 100 | st.warning(f"No ROC_AUC score found! Probably {algo} was not executed on {dataset_name}.") 101 | auroc[algo] = -1 102 | skip_algos.append(algo) 103 | continue 104 | 105 | # load scores 106 | try: 107 | df_scores[algo] = load_scores_df(algo, dataset_id, df, result_path).iloc[:, 0] 108 | except (ValueError, FileNotFoundError): 109 | st.warning(f"No anomaly scores found! Probably {algo} was not executed on {dataset_name}.") 110 | df_scores[algo] = np.nan 111 | skip_algos.append(algo) 112 | algorithms = [a for a in algos if a not in skip_algos] 113 | 114 | fig = plot_scores_plotly(algorithms, auroc, df_scores, df_dataset, dataset_dim, dataset_name.split(".")[0]) 115 | st.plotly_chart(fig) 116 | 117 | 118 | def plot_scores_plotly(algorithms, auroc, df_scores, df_dataset, dataset_dim, dataset_name, **kwargs) -> go.Figure: 119 | import plotly.graph_objects as go 120 | from plotly.subplots import make_subplots 121 | 122 | # Create plot 123 | fig = make_subplots(2, 1) 124 | if dataset_dim == "multivariate": 125 | for i in range(1, df_dataset.shape[1] - 1): 126 | fig.add_trace(go.Scatter(x=df_dataset.index, y=df_dataset.iloc[:, i], name=f"channel-{i}"), 1, 1) 127 | else: 128 | fig.add_trace(go.Scatter(x=df_dataset.index, y=df_dataset.iloc[:, 1], name="timeseries"), 1, 1) 129 | fig.add_trace(go.Scatter(x=df_dataset.index, y=df_dataset["is_anomaly"], name="label"), 2, 1) 130 | 131 | for algo in algorithms: 132 | fig.add_trace(go.Scatter(x=df_scores.index, y=df_scores[algo], name=f"{algo}={auroc[algo]:.4f}"), 2, 1) 133 | fig.update_xaxes(matches="x") 134 | fig.update_layout( 135 | title=f"Results of {','.join(np.unique(algorithms))} on {dataset_name}", 136 | height=400 137 | ) 138 | return fig 139 | 140 | 141 | class ResultsPage(Page): 142 | def _get_name(self) -> str: 143 | return "Results" 144 | 145 | def _overall_results(self, res: pd.DataFrame): 146 | st.header("Experiment run results") 147 | st.dataframe(res) 148 | 149 | def _error_summary(self, res: pd.DataFrame): 150 | st.header("Errors") 151 | 152 | index_columns = ["algo_training_type", "algo_input_dimensionality", "algorithm"] 153 | df_error_counts = res.pivot_table(index=index_columns, columns=["status"], values="repetition", aggfunc="count") 154 | df_error_counts = df_error_counts.fillna(value=0).astype(np.int64) 155 | if "Status.ERROR" in df_error_counts: 156 | sort_by = ["algo_input_dimensionality", "Status.ERROR"] 157 | else: 158 | sort_by = ["algo_input_dimensionality"] 159 | df_error_counts = df_error_counts.reset_index().sort_values(by=sort_by, 160 | ascending=False).set_index(index_columns) 161 | 162 | df_error_counts["ALL"] = \ 163 | df_error_counts.get("Status.ERROR", 0) + \ 164 | df_error_counts.get("Status.OK", 0) + \ 165 | df_error_counts.get("Status.TIMEOUT", 0) 166 | 167 | for tpe in ["SEMI_SUPERVISED", "SUPERVISED", "UNSUPERVISED"]: 168 | if tpe in df_error_counts.index: 169 | st.write(tpe) 170 | st.dataframe(df_error_counts.loc[tpe]) 171 | 172 | def _plot_experiment(self, res: pd.DataFrame, dmgr: Datasets, results_path: Path): 173 | st.header("Plot Single Experiment") 174 | col1, col2, col3 = st.columns(3) 175 | with col1: 176 | collection = st.selectbox("Collection", options=res["collection"].unique()) 177 | with col2: 178 | dataset = st.selectbox("Dataset", res[res.collection == collection]["dataset_name"].unique(), format_func=lambda x: x.split(".")[0]) 179 | with col3: 180 | options = res[(res.collection == collection) & (res.dataset_name == dataset) & (res.status.isin(["Status.OK", "OK"]))]["algorithm"].unique() 181 | options = [None] + list(options) 182 | algorithm_name = st.selectbox("Algorithm", options, index=0) 183 | if algorithm_name is not None: 184 | plot_scores(algorithm_name, collection, dataset, res, dmgr, results_path) 185 | 186 | def _df_overall_scores(self, res: pd.DataFrame) -> pd.DataFrame: 187 | aggregations = ["min", "mean", "median", "max"] 188 | df_overall_scores = res.pivot_table(index="algorithm-index", values="ROC_AUC", aggfunc=aggregations) 189 | df_overall_scores.columns = aggregations 190 | df_overall_scores = df_overall_scores.sort_values(by="mean", ascending=False) 191 | return df_overall_scores 192 | 193 | def _quality_summary(self, res: pd.DataFrame): 194 | df_lut = self._df_overall_scores(res) 195 | 196 | st.header("Quality Summary") 197 | if len(res.algorithm.unique()) > 2 and st.checkbox("Show only best and worse", key="nshow-check-quality", value=True): 198 | n_show = st.number_input("Show worst and best n algorithms", key="nshow_roc", min_value=2, max_value=df_lut.shape[0], value=min(df_lut.shape[0], 10)) 199 | else: 200 | n_show = None 201 | fmt_label = lambda c: f"{c} (ROC_AUC={df_lut.loc[c, 'mean']:.2f})" 202 | 203 | fig = plot_boxplot(res, n_show=n_show, title="AUC_ROC box plots", ax_label="AUC_ROC score", metric="ROC_AUC", _fmt_label=fmt_label) 204 | st.plotly_chart(fig) 205 | 206 | def _runtime_summary(self, res: pd.DataFrame): 207 | df_lut = self._df_overall_scores(res) 208 | 209 | st.header("Runtime Summary") 210 | if len(res.algorithm.unique()) > 2 and st.checkbox("Show only best and worse", key="nshow-check-rt", value=True): 211 | n_show = st.number_input("Show worst and best n algorithms", key="nshow_rt", min_value=2, max_value=df_lut.shape[0], value=min(df_lut.shape[0], 10)) 212 | else: 213 | n_show = None 214 | fmt_label = lambda c: f"{c} (ROC_AUC={df_lut.loc[c, 'mean']:.2f})" if c in df_lut.index else c 215 | 216 | fig = plot_boxplot(res, n_show=n_show, title="Overall runtime box plots", ax_label="Overall runtime (in seconds)", metric="overall_time", _fmt_label=fmt_label, log=True) 217 | st.plotly_chart(fig) 218 | 219 | def render(self): 220 | st.title(self.name) 221 | files = Files() 222 | 223 | col1, col2 = st.columns(2) 224 | 225 | with col1: 226 | results_dir = st.text_input( 227 | "Choose experiment run results parent folder", 228 | placeholder="/home/user/results", 229 | value=files.results_folder() 230 | ) 231 | 232 | with col2: 233 | experiments = [exp for exp in os.listdir(results_dir) if os.path.isdir(Path(results_dir) / exp) and ("results.csv" in os.listdir(Path(results_dir) / exp))] 234 | results_path = st.selectbox("Choose experiment run results folder", experiments) 235 | 236 | data_path = st.text_input( 237 | "Choose location of datasets folder", 238 | placeholder="/home/user/data", 239 | value=files.timeseries_folder() 240 | ) 241 | if results_dir != "" and results_path != "" and data_path != "" and len(experiments) > 0: 242 | results_path = Path(results_dir) / results_path 243 | data_path = Path(data_path) 244 | res = load_results(results_path) 245 | dmgr = create_dmgr(data_path) 246 | 247 | self._overall_results(res) 248 | self._error_summary(res) 249 | self._quality_summary(res) 250 | self._runtime_summary(res) 251 | self._plot_experiment(res, dmgr, results_path) 252 | -------------------------------------------------------------------------------- /timeeval_gui/config.py: -------------------------------------------------------------------------------- 1 | import gutenTAG 2 | from pathlib import Path 3 | 4 | GUTENTAG_CONFIG_SCHEMA_ANOMALY_KIND_URL: str = f"https://github.com/TimeEval/gutentag/raw/v{gutenTAG.__version__}/generation-config-schema/anomaly-kind.guten-tag-generation-config.schema.yaml" 5 | TIMEEVAL_FILES_PATH: Path = Path("timeeval-files") 6 | 7 | SKIP_DOCKER_PULL: bool = True 8 | -------------------------------------------------------------------------------- /timeeval_gui/files.py: -------------------------------------------------------------------------------- 1 | import tempfile 2 | from argparse import Namespace 3 | from pathlib import Path 4 | from typing import Dict, Hashable, Any, Optional 5 | 6 | import pandas as pd 7 | import requests 8 | import yaml 9 | from gutenTAG import GutenTAG 10 | from gutenTAG.addons.timeeval import TimeEvalAddOn 11 | from timeeval import Datasets, DatasetManager 12 | 13 | from timeeval_gui.config import GUTENTAG_CONFIG_SCHEMA_ANOMALY_KIND_URL, TIMEEVAL_FILES_PATH 14 | 15 | 16 | class Files: 17 | _instance: Optional['Files'] = None 18 | 19 | def __new__(cls, *args, **kwargs): 20 | if not cls._instance: 21 | cls._instance = super(Files, cls).__new__(cls, *args, **kwargs) 22 | return cls._instance 23 | 24 | def __init__(self): 25 | if TIMEEVAL_FILES_PATH.is_absolute(): 26 | self._files_path = TIMEEVAL_FILES_PATH 27 | else: 28 | self._files_path = (Path.cwd() / TIMEEVAL_FILES_PATH).absolute() 29 | self._files_path.mkdir(parents=True, exist_ok=True) 30 | self._anomaly_kind_schema_path = self._files_path / "cache" / "anomaly-kind.guten-tag-generation-config.schema.yaml" 31 | self._anomaly_kind_schema_path.parent.mkdir(exist_ok=True) 32 | self._ts_path = self._files_path / "timeseries" 33 | self._ts_path.mkdir(exist_ok=True) 34 | self._results_path = self._files_path / "results" 35 | self._results_path.mkdir(exist_ok=True) 36 | 37 | def anomaly_kind_configuration_schema(self) -> Dict[Hashable, Any]: 38 | # load parameter configuration only once 39 | if not self._anomaly_kind_schema_path.exists(): 40 | self._load_anomaly_kind_configuration_schema() 41 | with self._anomaly_kind_schema_path.open("r") as fh: 42 | return yaml.load(fh, Loader=yaml.FullLoader) 43 | 44 | def store_ts(self, gt: GutenTAG) -> None: 45 | # process time series with TimeEvalAddOn to create dataset metadata 46 | with tempfile.TemporaryDirectory() as tmp_path: 47 | tmp_path = Path(tmp_path) 48 | TimeEvalAddOn().process(gt.overview, gt, Namespace(output_dir=tmp_path, no_save=False)) 49 | df_index = pd.read_csv(tmp_path / "datasets.csv").set_index(["collection_name", "dataset_name"]) 50 | 51 | # store index file (and potentially merge with existing beforehand) 52 | if (self._ts_path / "datasets.csv").exists(): 53 | df_existing_index = pd.read_csv(self._ts_path / "datasets.csv").set_index( 54 | ["collection_name", "dataset_name"]) 55 | df_index = pd.concat([df_existing_index[~df_existing_index.index.isin(df_index.index)], df_index]) 56 | df_index.to_csv(self._ts_path / "datasets.csv") 57 | 58 | # save time series 59 | gt.save_timeseries(self._ts_path) 60 | 61 | # remove overview file (contains outdated information) 62 | (self._ts_path / "overview.yaml").unlink() 63 | 64 | def dmgr(self) -> Datasets: 65 | return DatasetManager(self._ts_path, create_if_missing=False) 66 | 67 | def results_folder(self) -> Path: 68 | return self._results_path 69 | 70 | def timeseries_folder(self) -> Path: 71 | return self._ts_path 72 | 73 | def _load_anomaly_kind_configuration_schema(self) -> None: 74 | result = requests.get(GUTENTAG_CONFIG_SCHEMA_ANOMALY_KIND_URL) 75 | with self._anomaly_kind_schema_path.open("w") as fh: 76 | fh.write(result.text) 77 | -------------------------------------------------------------------------------- /timeeval_gui/pages/1_📈_GutenTAG.py: -------------------------------------------------------------------------------- 1 | from timeeval_gui._pages import GutenTAGPage 2 | 3 | 4 | def main(): 5 | GutenTAGPage().render() 6 | 7 | 8 | if __name__ == '__main__': 9 | main() 10 | -------------------------------------------------------------------------------- /timeeval_gui/pages/2_🚀_Eval.py: -------------------------------------------------------------------------------- 1 | from timeeval_gui._pages import EvalPage 2 | 3 | 4 | def main(): 5 | EvalPage().render() 6 | 7 | 8 | if __name__ == '__main__': 9 | main() 10 | -------------------------------------------------------------------------------- /timeeval_gui/pages/3_🎯_Results.py: -------------------------------------------------------------------------------- 1 | from timeeval_gui._pages import ResultsPage 2 | 3 | 4 | def main(): 5 | ResultsPage().render() 6 | 7 | 8 | if __name__ == '__main__': 9 | main() 10 | -------------------------------------------------------------------------------- /timeeval_gui/st_redirect.py: -------------------------------------------------------------------------------- 1 | import streamlit as st 2 | import io 3 | import contextlib 4 | 5 | 6 | class _Redirect: 7 | """Taken from https://gist.github.com/schaumb/037f139035d93cff3ad9f4f7e5f739ce""" 8 | class IOStuff(io.StringIO): 9 | def __init__(self, trigger): 10 | super().__init__() 11 | self._trigger = trigger 12 | 13 | def write(self, __s: str) -> int: 14 | res = super().write(__s) 15 | self._trigger(self.getvalue()) 16 | return res 17 | 18 | def __init__(self, stdout=None, stderr=False, format=None, to=None): 19 | self.io = _Redirect.IOStuff(self._write) 20 | self.redirections = [] 21 | self.st = None 22 | self.stderr = stderr is True 23 | self.stdout = stdout is True or (stdout is None and not self.stderr) 24 | self.format = format or 'code' 25 | self.to = to 26 | self.fun = None 27 | 28 | if not self.stdout and not self.stderr: 29 | raise ValueError("one of stdout or stderr must be True") 30 | 31 | if self.format not in ['text', 'markdown', 'latex', 'code', 'write']: 32 | raise ValueError( 33 | f"format need oneof the following: {', '.join(['text', 'markdown', 'latex', 'code', 'write'])}") 34 | 35 | if self.to and (not hasattr(self.to, 'text') or not hasattr(self.to, 'empty')): 36 | raise ValueError(f"'to' is not a streamlit container object") 37 | 38 | def __enter__(self): 39 | if self.st is not None: 40 | raise Exception("Already entered") 41 | to = self.to or st 42 | 43 | to.text( 44 | f"Redirected output from {'stdout and stderr' if self.stdout and self.stderr else 'stdout' if self.stdout else 'stderr'}:") 45 | self.st = to.empty() 46 | 47 | if self.stdout: 48 | self.redirections.append(contextlib.redirect_stdout(self.io)) 49 | if self.stderr: 50 | self.redirections.append(contextlib.redirect_stderr(self.io)) 51 | 52 | self.fun = getattr(self.st, self.format) 53 | for redirection in self.redirections: 54 | redirection.__enter__() 55 | 56 | return self.io 57 | 58 | def __call__(self, to=None, format=None): 59 | return _Redirect(self.stdout, self.stderr, format=format, to=to) 60 | 61 | def __exit__(self, *exc): 62 | res = None 63 | for redirection in self.redirections: 64 | res = redirection.__exit__(*exc) 65 | 66 | self._write(self.io.getvalue()) 67 | 68 | self.redirections = [] 69 | self.st = None 70 | self.fun = None 71 | self.io = _Redirect.IOStuff(self._write) 72 | return res 73 | 74 | def _write(self, data): 75 | self.fun(data) 76 | 77 | 78 | stdout = _Redirect() 79 | stderr = _Redirect(stderr=True) 80 | stdouterr = _Redirect(stdout=True, stderr=True) 81 | -------------------------------------------------------------------------------- /timeeval_gui/timeseries_config.py: -------------------------------------------------------------------------------- 1 | from typing import Any, List, Dict 2 | 3 | from gutenTAG.anomalies import Anomaly 4 | from gutenTAG.base_oscillations import BaseOscillationInterface 5 | from gutenTAG.generator import TimeSeries 6 | from gutenTAG.generator.parser import ConfigParser 7 | 8 | 9 | class TimeSeriesConfig: 10 | def __init__(self): 11 | self.config: Dict[str, Any] = { 12 | "name": "", 13 | "length": 10, 14 | "semi-supervised": False, 15 | "supervised": False, 16 | "base-oscillations": [], 17 | "anomalies": [] 18 | } 19 | 20 | def set_name(self, name: str): 21 | self.config["name"] = name 22 | 23 | def set_length(self, length: int): 24 | self.config["length"] = length 25 | 26 | def set_supervised(self): 27 | self.config["supervised"] = True 28 | 29 | def set_semi_supervised(self): 30 | self.config["semi-supervised"] = True 31 | 32 | def add_base_oscillation(self, kind: str, **kwargs): 33 | self.config["base-oscillations"].append({"kind": kind, **kwargs}) 34 | 35 | def add_anomaly(self, **kwargs): 36 | self.config["anomalies"].append(kwargs) 37 | 38 | def generate_base_oscillations(self) -> List[BaseOscillationInterface]: 39 | parser = ConfigParser() 40 | return parser._build_base_oscillations(self.config) 41 | 42 | def generate_anomalies(self) -> List[Anomaly]: 43 | parser = ConfigParser() 44 | anomalies = parser._build_anomalies(self.config) 45 | return anomalies 46 | 47 | def generate_timeseries(self) -> TimeSeries: 48 | return TimeSeries(self.generate_base_oscillations(), self.generate_anomalies(), self.name, 49 | supervised=self.config["supervised"], 50 | semi_supervised=self.config["semi-supervised"]) 51 | 52 | def __getattr__(self, item): 53 | return self.config[item] 54 | 55 | def __repr__(self): 56 | return f"TimeSeriesConfig(config={self.config})" 57 | -------------------------------------------------------------------------------- /timeeval_gui/utils.py: -------------------------------------------------------------------------------- 1 | from dataclasses import dataclass 2 | from typing import Dict, Tuple, List, Type 3 | 4 | from timeeval_gui.files import Files 5 | 6 | 7 | def get_base_oscillations() -> Dict[str, str]: 8 | return { 9 | "sine": "Sine", 10 | "random-walk": "Random Walk", 11 | "ecg": "ECG", 12 | "polynomial": "Polynomial", 13 | "cylinder-bell-funnel": "Cylinder Bell Funnel", 14 | "random-mode-jump": "Random Mode Jump", 15 | # "formula": "Formula" 16 | } 17 | 18 | 19 | @dataclass 20 | class BOParameter: 21 | key: str 22 | name: str 23 | tpe: str 24 | help: str 25 | 26 | 27 | def get_base_oscillation_parameters(bo: str) -> List[BOParameter]: 28 | common = [ 29 | BOParameter(key="variance", name="Variance", tpe="number", help="Noise factor dependent on amplitude"), 30 | BOParameter(key="trend", name="Trend", tpe="object", 31 | help="Defines another base oscillation as trend that gets added to its parent object. " 32 | "Can be recursively used!"), 33 | BOParameter(key="offset", name="Offset", tpe="number", help="Gets added to the generated time series"), 34 | ] 35 | return { 36 | "sine": [ 37 | BOParameter(key="frequency", name="Frequency", tpe="number", 38 | help="Number of sine waves per 100 points"), 39 | BOParameter(key="amplitude", name="Amplitude", tpe="number", help="+/- deviation from 0"), 40 | BOParameter(key="freq-mod", name="Frequency modulation", tpe="number", 41 | help="Factor (of base frequency) of the frequency modulation that changes the amplitude of the " 42 | "sine wave over time. The carrier wave always has an amplitude of 1.") 43 | ], 44 | "random-walk": [ 45 | BOParameter(key="amplitude", name="Amplitude", tpe="number", help="+/- deviation from 0"), 46 | BOParameter(key="smoothing", name="Smoothing factor", tpe="number", 47 | help="Smoothing factor for convolution dependent on length") 48 | ], 49 | "cylinder-bell-funnel": [ 50 | BOParameter(key="avg-pattern-length", name="Average pattern length", tpe="integer", 51 | help="Average length of pattern in time series"), 52 | BOParameter(key="amplitude", name="Amplitude", tpe="number", 53 | help="Average amplitude of pattern in time series"), 54 | BOParameter(key="variance-pattern-length", name="Variance pattern length", tpe="number", 55 | help="Variance of pattern length in time series"), 56 | BOParameter(key="variance-amplitude", name="Variance amplitude", tpe="number", 57 | help="Variance of amplitude of pattern in time series"), 58 | ], 59 | "ecg": [ 60 | BOParameter(key="frequency", name="Frequency", tpe="number", 61 | help="Number of hear beats per 100 points") 62 | ], 63 | "polynomial": [ 64 | BOParameter(key="polynomial", name="Polynomial parameters", tpe="list[number]", 65 | help="See numpy documentation: https://numpy.org/doc/stable/reference/generated/numpy.polynomial.polynomial.Polynomial.html#numpy.polynomial.polynomial.Polynomial") 66 | ], 67 | "random-mode-jump": [ 68 | BOParameter(key="frequency", name="Frequency", tpe="number", 69 | help="Number of jumps in Time Series"), 70 | BOParameter(key="channel_diff", name="Channel mode difference", tpe="number", 71 | help="Value difference of absolute mode values between channels"), 72 | BOParameter(key="channel_offset", name="Channel offset", tpe="number", 73 | help="Value offset from 0 in both directions"), 74 | BOParameter(key="random_seed", name="Random seed", tpe="integer", 75 | help="Random seed to have the similar channels"), 76 | ] 77 | }.get(bo, []) + common 78 | 79 | 80 | def get_anomaly_types(bo_kind: str) -> Dict[str, str]: 81 | name_mapping = { 82 | "amplitude": "Amplitude", 83 | "extremum": "Extremum", 84 | "frequency": "Frequency", 85 | "mean": "Mean", 86 | "pattern": "Pattern", 87 | "pattern-shift": "Pattern Shift", 88 | "platform": "Platform", 89 | "trend": "Trend", 90 | "variance": "Variance", 91 | "mode-correlation": "Mode Correlation", 92 | } 93 | supported_anomalies = { 94 | "sine": ["amplitude", "extremum", "frequency", "mean", "pattern", "pattern-shift", "platform", "trend", 95 | "variance"], 96 | "random-walk": ["amplitude", "extremum", "mean", "platform", "trend", "variance"], 97 | "ecg": ["amplitude", "extremum", "frequency", "mean", "pattern", "pattern-shift", "platform", "trend", 98 | "variance"], 99 | "polynomial": ["extremum", "mean", "platform", "trend", "variance"], 100 | "cylinder-bell-funnel": ["amplitude", "extremum", "mean", "pattern", "platform", "trend", "variance"], 101 | "random-mode-jump": ["mode-correlation"], 102 | "formula": ["extremum"] 103 | } 104 | return dict(map(lambda x: (x, name_mapping[x]), supported_anomalies.get(bo_kind, []))) 105 | 106 | 107 | def map_types(t: str) -> Type: 108 | return { 109 | "boolean": bool, 110 | "string": str, 111 | "integer": int, 112 | "number": float 113 | }.get(t, str) 114 | 115 | 116 | def get_anomaly_params(anomaly: str) -> List[Tuple[str, Type, str]]: 117 | params = [] 118 | param_config = Files().anomaly_kind_configuration_schema() 119 | 120 | for param_name, param in param_config["definitions"].get(f"{anomaly}-params", {}).get("properties", {}).items(): 121 | params.append((param_name, map_types(param.get("type")), param.get("description", ""))) 122 | 123 | return params 124 | -------------------------------------------------------------------------------- /timeeval_gui/🏠_Home.py: -------------------------------------------------------------------------------- 1 | import streamlit as st 2 | 3 | 4 | def main(): 5 | st.markdown(""" 6 | # Welcome to the TimeEval GUI 7 | 8 | TimeEval includes an extensive data generator and supports both interactive and batch evaluation scenarios. 9 | This novel toolkit, aims to ease the evaluation effort and help the community to provide more meaningful evaluations 10 | in the Time Series Anomaly Detection field. 11 | 12 | This Tool has 3 main components: 13 | 14 | 1. [GutenTAG](/GutenTAG) to generate time series 15 | 2. [Eval](/Eval) to run multiple anomaly detectors on multiple datasets 16 | 3. [Results](/Results) to compare the quality of multiple anomaly detectors 17 | """) 18 | 19 | st.info("For more detailed documentation on the tools: " 20 | "[GutenTAG Documentation](https://github.com/TimeEval/gutentag/blob/main/doc/index.md) and " 21 | "[Eval Documentation](https://timeeval.readthedocs.io/en/latest/)") 22 | 23 | 24 | if __name__ == '__main__': 25 | main() 26 | --------------------------------------------------------------------------------