├── .github ├── FUNDING.yml └── ISSUE_TEMPLATE │ ├── bug_report.md │ └── feature_request.md ├── .gitignore ├── LICENSE ├── README.md ├── SECURITY.md ├── aggregator.py ├── requirements.in └── requirements.txt /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | # These are supported funding model platforms 2 | 3 | github: Spenhouet 4 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Bug report 3 | about: Create a report to help us improve 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Describe the bug** 11 | A clear and concise description of what the bug is. 12 | 13 | **To Reproduce** 14 | Steps to reproduce the behavior. 15 | 16 | **Expected behavior** 17 | A clear and concise description of what you expected to happen. 18 | 19 | **Screenshots** 20 | If applicable, add screenshots to help explain your problem. 21 | 22 | **Desktop (please complete the following information):** 23 | - OS: 24 | - python version 25 | - tensorflow version 26 | - numpy version 27 | 28 | **Additional context** 29 | Add any other context about the problem here. 30 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.md: -------------------------------------------------------------------------------- 1 | --- 2 | name: Feature request 3 | about: Suggest an idea for this project 4 | title: '' 5 | labels: '' 6 | assignees: '' 7 | 8 | --- 9 | 10 | **Is your feature request related to a problem? Please describe.** 11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] 12 | 13 | **Describe the solution you'd like** 14 | A clear and concise description of what you want to happen. 15 | 16 | **Describe alternatives you've considered** 17 | A clear and concise description of any alternative solutions or features you've considered. 18 | 19 | **Additional context** 20 | Add any other context or screenshots about the feature request here. 21 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | build/ 12 | develop-eggs/ 13 | dist/ 14 | downloads/ 15 | eggs/ 16 | .eggs/ 17 | lib/ 18 | lib64/ 19 | parts/ 20 | sdist/ 21 | var/ 22 | wheels/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | MANIFEST 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | .pytest_cache/ 49 | 50 | # Translations 51 | *.mo 52 | *.pot 53 | 54 | # Django stuff: 55 | *.log 56 | local_settings.py 57 | db.sqlite3 58 | 59 | # Flask stuff: 60 | instance/ 61 | .webassets-cache 62 | 63 | # Scrapy stuff: 64 | .scrapy 65 | 66 | # Sphinx documentation 67 | docs/_build/ 68 | 69 | # PyBuilder 70 | target/ 71 | 72 | # Jupyter Notebook 73 | .ipynb_checkpoints 74 | 75 | # pyenv 76 | .python-version 77 | 78 | # celery beat schedule file 79 | celerybeat-schedule 80 | 81 | # SageMath parsed files 82 | *.sage.py 83 | 84 | # Environments 85 | .env 86 | .venv 87 | env/ 88 | venv/ 89 | ENV/ 90 | env.bak/ 91 | venv.bak/ 92 | 93 | # Spyder project settings 94 | .spyderproject 95 | .spyproject 96 | 97 | # Rope project settings 98 | .ropeproject 99 | 100 | # mkdocs documentation 101 | /site 102 | 103 | # mypy 104 | .mypy_cache/ 105 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Sebastian Penhouet 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # tensorboard-aggregator 2 | 3 | This project contains an easy to use method to aggregate multiple tensorboard runs. The max, min, mean, median, standard deviation and variance of the scalars from multiple runs is saved either as new tensorboard summary or as `.csv` table. 4 | 5 | There is a similar tool which uses pytorch to output the tensorboard summary: [TensorBoard Reducer](https://github.com/janosh/tensorboard-reducer) 6 | 7 | ## Feature Overview 8 | 9 | - Aggregates scalars of multiple tensorboard files 10 | - Saves aggregates as new tensorboard summary or as `.csv` 11 | - Aggregate by any numpy function (default: max, min, mean, median, std, var) 12 | - Allows any number of subpath structures 13 | - Keeps step numbering 14 | - Saves wall time average per step 15 | 16 | ## Setup and run configuration 17 | 18 | 1. Download or clone repository files to your computer 19 | 1. Go into repository folder 20 | 1. Install requirements: `pip3 install -r requirements.txt --upgrade` 21 | 1. You can now run the aggregation with: `python aggregator.py` 22 | 23 | ### Parameters 24 | 25 | | Parameter | | Default | Description | 26 | | ------------ | -------- | ------------------------- | ----------- | 27 | | _--path_ | optional | current working directory | Path to folder containing runs | 28 | | _--subpaths_ | optional | `['.']` | List of all subpaths | 29 | | _--output_ | optional | `summary` | Possible values: `summary`, `csv` | 30 | 31 | ### Recommendation 32 | 33 | - Add the repository folder to the PATH (global environment variables). 34 | - Create an additional script file within the repository folder containing `python static/path/to/aggregator.py` 35 | - Script name: `aggregate.sh` / `aggregate.bat` / ... (depending on your OS) 36 | - Change default behavior via parameters 37 | - Do not change `path` parameter since this will by default be the path the script is run from 38 | - Workflow from here: Open folder with tensorboard files and call the script: aggregate files will be created for the current directory 39 | 40 | ## Explanation 41 | 42 | Example folder structure: 43 | 44 | . 45 | ├── ... 46 | ├── test_param_xy # Folder containing the runs for aggregation 47 | │ ├── run_1 # Folder containing tensorboard files of one run 48 | │ │ ├── test # Subpath containing one tensorboard file 49 | │ │ │ └── events.out.tfevents. ... 50 | │ │ └── train 51 | │ │ └── events.out.tfevents. ... 52 | │ ├── run_2 53 | │ ├── ... 54 | │ └── run_X 55 | └── ... 56 | 57 | The folder `test_param_xy` will be the base path (`cd test_param_xy`). 58 | The tensorboard summaries for the aggregation will be created by calling the `aggregate` script (containing: `python static/path/to/aggregator.py --subpaths ['test', 'train'] --output summary`) 59 | 60 | The base folder contains multiple subfolders. Each subfolder contains the tensorboard files of different runs for the same model and configuration as all other subfolders. 61 | 62 | The resulting folder structure for `summary` looks like this: 63 | 64 | . 65 | ├── ... 66 | ├── test_param_xy 67 | │ ├── ... 68 | │ └── aggregate 69 | │ ├── test 70 | │ │ ├── max 71 | │ │ │ └── test_param_xy 72 | │ │ │ └── events.out.tfevents. ... 73 | │ │ ├── min 74 | │ │ ├── mean 75 | │ │ ├── median 76 | │ │ └── std 77 | │ └── train 78 | └── ... 79 | 80 | Multiple aggregate summaries can be put together in one directory. 81 | Since the original base folder name is kept as subfolder to the aggregate function folder the summaries are distinguishable within tensorboard. 82 | 83 | . 84 | ├── ... 85 | ├── max 86 | │ ├── test_param_x 87 | │ ├── test_param_y 88 | │ ├── test_param_z 89 | │ └── test_param_v 90 | ├── min 91 | ├── mean 92 | ├── median 93 | └── std 94 | 95 | 96 | The `.csv` table files for the aggregation will be created by calling the `aggregate` script (containing: `python static/path/to/aggregator.py --subpaths ['test', 'train'] --output csv`) 97 | 98 | The resulting folder structure for `summary` looks like this: 99 | 100 | . 101 | ├── ... 102 | ├── test_param_xy 103 | │ ├── ... 104 | │ └── aggregate 105 | │ ├── test 106 | │ │ ├── max_test_param_xy.csv 107 | │ │ ├── min_test_param_xy.csv 108 | │ │ ├── mean_test_param_xy.csv 109 | │ │ ├── median_test_param_xy.csv 110 | │ │ └── std_test_param_xy.csv 111 | │ └── train 112 | └── ... 113 | 114 | The `.csv` files are primarily for latex plots. 115 | 116 | ## Limitations 117 | 118 | - The aggregation only works for scalars and not for other types like histograms 119 | - All runs for one aggregation need the exact same tags. Basically the naming and number of scalar metrics needs to be equal for all runs. 120 | - All runs for one aggregation need the same steps. Basically the number of iterations, epochs and the saving frequency needs to be equal for all runs of one scalar. 121 | 122 | ## Contributions 123 | 124 | If there are potential problems (bugs, incompatibilities to newer library versions or to a OS) or feature requests, please create an GitHub issue [here](https://github.com/Spenhouet/tensorboard-aggregator/issues). 125 | 126 | Dependencies are managed using [pip-tools](https://github.com/jazzband/pip-tools). 127 | Just add new dependencies to `requirements.in` and generate a new `requirements.txt` using `pip-compile` in the command line. 128 | 129 | ## License 130 | 131 | [MIT License](LICENSE) 132 | -------------------------------------------------------------------------------- /SECURITY.md: -------------------------------------------------------------------------------- 1 | # Security Policy 2 | 3 | ## Supported Versions 4 | 5 | No specific version support. 6 | 7 | ## Reporting a Vulnerability 8 | 9 | If there is something you are concerned about, please write an issue. 10 | -------------------------------------------------------------------------------- /aggregator.py: -------------------------------------------------------------------------------- 1 | # MIT License 2 | # Copyright (c) 2019 Sebastian Penhouet 3 | # GitHub project: https://github.com/Spenhouet/tensorboard-aggregator 4 | # ============================================================================== 5 | """Aggregates multiple tensorbaord runs""" 6 | 7 | import ast 8 | import argparse 9 | import os 10 | import re 11 | from pathlib import Path 12 | 13 | import pandas as pd 14 | import numpy as np 15 | import tensorflow as tf 16 | from tensorboard.backend.event_processing.event_accumulator import EventAccumulator 17 | from tensorflow.core.util.event_pb2 import Event 18 | 19 | FOLDER_NAME = 'aggregates' 20 | 21 | 22 | def extract(dpath, subpath): 23 | scalar_accumulators = [EventAccumulator(str(dpath / dname / subpath)).Reload( 24 | ).scalars for dname in os.listdir(dpath) if dname != FOLDER_NAME] 25 | 26 | # Filter non event files 27 | scalar_accumulators = [scalar_accumulator for scalar_accumulator in scalar_accumulators if scalar_accumulator.Keys()] 28 | 29 | # Get and validate all scalar keys 30 | all_keys = [tuple(scalar_accumulator.Keys()) for scalar_accumulator in scalar_accumulators] 31 | assert len(set(all_keys)) == 1, "All runs need to have the same scalar keys. There are mismatches in {}".format(all_keys) 32 | keys = all_keys[0] 33 | 34 | all_scalar_events_per_key = [[scalar_accumulator.Items(key) for scalar_accumulator in scalar_accumulators] for key in keys] 35 | 36 | # Get and validate all steps per key 37 | all_steps_per_key = [[tuple(scalar_event.step for scalar_event in scalar_events) for scalar_events in all_scalar_events] 38 | for all_scalar_events in all_scalar_events_per_key] 39 | 40 | for i, all_steps in enumerate(all_steps_per_key): 41 | assert len(set(all_steps)) == 1, "For scalar {} the step numbering or count doesn't match. Step count for all runs: {}".format( 42 | keys[i], [len(steps) for steps in all_steps]) 43 | 44 | steps_per_key = [all_steps[0] for all_steps in all_steps_per_key] 45 | 46 | # Get and average wall times per step per key 47 | wall_times_per_key = [np.mean([tuple(scalar_event.wall_time for scalar_event in scalar_events) for scalar_events in all_scalar_events], axis=0) 48 | for all_scalar_events in all_scalar_events_per_key] 49 | 50 | # Get values per step per key 51 | values_per_key = [[[scalar_event.value for scalar_event in scalar_events] for scalar_events in all_scalar_events] 52 | for all_scalar_events in all_scalar_events_per_key] 53 | 54 | all_per_key = dict(zip(keys, zip(steps_per_key, wall_times_per_key, values_per_key))) 55 | 56 | return all_per_key 57 | 58 | 59 | def aggregate_to_summary(dpath, aggregation_ops, extracts_per_subpath): 60 | for op in aggregation_ops: 61 | for subpath, all_per_key in extracts_per_subpath.items(): 62 | path = dpath / FOLDER_NAME / op.__name__ / dpath.name / subpath 63 | aggregations_per_key = {key: (steps, wall_times, op(values, axis=0)) for key, (steps, wall_times, values) in all_per_key.items()} 64 | write_summary(path, aggregations_per_key) 65 | 66 | 67 | def write_summary(dpath, aggregations_per_key): 68 | writer = tf.summary.create_file_writer(str(dpath)) 69 | 70 | for key, (steps, wall_times, aggregations) in aggregations_per_key.items(): 71 | for step, wall_time, aggregation in zip(steps, wall_times, aggregations): 72 | with writer.as_default(): 73 | tf.summary.scalar(key, aggregation, step=step) 74 | writer.flush() 75 | 76 | def aggregate_to_csv(dpath, aggregation_ops, extracts_per_subpath): 77 | for subpath, all_per_key in extracts_per_subpath.items(): 78 | for key, (steps, wall_times, values) in all_per_key.items(): 79 | aggregations = [op(values, axis=0) for op in aggregation_ops] 80 | write_csv(dpath, subpath, key, dpath.name, aggregations, steps, aggregation_ops) 81 | 82 | 83 | def get_valid_filename(s): 84 | s = str(s).strip().replace(' ', '_') 85 | return re.sub(r'(?u)[^-\w.]', '', s) 86 | 87 | 88 | def write_csv(dpath, subpath, key, fname, aggregations, steps, aggregation_ops): 89 | path = dpath / FOLDER_NAME 90 | 91 | if not path.exists(): 92 | os.makedirs(path) 93 | 94 | file_name = get_valid_filename(key) + '-' + get_valid_filename(subpath) + '-' + fname + '.csv' 95 | aggregation_ops_names = [aggregation_op.__name__ for aggregation_op in aggregation_ops] 96 | df = pd.DataFrame(np.transpose(aggregations), index=steps, columns=aggregation_ops_names) 97 | df.to_csv(path / file_name, sep=';') 98 | 99 | 100 | def aggregate(dpath, output, subpaths): 101 | name = dpath.name 102 | 103 | aggregation_ops = [np.mean, np.min, np.max, np.median, np.std, np.var] 104 | 105 | ops = { 106 | 'summary': aggregate_to_summary, 107 | 'csv': aggregate_to_csv 108 | } 109 | 110 | print("Started aggregation {}".format(name)) 111 | 112 | extracts_per_subpath = {subpath: extract(dpath, subpath) for subpath in subpaths} 113 | 114 | ops.get(output)(dpath, aggregation_ops, extracts_per_subpath) 115 | 116 | print("Ended aggregation {}".format(name)) 117 | 118 | 119 | if __name__ == '__main__': 120 | def param_list(param): 121 | p_list = ast.literal_eval(param) 122 | if type(p_list) is not list: 123 | raise argparse.ArgumentTypeError("Parameter {} is not a list".format(param)) 124 | return p_list 125 | 126 | parser = argparse.ArgumentParser() 127 | parser.add_argument("--path", type=str, help="main path for tensorboard files", default=os.getcwd()) 128 | parser.add_argument("--subpaths", type=param_list, help="subpath structures", default=['.']) 129 | parser.add_argument("--output", type=str, help="aggregation can be saved as tensorboard file (summary) or as table (csv)", default='summary') 130 | 131 | args = parser.parse_args() 132 | 133 | path = Path(args.path) 134 | 135 | if not path.exists(): 136 | raise argparse.ArgumentTypeError("Parameter {} is not a valid path".format(path)) 137 | 138 | subpaths = [path / dname / subpath for subpath in args.subpaths for dname in os.listdir(path) if dname != FOLDER_NAME] 139 | 140 | for subpath in subpaths: 141 | if not os.path.exists(subpath): 142 | raise argparse.ArgumentTypeError("Parameter {} is not a valid path".format(subpath)) 143 | 144 | if args.output not in ['summary', 'csv']: 145 | raise argparse.ArgumentTypeError("Parameter {} is not summary or csv".format(args.output)) 146 | 147 | aggregate(path, args.output, args.subpaths) 148 | -------------------------------------------------------------------------------- /requirements.in: -------------------------------------------------------------------------------- 1 | pip-tools 2 | pandas 3 | numpy 4 | tensorflow 5 | tensorboard -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # 2 | # This file is autogenerated by pip-compile with Python 3.11 3 | # by the following command: 4 | # 5 | # pip-compile 6 | # 7 | absl-py==1.4.0 8 | # via 9 | # tensorboard 10 | # tensorflow-macos 11 | astunparse==1.6.3 12 | # via tensorflow-macos 13 | build==0.10.0 14 | # via pip-tools 15 | cachetools==5.3.1 16 | # via google-auth 17 | certifi==2023.7.22 18 | # via requests 19 | charset-normalizer==3.2.0 20 | # via requests 21 | click==8.1.7 22 | # via pip-tools 23 | flatbuffers==23.5.26 24 | # via tensorflow-macos 25 | gast==0.4.0 26 | # via tensorflow-macos 27 | google-auth==2.22.0 28 | # via 29 | # google-auth-oauthlib 30 | # tensorboard 31 | google-auth-oauthlib==1.0.0 32 | # via tensorboard 33 | google-pasta==0.2.0 34 | # via tensorflow-macos 35 | grpcio==1.57.0 36 | # via 37 | # tensorboard 38 | # tensorflow-macos 39 | h5py==3.9.0 40 | # via tensorflow-macos 41 | idna==3.4 42 | # via requests 43 | keras==2.13.1 44 | # via tensorflow-macos 45 | libclang==16.0.6 46 | # via tensorflow-macos 47 | markdown==3.4.4 48 | # via tensorboard 49 | markupsafe==2.1.3 50 | # via werkzeug 51 | numpy==1.24.3 52 | # via 53 | # -r requirements.in 54 | # h5py 55 | # opt-einsum 56 | # pandas 57 | # tensorboard 58 | # tensorflow-macos 59 | oauthlib==3.2.2 60 | # via requests-oauthlib 61 | opt-einsum==3.3.0 62 | # via tensorflow-macos 63 | packaging==23.1 64 | # via 65 | # build 66 | # tensorflow-macos 67 | pandas==2.0.3 68 | # via -r requirements.in 69 | pip-tools==7.3.0 70 | # via -r requirements.in 71 | protobuf==4.24.1 72 | # via 73 | # tensorboard 74 | # tensorflow-macos 75 | pyasn1==0.5.0 76 | # via 77 | # pyasn1-modules 78 | # rsa 79 | pyasn1-modules==0.3.0 80 | # via google-auth 81 | pyproject-hooks==1.0.0 82 | # via build 83 | python-dateutil==2.8.2 84 | # via pandas 85 | pytz==2023.3 86 | # via pandas 87 | requests==2.31.0 88 | # via 89 | # requests-oauthlib 90 | # tensorboard 91 | requests-oauthlib==1.3.1 92 | # via google-auth-oauthlib 93 | rsa==4.9 94 | # via google-auth 95 | six==1.16.0 96 | # via 97 | # astunparse 98 | # google-auth 99 | # google-pasta 100 | # python-dateutil 101 | # tensorflow-macos 102 | tensorboard==2.13.0 103 | # via 104 | # -r requirements.in 105 | # tensorflow-macos 106 | tensorboard-data-server==0.7.1 107 | # via tensorboard 108 | tensorflow==2.13.0 109 | # via -r requirements.in 110 | tensorflow-estimator==2.13.0 111 | # via tensorflow-macos 112 | tensorflow-macos==2.13.0 113 | # via tensorflow 114 | termcolor==2.3.0 115 | # via tensorflow-macos 116 | typing-extensions==4.5.0 117 | # via tensorflow-macos 118 | tzdata==2023.3 119 | # via pandas 120 | urllib3==1.26.16 121 | # via 122 | # google-auth 123 | # requests 124 | werkzeug==2.3.7 125 | # via tensorboard 126 | wheel==0.41.2 127 | # via 128 | # astunparse 129 | # pip-tools 130 | # tensorboard 131 | wrapt==1.15.0 132 | # via tensorflow-macos 133 | 134 | # The following packages are considered to be unsafe in a requirements file: 135 | # pip 136 | # setuptools 137 | --------------------------------------------------------------------------------