├── tests ├── __init__.py └── test_titles.py ├── doc └── source │ ├── specs │ ├── index.rst │ └── redirect.py ├── specs ├── queens │ ├── implemented │ │ ├── .gitkeep │ │ ├── parallel_future.jpg │ │ ├── parallel_current.jpg │ │ ├── support_mark_down_action_for_instances.rst │ │ ├── snmp-parsing-service.rst │ │ ├── webhooks.rst │ │ ├── alarm-counts-api.rst │ │ ├── parallel-evaluation.rst │ │ ├── template-include.rst │ │ ├── event-persistor.rst │ │ ├── template-CRUD.rst │ │ ├── refactor-execute-mistral.rst │ │ └── db-support.rst │ └── approved │ │ └── Vitrage-Template_CRUD.jpg ├── mitaka │ ├── synchronizer_init.jpg │ ├── ui-system-health-visualization.rst │ ├── networkx-performance-improvement.rst │ ├── vitrage-template-validator.rst │ ├── entity-graph-consistency-validator.rst │ ├── vitrage-resource-processor.rst │ ├── entity-graph-change-notifications.rst │ ├── vitrage-cli.rst │ ├── networkx-graph-driver.rst │ ├── vitrage-support-deduced-alarms.rst │ ├── vitrage-evaluator-engine.rst │ ├── vitrage-support-rca.rst │ ├── nova-entity-transformer.rst │ ├── synchronizer-nagios-get-all.rst │ └── aodh-notifier.rst ├── ocata │ ├── collectd-datasource.rst │ ├── doctor-datasource.rst │ ├── aodh-message-bus-notifications.rst │ ├── multi-tenancy-support.rst │ ├── vitrage-id.rst │ ├── event-api.rst │ └── static-datasource-configuration.rst ├── pike │ └── implemented │ │ ├── integration-with-mistral.rst │ │ ├── template-not-operator-support.rst │ │ ├── keycloak.rst │ │ ├── osprofiler-support.rst │ │ ├── resource-show-api.rst │ │ ├── snmp-notifications.rst │ │ ├── resource-list-api.rst │ │ └── entity-equivalence.rst ├── rocky │ ├── implemented │ │ ├── rpc-collector.rst │ │ ├── graph_fast_failover.rst │ │ ├── datasource-scaffold.rst │ │ └── k8s-datasource.rst │ └── approved │ │ └── add-action-list-panel.rst ├── newton │ ├── template-validate-api.rst │ ├── heat-datasource.rst │ └── template-list-api.rst └── stein │ └── implemented │ ├── services_list_and_statuses.rst │ └── short_template_format.rst ├── .zuul.yaml ├── .stestr.conf ├── .gitreview ├── LICENSE ├── .gitignore ├── bindep.txt ├── requirements.txt ├── setup.cfg ├── setup.py ├── tox.ini └── README.rst /tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /doc/source/specs: -------------------------------------------------------------------------------- 1 | ../../specs/ -------------------------------------------------------------------------------- /specs/queens/implemented/.gitkeep: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /.zuul.yaml: -------------------------------------------------------------------------------- 1 | - project: 2 | templates: 3 | - openstack-specs-jobs 4 | -------------------------------------------------------------------------------- /.stestr.conf: -------------------------------------------------------------------------------- 1 | [DEFAULT] 2 | test_path=${TEST_PATH:-./vitrage_specs/tests} 3 | top_dir=./ 4 | 5 | -------------------------------------------------------------------------------- /.gitreview: -------------------------------------------------------------------------------- 1 | [gerrit] 2 | host=review.opendev.org 3 | port=29418 4 | project=openstack/vitrage-specs.git 5 | -------------------------------------------------------------------------------- /specs/mitaka/synchronizer_init.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/openstack/vitrage-specs/HEAD/specs/mitaka/synchronizer_init.jpg -------------------------------------------------------------------------------- /specs/queens/implemented/parallel_future.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/openstack/vitrage-specs/HEAD/specs/queens/implemented/parallel_future.jpg -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | This work is licensed under a Creative Commons Attribution 3.0 Unported License. 2 | 3 | http://creativecommons.org/licenses/by/3.0/legalcode -------------------------------------------------------------------------------- /specs/queens/approved/Vitrage-Template_CRUD.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/openstack/vitrage-specs/HEAD/specs/queens/approved/Vitrage-Template_CRUD.jpg -------------------------------------------------------------------------------- /specs/queens/implemented/parallel_current.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/openstack/vitrage-specs/HEAD/specs/queens/implemented/parallel_current.jpg -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | *.iml 2 | AUTHORS 3 | ChangeLog 4 | build 5 | .tox 6 | .venv 7 | *.egg* 8 | *.swp 9 | *.swo 10 | *.pyc 11 | .stestr/ 12 | #IntelJ Idea 13 | .idea/ 14 | -------------------------------------------------------------------------------- /bindep.txt: -------------------------------------------------------------------------------- 1 | # This is a cross-platform list tracking distribution packages needed for install and tests; 2 | # see https://docs.openstack.org/infra/bindep/ for additional information. 3 | 4 | # graphviz is necessary for documentation build 5 | graphviz 6 | 7 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # The order of packages is significant, because pip processes them in the order 2 | # of appearance. Changing the order has an impact on the overall integration 3 | # process, which may cause wedges in the gate later. 4 | 5 | pbr!=2.1.0 # Apache-2.0 6 | sphinx>=2.0.0,!=2.1.0 # BSD 7 | graphviz>=0.4,!=0.5.0 # MIT License 8 | stestr>=2.0.0 9 | testtools>=0.9.34 10 | yasfb 11 | openstackdocstheme>=2.2.1 # Apache-2.0 12 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [metadata] 2 | name = vitrage-specs 3 | summary = OpenStack Vitrage Project Development Specs 4 | description_file = 5 | README.rst 6 | author = OpenStack 7 | author_email = openstack-discuss@lists.openstack.org 8 | home_page = http://specs.openstack.org/openstack/vitrage-specs/ 9 | classifier = 10 | Intended Audience :: Developers 11 | License :: OSI Approved :: Apache Software License 12 | Operating System :: POSIX :: Linux 13 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # Copyright (c) 2013 Hewlett-Packard Development Company, L.P. 3 | # 4 | # Licensed under the Apache License, Version 2.0 (the "License"); 5 | # you may not use this file except in compliance with the License. 6 | # You may obtain a copy of the License at 7 | # 8 | # http://www.apache.org/licenses/LICENSE-2.0 9 | # 10 | # Unless required by applicable law or agreed to in writing, software 11 | # distributed under the License is distributed on an "AS IS" BASIS, 12 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 13 | # implied. 14 | # See the License for the specific language governing permissions and 15 | # limitations under the License. 16 | 17 | # THIS FILE IS MANAGED BY THE GLOBAL REQUIREMENTS REPO - DO NOT EDIT 18 | import setuptools 19 | 20 | setuptools.setup( 21 | setup_requires=['pbr'], 22 | pbr=True) -------------------------------------------------------------------------------- /specs/mitaka/ui-system-health-visualization.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =================================== 8 | Vitrage System Health Visualization 9 | =================================== 10 | 11 | https://blueprints.launchpad.net/vitrage-dashboard/+spec/ui-system-health-sunburst 12 | 13 | A macro view to show the general health of the system. 14 | 15 | Problem description 16 | =================== 17 | 18 | Missing visualization of the system status. 19 | 20 | Proposed change 21 | =============== 22 | 23 | We will present the system health in sunburest graph with color and labels 24 | that represnt their status 25 | 26 | Implementation 27 | ============== 28 | 29 | Assignee(s) 30 | ----------- 31 | 32 | Primary assignee: 33 | alon-heller 34 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | [tox] 2 | minversion = 3.18.0 3 | envlist = docs,py3 4 | skipsdist = True 5 | ignore_basepython_conflict = True 6 | 7 | [testenv] 8 | basepython = python3 9 | usedevelop = True 10 | setenv = VIRTUAL_ENV={envdir} 11 | deps = -r{toxinidir}/requirements.txt 12 | commands = stestr run --slowest {posargs} 13 | 14 | [testenv:venv] 15 | commands = {posargs} 16 | 17 | [testenv:docs] 18 | deps = 19 | -c{env:UPPER_CONSTRAINTS_FILE:https://opendev.org/openstack/requirements/raw/branch/master/upper-constraints.txt} 20 | -r{toxinidir}/requirements.txt 21 | commands = sphinx-build -W -E -b html -d doc/build/doctrees doc/source doc/build/html 22 | 23 | [testenv:doc8] 24 | deps = -r{toxinidir}/requirements.txt doc8 25 | commands = doc8 doc/source 26 | 27 | [testenv:autobuild] 28 | allowlist_externals = 29 | sphinx-autobuild 30 | commands = 31 | sphinx-autobuild --watch specs --open-browser doc/source doc/build 32 | -------------------------------------------------------------------------------- /specs/mitaka/networkx-performance-improvement.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================================ 8 | NetworkX Performance Improvement 9 | ================================ 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/networkx-performance-improvement 12 | 13 | Many CRUD operations can be performed on the Entity Graph simultaneously. 14 | This can cause inconsistency regarding the resources and states on the graph. 15 | 16 | Problem description 17 | =================== 18 | 19 | Many CRUD operations can be performed on the Entity Graph simultaneously. 20 | This can cause inconsistency regarding the resources and states on the graph. 21 | 22 | Proposed change 23 | =============== 24 | 25 | We would like to have graph versioning, which will held two copies of the 26 | Entity Graph: (1) Stable copy of the last consistent graph, and (2) Dynamic 27 | copy of the graph which is updated by the Vitrage Resource Processor. 28 | Every set interval the changes inserted to the dynamic graph will be updated 29 | to the stable graph, resulting in the stable graph being up to date. 30 | 31 | 32 | Implementation 33 | ============== 34 | 35 | Assignee(s) 36 | ----------- 37 | 38 | Primary assignee: 39 | alexey_weyl 40 | 41 | -------------------------------------------------------------------------------- /specs/mitaka/vitrage-template-validator.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================== 8 | Template Validator 9 | ================== 10 | 11 | Launchpad blueprint: 12 | 13 | https://blueprints.launchpad.net/vitrage/+spec/template-validator 14 | 15 | Vitrage Evaluator serves as workflow manager controlling the analysis and 16 | activation of templates and execution of template actions. 17 | 18 | Template Validator ensures that a new template is correct. Meaning, it conforms 19 | with Vitrage Template Standard. 20 | 21 | 22 | Problem description 23 | =================== 24 | 25 | Templates do not always meet the Vitrage Template Standard. For example, 26 | unsupported action, invalid alarm name, incorrect graph template and etc. 27 | 28 | Proposed change 29 | =============== 30 | 31 | Template validator is a part of Vitrage Evaluator. It receives a template, 32 | runs over it and checks its correctness. 33 | If the template is valid, it notify the Evaluator Engine which inserts the 34 | template into the template DB. Otherwise, insertion is failed. 35 | 36 | Alternatives 37 | ------------ 38 | A template editor that prevent invalid templates. 39 | 40 | Data model impact 41 | ----------------- 42 | None 43 | 44 | REST API impact 45 | --------------- 46 | The validator runs when a new template is added through API create call. 47 | -------------------------------------------------------------------------------- /doc/source/index.rst: -------------------------------------------------------------------------------- 1 | .. vitrage-specs documentation master file 2 | 3 | ============================== 4 | Vitrage Project Specifications 5 | ============================== 6 | 7 | Specifications 8 | ============== 9 | 10 | Mitaka 11 | ------ 12 | 13 | .. toctree:: 14 | :glob: 15 | :maxdepth: 1 16 | 17 | specs/mitaka/* 18 | 19 | Newton 20 | ------ 21 | 22 | .. toctree:: 23 | :glob: 24 | :maxdepth: 1 25 | 26 | specs/newton/* 27 | 28 | Ocata 29 | ----- 30 | 31 | .. toctree:: 32 | :glob: 33 | :maxdepth: 1 34 | 35 | specs/ocata/* 36 | 37 | Pike 38 | ---- 39 | 40 | Implemented 41 | ^^^^^^^^^^^ 42 | 43 | .. toctree:: 44 | :glob: 45 | :maxdepth: 1 46 | 47 | specs/pike/implemented/* 48 | 49 | Queens 50 | ------ 51 | 52 | Implemented 53 | ^^^^^^^^^^^ 54 | 55 | .. toctree:: 56 | :glob: 57 | :maxdepth: 1 58 | 59 | specs/queens/implemented/* 60 | 61 | Approved 62 | ^^^^^^^^ 63 | 64 | .. toctree:: 65 | :glob: 66 | :maxdepth: 1 67 | 68 | specs/queens/approved/* 69 | 70 | Rocky 71 | ----- 72 | 73 | Implemented 74 | ^^^^^^^^^^^ 75 | 76 | .. toctree:: 77 | :glob: 78 | :maxdepth: 1 79 | 80 | specs/rocky/implemented/* 81 | 82 | Approved 83 | ^^^^^^^^ 84 | 85 | .. toctree:: 86 | :glob: 87 | :maxdepth: 1 88 | 89 | specs/rocky/approved/* 90 | 91 | Stein 92 | ----- 93 | 94 | Implemented 95 | ^^^^^^^^^^^ 96 | 97 | .. toctree:: 98 | :glob: 99 | :maxdepth: 1 100 | 101 | specs/stein/implemented/* 102 | 103 | Indices and tables 104 | ================== 105 | 106 | * :ref:`search` 107 | -------------------------------------------------------------------------------- /specs/mitaka/entity-graph-consistency-validator.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ========================================== 8 | Vitrage Entity Graph Consistency Validator 9 | ========================================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/entity-graph-consistency-validator 12 | 13 | The Entity Graph may have mistakes, such as an incorrect resource state, and 14 | we would like to detect and repair such errors. 15 | 16 | Problem description 17 | =================== 18 | 19 | The Entity Graph is a living graph which is updated all the time with the data 20 | received from the synchronizer(s). The Entity Graph may have errors and become 21 | inconsistent compared to the state of the Cloud, which may occur due to: 22 | (1) The Vitrage Resource Processor may perform incorrect operations during 23 | graph updates, for example due to problems resulting from multi-threading. 24 | (2) Missing updates from the synchronizer(s). 25 | 26 | Proposed change 27 | =============== 28 | 29 | We would like to check the Entity Graph each set interval (configurable) and 30 | validate the consistency of its resources and state. If the Entity Graph is 31 | incorrect, it will repair it. When needed, it might consult the synchronizer(s) 32 | to retrieve missing data. 33 | 34 | Alternatives 35 | ------------ 36 | 37 | None 38 | 39 | REST API impact 40 | --------------- 41 | 42 | None 43 | 44 | 45 | Implementation 46 | ============== 47 | 48 | Assignee(s) 49 | ----------- 50 | 51 | Primary assignee: 52 | alexey_weyl 53 | 54 | -------------------------------------------------------------------------------- /specs/mitaka/vitrage-resource-processor.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ========================== 8 | Vitrage Resource Processor 9 | ========================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/vitrage-resource-processor 12 | 13 | When Vitrage initializes we need to create the Entity Graph on which the 14 | Vitrage will run it’s algorithms (sub graph matching, BFS, DFS etc.) and 15 | perform the actions (RCA, deduced alarms etc.). After the initialization 16 | of the graph, the resources changes are being processed and pushed to the 17 | Entity Graph. 18 | 19 | Problem description 20 | =================== 21 | 22 | Vitrage does not have a full Entity Graph of the resources and their state 23 | when it initializes. 24 | Vitrage needs to be consistent with the updated and new resources. 25 | 26 | Proposed change 27 | =============== 28 | 29 | When Vitrage initializes we need to build the up to date entity graph, so we 30 | can run the different algorithms on it and perform the required actions. 31 | To perform we require the full resources list from the synchronizer. 32 | In order for the Entity Graph to be constructed, it will need updates on 33 | changes in the cloud (e.g., machine creation/termination), and guidance on how 34 | resources should be linked to each other (e.g., a virtual machine resides on 35 | a physical host, which in turn belongs to an availability zone, etc.). 36 | For this purpose, each synchronizer will also update a configuration file - 37 | the entity graph template - which specifies for each resource what and how to 38 | locate the specific resources to link to. 39 | 40 | Implementation 41 | ============== 42 | 43 | Assignee(s) 44 | ----------- 45 | 46 | Primary assignee: 47 | alexey_weyl 48 | dan-ofek 49 | -------------------------------------------------------------------------------- /doc/source/redirect.py: -------------------------------------------------------------------------------- 1 | # A simple sphinx plugin which creates HTML redirections from old names 2 | # to new names. It does this by looking for files named "redirect" in 3 | # the documentation source and using the contents to create simple HTML 4 | # redirection pages for changed filenames. 5 | 6 | # Stolen from openstack/nova-specs 7 | 8 | import os.path 9 | 10 | from sphinx.application import ENV_PICKLE_FILENAME 11 | from sphinx.util.console import bold 12 | 13 | 14 | def setup(app): 15 | from sphinx.application import Sphinx 16 | if not isinstance(app, Sphinx): 17 | return 18 | app.connect('build-finished', emit_redirects) 19 | 20 | 21 | def process_redirect_file(app, path, ent): 22 | parent_path = path.replace(app.builder.srcdir, app.builder.outdir) 23 | with open(os.path.join(path, ent)) as redirects: 24 | for line in redirects.readlines(): 25 | from_path, to_path = line.rstrip().split(' ') 26 | from_path = from_path.replace('.rst', '.html') 27 | to_path = to_path.replace('.rst', '.html') 28 | 29 | redirected_filename = os.path.join(parent_path, from_path) 30 | redirected_directory = os.path.dirname(redirected_filename) 31 | if not os.path.exists(redirected_directory): 32 | os.makedirs(redirected_directory) 33 | with open(redirected_filename, 'w') as f: 34 | f.write('' 36 | % to_path) 37 | 38 | 39 | def emit_redirects(app, exc): 40 | app.builder.info(bold('scanning %s for redirects...') % app.builder.srcdir) 41 | 42 | def process_directory(path): 43 | for ent in os.listdir(path): 44 | p = os.path.join(path, ent) 45 | if os.path.isdir(p): 46 | process_directory(p) 47 | elif ent == 'redirects': 48 | app.builder.info(' found redirects at %s' % p) 49 | process_redirect_file(app, path, ent) 50 | 51 | process_directory(app.builder.srcdir) 52 | app.builder.info('...done') -------------------------------------------------------------------------------- /specs/ocata/collectd-datasource.rst: -------------------------------------------------------------------------------- 1 | This work is licensed under a Creative Commons Attribution 3.0 Unported 2 | License. 3 | 4 | http://creativecommons.org/licenses/by/3.0/legalcode 5 | 6 | ==================== 7 | Collectd Data Source 8 | ==================== 9 | 10 | https://blueprints.launchpad.net/vitrage/+spec/collectd-datasource 11 | 12 | This blueprint describes the datasource that will receive notifications from 13 | collectd. 14 | 15 | Problem description 16 | =================== 17 | Vitrage should be able to accept a collectd notification. 18 | 19 | 20 | Proposed change 21 | =============== 22 | The Collectd datasource will receive notifications in the following format: 23 | 24 | :: 25 | 26 | { 27 | "host": "compute-1", 28 | "plugin": "ovs_events", 29 | "plugin_instance": "br-ex", 30 | "type": "gauge", 31 | "type_instance": "link_status", 32 | "message": "link state of "br-ex" interface has been changed to "WARNING,"", 33 | "severity": "WARNING", 34 | "time": 1482409029.062524, 35 | "id": "46c7eba7753efb0e6f6a8de24c949c52" 36 | } 37 | 38 | 39 | Upon receiving such a notification, the Collectd datasource will create a 40 | corresponding alarm in Vitrage. When receiving an ok 41 | notification, the alarm will be deleted. 42 | 43 | In addition, a new evaluator template will be added in order to: 44 | - Create deduced alarms on the VMs running on the host 45 | - Modify the states of the host and the VMs to ERROR 46 | 47 | Alternatives 48 | ------------ 49 | 50 | None 51 | 52 | Data model impact 53 | ----------------- 54 | 55 | None 56 | 57 | REST API impact 58 | --------------- 59 | 60 | None 61 | 62 | 63 | Implementation 64 | ============== 65 | 66 | Assignee(s) 67 | ----------- 68 | 69 | Primary assignee: 70 | eyal bar ilan 71 | 72 | Work Items 73 | ---------- 74 | 75 | - Implement the Collectd datasource 76 | - Write a template for creating deduced alarms on the VMs and calling Nova 77 | mark host down 78 | 79 | Testing 80 | ======= 81 | 82 | The changes will be tested by unit tests 83 | 84 | References 85 | ========== 86 | 87 | - https://collectd.org/ 88 | -------------------------------------------------------------------------------- /specs/mitaka/entity-graph-change-notifications.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ========================== 8 | Vitrage Resource Processor 9 | ========================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/entity-graph-change-notifications 12 | 13 | When the Entity Graph is changed, different services would like to be informed of the change, and perform operations 14 | accordingly. 15 | 16 | Problem description 17 | =================== 18 | 19 | When the Entity Graph is changed, different services would like to be informed of the change, and perform operations 20 | accordingly. Such as: 21 | (1) When edges \ vertices are added \ removed, the Vitrage Evaluator will run the subgraph matching algorithm on the 22 | graph to find the templates which are matching and perform required actions. 23 | (2) When properties of existing edges \ vertices are modified, this can also impact subgraph matching outcome and thus 24 | the Vitrage Evaluator will need to be notified. 25 | 26 | Proposed change 27 | =============== 28 | 29 | For each CRUD operation on the Entity Graph, we will support adding listeners that can be notified, and receive updates 30 | when each change occurs. The update must include information about each change. 31 | 32 | Alternatives 33 | ------------ 34 | 35 | None 36 | 37 | Data model impact 38 | ----------------- 39 | 40 | None 41 | 42 | REST API impact 43 | --------------- 44 | 45 | None 46 | 47 | Security impact 48 | --------------- 49 | 50 | None 51 | 52 | Pipeline impact 53 | --------------- 54 | 55 | None 56 | 57 | Other end user impact 58 | --------------------- 59 | 60 | None 61 | 62 | Performance/Scalability Impacts 63 | ------------------------------- 64 | 65 | None 66 | 67 | 68 | Other deployer impact 69 | --------------------- 70 | 71 | None 72 | 73 | Developer impact 74 | ---------------- 75 | 76 | None 77 | 78 | 79 | Implementation 80 | ============== 81 | 82 | Assignee(s) 83 | ----------- 84 | 85 | Primary assignee: 86 | alexey_weyl 87 | 88 | Work Items 89 | ---------- 90 | 91 | None 92 | 93 | Future lifecycle 94 | ================ 95 | 96 | None 97 | 98 | Dependencies 99 | ============ 100 | 101 | None 102 | 103 | Testing 104 | ======= 105 | 106 | None 107 | 108 | Documentation Impact 109 | ==================== 110 | 111 | None 112 | 113 | References 114 | ========== 115 | 116 | None 117 | -------------------------------------------------------------------------------- /specs/pike/implemented/integration-with-mistral.rst: -------------------------------------------------------------------------------- 1 | .. 2 | 3 | This work is licensed under a Creative Commons Attribution 3.0 Unported 4 | License. 5 | 6 | http://creativecommons.org/licenses/by/3.0/legalcode 7 | 8 | ======================== 9 | Integration with Mistral 10 | ======================== 11 | 12 | launchpad blueprint: 13 | https://blueprints.launchpad.net/vitrage/+spec/integration-with-mistral 14 | 15 | Support executing Mistral workflows from Vitrage. 16 | 17 | Problem description 18 | =================== 19 | 20 | Vitrage provides insights about the state of the cloud, but is not meant to be 21 | a policy engine. In order to take corrective actions, for example, we need to 22 | integrate an external engine like Mistral - the OpenStack workflow engine. 23 | 24 | Proposed change 25 | =============== 26 | 27 | It will be possible to define in Vitrage templates that under certain 28 | conditions, a Mistral workflow should be executed. This gives the user the 29 | power to decide, for example, that different corrective actions should be taken 30 | based on the root cause of the problem (as identified by Vitrage). 31 | 32 | Note that this blueprint is based on the external-actions blueprint, that 33 | handles the more general case. 34 | 35 | Examples 36 | -------- 37 | 38 | .. code-block:: yaml 39 | 40 | - scenario: 41 | condition: host_down_alarm_on_host 42 | actions: 43 | - action: 44 | action_type: execute_mistral 45 | properties: 46 | workflow: wf1 47 | 48 | 49 | Alternatives 50 | ------------ 51 | Discussed in the external-actions blueprint. 52 | 53 | Data model impact 54 | ----------------- 55 | 56 | None 57 | 58 | REST API impact 59 | --------------- 60 | 61 | None 62 | 63 | Versioning impact 64 | ----------------- 65 | 66 | None 67 | 68 | Other end user impact 69 | --------------------- 70 | 71 | None 72 | 73 | Deployer impact 74 | --------------- 75 | 76 | None 77 | 78 | Developer impact 79 | ---------------- 80 | 81 | None 82 | 83 | Horizon impact 84 | -------------- 85 | 86 | None 87 | 88 | 89 | Implementation 90 | ============== 91 | 92 | Assignee(s) 93 | ----------- 94 | 95 | Primary assignee: 96 | ifat-afek 97 | 98 | Work Items 99 | ---------- 100 | 101 | * Implement the Mistral notifier 102 | * Update the documentation 103 | 104 | Dependencies 105 | ============ 106 | 107 | None 108 | 109 | Testing 110 | ======= 111 | 112 | The implementation will be covered by unit tests and tempest tests. 113 | 114 | Documentation Impact 115 | ==================== 116 | 117 | The new action should be documented 118 | 119 | References 120 | ========== 121 | 122 | None 123 | -------------------------------------------------------------------------------- /specs/ocata/doctor-datasource.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================== 8 | Doctor Data Source 9 | ================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/doctor-datasource 12 | 13 | This blueprint describes the datasource that will receive notifications from 14 | OPNFV Doctor monitor 15 | 16 | Problem description 17 | =================== 18 | In order for Vitrage to be accepted as a reference implementation for the 19 | OPNFV Doctor Inspector component, it should be able to receive alarm 20 | notifications from the Doctor monitor. 21 | 22 | Proposed change 23 | =============== 24 | The Doctor datasource will receive notifications in the format defined by 25 | Doctor SB API (see the reference below): 26 | 27 | :: 28 | 29 | { 30 | 'event': { 31 | 'time': '2016-04-12T08:00:00', 32 | 'type': 'compute.host.down', 33 | 'details': { 34 | 'hostname': 'compute-1', 35 | 'source': 'sample_monitor', 36 | 'cause': 'link-down', 37 | 'severity': 'critical', 38 | 'status': 'down', 39 | 'monitor_id': 'monitor-1', 40 | 'monitor_event_id': '123', 41 | } 42 | } 43 | } 44 | 45 | 46 | Upon receiving such a notification, the Doctor datasource will create or 47 | delete a corresponding alarm in Vitrage, based on the 'status' field. 48 | 49 | In addition, a new evaluator template will be added in order to: 50 | - Create deduced alarms on the VMs running on the host 51 | - Modify the states of the host and the VMs to ERROR 52 | - Call Nova force-down API to mark that the host is down 53 | 54 | 55 | REST API impact 56 | --------------- 57 | 58 | The Doctor monitor sends its alarms in REST format. Another blueprint discusses 59 | the SB API that should be added to Vitrage in order to support it. 60 | See https://blueprints.launchpad.net/vitrage/+spec/support-inspector-sb-api 61 | 62 | 63 | Implementation 64 | ============== 65 | 66 | Assignee(s) 67 | ----------- 68 | 69 | Primary assignee: 70 | ifat-afek 71 | 72 | Work Items 73 | ---------- 74 | 75 | - Implement the Doctor datasource 76 | - Write a template for creating deduced alarms on the VMs and calling Nova 77 | mark host down 78 | 79 | Testing 80 | ======= 81 | 82 | The changes will be tested by unit tests, and later on also by Doctor test 83 | scripts. 84 | 85 | References 86 | ========== 87 | 88 | - https://wiki.opnfv.org/display/doctor/Doctor+Home 89 | - http://artifacts.opnfv.org/doctor/docs/requirements/05-implementation.html 90 | section 4.5.6 91 | 92 | -------------------------------------------------------------------------------- /README.rst: -------------------------------------------------------------------------------- 1 | ======================== 2 | Team and repository tags 3 | ======================== 4 | 5 | .. image:: https://governance.openstack.org/tc/badges/vitrage-specs.svg 6 | :target: https://governance.openstack.org/tc/reference/tags/index.html 7 | 8 | .. Change things from this point on 9 | 10 | ================================ 11 | OpenStack Vitrage Specifications 12 | ================================ 13 | 14 | This git repository is used to hold approved design specifications for additions 15 | to the Vitrage project. Reviews of the specs are done in gerrit, using a similar 16 | workflow to how we review and merge changes to the code itself. 17 | 18 | The layout of this repository is:: 19 | 20 | specs// 21 | 22 | Where there are two sub-directories: 23 | 24 | specs//approved: specifications approved but not yet implemented 25 | specs//implemented: implemented specifications 26 | 27 | This directory structure allows you to see what we thought about doing, 28 | decided to do, and actually got done. Users interested in functionality in a 29 | given release should only refer to the ``implemented`` directory. 30 | 31 | You can find an example spec in `doc/source/specs/template.rst`. 32 | 33 | Specifications are proposed for a given release by adding them to the 34 | `specs/` directory and posting it for review. The implementation 35 | status of a blueprint for a given release can be found by looking at the 36 | blueprint in launchpad. Not all approved blueprints will get fully implemented. 37 | 38 | Specifications have to be re-proposed for every release. The review may be 39 | quick, but even if something was previously approved, it should be re-reviewed 40 | to make sure it still makes sense as written. 41 | 42 | Launchpad blueprints:: 43 | 44 | http://blueprints.launchpad.net/vitrage 45 | 46 | Starting from the Mitaka-1 development milestone Vitrage performs the pilot of 47 | the specs repos approach. 48 | 49 | Please note, Launchpad blueprints are still used for tracking the 50 | current status of blueprints. For more information, see:: 51 | 52 | https://wiki.openstack.org/wiki/Blueprints 53 | 54 | For more information about working with gerrit, see:: 55 | 56 | http://docs.openstack.org/infra/manual/developers.html#development-workflow 57 | 58 | To validate that the specification is syntactically correct (i.e. get more 59 | confidence in the Jenkins result), please execute the following command:: 60 | 61 | $ tox 62 | 63 | After running ``tox``, the documentation will be available for viewing in HTML 64 | format in the ``doc/build/`` directory. 65 | 66 | To build the document automatically on changes, use the command:: 67 | 68 | $ tox -e autobuild 69 | 70 | Then open in a browser http://localhost:8000 71 | -------------------------------------------------------------------------------- /specs/pike/implemented/template-not-operator-support.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ============================= 8 | Template not operator Support 9 | ============================= 10 | 11 | We would like the templates language to support the "not" operator in addition 12 | to "and" and "or" operators. 13 | 14 | Problem description 15 | =================== 16 | 17 | Today the templates language support the "and" and "or" operators, but this is 18 | not enough. 19 | Many scenarios can't be described by those two operators, and thus we would 20 | like to add a support for the "not" operator. 21 | 22 | Proposed change 23 | =============== 24 | 25 | Remark: positive vertex = vertex that has an edge with "is_deleted"=False property. 26 | negative vertex = the opposite of positive vertex. Meaning that the 27 | vertex has only edges with "is_deleted"=True property. 28 | 29 | The following changes needs to be done to support the "not" operator. 30 | 31 | 1. In the template validation. Check that the "Not" operator can appear only 32 | in the following way: 33 | 34 | "Not" operator can appear before edges. 35 | 36 | 2. Scenario Evaluator: 37 | 38 | check if "not" operator appears on an element, and add a property named 39 | "negative_condition" on it, and updated it's "is_delete" property to True. 40 | 41 | 3. Networkx Algorithm: 42 | 43 | In the subgraph_matching method. If the match is not a "negative condition" 44 | then perform regular subgraph_matching. 45 | Otherwise, if the match is a "negative condition" then perform 46 | subgraph_matching on a negative edges. 47 | 48 | 49 | Performance/Scalability Impacts 50 | ------------------------------- 51 | 52 | The performance of the subgraph matching algorithm is a bit slower due to 53 | the steps above. 54 | 55 | 56 | Other deployer impact 57 | --------------------- 58 | 59 | None 60 | 61 | Developer impact 62 | ---------------- 63 | 64 | None 65 | 66 | 67 | Implementation 68 | ============== 69 | 70 | Assignee(s) 71 | ----------- 72 | 73 | Primary assignee: 74 | alexey_weyl 75 | 76 | Work Items 77 | ---------- 78 | 79 | None 80 | 81 | Future lifecycle 82 | ================ 83 | 84 | None 85 | 86 | Dependencies 87 | ============ 88 | 89 | None 90 | 91 | Testing 92 | ======= 93 | 94 | Added tests that are checking different uses of the "not" operator. 95 | 96 | Documentation Impact 97 | ==================== 98 | 99 | Added documentation in Vitrage: 100 | https://github.com/openstack/vitrage/blob/master/doc/source/not_operator_support.rst 101 | 102 | References 103 | ========== 104 | 105 | None 106 | -------------------------------------------------------------------------------- /specs/ocata/aodh-message-bus-notifications.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ============================================== 8 | Get Aodh Alarms From Message Bus Notifications 9 | ============================================== 10 | 11 | Get Aodh alarms immediately via notification from the message bus. 12 | 13 | Problem description 14 | =================== 15 | 16 | Currently Aodh datasource queries Aodh alarms periodically. It is not in time 17 | and it is also not efficient, since most query will get no change when the 18 | interval of alarm evaluation in Aodh is larger than the interval of pulling 19 | Aodh alarm in Vitrage. 20 | 21 | Proposed change 22 | =============== 23 | 24 | Aodh `update_method` in vitrage config file will be configured with: 25 | 26 | /etc/vitrage/vitrage.conf 27 | :: 28 | 29 | [aodh] 30 | update_method = push 31 | 32 | Add a new notification topic in Aodh config file: 33 | 34 | /etc/aodh/aodh.conf 35 | :: 36 | 37 | [oslo_messaging_notifications] 38 | topics = vitrage_notifications 39 | 40 | Vitrage listener will get the alarm events from the message bus. Aodh driver 41 | will filter the needed event types, enrich the events and then send them to 42 | the queue for the Aodh transformer to create, update or delete the entity 43 | vertex. 44 | 45 | Alternatives 46 | ------------ 47 | 48 | None 49 | 50 | Data model impact 51 | ----------------- 52 | 53 | None 54 | 55 | REST API impact 56 | --------------- 57 | 58 | None 59 | 60 | Security impact 61 | --------------- 62 | 63 | None 64 | 65 | Pipeline impact 66 | --------------- 67 | 68 | None 69 | 70 | Other end user impact 71 | --------------------- 72 | 73 | None 74 | 75 | Performance/Scalability Impacts 76 | ------------------------------- 77 | 78 | performance improvement of getting Aodh alarms 79 | 80 | 81 | Other deployer impact 82 | --------------------- 83 | 84 | - Config `update_method` as 'push' in `Aodh` section in vitrage config file. 85 | - Add the `vitrage_notifications` topic in `oslo_messaging_notifications` 86 | section in Aodh config file. 87 | 88 | Developer impact 89 | ---------------- 90 | 91 | None 92 | 93 | 94 | Implementation 95 | ============== 96 | 97 | Assignee(s) 98 | ----------- 99 | 100 | Primary assignee: 101 | dongwenjuan 102 | 103 | Work Items 104 | ---------- 105 | 106 | None 107 | 108 | Future lifecycle 109 | ================ 110 | 111 | None 112 | 113 | Dependencies 114 | ============ 115 | 116 | None 117 | 118 | Testing 119 | ======= 120 | 121 | Unit tests and tempest tests. 122 | 123 | Documentation Impact 124 | ==================== 125 | 126 | Documentation will be modified to describe how to configure the notification 127 | topics in Aodh when using devstack to deploy OpenStack env. 128 | 129 | References 130 | ========== 131 | 132 | None 133 | -------------------------------------------------------------------------------- /specs/pike/implemented/keycloak.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================ 8 | Keycloak support 9 | ================ 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/keycloak-support 13 | 14 | As part of an on going effort to make vitrage to be able to work also in a non 15 | OpenStack environment (in addition to the default OpenStack environment). 16 | We should be able to make vitrage work with a different authorization server 17 | instead of keystone. An optional authorization server can be Keycloak which is 18 | an open source Identity and Access Management solution aimed at modern 19 | applications and services 20 | 21 | 22 | Problem description 23 | =================== 24 | 25 | Vitrage at the moment can only work in an OpenStack environment because it needs 26 | Keystone for authorization. We should support other authorization such as Keycloak. 27 | 28 | 29 | 30 | Proposed change 31 | =============== 32 | 33 | New auth_mode in api section in Vitrage config file:: 34 | 35 | [api] 36 | auth_mode = keycloak 37 | 38 | New keycloak section with the auth_url in Vitrage config:: 39 | 40 | [keycloak] 41 | auth_url = http://[keycloak server]:[keycloak port]/auth 42 | 43 | The Vitrage server will use a new middleware which will authenticate with the 44 | Keycloak server once an api request is received. 45 | 46 | A new auth plugin will be added to the vitrage client which will get the token 47 | from the Keycloak server and sent it with the api request. 48 | 49 | Alternatives 50 | ------------ 51 | 52 | None 53 | 54 | Data model impact 55 | ----------------- 56 | 57 | None 58 | 59 | REST API impact 60 | --------------- 61 | When using the client we should use the keycloak-plugin 62 | 63 | Versioning impact 64 | ----------------- 65 | 66 | None 67 | 68 | Other end user impact 69 | --------------------- 70 | 71 | None 72 | 73 | Deployer impact 74 | --------------- 75 | 76 | To use the Keycloak Authorization there is a need to define it in the 77 | Vitrage config file. 78 | 79 | Developer impact 80 | ---------------- 81 | 82 | None 83 | 84 | Horizon impact 85 | -------------- 86 | 87 | None 88 | 89 | Implementation 90 | ============== 91 | 92 | Assignee(s) 93 | ----------- 94 | 95 | Primary assignee: 96 | eyalb1 97 | 98 | Work Items 99 | ---------- 100 | 101 | - Create Keycloak plugin in client 102 | 103 | - Create Keycloak plugin in server 104 | 105 | Dependencies 106 | ============ 107 | 108 | None 109 | 110 | Testing 111 | ======= 112 | 113 | This blueprint requires unit tests. 114 | 115 | Documentation Impact 116 | ==================== 117 | 118 | The usage of the KeyCloak authorization will be documented 119 | 120 | 121 | References 122 | ========== 123 | 124 | `keycloak-config.rst `_ 125 | -------------------------------------------------------------------------------- /specs/pike/implemented/osprofiler-support.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================== 8 | OSProfiler Support 9 | ================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/osprofiler-support 12 | 13 | `OSProfiler`_ is an OpenStack cross-project profiling library. It allows user 14 | to generate 1 trace per request which is processed in multiple services. 15 | 16 | Problem description 17 | =================== 18 | 19 | It is quite common that Vitrage is integrated in a large system to provide a 20 | complete solution e.g. as inspector in `OPNFV/Doctor`_ for fault management. 21 | 22 | When performance is a critical issue, it is very complicated to analyse the 23 | event processing workflow and locate the bottleneck in case something works 24 | slowly. 25 | 26 | Proposed change 27 | =============== 28 | 29 | Add osprofiler support in vitrage and vitrage-client. 30 | 31 | OSProfiler will help to generate a tree of calls (see `example`_) which is 32 | intuitive to understand what exactly is going on. 33 | 34 | Alternatives 35 | ------------ 36 | 37 | Explained in `why not cprofile and etc`_ 38 | 39 | .. _why not cprofile and etc: https://osprofiler.readthedocs.io/en/latest/#why-not-cprofile-and-etc 40 | 41 | Data model impact 42 | ----------------- 43 | 44 | None 45 | 46 | REST API impact 47 | --------------- 48 | 49 | Additional HTTP header inserted by profiler should be checked when it is 50 | enabled. Besides that, there is no impact on the business logic in api handler. 51 | 52 | Versioning impact 53 | ----------------- 54 | 55 | None 56 | 57 | Other end user impact 58 | --------------------- 59 | 60 | None 61 | 62 | Deployer impact 63 | --------------- 64 | 65 | None 66 | 67 | Developer impact 68 | ---------------- 69 | 70 | None 71 | 72 | Horizon impact 73 | -------------- 74 | 75 | None 76 | 77 | Implementation 78 | ============== 79 | 80 | Assignee(s) 81 | ----------- 82 | 83 | Primary assignee: 84 | yujunz 85 | 86 | Other contributors: 87 | None 88 | 89 | Work Items 90 | ---------- 91 | 92 | - add configuration options for osprofiler 93 | - add initialization of osprofiler in service startup 94 | - add osprofiler wsgi middleware to trace HTTP calls 95 | - trace RPC calls 96 | 97 | Refer to the changes proposed in `similar topic in ironic`_ for an overview. 98 | 99 | Dependencies 100 | ============ 101 | 102 | None 103 | 104 | Testing 105 | ======= 106 | 107 | Covered by unit test. 108 | 109 | Documentation Impact 110 | ==================== 111 | 112 | Add to developer guide on how to use osprofiler. 113 | 114 | References 115 | ========== 116 | 117 | .. _OSProfiler: https://docs.openstack.org/developer/osprofiler/index.html 118 | .. _OPNFV/Doctor: https://wiki.opnfv.org/display/doctor 119 | .. _similar topic in ironic: https://review.openstack.org/#/q/topic:bug/1560704 120 | .. _example: http://doctor.surge.sh/ 121 | -------------------------------------------------------------------------------- /specs/rocky/implemented/rpc-collector.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =========================== 8 | Vitrage-Collector on demand 9 | =========================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/rpc-collector 12 | 13 | A simpler implementation of Vitrage high availability, collector service 14 | should meet these new requirements: 15 | 16 | - collector should be active-active. 17 | - Dependency between the collector and graph services should be removed. 18 | - vitrage-graph should be able to request updates from the collector when needed. 19 | 20 | Problem description 21 | =================== 22 | 23 | The current implementation, where vitrage collector and vitrage graph are 24 | always restarted simultaneously, is not desired, but is due to the lack of a 25 | better triggering event for get_all. In addition message bus may overload 26 | occasionaly due to this behaviour, when vitrage-graph is down and does not 27 | consume the messages. 28 | 29 | 30 | Proposed change 31 | =============== 32 | 33 | Collector receives synchronous RPC and retrieves a list of all the events. 34 | 35 | vitrage-collector implementation: 36 | - Remove timers for get_all and get_changes 37 | - get_all and get_changes can be called by RPC 38 | - get_all and get_changes receive a list parameter `datasource_names`. 39 | - Will return all the entities from all the specified data-sources 40 | - Will open a thread to write all the events to the Persistor message queue 41 | in the background. 42 | - Remove dependency between the collector and graph services (.service files) 43 | 44 | vitrage-graph implementation: 45 | - Should manage the RPC calls to get_all and get changes. 46 | - Add appropriate timers calling get_all and get_changes by RPC 47 | - List of events should be processed using an event coordinator 48 | (before `TwoPrirityQueue`) 49 | - A snapshot should be taken after each get_all once the events have been processed. 50 | So the majority of events do not need to be replayed 51 | 52 | 53 | 54 | Alternatives 55 | ------------ 56 | 57 | None 58 | 59 | 60 | Data model impact 61 | ----------------- 62 | 63 | None 64 | 65 | REST API impact 66 | --------------- 67 | 68 | None 69 | 70 | Versioning impact 71 | ----------------- 72 | 73 | None 74 | 75 | Other end user impact 76 | --------------------- 77 | 78 | None 79 | 80 | Deployer impact 81 | --------------- 82 | 83 | vitrage-collector could be restarted individually from vitrage-graph 84 | 85 | Developer impact 86 | ---------------- 87 | 88 | None 89 | 90 | Horizon impact 91 | -------------- 92 | 93 | None 94 | 95 | Implementation 96 | ============== 97 | 98 | Assignee(s) 99 | ----------- 100 | 101 | Primary assignee: 102 | idan-hefetz 103 | 104 | Other contributors: 105 | None 106 | 107 | Work Items 108 | ---------- 109 | 110 | None 111 | 112 | Dependencies 113 | ============ 114 | 115 | None 116 | 117 | Testing 118 | ======= 119 | 120 | New behaviour is already covered by existing tempest 121 | 122 | Documentation Impact 123 | ==================== 124 | 125 | None 126 | 127 | References 128 | ========== 129 | 130 | None 131 | 132 | -------------------------------------------------------------------------------- /specs/ocata/multi-tenancy-support.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ============================= 8 | Vitrage Multi Tenancy Support 9 | ============================= 10 | 11 | When a tenant uses Vitrage APIs we would like to show him only the data that is relevant to him. 12 | 13 | Problem description 14 | =================== 15 | 16 | Vitrage needs to show in its API an CLI only the data relevant to that tenant (it can't show all the data due to irrelevancy and privacy of each tenant). 17 | Thus for each datasource and entity we need to know what relevant data to show for that tenant. 18 | We would also like to show all of the data if someone adds the all_tenants property. 19 | 20 | Proposed change 21 | =============== 22 | 23 | Here is a description, for each of the Vitrage APIs, how it should behave for each tenant: 24 | 25 | Get alarms: 26 | 1. Find all the alarms with the requested project_id (if the project is admin then show also alarms that has no project_id property) 27 | 2. Find all the resources with the requested project_id and return all the alarms that are attached to them. 28 | 3. Merge the results from the previous steps and return it. 29 | 30 | Get RCA: 31 | 1. Find all the alarms that this alarm has caused, recursively. When reaching an alarm that is not of same project_id (or on resource of same project_id), stop the recursion, including this last alarm in the response. 32 | 2. Find all the alarms that caused this alarm, recursively. When reaching an alarm that is not of same project_id (or on resource of same project_id), stop the recursion, including this last alarm in the response. 33 | 3. Merge the results from the previous steps and return it. 34 | 35 | Get Topology: 36 | 1. Find all the connected components for project_id. 37 | 2. For each component, select 1 entity and find all paths (without cycles) to "openstack.cluster" entity. Add all of the vertices in the path to the component. 38 | 3. Merge all the components and return it. 39 | 40 | 41 | Remark: API and CLI needs to behave the same. 42 | 43 | Alternatives 44 | ------------ 45 | 46 | None 47 | 48 | Data model impact 49 | ----------------- 50 | 51 | None 52 | 53 | REST API impact 54 | --------------- 55 | 56 | None 57 | 58 | Security impact 59 | --------------- 60 | 61 | None 62 | 63 | Pipeline impact 64 | --------------- 65 | 66 | None 67 | 68 | Other end user impact 69 | --------------------- 70 | 71 | None 72 | 73 | Performance/Scalability Impacts 74 | ------------------------------- 75 | 76 | None 77 | 78 | 79 | Other deployer impact 80 | --------------------- 81 | 82 | None 83 | 84 | Developer impact 85 | ---------------- 86 | 87 | None 88 | 89 | 90 | Implementation 91 | ============== 92 | 93 | Assignee(s) 94 | ----------- 95 | 96 | Primary assignee: 97 | alexey_weyl 98 | 99 | Work Items 100 | ---------- 101 | 102 | None 103 | 104 | Future lifecycle 105 | ================ 106 | 107 | None 108 | 109 | Dependencies 110 | ============ 111 | 112 | None 113 | 114 | Testing 115 | ======= 116 | 117 | None 118 | 119 | Documentation Impact 120 | ==================== 121 | 122 | None 123 | 124 | References 125 | ========== 126 | 127 | None 128 | -------------------------------------------------------------------------------- /specs/ocata/vitrage-id.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ========== 8 | Vitrage ID 9 | ========== 10 | 11 | Vitrage ID will be standard generated UUID. 12 | 13 | Problem description 14 | =================== 15 | 16 | Currently Vitrage ID is actually a set of properties. This can create duplicates and is very misleading. 17 | Furthermore it will impair us once we have history, since same alarms can happen multiple times. Thus, 18 | it should be the same as any other service in Openstack and provide an ID based on Openstack UUID. 19 | generating algorithm. 20 | 21 | Proposed change 22 | =============== 23 | 24 | Changing Vitrage ID from it's current algorithm to an OpenStack compliant "UUIDUtils" generated UUID. 25 | 26 | All the documentations and examples in Vitrage project should be updated to use an OpenStack compliant UUID. 27 | 28 | A few Mock Json files exist for “mock api” requests purposes, for example, alarms.sample.json. 29 | They contain the “old” Vitrage ID. The “mock” api should be deleted, and the same should go with 30 | these file. We can also just fix the examples in the mock. 31 | 32 | 33 | Alarm IDs: When creating Vitrage ID for an alarm, we also need to put it in the metadata when 34 | updating AODH / other. Afterwards, when getting the updated alarm back, we will to update the 35 | alarm ID in Vitrage, In case we will have simultaneous multiple alarm engines, we might need to 36 | have an “ServiceName / ID” map for the alarm, and the Alarm ID in Vitrage will be “Vitrage ID”. 37 | 38 | 39 | Template IDs should be changed back from a calculated String to generated uuid, in scenario_repository. 40 | 41 | 42 | Datasources: 43 | - key / value tests : fix field names. 44 | - Transformers: No change is needed in the Transformers. 45 | - Processor: Checking if an entity exists in the graph: The entity is currently queried in the 46 | Graph according to its Vitrage ID. Instead, it will be queried according to the parameters set. 47 | If the entity exists, it’s original Vitrage ID will be used. Otherwise, a new UUID will be 48 | generated for vitrage ID via openstack UUIDUtils' generate_uuid. 49 | 50 | 51 | Update all necessary tests. 52 | 53 | 54 | No changes are needed in Evaluator Action / Recipes or in Consistency enforcer. 55 | 56 | 57 | Performance/Scalability Impacts 58 | ------------------------------- 59 | 60 | Performance needs to be tested, since after the development of this blueprint, all graph queries will use parameters 61 | instead of a single index. 62 | 63 | As long as Vitrage uses an in memory graph database, starting from this change, standard HA will be buggy, 64 | to say the least. An entity's Vitrage ID will have different values in each HA "instance". Using Pacemaker 65 | equivalent HA (stonith) will solve this. 66 | 67 | 68 | Implementation 69 | ============== 70 | 71 | Assignee(s) 72 | ----------- 73 | 74 | Primary assignee: 75 | doffek 76 | 77 | 78 | Dependencies 79 | ============ 80 | 81 | Aodh : Need to change the notification from Vitrage to Aodh. 82 | 83 | Testing 84 | ======= 85 | 86 | Unit tests and tempest tests. 87 | 88 | Documentation Impact 89 | ==================== 90 | 91 | All documentation regarding the creation of Vitrage ID will be updated. 92 | 93 | References 94 | ========== 95 | 96 | None 97 | -------------------------------------------------------------------------------- /specs/queens/implemented/support_mark_down_action_for_instances.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ============================= 8 | Add Nova reset-state notifier 9 | ============================= 10 | 11 | Include the URL of your launchpad blueprint: 12 | 13 | https://blueprints.launchpad.net/vitrage/+spec/add-nova-reset-state-notifier 14 | 15 | When a host is marked as down, all the servers which are launched at the host 16 | should be with error status. Nova notifier will be extended to reset server 17 | state. 18 | 19 | Problem description 20 | =================== 21 | 22 | As Vitrage works as the OPNFV Doctor Inspector component, when Vitrage 23 | receives alarm notifications from the Doctor monitor, it should map the 24 | physical resources to virtual resources and set their states appropriately. 25 | For the host down scenario, currently Vitrage only support to mark host down, 26 | the states of the servers which are launched at the host are still 'Ok' in 27 | Nova. And also in a real scenario, notifying Nova would help the user to get 28 | a clear picture of the state of its instances. So Nova notifier should be 29 | extended to call 'reset-state' API to reset server state. 30 | 31 | Proposed change 32 | =============== 33 | 34 | Reuse 'mark_down' action type and set 'instance' as 'action_target' in Vitrage 35 | template which will call Nova api: 'reset-state' to set instance state. 36 | 37 | Doctor Example 38 | --------------- 39 | 40 | .. code-block:: yaml 41 | 42 | - scenario: 43 | condition: host_down_alarm_on_host and host_contains_instance and alarm_on_instance 44 | actions: 45 | - action: 46 | action_type: mark_down 47 | action_target: 48 | target: instance 49 | 50 | Alternatives 51 | ------------ 52 | 53 | None 54 | 55 | Data model impact 56 | ----------------- 57 | 58 | None 59 | 60 | REST API impact 61 | --------------- 62 | 63 | None 64 | 65 | Versioning impact 66 | ----------------- 67 | 68 | None 69 | 70 | Other end user impact 71 | --------------------- 72 | 73 | None 74 | 75 | Deployer impact 76 | --------------- 77 | 78 | To use the Nova notifier, there is a need to define it in the Vitrage config 79 | file, and in addition use the 'mark_down' action for instances in Vitrage 80 | template. 81 | 82 | Developer impact 83 | ---------------- 84 | 85 | None 86 | 87 | Horizon impact 88 | -------------- 89 | 90 | None 91 | 92 | Implementation 93 | ============== 94 | 95 | Assignee(s) 96 | ----------- 97 | 98 | Primary assignee: 99 | dong wenjuan 100 | 101 | Work Items 102 | ---------- 103 | 104 | - Implement the 'mark_down' action for instances and tests 105 | - Modify the host_down_scenario template for calling Nova reset-state 106 | 107 | Dependencies 108 | ============ 109 | 110 | None 111 | 112 | Testing 113 | ======= 114 | 115 | Unit tests and tempest tests need to be added. 116 | 117 | Documentation Impact 118 | ==================== 119 | 120 | The usage of the 'mark_down' action for instances will be documented. 121 | 122 | 123 | References 124 | ========== 125 | 126 | `Doctor inspector design guideline `_ 127 | `Support external actions in Vitrage templates `_ 128 | -------------------------------------------------------------------------------- /specs/mitaka/vitrage-cli.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =========== 8 | Vitrage CLI 9 | =========== 10 | 11 | https://blueprints.launchpad.net/python-vitrageclient/+spec/vitrage-cli 12 | 13 | Vitrage Project introduces a Root Cause Analysis (RCA) engine 14 | for organizing, analyzing and expanding OpenStack alarms & events. 15 | 16 | In order to communicate with Vitrage a command line utility can be used 17 | that will issue some REST command to the API service. 18 | 19 | The CLI will use a ``python-vitrageclient`` which is a client library built 20 | on the Vitrage API. 21 | 22 | :: 23 | 24 | +-----------------+ +-----------------+ 25 | | *CLI* | | | 26 | | | | | 27 | | RCA | | | 28 | | | HTTP/Vitrage API | vitrage api | 29 | | CRUD Templates |+--------------------> | 30 | | | | service | 31 | | Topology | | | 32 | | | | | 33 | +-----------------+ +-----------------+ 34 | 35 | Problem description 36 | =================== 37 | 38 | As a user I would like to be able to see the root cause of any alerts or events in the system. 39 | A command line utility will be used to communicate with Vitrage API service. 40 | The CLI will 3 types of commands: 41 | 42 | #. RCA - find the root cause for an alert/event 43 | 44 | #. CRUD Templates - Create/Read/Update/Delete Templates 45 | 46 | #. Topology - get the topology of the system 47 | 48 | 49 | Proposed change 50 | =============== 51 | 52 | The CLI and the vitrage client is part of a new project for Root Cause Analysis 53 | called vitrage 54 | 55 | Alternatives 56 | ------------ 57 | None 58 | 59 | Data model impact 60 | ----------------- 61 | 62 | No data is stored or cached. 63 | 64 | REST API impact 65 | --------------- 66 | 67 | Will implement the api of vitrage-api service 68 | 69 | Versioning impact 70 | ----------------- 71 | 72 | Discuss how your change affects versioning and backward compatibility: 73 | 74 | None 75 | 76 | Other end user impact 77 | --------------------- 78 | 79 | The User will be able to interact using any HTTP rest client. 80 | The User will also have a UI. 81 | 82 | Deployer impact 83 | --------------- 84 | 85 | A new project called Vitrage will deploy the vitrage client and CLI 86 | 87 | Developer impact 88 | ---------------- 89 | 90 | None 91 | 92 | Horizon impact 93 | -------------- 94 | 95 | A new UI will be added to Horizon to support the Vitrage project 96 | a separate blueprint will be supplied. 97 | 98 | 99 | Implementation 100 | ============== 101 | 102 | Assignee(s) 103 | ----------- 104 | 105 | None 106 | 107 | Work Items 108 | ---------- 109 | 110 | None 111 | 112 | 113 | Dependencies 114 | ============ 115 | 116 | None 117 | 118 | 119 | Testing 120 | ======= 121 | 122 | All code will be tested. 123 | 124 | Documentation Impact 125 | ==================== 126 | 127 | None 128 | 129 | 130 | References 131 | ========== 132 | 133 | `Vitrage project `_ 134 | the get topology api blueprint https://blueprints.launchpad.net/vitrage/+spec/get-topology-api 135 | -------------------------------------------------------------------------------- /specs/pike/implemented/resource-show-api.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================= 8 | Resource show API 9 | ================= 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/resource-show-api 12 | 13 | An API to show the details of the specified resource. 14 | 15 | Problem description 16 | =================== 17 | 18 | As a user, I want to get the details of a specified resource. 19 | 20 | Proposed change 21 | =============== 22 | 23 | Add an API to show the details of specified resource. 24 | 25 | Alternatives 26 | ------------ 27 | 28 | None 29 | 30 | Data model impact 31 | ----------------- 32 | 33 | None 34 | 35 | REST API impact 36 | --------------- 37 | 38 | Resource show 39 | ^^^^^^^^^^^^^ 40 | 41 | Returns details of the resource 42 | 43 | GET /v1/resources/vitrage_id 44 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 45 | 46 | Headers 47 | ^^^^^^^ 48 | 49 | - X-Auth-Token (string, required) - Keystone auth token 50 | - Accept (string) - application/json 51 | - User-Agent (String) 52 | 53 | Path Parameters 54 | ^^^^^^^^^^^^^^^ 55 | 56 | - vitrage_id 57 | 58 | Query Parameters 59 | ^^^^^^^^^^^^^^^^ 60 | 61 | None 62 | 63 | Request Body 64 | ^^^^^^^^^^^^ 65 | 66 | None 67 | 68 | Request Examples 69 | ^^^^^^^^^^^^^^^^ 70 | :: 71 | 72 | GET /v1/resources/`` 73 | Host: 127.0.0.1:8999 74 | User-Agent: keystoneauth1/2.3.0 python-requests/2.9.1 CPython/2.7.6 75 | Accept: application/json 76 | X-Auth-Token: 2b8882ba2ec44295bf300aecb2caa4f7 77 | 78 | Response 79 | ~~~~~~~~ 80 | 81 | Status code 82 | ^^^^^^^^^^^ 83 | 84 | - 200 - OK 85 | - 404 - Not Found 86 | 87 | Response Body 88 | ^^^^^^^^^^^^^ 89 | 90 | Returns details of the requested resource. 91 | 92 | Response Examples 93 | ^^^^^^^^^^^^^^^^^ 94 | 95 | :: 96 | 97 | { 98 | "category": "RESOURCE", 99 | "is_placeholder": false, 100 | "is_deleted": false, 101 | "name": "vm-1", 102 | "update_timestamp": "2015-12-01T12:46:41Z", 103 | "state": "ACTIVE", 104 | "project_id": "0683517e1e354d2ba25cba6937f44e79", 105 | "type": "nova.instance", 106 | "id": "dc35fa2f-4515-1653-ef6b-03b471bb395b", 107 | "vitrage_id": "RESOURCE:nova.instance:dc35fa2f-4515-1653-ef6b-03b471bb395b" 108 | } 109 | 110 | Security impact 111 | --------------- 112 | 113 | None 114 | 115 | Pipeline impact 116 | --------------- 117 | 118 | None 119 | 120 | Other end user impact 121 | --------------------- 122 | 123 | None 124 | 125 | Performance/Scalability Impacts 126 | ------------------------------- 127 | 128 | None 129 | 130 | 131 | Other deployer impact 132 | --------------------- 133 | 134 | None 135 | 136 | Developer impact 137 | ---------------- 138 | 139 | None 140 | 141 | 142 | Implementation 143 | ============== 144 | 145 | Assignee(s) 146 | ----------- 147 | 148 | dong wenjuan 149 | 150 | 151 | Work Items 152 | ---------- 153 | 154 | * Implement the API and tests 155 | * Implement the client and tests 156 | 157 | Future lifecycle 158 | ================ 159 | 160 | None 161 | 162 | Dependencies 163 | ============ 164 | 165 | None 166 | 167 | Testing 168 | ======= 169 | 170 | Unit tests and tempest tests need to be added. 171 | 172 | Documentation Impact 173 | ==================== 174 | 175 | The new api should be documented 176 | 177 | References 178 | ========== 179 | None 180 | -------------------------------------------------------------------------------- /specs/queens/implemented/snmp-parsing-service.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ==================== 8 | Snmp Parsing Service 9 | ==================== 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/snmp-support 13 | 14 | This blueprint describes the implementation of SNMP parsing service for transforming SNMP alarm 15 | messages to alarm details and distributing them to corresponding datasource. 16 | 17 | Problem description 18 | =================== 19 | 20 | The following use case should be supported: 21 | 22 | Vitrage datasources module provides the ability to handle alarms from part of monitored systems, 23 | but currently there is no system that reports alarms by SNMP communication. 24 | 25 | Proposed change 26 | =============== 27 | A SNMP service module is presented here, which provides service to parse alarms reported from SNMP 28 | managed system and sends them to the OpenStack message bus, for further processing by the datasources. 29 | 30 | Since snmp service is a common service for alarm datasource, the service powers on just after api, 31 | graph and notifier service. After successfully powered on, SNMP parsing service can receive and 32 | decode alarm messages. Decoded alarm details are made up of alarm/object info and corresponding value, 33 | e.g. alarm_code. The SNMP parsing service parses alarm datasource info according to corresponding 34 | OID, then constructs message after marking datasource information and distributes messages to the 35 | RabbitMQ queue. According to the values of OID, the alarm datasource can extract information by decoded 36 | alarm details. 37 | 38 | The configuration for snmp service: 39 | [snmp_parsing] 40 | 41 | # snmp listening port (integer value) 42 | snmp_listening_port = xxx 43 | 44 | # traps oid mapping yaml file path(string value) 45 | #oid_mapping = /etc/vitrage/snmp_parsing_conf.yaml 46 | 47 | An example of config for snmp_parsing_conf.yaml: 48 | 49 | - oid: 1.3.6.1.4.1.3902.4101.1.3.1.12 # for example 50 | system: iaas_platform # for example 51 | datasource: new_datasource 52 | - oid: xxxx 53 | system: xxx 54 | datasource: xxx 55 | 56 | 57 | Alternatives 58 | ------------ 59 | 60 | None 61 | 62 | Data model impact 63 | ----------------- 64 | 65 | None 66 | 67 | REST API impact 68 | --------------- 69 | 70 | None 71 | 72 | Versioning impact 73 | ----------------- 74 | 75 | None 76 | 77 | Other end user impact 78 | --------------------- 79 | 80 | None 81 | 82 | Deployer impact 83 | --------------- 84 | 85 | None 86 | 87 | Developer impact 88 | ---------------- 89 | 90 | The snmp parsing service does not support lost notifications at the moment. If one 91 | needs the solution in the future, the service should be enhanced. 92 | 93 | Horizon impact 94 | -------------- 95 | 96 | None 97 | 98 | 99 | Implementation 100 | ============== 101 | 102 | Assignee(s) 103 | ----------- 104 | 105 | Primary assignee: 106 | xupeipei 107 | 108 | Work Items 109 | ---------- 110 | 111 | * Add a new SNMP parsing service 112 | 113 | 114 | Dependencies 115 | ============ 116 | 117 | None 118 | 119 | Testing 120 | ======= 121 | 122 | The implementation will be covered by unit tests and tempest tests. 123 | 124 | Documentation Impact 125 | ==================== 126 | 127 | The new SNMP configuration should be documented 128 | 129 | References 130 | ========== 131 | 132 | None 133 | 134 | -------------------------------------------------------------------------------- /specs/pike/implemented/snmp-notifications.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================== 8 | SNMP Notifications 9 | ================== 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/snmp-notifications 13 | 14 | The Evaluator performs root cause analysis on the Vitrage Graph and may 15 | determine that an alarm should be created, deleted or otherwise updated. 16 | Other components are notified of such changes by the Vitrage Notifier service. 17 | Among others, one Vitrage Notifier is responsible for sending SNMP traps on 18 | Vitrage deduced alarms. 19 | 20 | This blueprint describes the implementation of Vitrage Notifier for notifying 21 | SNMP on Vitrage deduced alarms. 22 | 23 | 24 | Problem description 25 | =================== 26 | 27 | Vitrage should support registering for SNMP notifications, and sending traps 28 | on raised alarms and deactivated alarms to any registered targets. 29 | 30 | 31 | Proposed change 32 | =============== 33 | 34 | Due to definition in Vitrage config file:: 35 | 36 | [DEFAULT] 37 | notifiers = snmp 38 | 39 | Vitrage listener will get the alarm events from the message bus and the SNMP 40 | notifier will send SNMP traps on raised deduced alarms and deleted deduced alarms. 41 | 42 | The traps will be sent to destinations specified in consumers yaml file. 43 | 44 | The traps will be sent only on alarms specified in yaml file which contains 45 | oid mapping for each alarm name. 46 | 47 | The format of sent traps will be specified in another yaml file. 48 | 49 | All those yaml files' paths should be specified in Vitrage config:: 50 | 51 | [snmp] 52 | consumers = 53 | alarm_oid_mapping = 54 | oid_tree = 55 | 56 | The SNMP notifier is pluggable, you can implement your own SNMP sender and use 57 | it (it must inherit from the base class), when there is a default implementation. 58 | 59 | Alternatives 60 | ------------ 61 | 62 | None 63 | 64 | Data model impact 65 | ----------------- 66 | 67 | None 68 | 69 | REST API impact 70 | --------------- 71 | 72 | None 73 | 74 | Versioning impact 75 | ----------------- 76 | 77 | None 78 | 79 | Other end user impact 80 | --------------------- 81 | 82 | None 83 | 84 | Deployer impact 85 | --------------- 86 | 87 | To use the SNMP notifier there is a need to define it in the Vitrage config 88 | file, and in addition create three yaml files and define them in Vitrage config file. 89 | 90 | Developer impact 91 | ---------------- 92 | 93 | None 94 | 95 | Horizon impact 96 | -------------- 97 | 98 | None 99 | 100 | Implementation 101 | ============== 102 | 103 | Assignee(s) 104 | ----------- 105 | 106 | Primary assignee: 107 | annarez 108 | 109 | Work Items 110 | ---------- 111 | 112 | - Create SNMP notifier 113 | 114 | - Create SNMP sender 115 | 116 | - create base class 117 | - Create unit test for SNMP sender 118 | 119 | - test snmp notifier 120 | - test snmp sender 121 | 122 | Dependencies 123 | ============ 124 | 125 | None 126 | 127 | Testing 128 | ======= 129 | 130 | This blueprint requires unit tests. 131 | 132 | Documentation Impact 133 | ==================== 134 | 135 | The usage of the SNMP notifier will be documented 136 | 137 | 138 | References 139 | ========== 140 | 141 | `notifier-snmp-plugin.rst `_ 142 | -------------------------------------------------------------------------------- /specs/queens/implemented/webhooks.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ======== 8 | Webhooks 9 | ======== 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/configurable-notifications 13 | 14 | The Evaluator performs root cause analysis on the Vitrage Graph and may 15 | determine that an alarm should be created, deleted or otherwise updated. 16 | Other components are notified of such changes by the Vitrage Notifier service. 17 | Among others, Vitrage Notifier is responsible for sending http post 18 | notifications on Vitrage deduced alarms. 19 | 20 | This blueprint describes the implementation of Vitrage Notifier for 21 | webhooks on Vitrage alarms and state changes. 22 | 23 | 24 | Problem description 25 | =================== 26 | 27 | Vitrage should support webhooks for notfications, which are sent on raised 28 | alarms, deactivated alarms, state changes, RCA or other to any 29 | registered targets. 30 | Furthermore any registered recipient should supply a regex to filter the alarms 31 | sent to that recipient. 32 | 33 | 34 | Proposed change 35 | =============== 36 | 37 | Needed definitions in Vitrage config file:: 38 | 39 | [DEFAULT] 40 | notifiers = webhook 41 | 42 | Vitrage listener will get the alarm events from the message bus and the webhook 43 | notifier will send http post notifications on raised deduced alarms and deleted deduced alarms. 44 | 45 | The filtered notifications will be sent to the destinations that are written in 46 | the database, as configured via API requests. 47 | 48 | The notifications will be sent only on alarms which meet the regex filter specified in the 49 | webhook specification. 50 | 51 | The format of sent notifications will be hard coded. 52 | 53 | As Vitrage notifiers are pluggable, you can write your own notifier and use it. 54 | Specifically in this case, you can inherit the webhook base class and implement your own webhook notifier. 55 | 56 | 57 | 58 | Alternatives 59 | ------------ 60 | 61 | None 62 | 63 | Data model impact 64 | ----------------- 65 | 66 | New DB table, to represent registration details 67 | Preliminary columns : 68 | 69 | - ID 70 | 71 | - Date 72 | 73 | - Address 74 | 75 | - Headers 76 | 77 | - Filter 78 | 79 | REST API impact 80 | --------------- 81 | 82 | An API which supports adding, removing and listing webhooks 83 | 84 | Versioning impact 85 | ----------------- 86 | 87 | None 88 | 89 | Other end user impact 90 | --------------------- 91 | 92 | None 93 | 94 | Deployer impact 95 | --------------- 96 | 97 | To use webhooks one needs to define it in the Vitrage config file. 98 | 99 | Developer impact 100 | ---------------- 101 | 102 | None 103 | 104 | Horizon impact 105 | -------------- 106 | 107 | Future support for webhooks in Horizon 108 | 109 | Implementation 110 | ============== 111 | 112 | Assignee(s) 113 | ----------- 114 | 115 | Primary assignee: 116 | nivo 117 | 118 | Work Items 119 | ---------- 120 | - Add DB table 121 | - Add API 122 | - Implement notifier 123 | - Update docs 124 | - Tests 125 | 126 | Dependencies 127 | ============ 128 | 129 | None 130 | 131 | Testing 132 | ======= 133 | 134 | This blueprint requires tempest tests and unit tests. 135 | 136 | Documentation Impact 137 | ==================== 138 | 139 | The usage of webhooks will be documented 140 | 141 | 142 | References 143 | ========== 144 | 145 | Example on http post notifications in AODH 146 | `http post request `_ 147 | -------------------------------------------------------------------------------- /specs/newton/template-validate-api.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =============================== 8 | Vitrage Template Validation API 9 | =============================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/template-validate-api 12 | 13 | An API for validating templates 14 | 15 | Problem description 16 | =================== 17 | 18 | We would like to be able to validate a single template (or several templates) 19 | through api before uploading it to Vitrage. 20 | 21 | Proposed change 22 | =============== 23 | Create API to validate Vitrage templates in terms of content and syntax. 24 | 25 | #. By given a full path to template file, validate a single template. 26 | #. By given a full path to directory, validate all template files inside it. 27 | 28 | The template validate API returns a result that contains the following fields: 29 | 30 | #. status - validation succeeded/failed 31 | #. file path - the full path to the template file 32 | #. description 33 | #. message - error message 34 | #. status code 35 | 36 | REST API impact 37 | --------------- 38 | 39 | Template Validate 40 | ^^^^^^^^^^^^^^^^^ 41 | 42 | Validate Vitrage template(s) 43 | 44 | POST / 45 | ~~~~~~ 46 | 47 | Headers 48 | ^^^^^^^ 49 | 50 | - X-Auth-Token (string, required) - Keystone auth token 51 | - Accept (string) - application/json 52 | - User-Agent (String) 53 | - Content-Type (String): application/json 54 | 55 | Path Parameters 56 | ^^^^^^^^^^^^^^^ 57 | 58 | None. 59 | 60 | Query Parameters 61 | ^^^^^^^^^^^^^^^^ 62 | - path (string(255), required) - the path to template file or directory 63 | 64 | 65 | Request Body 66 | ^^^^^^^^^^^^ 67 | 68 | None. 69 | 70 | Request Examples 71 | ^^^^^^^^^^^^^^^^ 72 | :: 73 | 74 | POST /v1/template/?path=[file/dir path] 75 | Host: 135.248.18.122:8999 76 | User-Agent: keystoneauth1/2.3.0 python-requests/2.9.1 CPython/2.7.6 77 | Content-Type: application/json 78 | Accept: application/json 79 | X-Auth-Token: 2b8882ba2ec44295bf300aecb2caa4f7 80 | 81 | Response 82 | ~~~~~~~~ 83 | 84 | Status code 85 | ^^^^^^^^^^^ 86 | 87 | - 200 - OK 88 | - 400 - Bad request 89 | 90 | Response Body 91 | ^^^^^^^^^^^^^ 92 | 93 | Returns a JSON object that is a list of results. 94 | Each result describes the full validation (syntax and content) of one template file. 95 | 96 | Response Examples 97 | ^^^^^^^^^^^^^^^^^ 98 | 99 | .. code-block:: json 100 | 101 | { 102 | "results": [ 103 | { 104 | "status": "validation failed", 105 | "file path": "/tmp/templates/basic_no_meta.yaml", 106 | "description": "Template syntax validation", 107 | "message": "metadata is a mandatory section.", 108 | "status code": 62 109 | }, 110 | { 111 | "status": "validation OK", 112 | "file path": "/tmp/templates/basic.yaml", 113 | "description": "Template validation", 114 | "message": "Template validation is OK", 115 | "status code": 4 116 | } 117 | ] 118 | } 119 | 120 | Implementation 121 | ============== 122 | 123 | Assignee(s) 124 | ----------- 125 | 126 | liat har-tal 127 | 128 | Dependencies 129 | ============ 130 | 131 | Depends on the template validation blueprints 132 | 133 | Testing 134 | ======= 135 | 136 | Tempest tests also need to be added in order to test: 137 | 138 | #. Validate single template 139 | #. Validate several templates 140 | 141 | 142 | Documentation Impact 143 | ==================== 144 | The new api should be documented 145 | 146 | References 147 | ========== 148 | None 149 | -------------------------------------------------------------------------------- /specs/pike/implemented/resource-list-api.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================= 8 | Resource List API 9 | ================= 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/resource-list-api 12 | 13 | An API to list the resources with specified type or all the resources. 14 | 15 | Problem description 16 | =================== 17 | 18 | Currently Vitrage has the APIs for getting topology and alarms. But user may 19 | want to get specified resources which he cares about. 20 | 21 | Proposed change 22 | =============== 23 | Add an API to list resources. If user specify the resource type, list the 24 | resources with the given type. 25 | 26 | Alternatives 27 | ------------ 28 | 29 | None 30 | 31 | Data model impact 32 | ----------------- 33 | 34 | None 35 | 36 | REST API impact 37 | --------------- 38 | 39 | Resource List 40 | ^^^^^^^^^^^^^ 41 | 42 | Returns resource list 43 | 44 | GET /v1/resources/ 45 | ~~~~~~~~~~~~~~~~~~ 46 | 47 | Headers 48 | ^^^^^^^ 49 | 50 | - X-Auth-Token (string, required) - Keystone auth token 51 | - Accept (string) - application/json 52 | - User-Agent (String) 53 | 54 | Path Parameters 55 | ^^^^^^^^^^^^^^^ 56 | 57 | None. 58 | 59 | Query Parameters 60 | ^^^^^^^^^^^^^^^^ 61 | 62 | None 63 | 64 | Request Body 65 | ^^^^^^^^^^^^ 66 | 67 | * resource_type - (string, optional) the type of resource. defaults to return all resources. 68 | * all_tenants - (boolean, optional) shows the resources of all tenants (in case the user has the permissions). 69 | 70 | Request Examples 71 | ^^^^^^^^^^^^^^^^ 72 | :: 73 | 74 | GET /v1/resources/ 75 | Host: 127.0.0.1:8999 76 | User-Agent: keystoneauth1/2.3.0 python-requests/2.9.1 CPython/2.7.6 77 | Accept: application/json 78 | X-Auth-Token: 2b8882ba2ec44295bf300aecb2caa4f7 79 | 80 | Response 81 | ~~~~~~~~ 82 | 83 | Status code 84 | ^^^^^^^^^^^ 85 | 86 | - 200 - OK 87 | - 400 - Bad request 88 | 89 | Response Body 90 | ^^^^^^^^^^^^^ 91 | 92 | Returns a list with all the resources requested. 93 | 94 | Response Examples 95 | ^^^^^^^^^^^^^^^^^ 96 | 97 | :: 98 | 99 | [ 100 | { 101 | "category": "RESOURCE", 102 | "is_placeholder": false, 103 | "is_deleted": false, 104 | "name": "vm-1", 105 | "update_timestamp": "2015-12-01T12:46:41Z", 106 | "state": "ACTIVE", 107 | "project_id": "0683517e1e354d2ba25cba6937f44e79", 108 | "type": "nova.instance", 109 | "id": "dc35fa2f-4515-1653-ef6b-03b471bb395b", 110 | "vitrage_id": "RESOURCE:nova.instance:dc35fa2f-4515-1653-ef6b-03b471bb395b" 111 | } 112 | ] 113 | 114 | Security impact 115 | --------------- 116 | 117 | None 118 | 119 | Pipeline impact 120 | --------------- 121 | 122 | None 123 | 124 | Other end user impact 125 | --------------------- 126 | 127 | None 128 | 129 | Performance/Scalability Impacts 130 | ------------------------------- 131 | 132 | None 133 | 134 | 135 | Other deployer impact 136 | --------------------- 137 | 138 | None 139 | 140 | Developer impact 141 | ---------------- 142 | 143 | None 144 | 145 | 146 | Implementation 147 | ============== 148 | 149 | Assignee(s) 150 | ----------- 151 | 152 | dong wenjuan 153 | 154 | 155 | Work Items 156 | ---------- 157 | 158 | * Implement the API and tests 159 | * Implement the client and tests 160 | 161 | Future lifecycle 162 | ================ 163 | 164 | None 165 | 166 | Dependencies 167 | ============ 168 | 169 | None 170 | 171 | Testing 172 | ======= 173 | 174 | Unit tests and tempest tests need to be added. 175 | 176 | Documentation Impact 177 | ==================== 178 | The new api should be documented 179 | 180 | References 181 | ========== 182 | None 183 | -------------------------------------------------------------------------------- /specs/rocky/implemented/graph_fast_failover.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =========================== 8 | Vitrage-graph fast failover 9 | =========================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/vitrage-fast-failover 12 | 13 | vitrage-graph high availability should meet these requirements: 14 | - Single active instance of vitrage-graph (managed by pacemaker). 15 | - Initialize quickly upon failover without requesting updates. 16 | - In case of a long downtime, vitrage-graph startup will request 17 | collector updates 18 | 19 | 20 | Problem description 21 | =================== 22 | 23 | Vitrage-graph is active standby. Currently on a failover, vitrage-graph 24 | needs to pull all the data again from the collector data-sources. 25 | This takes a considerable amount of time, in which data is inconsistent. 26 | As we wish to continue working with an in-memory graph (due to performance), 27 | vitrage-graph service will remain active-standby. Therefore, downtime must 28 | be minimized in failover events. 29 | 30 | Proposed change 31 | =============== 32 | 33 | - after every get_all, vitrage-graph stores a full entity graph snapshot in 34 | the db, so the majority of events do not need to be replayed. 35 | - Vitrage-graph sends each processed event to vitrage-persistor so these 36 | are stored in the order of handling. 37 | - Upon init vitrage-graph queries the db table graph_snapshots, fetching the 38 | latest entry, it will be used if it is not older than snapshot_interval. 39 | 40 | Init with a snapshot - on failover 41 | - Unpickle stored snapshot to get the graph. 42 | - Run the processor on all the events (from events table) that occurred after 43 | the snapshot. 44 | - Enable the evaluators. 45 | - Process all the events that are waiting in the message bus. 46 | 47 | Init without a snapshot - a fresh start (This is the current behaviour). 48 | - Start with a new empty graph. 49 | - RPC to Collector to run get_all for all drivers, then process the events. 50 | - Process all the events that are waiting in the message bus. 51 | - Enable evaluator and iterate all graph. 52 | 53 | Alternatives 54 | ------------ 55 | 56 | Using a persistent graph database can improve vitrage-graph high availability 57 | as fail-over will be quick due to running active-active. This may be a 58 | preferred solution in terms of high availability, but overall, when comparing 59 | performance compared to in-memory networkx, the degradation is not reasonable 60 | 61 | 62 | Data model impact 63 | ----------------- 64 | 65 | May require minor changes, TBD. 66 | 67 | 68 | REST API impact 69 | --------------- 70 | 71 | None 72 | 73 | Versioning impact 74 | ----------------- 75 | 76 | None 77 | 78 | Other end user impact 79 | --------------------- 80 | 81 | None 82 | 83 | Deployer impact 84 | --------------- 85 | 86 | This will be enabled by default. 87 | Deployer may disable in by adding the following to vitrage.conf 88 | [persistancy] 89 | enable_persistancy=false 90 | 91 | Developer impact 92 | ---------------- 93 | 94 | None 95 | 96 | Horizon impact 97 | -------------- 98 | 99 | None 100 | 101 | Implementation 102 | ============== 103 | 104 | Assignee(s) 105 | ----------- 106 | 107 | Primary assignee: 108 | idan-hefetz 109 | 110 | Other contributors: 111 | None 112 | 113 | Work Items 114 | ---------- 115 | 116 | None 117 | 118 | Dependencies 119 | ============ 120 | 121 | None 122 | 123 | Testing 124 | ======= 125 | 126 | Additional tempest will be added for fail-over, as persistence is already 127 | covered by existing tempest. 128 | Unit tests will not be affective here as changes are mostly in the init process 129 | and scheduler. This feature mostly reuses existing (tested) functionality. 130 | 131 | Documentation Impact 132 | ==================== 133 | 134 | None 135 | 136 | References 137 | ========== 138 | 139 | None 140 | 141 | -------------------------------------------------------------------------------- /specs/rocky/implemented/datasource-scaffold.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =================== 8 | Datasource Scaffold 9 | =================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/datasource-scaffold 12 | 13 | A command line tool to generate skeleton of new datasource. A skeleton contains 14 | stubs of required classes and methods as described in the `design specs`_, 15 | without detail implementation. It aims to bootstrapping the development of new 16 | datasource. 17 | 18 | Problem description 19 | =================== 20 | 21 | The `design specs`_ has given detail instructions on how to add a new data 22 | source. However, there is much overhead to create it from scratch. Developers 23 | used to copy from existing datasource as a start. It is sometimes out of date 24 | and always contains many specific codes. 25 | 26 | Proposed change 27 | =============== 28 | 29 | Create templates for the datasource skeleton with placeholders of names and 30 | render the Python source file on demand. 31 | 32 | Example template in `Jinja2`_:: 33 | 34 | from oslo_config import cfg 35 | from vitrage.common.constants import UpdateMethod 36 | 37 | {{ name|upper }}_DATASOURCE = '{{ name }}' 38 | 39 | # define needed options 40 | OPTS = [ 41 | # Transformer with the path to your transformer classes 42 | cfg.StrOpt('transformer', 43 | default='vitrage.datasources.{{ name }}_datasource.transformer.' 44 | '{{ name|capitalize }}Transformer', 45 | help='{{ name|capitalize }} transformer class path', 46 | required=True), 47 | 48 | Providing ``name=foo``, it will generate the skeleton source file in Python:: 49 | 50 | from oslo_config import cfg 51 | from vitrage.common.constants import UpdateMethod 52 | 53 | FOO_DATASOURCE = 'foo' 54 | 55 | # define needed options 56 | OPTS = [ 57 | # Transformer with the path to your transformer classes 58 | cfg.StrOpt('transformer', 59 | default='vitrage.datasources.foo_datasource.transformer.' 60 | 'FooTransformer', 61 | help='Foo transformer class path', 62 | required=True), 63 | 64 | Alternatives 65 | ------------ 66 | 67 | Create and maintain a sample datasource to allow user to modify as base. In this 68 | way, developer is likely to miss some string replacement somewhere as we 69 | experienced in the `abandoned patch set`_. 70 | 71 | Data model impact 72 | ----------------- 73 | 74 | None 75 | 76 | REST API impact 77 | --------------- 78 | 79 | None 80 | 81 | Versioning impact 82 | ----------------- 83 | 84 | None 85 | 86 | Other end user impact 87 | --------------------- 88 | 89 | None 90 | 91 | Deployer impact 92 | --------------- 93 | 94 | None 95 | 96 | Developer impact 97 | ---------------- 98 | 99 | None 100 | 101 | Horizon impact 102 | -------------- 103 | 104 | None 105 | 106 | Implementation 107 | ============== 108 | 109 | Assignee(s) 110 | ----------- 111 | 112 | Primary assignee: 113 | yujunz 114 | 115 | Other contributors: 116 | None 117 | 118 | Work Items 119 | ---------- 120 | 121 | - Create datasource skeleton template 122 | - driver 123 | - transformer 124 | - Create unit test skeleton template 125 | - test driver 126 | - test transformer 127 | - mock configuration 128 | - mock driver 129 | - trace generator 130 | - Templates for datasource with different update methods 131 | 132 | Dependencies 133 | ============ 134 | 135 | None 136 | 137 | Testing 138 | ======= 139 | 140 | The changes shall be covered by new unit test. 141 | 142 | Documentation Impact 143 | ==================== 144 | 145 | How to use the generator will be documented. 146 | 147 | References 148 | ========== 149 | 150 | .. _design specs: http://docs.openstack.org/developer/vitrage/add-new-datasource.html 151 | .. _Jinja2: http://jinja.pocoo.org 152 | .. _abandoned patch set: https://review.openstack.org/#/c/396974 153 | -------------------------------------------------------------------------------- /specs/queens/implemented/alarm-counts-api.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ======================== 8 | Vitrage Alarm Counts API 9 | ======================== 10 | 11 | Extend the Vitrage REST API to support a GET of the Active Alarm Counts, for 12 | each alarm severity level, in Vitrage. 13 | 14 | 15 | Problem description 16 | =================== 17 | Provide REST API access to the Vitrage Active Alarm Counts in support of the 18 | Horizon blueprint, "Vitrage Alarm Banner in Top Navbar". 19 | 20 | 21 | Proposed change 22 | =============== 23 | Support the following REST API to Vitrage for calculating and returning 24 | the Active Alarm Counts for each alarm severity level:: 25 | 26 | GET /v1/alarm/count 27 | 28 | Headers 29 | X-Auth-Token (string, required) - Keystone auth token 30 | Accept (string) - application/json 31 | 32 | Path Parameters 33 | None. 34 | 35 | Query Parameters 36 | None. 37 | 38 | Request Body 39 | all_tenants - (boolean, optional) shows the alarm counts 40 | summed across all tenants (in case the user 41 | has the permissions). 42 | 43 | Request Examples 44 | GET /v1/alarm/count HTTP/1.1 45 | Host: 135.248.19.18:8999 46 | X-Auth-Token: 2b8882ba2ec44295bf300aecb2caa4f7 47 | Accept: application/json 48 | 49 | Response Status code 50 | 200 - OK 51 | 52 | Response Body 53 | Returns a JSON object containing the alarm counts for the 54 | different alarm severities. 55 | 56 | Response Examples 57 | { 58 | "critical_alarm_count": 1, 59 | "major_alarm_count": 0, 60 | "minor_alarm_count": 1, 61 | "warning_alarm_count": 3 62 | } 63 | 64 | NOTE: The vitrage CLI and client will be updated for this new API. 65 | e.g. "vitrage alarm count" 66 | 67 | 68 | 69 | Alternatives 70 | ------------ 71 | For performance reasons, maintain the Active Alarm Counts in Vitrage Entity 72 | Graph, and just return these counts when the REST API command is received. 73 | 74 | Although decided against this due to: 75 | 76 | - keeping a counter, in addition to the graph, might be buggy (multi threading 77 | issues etc.), 78 | - calculating the counter means traversing once all of the vertices in the graph, 79 | get all alarms, and count. It shouldn't be too expensive, it's just like 80 | 'get alarms' api, 81 | - since the result of this api is used for ui query (and not for notification or 82 | corrective actions for example), the performance is not that critical. 83 | 84 | Data model impact 85 | ----------------- 86 | None 87 | 88 | REST API impact 89 | --------------- 90 | Extending REST API with new GET /v1/alarm/count API. 91 | 92 | Versioning impact 93 | ----------------- 94 | None ... just extending API, no changes. 95 | 96 | Other end user impact 97 | --------------------- 98 | None 99 | 100 | Deployer impact 101 | --------------- 102 | None 103 | 104 | Developer impact 105 | ---------------- 106 | None 107 | 108 | Horizon impact 109 | -------------- 110 | Horizon will use this API to populate the counts in its new "Vitrage 111 | Alarm Banner" in its Top Navbar; a proposed Horizon blueprint. 112 | 113 | 114 | 115 | Implementation 116 | ============== 117 | 118 | Assignee(s) 119 | ----------- 120 | Primary assignee: 121 | gwaines 122 | 123 | Other contributors: 124 | None 125 | 126 | Work Items 127 | ---------- 128 | - Implement new REST API in Vitrage API: GET /v1/alarm/count API, 129 | to calculate and return the Vitrage Active Alarm Counts for each 130 | alarm severity level, 131 | - Update Vitrage client for new API 132 | - Add the new "vitrage alarm count" CLI command 133 | 134 | 135 | Dependencies 136 | ============ 137 | None 138 | 139 | 140 | Testing 141 | ======= 142 | The changes shall be covered by new unit test and tempest test. 143 | 144 | 145 | Documentation Impact 146 | ==================== 147 | Update to Vitrage API Documentation; i.e. the new API will be added under 148 | https://github.com/openstack/vitrage/blob/master/doc/source/contributor/vitrage-api.rst 149 | 150 | 151 | References 152 | ========== 153 | None. 154 | 155 | -------------------------------------------------------------------------------- /specs/mitaka/networkx-graph-driver.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ===================== 8 | NetworkX Graph driver 9 | ===================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/networkx-graph-driver 12 | 13 | Graph Driver is the defined API for access and manipulation of the underlying graph used for storing the Entity Graph. 14 | This API should be implemented for the NetworkX graph package and possibly for other graph tools, allowing Vitrage a seamless transition between different underlying graph implementations. 15 | 16 | 17 | Problem description 18 | =================== 19 | 20 | Vitrage will have a graph data structure that will hold a list of physical and virtual entities and their relationships to one another, in what we call the Entity Graph. 21 | The specific implementation of this graph should be interchangeable allowing a stateless or state-full implementations. 22 | 23 | Proposed change 24 | =============== 25 | 26 | **1. Graph Driver** 27 | 28 | Create a Graph Driver, which defines a set of graph methods, to be implemented over NetworkX. 29 | This blueprint describes the addition of the Graph Driver and NetworkX Driver. 30 | 31 | :: 32 | 33 | +-------------------+ 34 | | | 35 | +------------------+ +-------> NetworkX Driver | 36 | | | | | | 37 | | Graph | Impl | +-------------------+ 38 | | |-------+ 39 | | Driver | | +-------------------+ 40 | | | | | | 41 | +------------------+ +-------> Other Drivers | 42 | | (BulbFlow,etc..) | 43 | +-------------------+ 44 | 45 | **Specification of the Graph Driver API:** 46 | 47 | Note that the entity graph is a property graph, where edges and vertices can also have a set of key-value properties, which can be added, updated and removed as well. 48 | 49 | *Graph CRUD* 50 | - init 51 | - num_of_edges 52 | - num_of_vertices 53 | - copy //deep copy of the graph 54 | 55 | *Vertex CRUD* 56 | - add_vertex 57 | - add_vertices 58 | - get_vertex 59 | - update_vertex 60 | - remove_vertex 61 | 62 | *Edges CRUD* 63 | - add_edge 64 | - get_edge 65 | - update_edge 66 | - remove_edge 67 | 68 | *Graph algorithms* 69 | - subgraph_matching (sub-graph isomorphism) 70 | - BFS 71 | - DFS 72 | 73 | **2. NetworkX Driver** 74 | 75 | NetworkX is a pure python library for graphs. It is stateless and suitable for operation on large real world graphs. 76 | The NetworkX Driver will implement Graph Driver 77 | 78 | 79 | Alternatives 80 | ------------ 81 | 82 | None 83 | 84 | Data model impact 85 | ----------------- 86 | 87 | None 88 | 89 | REST API impact 90 | --------------- 91 | 92 | None 93 | 94 | Security impact 95 | --------------- 96 | 97 | None 98 | 99 | Pipeline impact 100 | --------------- 101 | 102 | None 103 | 104 | Other end user impact 105 | --------------------- 106 | 107 | None 108 | 109 | Performance/Scalability Impacts 110 | ------------------------------- 111 | 112 | None 113 | 114 | 115 | Other deployer impact 116 | --------------------- 117 | 118 | None 119 | 120 | Developer impact 121 | ---------------- 122 | 123 | None 124 | 125 | 126 | Implementation 127 | ============== 128 | 129 | Assignee(s) 130 | ----------- 131 | 132 | Primary assignee: 133 | None. 134 | 135 | Work Items 136 | ---------- 137 | 138 | - Create the GraphDriver skeleton 139 | - Implement Graph Driver for NetworkX 140 | - Testing of GraphDriver over NetworkX 141 | 142 | 143 | Future lifecycle 144 | ================ 145 | 146 | None 147 | 148 | Dependencies 149 | ============ 150 | 151 | None 152 | 153 | Testing 154 | ======= 155 | 156 | This change needs to be tested by unit tests. 157 | 158 | Documentation Impact 159 | ==================== 160 | 161 | 162 | References 163 | ========== -------------------------------------------------------------------------------- /specs/queens/implemented/parallel-evaluation.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ======================================== 8 | Parallel evaluation of Vitrage templates 9 | ======================================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/parallel-evaluation 12 | 13 | Currently Vitrage sequentially processes and evaluates incoming events. 14 | While it is a necessity to run sequential processing, template evaluation 15 | for a single event is independent and can be paralleled. 16 | The challenge is NetworkX in-memory graph held by a single process, thus 17 | preventing parallelism, this can be addressed by creating evaluator worker 18 | processes that maintain clones of the graph. 19 | 20 | 21 | Problem description 22 | =================== 23 | 24 | Each and every graph change triggers evaluation of all relevant template 25 | conditions, thus run time of event processing is dependent upon the number 26 | of loaded templates. 27 | 28 | - Processor receives an event, from either the datasource or evaluator queue 29 | - Processor updates the graph 30 | - Evaluator is triggered to run relevant templates 31 | - Evaluation may result in an event writen to the evaluator queue 32 | 33 | The above flow runs on in memory NetworkX graph in a single thread 34 | 35 | .. figure:: ./parallel_current.jpg 36 | :width: 100% 37 | :align: center 38 | :alt: Problem description 39 | 40 | Proposed change 41 | =============== 42 | 43 | One or more EvaluatorWorker processes will be added to vitrage-graph service. 44 | These processes keep their own graph instance. A new component 45 | EvaluatorManager manages the communication with the EvaluatorWorkers. 46 | Per each graph change resulted in the processor, it will request the 47 | EvaluatorManager to inform all the EvaluatorWorkers, so in effect these hold 48 | an identical graph clone. 49 | Each EvaluatorWorker runs a portion of the templates, writing its results to 50 | evaluator queue. 51 | 52 | The flow will be as follows: 53 | 54 | - Processor receives an event, from either the datasource or evaluator queue 55 | - Processor updates the graph 56 | - EvaluatorManager is triggered, sending the event to the N EvaluatorWorkers, 57 | via N multiprocessing queues, then waits for their ack signal 58 | - Each EvaluatorWorker updates it's own graph 59 | - In each EvaluatorWorker the evaluator is triggered to run a portion of the 60 | templates 61 | - Evaluation may result in an event writen to the evaluator queue 62 | 63 | .. figure:: ./parallel_future.jpg 64 | :width: 100% 65 | :align: center 66 | :alt: Proposed change 67 | 68 | 69 | Alternatives 70 | ------------ 71 | 72 | None 73 | 74 | 75 | Data model impact 76 | ----------------- 77 | 78 | None 79 | 80 | REST API impact 81 | --------------- 82 | 83 | None 84 | 85 | Versioning impact 86 | ----------------- 87 | 88 | None 89 | 90 | Other end user impact 91 | --------------------- 92 | 93 | None 94 | 95 | Deployer impact 96 | --------------- 97 | 98 | Each EvaluatorWorker holds a clone of the in memory Entity Graph, hence memory 99 | consumption will increase as the configured number of workers increases. 100 | 101 | Developer impact 102 | ---------------- 103 | 104 | None 105 | 106 | Horizon impact 107 | -------------- 108 | 109 | None 110 | 111 | Implementation 112 | ============== 113 | 114 | Assignee(s) 115 | ----------- 116 | 117 | Primary assignee: 118 | idan-hefetz 119 | 120 | Other contributors: 121 | None 122 | 123 | Work Items 124 | ---------- 125 | 126 | - processor should not hold a ScenarioEvaluator 127 | - create EvaluatorManager 128 | - create EvaluatorWorker 129 | - change main in graph.py 130 | - change GraphService to handle these changes 131 | - Choose the best way to assign tasks to workers 132 | 133 | Dependencies 134 | ============ 135 | 136 | None 137 | 138 | Testing 139 | ======= 140 | 141 | The implementation will be covered by additional unit test 142 | 143 | Documentation Impact 144 | ==================== 145 | 146 | None 147 | 148 | References 149 | ========== 150 | 151 | None 152 | 153 | -------------------------------------------------------------------------------- /tests/test_titles.py: -------------------------------------------------------------------------------- 1 | # Licensed under the Apache License, Version 2.0 (the "License"); you may 2 | # not use this file except in compliance with the License. You may obtain 3 | # a copy of the License at 4 | # 5 | # http://www.apache.org/licenses/LICENSE-2.0 6 | # 7 | # Unless required by applicable law or agreed to in writing, software 8 | # distributed under the License is distributed on an "AS IS" BASIS, WITHOUT 9 | # WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the 10 | # License for the specific language governing permissions and limitations 11 | # under the License. 12 | 13 | import glob 14 | import re 15 | 16 | import docutils.core 17 | import testtools 18 | 19 | 20 | class TestTitles(testtools.TestCase): 21 | def _get_title(self, section_tree): 22 | section = { 23 | 'subtitles': [], 24 | } 25 | for node in section_tree: 26 | if node.tagname == 'title': 27 | section['name'] = node.rawsource 28 | elif node.tagname == 'section': 29 | subsection = self._get_title(node) 30 | section['subtitles'].append(subsection['name']) 31 | return section 32 | 33 | def _get_titles(self, spec): 34 | titles = {} 35 | for node in spec: 36 | if node.tagname == 'section': 37 | section = self._get_title(node) 38 | titles[section['name']] = section['subtitles'] 39 | return titles 40 | 41 | def _check_titles(self, fname, titles): 42 | expected_titles = ('Problem description', 'Proposed change', 43 | 'Implementation', 'Dependencies', 44 | 'Testing', 'Documentation Impact', 45 | 'References') 46 | self.assertEqual( 47 | sorted(expected_titles), 48 | sorted(titles.keys()), 49 | "Expected titles not found in document %s" % fname) 50 | 51 | proposed = 'Proposed change' 52 | self.assertIn('Alternatives', titles[proposed]) 53 | self.assertIn('Data model impact', titles[proposed]) 54 | self.assertIn('REST API impact', titles[proposed]) 55 | self.assertIn('Versioning impact', titles[proposed]) 56 | self.assertIn('Other end user impact', titles[proposed]) 57 | self.assertIn('Deployer impact', titles[proposed]) 58 | self.assertIn('Developer impact', titles[proposed]) 59 | self.assertIn('Horizon impact', titles[proposed]) 60 | 61 | impl = 'Implementation' 62 | self.assertIn('Assignee(s)', titles[impl]) 63 | self.assertIn('Work Items', titles[impl]) 64 | 65 | def _check_lines_wrapping(self, tpl, raw): 66 | for i, line in enumerate(raw.split("\n")): 67 | if "http://" in line or "https://" in line: 68 | continue 69 | self.assertTrue( 70 | len(line) < 80, 71 | msg="%s:%d: Line limited to a maximum of 79 characters." % 72 | (tpl, i+1)) 73 | 74 | def _check_no_cr(self, tpl, raw): 75 | matches = re.findall('\r', raw) 76 | self.assertEqual( 77 | len(matches), 0, 78 | "Found %s literal carriage returns in file %s" % 79 | (len(matches), tpl)) 80 | 81 | 82 | def _check_trailing_spaces(self, tpl, raw): 83 | for i, line in enumerate(raw.split("\n")): 84 | trailing_spaces = re.findall(" +$", line) 85 | self.assertEqual(len(trailing_spaces),0, 86 | "Found trailing spaces on line %s of %s" % (i+1, tpl)) 87 | 88 | 89 | def test_template(self): 90 | files = ['specs/template.rst'] + glob.glob('specs/*/*/*') 91 | for filename in files: 92 | self.assertTrue(filename.endswith(".rst"), 93 | "spec's file must uses 'rst' extension.") 94 | with open(filename) as f: 95 | data = f.read() 96 | 97 | spec = docutils.core.publish_doctree(data) 98 | titles = self._get_titles(spec) 99 | self._check_titles(filename, titles) 100 | self._check_lines_wrapping(filename, data) 101 | self._check_no_cr(filename, data) 102 | self._check_trailing_spaces(filename, data) -------------------------------------------------------------------------------- /specs/queens/implemented/template-include.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =================================================== 8 | Include template definitions from an external file 9 | =================================================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/definition-templates 12 | 13 | Define special template files that contain only definitions 14 | (entities / relationships and not scenarios), that can then be included within 15 | other template files and used there to create new scenarios. 16 | 17 | Problem description 18 | =================== 19 | 20 | A lot of templates were redefining the same entities and relationships for use 21 | in their scenarios. 22 | 23 | Proposed change 24 | =============== 25 | 26 | Add a new file (or a set of files) into a "def_templates" directory. These 27 | files are the same as templates but do not contain scenarios or an "include" 28 | section (only definitions): 29 | 30 | .. code-block:: yaml 31 | 32 | metadata: 33 | name: alarm_on_host_defs 34 | description: basic def_template example 35 | definitions: 36 | entities: 37 | - entity: 38 | category: ALARM 39 | type: nagios 40 | name: host_problem 41 | template_id: alarm 42 | - entity: 43 | category: RESOURCE 44 | type: nova.host 45 | template_id: resource 46 | relationships: 47 | - relationship: 48 | source: alarm 49 | target: resource 50 | relationship_type: on 51 | template_id : alarm_on_host 52 | 53 | 54 | Add an "include" section within templates, which states the name that should be 55 | included at it appears in the metadata of the definition template. Multiple 56 | definition templates can be added: 57 | 58 | .. code-block:: yaml 59 | 60 | metadata: 61 | name: alarm_on_host_scenario 62 | description: basic template with an include section example 63 | definitions: 64 | entities: 65 | - entity: 66 | ... 67 | relationships: 68 | - relationship: 69 | ... 70 | include: 71 | - name: alarm_on_host_defs 72 | - name: ... 73 | scenarios: 74 | - scenario: 75 | condition: alarm_on_host 76 | actions: 77 | - action: 78 | action_type: set_state 79 | properties: 80 | state: SUBOPTIMAL 81 | action_target: 82 | target: resource11 83 | 84 | Alternatives 85 | ------------ 86 | 87 | None 88 | 89 | Data model impact 90 | ----------------- 91 | 92 | None 93 | 94 | REST API impact 95 | --------------- 96 | 97 | None. 98 | Should be addressed in a future template CRUD implementation 99 | 100 | Versioning impact 101 | ----------------- 102 | 103 | None - Old template formats will still be supported. Introduces an alternative 104 | version. 105 | 106 | Other end user impact 107 | --------------------- 108 | 109 | None 110 | 111 | Deployer impact 112 | --------------- 113 | 114 | None 115 | 116 | Developer impact 117 | ---------------- 118 | 119 | None 120 | 121 | Horizon impact 122 | -------------- 123 | 124 | Template UI should be changed to show definition template files. 125 | 126 | Topology view 127 | ^^^^^^^^^^^^^ 128 | 129 | No impact 130 | 131 | RCA view 132 | ^^^^^^^^ 133 | 134 | No impact 135 | 136 | 137 | Entity graph 138 | ^^^^^^^^^^^^ 139 | 140 | No impact 141 | 142 | Summary 143 | ^^^^^^^ 144 | 145 | No impacts 146 | 147 | Implementation 148 | ============== 149 | 150 | Assignee(s) 151 | ----------- 152 | 153 | Primary assignee: 154 | nivolas 155 | 156 | Other contributors: 157 | None 158 | 159 | Work Items 160 | ---------- 161 | In scope: 162 | 163 | - Loading definition templates. 164 | - Validating definition templates. 165 | - Tests 166 | - Documentation 167 | 168 | The following items are not in scope: 169 | 170 | - Definition templates with scenarios. 171 | - Recursive includes (a definition template can not include other definition 172 | templates). 173 | 174 | Dependencies 175 | ============ 176 | 177 | None 178 | 179 | Testing 180 | ======= 181 | 182 | The implementation will be covered by additional unit tests and tempest tests. 183 | 184 | Documentation Impact 185 | ==================== 186 | 187 | Documentation on how to define definition template files and when to use them 188 | 189 | References 190 | ========== 191 | 192 | None 193 | -------------------------------------------------------------------------------- /specs/mitaka/vitrage-support-deduced-alarms.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================================== 8 | Vitrage Support for Deduced Alarms 9 | ================================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/support-deduced-alarms 12 | 13 | Vitrage should support raising deduced alarms. 14 | When an alarm is raised on a certain resource, we might conclude that it results in problems in other related 15 | resources. In this case, we would like to raise deduced alarms on these other resources, even if we got no indication 16 | of a problem from other sources of information. 17 | 18 | Problem description 19 | =================== 20 | 21 | There are cases where we know that a failure in one resource should result in problems in other resources. For example, 22 | high CPU load on a host may cause suboptimal performance of all instances running on this host, thus performance 23 | problems in the applications running on these instances. As far as Nova is concerned, all instances are up and 24 | running, and there is no indication of a problem. 25 | 26 | 27 | Proposed change 28 | =============== 29 | 30 | The Vitrage Evaluator serves as workflow manager controlling the analysis and activation of templates and execution 31 | of template actions. One of its responsibilities is to listen to changes in Vitrage Graph, and upon a change execute 32 | the matching templates. This is a general mechanism that should work for all kinds of templates and perform several 33 | kinds of actions. 34 | 35 | The aim of this blueprint is to make sure deduced alarms functionality works properly end to end. 36 | 37 | Whenever a new alarm is raised, Vitrage Graph is updated with a new vertex for this alarm, connected to the relevant 38 | resource. Then, the vitrage evaluator engine looks for templates that could match this new alarm. If deduced alarms 39 | templates are found, the engine will try to find a full match for the entire template in Vitrage Graph. If found, 40 | the engine will ask the notifier to raise new alarms according to the template. 41 | 42 | 43 | Example for a graph with deduced alarms: 44 | 45 | :: 46 | 47 | Original alarm: 48 | +-----------+ +------------+ 49 | |Host High | on | | 50 | |CPU load | +-----------> | Host | 51 | | | | | 52 | +-----------+ +-----+------+ 53 | | 54 | | contains 55 | Deduced alarm | 56 | to raise: | 57 | +-----------+ +-----v------+ 58 | |Instance | on | | 59 | |Suboptimal | +-----------> | Instance | 60 | |Performance| | | 61 | +-----------+ +------------+ 62 | 63 | 64 | 65 | 66 | Alternatives 67 | ------------ 68 | 69 | None 70 | 71 | Data model impact 72 | ----------------- 73 | 74 | None 75 | 76 | REST API impact 77 | --------------- 78 | 79 | None 80 | 81 | Security impact 82 | --------------- 83 | 84 | None 85 | 86 | Pipeline impact 87 | --------------- 88 | 89 | None 90 | 91 | Other end user impact 92 | --------------------- 93 | 94 | None 95 | 96 | Performance/Scalability Impacts 97 | ------------------------------- 98 | 99 | None 100 | 101 | Other deployer impact 102 | --------------------- 103 | 104 | None 105 | 106 | Developer impact 107 | ---------------- 108 | 109 | None 110 | 111 | Horizon impact 112 | -------------- 113 | 114 | None 115 | 116 | 117 | Implementation 118 | ============== 119 | 120 | Assignee(s) 121 | ----------- 122 | 123 | Primary assignee: 124 | ifat_afek 125 | 126 | Work Items 127 | ---------- 128 | 129 | The blueprint includes: 130 | 131 | - Define the exact syntax for deduced alarms templates 132 | - Call the notifier to raise alarms 133 | 134 | 135 | Future lifecycle 136 | ================ 137 | 138 | None 139 | 140 | Dependencies 141 | ============ 142 | 143 | - Vitrage Graph 144 | - Vitrage Engine 145 | - Vitrage Notifier 146 | 147 | Testing 148 | ======= 149 | 150 | This change needs to be tested by unit tests. 151 | 152 | Documentation Impact 153 | ==================== 154 | 155 | None 156 | 157 | References 158 | ========== 159 | 160 | https://wiki.openstack.org/wiki/Vitrage 161 | 162 | -------------------------------------------------------------------------------- /specs/queens/implemented/event-persistor.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =============== 8 | Event Persistor 9 | =============== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/event-persistor 12 | 13 | The presistor service listens to the RabbitMQ2 (on a different topic) and 14 | asynchronously writes the events to a relational database. All events are 15 | stored after the filter/enrich phase. 16 | 17 | Problem description 18 | =================== 19 | 20 | In order to support some of the main future use cases of Vitrage, including full 21 | HA support, alarm history and RCA history, we will need to store the events 22 | that arrive from the collector in a persistent database. 23 | Explained in `Vitrage HA and History Vision`_. 24 | 25 | .. _Vitrage HA and History Vision: https://docs.openstack.org/vitrage/latest/contributor/vitrage-ha-and-history-vision.html 26 | 27 | Example of a use case for the stored data: 28 | Reconstructing the graph from the historic data that controlled by the processor, 29 | and will be used in two cases: 30 | 31 | - Upon failure, in order to initiate the standby vitrage-graph process 32 | - For RCA history 33 | 34 | Proposed change 35 | =============== 36 | 37 | .. _vitrage_persistor: new topic in rabbitMQ2 for the Persistor. 38 | 39 | Add `event`_ table in Vitrage database. 40 | Both Datasource driver and Service listener (collector) passes the filtered/enriched 41 | events to the persistor via `vitrage_persistor`_ topic. 42 | Add a Persistor service which listens to `vitrage_persistor`_ topic and writes the 43 | events to `event`_ table. 44 | 45 | 46 | Alternatives 47 | ------------ 48 | 49 | None 50 | 51 | Data model impact 52 | ----------------- 53 | 54 | .. _event: a relational database table to store the filtered/enriched events. 55 | 56 | The table will have the following fields: 57 | 58 | +----------------------+------------------------------------------------------+-------------------------------------+ 59 | | Field | Description | Examples | 60 | +======================+======================================================+=====================================+ 61 | | id | INTEGER, Auto-Increment | 19588 | 62 | +----------------------+------------------------------------------------------+-------------------------------------+ 63 | | collector_timestamp | The time the event filtered/enriched in the driver | 2017-10-09 09:19:50 | 64 | +----------------------+------------------------------------------------------+-------------------------------------+ 65 | | payload | The enriched event sent from the collector | JSON representation of the event | 66 | +----------------------+------------------------------------------------------+-------------------------------------+ 67 | 68 | 69 | REST API impact 70 | --------------- 71 | 72 | None 73 | 74 | Versioning impact 75 | ----------------- 76 | 77 | None 78 | 79 | Other end user impact 80 | --------------------- 81 | 82 | None 83 | 84 | Deployer impact 85 | --------------- 86 | 87 | None 88 | 89 | Developer impact 90 | ---------------- 91 | 92 | None 93 | 94 | Horizon impact 95 | -------------- 96 | 97 | None 98 | 99 | Implementation 100 | ============== 101 | 102 | Assignee(s) 103 | ----------- 104 | 105 | Primary assignee: 106 | 7mode3294 107 | 108 | Other contributors: 109 | None 110 | 111 | Work Items 112 | ---------- 113 | - Add new topic `vitrage_persistor`_ in rabbitMQ2 for the Persistor. 114 | - Add `event`_ table in Vitrage database. 115 | - Both Datasource driver and Service listener (collector) passes the filtered/enriched 116 | events to the persistor via `vitrage_persistor`_ topic. 117 | - Add a Persistor service which listens to `vitrage_persistor`_ topic and writes the 118 | events to `event`_ table. 119 | 120 | Dependencies 121 | ============ 122 | 123 | None 124 | 125 | Testing 126 | ======= 127 | 128 | The implementation will be covered by additional unit tests and tempest tests. 129 | 130 | Documentation Impact 131 | ==================== 132 | 133 | None 134 | 135 | References 136 | ========== 137 | 138 | - https://docs.openstack.org/vitrage/latest/contributor/vitrage-ha-and-history-vision.html 139 | -------------------------------------------------------------------------------- /specs/mitaka/vitrage-evaluator-engine.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================ 8 | Evaluator Engine 9 | ================ 10 | 11 | Launchpad blueprint: 12 | 13 | https://blueprints.launchpad.net/vitrage/+spec/evaluator-engine 14 | 15 | Vitrage Evaluator serves as workflow manager controlling the analysis and activation of templates and execution of template actions. 16 | 17 | Evaluator engine is the main core of Vitrage evaluator which responsible for managing the templates, executing them over Vitrage Graph and running their action. 18 | 19 | :: 20 | 21 | +---------------------+ 22 | | | 23 | | <------------------+ 24 | | Vitrage Graph | | 25 | | | +-------+-------------------+ 26 | | | | Vitrage Evaluator | 27 | +--------------+------+ | | 28 | | | +------+ | 29 | | | | RCA | | 30 | +-----------------> | | | 31 | | +------+ | 32 | | +------+ | 33 | +---------------------+ | |Deduce| | 34 | | <----------+ |Alarm | | 35 | | Vitrage Notifier | | +------+ | 36 | | | | | 37 | | | +---------------------------+ 38 | | | 39 | +---------------------+ 40 | 41 | 42 | Problem description 43 | =================== 44 | 45 | Vitrage requires a component that is responsible for managing and executing templates, which are the basis for the different algorithms used in Vitrage, such as RCA. 46 | 47 | **Use cases:** 48 | #. When a new **instance** is added/removed/updated in the Vitrage Graph, find the templates relevant to this change and compare the pattern specified in each template to the new graph structure. For each pattern match found, execute the actions specified in the template. 49 | 50 | #. When a new **alarm** is added/removed/updated in the Vitrage Graph, find the templates relevant to this change and compare the pattern specified in each template to the new graph structure. For each pattern match found, execute the actions specified in the template. 51 | 52 | 53 | Proposed change 54 | =============== 55 | 56 | **Managing Templates** 57 | 58 | Users can perform CRUD actions on templates, in order to make changes to the use cases Vitrage supports. Specifically, when adding a template it should be added and stored in graph format and saved in NetworkX (Graph-based in-memory DB). Similarly, this template can be modified or deleted as well. The functionality added here should have a corresponding set of API calls for users to perform these actions. 59 | 60 | **Executing Templates** 61 | 62 | When a change takes place in the Vitrage Graph, it informs the Evaluator Engine of the change. The engine will then search for templates relevant to this change and compare the pattern specified in each template to the new graph structure. For each pattern match found, execute the actions specified in the template. 63 | 64 | 65 | Alternatives 66 | ------------ 67 | None 68 | 69 | Data model impact 70 | ----------------- 71 | The templates are saved in NetworkX graph-base in memory DB. 72 | 73 | REST API impact 74 | --------------- 75 | The functionality added here for **managing** templates should have a corresponding set of API calls for users to perform these actions. 76 | 77 | Versioning impact 78 | ----------------- 79 | None 80 | 81 | Other end user impact 82 | --------------------- 83 | None 84 | 85 | Deployer impact 86 | --------------- 87 | None 88 | 89 | Developer impact 90 | ---------------- 91 | None 92 | 93 | Horizon impact 94 | -------------- 95 | None 96 | 97 | Implementation 98 | ============== 99 | 100 | Assignee(s) 101 | ----------- 102 | TBD 103 | 104 | Work Items 105 | ---------- 106 | 107 | 108 | Dependencies 109 | ============ 110 | 111 | * API with Vitrage Graph 112 | * API with Vitrage Notifier 113 | 114 | 115 | Testing 116 | ======= 117 | TBD 118 | 119 | 120 | Documentation Impact 121 | ==================== 122 | TBD 123 | 124 | 125 | References 126 | ========== 127 | TBD -------------------------------------------------------------------------------- /specs/ocata/event-api.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ========= 8 | Event API 9 | ========= 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/support-inspector-sb-api 12 | 13 | This blueprint defines an API for sending events to Vitrage datasources. 14 | 15 | 16 | Problem description 17 | =================== 18 | As a reference implementation to OPNFV Doctor Inspector project, Vitrage has 19 | to implement the Inspector SB API. This is a REST API that each monitoring 20 | service should call in order to push events to the Inspector. 21 | 22 | 23 | Proposed change 24 | =============== 25 | 26 | New API will be added to allow sending events to the Vitrage datasources. 27 | The events will be sent to Oslo message bus, and will be consumed by the one of 28 | the datasource drivers according to the ``type`` property. 29 | 30 | For example, the Doctor datasource will handle events of type ``compute.host.down``, 31 | while the Nova instance datasource will handle events of type ``compute.instance.delete.start``. 32 | (Using this API for Nova is not a real use case, but can be used for debugging). 33 | 34 | For the Doctor datasource, the events will contain the details defined in the 35 | Doctor specification. Future datasources may also use this API in order to send 36 | their own events to Vitrage. 37 | 38 | 39 | REST API impact 40 | --------------- 41 | 42 | POST /v1.0/event/ 43 | ^^^^^^^^^^^^^^^^^ 44 | 45 | Post an event to Vitrage message queue, to be consumed by a datasource driver. 46 | 47 | Headers 48 | """"""" 49 | 50 | - X-Auth-Token (string, required) - Keystone auth token 51 | - Accept (string) - application/json 52 | - User-Agent (String) 53 | - Content-Type (String): application/json 54 | 55 | Path Parameters 56 | """"""""""""""" 57 | 58 | None. 59 | 60 | Query Parameters 61 | """""""""""""""" 62 | 63 | None. 64 | 65 | Request Body 66 | """""""""""" 67 | 68 | An event to be posted. Will contain the following fields: 69 | 70 | - time: a timestamp of the event. In case of a monitor event, should specify when the fault has occurred. 71 | - type: the type of the event. 72 | - details: a key-value map of metadata. 73 | 74 | A list of some potential details, copied from the Doctor SB API reference: 75 | 76 | - hostname: the hostname on which the event occurred. 77 | - source: the display name of reporter of this event. This is not limited to monitor, other entity can be specified such as ‘KVM’. 78 | - cause: description of the cause of this event which could be different from the type of this event. 79 | - severity: the severity of this event set by the monitor. 80 | - status: the status of target object in which error occurred. 81 | - monitorID: the ID of the monitor sending this event. 82 | - monitorEventID: the ID of the event in the monitor. This can be used by operator while tracking the monitor log. 83 | - relatedTo: the array of IDs which related to this event. 84 | 85 | Request Examples 86 | """""""""""""""" 87 | :: 88 | 89 | POST /v1/event/ 90 | Host: 135.248.18.122:8999 91 | User-Agent: keystoneauth1/2.3.0 python-requests/2.9.1 CPython/2.7.6 92 | Content-Type: application/json 93 | Accept: application/json 94 | X-Auth-Token: 2b8882ba2ec44295bf300aecb2caa4f7 95 | 96 | 97 | :: 98 | 99 | { 100 | 'event': { 101 | 'time': '2016-04-12T08:00:00', 102 | 'type': 'compute.host.down', 103 | 'details': { 104 | 'hostname': 'compute-1', 105 | 'source': 'sample_monitor', 106 | 'cause': 'link-down', 107 | 'severity': 'critical', 108 | 'status': 'down', 109 | 'monitor_id': 'monitor-1', 110 | 'monitor_event_id': '123', 111 | } 112 | } 113 | } 114 | 115 | Response 116 | """""""" 117 | 118 | Status code 119 | ~~~~~~~~~~~ 120 | 121 | - 200 - OK 122 | - 400 - Bad request 123 | 124 | Response Body 125 | ~~~~~~~~~~~~~ 126 | 127 | Returns an empty response body if the request was OK. 128 | Otherwise returns a detailed error message (e.g. 'missing time parameter'). 129 | 130 | Implementation 131 | ============== 132 | 133 | Assignee(s) 134 | ----------- 135 | 136 | Primary assignee: 137 | ifat-afek 138 | 139 | Testing 140 | ======= 141 | 142 | The changes will be tested by unit tests and tempest tests. 143 | 144 | Documentation Impact 145 | ==================== 146 | The new api should be documented 147 | 148 | References 149 | ========== 150 | 151 | - https://wiki.opnfv.org/display/doctor/Doctor+Home 152 | - http://artifacts.opnfv.org/doctor/docs/requirements/05-implementation.html 153 | section 4.5.6 154 | - https://blueprints.launchpad.net/vitrage/+spec/doctor-datasource 155 | -------------------------------------------------------------------------------- /specs/newton/heat-datasource.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =============================================== 8 | Heat Datasource Driver - get_all implementation 9 | =============================================== 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/heat-datasource-get-all 13 | 14 | This blueprint describes the Heat driver for Vitrage Datasource, and its 15 | implementation for get_all Stacks and Stack Resources. 16 | 17 | Problem description 18 | =================== 19 | 20 | Heat stacks should be added to Vitrage Graph via Vitrage Datasources. 21 | This requires writing a Datasource driver for heat. 22 | 23 | The driver should support two modes: 24 | 25 | * get_all: query all Heat Stacks. 26 | * notifications: notify the datasource upon a change in a stack definition or state 27 | 28 | This blueprint refers to get_all implementation. 29 | 30 | Proposed change 31 | =============== 32 | 33 | Heat driver will be configured with: 34 | 35 | * Poll interval in seconds (default: 600) 36 | 37 | Every poll-interval seconds, Heat Driver will call Heat list all stacks API. The stacks will be converted to Vitrage datasource events and passed to Vitrage Graph queue. 38 | 39 | 40 | Alternatives 41 | ------------ 42 | 43 | None 44 | 45 | Data model impact 46 | ----------------- 47 | 48 | Heat event will be sent to Vitrage Graph queue with the following properties: 49 | 50 | +------------------+----------------------------------------------------------+-----------------------------------------------------+ 51 | | Field | Description | Examples | 52 | +==================+==========================================================+=====================================================+ 53 | | stack_id | The stack UUID | e47e1be6-3598-46e6-bb63-0cc9a4e35ad7 | 54 | +------------------+----------------------------------------------------------+-----------------------------------------------------+ 55 | | name | The stack name | mystack | 56 | +------------------+----------------------------------------------------------+-----------------------------------------------------+ 57 | | description | The stack description | mydescription | 58 | +------------------+----------------------------------------------------------+-----------------------------------------------------+ 59 | | user_project_id | The ID of the tenant that owns the alarm | 5542b27142154f30b32dea6238aa81aa | 60 | +------------------+----------------------------------------------------------+-----------------------------------------------------+ 61 | | owner | The ID of the user that owns the alarm | 5555527142154f30b32dea6238aa81aa | 62 | +------------------+----------------------------------------------------------+-----------------------------------------------------+ 63 | | status | The stack status | in progress / failed | 64 | +------------------+----------------------------------------------------------+-----------------------------------------------------+ 65 | | status_reason | The stack status reason | Stack create started .... | 66 | +------------------+----------------------------------------------------------+-----------------------------------------------------+ 67 | 68 | 69 | 70 | 71 | REST API impact 72 | --------------- 73 | 74 | None 75 | 76 | Versioning impact 77 | ----------------- 78 | 79 | None 80 | 81 | Other end user impact 82 | --------------------- 83 | 84 | None 85 | 86 | Deployer impact 87 | --------------- 88 | 89 | Heat driver should be configured 90 | 91 | Developer impact 92 | ---------------- 93 | 94 | None 95 | 96 | Horizon impact 97 | -------------- 98 | 99 | None 100 | 101 | Implementation 102 | ============== 103 | 104 | Assignee(s) 105 | ----------- 106 | 107 | Primary assignee: 108 | dan-offek 109 | 110 | Work Items 111 | ---------- 112 | 113 | None 114 | 115 | Dependencies 116 | ============ 117 | 118 | None 119 | 120 | Testing 121 | ======= 122 | 123 | This blueprint requires unit tests and Tempest tests. 124 | 125 | Documentation Impact 126 | ==================== 127 | 128 | None 129 | 130 | References 131 | ========== 132 | 133 | Datasource main blueprint: 134 | https://github.com/openstack/vitrage-specs/blob/master/specs/mitaka/vitrage-datasource.rst 135 | -------------------------------------------------------------------------------- /specs/ocata/static-datasource-configuration.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ========================================== 8 | Static Data Source Configuration 9 | ========================================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/static-datasource-config-format 12 | 13 | The configuration of static data source has a lot in common with entity and 14 | relationships definition in evaluator template. This blueprint proposes a 15 | refactoring to reuse the template format and parsing methods in static data 16 | source. By doing so, we may reduce the work in maintenance and bring in new 17 | features more easily. 18 | 19 | Problem description 20 | =================== 21 | 22 | Currently the configuration of static data source use a dedicated format, which 23 | has a lot of overlapping with the evaluator templates. 24 | 25 | In static configuration, there are ``entities`` and their ``relationships`` 26 | 27 | .. code-block:: yaml 28 | 29 | - entities 30 | - {entity} 31 | - {entity} 32 | 33 | In each entity 34 | 35 | .. code-block:: yaml 36 | 37 | - name: 38 | id: 39 | relationship: 40 | - {relationship} 41 | - {relationship} 42 | 43 | In evaluator templates we define: ``entities``, ``relationship`` and 44 | ``scenarios``. Each scenario has a condition and actions. 45 | 46 | .. code-block:: yaml 47 | 48 | - definitions 49 | - entities 50 | - {entity} 51 | - {entity} 52 | - relationships 53 | - {relationship} 54 | - {relationship} 55 | 56 | Though serving different purpose, they both 57 | 58 | #. Describe ``entities`` and ``relationships`` 59 | #. Use a dedicated key (id/template_id) to reference the items 60 | #. Include a source entity and target entity in relationship 61 | 62 | The main differences between the two are 63 | 64 | - Evaluator templates defines a topology and scenarios based on it 65 | - Static config defines a topology and **adds** it to the graph 66 | 67 | We may define the static configuration using the same format as the evaluator 68 | templates. And then simulate an entity discovery from the same file. 69 | 70 | By reusing the template parsing engine and workflow, we may reduce the work 71 | in maintenance and bring in new features more easily. 72 | 73 | Proposed change 74 | =============== 75 | 76 | Refactoring the static data source template and use the same parser as in 77 | evaluator template. 78 | 79 | Discover entities from the static data source template. 80 | 81 | For backward compatibility, static data source will take over the control of the 82 | default configuration folder ``/etc/vitrage/static_datasource`` which was used 83 | by static physical datasource. 84 | 85 | Both legacy format and new format will be placed in the same folder. static 86 | datasource parse the file and check the existence of ``meta`` to decide which 87 | engine to use. If not found, proxy the job to static physical datasource and 88 | print a deprecation warning. 89 | 90 | Static physical datasource will be disabled by default and throws exception if 91 | running standalone. 92 | 93 | Alternatives 94 | ------------ 95 | 96 | Fix the issues found in static data source without refactoring the format. This 97 | will keep best back-compatibility but will cause redundant work with scenario 98 | evaluator. 99 | 100 | Data model impact 101 | ----------------- 102 | 103 | None 104 | 105 | REST API impact 106 | --------------- 107 | 108 | None 109 | 110 | Versioning impact 111 | ----------------- 112 | 113 | - Backward compatibility with old format will be kept. 114 | - This introduces a new feature and a minor version incremental is required. 115 | 116 | Other end user impact 117 | --------------------- 118 | 119 | New format of static data source configuration should be applied by the end 120 | user. 121 | 122 | Deployer impact 123 | --------------- 124 | 125 | Old parser will be kept but a deprecated warning will be prompt. 126 | 127 | Developer impact 128 | ---------------- 129 | 130 | None 131 | 132 | Horizon impact 133 | -------------- 134 | 135 | None 136 | 137 | Implementation 138 | ============== 139 | 140 | Assignee(s) 141 | ----------- 142 | 143 | Primary assignee: 144 | yujunz 145 | 146 | Other contributors: 147 | None 148 | 149 | Work Items 150 | ---------- 151 | 152 | - Reuse the parser of evaluator template in static data source configuration. 153 | - Discover entities from the configuration. 154 | - Add deprecated warning on old format. 155 | 156 | Dependencies 157 | ============ 158 | 159 | None 160 | 161 | Testing 162 | ======= 163 | 164 | The changes shall be covered by new unit test. 165 | 166 | Documentation Impact 167 | ==================== 168 | 169 | New format of the template shall be documented. 170 | 171 | References 172 | ========== 173 | 174 | None -------------------------------------------------------------------------------- /specs/newton/template-list-api.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ============================= 8 | Vitrage Get Template List API 9 | ============================= 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/template-list-api 12 | 13 | An API for list all templates loaded from templates lib, both those that passed validation and those that did not 14 | 15 | Problem description 16 | =================== 17 | 18 | We would like to be able to list all templates loaded from /etc/vitrage/templates, 19 | both those that passed validation and those that did not before uploading it to Vitrage. 20 | 21 | Proposed change 22 | =============== 23 | Create API to list all Vitrage loaded templates. 24 | 25 | #. Valid template - template that passed validation and loaded into Scenario Repository. 26 | #. Invalid template = template that did not pass validation. 27 | 28 | The template list API returns a table with the following columns: 29 | 30 | #. uuid - a unique id generated by Vitrage 31 | #. name - template's name 32 | #. status - pass validation or not 33 | #. status details 34 | #. date - when template validation occurred (before template loading is executed) 35 | 36 | Alternatives 37 | ------------ 38 | 39 | None 40 | 41 | Data model impact 42 | ----------------- 43 | 44 | None 45 | 46 | REST API impact 47 | --------------- 48 | 49 | Template List 50 | ^^^^^^^^^^^^^ 51 | 52 | Returns template list 53 | 54 | GET / 55 | ~~~~~ 56 | 57 | Headers 58 | ^^^^^^^ 59 | 60 | - X-Auth-Token (string, required) - Keystone auth token 61 | - Accept (string) - application/json 62 | - User-Agent (String) 63 | 64 | Path Parameters 65 | ^^^^^^^^^^^^^^^ 66 | 67 | None. 68 | 69 | Query Parameters 70 | ^^^^^^^^^^^^^^^^ 71 | 72 | None 73 | 74 | 75 | Request Body 76 | ^^^^^^^^^^^^ 77 | 78 | None. 79 | 80 | Request Examples 81 | ^^^^^^^^^^^^^^^^ 82 | :: 83 | 84 | GET /v1/template/ 85 | Host: 135.248.18.122:8999 86 | User-Agent: keystoneauth1/2.3.0 python-requests/2.9.1 CPython/2.7.6 87 | Accept: application/json 88 | X-Auth-Token: 2b8882ba2ec44295bf300aecb2caa4f7 89 | 90 | Response 91 | ~~~~~~~~ 92 | 93 | Status code 94 | ^^^^^^^^^^^ 95 | 96 | - 200 - OK 97 | - 400 - Bad request 98 | 99 | Response Body 100 | ^^^^^^^^^^^^^ 101 | 102 | Returns a table that is a list of all templates. 103 | Each row describes a template and its status. 104 | 105 | Response Examples 106 | ^^^^^^^^^^^^^^^^^ 107 | 108 | :: 109 | +--------------------------------------+---------------------------------------+--------+--------------------------------------------------+----------------------+ 110 | | uuid | name | status | status details | date | 111 | +--------------------------------------+---------------------------------------+--------+--------------------------------------------------+----------------------+ 112 | | 67bebcb4-53b1-4240-ad05-451f34db2438 | vm_down_causes_suboptimal_application | failed | Entity definition must contain template_id field | 2016-06-29T12:24:16Z | 113 | | 4cc899e6-f6cb-43d8-94a0-6fa937e41ae2 | host_cpu_load_causes_vm_problem | pass | Template validation is OK | 2016-06-29T12:24:16Z | 114 | | 0548367e-711a-4c08-9bdb-cb61f96fed04 | switch_connectivity_issues | pass | Template validation is OK | 2016-06-29T12:24:16Z | 115 | | 33cb4400-f846-4c64-b168-530824d38f3e | host_nic_down | pass | Template validation is OK | 2016-06-29T12:24:16Z | 116 | | a04cd155-0fcf-4409-a27c-c83ba8b20a3c | disconnected_storage_problems | pass | Template validation is OK | 2016-06-29T12:24:16Z | 117 | +--------------------------------------+---------------------------------------+--------+--------------------------------------------------+----------------------+ 118 | 119 | Security impact 120 | --------------- 121 | 122 | None 123 | 124 | Pipeline impact 125 | --------------- 126 | 127 | None 128 | 129 | Other end user impact 130 | --------------------- 131 | 132 | None 133 | 134 | Performance/Scalability Impacts 135 | ------------------------------- 136 | 137 | None 138 | 139 | 140 | Other deployer impact 141 | --------------------- 142 | 143 | None 144 | 145 | Developer impact 146 | ---------------- 147 | 148 | None 149 | 150 | 151 | Implementation 152 | ============== 153 | 154 | Assignee(s) 155 | ----------- 156 | 157 | liat har-tal 158 | 159 | 160 | Work Items 161 | ---------- 162 | 163 | None 164 | 165 | Future lifecycle 166 | ================ 167 | 168 | None 169 | 170 | Dependencies 171 | ============ 172 | 173 | None 174 | 175 | Testing 176 | ======= 177 | 178 | Tempest tests also need to be added in order to test: 179 | 180 | #. Get template list 181 | 182 | Documentation Impact 183 | ==================== 184 | The new api should be documented 185 | 186 | References 187 | ========== 188 | None 189 | -------------------------------------------------------------------------------- /specs/queens/implemented/template-CRUD.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ====================================== 8 | Add CRUD support for template addition 9 | ====================================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/crud-templates 12 | 13 | 14 | Adding CRUD support means that templates can be added/removed in real time when Vitrage is up. 15 | 16 | Templates added/removed via API will be stored in the database so they remain after restarting Vitrage. 17 | sqlalchemy will be used for DB management. 18 | 19 | Problem description 20 | =================== 21 | 22 | Currently, Vitrage templates are loaded from a specific folder during Vitrage startup. 23 | Adding/removing templates while Vitrage services are running requires restart of the vitrage-graph service. 24 | 25 | 26 | Proposed change 27 | =============== 28 | Templates should be stored in the database instead of specific folder in file system, 29 | In that way they can be modified (add/delete) while vitrage is up. 30 | 31 | The evaluator will preform a live update to the entity graph according to actions specified in the added/removed templates. 32 | 33 | 34 | Template add: 35 | 36 | - Validate template. 37 | - Store template in database. 38 | - Notify evaluator. 39 | - Entity graph evaluation with new actions. 40 | 41 | Delete: 42 | 43 | - Delete template from database. 44 | - Notify evaluator. 45 | - Entity graph evaluation to undo templates actions. 46 | 47 | 48 | Update - not supported at this stage. 49 | In order to update template use add and delete template. 50 | 51 | A few changes should be made in order to implement CRUD support 52 | 53 | - Support template Add/Remove commands in vitrageclient (Vitrage API) 54 | - API handler: Store/Delete Templates to and from the Database. 55 | - Graph cloning logic should be extracted to base class. 56 | - Add new DB table called templates 57 | 58 | Template DB Table: 59 | 60 | :: 61 | 62 | +----------------+--------------+------+-----+---------+-------+ 63 | | Field | Type | Null | Key | Default | Extra | 64 | +----------------+--------------+------+-----+---------+-------+ 65 | | created_at | datetime | YES | | NULL | | 66 | | updated_at | datetime | YES | | NULL | | 67 | | id | varchar(64) | NO | PRI | NULL | | 68 | | status | varchar(16) | YES | | NULL | | 69 | | status_details | varchar(128) | YES | | NULL | | 70 | | name | varchar(128) | NO | | NULL | | 71 | | file_content | text | NO | | NULL | | 72 | | type | varchar(64) | YES | | NULL | | 73 | +----------------+--------------+------+-----+---------+-------+ 74 | 75 | 76 | 77 | Alternatives 78 | ------------ 79 | 80 | None 81 | 82 | Data model impact 83 | ----------------- 84 | 85 | A new database table. 86 | 87 | REST API impact 88 | --------------- 89 | 90 | PUT and DELETE methods will be added. 91 | 92 | 93 | Versioning impact 94 | ----------------- 95 | 96 | None 97 | 98 | Other end user impact 99 | --------------------- 100 | 101 | New CLI commands added: 102 | 103 | Vitrage template add 104 | 105 | Vitrage template delete 106 | 107 | Deployer impact 108 | --------------- 109 | 110 | None 111 | 112 | Developer impact 113 | ---------------- 114 | 115 | None 116 | 117 | Horizon impact 118 | -------------- 119 | 120 | None 121 | 122 | Implementation 123 | ============== 124 | 125 | Assignee(s) 126 | ----------- 127 | 128 | Primary assignee: 129 | ikinory 130 | 131 | Other contributors: 132 | None 133 | 134 | Work Items 135 | ---------- 136 | 137 | - API support for add/remove template 138 | - implement database table via SQLAlchemy. 139 | - implement queries to database. 140 | - Tests as explained in "Testing". 141 | 142 | 143 | Dependencies 144 | ============ 145 | 146 | None 147 | 148 | Testing 149 | ======= 150 | 151 | API: 152 | - template add: 153 | * add all types of templates : standard, equivalence, definition. 154 | * add corrupted template and check for failed to add. 155 | * add a folder of templates. 156 | 157 | - template delete: 158 | * check all types of templates : standard, equivalence, definition. 159 | - template list. 160 | - template show: 161 | * compare cli template content to original file content 162 | 163 | e2e: 164 | evaluate the added/ deleted templates on the entire graph. 165 | - test evaluator reload templates: 166 | 167 | example: 168 | 1.raise trigger alarm (template is not loaded yet). 169 | 170 | 2.add the relevant template. 171 | 172 | 3.check action is executed. 173 | 174 | This checks that the evaluators are reloaded and run on all existing vertices. 175 | 176 | Documentation Impact 177 | ==================== 178 | 179 | Template add and delete should be added. Modify template validate and list. 180 | 181 | Changes should be added to: 182 | 183 | API description: vitrage/doc/source/contributor/vitrage-api.rst 184 | 185 | CLI description: doc/source/contributor/cli.rst 186 | 187 | References 188 | ========== 189 | 190 | None 191 | -------------------------------------------------------------------------------- /specs/mitaka/vitrage-support-rca.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =================== 8 | Vitrage RCA Support 9 | =================== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/support-rca 12 | 13 | Vitrage should support RCA calculation. 14 | This feature will allow us to track the cause-and-effect path of an alarm raised in OpenStack. In order to track the 15 | local causal relationships between alarm-pairs we shall use one or more RCA templates which will specify which alarms 16 | cause which alarms. 17 | 18 | Problem description 19 | =================== 20 | 21 | In case of a major failure in the system, we might get a lot of alarms, which will be hard to track. We would like to 22 | identify the root cause of the alarms, so the user can focus on understanding and fixing this alarm. 23 | 24 | Proposed change 25 | =============== 26 | 27 | The Vitrage Evaluator serves as workflow manager controlling the analysis and activation of templates and execution 28 | of template actions. One of its responsibilities is to listen to changes in Vitrage Graph, and upon a change execute 29 | the matching templates. This is a general mechanism that should work for all kinds of templates and perform several 30 | kinds of actions. 31 | 32 | The aim of this blueprint is to make sure RCA functionality works properly end to end. 33 | 34 | Whenever the Vitrage Graph is updated, we will calculate RCA and optionally connect alarm vertices with "causes" edges. 35 | When RCA relations are queried for a certain alarm (i.e. which alarm(s) caused it and which alarm(s) were caused by it), 36 | we will traverse the already-existing "causes" edges and return the RCA tree. 37 | 38 | Example for a graph with causes edges: 39 | 40 | :: 41 | 42 | +---------------+ +-------------+ 43 | | | on | | 44 | | switch alarm | +----------> | switch | 45 | | | | | 46 | +------+--------+ +-------+-----+ 47 | | | 48 | causes | | attached 49 | | | 50 | v v 51 | 52 | +---------------+ +-------------+ 53 | | | on | | 54 | | host alarm | +----------> | host | 55 | | | | | 56 | +------+--------+ +-------+-----+ 57 | | | 58 | causes | | contains 59 | | | 60 | | | 61 | v v 62 | 63 | +---------------+ +-------------+ 64 | | | on | | 65 | | instance alarm| +----------> | instance | 66 | | | | | 67 | +---------------+ +-------------+ 68 | 69 | 70 | Alternatives 71 | ------------ 72 | 73 | We could re-calculate the RCA relationship whenever someone queries it, but this would be inefficient. Calculating 74 | in advance and keeping the results in Vitrage Graph makes more sense. 75 | 76 | Data model impact 77 | ----------------- 78 | 79 | None 80 | 81 | REST API impact 82 | --------------- 83 | 84 | The API is defined in a separate blueprint: https://blueprints.launchpad.net/vitrage/+spec/rca-api 85 | 86 | Security impact 87 | --------------- 88 | 89 | None 90 | 91 | Pipeline impact 92 | --------------- 93 | 94 | None 95 | 96 | Other end user impact 97 | --------------------- 98 | 99 | None 100 | 101 | Performance/Scalability Impacts 102 | ------------------------------- 103 | 104 | Performance should be tested. 105 | Most of the performance risk is in the common blueprints like https://blueprints.launchpad.net/vitrage/+spec/networkx-graph-driver 106 | (see also https://blueprints.launchpad.net/vitrage/+spec/networkx-performance-improvement). However, we will also need 107 | to have specific tests for RCA. 108 | 109 | Other deployer impact 110 | --------------------- 111 | 112 | None 113 | 114 | Developer impact 115 | ---------------- 116 | 117 | None 118 | 119 | Horizon impact 120 | -------------- 121 | 122 | We should develop horizon UI plugin for viewing the RCA relationship. This should be described in a separate blueprint. 123 | 124 | 125 | Implementation 126 | ============== 127 | 128 | Assignee(s) 129 | ----------- 130 | 131 | Primary assignee: 132 | ifat_afek 133 | 134 | Work Items 135 | ---------- 136 | 137 | The blueprint includes: 138 | 139 | - Define the exact syntax for RCA templates 140 | - Mark the causal relationship between two alarms. We would implement it using an action that adds a "causes" edge between the alarm vertices in Vitrage Graph. 141 | - Define and implement the method to query the RCA relations for a given alarm 142 | 143 | 144 | Future lifecycle 145 | ================ 146 | 147 | None 148 | 149 | Dependencies 150 | ============ 151 | 152 | - Vitrage Graph 153 | - Vitrage Engine 154 | 155 | Testing 156 | ======= 157 | 158 | This change needs to be tested by unit tests. 159 | 160 | Documentation Impact 161 | ==================== 162 | 163 | None 164 | 165 | References 166 | ========== 167 | 168 | https://wiki.openstack.org/wiki/Vitrage 169 | 170 | -------------------------------------------------------------------------------- /specs/queens/implemented/refactor-execute-mistral.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =============================== 8 | Refactor execute-mistral action 9 | =============================== 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/refactor-execute-mistral-definition 13 | 14 | The definition of the execute-mistral action should be changed, to better 15 | support the integration of Vitrage and Mistral. 16 | 17 | Problem description 18 | =================== 19 | 20 | With the current execute-mistral action definition, it is possible to pass 21 | string values to the Mistral workflow, but not dynamic attributes like the id 22 | or ip address of the instance that was matched in the template condition. 23 | We should enhance the template language to support passing dynamic attributes 24 | to Mistral. 25 | 26 | Another issue is that the structure of the action definition should be changed. 27 | In the current structure, optional input parameters to the workflow appear on 28 | the same level as the 'workflow' property, which is a mandatory part of the 29 | action definition: 30 | 31 | .. code-block:: yaml 32 | 33 | - action: 34 | action_type: execute_mistral 35 | properties: 36 | workflow: evacuate_host 37 | timeout: 10 38 | force: false 39 | 40 | Instead, we should gather all optional parameters (timeout and force) under 41 | an 'input' section. 42 | 43 | 44 | Proposed change 45 | =============== 46 | 47 | The first part of the change is to create a new 'input' section, under which 48 | all input parameters of the workflow should be placed. A new versioning 49 | mechanism should be introduced to Vitrage templates, to allow validating and 50 | loading of both the old format and the new format. At the first stage both 51 | formats will be supported, but in a version or two we should deprecate the old 52 | format. 53 | 54 | The second part is to define a way to describe which attribute of a specific 55 | entity should be passed to the workflow. The suggested solution is to use 56 | a syntax that is similar to the HOT template. 57 | 58 | We will introduce a get_attr() function with the following parameters: 59 | 60 | * ``resource template_id``: the id of the resource inside the template. Note 61 | that the resource must be part of the condition. 62 | 63 | * ``attribute``: the name of the attribute to use. The attribute will be taken 64 | from the resource vertex in the graph, if the condition is met. 65 | 66 | 67 | .. code-block:: yaml 68 | 69 | - scenario: 70 | condition: host_down_alarm_on_host 71 | actions: 72 | - action: 73 | action_type: execute_mistral 74 | properties: 75 | workflow: evacuate_host 76 | input: 77 | host_id: get_attr(host, "id") 78 | host_ip_addr: get_attr(host, "ip_address") 79 | timeout: 10 80 | force: false 81 | 82 | 83 | Alternatives 84 | ------------ 85 | 86 | One alternative is to replace the get_attr with a shorter syntax. In this case, 87 | we will refer to an entity attribute by the entity template-id and the 88 | attribute name. For example, host.ip_address will mark the ip_address 89 | attribute of the instance. 90 | 91 | The example above will look like: 92 | 93 | .. code-block:: yaml 94 | 95 | - action: 96 | action_type: execute_mistral 97 | properties: 98 | workflow: evacuate_host 99 | input: 100 | host_id: host.id 101 | host_ip_addr: host.ip_address 102 | timeout: 10 103 | force: false 104 | 105 | This looks much nicer, but has two main disadvantages: 106 | 107 | * It is not clear, just by looking at the template, that this is a reference to 108 | an attribute. What if the user wants to pass a string which is "host.id"? 109 | 110 | * It is less generic. On the other hand, if we add get_attr as a function 111 | syntax, we will be able to add other functions in the future with a similar 112 | syntax. 113 | 114 | 115 | Data model impact 116 | ----------------- 117 | 118 | None 119 | 120 | REST API impact 121 | --------------- 122 | 123 | None 124 | 125 | Versioning impact 126 | ----------------- 127 | 128 | The suggested change is not backward-compatible with Pike. Vitrage templates 129 | should be enhanced to support versioning, and both the old version and the new 130 | one should be supported for now. 131 | 132 | Other end user impact 133 | --------------------- 134 | 135 | None 136 | 137 | Deployer impact 138 | --------------- 139 | 140 | None 141 | 142 | Developer impact 143 | ---------------- 144 | 145 | None 146 | 147 | Horizon impact 148 | -------------- 149 | 150 | None 151 | 152 | 153 | Implementation 154 | ============== 155 | 156 | Assignee(s) 157 | ----------- 158 | 159 | Primary assignee: 160 | ifat-afek 161 | 162 | Work Items 163 | ---------- 164 | 165 | * Support versioning in Vitrage templates. Allow per-version validators and 166 | loaders for specific actions. 167 | * Move optional input parameters under 'input' section 168 | * Support get_attr 169 | 170 | Dependencies 171 | ============ 172 | 173 | None 174 | 175 | Testing 176 | ======= 177 | 178 | The implementation will be covered by unit tests and tempest tests. 179 | 180 | Documentation Impact 181 | ==================== 182 | 183 | The changes in the action definition should be documented 184 | 185 | References 186 | ========== 187 | 188 | `Vitrage template format: `_ 189 | -------------------------------------------------------------------------------- /specs/mitaka/nova-entity-transformer.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ======================= 8 | Nova Entity Transformer 9 | ======================= 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/nova-entity-transformer 12 | 13 | When an entity event is being introduced into Vitrage graph, it must 14 | be transformed into a type which can be handled by the graph. 15 | The Entity transformer is responsible for converting an entity event 16 | into entity wrapper. Each entity type has his own transformer. 17 | 18 | All transformers should be written in python and added to vitrage.conf. 19 | The Entity Transformer Engine allows a service provider to extend 20 | support for other entities by writing a designated transformer for each 21 | entity type. 22 | 23 | The first transformers to implement are: 24 | 25 | * Nova Instance transformer 26 | * Nova Host transformer 27 | 28 | Problem description 29 | =================== 30 | 31 | In Vitrage graph, each (instance of an) entity is represented as a vertex. 32 | Therefore, the output of a transformer is a vertex for the given entity 33 | (an "entity vertex"). 34 | In addition, the transformer also returns a list of (vertex, edge)-pairs. 35 | The vertex in each pair describes a neighbor with limited properties, and the 36 | edge represents the connection between both vertices, describing their 37 | relationship. 38 | 39 | There are different Entity Types that should be supported by Vitrage: 40 | 41 | * Openstack types (Nova.instance, Nova.host and etc.) 42 | * Non-Openstack types (Nagios tests, Physical resources and etc.) 43 | 44 | Note: in Vitrage Graph, each vertex contains a dictionary of key-value pairs 45 | which represents the entity properties. Similarly, each edge contains a 46 | dictionary of key-value pairs which represents aspects of the relation between 47 | two vertices. 48 | 49 | 50 | Proposed change 51 | =============== 52 | 53 | When the Entity Processor receives a new entity event, it asks the Entity 54 | Transformer to convert the event into an object which the processor can then 55 | enter into the Vitrage Graph. 56 | 57 | Transformer Operation: When receiving an entity event, the Transformer Engine 58 | first recognizes the entity type and accordingly activates the corresponding 59 | transformer. Each transformer inherits from the Transformer base class and 60 | implements the three methods: 61 | 62 | * Method that returns an entity wrapper 63 | * Method that returns key fields and their order.The key fields are mandatory 64 | * Method that returns a key by given an entity event 65 | 66 | Output Object 67 | ------------- 68 | 69 | The transformer returns an Entity Wrapper, which is a tuple containing an 70 | entity vertex and a list of (vertex,edge) pairs that describe the entity 71 | neighbors (relationships). 72 | 73 | **Entity (source) vertex description:** 74 | 75 | Mandatory properties: 76 | 77 | * ``key`` - For Openstack entities this is the Openstack ID. For non-Openstack 78 | entities this is an ID which will generated by Vitrage 79 | * ``Type`` - Resource \ Physical Resource \ Alarm \ Tests Results 80 | * ``Sub Type`` - Alarm Name, host, instance, switch and etc. 81 | * ``Entity Name`` 82 | * ``Is Deleted``- needed for graph maintenance and marks items that can be 83 | gathered by the garbage collector 84 | 85 | Optional properties (vertex metadata) 86 | 87 | * ``State`` 88 | * ``Project ID`` 89 | 90 | The optional properties list is flexible and can be changed as needed. 91 | In addition, each entity type can have its own relevant properties. 92 | 93 | **(vertex, edge) Pairs:** 94 | 95 | The pair describes a entity’s neighbor vertex and their relationship. 96 | Relationships can be both physical, virtual or (in the future) logical 97 | 98 | ``Target vertex description``: 99 | 100 | The vertex in the pair must have sufficient data to help specify uniquely 101 | which vertex in the Vitrage Graph will be connected to this entity. 102 | Currently, the minimal information needed for this is: 103 | 104 | * ID - For each entity, must have the data which vertex it connects 105 | * Type - Resource \ Physical Resource \ Alarm \ Tests Results 106 | * Sub Type - Alarm Name, host, instance, switch and etc. 107 | 108 | ``Edge description``: 109 | 110 | * Source ID Entity ID 111 | * Target ID - For each entity, must have the data which vertex it connects 112 | * Relation Type - contains, run, attached and etc. 113 | 114 | **Event Type:** 115 | 116 | The type of the event as it happened. Possible options: 117 | 118 | * CREATE - New entity is created 119 | * UPDATE - The entity has been updated 120 | * DELETE - When the entity is deleted 121 | 122 | Alternatives 123 | ------------ 124 | 125 | None 126 | 127 | Data model impact 128 | ----------------- 129 | 130 | TBD 131 | 132 | REST API impact 133 | --------------- 134 | 135 | None 136 | 137 | Versioning impact 138 | ----------------- 139 | 140 | None 141 | 142 | Other end user impact 143 | --------------------- 144 | 145 | None 146 | 147 | Deployer impact 148 | --------------- 149 | 150 | TBD 151 | 152 | Developer impact 153 | ---------------- 154 | 155 | TBD 156 | 157 | Horizon impact 158 | -------------- 159 | 160 | None 161 | 162 | 163 | Implementation 164 | ============== 165 | 166 | Assignee(s) 167 | ----------- 168 | 169 | liat har-tal 170 | 171 | 172 | Work Items 173 | ---------- 174 | None 175 | 176 | 177 | Dependencies 178 | ============ 179 | 180 | None 181 | 182 | 183 | Testing 184 | ======= 185 | 186 | All code will be tested 187 | 188 | 189 | Documentation Impact 190 | ==================== 191 | 192 | None 193 | 194 | 195 | References 196 | ========== 197 | 198 | Vitrage project -------------------------------------------------------------------------------- /specs/mitaka/synchronizer-nagios-get-all.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | =================================================== 8 | Nagios Synchronizer Plugin - get_all implementation 9 | =================================================== 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/synchronizer-nagios-get-all 13 | 14 | This blueprint describes the Nagios plugin for Vitrage Synchronizer, and its 15 | implementation for get_all nagios services (tests). 16 | 17 | Problem description 18 | =================== 19 | 20 | Nagios test results should be added to Vitrage Graph via Vitrage Synchronizer. 21 | This requires writing a Synchronizer plugin for Nagios. 22 | 23 | The plugin should support two modes: 24 | 25 | * get_all: query the results of all Nagios tests and send corresponding events to the Synchronizer 26 | * notifications: notify the Synchronizer upon a change in one of Nagios test results 27 | 28 | This blueprint refers to get_all implementation. 29 | 30 | As a first stage, we will support only Nagios 3. 31 | 32 | Proposed change 33 | =============== 34 | 35 | The current Nagios plugin will support Nagios 3.x. 36 | 37 | Nagios plugin will be configured with: 38 | 39 | * Nagios URI, for example: 40 | http://10.45.1.10/monitoring/nagios/cgi-bin/status.cgi?host=all&limit=0 41 | * Nagios credentials 42 | * Poll interval in seconds (default: 60) 43 | 44 | Every poll-interval seconds, Nagios plugin will call Nagios URI to query the 45 | results of Nagios services (tests). It will parse the returned html (there is 46 | no REST API for Nagios 3.x), create Nagios events and send them to Vitrage 47 | Graph queue. 48 | 49 | 50 | Alternatives 51 | ------------ 52 | 53 | None 54 | 55 | Data model impact 56 | ----------------- 57 | 58 | Nagios event will look like that: 59 | 60 | +-----------------------+--------------------------------------+----------------------+ 61 | | Field | Description | Examples | 62 | +=======================+======================================+======================+ 63 | | resource_name | name of Nagios host | - compute-0-0.local | 64 | | | | - os-glance-00.local | 65 | | | | - ilo.node14 | 66 | +-----------------------+--------------------------------------+----------------------+ 67 | | resource_type | type of Nagios host | - nova.host | 68 | | | | - nova.instance | 69 | | | | - switch | 70 | +-----------------------+--------------------------------------+----------------------+ 71 | | service | name of Nagios service (test) | - CPU load | 72 | | | | - check_ceph_health | 73 | +-----------------------+--------------------------------------+----------------------+ 74 | | status | the status of the service | - OK | 75 | | | | - WARNING | 76 | | | | - CRITICAL | 77 | | | | - UNKNOWN | 78 | +-----------------------+--------------------------------------+----------------------+ 79 | | last_check | last time the service was checked | 2016-01-04 19:17:10 | 80 | +-----------------------+--------------------------------------+----------------------+ 81 | | duration | duration since the last status change| 1d 2h 55m 48s | 82 | +-----------------------+--------------------------------------+----------------------+ 83 | | attempt | how many attempts were made | 1/3 | 84 | +-----------------------+--------------------------------------+----------------------+ 85 | | status_info | additional information | OK - 15min load 1.66 | 86 | | | | at 32 CPUs | 87 | +-----------------------+--------------------------------------+----------------------+ 88 | | vitrage_entity_type | the source of information | nagios | 89 | | | | (constant value) | 90 | +-----------------------+--------------------------------------+----------------------+ 91 | 92 | 93 | 94 | REST API impact 95 | --------------- 96 | 97 | None 98 | 99 | Versioning impact 100 | ----------------- 101 | 102 | None 103 | 104 | Other end user impact 105 | --------------------- 106 | 107 | None 108 | 109 | Deployer impact 110 | --------------- 111 | 112 | Nagios plugin should be configured with Nagios URI, credentials and poll 113 | interval. 114 | 115 | Developer impact 116 | ---------------- 117 | 118 | None 119 | 120 | Horizon impact 121 | -------------- 122 | 123 | None 124 | 125 | Implementation 126 | ============== 127 | 128 | Assignee(s) 129 | ----------- 130 | 131 | Primary assignee: 132 | ifat-afek 133 | 134 | Work Items 135 | ---------- 136 | 137 | None 138 | 139 | Dependencies 140 | ============ 141 | 142 | None 143 | 144 | Testing 145 | ======= 146 | 147 | This blueprint requires unit tests and Tempest tests. 148 | 149 | Documentation Impact 150 | ==================== 151 | 152 | Nagios Configuration should be documented 153 | 154 | References 155 | ========== 156 | 157 | Nagios Configuration: 158 | https://github.com/openstack/vitrage/blob/master/doc/source/nagios-config.rst 159 | 160 | Synchronizer main blueprint: 161 | https://github.com/openstack/vitrage-specs/blob/master/specs/mitaka/vitrage-synchronizer.rst 162 | -------------------------------------------------------------------------------- /specs/rocky/implemented/k8s-datasource.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ===================================================== 8 | kubernetes Datasource Driver - get_all implementation 9 | ===================================================== 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/k8s-datasource 13 | 14 | This blueprint describes the kubernetes datasource, and its 15 | implementation for get_all nodes (VM's). 16 | 17 | Problem description 18 | =================== 19 | 20 | Kubernetes nodes should be added to Vitrage Graph via Vitrage Datasources. 21 | This requires writing a Datasource for kubernetes. 22 | 23 | The datasource should support get_all: periodic query all kubernetes nodes. 24 | 25 | 26 | Proposed change 27 | =============== 28 | 29 | Kubernetes datasource will be configured with: 30 | 31 | * Poll interval in seconds (default: 600) 32 | 33 | Every poll-interval seconds, Kubernetes Driver will call kubernetes client to retrieve all nodes. 34 | The nodes will be converted to Vitrage datasource events and passed to Vitrage 35 | Graph queue. 36 | 37 | Example of kubernetes client return call for list_nodes() :: 38 | 39 | apiVersion: v1 40 | items: 41 | - apiVersion: v1 42 | kind: Node 43 | metadata: 44 | annotations: 45 | node.alpha.kubernetes.io/ttl: "0" 46 | volumes.kubernetes.io/controller-managed-attach-detach: "true" 47 | creationTimestamp: 2017-11-29T07:24:59Z 48 | labels: 49 | beta.kubernetes.io/arch: amd64 50 | beta.kubernetes.io/os: linux 51 | failure-domain.beta.kubernetes.io/region: regionOne 52 | failure-domain.beta.kubernetes.io/zone: zone0 53 | is_control: "true" 54 | kubernetes.io/hostname: bcmt-control-01 55 | name: bcmt-control-01 56 | namespace: "" 57 | resourceVersion: "10714282" 58 | selfLink: /api/v1/nodes/bcmt-control-01 59 | uid: 68011206-d4d6-11e7-9c63-fa163e2e2123 60 | spec: 61 | externalID: 41c40aab-80e9-4bb6-a280-27976bfc811f 62 | providerID: openstack:///41c40aab-80e9-4bb6-a280-27976bfc811f 63 | taints: 64 | - effect: NoExecute 65 | key: is_control 66 | timeAdded: null 67 | value: "true" 68 | status: 69 | addresses: 70 | - address: 172.16.1.12 71 | type: InternalIP 72 | - address: 10.5.138.49 73 | type: InternalIP 74 | - address: bcmt-control-01 75 | type: Hostname 76 | allocatable: 77 | cpu: "2" 78 | memory: 3779500Ki 79 | pods: "110" 80 | capacity: 81 | cpu: "2" 82 | memory: 3881900Ki 83 | pods: "110" 84 | conditions: 85 | - lastHeartbeatTime: 2018-02-15T12:08:58Z 86 | lastTransitionTime: 2018-02-14T13:39:53Z 87 | message: kubelet has sufficient disk space available 88 | reason: KubeletHasSufficientDisk 89 | status: "False" 90 | type: OutOfDisk 91 | - lastHeartbeatTime: 2018-02-15T12:08:58Z 92 | lastTransitionTime: 2018-02-14T13:39:53Z 93 | message: kubelet has sufficient memory available 94 | reason: KubeletHasSufficientMemory 95 | status: "False" 96 | type: MemoryPressure 97 | - lastHeartbeatTime: 2018-02-15T12:08:58Z 98 | lastTransitionTime: 2018-02-14T13:39:53Z 99 | message: kubelet has no disk pressure 100 | reason: KubeletHasNoDiskPressure 101 | status: "False" 102 | type: DiskPressure 103 | - lastHeartbeatTime: 2018-02-15T12:08:58Z 104 | lastTransitionTime: 2018-02-14T13:39:53Z 105 | message: kubelet is posting ready status 106 | reason: KubeletReady 107 | status: "True" 108 | type: Ready 109 | daemonEndpoints: 110 | kubeletEndpoint: 111 | Port: 10250 112 | images: 113 | - names: 114 | - 172.16.1.4:5000/gcr.io/google-containers/hyperkube-amd64 115 | - 172.16.1.4:5000/gcr.io/google-containers/hyperkube-amd64:v1.7.4 116 | sizeBytes: 615424570 117 | nodeInfo: 118 | architecture: amd64 119 | bootID: 883c98a9-17ea-40f9-af7d-a448ad817249 120 | containerRuntimeVersion: docker://1.12.6 121 | kernelVersion: 3.10.0-514.21.1.el7.x86_64 122 | kubeProxyVersion: v1.7.4 123 | kubeletVersion: v1.7.4 124 | machineID: 10783ea106f742728fede153a98b035d 125 | operatingSystem: linux 126 | osImage: Red Hat Enterprise Linux Server 7.3 (Maipo) 127 | systemUUID: 41C40AAB-80E9-4BB6-A280-27976BFC811F 128 | 129 | Relevant data will be extracted: 130 | - Creation Timestamp. 131 | - Name. 132 | - Addresses (IP's) 133 | - kubernetes ID (uid) 134 | - provider 135 | - providerID 136 | 137 | 138 | Alternatives 139 | ------------ 140 | 141 | None 142 | 143 | Data model impact 144 | ----------------- 145 | New vertices will be added to the entity graph. This might be duplicate vertices. 146 | (VM's from Nova and kubernetes). 147 | Proposed solution is resource equivalence. (planned for future work) 148 | 149 | REST API impact 150 | --------------- 151 | 152 | None 153 | 154 | 155 | Versioning impact 156 | ----------------- 157 | 158 | None 159 | 160 | Other end user impact 161 | --------------------- 162 | 163 | None 164 | 165 | Deployer impact 166 | --------------- 167 | 168 | Kubernetes driver should be configured.(get access to master node) 169 | 170 | Developer impact 171 | ---------------- 172 | 173 | None 174 | 175 | Horizon impact 176 | -------------- 177 | 178 | None 179 | 180 | Implementation 181 | ============== 182 | 183 | Assignee(s) 184 | ----------- 185 | 186 | Primary assignee: 187 | Idan-kinory 188 | 189 | Work Items 190 | ---------- 191 | 192 | None 193 | 194 | Dependencies 195 | ============ 196 | 197 | None 198 | 199 | Testing 200 | ======= 201 | 202 | This blueprint requires unit tests. 203 | 204 | Documentation Impact 205 | ==================== 206 | 207 | Datasource configuration. 208 | 209 | References 210 | ========== 211 | 212 | Datasource main blueprint: 213 | https://blueprints.launchpad.net/vitrage/+spec/k8s-datasource -------------------------------------------------------------------------------- /specs/stein/implemented/services_list_and_statuses.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ================ 8 | Service List API 9 | ================ 10 | 11 | StoryBoard link: https://storyboard.openstack.org/#!/story/2004897 12 | 13 | This spec adds a new api for listing all vitrage services and their status. 14 | 15 | Problem description 16 | =================== 17 | 18 | In a production cloud environment, Vitrage will have multiple services 19 | deployed on multiple hosts. This will enable an admin to find these services 20 | and get details like: 21 | 22 | * what is the node on which vitrage service is running, 23 | * what is the running status of vitrage service. 24 | * How long the vitrage services are running successfully. 25 | 26 | 27 | Proposed change 28 | =============== 29 | 30 | A new api command ``vitrage service list`` will be added that will list all 31 | the vitrage services that are currently running, where are they running, their 32 | status and how long are they running. 33 | 34 | Example of a response 35 | --------------------- 36 | 37 | .. code-block:: json 38 | 39 | [ 40 | { 41 | "Created At": "2019-02-10T11:07:15+00:00", 42 | "Hostname": "controller-1", 43 | "Process Id": 23161, 44 | "Name": "ApiWorker worker(0)" 45 | }, 46 | { 47 | "Created At": "2019-02-10T11:07:15+00:00", 48 | "Hostname": "controller-1", 49 | "Process Id": 23153, 50 | "Name": "EvaluatorWorker worker(0)" 51 | }, 52 | { 53 | "Created At": "2019-02-10T11:07:15+00:00", 54 | "Hostname": "controller-1", 55 | "Process Id": 23155, 56 | "Name": "EvaluatorWorker worker(1)" 57 | }, 58 | { 59 | "Created At": "2019-02-10T11:07:15+00:00", 60 | "Hostname": "controller-1", 61 | "Process Id": 23157, 62 | "Name": "EvaluatorWorker worker(2)" 63 | }, 64 | { 65 | "Created At": "2019-02-10T11:07:15+00:00", 66 | "Hostname": "controller-1", 67 | "Process Id": 23158, 68 | "Name": "EvaluatorWorker worker(3)" 69 | }, 70 | { 71 | "Created At": "2019-02-10T11:07:33+00:00", 72 | "Hostname": "controller-1", 73 | "Process Id": 23366, 74 | "Name": "MachineLearningService worker(0)" 75 | }, 76 | { 77 | "Created At": "2019-02-10T11:07:35+00:00", 78 | "Hostname": "controller-1", 79 | "Process Id": 23475, 80 | "Name": "PersistorService worker(0)" 81 | }, 82 | { 83 | "Created At": "2019-02-10T11:07:15+00:00", 84 | "Hostname": "controller-1", 85 | "Process Id": 23164, 86 | "Name": "SnmpParsingService worker(0)" 87 | }, 88 | { 89 | "Created At": "2019-02-10T11:14:30+00:00", 90 | "Hostname": "controller-1", 91 | "Process Id": 25698, 92 | "Name": "vitrageuWSGI worker 1" 93 | }, 94 | { 95 | "Created At": "2019-02-10T11:14:30+00:00", 96 | "Hostname": "controller-1", 97 | "Process Id": 25699, 98 | "Name": "vitrageuWSGI worker 2" 99 | }, 100 | { 101 | "Created At": "2019-02-10T11:07:32+00:00", 102 | "Hostname": "controller-1", 103 | "Process Id": 23352, 104 | "Name": "VitrageNotifierService worker(0)" 105 | } 106 | ] 107 | 108 | 109 | 110 | CLI Example 111 | ----------- 112 | 113 | .. code-block:: console 114 | 115 | +----------------------------------+------------+--------------+---------------------------+ 116 | | Name | Process Id | Hostname | Created At | 117 | +----------------------------------+------------+--------------+---------------------------+ 118 | | ApiWorker worker(0) | 23161 | controller-1 | 2019-02-10T11:07:15+00:00 | 119 | | EvaluatorWorker worker(0) | 23153 | controller-1 | 2019-02-10T11:07:15+00:00 | 120 | | EvaluatorWorker worker(1) | 23155 | controller-1 | 2019-02-10T11:07:15+00:00 | 121 | | EvaluatorWorker worker(2) | 23157 | controller-1 | 2019-02-10T11:07:15+00:00 | 122 | | EvaluatorWorker worker(3) | 23158 | controller-1 | 2019-02-10T11:07:15+00:00 | 123 | | MachineLearningService worker(0) | 23366 | controller-1 | 2019-02-10T11:07:33+00:00 | 124 | | PersistorService worker(0) | 23475 | controller-1 | 2019-02-10T11:07:35+00:00 | 125 | | SnmpParsingService worker(0) | 23164 | controller-1 | 2019-02-10T11:07:15+00:00 | 126 | | vitrageuWSGI worker 1 | 25698 | controller-1 | 2019-02-10T11:14:30+00:00 | 127 | | vitrageuWSGI worker 2 | 25699 | controller-1 | 2019-02-10T11:14:30+00:00 | 128 | | VitrageNotifierService worker(0) | 23352 | controller-1 | 2019-02-10T11:07:32+00:00 | 129 | +----------------------------------+------------+--------------+---------------------------+ 130 | 131 | **Note:** 132 | The cloud operator must pre-install zookeeper or other tooz backend component. 133 | Otherwise, an exception will be raised when users call the service REST API. 134 | 135 | If vitrage is running in k8s cluster then this api might be redundant. 136 | Since k8s handles pods health and topology. We might make the service api 137 | communicate with the k8s api in this case to get all vitrage services and 138 | their statuses. 139 | 140 | Data model impact 141 | ----------------- 142 | 143 | None. 144 | 145 | REST API impact 146 | --------------- 147 | 148 | New api will be added to vitrage to list the services. 149 | 150 | Versioning impact 151 | ----------------- 152 | 153 | None. 154 | 155 | Other end user impact 156 | --------------------- 157 | 158 | In order to support the api we will need a backend to store the information. 159 | We will use the tooz library that supports multiple backends see Tooz_. 160 | 161 | .. _Tooz: https://docs.openstack.org/tooz/latest/ 162 | 163 | Deployer impact 164 | --------------- 165 | 166 | The deployer must pre-install zookeeper or other tooz backend component 167 | In order to support the API. 168 | 169 | When deploying a container then hostname must be changed 170 | so api will be readable. 171 | 172 | Developer impact 173 | ---------------- 174 | 175 | We need to think what to do in case of a container deployment. 176 | docker by default has a hostname of the container id but it can be changed. 177 | 178 | We might use an optional environment variable (e.g HOST_HOSTNAME) for host name 179 | if exist in case of a container that can be passed to the container. 180 | 181 | Horizon impact 182 | -------------- 183 | 184 | None. 185 | 186 | 187 | 188 | Implementation 189 | ============== 190 | 191 | Assignee(s) 192 | ----------- 193 | 194 | Primary assignee: 195 | Eyal 196 | 197 | Work Items 198 | ---------- 199 | 200 | * add tooz support 201 | * add new API to vitrage 202 | * add `service list` to vitrage client 203 | * Documentation and tests 204 | 205 | Dependencies 206 | ============ 207 | 208 | Depends on the tooz library with a backend configured. 209 | 210 | Testing 211 | ======= 212 | 213 | Unit tests, functional tests and tempest tests 214 | 215 | Documentation Impact 216 | ==================== 217 | 218 | The new api will be documented 219 | 220 | References 221 | ========== 222 | 223 | None 224 | -------------------------------------------------------------------------------- /specs/pike/implemented/entity-equivalence.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ============================================ 8 | Define and handle equivalence among entities 9 | ============================================ 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/entity-equivalence 12 | 13 | Define equivalence among entities to allow mapping two or more entities to same 14 | object and handle it properly in scenario evaluator. It is inspired by Idan 15 | Hefetz's proposal about `ZTE use case of alarm deduction`_, but designed in a 16 | generic way to allow extending to RESOURCE equivalence in future. 17 | 18 | Problem description 19 | =================== 20 | 21 | Introducing entity equivalence will enhance the extensibility of Vitrage, 22 | making it suitable for more use cases, such as 23 | 24 | - early `deduction`_ of alarm before it is reported and deal with the real 25 | alarm followed 26 | - `aggregation`_ of equivalent alarms from multiple monitors 27 | - aggregation of resource information from multiple data sources. 28 | 29 | Proposed change 30 | =============== 31 | 32 | Add a new file (or a set of files) that define equivalence between entities 33 | 34 | .. code-block:: yaml 35 | 36 | metadata: 37 | name: entity equivalence example 38 | equivalences: 39 | - equivalence: 40 | - entity: 41 | category: ALARM 42 | type: nagios 43 | name: host_problem 44 | - entity: 45 | category: ALARM 46 | type: zabbix 47 | name: host_problem 48 | - entity: 49 | category: ALARM 50 | type: vitrage 51 | name: host_problem 52 | - equivalence: 53 | ... 54 | 55 | These definitions will take effect globally, i.e. for every other template 56 | 57 | The evaluator will duplicate every scenario for every equivalent alarm 58 | automatically. For example, in case of the condition:: 59 | 60 | condition: nagios_host_problem_on_host and host_contains_vm 61 | 62 | Two conditions will be created internally:: 63 | 64 | condition: nagios_host_problem_on_host and host_contains_vm 65 | condition: zabbix_host_problem_on_host and host_contains_vm 66 | condition: vitrage_host_problem_on_host and host_contains_vm 67 | 68 | The idea is that the user will write a single condition, and all equivalent 69 | conditions will be created and evaluated automatically. 70 | 71 | Equivalences should be defined explicitly. Including one entity in two or more 72 | equivalence definition will result in implicit chaining, thus is considered 73 | invalid. For example, if ``a eq b`` and ``b eq c`` are defined separately, it 74 | will logically result in an implicit ``a eq c``. This will introduce unnecessary 75 | complexity in creating templates and should be restricted in validator. 76 | 77 | Alternatives 78 | ------------ 79 | 80 | Separate file vs embedded definition 81 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 82 | 83 | Instead of creating a separate file, we may embed the equivalence definitions in 84 | templates by adding a new section ``equivalences``. Entities that are equivalent 85 | to each other are grouped in arrays of their ``template_id`` 86 | 87 | .. code-block:: yaml 88 | 89 | metadata: 90 | name: entity equivalence example 91 | definitions: 92 | entities: 93 | - entity: 94 | category: ALARM 95 | type: nagios 96 | name: host_problem 97 | template_id: nagios_host_problem 98 | - entity: 99 | category: ALARM 100 | type: zabbix 101 | name: host_problem 102 | template_id: zabbix_host_problem 103 | - entity: 104 | category: ALARM 105 | type: vitrage 106 | name: host_problem 107 | template_id: vitrage_host_problem 108 | ... 109 | relationships: 110 | ... 111 | equivalences: 112 | - [nagios_host_problem, zabbix_host_problem, vitrage_host_problem] 113 | scenarios: 114 | ... 115 | 116 | In this way, there will be fewer duplication of entity definitions. 117 | 118 | However, given the fact that once an ``equivalent`` edge is added between two 119 | alarms, then it *logically* means that they are equivalent in *all* other 120 | templates as well. Even if they are not specified this way in the other 121 | templates. Then template will be less clear without the equivalence information 122 | embedded in it. 123 | 124 | The duplication of entity definition might be resolved by implementing an 125 | ``import`` feature in other blueprint. 126 | 127 | Adding equivalent edge vs not 128 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 129 | 130 | ``equivalent`` edges could be created between every two equivalent alarms. 131 | Since all related scenarios have been duplicated, This does not bring extra 132 | value in the evaluator. 133 | 134 | The ``equivalent`` edge could be useful for future evolution such as alarm 135 | aggregation, UI optimization, alarm deduction. It may be implemented in those 136 | blueprints. 137 | 138 | Data model impact 139 | ----------------- 140 | 141 | None 142 | 143 | REST API impact 144 | --------------- 145 | 146 | None 147 | 148 | Versioning impact 149 | ----------------- 150 | 151 | None 152 | 153 | Other end user impact 154 | --------------------- 155 | 156 | None 157 | 158 | Deployer impact 159 | --------------- 160 | 161 | None 162 | 163 | Developer impact 164 | ---------------- 165 | 166 | None 167 | 168 | Horizon impact 169 | -------------- 170 | 171 | There are currently three views in ``vitrage-dashboard`` 172 | 173 | Topology view 174 | ^^^^^^^^^^^^^ 175 | 176 | No impact 177 | 178 | RCA view 179 | ^^^^^^^^ 180 | 181 | More alarms and more ``causes`` edges 182 | 183 | .. TODO:: (yujunz) include example graph 184 | 185 | Entity graph 186 | ^^^^^^^^^^^^ 187 | 188 | - separate vertices for equivalent alarms (nagios, zabbix, vitrage) 189 | - more edges (``equivalent`` and ``on``) 190 | 191 | Summary 192 | ^^^^^^^ 193 | 194 | The impacts on RCA view and Entity graph will only be relevant to cases where 195 | both ``equivalence`` and ``vitrage-dashboard`` are used. We will handle it in 196 | future blueprints. 197 | 198 | Implementation 199 | ============== 200 | 201 | Assignee(s) 202 | ----------- 203 | 204 | Primary assignee: 205 | yujunz 206 | 207 | Other contributors: 208 | None 209 | 210 | Work Items 211 | ---------- 212 | 213 | - validate and parse equivalence definition in templates 214 | - duplicate scenarios in the scenario repository 215 | - no changes in sub-graph matching or the evaluator 216 | 217 | The following items are not in scope 218 | 219 | - aggregation of equivalent alarms 220 | - ``add-equivalent`` action 221 | - support alarm equivalence in UI 222 | - implement causal tree model for alarm deduction enhancement 223 | - resource equivalence 224 | 225 | Dependencies 226 | ============ 227 | 228 | None 229 | 230 | Testing 231 | ======= 232 | 233 | The implementation will be covered by additional unit test 234 | 235 | Documentation Impact 236 | ==================== 237 | 238 | - documentation on how to define equivalence and when to use it 239 | - declare limitation on resource equivalence 240 | - list known issues when use ``equivalence`` with ``vitrage-dashboard`` 241 | 242 | References 243 | ========== 244 | 245 | .. _ZTE use case of alarm deduction: https://goo.gl/FfDLi8 246 | .. _deduction: https://review.openstack.org/#/c/423000/ 247 | .. _aggregation: https://blueprints.launchpad.net/vitrage/+spec/alarm-aggregation 248 | -------------------------------------------------------------------------------- /specs/queens/implemented/db-support.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ========== 8 | DB Support 9 | ========== 10 | 11 | https://blueprints.launchpad.net/vitrage/+spec/db-support 12 | 13 | There is a need in Vitrage for persistent DB. The main reasons are history and 14 | high availability, but there are additional uses for persistent DB, such as 15 | saving the registered rest notifications recipients, and improving performance 16 | using DB capabilities. 17 | We will use sqlalchemy for DB management, same as Nova, Heat, Aodh and other 18 | projects. 19 | 20 | 21 | Problem description 22 | =================== 23 | 24 | Vitrage should support persistent db management. 25 | 26 | 27 | Proposed change 28 | =============== 29 | 30 | The development will happen in a few stages. To enable faster development 31 | of Vitrage History / HA / Vitrage notifications, in the first stage only basic 32 | sqlalchemy support will be developed. 33 | 34 | Note : A good example of sqlalchemy implementation can be found in : 35 | 36 | https://github.com/openstack/aodh/blob/master/aodh/storage/impl_sqlalchemy.py 37 | 38 | | 39 | 40 | Basic configuration in vitrage.conf : 41 | 42 | # The SQLAlchemy connection string to use to connect to the database. 43 | 44 | # Example: 45 | 46 | # connection = mysql://root:pass@127.0.0.1:8999/vitrage 47 | 48 | #connection = 49 | 50 | | 51 | 52 | # The SQL mode to be used for MySQL sessions. This option, including the 53 | 54 | # default, overrides any server-set SQL mode. To use whatever SQL mode is set 55 | 56 | # by the server configuration, set this to no value. Example: mysql_sql_mode= 57 | 58 | # (string value) 59 | 60 | # Additonal modes and their functionality : 61 | 62 | # https://dev.mysql.com/doc/refman/5.7/en/sql-mode.html#sql-mode-combo 63 | 64 | #mysql_sql_mode = TRADITIONAL 65 | 66 | | 67 | 68 | # Timeout before idle SQL connections are reaped. (integer value) 69 | 70 | #idle_timeout = 3600 71 | 72 | | 73 | 74 | # Minimum number of SQL connections to keep open in a pool. (integer value) 75 | 76 | #min_pool_size = 1 77 | 78 | | 79 | 80 | # Maximum number of SQL connections to keep open in a pool. Setting a value of 81 | 82 | # 0 indicates no limit. (integer value) 83 | 84 | #max_pool_size = 5 85 | 86 | | 87 | 88 | # Maximum number of database connection retries during startup. Set to -1 to 89 | 90 | # specify an infinite retry count. (integer value) 91 | 92 | #max_retries = 10 93 | 94 | | 95 | 96 | # Interval between retries of opening a SQL connection. (integer value) 97 | 98 | #retry_interval = 10 99 | 100 | | 101 | 102 | # If set, use this value for max_overflow with SQLAlchemy. (integer value) 103 | 104 | #max_overflow = 50 105 | 106 | | 107 | 108 | # Verbosity of SQL debugging information: 0=None, 100=Everything. (integer 109 | 110 | # value) 111 | 112 | # Minimum value: 0 113 | 114 | # Maximum value: 100 115 | 116 | #connection_debug = 0 117 | 118 | | 119 | 120 | # Add Python stack traces to SQL as comment strings. (boolean value) 121 | 122 | #connection_trace = false 123 | 124 | | 125 | 126 | # If set, use this value for pool_timeout with SQLAlchemy. (integer value) 127 | 128 | #pool_timeout = 129 | 130 | | 131 | 132 | # Enable the experimental use of database reconnect on connection lost. 133 | 134 | # (boolean value) 135 | 136 | #use_db_reconnect = false 137 | 138 | | 139 | 140 | # Seconds between retries of a database transaction. (integer value) 141 | 142 | #db_retry_interval = 1 143 | 144 | | 145 | 146 | # If True, increases the interval between retries of a database operation up to 147 | 148 | # db_max_retry_interval. (boolean value) 149 | 150 | #db_inc_retry_interval = true 151 | 152 | | 153 | 154 | # If db_inc_retry_interval is set, the maximum seconds between retries of a 155 | 156 | # database operation. (integer value) 157 | 158 | #db_max_retry_interval = 10 159 | 160 | | 161 | 162 | # Maximum retries in case of connection error or deadlock error before error is 163 | 164 | # raised. Set to -1 to specify an infinite retry count. (integer value) 165 | 166 | #db_max_retries = 20 167 | 168 | 169 | ------ 170 | Step 1 171 | ------ 172 | 173 | Build a basic sqlalchemy implementation for connecting to the DB, 174 | Create a Vitrage schema in the db and required tables (from Vitrage history 175 | spec and Vitrage httppost registration spec). Each developer should add 176 | and test his own tables to the schema. 177 | 178 | The schema and tables should be created dynamically from Vitrage, each time 179 | Vitrage loads. Vitrage will test existence of the database tables and if they 180 | do not exist, create them, using a "model" data representation. 181 | Testing validity of existing tables or even fixing them is not required in the 182 | first stage. 183 | 184 | Cinder Data Model example : 185 | https://github.com/openstack/cinder/blob/master/cinder/db/sqlalchemy/models.py 186 | 187 | Create DAO that will perform hard coded CRUD on the tables. 188 | 189 | ------ 190 | Step 2 191 | ------ 192 | 193 | - Upgrade / versioning with alembic. 194 | 195 | - Generic sql support. 196 | 197 | - Pagination for big queries, such as all of the entity graph or events table. 198 | 199 | 200 | Alternatives 201 | ============ 202 | 203 | Not today. 204 | 205 | Data model impact 206 | ================= 207 | 208 | This whole design is one big Data Model impact. 209 | 210 | REST API impact 211 | =============== 212 | 213 | None 214 | 215 | Security impact 216 | =============== 217 | 218 | None 219 | 220 | Pipeline impact 221 | =============== 222 | 223 | None 224 | 225 | Other end user impact 226 | ===================== 227 | 228 | None 229 | 230 | Performance/Scalability Impacts 231 | =============================== 232 | 233 | Proper Indexes should be applied, and, according to performance testing 234 | during development, multiply table data to smaller indexed tables for better data 235 | polling performance. 236 | 237 | 238 | Other deployer impact 239 | ===================== 240 | 241 | None 242 | 243 | Developer impact 244 | ================ 245 | 246 | None 247 | 248 | 249 | Implementation 250 | ============== 251 | 252 | Assignee(s) 253 | =========== 254 | 255 | Primary assignees: 256 | danoffek. 257 | alexey_weyl. 258 | 259 | Work Items 260 | ========== 261 | 262 | - Create a basic SQLAlchemy implementation to connect to the DB. 263 | 264 | - Create Data Model for events and other tables. 265 | 266 | - Create simple CRUD methods for event storage. 267 | 268 | - CRUD templates 269 | 270 | - Tests as explained in "Testing". 271 | 272 | 273 | Future lifecycle 274 | ================ 275 | 276 | See "Step 2" 277 | 278 | Dependencies 279 | ============ 280 | 281 | None 282 | 283 | Testing 284 | ======= 285 | 286 | Unit tests: 287 | 288 | - Data selection queries 289 | 290 | - Data update 291 | 292 | - Table create with indexes 293 | 294 | - (should not run in the gate or regularly) Performance tests on large table data with 295 | multiple inserts per seconds over a period of an hour. 296 | 297 | | 298 | 299 | Tempest tests: Each developer should create his own DB tempest tests. 300 | Example: Create alarms / resources events, and poll the system afterwards. In case the data 301 | wasn't stored in the events table properly, errors should be issued. 302 | 303 | Documentation Impact 304 | ==================== 305 | The DB configuration should be documented in an .rst file, and there should be a 306 | link to it from 307 | https://github.com/openstack/vitrage/blob/master/doc/source/installation-and-configuration.rst 308 | 309 | References 310 | ========== -------------------------------------------------------------------------------- /specs/rocky/approved/add-action-list-panel.rst: -------------------------------------------------------------------------------- 1 | This work is licensed under a Creative Commons Attribution 3.0 Unported 2 | License. 3 | 4 | http://creativecommons.org/licenses/by/3.0/legalcode 5 | 6 | ============================================= 7 | Add action list panel for entity click action 8 | ============================================= 9 | 10 | The URL of the launchpad blueprint: 11 | 12 | https://blueprints.launchpad.net/vitrage/+spec/add-action-list-panel 13 | 14 | Problem description 15 | =================== 16 | 17 | The Vitrage Dashboard's Entity Graph provides users with visual convenience. 18 | As a result, the cloud administrator or Vitrage users can easily identify the 19 | different situations for each entity. However, the Entity Graph is now provided 20 | for visual functionality and still has a small range of actions that can be 21 | taken in context. 22 | 23 | Proposed changes 24 | ================ 25 | 26 | For each entity in the current entity graph, the user can see information about the 27 | entity through a click action. We will add an action list panel that provides 28 | multiple actions to the user through existing click actions. Users can click an 29 | entity to see a list of available actions from the drop-down menu. The drop-down 30 | menu is located at the bottom of the existing Info panel and configures the action 31 | list based on a setting file. The user can select one of these action lists and enter 32 | the specific parameters required to execute the action through the additional UI. 33 | 34 | Examples of actions that can be provided: 35 | 36 | * Execute Mistral : If the user selects Mistral, a drop-down menu that is included in 37 | the new UI displays a list of workflows currently stored in Mistral. When the user 38 | selects a workflow from the list, the user can enter various parameter values 39 | (for example, Workflow_name, Workflow_input, params). Then ask them to run the 40 | workflow from Mistral server to the API. The functional scope of 41 | Mistral will be expanded in the future. 42 | 43 | * Performance Test : The user can perform tests by clicking entities such as Nova, 44 | Neutron, and others displayed in the Entity Graph. Like Mistral, the user can view 45 | a list of test scenarios in a new UI, enter various parameter values for testing, 46 | and then request performance testing through the Openstack Rally API. 47 | The range of performance support for this feature will be expanded in the future. 48 | 49 | * Launch web page of monitoring tool : The web of the monitoring tool monitoring 50 | the VM or the physical node entity is displayed. 51 | 52 | * Open related UI for other projects : If the user requests an action from other project 53 | through the action list panel, the results of the action request(e.g. Execute Workflow, 54 | Performance Test) can be checked through the UI of other project. Therefore, this action 55 | opens the UI related with the action in the other project(e.g. Workflow's 'Workflow 56 | Executions' panel from Mistral) in a new tab to see the results of the request. 57 | 58 | The overall workflow is as follows:: 59 | 60 | +-----------------------+ 61 | | Vitrage Dashboard | 62 | | | 63 | | +------------------+ | 64 | | | | | 65 | | | Entity Click | | +-----------------------+ 66 | | | | | | | 67 | | +---------+--------+ | | Other Project's | 68 | | | | | | 69 | +-----------------------+ +-----------^-----------+ 70 | | | 71 | (1)| |(3) 72 | | | 73 | +-----------v-----------+ +-----------+-----------+ 74 | | | | | 75 | | Action list Panel +--------> Action Parameter UI | 76 | | | (2) | | 77 | +-----------+-----------+ +-----------------------+ 78 | | 79 | (4)| 80 | | 81 | +-----------v-----------+ 82 | | | 83 | | Other Project's UI | 84 | | | 85 | +-----------------------+ 86 | 87 | (1) Users can click an entity to see a list of actions that can be 88 | performed through the drop-down at the bottom of the Info panel. 89 | The action list is composed of the settings file configured by the user. 90 | This means that the list of actions depends on user's environment. 91 | Also, depending on the entity type, the action may be restricted. 92 | 93 | (2) When the user selects an action from the action list, a detailed action 94 | list(e.g. Workflow list, Test scenario list) and a UI for inputting 95 | parameters are displayed. The user can then select detailed actions and enter 96 | parameters in the corresponding UI. 97 | 98 | (3) Enter all the parameters in the parameter UI and press the OK button to 99 | request it from the other project.(e.g. Mistral, Rally) 100 | 101 | (4) The result of the user's request can be viewed in a new tab with the 102 | related project's UI. 103 | 104 | An example configuration file is shown below: 105 | 106 | .. code-block:: ini 107 | 108 | [ACTION_LIST] 109 | mistral = [Mistral Endpoint] 110 | rally = [Rally Endpoint] 111 | monitor_url = [Monitoring Tool URL] 112 | 113 | .. end 114 | 115 | If the user does not enter information for a specific action in the above mentioned 116 | setting file, the action list will not include the corresponding action. 117 | This determines whether the project is installed to request action. 118 | Therefore, the action list is configured based on the setting file, so if 119 | the user wants to receive the action, the user should input the information according 120 | to the user's environment. 121 | 122 | Alternatives 123 | ------------ 124 | None 125 | 126 | Data model impact 127 | ----------------- 128 | None 129 | 130 | REST API impact 131 | --------------- 132 | None 133 | 134 | Versioning impact 135 | ----------------- 136 | None 137 | 138 | Other end user impact 139 | --------------------- 140 | None 141 | 142 | Deployer impact 143 | --------------- 144 | None 145 | 146 | Developer impact 147 | ---------------- 148 | None 149 | 150 | Horizon impact 151 | -------------- 152 | 153 | * When the user clicks an entity in the entity graph, a panel is added to 154 | display a list of actions. 155 | * Additional actions for the action list can be configured by the user. 156 | * When the user selects an action, the UI for entering the required parameter 157 | values and selecting detailed actions(e.g. workflow list, test scenario) 158 | appears. 159 | 160 | Implementation 161 | ============== 162 | 163 | Assignee(s) 164 | ----------- 165 | 166 | Primary assignee: 167 | MinWookKim 168 | 169 | Work Items 170 | ---------- 171 | 172 | * Add a new panel for entity clicks in the entity graph. 173 | * See a list of actions that use the new panel. 174 | * The action list can be selected and requested from an other project. 175 | * Configure the settings file to organize the action list. 176 | 177 | Dependencies 178 | ============ 179 | None 180 | 181 | Testing 182 | ======= 183 | 184 | * New UI. (action list UI, prameter UI) 185 | * Request API through action list panel and check other project action. 186 | 187 | Documentation Impact 188 | ==================== 189 | 190 | Configuration for additional action list, usage for 191 | adding actions should be documented. 192 | 193 | References 194 | ========== 195 | - https://wiki.openstack.org/wiki/Mistral 196 | - https://docs.openstack.org/rally/latest/ 197 | -------------------------------------------------------------------------------- /specs/stein/implemented/short_template_format.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ===================== 8 | Short Template Format 9 | ===================== 10 | 11 | StoryBoard link: https://storyboard.openstack.org/#!/story/2004871 12 | 13 | This spec suggests a shorter format for Vitrage template files. 14 | The new format should be shorter, read faster and written more easily. 15 | 16 | Problem description 17 | =================== 18 | 19 | Vitrage template language is powerful, yet a bit verbose and it's quite easy 20 | to make syntax mistakes when creating a new template from scratch. 21 | We would like to make it easier to generate any template. 22 | 23 | For more information on the current format, see Template_format_. 24 | 25 | .. _Template_format: https://docs.openstack.org/vitrage/latest/contributor/vitrage-template-format.html 26 | 27 | Proposed change 28 | =============== 29 | 30 | Several simplifications that will result in a shorter template file 31 | while preserving it's capabilities. 32 | 33 | 1. Remove ``definitions`` level nesting, ``entities`` will directly replace it 34 | 2. Remove ``relationships``, these will be represented inline in the condition, 35 | example usage ``condition: host_ssh_alarm [ on ] host`` 36 | 3. ``entities`` will be changed from a list to a dictionary, ``template_id`` 37 | is removed and the entity key will be used instead. 38 | 4. In ``actions`` the field ``action_type`` is removed and the action key will be used instead 39 | 5. In each action ``action_target`` nesting level is removed 40 | 6. In ``scenarios`` removed the ``scenario`` nesting level 41 | 7. Action ``set_state`` will be replaced with ``set_suboptimal``, ``set_error`` and ``set_ok``. 42 | 8. ``raise_alarm`` action will have a new optional field ``causing_alarm`` this will add a causal relationship 43 | between the new alarm and the causing alarm, de-necessitating the use of ``add_causal_relationship`` 44 | 45 | 46 | Example for a short format template 47 | ----------------------------------- 48 | 49 | .. code-block:: yaml 50 | 51 | metadata: 52 | version: 3 53 | name: zabbix alarm for network interface and ssh affects host instances 54 | description: zabbix alarm for network interface and ssh affects host instances 55 | entities: 56 | host_network_alarm: 57 | type: zabbix 58 | rawtext: host network interface is down 59 | host_ssh_alarm: 60 | type: zabbix 61 | rawtext: host ssh is down 62 | instance: 63 | type: nova.instance 64 | host: 65 | type: nova.host 66 | scenarios: 67 | - condition: host_ssh_alarm [ on ] host 68 | actions: 69 | - set_suboptimal: 70 | target: host 71 | - condition: host_network_alarm [ on ] host AND host_ssh_alarm [ on ] host 72 | actions: 73 | - add_causal_relationship: 74 | source: host_network_alarm 75 | target: host_ssh_alarm 76 | - condition: host_ssh_alarm [ on ] host AND host [ contains ] instance 77 | actions: 78 | - raise_alarm: 79 | target: instance 80 | alarm_name: instance is down 81 | severity: WARNING 82 | causing_alarm: host_ssh_alarm 83 | - set_error: 84 | target: instance 85 | 86 | Example for an equivalent template in the previous format 87 | --------------------------------------------------------- 88 | 89 | .. code-block:: yaml 90 | 91 | metadata: 92 | version: 2 93 | type: standard 94 | name: zabbix alarm for network interface and ssh affects host instances 95 | description: zabbix alarm for network interface and ssh affects host instances 96 | definitions: 97 | entities: 98 | - entity: 99 | category: ALARM 100 | type: zabbix 101 | rawtext: host network interface is down 102 | template_id: host_network_alarm 103 | - entity: 104 | category: ALARM 105 | type: zabbix 106 | rawtext: host ssh is down 107 | template_id: host_ssh_alarm 108 | - entity: 109 | category: ALARM 110 | type: vitrage 111 | name: instance is down 112 | template_id: instance_alarm 113 | - entity: 114 | category: RESOURCE 115 | type: nova.instance 116 | template_id: instance 117 | - entity: 118 | category: RESOURCE 119 | type: nova.host 120 | template_id: host 121 | relationships: 122 | - relationship: 123 | source: host_network_alarm 124 | relationship_type: on 125 | target: host 126 | template_id : network_alarm_on_host 127 | - relationship: 128 | source: host_ssh_alarm 129 | relationship_type: on 130 | target: host 131 | template_id : ssh_alarm_on_host 132 | - relationship: 133 | source: host 134 | relationship_type: contains 135 | target: instance 136 | template_id : host_contains_instance 137 | - relationship: 138 | source: instance_alarm 139 | relationship_type: on 140 | target: instance 141 | template_id : alarm_on_instance 142 | scenarios: 143 | - scenario: 144 | condition: ssh_alarm_on_host 145 | actions: 146 | - action: 147 | action_type: set_state 148 | action_target: 149 | target: host 150 | properties: 151 | state: SUBOPTIMAL 152 | - scenario: 153 | condition: network_alarm_on_host AND ssh_alarm_on_host 154 | actions: 155 | - action: 156 | action_type: add_causal_relationship 157 | action_target: 158 | source: host_network_alarm 159 | target: host_ssh_alarm 160 | - scenario: 161 | condition: ssh_alarm_on_host AND host_contains_instance 162 | actions: 163 | - action: 164 | action_type: raise_alarm 165 | action_target: 166 | target: instance 167 | properties: 168 | alarm_name: instance is down 169 | severity: WARNING 170 | - action: 171 | action_type: set_state 172 | action_target: 173 | target: instance 174 | properties: 175 | state: ERROR 176 | - scenario: 177 | condition: ssh_alarm_on_host AND host_contains_instance AND alarm_on_instance 178 | actions: 179 | - action: 180 | action_type: add_causal_relationship 181 | action_target: 182 | source: host_ssh_alarm 183 | target: instance_alarm 184 | 185 | 186 | Alternatives 187 | ------------ 188 | 189 | None. 190 | 191 | 192 | Data model impact 193 | ----------------- 194 | 195 | None. 196 | Templates will be stored in the same data model. 197 | 198 | REST API impact 199 | --------------- 200 | 201 | None 202 | 203 | Versioning impact 204 | ----------------- 205 | 206 | These changes will be part of Vitrage template version 3. 207 | 208 | Other end user impact 209 | --------------------- 210 | 211 | None 212 | 213 | Deployer impact 214 | --------------- 215 | 216 | None 217 | 218 | Developer impact 219 | ---------------- 220 | 221 | None 222 | 223 | Horizon impact 224 | -------------- 225 | 226 | None. 227 | 228 | Implementation 229 | ============== 230 | 231 | Assignee(s) 232 | ----------- 233 | 234 | Primary assignee: 235 | idan_hefetz 236 | 237 | Work Items 238 | ---------- 239 | 240 | * Introduce Vitrage template version 3 241 | * Support template validation and loading 242 | * Documentation and tests 243 | 244 | Dependencies 245 | ============ 246 | 247 | None 248 | 249 | Testing 250 | ======= 251 | 252 | Unit tests, functional tests and tempest tests 253 | 254 | Documentation Impact 255 | ==================== 256 | 257 | The new template format will be documented 258 | 259 | References 260 | ========== 261 | 262 | None 263 | -------------------------------------------------------------------------------- /specs/mitaka/aodh-notifier.rst: -------------------------------------------------------------------------------- 1 | .. 2 | This work is licensed under a Creative Commons Attribution 3.0 Unported 3 | License. 4 | 5 | http://creativecommons.org/licenses/by/3.0/legalcode 6 | 7 | ============= 8 | Aodh Notifier 9 | ============= 10 | 11 | launchpad blueprint: 12 | https://blueprints.launchpad.net/vitrage/+spec/aodh-notifier 13 | 14 | The Evaluator performs root cause analysis on the Vitrage Graph and may determine that an alarm should be created, deleted or otherwise updated. 15 | Other components are notified of such changes by the Vitrage Notifier service. Among others, Vitrage Notifier is responsible for handling Aodh Alarms. 16 | 17 | This blueprint describes the implementation of Vitrage Notifier for notifying Aodh on Vitrage alarms. 18 | 19 | :: 20 | 21 | +------------------+ +------------------+ +------------------+ 22 | | Aodh <--+ | | | | 23 | +------------------+ | Update | Vitrage | Raise | Vitrage | 24 | +--------| <----------| | 25 | +------------------+ | Alarm | Notifier | Alarm | Evaluator | 26 | | Other components <--+ | | | | 27 | +------------------+ +------------------+ +------------------+ 28 | 29 | 30 | Problem description 31 | =================== 32 | 33 | Vitrage should be capable of creating, deleting and otherwise updating alarms as requested by the Evaluator Engine. 34 | The notifier is responsible for ensuring these updates are executed. Specifically we will start here with Aodh alarms. 35 | 36 | Main challenges: 37 | 38 | * There is no way to define a 'custom alarm' in Aodh 39 | * Vitrage alarms are based on resources. There is a need to pass the resource information to Aodh 40 | * Several alarms of the same type can be triggered at the same time, each for a different resource. For example, in case there is an alarm on a host, Vitrage will raise a deduced alarm on every instance in this host. 41 | * How can someone ask for notifications on updates of Vitrage alarms? 42 | 43 | 44 | Proposed change 45 | =============== 46 | 47 | The Vitrage Notifier will be separate from the Evaluator, as the two will have different demands of scale and other performance considerations. 48 | The Vitrage Notifier will supply an API used by the Vitrage Evaluator, containing create/delete/update alarm. 49 | 50 | In Aodh, Vitrage alarms will be defined as event alarms, this seems like the most appropriate option. The resource id will be defined in the alarm query. 51 | 52 | Vitrage deduced alarms will look like this: 53 | 54 | +---------------------------+---------------------------------------------------------+ 55 | | Property | Value | 56 | +---------------------------+---------------------------------------------------------+ 57 | | alarm_actions | [] | 58 | +---------------------------+---------------------------------------------------------+ 59 | | alarm_id | 4a3cb988-a620-4bf3-87f7-077c751c408f | 60 | +---------------------------+---------------------------------------------------------+ 61 | | description | Instance is unreachable | 62 | +---------------------------+---------------------------------------------------------+ 63 | | enabled | True | 64 | +---------------------------+---------------------------------------------------------+ 65 | | event_type | vitrage.alarm.instance_unreachable | 66 | +---------------------------+---------------------------------------------------------+ 67 | | insufficient_data_actions | [] | 68 | +---------------------------+---------------------------------------------------------+ 69 | | name | vitrage_instance_unreachable_1 | 70 | +---------------------------+---------------------------------------------------------+ 71 | | ok_actions | [] | 72 | +---------------------------+---------------------------------------------------------+ 73 | | project_id | 5542b27142154f30b32dea6238aa81aa | 74 | +---------------------------+---------------------------------------------------------+ 75 | | query | [{field': 'resource_id', 'type': '', 'value': | 76 | | | 'b0bf3635-d9e8-4624-9793-7aac82948c0a', 'op': 'eq'}] | 77 | +---------------------------+---------------------------------------------------------+ 78 | | repeat_actions | False | 79 | +---------------------------+---------------------------------------------------------+ 80 | | severity | moderate | 81 | +---------------------------+---------------------------------------------------------+ 82 | | state | alarm | 83 | +---------------------------+---------------------------------------------------------+ 84 | | type | event | 85 | +---------------------------+---------------------------------------------------------+ 86 | | user_id | 8ab65ef808b245e3ba234b7b3554cb94 | 87 | +---------------------------+---------------------------------------------------------+ 88 | 89 | In this example, Vitrage triggers a deduced alarm that an instance is unreachable due to a failure in the public switch (which was detected by Nagios). 90 | There will be several alarms with the same event_type and different instance ids in their query. 91 | 92 | 93 | There are two options how to trigger Vitrage alarms in Aodh, none is perfect. 94 | 95 | 96 | Alternative 1 97 | ------------- 98 | 99 | Vitrage will create an event alarm in Aodh. 100 | Then, it will send a notification to the message bus. The notification will be converted to a Ceilometer event, which will trigger the Aodh alarm. 101 | 102 | The exact notification and event format are still TBD. 103 | 104 | The main problem with this solution is that the Aodh alarm will be created on-the-fly and triggered immediately, so it will be impossible for another project to register a web-hook on the alarm before it is triggered. 105 | It will be possbile to see Vitrage alarms in list-alarms, but not to be notified when they are first triggered. 106 | 107 | 108 | Alternative 2 109 | ------------- 110 | 111 | Vitrage will create an event alarm in Aodh, with 'alarm' state. The event itself will never be sent, so the alarm state will remain 'alarm'. 112 | 113 | The problem with this solution is that Aodh will not send a notification about the alarm being triggered. But since in Alternative 1 it is also impossible to register on the alarm, there is no real difference between the two options. 114 | 115 | 116 | Data model impact 117 | ----------------- 118 | 119 | None 120 | 121 | REST API impact 122 | --------------- 123 | 124 | None 125 | 126 | Versioning impact 127 | ----------------- 128 | 129 | None 130 | 131 | Other end user impact 132 | --------------------- 133 | 134 | None 135 | 136 | Deployer impact 137 | --------------- 138 | 139 | For Alternative 1 - there is a need to define the notification->event configuration 140 | 141 | For Alternative 2 - None 142 | 143 | Developer impact 144 | ---------------- 145 | 146 | None 147 | 148 | Horizon impact 149 | -------------- 150 | 151 | None 152 | 153 | Implementation 154 | ============== 155 | 156 | Assignee(s) 157 | ----------- 158 | 159 | Primary assignee: 160 | idan-hefetz 161 | 162 | Work Items 163 | ---------- 164 | 165 | None 166 | 167 | Dependencies 168 | ============ 169 | 170 | None 171 | 172 | Testing 173 | ======= 174 | 175 | This blueprint requires unit tests and Tempest tests. 176 | 177 | Documentation Impact 178 | ==================== 179 | 180 | For Alternative 1 - there is a need to document the notification->event configuration 181 | 182 | For Alternative 2 - None 183 | 184 | References 185 | ========== 186 | 187 | Vitrage wiki page: https://wiki.openstack.org/wiki/Vitrage 188 | 189 | Vitrage use cases: https://github.com/openstack/vitrage/blob/master/doc/source/vitrage-use-cases.rst 190 | --------------------------------------------------------------------------------