└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # chaos-engineering-tools 2 | Tools for Chaos Engineers 3 | 4 | # Prerequisites for Chaos Engineering 5 | 6 | ## 1. High Severity Incident Management (SEVs) 7 | 8 | ### Papers 9 | * How To Establish a High Severity Incident Management Program https://www.gremlin.com/how-to-establish-a-high-severity-incident-management-program/ 10 | 11 | ### Tools 12 | * Banjaxed - Open source incident management tool https://github.com/intercom-archive/banjaxed 13 | * Cyphon - Open source incident management and response platform https://github.com/dunbarcyber/cyphon 14 | * Arcdata - Open source incident management and volunteer scheduling application for Red Cross Disaster Services https://github.com/redcross/arcdata 15 | 16 | ## 2. Monitoring & Observability 17 | 18 | * Prometheus - The Prometheus monitoring system and time series database. https://github.com/prometheus/prometheus 19 | * PromViz - Promviz is an application that helps you visualize the traffic of your cluster from Prometheus data. https://github.com/nghialv/promviz 20 | 21 | ## 3. Cost of SEVs 22 | 23 | * Availability Calculator - Calculate how much downtime should be permitted in your SLA https://github.com/dastergon/availability-calculator 24 | 25 | # Chaos Engineering 26 | 27 | ## Gremlin 28 | 29 | Gremlin enables you to run proactive Chaos Experiments to verify that your system can withstand failure. https://gremlin.com 30 | 31 | --------------------------------------------------------------------------------