├── .gitignore
├── 0x01-云原生的来源.md
├── 0x02-监控系统.md
├── 0x03-搭建测试环境.md
├── 0x04-Prometheus-理论(入门篇).md
├── 0x05-Prometheus-理论(进阶篇).md
├── 0x06-Prometheus-实践(入门篇).md
├── 0x07-Prometheus-实践(进阶篇).md
├── 0x08-Alertmanager-告警系统.md
├── 0x09-高可用-Prometheus.md
├── LICENSE
├── README.md
├── example
    ├── grafana-docker.json
    ├── grafana-ingress.json
    ├── grafana-language-echo.json
    ├── grafana-node-exporter.json
    └── grafana-prometheus.json
└── images
    ├── alertmanager-arch.svg
    ├── alertmanager.png
    ├── bbox-alert.png
    ├── bbox-firing.png
    ├── bbox-pending.png
    ├── bbox-slack.png
    ├── bbox-top.png
    ├── cloudnative.png
    ├── cncf.png
    ├── grafana-datasource.png
    ├── grafana-docker.png
    ├── grafana-ingress.png
    ├── grafana-language-echo.png
    ├── grafana-login.png
    ├── grafana-node-exporter.png
    ├── grafana-prometheus.png
    ├── heartbeat.gif
    ├── k8s-dashboard.png
    ├── language-echo-prometheus.png
    ├── logo.png
    ├── monitoring.png
    ├── prometheus-architecture.png
    ├── prometheus-dashboard-0.jpg
    ├── prometheus-dashboard-1.jpg
    ├── prometheus-federation.png
    ├── prometheus-logo.png
    ├── prometheus-operator-architecture.png
    ├── remote-read.png
    └── remote-write.png


/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | 


--------------------------------------------------------------------------------
/0x01-云原生的来源.md:
--------------------------------------------------------------------------------
 1 | ## 1. 云原生的来源
 2 | 
 3 | ### 1.1 CNCF 与 CloudNative
 4 | 
 5 | ![cncf](./images/cncf.png)
 6 | 
 7 | [CNCF](https://cncf.io) 全称 Cloud Native Computing Foundation（云原生计算基金会），成立于 2015 年 7月 21 日，其最初的口号是坚持和整合开源技术来让编排容器作为微服务架构的一部分，是致力于云原生应用推广和普及的一支重要力量。
 8 | 
 9 | CNCF 作为一个厂商中立的基金会，致力于 Github 上的快速成长的开源技术的推广，如 Kubernetes、Prometheus、Envoy 等，帮助开发人员更快更好的构建出色的产品。
10 | 
11 | 在谈 Prometheus 之前，我们先来谈谈 CloudNative 这个概念，这个概念在最近两年被炒得火热，所以云原生到底是个什么东西。CNCF 给出的定义如下：
12 | 
13 | > 云原生技术有利于各组织在公有云、私有云和混合云等新型动态环境中，构建和运行可弹性扩展的应用。云原生的代表技术包括容器、服务网格、微服务、不可变基础设施和声明式 API。
14 | > 
15 | > 这些技术能够构建容错性好、易于管理和便于观察的松耦合系统。结合可靠的自动化手段，云原生技术使工程师能够轻松地对系统作出频繁和**可预测**的重大变更。
16 | >
17 | > 云原生计算基金会（CNCF）致力于培育和维护一个厂商中立的开源生态系统，来推广云原生技术。我们通过将最前沿的模式民主化，让这些创新为大众所用。
18 | 
19 | ### 1.2 Heroku 与 12-Factor
20 | 
21 | 如果看了上述概念觉得不太懂的话，可以结合 [Heroku](http://www.heroku.com/) 公司提供的 [12factor](https://12factor.net/zh_cn/) 规范指南来理解。12-Factor 为构建如下的 SaaS 应用提供了方法论：
22 | 
23 | * 使用标准化流程自动配置，从而使新的开发者花费最少的学习成本加入这个项目。
24 | * 和操作系统之间尽可能的划清界限，在各个系统中提供最大的可移植性。
25 | * 适合部署在现代的云计算平台，从而在服务器和系统管理方面节省资源。
26 | * 将开发环境和生产环境的差异降至最低，并使用持续交付实施敏捷开发。
27 | * 可以在工具、架构和开发流程不发生明显变化的前提下实现扩展。
28 | 
29 | #### 12-Factor 内容
30 | 
31 | 1. **基准代码**：一份基准代码，多份部署。
32 | 2. **依赖**：显式声明依赖关系。
33 | 3. **配置**：在环境中存储配置。
34 | 4. **后端服务**：把后端服务当作附加资源。
35 | 5. **构建**，发布，运行，严格分离构建和运行。
36 | 6. **进程**：以一个或多个无状态进程运行应用。
37 | 7. **端口绑定**：通过端口绑定提供服务。
38 | 8. **并发**：通过进程模型进行扩展。
39 | 9. **易处理**：快速启动和优雅终止可最大化健壮性。
40 | 10. **开发环境与线上环境等价**：尽可能的保持开发，预发布，线上环境相同。
41 | 11. **日志**：把日志当作事件流。
42 | 12. **管理进程**：后台管理任务当作一次性进程运行。
43 | 
44 | ***CloudNative 概念组成***
45 | 
46 | DevOps、Microservices、CI/CD、Containers 这些概念融会贯通后形成 CloudNative 这个大概念。
47 | 
48 | ![CloudNative 示意图](./images/cloudnative.png)
49 | 
50 | ### 1.3 云原生和监控的关系
51 | 
52 | 云原生和监控首先是两个「独立且完整」的概念，云原生的应用强调「可观测性」，而可观测性大致可归纳为三点，**监控**、**日志**、**调用链追踪**。也就是说 **监控应该是云原生概念的一个子集**。
53 | 


--------------------------------------------------------------------------------
/0x02-监控系统.md:
--------------------------------------------------------------------------------
 1 | ## 2. 监控系统
 2 | 
 3 | ![monitoring](./images/monitoring.png)
 4 | 
 5 | ### 2.1 监控的目标
 6 | 
 7 | 技术的进步永远是依靠需求来推动的，监控系统需要解决什么问题决定了它的技术发展方向。我们的目标是：
 8 | 
 9 | * **实时反馈系统当前状态**：监控某个硬件或某个系统，都需要能实时看到当前系统的状态，是正常、异常、或者故障。
10 | * **保证业务持续稳定运行**：监控的目的就是要保证系统、服务、业务正常运行，即使出现故障也能第一时间接收到故障报警，在第一时间处理解决，从而保证业务持续性的稳定运行。
11 | 
12 | 监控是服务于系统和业务的，是为了提高系统的鲁棒性，辅助我们更好的掌握系统和服务的当前状态。
13 | 
14 | 监控的核心可归纳为四点：
15 | 
16 | 1. 发现问题：当系统发生故障报警，我们会收到故障报警的信息。
17 | 2. 定位问题：告警信息里面一般会有错误的原因，我们需要对报警内容进行分析，定位产生故障具体原因。
18 | 3. 解决问题：当分析完故障的原因后，就需要通过故障解决的优先级去修复问题。
19 | 4. 总结问题：当我们解决完重大故障后，需要对故障原因以及防范进行总结归纳，避免以后重复出现。
20 | 
21 | ### 2.2 监控的维度和流程
22 | 
23 | #### 2.2.1 监控维度
24 | 
25 | * **网络**：网络协议：http、dns、tcp、icmp；网络硬件：路由器，交换机等。
26 | * **磁盘**：资源用量。
27 | * **主机**：资源用量。
28 | * **容器**：资源用量。
29 | * **应用**：延迟，错误，QPS，内部状态等。
30 | * **中间件**：资源用量，以及服务状态。
31 | * **编排工具**：集群资源用量，调度等。
32 | 
33 | #### 2.2.1 监控流程
34 | 
35 | 一般来讲，通用的监控流程可以分为以下几点：
36 | 
37 | 1. 数据采集：数据采集对应服务端来说，可能是主动的也可能是被动的（pull/push）。
38 | 2. 数据存储：采集到的样本数据应该落实到本地或远程持久化存储。
39 | 3. 数据分析和展示：提供数据分析工具以及 dashboard 界面，使开发者可以更直观地分析问题。
40 | 5. 监控报警：可以根据开发者提供的自定义告警规则自动监控系统行为，一旦符合规则就发送对应告警信息给接受者。
41 | 6. 报警处理：排查并处理故障问题。
42 | 
43 | ### 2.3 四个黄金信号
44 | 
45 | Google 的 Google SRE Books 一书中提出了系统监控的四个黄金信号，这个四个黄金信号在任何系统中都是很好地反应系统的性能问题，因此被称为「黄金信号」。
46 | 
47 | #### 2.3.1 Latency（延迟）
48 | 
49 | 延迟是发起一次服务请求所需要的时间，但是需要明白，成功请求的延迟和失败请求的延迟是需要区分开的。例如由于与数据库或其他关键后端服务的连接断开而触发的 HTTP 500 错误可能很快就会得到解决，但是由于 HTTP 500 错误表示请求失败，因此将所有的 500 错误都计入整体延迟可能会导致计算结果出现较大误差。
50 | 
51 | 除此以外，在微服务中通常提倡“快速失败”，开发人员需要特别注意这些延迟较大的错误（可能是 SQL 中的慢查询），因为这些缓慢的错误会明显影响系统的性能，因此追踪这些错误的延迟也是非常重要的。
52 | 
53 | #### 2.3.2 Traffic（通讯量）
54 | 
55 | 通讯量是一种对系统需求的度量单位，流量对于不同类型的系统而言可能代表不同的含义。
56 | 
57 | * Web 服务：每秒 HTTP 请求数。
58 | * 音频流系统：网络 I/O 情况。
59 | * 数据库：查询和写入频率。
60 | 
61 | #### 2.3.3 Errors（错误）
62 | 
63 | 错误是监控当前系统所有发生的错误请求，用于衡量当前系统错误发生的速率。对于失败的请求而言，大部分是显式的（HTTP 500），而有些是隐式（HTTP 200，实际业务流程依然是失败的）。
64 | 
65 | 显式的错误如 HTTP 500 可通过在负载均衡器（如 Nginx）上进行捕获，而对于一些系统内部的异常，则可能需要直接从服务中添加钩子统计并进行获取（日志输出）。
66 | 
67 | #### 2.3.4 Saturation（饱和度）
68 | 
69 | 饱和度主要强调最能影响服务状态的受限制的资源。如果系统主要受内存影响，那主要关注系统的内存状态，如果系统主要受限与磁盘 I/O，则主要观测磁盘 I/O 的状态。
70 | 
71 | 通常情况下，当这些资源达到饱和后，服务的性能会明显下降，会出现比较明显的瓶颈。同时还可以利用饱和度对系统做出预测，比如「磁盘是否可能在 4 个小时候就满了」。
72 | 
73 | ### 2.4 RED 法则
74 | 
75 | 在「四个黄金信号」的原则下，RED 方法可以有效的帮助用户衡量云原生以及微服务应用下的用户体验问题。
76 | 
77 | * **Rate（速率）**：服务每秒接收的请求数。
78 | * **Errors（错误）**：每秒失败的请求数。
79 | * **Duration（耗时）**：每个请求的耗时。
80 | 
81 | ### 2.5 USE 法则
82 | 
83 | USE，全称（Utilization Saturation and Errors Method），主要用于分析系统性能问题，可以指导用户快速识别资源瓶颈以及错误的方法。
84 | 
85 | #### 2.5.1 Utilization（使用率）
86 | 
87 | 使用率主要关注系统资源的使用情况。这里的资源主要包括但不限于：CPU、内存、网络、磁盘等。100% 的使用率通常是系统性能瓶颈的标志。
88 | 
89 | #### 2.5.2 Saturation（饱和度）
90 | 
91 | 饱和度例如 CPU 的平均运行排队长度，任何资源在某种程度上的饱和都可能导致系统性能的下降。
92 | 
93 | #### 2.5.3 Errors（错误）
94 | 
95 | 错误指的是错误计数。例如「网卡在数据包传输过程中检测到的以太网网络冲突了 14 次」，「数据库连接在 5 分钟中断开了 3 次」。
96 | 


--------------------------------------------------------------------------------
/0x03-搭建测试环境.md:
--------------------------------------------------------------------------------
 1 | ## 3. 搭建测试环境
 2 | 
 3 | ### 3.1 集群环境
 4 | 
 5 | 主要的折腾环境是在 Kubernetes 集群上，集群是在本地搭建的。使用 VMware 虚拟机，系统为 CentOS7，三节点一主两从。本文不对如何搭建 Kubernetes 集群作详细介绍，这不是本文讨论的重点。都 2020 年了，是时候学会自己搭建一个集群啦，其实也不难，参照官网教程即可。
 6 | 
 7 | 🤔 如果不想搭建个多节点的集群，使用 [kubernetes/minikube](https://github.com/kubernetes/minikube) 也是可以的，作为测试也完全足够。
 8 | 
 9 | ### 3.2 集群节点信息
10 | 
11 | ```shell
12 | ~ 🐶 which k
13 | k: aliased to kubectl
14 | 
15 | ~ 🐶 k get nodes
16 | NAME     STATUS   ROLES    AGE   VERSION
17 | master   Ready    master   95d   v1.16.2
18 | node1    Ready    <none>   95d   v1.16.2
19 | node2    Ready    <none>   95d   v1.16.2
20 | 
21 | ~ 🐶 k top nodes
22 | NAME     CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
23 | master   98m          4%     926Mi           41%
24 | node1    92m          4%     873Mi           50%
25 | node2    113m         5%     921Mi           53%
26 | ```
27 | 
28 | ***Kubernetes Dashboard***
29 | 
30 | ![Kubernetes dashboard](./images/k8s-dashboard.png)
31 | 
32 | 推荐几个部署 kubernetes 集群的相关教程
33 | 
34 | * [2019 最新 k8s 集群搭建教程 (centos k8s 搭建)](https://juejin.im/post/5cb7dde9f265da034d2a0dba)
35 | * [create-cluster-kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/)
36 | * [和我一步步部署 kubernetes 集群](https://github.com/opsnull/follow-me-install-kubernetes-cluster)
37 | 


--------------------------------------------------------------------------------
/0x04-Prometheus-理论(入门篇).md:
--------------------------------------------------------------------------------
 1 | ## 4. Prometheus 理论（入门篇）
 2 | 
 3 | <p align="center"><img src="./images/prometheus-logo.png" width="400px"></p>
 4 | 
 5 | ### 4.1 Prometheus 是什么？
 6 | 
 7 | [Prometheus](https://prometheus.io) 是由 SoundCloud 公司开源和维护的监控和告警的解决方案。项目始于 2012 年，目前已有多家公司将 Prometheus 投入于生产环境中。Prometheus 是继 Kubernetes 之后第二个从 [CNCF](https://www.cncf.io/) 毕业的项目，在这个 CloudNative 盛行的时代，Prometheus 已经成为最热门的监控方案。Prometheus 有着良好的社区氛围，在社区的驱动下，项目正在如火如荼的开发迭代中。
 8 | 
 9 | 作为新一代的监控框架，Prometheus 具有以下特性：
10 |  
11 | * 强大的多维度数据模型，时间序列数据通过 metric 名和键值对来区分，并且所有的 metrics 都可以设置任意的多维标签。
12 | * 灵活的查询语句（PromQL），在同一个查询语句，可以对多个 metrics 进行乘法、加法、连接、取分数位等操作。
13 | * 部署方便，Prometheus server 是一个单独的二进制文件，可直接在本地工作，不依赖于分布式存储。
14 | * 采样高效，一个 Prometheus server 可以处理数百万的 metrics。
15 | * 解耦客户端和服务端，使用 pull 模式采集时间序列数据，由服务端主动配置和抓取客户端 metrics。不过也可以采用 push gateway 的方式把 metrics 数据推送至服务端。
16 | * 支持多种服务发现机制。
17 | * 丰富的可视化图形界面。
18 | 
19 | prometheus 组件
20 | 
21 | * [Prometheus server](https://github.com/prometheus/prometheus): Prometheus 服务端，负责采集来自客户端的时间序列数据。
22 | * [client libraries](https://prometheus.io/docs/instrumenting/clientlibs/): 不同语言的 SDK 客户端。
23 | * [push gateway](https://github.com/prometheus/pushgateway): 主动推送网关，用于采集短时间或瞬时任务，这种情况在一个 Prometheus 采集周期可能会被忽略，所以需要自己主动推送数据。
24 | * [alertmanger](https://github.com/prometheus/alertmanager): 处理告警信息，将警告通知给指定的接受者。
25 | * [exporters](https://prometheus.io/docs/instrumenting/exporters/): exporters 是 Prometheus 的一类采集数据的客户端组件，比如数据库数据采集，主机节点数据采集等等。
26 | 
27 | ***prometheus 架构***
28 | 
29 | ![Prometheus 架构图](./images/prometheus-architecture.png)
30 | 
31 | ### 4.2 Prometheus 目标
32 | 
33 | **prometheus 适合干什么？**
34 | 
35 | Prometheus 天生就是为记录纯数字时间序列数据而准备的。它既适合监控主机级别的数据，也适合于高度动态的面向服务的数据的监控。在微服务领域对多维数据收集和查询的支持是非常强大的特性。Prometheus 的设计旨在提高可靠性，以便我们能快速地发现和排查系统的问题。每个 Prometheus 服务都是独立的，不依赖于网络存储或其他远程服务。
36 | 
37 | **prometheus 不适合干什么？**
38 | 
39 | Prometheus 强调可靠性，即使在故障情况下也始终可以查看有关系统的可用统计信息。但是如果需要 100％ 的准确性（例如审计系统），则 Prometheus 并不是一个好的选择，因为其所收集的数据不够详细和完整（数据压缩也会丢失精度）。在这种情况下，使用传统的关系型数据库会是一个更好的选择。
40 | 
41 | ### 4.3 prometheus 数据类型
42 | 
43 | Prometheus 有四种基本的数据类型，Counter、Gauge、Histogram、Summary。
44 | 
45 | #### 4.3.1 Counter（计数器）
46 | 
47 | Counter 单调递增，违反单调性时重置为 0。可以用于统计某些事件出现的次数，或者服务的 uptime。指标名称一般以 `_total` 作为后缀。
48 | 
49 | #### 4.3.2 Gauge（仪表盘）
50 | 
51 | Gauge 记录瞬时值，可以用于记录系统当下时刻的状态，比如 CPU 使用率，使用内存大小，网络 IO 情况。
52 | 
53 | #### 4.3.3 Histogram（直方图）
54 | 
55 | Histogram 由 `<basename>_bucket{le="<upper inclusive bound>"}`，`<basename>_bucket{le="+Inf"}`, `<basename>_sum`，`<basename>_count` 组成，主要用于表示一段时间范围内对数据进行采样（通常是请求持续时间或响应大小），并能够对其指定区间以及总数进行统计，通常它采集的数据展示为直方图。例如 Prometheus server 中 prometheus_local_storage_series_chunks_persisted, 表示 Prometheus 中每个时序需要存储的 chunks 数量，我们可以用它计算待持久化的数据的分位数。（主要计算消耗在服务端）
56 | 
57 | #### 4.3.4 Summary（摘要）
58 | 
59 | Summary 和 Histogram 类似，由 `<basename>{quantile="<φ>"}`，`<basename>_sum`，`<basename>_count` 组成，主要用于表示一段时间内数据采样结果（通常是请求持续时间或响应大小），它直接存储了 quantile 数据，而不是根据统计区间计算出来的（主要计算消耗在客户端）。例如 Prometheus server 中 prometheus_target_interval_length_seconds。
60 | 
61 | 关于 Histogram 和 Summary 的详细对比，可参考文档：[practices/histograms](https://prometheus.io/docs/practices/histograms/) 。
62 | 


--------------------------------------------------------------------------------
/0x05-Prometheus-理论(进阶篇).md:
--------------------------------------------------------------------------------
  1 | ## 5. Prometheus 理论（进阶篇）
  2 | 
  3 | ### 5.1 存储格式
  4 | 
  5 | Prometheus 按照两个小时为一个时间窗口，将两小时内产生的数据存储在一个块（Block）中。每个块都是一个单独的目录，里面含该时间窗口内的所有样本数据（chunks），元数据文件（meta.json）以及索引文件（index）。
  6 | 
  7 | 其中索引文件会将指标名称和标签索引到样板数据的时间序列中。此期间如果通过 API 删除时间序列，删除记录会保存在单独的逻辑文件 tombstone 当中。
  8 | 
  9 | 当前样本数据所在的块会被直接保存在内存中，不会持久化到磁盘中。为了确保 Prometheus 发生崩溃或重启时能够恢复数据，Prometheus 启动时会通过预写日志（write-ahead-log(WAL)）重新播放记录，从而恢复数据。预写日志文件保存在 wal 目录中，每个文件大小为 128MB。wal 文件包括还没有被压缩的原始数据，所以比常规的块文件大得多。
 10 | 
 11 | 一般情况下，Prometheus 会保留三个 wal 文件，但如果有些高负载服务器需要保存两个小时以上的原始数据，wal 文件的数量就会大于 3 个。
 12 | 
 13 | Prometheus 保存块数据的目录结构如下所示：
 14 | ```
 15 | ├── 01DVSBTTWHKTAZDJEZKS0KZR07
 16 | │   ├── chunks
 17 | │   │   └── 000001
 18 | │   ├── index
 19 | │   ├── meta.json
 20 | │   └── tombstones
 21 | ├── lock
 22 | ├── queries.active
 23 | └── wal
 24 |     ├── 00000086
 25 |     ├── 00000087
 26 |     ├── 00000088
 27 |     ├── 00000089
 28 |     ├── 00000090
 29 |     └── checkpoint.000085
 30 |         └── 00000000
 31 | 
 32 | 4 directories, 12 files
 33 | ```
 34 | 
 35 | 本地存储细节可参考文章：[容器监控实践—Prometheus 存储机制](https://segmentfault.com/a/1190000018479963)。
 36 | 
 37 | ### 5.2 PromQL
 38 | 
 39 | #### 5.2.1 数据格式
 40 | 
 41 | 时间序列是由样本构成的，每个样本包括：
 42 | 
 43 | * float64 精度的数值
 44 | * 毫秒精度的时间戳
 45 | * 标签（label）
 46 | 
 47 | Prometheus label 可能包含 ASCII 字母，数字和下划线。label 必须与正则表达式 `[a-zA-Z_:][a-zA-Z0-9_:]*.` 相匹配。以 `__` 作为前缀的 label 保留供内部使用，如 Kubernetes_sd 中的 `__meta_*`。label 值可以包含任何 Unicode 字符且空 label 等同于该 label 不存在。
 48 | 
 49 | 样本形式如下
 50 | ```
 51 | <metric name>{<label name>=<label value>, ...}
 52 | ```
 53 | 
 54 | #### 5.2.2 PromQL 标签匹配四种操作符
 55 | 
 56 | * **=**: label 完全匹配。
 57 | * **!=**: label 非完全匹配。
 58 | * **=~**: label 正则匹配。
 59 | * **!~**: label 非正则匹配。
 60 | 
 61 | #### 5.2.3 PromQL 四种数据类型
 62 | 
 63 | **Instant vector（即时向量）**：一组时间序列，拥有共同的时间戳，每个时间序列中都包含一个样本。
 64 | 
 65 | 下面表达式描述的是 environment 值可能为 `staging|testing|development` 三种中的一种，且 method 不为 `GET` 的 HTTP 请求的数据。其实也挺好的理解 😎。
 66 | ```
 67 | http_requests_total{environment=~"staging|testing|development",method!="GET"}
 68 | 
 69 | # 等价于该表达式，__name__ 是指 metrics_name
 70 | {__name__="http_requests_total",environment=~"staging|testing|development",method!="GET"}
 71 | ```
 72 | 
 73 | 即时向量之间还可以进行多种算术操作，算术操作符包括。
 74 | ```
 75 | + (加法)
 76 | - (减法)
 77 | * (乘法)
 78 | / (除法)
 79 | % (求余)
 80 | ^ (幂运算)
 81 | ```
 82 | 
 83 | 即时向量之间同时也还可以进行多种布尔操作，布尔操作符包括。
 84 | ```
 85 | == (相等)
 86 | != (不相等)
 87 | > (大于)
 88 | < (小于)
 89 | >= (大于等于)
 90 | <= (小于等于)
 91 | ```
 92 | **Range vector（范围向量）**：一组时间序列，其中每个时间序列都包含一系列时间范围内的数据点。
 93 | 
 94 | 如果取的是一段时间范围内的样本数据，则需要使用区间向量表达式。区间向量表达式和瞬时向量表达式之间的差异在于在区间向量表达式中我们需要定义时间选择的范围，时间范围通过时间范围选择器 `[]` 进行定义。
 95 | ```
 96 | # 该样本最近 5 分钟内的数据
 97 | http_request_total{}[5m]
 98 | ```
 99 | 
100 | Prometheus 支持多种时间维度
101 | ```
102 | s - 秒
103 | m - 分钟
104 | h - 小时
105 | d - 天
106 | w - 周
107 | y - 年
108 | ```
109 | 
110 | 范围向量的查询还可「位移时间」，比如说查询 20 分钟前 1 分钟内的数据。
111 | ```
112 | http_request_total{}[1m] offset 20m
113 | ```
114 | 
115 | **Scalar（标量）**：一个简单的浮点值。
116 | 
117 | 只有数字，没有时序性质。
118 | 
119 | **String（字符串）**：一个简单的字符串，目前暂未使用。
120 | 
121 | #### 5.2.4 PromQL Function
122 | 
123 | PromQL 最具威力的还是其提供的多种使用函数，函数列表请参考 [querying/functions](https://prometheus.io/docs/prometheus/latest/querying/functions/)。
124 | 
125 | 这里就介绍最常用的几个函数，剩下的还是需要自己多阅读文档，用到时再查询即可。
126 | ```
127 | # 计算最近 5 分钟平均请求速率。
128 | rate(http_requests_total[5m])
129 | 
130 | # 计算所有 HTTP 请求访问总量。
131 | sum(http_requests_total)
132 | 
133 | # 计算访问量最大的 5 个请求。
134 | topk(5, http_requests_total)
135 | 
136 | # 计算最近 5 分钟请求量新增了多少。
137 | increase(http_requests_total{job="api-server"}[5m])
138 | 
139 | # 计算最近 5 分钟请求量的增长率。
140 | increase(http_requests_total{job="api-server"}[5m]) / 300
141 | 
142 | # 计算当前样本数据中的中位数。
143 | quantile(0.5, http_requests_total)
144 | 
145 | # 计算以最近 1 个小时的磁盘空间消耗速度，是否会在 2 个小时内将磁盘空间消耗殆尽。
146 | predict_linear(node_filesystem_free{job="node1"}[1h], 2 * 3600) < 0
147 | 
148 | # 计算 HTTP 请求在最近 5 分钟内的平均延迟时间。
149 | rate(http_request_duration_seconds_sum[5m]) / rate(http_request_duration_seconds_count[5m])
150 | 
151 | # 计算 HTTP 请求在最近 5 分钟内延迟时间小于 0.3 秒的比例。
152 | sum(rate(http_request_duration_seconds_bucket{le="0.3"}[5m])) by (job) /
153 | sum(rate(http_request_duration_seconds_count[5m])) by (job)
154 | ```
155 | 
156 | ### 5.3 Prometheus 配置详解
157 | 
158 | Prometheus 的配置项是整个组件的核心，配置是声明式的，这点我觉得是蛮棒的，降低使用门槛，如果想熟练的使用 Prometheus，还是需要把配置文档好好阅读几遍，Prometheus 的官方文档质量还是蛮高的。最基础的配置如下：
159 | ```yaml
160 | global:
161 |   # How frequently to scrape targets by default.
162 |   [ scrape_interval: <duration> | default = 1m ]
163 | 
164 |   # How long until a scrape request times out.
165 |   [ scrape_timeout: <duration> | default = 10s ]
166 | 
167 |   # How frequently to evaluate rules.
168 |   [ evaluation_interval: <duration> | default = 1m ]
169 | 
170 |   # The labels to add to any time series or alerts when communicating with
171 |   # external systems (federation, remote storage, Alertmanager).
172 |   external_labels:
173 |     [ <labelname>: <labelvalue> ... ]
174 | 
175 | # Rule files specifies a list of globs. Rules and alerts are read from
176 | # all matching files.
177 | rule_files:
178 |   [ - <filepath_glob> ... ]
179 | 
180 | # A list of scrape configurations.
181 | scrape_configs:
182 |   [ - <scrape_config> ... ]
183 | 
184 | # Alerting specifies settings related to the Alertmanager.
185 | alerting:
186 |   alert_relabel_configs:
187 |     [ - <relabel_config> ... ]
188 |   alertmanagers:
189 |     [ - <alertmanager_config> ... ]
190 | 
191 | # Settings related to the remote write feature.
192 | remote_write:
193 |   [ - <remote_write> ... ]
194 | 
195 | # Settings related to the remote read feature.
196 | remote_read:
197 |   [ - <remote_read> ... ]
198 | ```
199 | 
200 | #### 5.3.1 scrape_configs
201 | 
202 | scrape_configs 是 Prometheus 最为重要的配置之一，指定了 Prometheus 实例如何抓取 metrics。
203 | ```yaml
204 | # scrape_configs
205 | 
206 | # job 是 Prometheus 中最基本的调度单位，一个 job 可能会拥有多个 instances，就像一个服务可能会拥有多个实例一样。
207 | job_name: <job_name>
208 | 
209 | # 抓取时间间隔。
210 | [ scrape_interval: <duration> | default = <global_config.scrape_interval> ]
211 | 
212 | # 抓取超时时间。
213 | [ scrape_timeout: <duration> | default = <global_config.scrape_timeout> ]
214 | 
215 | # 默认的 web path 是 /metrics，现在对整个社区来说也是一个约定俗成的东西，基本不会有变动。
216 | [ metrics_path: <path> | default = /metrics ]
217 | 
218 | # 协议格式 http/https？
219 | [ scheme: <scheme> | default = http ]
220 | 
221 | # 查询还可以带参数，但一般也没这个必要。
222 | params:
223 |   [ <string>: [<string>, ...] ]
224 | 
225 | # tls 证书配置，不过目前如果使用的服务都是跑在 kubernetes 集群内，且无对外部暴露的话，也不用考虑 tls。
226 | tls_config:
227 |   [ <tls_config> ]
228 | 
229 | # 代理配置。
230 | [ proxy_url: <string> ]
231 | 
232 | # 下面的都是服务发现的配置，Prometheus 原生地提供了多种服务发现的方案。
233 | # 这里只简单介绍 static_sd_config 和 kubernetes_sd_configs 两种。
234 | # 目的是为了结合服务发现实现 Prometheus 的热更新，不必再手动地更新配置。
235 | # 使用 prometheus-operator 本质上是与 kubernetes_sd_configs 相结合，只是 operator 帮我们屏蔽了这些复杂性。
236 | # 对于其他服务发现体系的，可以到官网上查看具体的配置项。
237 | # https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
238 | 
239 | # kubernetes_sd_configs 实现的思路其实就是通过 Kubernetes REST API 获取对应的资源的信息。
240 | # 包括 node/service/pod/endpoints/ingress
241 | # kubernetes 会给每种资源注入自己的信息，当然资源本身也可以自定义一些附带信息，如 labels/annotation 等。
242 | # 获取到的数据都以 __meta 作为前缀，在 Prometheus 中，双下划线为前缀的 metrics 是不会被暴露到外部的。
243 | # 所以可用 relabels 来将 __meta_* metrics 转换为自己想要的形式。
244 | kubernetes_sd_configs:
245 |   [ - <kubernetes_sd_config> ... ]
246 | 
247 | # 如果你是单机使用并且没有任何服务发现体系的话，可以用 static_configs
248 | # <static_config>
249 | # targets:
250 | # targets 就是上面指的一个 job 可能会有多个 instances 的情况，列表类型。
251 | #  [ - '<host>' ]
252 | # Labels assigned to all metrics scraped from the targets.
253 | # labels:
254 | # [ <labelname>: <labelvalue> ... ]
255 | static_configs:
256 |   [ - <static_config> ... ]
257 | 
258 | # relabel 涉及到几种 action，action 指的是你可以根据正则捕获结果对 label 进行何种操作，是丢弃呢，还是改写呢？
259 | # 具体内容可以参考下面这篇博客
260 | # https://www.li-rui.top/2019/04/16/monitor/Prometheus%E4%B8%ADrelabel_configs%E7%9A%84%E4%BD%BF%E7%94%A8/
261 | # 
262 | # actions 类型如下
263 | # 
264 | # replace       根据正则匹配标签的值进行替换标签
265 | # keep          根据正则匹配标签的值保留数据采集源
266 | # dro           根据正则匹配标签的值剔除数据采集源
267 | # hashmod       hash 模式
268 | # labelmap      根据正则匹配标签的名称进行映射
269 | # labeldrop     根据正则匹配标签的名称剔除标签
270 | # labelkeep     根据正则匹配标签的名称保留标签
271 | relabel_configs:
272 |   [ - <relabel_config> ... ]
273 | ```
274 | 
275 | #### 5.3.2 alertingRule
276 | 
277 | 采集了 metrics 可以被告警系统使用，所以我们需要根据手上掌握的数据来定义告警规则。首先来看看官方给出的一个基础示例。
278 | ```yaml
279 | groups:
280 | - name: example
281 |   rules:
282 |   - alert: HighRequestLatency
283 |     expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5
284 |     for: 10m
285 |     labels:
286 |       severity: page
287 |     annotations:
288 |       summary: High request latency
289 | ```
290 | 
291 | 规则模板如下
292 | ```yaml
293 | # The name of the alert. Must be a valid metric name.
294 | # 告警规则名字。
295 | alert: <string>
296 | 
297 | # The PromQL expression to evaluate. Every evaluation cycle this is
298 | # evaluated at the current time, and all resultant time series become
299 | # pending/firing alerts.
300 | # promQL 表达式，需要符合 promQL 语法
301 | expr: <string>
302 | 
303 | # Alerts are considered firing once they have been returned for this long.
304 | # Alerts which have not yet fired for long enough are considered pending.
305 | # 规则匹配的持续时间
306 | [ for: <duration> | default = 0s ]
307 | 
308 | # Labels to add or overwrite for each alert.
309 | # labels 用于提供附带信息，在 alertmanagers 中可以用到，可以为 golang 标准模板语法。
310 | labels:
311 |   [ <labelname>: <tmpl_string> ]
312 | 
313 | # Annotations to add to each alert.
314 | # Annotations 用于提供附带信息，在 alertmanagers 中可以用到，可以为 golang 标准模板语法。
315 | annotations:
316 |   [ <labelname>: <tmpl_string> ]
317 | ```
318 | 
319 | #### 5.3.3 recordingRule
320 | 
321 | Prometheus 提供一种记录规则（Recording Rule） 来支持后台计算的方式，可以实现对复杂查询的 PromQL 语句的性能优化，提高查询效率。记录规则的基本思想是，它允许我们基于其他时间序列创建自定义的 meta-time 序列。
322 | 
323 | 在 Prometheus Operator 中已经有了大量此类规则，比如：
324 | ```yaml
325 | groups:
326 |   - name: k8s.rules
327 |     rules:
328 |     - expr: |
329 |         sum(rate(container_cpu_usage_seconds_total{image!="", container!=""}[5m])) by (namespace)
330 |       record: namespace:container_cpu_usage_seconds_total:sum_rate
331 |     - expr: |
332 |         sum(container_memory_usage_bytes{image!="", container!=""}) by (namespace)
333 |       record: namespace:container_memory_usage_bytes:sum
334 | ```
335 | 
336 | 上面的这两个规则就完全可以执行上面我们的查询，它们会连续执行并以很小的时间序列将结果存储起来。
337 | ```
338 | sum(rate(container_cpu_usage_seconds_total{job="kubelet", image!="", container_name!=""}[5m])) by (namespace)
339 | ```
340 | 将以预定义的时间间隔进行评估，并存储为新的指标，新指标与内存查询相同，效率更高
341 | ```
342 | namespace:container_cpu_usage_seconds_total:sum_rate
343 | ```
344 | 
345 | 同样的，我们先来看官方给出的基础示例。
346 | ```yaml
347 | groups:
348 |   - name: example
349 |     rules:
350 |     - record: job:http_inprogress_requests:sum
351 |       expr: sum(http_inprogress_requests) by (job)
352 | ```
353 | 
354 | rule 规则模板如下
355 | ```yaml
356 | # The name of the time series to output to. Must be a valid metric name.
357 | record: <string>
358 | 
359 | # The PromQL expression to evaluate. Every evaluation cycle this is
360 | # evaluated at the current time, and the result recorded as a new set of
361 | # time series with the metric name as given by 'record'.
362 | # promQL 表达式，需要符合 promQL 语法
363 | expr: <string>
364 | 
365 | # Labels to add or overwrite before storing the result.
366 | labels:
367 |   [ <labelname>: <labelvalue> ]
368 | ```
369 | 
370 | 推荐一篇博客：[Prometheus 记录规则的使用](https://www.qikqiak.com/post/recording-rules-on-prometheus/)
371 | 


--------------------------------------------------------------------------------
/0x06-Prometheus-实践(入门篇).md:
--------------------------------------------------------------------------------
  1 | ## 6. Prometheus 实践（入门篇）
  2 | 
  3 | ### 6.1 prometheus-operator 是什么？
  4 | 
  5 | 要想了解 prometheus-operator，那就得先了解 operator 是什么。operator 是一种部署和管理 Kubernetes 应用的解决方案。Kubernetes 作为一个成熟的容器编排系统，在无状态服务的管理上已经做得非常好，使开发人员可以迅速地开发和部署高可用的服务。
  6 | 
  7 | 但是对于有状态的应用来说，情况就变得复杂了，如何在 Kubernetes 上迅速部署一个易于后期维护 Kafka 集群或者 Mongo 集群？有状态的应用我们常常需要考虑很多复杂的事情，包括升级、配置更新、备份、灾难恢复、Scale 调整数量等等，有时甚至可能要重启一些服务，一些基础组件由于自身系统架构的复杂性，加上又可能存在分布式的情况，所以这对于在 Kubernetes 的运维人员来说，着实也是一件头疼的事情。
  8 | 
  9 | 因此 operator 出现了，operator 充分利用了 Kubernetes 的可扩展性，通过注册 [Kubernetes CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) 来抽象和管理系统组件，将复杂的运维工作与 Kubernetes runtime 接口相结合，对外屏蔽了操作的复杂性，实乃运维人员之福音 👏。
 10 | 
 11 | [CoreOS](https://coreos.com/) 公司最先提出了 operator 这个概念，并开源了 [operator-framework](https://github.com/operator-framework)，以及两个社区知名的 operator 实现，[prometheus-operator](https://github.com/coreos/prometheus-operator) 和 [etcd-operator](https://github.com/coreos/etcd-operator)。
 12 | 
 13 | ***prometheus-operator 架构***
 14 | 
 15 | ![prometheus-operator 架构图](./images/prometheus-operator-architecture.png)
 16 | 
 17 | prometheus-operator 有以下几种自定义资源：
 18 | 
 19 | * **Prometheus**：Prometheus 服务端实例。
 20 | * **ServiceMonitor**：基于 Service 监控实例，是对要抓取 Service Metircs 的规则描述。
 21 | * **PodMonitor**：基于 Pod 监控实例，是对要抓取 Pod Metircs 的规则描述。
 22 | * **PrometheusRule**：告警规则和指标计算规则。
 23 | * **Alertmanager**：Alertmanager 服务端实例。
 24 | 
 25 | ### 6.2 部署 prometheus-operator
 26 | 
 27 | 下面开始在 Kubernetes 上部署 prometheus-operator。按照官方文档指引
 28 | 
 29 | ```shell
 30 | $ git clone https://github.com/coreos/prometheus-operator.git
 31 | $ cd rometheus-operator
 32 | $ kubectl apply -f bundle.yaml
 33 | ```
 34 | 
 35 | > Note：敲黑板！划重点！
 36 | 
 37 | 由于总所周知的原因，如果你本地没配置好「特殊」环境的话是拉不了 quay.io 的镜像的。解决方案有二：
 38 | 
 39 | * (推荐) 配置代理环境，可参考我的一篇博客 [Docker 解决拉取镜像被墙的问题](https://github.com/chenjiandongx/blog/blob/master/posts/fix-docker-pull-images.md)。
 40 | * 使用微软镜像源 [GCR Proxy Cache](http://mirror.azure.cn/help/gcr-proxy-cache.html)，把下载后的镜像重新打 tag 并手动同步到所有节点上。
 41 | 
 42 |     1. 获取 image 版本信息
 43 |     ```shell
 44 |     $ cat bundle.yaml | grep image
 45 |             - --config-reloader-image=jimmidyson/configmap-reload:v0.3.0
 46 |             image: quay.io/coreos/prometheus-operator:v0.35.0
 47 |     ```
 48 | 
 49 |     2. 替换 image 地址
 50 |     ```shell
 51 |     # 拉取镜像
 52 |     $ docker pull quay.azk8s.cn/coreos/prometheus-operator:v0.35.0
 53 |     # 重新打 tag
 54 |     $ docker tag quay.azk8s.cn/coreos/prometheus-operator:v0.35.0 quay.io/coreos/prometheus-operator:v0.35.0
 55 |     ```
 56 | 
 57 |     3. 将镜像拉取策略修改为 `imagePullPolicy: IfNotPresent`
 58 | 
 59 | 一切准备完毕之后应该就可以顺利地安装 operator 了，查看新注册的 CRD 以及 operator 实例。
 60 | 
 61 | ```shell
 62 | ~ 🐶 k get crds | grep monitoring
 63 | alertmanagers.monitoring.coreos.com     2019-11-18T06:11:26Z
 64 | podmonitors.monitoring.coreos.com       2019-11-18T06:11:27Z
 65 | prometheuses.monitoring.coreos.com      2019-11-18T06:11:26Z
 66 | prometheusrules.monitoring.coreos.com   2019-11-18T06:11:27Z
 67 | servicemonitors.monitoring.coreos.com   2019-11-18T06:11:26Z
 68 | 
 69 | ~ 🐶 k get pods | grep operator
 70 | prometheus-operator-99dccdc56-89rlt      1/1     Running   25         78d
 71 | ```
 72 | 
 73 | ### 6.3 使用 prometheus-operator
 74 | 
 75 | 1. 部署 prometheus 服务端实例，用于抓取 Job Metrics。
 76 |     ```yaml
 77 |     # prometheus-frontend.yaml
 78 |     apiVersion: monitoring.coreos.com/v1
 79 |     kind: Prometheus
 80 |     metadata:
 81 |       name: prometheus
 82 |     spec:
 83 |       serviceAccountName: prometheus
 84 |       # prometheus 版本
 85 |       version: v2.14.0
 86 |       serviceMonitorSelector:
 87 |         # 声明 labels，就当你是前端团队吧 🐶
 88 |         matchLabels:
 89 |           team: frontend
 90 |       podMonitorSelector:
 91 |         matchLabels:
 92 |           team: frontend
 93 |       resources:
 94 |         requests:
 95 |         memory: 400Mi
 96 |       enableAdminAPI: false
 97 |       # Note: 这里有个地方需要提醒下，prometheus.spec 有 stroage 选项，可以指定 PVC，用于持久化存储监控指标数据。
 98 |       # 线上环境请使用 PVC，这里只是做测试用的，所以没有挂载持久化卷啦。
 99 | 
100 |     # Note: 如果镜像未能成功拉取，请参考上面给出的方案，屡试不爽。
101 |     # k apply -f prometheus-frontend.yaml
102 |     ```
103 | 
104 |     查看实例运行情况
105 |     ```shell
106 |     ~ 🐶 k get pods | grep prometheus
107 |     prometheus-operator-99dccdc56-89rlt      1/1     Running   25         78d
108 |     prometheus-prometheus-0                  3/3     Running   73         76d
109 | 
110 |     # 可以看到，一个 prometheus pod 里面有三个容器。
111 |     # 除了一个实际真正的 prometheus 服务端容器之外，还有另外两个容器是用于重载配置文件的。
112 |     ~ 🐶 k describe pod prometheus-prometheus-0 | grep Image:
113 |         Image:         quay.io/prometheus/prometheus:v2.14.0
114 |         Image:         quay.io/coreos/prometheus-config-reloader:v0.34.0
115 |         Image:         quay.io/coreos/configmap-reload:v0.0.1
116 |     ```
117 | 
118 | 2. 部署 prometheus-service
119 |     ```yaml
120 |     # prometheus-svc.yaml
121 |     apiVersion: v1
122 |     kind: Service
123 |     metadata:
124 |       # 这个 service 在后面的 Ingress 会用到
125 |       name: prometheus
126 |     spec:
127 |       type: ClusterIP
128 |       ports:
129 |       - name: web
130 |         port: 9090
131 |         protocol: TCP
132 |         targetPort: web
133 |       selector:
134 |         prometheus: prometheus
135 | 
136 |     # k apply -f prometheus-svc.yaml
137 |     ```
138 | 
139 | 3. 部署 prometheus-pod-monitor。不知道要监控什么的话那就先监控自己吧 😅。
140 |     ```yaml
141 |     # prometheus-pod-monitor.yaml
142 |     apiVersion: monitoring.coreos.com/v1
143 |     kind: PodMonitor
144 |     metadata:
145 |       name: prometheus-monitor
146 |       labels:
147 |         team: frontend
148 |     spec:
149 |       selector:
150 |         matchLabels:
151 |           # 这是我们前面部署的 prometheus 服务端实例
152 |           prometheus: prometheus
153 |       jobLabel: "prometheus-monitor"
154 |       podMetricsEndpoints:
155 |       - port: web
156 | 
157 |     # k apply -f prometheus-pod-monitor.yaml
158 |     ```
159 | 
160 | 到目前为止，我们所了解到和看到的一切，好像都比较... 抽象 😜，搞个可视化面板吧。Prometheus 本身自带了一个 dashboard，虽然说比较简陋，不管了，先搞起来再说。
161 | 
162 | ***Prometheus Dashboard***
163 | ![prometheus 监控面板](./images/prometheus-dashboard-0.jpg)
164 | 
165 | 这里我使用了 Ingress-Nginx 来做代理转发，并修改本地 Host 实现的，毕竟咱也没服务器在线上搞。这里可以参考我的另外一篇博客 [kubernetes 之 Ingress-Controller](https://github.com/chenjiandongx/blog/blob/master/posts/k8s-ingress-controller.md)，也不是什么复杂的工作，主要的步骤如下。
166 | 
167 | 1. 安装 Ingress-nginx
168 |    ```shell
169 |    $ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/mandatory.yaml
170 |    ```
171 | 
172 | 2. 部署 Ingress 实例
173 |     ```yaml
174 |     # ingress-prometheus.yaml
175 |     apiVersion: extensions/v1beta1
176 |     kind: Ingress
177 |     metadata:
178 |       name: ingress-prometheus
179 |       namespace: default
180 |       annotations:
181 |         kubernetes.io/ingress.class: "nginx"
182 |         nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
183 |     spec:
184 |       rules:
185 |       - host: prometheus.chenjiandongx.com
186 |         http:
187 |           paths:
188 |           - path: /
189 |             backend:
190 |               # 前面我们已经部署了一个 prometheus-service，暴露的端口为 9090
191 |               serviceName: prometheus
192 |               servicePort: 9090
193 | 
194 |     # k apply -f ingress-prometheus.yaml
195 |     ```
196 | 
197 |     查看 Ingress 资源
198 |     ```shell
199 |     $ k get ingresses | grep prome
200 |     ingress-prometheus     prometheus.chenjiandongx.com     10.106.96.52   80      69d
201 |     ```
202 | 
203 | 3. 修改本地 host
204 |     ```shell
205 |     $ vim /etc/hosts
206 | 
207 |     # 192.168.2.11 是我 Kubernetes 集群 master 节点的 IP
208 |     192.168.2.11 prometheus.chenjiandongx.com
209 |     ```
210 | 
211 | 4. 再加一层 Nginx 实现第二层转发，让界面看起来优雅一些（自欺欺人！）。
212 | 
213 |     kubeadm 在安装 Kubernetes 集群的时候，NodePort 端口默认只开放 30000+，这也是 ingress-controller-svc 的 HTTP/HTTPS 端口都被映射到 30000+ 的原因。所以我们虽然配置了域名，但是访问的时候还是需要把端口带上。比如 http://prometheus.chenjiandongx.com:30834 我还是喜欢直接访问 http://prometheus.chenjiandongx.com （傲娇脸！）。所以我们可以在 master 节点上再新增一个 nginx 来做 4 层转发，这样我们就可以达到我们想要的效果啦！
214 | 
215 | 验证我们刚才监控 prometheus 自己的操作是否成功，诺！`prometheus_*` metircs 就是 prometheus 自己上报的指标。
216 | 
217 | ![prometheus 监控面板](./images/prometheus-dashboard-1.jpg)
218 | 
219 | ### 6.4 Grafana 是什么？
220 | 
221 | [Grafana](https://grafana.com/) 是一个跨平台的开源的度量分析和可视化工具，可通过将采集的数据查询然后可视化的展示，并及时通知。它主要有以下六大特点
222 | 
223 | 1. **灵活的图表编排**：面板插件有许多不同方式的可视化指标和日志，官方库中具有丰富的仪表盘插件，比如热图、折线图、图表等多种展示方式。
224 | 2. **支持多种数据源**：Graphite、InfluxDB、MySQL、OpenTSDB、Prometheus、Elasticsearch、CloudWatch 和 KairosDB 等。
225 | 3. **告警通知**：以可视方式定义最重要指标的警报规则，Grafana 将不断计算并发送通知，在数据达到阈值时通过 Slack、PagerDuty 等获得通知。
226 | 4. **混合展示**：在同一图表中混合使用不同的数据源，可以基于每个查询指定数据源，甚至自定义数据源。
227 | 5. **注释**：使用来自不同数据源的丰富事件注释图表，将鼠标悬停在事件上会显示完整的事件元数据和标记。
228 | 6. **过滤器**：Ad-hoc 过滤器允许动态创建新的键/值过滤器，这些过滤器会自动应用于使用该数据源的所有查询。
229 | 
230 | Grafana 是一个非常棒的可视化组件，给予了开发者无限的可能，待提高的是使用者的想象力 😳。
231 | 
232 | ### 6.5 部署 Grafana
233 | 
234 | 1. 部署 grafana StatefulSet 到 Kubernetes 中。
235 |     ```yaml
236 |     # grafana-statefulset.yaml
237 |     apiVersion: apps/v1
238 |     kind: StatefulSet
239 |     metadata:
240 |       name: grafana
241 |     spec:
242 |       serviceName: "grafana-svc"
243 |       replicas: 1
244 |       selector:
245 |         matchLabels:
246 |           app: grafana
247 |       template:
248 |         metadata:
249 |           labels:
250 |             app: grafana
251 |         spec:
252 |           containers:
253 |           - name: grafana
254 |             image: grafana/grafana:6.4.4
255 |             ports:
256 |             # 暴露 3000 端口，这是 Grafana 的默认端口
257 |             - containerPort: 3000
258 |               name: web
259 |             volumeMounts:
260 |             - name: grafana-data
261 |               mountPath: /var/lib/grafana
262 |     volumeClaimTemplates:
263 |     - metadata:
264 |         name: grafana-data
265 |         annotations:
266 |           # 这里使用了 nfs 作为 storageclass，关于 stroageclass 部署可以参考博客文章
267 |           # https://github.com/chenjiandongx/blog/blob/master/posts/k8s-nfs-storageclass.md
268 |           volume.beta.kubernetes.io/storage-class: nfs-storage
269 |         spec:
270 |           accessModes: [ "ReadWriteOnce" ]
271 |         resources:
272 |           requests:
273 |             storage: 1Gi
274 | 
275 |     # k apply -f grafana-statefulset.yaml
276 |     ```
277 | 
278 | 2. 部署 grafana-svc
279 |     ```yaml
280 |     # grafana-svc.yaml
281 |     kind: Service
282 |     apiVersion: v1
283 |     metadata:
284 |       name: grafana-svc
285 |       labels:
286 |         app: grafana
287 |     spec:
288 |       selector:
289 |         app: grafana
290 |       ports:
291 |       - name: web
292 |         port: 3000
293 | 
294 |     # k apply -f grafana-svc.yaml
295 |     ```
296 | 
297 | 3. 部署 Ingress 转发 http://grafana.chenjiandongx.com 域名（🧐 非必须，也可以将 grafana-svc 暴露成 NodePort 形式，使用 IP 访问）
298 |     ```yaml
299 |     # ingress-grafana.yaml
300 |     apiVersion: extensions/v1beta1
301 |     kind: Ingress
302 |     metadata:
303 |       name: ingress-grafana
304 |       namespace: default
305 |       annotations:
306 |         kubernetes.io/ingress.class: "nginx"
307 |         nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
308 |     spec:
309 |       rules:
310 |       - host: grafana.chenjiandongx.com
311 |         http:
312 |           paths:
313 |             - path: /
314 |               backend:
315 |                 # 与上面 svc 名称对应
316 |                 serviceName: grafana-svc
317 |                 servicePort: 3000
318 | 
319 |     # k apply -f ingress-grafana.yaml
320 |     # 记得在 /etc/hosts 上补充 grafana.chenjiandongx.com DNS 解析记录
321 |     ```
322 | 
323 | 访问 http://grafana.chenjiandongx.com 即可看到登录页面。默认账号密码均为 admin。
324 | 
325 | ![Grafana 登录界面](./images/grafana-login.png)
326 | 


--------------------------------------------------------------------------------
/0x07-Prometheus-实践(进阶篇).md:
--------------------------------------------------------------------------------
  1 | ## 7. Prometheus 实践（进阶篇）
  2 | 
  3 | ### 7.1 使用酷炫的 Grafana 界面
  4 | 
  5 | 要使用 Grafana 首先需要添加数据源，因为我们这里使用的仅是 Prometheus，所以我们暂先把上面部署的 prometheus service 添加为默认数据源，因为 prometheus-svc 也是部署在 default 命名空间，暴露端口为 9090，因此 URL 为 `http://prometheus:9090`。点击 `Save & Test`。
  6 | 
  7 | ![grafana-datasource](./images/grafana-datasource.png)
  8 | 
  9 | #### 7.1.1 grafana-prometheus
 10 | 
 11 | 接下来就是我们大显身手的时候了，刚才上面一开始我们监控了 prometheus 自己，这回有了 Grafana，就可以用更加「专业（高逼格！）」的图表来可视化数据了。
 12 | 
 13 | > 配置 JSON 文件：[grafana-prometheus.json](./example/grafana-prometheus.json)
 14 | 
 15 | ![grafana-prometheus](./images/grafana-prometheus.png)
 16 | 
 17 | #### 7.1.2 grafana-ingress
 18 | 
 19 | 前面我们也部署了 Ingress-Nginx 用于域名转发，Kubernetes 官方提供的 nginx-ingress-controller 是有上报 `/metrics` 的接口的，不过为了配合 prometheus-operator 使用，我们需要修改点内容。
 20 | 
 21 | prometheus-operator 的 monitor 如若需要指定端口的话，必须为其 `name`，不能为端口号。至于为什么一定要这样，我也不是很清楚，因为源码中就是这么规定的...
 22 | 
 23 | ```shell
 24 | # 使用 vim 搜索 `ports`，找到 container port 为 10254 的端口，新增 `name: web` 配置。
 25 | $ k edit deployments.apps -n ingress-nginx nginx-ingress-controller
 26 | 
 27 | # 修改后如下
 28 | - containerPort: 10254
 29 |   name: web
 30 |   protocol: TCP
 31 | ```
 32 | 
 33 | 修改完成之后就可以部署 ingress-pod-monitor，告诉 prometheus 可以去抓取数据 ingress 上报的数据。
 34 | ```yaml
 35 | # ingress-pod-monitor.yaml
 36 | apiVersion: monitoring.coreos.com/v1
 37 | kind: PodMonitor
 38 | metadata:
 39 |   labels:
 40 |     team: frontend
 41 |   name: ingress-monitor
 42 |   namespace: default
 43 | spec:
 44 |   namespaceSelector:
 45 |     matchNames:
 46 |     # ingress-nginx 默认是部署在 `ingress-nginx` 命名空间下
 47 |     - ingress-nginx
 48 |   podMetricsEndpoints:
 49 |   # 必须为 name，不能为端口号
 50 |   - port: web
 51 |   selector:
 52 |     matchLabels:
 53 |       app.kubernetes.io/name: ingress-nginx
 54 |       app.kubernetes.io/part-of: ingress-nginx
 55 | 
 56 | # k apply -f ingress-pod-monitor.yaml
 57 | ```
 58 | 
 59 | > 配置 JSON 文件：[grafana-ingress.json](./example/grafana-ingress.json)
 60 | 
 61 | ![grafana-ingress](./images/grafana-ingress.png)
 62 | 
 63 | #### 7.1.3 grafana-docker
 64 | 
 65 | [cadvisor](https://github.com/google/cadvisor) 是 Google 开源的 docker 容器监控组件，提供原生的 Prometheus metrics 接口。部署可以参照官方文档，[deploy/kubernetes](https://github.com/google/cadvisor/tree/master/deploy/kubernetes)，挺方便的。官方的部署方案会在每个非 master 节点上都配置一个 cadvisor 实例，以 daemonsets.apps 的形式。
 66 | ```shell
 67 | ~ 🐶 k get pods -n cadvisor -o wide
 68 | NAME             READY   STATUS    RESTARTS   AGE   IP             NODE    NOMINATED NODE   READINESS GATES
 69 | cadvisor-d7249   1/1     Running   8          26d   10.244.2.73    node2   <none>           <none>
 70 | cadvisor-pmkmx   1/1     Running   8          26d   10.244.1.130   node1   <none>           <none>
 71 | ```
 72 | 
 73 | 同样的，我们需要告知 Prometheus 来抓取 cadvisor 上报的数据。
 74 | ```yaml
 75 | # cadvidor-svc-monitor.yaml
 76 | apiVersion: monitoring.coreos.com/v1
 77 | kind: ServiceMonitor
 78 |   labels:
 79 |     team: frontend
 80 |   name: cadvisor
 81 |   namespace: default
 82 | spec:
 83 |   endpoints:
 84 |   - port: web
 85 |   namespaceSelector:
 86 |     matchNames:
 87 |     - cadvisor
 88 |   selector:
 89 |     matchLabels:
 90 |       app.kubernetes.io/name: cadvisor
 91 | 
 92 | # k apply cadvidor-svc-monitor.yaml
 93 | # Note: 实际上我们应该使用的 pod-monitor 而不是 service-monitor，这里这么做只是为了贪图方便。（心虚！）
 94 | ```
 95 | > 配置 JSON 文件：[grafana-docker.json](./example/grafana-docker.json)
 96 | 
 97 | ![grafana-docker](./images/grafana-docker.png)
 98 | 
 99 | #### 7.1.4 grafana-node-exporter
100 | 
101 | [Node-exporter](https://github.com/prometheus/node_exporter) 是 Prometheus 团队开发的用于监控主机指标的组件，也算是目前最热门的 exporter 实现之一。监控的指标蛮齐全的，该有的都有了 🙌。
102 | 
103 | 我们也把 node-exporter 用 daemonsets 的形式部署起来。
104 | ```yaml
105 | # node-exporter-ds.yaml
106 | apiVersion: apps/v1
107 | kind: DaemonSet
108 | metadata:
109 |   name: node-exporter
110 |   labels:
111 |     name: node-exporter
112 | spec:
113 |   selector:
114 |     matchLabels:
115 |       name: node-exporter
116 |   template:
117 |     metadata:
118 |       labels:
119 |         name: node-exporter
120 |       annotations:
121 |          prometheus.io/scrape: "true"
122 |          prometheus.io/port: "9100"
123 |     spec:
124 |       hostPID: true
125 |       hostIPC: true
126 |       hostNetwork: true
127 |       containers:
128 |         - name: node-exporter
129 |           image: prom/node-exporter:latest
130 |           imagePullPolicy: IfNotPresent
131 |           securityContext:
132 |             privileged: true
133 |           args:
134 |             - --path.rootfs
135 |             - /host
136 |           ports:
137 |             - containerPort: 9100
138 |               protocol: TCP
139 |               name: web
140 |           volumeMounts:
141 |             - name: rootfs
142 |               mountPath: /host
143 |               readOnly: true
144 |       volumes:
145 |         - name: rootfs
146 |           hostPath:
147 |             path: /
148 | 
149 | # k apply -f node-exporter-ds.yaml
150 | ```
151 | 
152 | 然后再把 prometheus pod-monitor 规则也部署好
153 | ```yaml
154 | # node-exporter-pod-monitor.yaml
155 | apiVersion: monitoring.coreos.com/v1
156 | kind: PodMonitor
157 | metadata:
158 |   name: node-exporter-monitor
159 |   labels:
160 |     team: frontend
161 | spec:
162 |   selector:
163 |     matchLabels:
164 |       name: node-exporter
165 |   jobLabel: "node-exporter-monitor"
166 |   podMetricsEndpoints:
167 |   - port: web
168 | 
169 | # k apply -f node-exporter-pod-monitor.yaml
170 | ```
171 | 
172 | > 配置 JSON 文件：[grafana-node-exporter.json](./example/grafana-node-exporter.json)
173 | 
174 | ![grafana-node-exporter](./images/grafana-node-exporter.png)
175 | 
176 | ### 7.2 如何编写一个 exporter
177 | 
178 | 前面所使用的都是其他开发者编写的 exporter，那如果我们自己也想根据业务需求来编写一个 exporter，提供 `/metrics` 路由向 Prometheus 上报数据呢。
179 | 
180 | 目前在业务开发中，我所使用的 Golang Web 框架是 [Gin](https://github.com/gin-gonic/gin)，Gin 是一个精巧的框架，API 设计得挺优雅，性能也不错。所以我为 Gin 开发了一个 Middleware [ginprom](https://github.com/chenjiandongx/ginprom) 并提供了相应的 Grafana 面板。总代码量也就 100 多行，下面进行源码剖析 🐶。
181 | 
182 | ```golang
183 | // https://github.com/chenjiandongx/ginprom/blob/master/middleware.go
184 | // 我们挑重点的来，省略非核心代码
185 | 
186 | // 命名空间，生成的 metrics 格式为 service_*
187 | const namespace = "service"
188 | 
189 | var (
190 | 	// 对于一个 HTTP 请求，我们需要携带的数据有 status, endpoint, method
191 | 	labels = []string{"status", "endpoint", "method"}
192 | 
193 | 	// 服务启动的时间，重启清零。
194 | 	uptime = prometheus.NewCounterVec(
195 | 		prometheus.CounterOpts{
196 | 			Namespace: namespace,
197 | 			Name:      "uptime",
198 | 			Help:      "HTTP service uptime.",
199 | 		}, nil,
200 | 	)
201 | 
202 | 	// Counter 是单调递增，记录请求总数
203 | 	reqCount = prometheus.NewCounterVec(
204 | 		prometheus.CounterOpts{
205 | 			Namespace: namespace,
206 | 			Name:      "http_request_count_total",
207 | 			Help:      "Total number of HTTP requests made.",
208 | 		}, labels,
209 | 	)
210 | 
211 | 	// 请求延迟时间
212 | 	reqDuration = prometheus.NewHistogramVec(
213 | 		prometheus.HistogramOpts{
214 | 			Namespace: namespace,
215 | 			Name:      "http_request_duration_seconds",
216 | 			Help:      "HTTP request latencies in seconds.",
217 | 		}, labels,
218 | 	)
219 | 
220 | 	// 请求体大小
221 | 	reqSizeBytes = prometheus.NewSummaryVec(
222 | 		prometheus.SummaryOpts{
223 | 			Namespace: namespace,
224 | 			Name:      "http_request_size_bytes",
225 | 			Help:      "HTTP request sizes in bytes.",
226 | 		}, labels,
227 | 	)
228 | 
229 | 	// 响应体大小
230 | 	respSizeBytes = prometheus.NewSummaryVec(
231 | 		prometheus.SummaryOpts{
232 | 			Namespace: namespace,
233 | 			Name:      "http_response_size_bytes",
234 | 			Help:      "HTTP request sizes in bytes.",
235 | 		}, labels,
236 | 	)
237 | )
238 | 
239 | // 初始化操作，需要将上面声明的指标注册到 Prometheus 程序中
240 | func init() {
241 | 	prometheus.MustRegister(uptime, reqCount, reqDuration, reqSizeBytes, respSizeBytes)
242 | 	go recordUptime()
243 | }
244 | 
245 | // 这里是一个取巧的方法，每秒自增 1，用来记录服务持续运行的时间。
246 | func recordUptime() {
247 | 	for range time.Tick(time.Second) {
248 | 		uptime.WithLabelValues().Inc()
249 | 	}
250 | }
251 | 
252 | // 计算请求体的大小
253 | func calcRequestSize(r *http.Request) float64 {
254 | 	size := 0
255 | 	if r.URL != nil {
256 | 		size = len(r.URL.String())
257 | 	}
258 | 
259 | 	size += len(r.Method)
260 | 	size += len(r.Proto)
261 | 
262 | 	for name, values := range r.Header {
263 | 		size += len(name)
264 | 		for _, value := range values {
265 | 			size += len(value)
266 | 		}
267 | 	}
268 | 	size += len(r.Host)
269 | 
270 | 	// r.Form and r.MultipartForm are assumed to be included in r.URL.
271 | 	if r.ContentLength != -1 {
272 | 		size += int(r.ContentLength)
273 | 	}
274 | 	return float64(size)
275 | }
276 | 
277 | // 核心中间件，装饰器模式。
278 | func PromMiddleware(promOpts *PromOpts) gin.HandlerFunc {
279 | 	// make sure promOpts is not nil
280 | 	if promOpts == nil {
281 | 		promOpts = defaultPromOpts
282 | 	}
283 | 
284 | 	return func(c *gin.Context) {
285 | 		start := time.Now()
286 | 		c.Next()
287 | 
288 | 		status := fmt.Sprintf("%d", c.Writer.Status())
289 | 		endpoint := c.Request.URL.Path
290 | 		method := c.Request.Method
291 | 
292 | 		lvs := []string{status, endpoint, method}
293 | 
294 | 		isOk := promOpts.checkLabel(status, promOpts.ExcludeRegexStatus) &&
295 | 			promOpts.checkLabel(endpoint, promOpts.ExcludeRegexEndpoint) &&
296 | 			promOpts.checkLabel(method, promOpts.ExcludeRegexMethod)
297 | 
298 | 		if !isOk {
299 | 			return
300 | 		}
301 | 
302 | 		// 这里将数据记录在内存中，等 Prometheus server 来抓取
303 | 		// 每个数据都会携带 status, endpoint, method
304 | 		reqCount.WithLabelValues(lvs...).Inc()
305 | 		reqDuration.WithLabelValues(lvs...).Observe(time.Since(start).Seconds())
306 | 		reqSizeBytes.WithLabelValues(lvs...).Observe(calcRequestSize(c.Request))
307 | 		respSizeBytes.WithLabelValues(lvs...).Observe(float64(c.Writer.Size()))
308 | 	}
309 | }
310 | 
311 | // 通用装饰器写法
312 | func PromHandler(handler http.Handler) gin.HandlerFunc {
313 | 	return func(c *gin.Context) {
314 | 		handler.ServeHTTP(c.Writer, c.Request)
315 | 	}
316 | }
317 | ```
318 | 
319 | 在 Gin Web 代码中使用也很方便，几行代码即可 🙂。为了体验真实的效果，我们可以开发一个小应用部署在 Kubernetes 上试试。源码如下。
320 | ```golang
321 | package main
322 | 
323 | import (
324 | 	"log"
325 | 
326 | 	"github.com/chenjiandongx/ginprom"
327 | 	"github.com/gin-gonic/gin"
328 | 	"github.com/prometheus/client_golang/prometheus/promhttp"
329 | )
330 | 
331 | func main() {
332 | 	r := gin.Default()
333 | 
334 | 	r.Use(ginprom.PromMiddleware(&ginprom.PromOpts{ExcludeRegexStatus: "404"}))
335 | 	r.GET("/metrics", ginprom.PromHandler(promhttp.Handler()))
336 | 
337 | 	// 并无引战的想法，友军来着，别激动。
338 | 	r.GET("/python", func(c *gin.Context) {
339 | 		c.JSON(200, gin.H{"echo": "python is the best language in the world!"})
340 | 	})
341 | 
342 | 	r.GET("/php", func(c *gin.Context) {
343 | 		c.JSON(200, gin.H{"echo": "php is the best language in the world!"})
344 | 	})
345 | 
346 | 	r.GET("/java", func(c *gin.Context) {
347 | 		c.JSON(200, gin.H{"echo": "java is the best language in the world!"})
348 | 	})
349 | 
350 | 	r.GET("/golang", func(c *gin.Context) {
351 | 		c.JSON(200, gin.H{"echo": "golang is the best language in the world!"})
352 | 	})
353 | 
354 | 	r.GET("/ruby", func(c *gin.Context) {
355 | 		c.JSON(200, gin.H{"echo": "ruby is the best language in the world!"})
356 | 	})
357 | 
358 | 	if err := r.Run("0.0.0.0:8080"); err != nil {
359 | 		log.Fatalf("start server error: %+v", err)
360 | 	}
361 | }
362 | ```
363 | 
364 | 构建好镜像，以 Deployment 形式部署。
365 | ```yaml
366 | # language-echo-app.yaml
367 | apiVersion: apps/v1
368 | kind: Deployment
369 | metadata:
370 |   name: language-echo
371 | spec:
372 |   replicas: 1
373 |   selector:
374 |     matchLabels:
375 |       run: language-echo
376 |   template:
377 |     metadata:
378 |       labels:
379 |         run: language-echo
380 |     spec:
381 |       containers:
382 |         - image: chenjiandongx/language-echo:latest
383 |           imagePullPolicy: IfNotPresent
384 |           name: language-echo
385 |           ports:
386 |           - name: web
387 |             containerPort: 8080
388 | ---
389 | apiVersion: v1
390 | kind: Service
391 | metadata:
392 |   name: language-echo-svc
393 | spec:
394 |   ports:
395 |     - port: 8080
396 |       protocol: TCP
397 |       targetPort: 8080
398 |       nodePort: 30110
399 |       name: web
400 |   selector:
401 |     run: language-echo
402 |   # 这里暴露成 NodePort 类型，方便在本地访问
403 |   type: NodePort
404 | 
405 | # k apply -f language-echo-app.yaml
406 | ```
407 | 
408 | 部署 Prometheus pod-monitor 规则
409 | ```yaml
410 | # language-echo-pod-monitor.yaml
411 | apiVersion: monitoring.coreos.com/v1
412 | kind: PodMonitor
413 | metadata:
414 |   name: language-echo
415 |   labels:
416 |     team: frontend
417 | spec:
418 |   selector:
419 |     matchLabels:
420 |       run: language-echo
421 |   podMetricsEndpoints:
422 |   - port: web
423 | 
424 | # k apply -f language-echo-pod-monitor.yaml
425 | ```
426 | 
427 | 验证 Prometheus 是否有抓取服务指标。如有 `service_http_*` 前缀的指标就证明成功抓取了。
428 | 
429 | ![language echo](./images/language-echo-prometheus.png)
430 | 
431 | ### 7.3 自定义 Grafana Dashboard
432 | 
433 | 有了中间件是不够的，我们还需要自定制定一个 Grafana 面板，毕竟数据是你自己定义的，面板当然也需要你定义啦。
434 | 
435 | 这里我们需要考虑一个重要的问题，我需要监控面板能同时抓取到一个服务所有实例的数据并且能够做聚合。
436 | 
437 | 求总数的需要累加，比如 QPS，所有实例 QPS 加起来才等于该服务真实的 QPS。求平均的需要先累加再平均，比如延迟的话要算所有服务实例的平均延迟。
438 | 
439 | * 计算服务上一次启动的时间
440 |   ```
441 |   max((time() - service_uptime{job=~"$job"}) * 1000)
442 |   ```
443 | 
444 | * 计算服务运行时长
445 |   ```
446 |   max(service_uptime{job=~"$job"})
447 |   ```
448 | 
449 | * 计算服务 QPS
450 |   ```
451 |   sum(rate(service_http_request_count_total{job=~"$job", exported_endpoint=~"$endpoint", method=~"$method", status=~"$status"}[$interval])) by (exported_endpoint)
452 |   ```
453 | 
454 | * 计算服务平均延迟
455 |   ```
456 |   sum(rate(service_http_request_duration_seconds_sum{job=~"$job", exported_endpoint=~"$endpoint", method=~"$method", status=~"$status"}[$interval])
457 |   ) by (exported_endpoint)
458 |   /
459 |   sum(rate(service_http_request_duration_seconds_count{job=~"$job", exported_endpoint=~"$endpoint", method=~"$method", status=~"$status"}[$interval])) by (exported_endpoint)
460 |   ```
461 | 
462 | > 配置 JSON 文件：[grafana-language-echo.json](./example/grafana-language-echo.json)
463 | 
464 | 对服务发起「大量请求」然后看看效果吧。这里提供一个测试脚本。
465 | ```python
466 | # curl.py
467 | import random
468 | from multiprocessing.dummy import Pool
469 | 
470 | import requests
471 | 
472 | urls = ["python", "golang", "php", "java", "ruby"]
473 | 
474 | 
475 | count = 10000
476 | 
477 | 
478 | def curl(lang):
479 |     resp = requests.get("http://192.168.2.12:30110/{}".format(lang))
480 |     print(resp.json())
481 | 
482 | 
483 | if __name__ == "__main__":
484 |     pool = Pool(8)
485 |     reqs = [random.choice(urls) for _ in range(count)]
486 |     result = list(pool.map(curl, reqs))
487 |     pool.close()
488 |     pool.join()
489 | 
490 | # python3 curl.py
491 | ```
492 | 
493 | ![grafana-language-echo](./images/grafana-language-echo.png)
494 | 


--------------------------------------------------------------------------------
/0x08-Alertmanager-告警系统.md:
--------------------------------------------------------------------------------
  1 | ## 8. Alertmanager 告警系统
  2 | 
  3 | ### 8.1 Alertmanager 是什么？
  4 | 
  5 | Prometheus 的分工还是比较明确的，Prometheus server 负责产生警告，而 Alertmanager 负责消费和处理警告。
  6 | 
  7 | Alertmanager 接受到来自 Prometheus 的告警后，需要删除重复、分组，并将它们通过路由发送到正确的接收器，比如电子邮件、Slack、钉钉等。Alertmanager 还支持沉默和警报抑制的机制。
  8 | 
  9 | ***Alertmanager 架构***
 10 | 
 11 | ![Alertmanager 架构](./images/alertmanager-arch.svg)
 12 | 
 13 | #### 8.1.1 分组
 14 | 
 15 | 分组是指当出现问题时，Alertmanager 会收到一个单一的通知，而当系统宕机时，很有可能成百上千的警报会同时生成，这个时候就需要将这些警告分组了。
 16 | 
 17 | 例如当数十或数百个服务的实例在运行，网络发生故障时，有可能服务实例的一半不可达数据库。在告警规则中配置为每一个服务实例都发送警报的话，那么结果是数百警报被发送至Alertmanager。这个时候其实只需要发送一条警告就行，重复的警告有时候会干扰错误的判断。
 18 | 
 19 | #### 8.1.2 抑制
 20 | 
 21 | 抑制是指当警报发出后，停止重复发送由此警报引发其他错误的警报的机制。
 22 | 
 23 | 例如当警报被触发，通知整个集群不可达，可以配置 Alertmanager 忽略由该警报触发而产生的所有其他警报，这可以防止通知数百或数千与此问题不相关的其他警报。
 24 | 
 25 | #### 8.1.3 沉默
 26 | 
 27 | 沉默是一种简单的特定时间静音提醒的机制（别吵，吵到我刷剧了 🐶）。一种沉默是通过匹配器来配置，就像路由树一样。传入的警报会匹配RE，如果匹配，将不会为此警报发送通知。
 28 | 
 29 | ### 8.2 部署 Alertmanager
 30 | 
 31 | #### 8.2.1 部署 Alertmanager 实例
 32 | 
 33 | 使用 prometheus-operator 部署 Alertmanager 特别简单。
 34 | ```yaml
 35 | # alertmanager-dev.yaml
 36 | apiVersion: monitoring.coreos.com/v1
 37 | kind: Alertmanager
 38 | metadata:
 39 |   name: dev
 40 | spec:
 41 |   replicas: 1
 42 | 
 43 | # k apply -f alertmanager-dev.yaml
 44 | ```
 45 | 
 46 | 查看资源运行情况
 47 | ```shell
 48 | 🐶 k get pods | grep alert
 49 | alertmanager-dev-0                       2/2     Running   16         27d
 50 | 
 51 | # 该 pod 里面有两个 container，另外一个负责更新配置。
 52 | 🐶 k describe pod alertmanager-dev-0 | grep Image:
 53 |     Image:         quay.io/prometheus/alertmanager:v0.17.0
 54 |     Image:         quay.io/coreos/configmap-reload:v0.0.1
 55 | ```
 56 | 
 57 | #### 8.2.2 部署 alertmanager-svc
 58 | 
 59 | Alertmanager 本身也提供了一个可视化面板，先把 service 部署起来。
 60 | ```yaml
 61 | # alertmanager-dev-svc.yaml
 62 | apiVersion: v1
 63 | kind: Service
 64 | metadata:
 65 |   name: alertmanager-dev
 66 | spec:
 67 |   type: ClusterIP
 68 |   ports:
 69 |   - name: web
 70 |     port: 9093
 71 |     protocol: TCP
 72 |     targetPort: web
 73 |   selector:
 74 |     alertmanager: dev
 75 | 
 76 | # k apply -f alertmanager-dev-svc.yaml
 77 | ```
 78 | 
 79 | #### 8.2.3 部署 Ingress
 80 | 
 81 | 同样的，部署一个 Ingress 资源用来转发 http://alertmanager.chenjiandongx.com 域名。
 82 | ```yaml
 83 | # ingress-alertmanager.yaml
 84 | apiVersion: extensions/v1beta1
 85 | kind: Ingress
 86 | metadata:
 87 |   name: ingress-alertmanager
 88 |   namespace: default
 89 |   annotations:
 90 |     kubernetes.io/ingress.class: "nginx"
 91 |     nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
 92 | spec:
 93 |   rules:
 94 |   - host: alertmanager.chenjiandongx.com
 95 |     http:
 96 |       paths:
 97 |         - path: /
 98 |           backend:
 99 |             serviceName: alertmanager-dev
100 |             servicePort: 9093
101 | 
102 | # k apply -f ingress-alertmanager.yaml
103 | # 老规矩，记得更新 /etc/hosts 文件
104 | ```
105 | 
106 | ![alertmanager](./images/alertmanager.png)
107 | 
108 | ### 8.3 准备 Alertmanager 测试环境
109 | 
110 | #### 8.3.1 部署测试 busybox
111 | 
112 | Alertmanager 部署完以后我们就要来使用啦，作为测试用途，我们先整一个小盒子。
113 | ```yaml
114 | # bbox.yaml
115 | apiVersion: apps/v1
116 | kind: Deployment
117 | metadata:
118 |   labels:
119 |     run: bbox
120 |   name: bbox
121 | spec:
122 |   replicas: 1
123 |   selector:
124 |     matchLabels:
125 |       run: bbox
126 |   template:
127 |     metadata:
128 |       labels:
129 |         run: bbox
130 |     spec:
131 |       containers:
132 |       - image: busybox:1.28.4
133 |         imagePullPolicy: IfNotPresent
134 |         name: bbox
135 |         resources:
136 |           limits:
137 |             cpu: 200m
138 |             memory: 25Mi
139 |           requests:
140 |             cpu: 200m
141 |             memory: 25Mi
142 | 
143 | # k apply -f bbox.yaml
144 | ```
145 | 
146 | #### 8.3.2 创建 alertmanager 配置文件
147 | 
148 | 根据 prometheus-operator 文档要求，需要先准备一个配置文件，定义警告的接受者，然后根据配置文件创建 secrets。
149 | 
150 | 这里我使用的是 slack 来接受告警信息。需要在 slack 注册一个 webhook 和相应的 channel，这部分网上有很多相关资料，就不再赘述了。
151 | ```yaml
152 | # alertmanager.yaml
153 | global:
154 |   resolve_timeout: 4m
155 | receivers:
156 | - name: slack_general
157 |   slack_configs:
158 |   - api_url: ${your_slack_api}
159 |     #nothing 这个 channel 是我自己随便取的，没任何特殊含义
160 |     channel: '#nothing'
161 |     title: "{{ range .Alerts }}{{ .Annotations.summary }}\n{{ end }}"
162 |     text: "{{ range .Alerts }}{{ .Annotations.description }}\n{{ end }}"
163 | route:
164 |  receiver: slack_general
165 |  # 这里就没有设置太复杂的 route tree 了，关于这部分的详细信息可参考官方文档。
166 |  routes:
167 |   - match:
168 |       severity: warn
169 |     receiver: slack_general
170 | ```
171 | 
172 | #### 8.3.3 创建 secret
173 | 
174 | 使用 `kubectl create` 命令创建 secret
175 | ```shell
176 | # 注意命名规则，一定要与前面部署的 alertmanager 实例相对应，比如这里是 alertmanager-dev
177 | $ kubectl create secret generic alertmanager-dev --from-file=alertmanager.yaml
178 | ```
179 | 
180 | #### 8.3.4 部署 kube-state-metrics
181 | 
182 | 在上面的篇章中，我们已经部署了 cadvisor，这个是用于采集容器的实时数据的。但如果你要计算使用率的话，还需要了解该容器的限制资源为多少，这个时候就需要用到另外一个组件了。
183 | 
184 | [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) 是一个由 Kubernetes 官方团队开发的采集容器和资源信息的项目，只有 cadvisor 和 kube-state-metrics 这两者相结合才能计算出资源的使用率 🤭。
185 | 
186 | ```shell
187 | $ git clone https://github.com/kubernetes/kube-state-metrics.git
188 | $ cd kube-state-metrics
189 | $ kubectl apply -f examples/standard
190 | ```
191 | 
192 | 查看部署情况
193 | ```shell
194 | ~ 🐶 k get pods -A | grep kube-state
195 | kube-system            kube-state-metrics-676545cdcc-wzlld          1/1     Running   8          27d
196 | ```
197 | 
198 | #### 8.3.5 通知 Prometheus 抓取 kube-state-metrics 数据
199 | ```yaml
200 | # kube-state-metrics-svc-monitor.yaml
201 | apiVersion: monitoring.coreos.com/v1
202 | kind: ServiceMonitor
203 | metadata:
204 |   labels:
205 |     team: frontend
206 |   name: kube-state-metrics
207 | spec:
208 |   endpoints:
209 |   - port: http-metrics
210 |   namespaceSelector:
211 |     matchNames:
212 |     - kube-system
213 |   selector:
214 |     matchLabels:
215 |       app.kubernetes.io/name: kube-state-metrics
216 | 
217 | # k apply -f kube-state-metrics-svc-monitor.yaml
218 | ```
219 | 
220 | #### 8.3.6 创建定义告警规则
221 | 
222 | 我们这里只进行两种规则的判断，CPU 或内存使用率持续 3 分钟内大于 80%。
223 | ```yaml
224 | # bbox-rules.yaml
225 | apiVersion: monitoring.coreos.com/v1
226 | kind: PrometheusRule
227 | metadata:
228 |   labels:
229 |     prometheus: prometheus
230 |     role: alert-rules
231 |   name: bbox-rules
232 | spec:
233 |   groups:
234 |   - name: alert-golang
235 |     rules:
236 |     - alert: cpu-usage
237 |       expr: sum(rate(container_cpu_usage_seconds_total{container_label_io_kubernetes_pod_name=~"bbox-.*"}[1m])) by (pod_name) / sum(kube_pod_container_resource_limits_cpu_cores{exported_pod=~"bbox.*"}) > 0.8
238 |       for: 3m
239 |       labels:
240 |         # 匹配 alertmanager.yaml 中的 label
241 |         severity: warn
242 |       annotations:
243 |         description: 'bbox 持续 3 分钟 CPU 使用率高于 80%'
244 |         summary: 'bbox 服务出现问题啦！'
245 |     - alert: memory-usage
246 |       expr: sum(container_memory_usage_bytes{container_label_io_kubernetes_pod_name=~"bbox-.*"}) / sum(kube_pod_container_resource_requests_memory_bytes{exported_pod=~"bbox.*"}) > 0.8
247 |       for: 3m
248 |       labels:
249 |         severity: warn
250 |       annotations:
251 |         description: 'bbox 持续 3 分钟内存使用率高于 80%'
252 |         summary: 'bbox 服务出现问题啦！'
253 | 
254 | # k apply -f bbox-rules.yaml
255 | ```
256 | 
257 | 解释一下 cpu-usage 和 memory-usage 表达式分别是什么意思
258 | 
259 | cpu-usage
260 | ```shell
261 | # cpu 本身是没有使用率这个概念的，使用率是通过一个进程占用的 cpu 时间计算出来的，可以参考下面 Stackoverflow 的回答。
262 | # https://stackoverflow.com/questions/40327062/how-to-calculate-containers-cpu-usage-in-kubernetes-with-prometheus-as-monitori
263 | # 这样计算出来其实是一个 cpu 核心数的「绝对值」。通过 cadvisor 采集得到。
264 | sum(rate(container_cpu_usage_seconds_total{container_label_io_kubernetes_pod_name=~"bbox-.*"}[1m])) by (pod_name) 
265 | /
266 | # 这里再统计该 pod 所有实例 limit 的 cpu 核心数，两者的比率就是以 deployment 为单位的 cpu 使用率。
267 | # 通过 kube-state-metrics 采集得到。
268 | sum(kube_pod_container_resource_limits_cpu_cores{exported_pod=~"bbox.*"}) > 0.8
269 | ```
270 | 
271 | memory-usage
272 | ```shell
273 | # 统计 bbox deployment 所有 pod 的使用内存，单位为字节。通过 cadvisor 采集得到。
274 | sum(container_memory_usage_bytes{container_label_io_kubernetes_pod_name=~"bbox-.*"})
275 | /
276 | # 统计 bbox deployment 所有 pod 的 limit 内存总量，单位同样为字节。通过 kube-state-metrics 采集得到。
277 | # 两者的比率就是以 deployment 为单位的内存使用率
278 | sum(kube_pod_container_resource_requests_memory_bytes{exported_pod=~"bbox.*"}) > 0.8
279 | ```
280 | 
281 | ### 8.4 实战 Alertmanager
282 | 
283 | 下面我们要执行一些操作来触发告警规则，使用 kubectl 命令行进入到 bbox 容器内部。
284 | ```shell
285 | $ k exec -it bbox-7db4d47f4b-425ng  /bin/sh
286 | # 使用该命令迅速拉高 CPU 使用率
287 | $ cat /dev/zero>/dev/null
288 | ```
289 | 
290 | 使用 `kubectl top` 命令查看操作是否实际生效。可以看到 CPU 使用上来了。
291 | 
292 | ![bbox top](./images/bbox-top.png)
293 | 
294 | 等待一分钟，Prometheus server CPU 警告规则已经处于 pending 状态。
295 | 
296 | ![bbox pending](./images/bbox-pending.png)
297 | 
298 | 再等待三分钟，CPU 警告规则已经处于 firing 状态。
299 | 
300 | ![bbox firing](./images/bbox-firing.png)
301 | 
302 | 查看 alertmanager dashboard，告警也已经被消费了。
303 | 
304 | ![bbox alert](./images/bbox-alert.png)
305 | 
306 | 最后一步，登录 slack 查看是否有真正接受到告警信息。
307 | 
308 | ![bbox slack](./images/bbox-slack.png)
309 | 
310 | 完美！✿✿ヽ(°▽°)ノ✿
311 | 


--------------------------------------------------------------------------------
/0x09-高可用-Prometheus.md:
--------------------------------------------------------------------------------
 1 | ## 9. 高可用 Prometheus
 2 | 
 3 | 鉴于自己的开发机器性能有限，也没有那么大数据量的业务场景。所以实际上高可用这部分我也没真正折腾过（诚实不做作！），不过也聊一聊社区一些现有的方案吧。
 4 | 
 5 | ### 9.1 数据高可用
 6 | 
 7 | 首先我们知道，Prometheus 本身的数据是不依赖与网络存储的，也就是说时序数据是存放在「本地的」，所以对于超大量数据来说瓶颈就很明显了。
 8 | 
 9 | 但是不用慌，Prometheus 团队也考虑到这个事情了，Prometheus 提供了两种接口，remote-read 和 remote-write，意味着可以对接第三方的时序数据库，将旧数据堆放到其他地方。
10 | 
11 | ***remote write***
12 | 
13 | 用户可以在 Prometheus 配置文件中指定 ``remote_read`` 的 URL 地址，一旦设置了该配置项，Prometheus 将样本数据通过 HTTP 的形式发送给 Adaptor。而用户则可以在适配器中对接外部任意的服务。外部服务可以是真正的存储系统，公有云的存储服务，也可以是消息队列等任意形式。
14 | 
15 | ![remote write](./images/remote-write.png)
16 | 
17 | ***remote read***
18 | 
19 | 在远程读的流程当中，当用户发起查询请求后，Prometheus 将向 `remote_read` 中配置的 URL 发起查询请求(matchers,ranges)，Adaptor 根据请求条件从第三方存储服务中获取响应的数据。同时将数据转换为 Prometheus 的原始样本数据返回给 Prometheus Server。
20 | 
21 | 当获取到样本数据后，Prometheus 在本地使用 PromQL 对样本数据进行二次处理。
22 | 
23 | ![remote read](./images/remote-read.png)
24 | 
25 | 目前已经支持的 remote-* 接口的实现方案如下。
26 | 
27 | * [AppOptics](https://github.com/solarwinds/prometheus2appoptics): write
28 | * [Azure Data Explorer](https://github.com/cosh/PrometheusToAdx): read and write
29 | * [Azure Event Hubs](https://github.com/bryanklewis/prometheus-eventhubs-adapter): write
30 | * [Chronix](https://github.com/ChronixDB/chronix.ingester): write
31 | * [Cortex](https://github.com/cortexproject/cortex): read and write
32 | * [CrateDB](https://github.com/crate/crate_adapter): read and write
33 | * [Elasticsearch](https://github.com/infonova/prometheusbeat): write
34 | * [Gnocchi](https://gnocchi.xyz/prometheus.html): write
35 | * [Graphite](https://github.com/prometheus/prometheus/tree/master/documentation/examples/remote_storage/remote_storage_adapter): write
36 | * [InfluxDB](https://docs.influxdata.com/influxdb/latest/supported_protocols/prometheus): read and write
37 | * [IRONdb](https://github.com/circonus-labs/irondb-prometheus-adapter): read and write
38 | * [Kafka](https://github.com/Telefonica/prometheus-kafka-adapter): write
39 | * [M3DB](https://m3db.github.io/m3/integrations/prometheus): read and write
40 | * [OpenTSDB](https://github.com/prometheus/prometheus/tree/master/documentation/examples/remote_storage/remote_storage_adapter): write
41 | * [PostgreSQL/TimescaleDB](https://github.com/timescale/prometheus-postgresql-adapter): read and write
42 | * [QuasarDB](https://doc.quasardb.net/master/user-guide/integration/prometheus.html): read and write
43 | * [SignalFx](https://github.com/signalfx/metricproxy#prometheus): write
44 | * [Splunk](https://github.com/kebe7jun/ropee): read and write
45 | * [TiKV](https://github.com/bragfoo/TiPrometheus): read and write
46 | * [Thanos](https://github.com/thanos-io/thanos): write
47 | * [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics): write
48 | * [Wavefront](https://github.com/wavefrontHQ/prometheus-storage-adapter): write
49 | 
50 | ### 9.2 Prometheus 服务高可用
51 | 
52 | Prometheus server 是没有原生的高可用方案的，不过它有一种联邦机制（Federation）。也就是说，可以把 prometheus 本身的数据上报给上层的 prometheus 实例，是一个层级结构。
53 | 
54 | ![prometheus-federation](./images/prometheus-federation.png)
55 | 
56 | 联邦机制解决了什么问题呢。比如，你可以在一个数据中心部署一个 prometheus server，然后各个业务部门部署自己单独的 prometheus server，然后数据统一上报给数据中心，由数据中心负责对数据进行聚合。
57 | 
58 | 然后问题又来了，如果我一个业务部门只部署了一个 prometheus server 用于抓取数据，这样就存在单点问题，万一这个 server 挂掉那不就会丢失重要的监控样本数据了。所以为了保险起见，是否要再部署一个副本，抓取相同的数据，做冗余备份呢？再继续思考，那我们是不是还得再部署一个 prometheus 来监控这两个 prometheus 的运行情况呢？🐶
59 | 
60 | 仁者见仁，智者见智。
61 | 
62 | ### 9.3 Alertmanager 高可用
63 | 
64 | Alertmanager 组件本身提供了高可用方案，所以在 prometheus-operator 中，只需要提高 replica 数量即可（舒服！）。
65 | ```yaml
66 | apiVersion: monitoring.coreos.com/v1
67 | kind: Alertmanager
68 | metadata:
69 |   name: dev
70 | spec:
71 |   # 提高备份数即可。
72 |   replicas: 1
73 | ```
74 | 
75 | ### 敲黑板！划重点！
76 | 
77 | Prometheus 的折腾之旅到这里就告一段落啦，通过这段时间的捣鼓也学习到了不少东西。最后还是一句话，多翻文档，多动手。如果后面有再更新的话，那应该是「源码解析篇」🙃。
78 | 
79 | **:wq**
80 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2020~now chenjiandongx
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | <p align="center"><img src="./images/logo.png" width="360px"></p>
 2 | <h1 align="center">《 Prometheus 折腾笔记 》</h1>
 3 | 
 4 | 最近以来都很想写写关于这段时间以来折腾 Prometheus 的心得，但是人总是有惰性的，拖延是会上瘾的。
 5 | 
 6 | 正所谓，拖更一时爽，一直拖就一直爽 🐶。
 7 | 
 8 | 有时候人的动力来源就很迷，你也说不出为什么，就是突然地他就心血来潮开写了，像这样 😉！
 9 | 
10 | <p align="center"><img src="./images/heartbeat.gif"></p>
11 | 
12 | * [0x01-云原生的来源](./0x01-云原生的来源.md)
13 | * [0x02-监控系统](./0x02-监控系统.md)
14 | * [0x03-搭建测试环境](./0x03-搭建测试环境.md)
15 | * [0x04-Prometheus-理论(入门篇)](./0x04-Prometheus-理论(入门篇).md)
16 | * [0x05-Prometheus-理论(进阶篇)](./0x05-Prometheus-理论(进阶篇).md)
17 | * [0x06-Prometheus-实践(入门篇)](./0x06-Prometheus-实践(入门篇).md)
18 | * [0x07-Prometheus-实践(进阶篇)](./0x07-Prometheus-实践(进阶篇).md)
19 | * [0x08-Alertmanager-告警系统](./0x08-Alertmanager-告警系统.md)
20 | * [0x09-高可用-Prometheus](./0x09-高可用-Prometheus.md)
21 | 
22 | #### License
23 | 
24 | MIT [©chenjiandongx](https://github.com/chenjiandongx)
25 | 


--------------------------------------------------------------------------------
/example/grafana-docker.json:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/example/grafana-docker.json


--------------------------------------------------------------------------------
/example/grafana-ingress.json:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/example/grafana-ingress.json


--------------------------------------------------------------------------------
/example/grafana-language-echo.json:
--------------------------------------------------------------------------------
  1 | {
  2 |     "annotations": {
  3 |       "list": [
  4 |         {
  5 |           "builtIn": 1,
  6 |           "datasource": "-- Grafana --",
  7 |           "enable": true,
  8 |           "hide": true,
  9 |           "iconColor": "rgba(0, 211, 255, 1)",
 10 |           "name": "Annotations & Alerts",
 11 |           "type": "dashboard"
 12 |         }
 13 |       ]
 14 |     },
 15 |     "editable": true,
 16 |     "gnetId": null,
 17 |     "graphTooltip": 1,
 18 |     "id": 3,
 19 |     "iteration": 1580895141536,
 20 |     "links": [],
 21 |     "panels": [
 22 |       {
 23 |         "cacheTimeout": null,
 24 |         "colorBackground": false,
 25 |         "colorValue": true,
 26 |         "colors": [
 27 |           "#5794F2",
 28 |           "#5794F2",
 29 |           "#5794F2"
 30 |         ],
 31 |         "datasource": "$datasource",
 32 |         "format": "dateTimeAsIso",
 33 |         "gauge": {
 34 |           "maxValue": 100,
 35 |           "minValue": 0,
 36 |           "show": false,
 37 |           "thresholdLabels": false,
 38 |           "thresholdMarkers": true
 39 |         },
 40 |         "gridPos": {
 41 |           "h": 3,
 42 |           "w": 12,
 43 |           "x": 0,
 44 |           "y": 0
 45 |         },
 46 |         "id": 14,
 47 |         "interval": null,
 48 |         "links": [],
 49 |         "mappingType": 1,
 50 |         "mappingTypes": [
 51 |           {
 52 |             "name": "value to text",
 53 |             "value": 1
 54 |           },
 55 |           {
 56 |             "name": "range to text",
 57 |             "value": 2
 58 |           }
 59 |         ],
 60 |         "maxDataPoints": 100,
 61 |         "nullPointMode": "connected",
 62 |         "nullText": null,
 63 |         "options": {},
 64 |         "postfix": "",
 65 |         "postfixFontSize": "50%",
 66 |         "prefix": "",
 67 |         "prefixFontSize": "50%",
 68 |         "rangeMaps": [
 69 |           {
 70 |             "from": "null",
 71 |             "text": "N/A",
 72 |             "to": "null"
 73 |           }
 74 |         ],
 75 |         "sparkline": {
 76 |           "fillColor": "rgba(31, 118, 189, 0.18)",
 77 |           "full": false,
 78 |           "lineColor": "rgb(31, 120, 193)",
 79 |           "show": false
 80 |         },
 81 |         "tableColumn": "",
 82 |         "targets": [
 83 |           {
 84 |             "expr": "max((time() - service_uptime{job=~\"$job\"}) * 1000)",
 85 |             "format": "time_series",
 86 |             "intervalFactor": 1,
 87 |             "refId": "A"
 88 |           }
 89 |         ],
 90 |         "thresholds": "",
 91 |         "timeFrom": null,
 92 |         "timeShift": null,
 93 |         "title": "Service last updated",
 94 |         "type": "singlestat",
 95 |         "valueFontSize": "100%",
 96 |         "valueMaps": [
 97 |           {
 98 |             "op": "=",
 99 |             "text": "N/A",
100 |             "value": "null"
101 |           }
102 |         ],
103 |         "valueName": "current"
104 |       },
105 |       {
106 |         "cacheTimeout": null,
107 |         "colorBackground": false,
108 |         "colorValue": true,
109 |         "colors": [
110 |           "#299c46",
111 |           "#73BF69",
112 |           "#d44a3a"
113 |         ],
114 |         "datasource": "$datasource",
115 |         "format": "s",
116 |         "gauge": {
117 |           "maxValue": 100,
118 |           "minValue": 0,
119 |           "show": false,
120 |           "thresholdLabels": false,
121 |           "thresholdMarkers": true
122 |         },
123 |         "gridPos": {
124 |           "h": 3,
125 |           "w": 12,
126 |           "x": 12,
127 |           "y": 0
128 |         },
129 |         "id": 12,
130 |         "interval": null,
131 |         "links": [],
132 |         "mappingType": 1,
133 |         "mappingTypes": [
134 |           {
135 |             "name": "value to text",
136 |             "value": 1
137 |           },
138 |           {
139 |             "name": "range to text",
140 |             "value": 2
141 |           }
142 |         ],
143 |         "maxDataPoints": 100,
144 |         "nullPointMode": "connected",
145 |         "nullText": null,
146 |         "options": {},
147 |         "pluginVersion": "6.2.1",
148 |         "postfix": "",
149 |         "postfixFontSize": "50%",
150 |         "prefix": "",
151 |         "prefixFontSize": "50%",
152 |         "rangeMaps": [
153 |           {
154 |             "from": "null",
155 |             "text": "N/A",
156 |             "to": "null"
157 |           }
158 |         ],
159 |         "sparkline": {
160 |           "fillColor": "rgba(31, 118, 189, 0.18)",
161 |           "full": false,
162 |           "lineColor": "rgb(31, 120, 193)",
163 |           "show": false
164 |         },
165 |         "tableColumn": "",
166 |         "targets": [
167 |           {
168 |             "expr": "max(service_uptime{job=~\"$job\"})",
169 |             "format": "time_series",
170 |             "intervalFactor": 1,
171 |             "refId": "A"
172 |           }
173 |         ],
174 |         "thresholds": "",
175 |         "timeFrom": null,
176 |         "timeShift": null,
177 |         "title": "Service Uptime",
178 |         "type": "singlestat",
179 |         "valueFontSize": "110%",
180 |         "valueMaps": [
181 |           {
182 |             "op": "=",
183 |             "text": "N/A",
184 |             "value": "null"
185 |           }
186 |         ],
187 |         "valueName": "current"
188 |       },
189 |       {
190 |         "aliasColors": {},
191 |         "bars": false,
192 |         "dashLength": 10,
193 |         "dashes": false,
194 |         "datasource": "$datasource",
195 |         "decimals": null,
196 |         "fill": 1,
197 |         "fillGradient": 0,
198 |         "gridPos": {
199 |           "h": 12,
200 |           "w": 12,
201 |           "x": 0,
202 |           "y": 3
203 |         },
204 |         "id": 2,
205 |         "legend": {
206 |           "alignAsTable": true,
207 |           "avg": true,
208 |           "current": true,
209 |           "max": true,
210 |           "min": false,
211 |           "rightSide": false,
212 |           "show": true,
213 |           "total": false,
214 |           "values": true
215 |         },
216 |         "lines": true,
217 |         "linewidth": 2,
218 |         "links": [],
219 |         "nullPointMode": "null as zero",
220 |         "options": {
221 |           "dataLinks": []
222 |         },
223 |         "percentage": false,
224 |         "pointradius": 2,
225 |         "points": false,
226 |         "renderer": "flot",
227 |         "seriesOverrides": [],
228 |         "spaceLength": 10,
229 |         "stack": false,
230 |         "steppedLine": false,
231 |         "targets": [
232 |           {
233 |             "expr": "sum(rate(service_http_request_count_total{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"$status\"}[$interval])) by (exported_endpoint)",
234 |             "format": "time_series",
235 |             "intervalFactor": 1,
236 |             "legendFormat": "{{ exported_endpoint }}",
237 |             "refId": "A"
238 |           }
239 |         ],
240 |         "thresholds": [],
241 |         "timeFrom": null,
242 |         "timeRegions": [],
243 |         "timeShift": null,
244 |         "title": "Request QPS",
245 |         "tooltip": {
246 |           "shared": true,
247 |           "sort": 0,
248 |           "value_type": "individual"
249 |         },
250 |         "type": "graph",
251 |         "xaxis": {
252 |           "buckets": null,
253 |           "mode": "time",
254 |           "name": null,
255 |           "show": true,
256 |           "values": []
257 |         },
258 |         "yaxes": [
259 |           {
260 |             "format": "short",
261 |             "label": null,
262 |             "logBase": 1,
263 |             "max": null,
264 |             "min": null,
265 |             "show": true
266 |           },
267 |           {
268 |             "format": "short",
269 |             "label": null,
270 |             "logBase": 1,
271 |             "max": null,
272 |             "min": null,
273 |             "show": true
274 |           }
275 |         ],
276 |         "yaxis": {
277 |           "align": false,
278 |           "alignLevel": null
279 |         }
280 |       },
281 |       {
282 |         "aliasColors": {},
283 |         "bars": false,
284 |         "dashLength": 10,
285 |         "dashes": false,
286 |         "datasource": null,
287 |         "fill": 1,
288 |         "fillGradient": 0,
289 |         "gridPos": {
290 |           "h": 12,
291 |           "w": 12,
292 |           "x": 12,
293 |           "y": 3
294 |         },
295 |         "id": 16,
296 |         "legend": {
297 |           "alignAsTable": true,
298 |           "avg": true,
299 |           "current": true,
300 |           "max": false,
301 |           "min": true,
302 |           "show": true,
303 |           "total": false,
304 |           "values": true
305 |         },
306 |         "lines": true,
307 |         "linewidth": 2,
308 |         "nullPointMode": "connected",
309 |         "options": {
310 |           "dataLinks": []
311 |         },
312 |         "percentage": false,
313 |         "pluginVersion": "6.4.4",
314 |         "pointradius": 2,
315 |         "points": false,
316 |         "renderer": "flot",
317 |         "seriesOverrides": [],
318 |         "spaceLength": 10,
319 |         "stack": false,
320 |         "steppedLine": false,
321 |         "targets": [
322 |           {
323 |             "expr": "sum(rate(service_http_request_count_total{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"^2.*\"}[$interval])) by (exported_endpoint) /sum(rate(service_http_request_count_total{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\"}[$interval])) by (exported_endpoint)",
324 |             "legendFormat": "{{ exported_endpoint }}",
325 |             "refId": "A"
326 |           }
327 |         ],
328 |         "thresholds": [],
329 |         "timeFrom": null,
330 |         "timeRegions": [],
331 |         "timeShift": null,
332 |         "title": "Request success rate",
333 |         "tooltip": {
334 |           "shared": true,
335 |           "sort": 0,
336 |           "value_type": "individual"
337 |         },
338 |         "type": "graph",
339 |         "xaxis": {
340 |           "buckets": null,
341 |           "mode": "time",
342 |           "name": null,
343 |           "show": true,
344 |           "values": []
345 |         },
346 |         "yaxes": [
347 |           {
348 |             "format": "short",
349 |             "label": null,
350 |             "logBase": 1,
351 |             "max": null,
352 |             "min": null,
353 |             "show": true
354 |           },
355 |           {
356 |             "format": "short",
357 |             "label": null,
358 |             "logBase": 1,
359 |             "max": null,
360 |             "min": null,
361 |             "show": true
362 |           }
363 |         ],
364 |         "yaxis": {
365 |           "align": false,
366 |           "alignLevel": null
367 |         }
368 |       },
369 |       {
370 |         "aliasColors": {},
371 |         "bars": false,
372 |         "dashLength": 10,
373 |         "dashes": false,
374 |         "datasource": "$datasource",
375 |         "fill": 1,
376 |         "fillGradient": 0,
377 |         "gridPos": {
378 |           "h": 11,
379 |           "w": 12,
380 |           "x": 0,
381 |           "y": 15
382 |         },
383 |         "id": 4,
384 |         "legend": {
385 |           "alignAsTable": true,
386 |           "avg": true,
387 |           "current": true,
388 |           "max": true,
389 |           "min": false,
390 |           "show": true,
391 |           "total": false,
392 |           "values": true
393 |         },
394 |         "lines": true,
395 |         "linewidth": 2,
396 |         "links": [],
397 |         "nullPointMode": "null as zero",
398 |         "options": {
399 |           "dataLinks": []
400 |         },
401 |         "percentage": false,
402 |         "pointradius": 2,
403 |         "points": false,
404 |         "renderer": "flot",
405 |         "seriesOverrides": [],
406 |         "spaceLength": 10,
407 |         "stack": false,
408 |         "steppedLine": false,
409 |         "targets": [
410 |           {
411 |             "expr": "sum(rate(service_http_request_size_bytes_sum{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"$status\"}[$interval])) by (exported_endpoint)\n/\nsum(rate(service_http_request_size_bytes_count{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"$status\"}[$interval])) by (exported_endpoint)",
412 |             "format": "time_series",
413 |             "intervalFactor": 1,
414 |             "legendFormat": "{{ exported_endpoint }}",
415 |             "refId": "A"
416 |           }
417 |         ],
418 |         "thresholds": [],
419 |         "timeFrom": null,
420 |         "timeRegions": [],
421 |         "timeShift": null,
422 |         "title": "Request size(bytes) per second",
423 |         "tooltip": {
424 |           "shared": true,
425 |           "sort": 0,
426 |           "value_type": "individual"
427 |         },
428 |         "type": "graph",
429 |         "xaxis": {
430 |           "buckets": null,
431 |           "mode": "time",
432 |           "name": null,
433 |           "show": true,
434 |           "values": []
435 |         },
436 |         "yaxes": [
437 |           {
438 |             "format": "short",
439 |             "label": null,
440 |             "logBase": 1,
441 |             "max": null,
442 |             "min": null,
443 |             "show": true
444 |           },
445 |           {
446 |             "format": "short",
447 |             "label": null,
448 |             "logBase": 1,
449 |             "max": null,
450 |             "min": null,
451 |             "show": true
452 |           }
453 |         ],
454 |         "yaxis": {
455 |           "align": false,
456 |           "alignLevel": null
457 |         }
458 |       },
459 |       {
460 |         "aliasColors": {},
461 |         "bars": false,
462 |         "dashLength": 10,
463 |         "dashes": false,
464 |         "datasource": "$datasource",
465 |         "fill": 1,
466 |         "fillGradient": 0,
467 |         "gridPos": {
468 |           "h": 11,
469 |           "w": 12,
470 |           "x": 12,
471 |           "y": 15
472 |         },
473 |         "id": 8,
474 |         "legend": {
475 |           "alignAsTable": true,
476 |           "avg": true,
477 |           "current": true,
478 |           "max": true,
479 |           "min": false,
480 |           "show": true,
481 |           "total": false,
482 |           "values": true
483 |         },
484 |         "lines": true,
485 |         "linewidth": 2,
486 |         "links": [],
487 |         "nullPointMode": "null as zero",
488 |         "options": {
489 |           "dataLinks": []
490 |         },
491 |         "percentage": false,
492 |         "pointradius": 2,
493 |         "points": false,
494 |         "renderer": "flot",
495 |         "seriesOverrides": [],
496 |         "spaceLength": 10,
497 |         "stack": false,
498 |         "steppedLine": false,
499 |         "targets": [
500 |           {
501 |             "expr": "sum(rate(service_http_request_duration_seconds_sum{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"$status\"}[$interval])\r) by (exported_endpoint)\n/\r\nsum(rate(service_http_request_duration_seconds_count{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"$status\"}[$interval])) by (exported_endpoint)",
502 |             "format": "time_series",
503 |             "intervalFactor": 1,
504 |             "legendFormat": "{{ exported_endpoint }}",
505 |             "refId": "A"
506 |           }
507 |         ],
508 |         "thresholds": [],
509 |         "timeFrom": null,
510 |         "timeRegions": [],
511 |         "timeShift": null,
512 |         "title": "Duration per request",
513 |         "tooltip": {
514 |           "shared": true,
515 |           "sort": 0,
516 |           "value_type": "individual"
517 |         },
518 |         "type": "graph",
519 |         "xaxis": {
520 |           "buckets": null,
521 |           "mode": "time",
522 |           "name": null,
523 |           "show": true,
524 |           "values": []
525 |         },
526 |         "yaxes": [
527 |           {
528 |             "format": "short",
529 |             "label": null,
530 |             "logBase": 1,
531 |             "max": null,
532 |             "min": null,
533 |             "show": true
534 |           },
535 |           {
536 |             "format": "short",
537 |             "label": null,
538 |             "logBase": 1,
539 |             "max": null,
540 |             "min": null,
541 |             "show": true
542 |           }
543 |         ],
544 |         "yaxis": {
545 |           "align": false,
546 |           "alignLevel": null
547 |         }
548 |       },
549 |       {
550 |         "aliasColors": {},
551 |         "bars": false,
552 |         "dashLength": 10,
553 |         "dashes": false,
554 |         "datasource": "$datasource",
555 |         "fill": 1,
556 |         "fillGradient": 0,
557 |         "gridPos": {
558 |           "h": 12,
559 |           "w": 12,
560 |           "x": 0,
561 |           "y": 26
562 |         },
563 |         "id": 10,
564 |         "legend": {
565 |           "avg": true,
566 |           "current": true,
567 |           "max": true,
568 |           "min": false,
569 |           "show": true,
570 |           "total": false,
571 |           "values": true
572 |         },
573 |         "lines": true,
574 |         "linewidth": 2,
575 |         "links": [],
576 |         "nullPointMode": "null as zero",
577 |         "options": {
578 |           "dataLinks": []
579 |         },
580 |         "percentage": false,
581 |         "pointradius": 2,
582 |         "points": false,
583 |         "renderer": "flot",
584 |         "seriesOverrides": [],
585 |         "spaceLength": 10,
586 |         "stack": false,
587 |         "steppedLine": false,
588 |         "targets": [
589 |           {
590 |             "expr": "histogram_quantile($quantile, sum(rate(service_http_request_duration_seconds_bucket{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"$status\"}[$interval])) by (le))",
591 |             "format": "time_series",
592 |             "intervalFactor": 1,
593 |             "legendFormat": "{{ exported_endpoint }}",
594 |             "refId": "A"
595 |           }
596 |         ],
597 |         "thresholds": [],
598 |         "timeFrom": null,
599 |         "timeRegions": [],
600 |         "timeShift": null,
601 |         "title": "Quantile of requests",
602 |         "tooltip": {
603 |           "shared": true,
604 |           "sort": 0,
605 |           "value_type": "individual"
606 |         },
607 |         "type": "graph",
608 |         "xaxis": {
609 |           "buckets": null,
610 |           "mode": "time",
611 |           "name": null,
612 |           "show": true,
613 |           "values": []
614 |         },
615 |         "yaxes": [
616 |           {
617 |             "format": "short",
618 |             "label": null,
619 |             "logBase": 1,
620 |             "max": null,
621 |             "min": null,
622 |             "show": true
623 |           },
624 |           {
625 |             "format": "short",
626 |             "label": null,
627 |             "logBase": 1,
628 |             "max": null,
629 |             "min": null,
630 |             "show": true
631 |           }
632 |         ],
633 |         "yaxis": {
634 |           "align": false,
635 |           "alignLevel": null
636 |         }
637 |       },
638 |       {
639 |         "aliasColors": {},
640 |         "bars": false,
641 |         "dashLength": 10,
642 |         "dashes": false,
643 |         "datasource": "$datasource",
644 |         "description": "",
645 |         "fill": 1,
646 |         "fillGradient": 0,
647 |         "gridPos": {
648 |           "h": 12,
649 |           "w": 12,
650 |           "x": 12,
651 |           "y": 26
652 |         },
653 |         "id": 6,
654 |         "legend": {
655 |           "alignAsTable": true,
656 |           "avg": true,
657 |           "current": true,
658 |           "max": true,
659 |           "min": false,
660 |           "show": true,
661 |           "total": false,
662 |           "values": true
663 |         },
664 |         "lines": true,
665 |         "linewidth": 2,
666 |         "links": [],
667 |         "nullPointMode": "null as zero",
668 |         "options": {
669 |           "dataLinks": []
670 |         },
671 |         "percentage": false,
672 |         "pointradius": 2,
673 |         "points": false,
674 |         "renderer": "flot",
675 |         "seriesOverrides": [],
676 |         "spaceLength": 10,
677 |         "stack": false,
678 |         "steppedLine": false,
679 |         "targets": [
680 |           {
681 |             "expr": "sum(rate(service_http_response_size_bytes_sum{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"$status\"}[$interval])) by (exported_endpoint)\n/\nsum(rate(service_http_response_size_bytes_count{job=~\"$job\", exported_endpoint=~\"$endpoint\", method=~\"$method\", status=~\"$status\"}[$interval])) by (exported_endpoint)",
682 |             "format": "time_series",
683 |             "intervalFactor": 1,
684 |             "legendFormat": "{{ exported_endpoint }}",
685 |             "refId": "A"
686 |           }
687 |         ],
688 |         "thresholds": [],
689 |         "timeFrom": null,
690 |         "timeRegions": [],
691 |         "timeShift": null,
692 |         "title": "Response size(bytes) per second",
693 |         "tooltip": {
694 |           "shared": true,
695 |           "sort": 0,
696 |           "value_type": "individual"
697 |         },
698 |         "type": "graph",
699 |         "xaxis": {
700 |           "buckets": null,
701 |           "mode": "time",
702 |           "name": null,
703 |           "show": true,
704 |           "values": []
705 |         },
706 |         "yaxes": [
707 |           {
708 |             "format": "short",
709 |             "label": null,
710 |             "logBase": 1,
711 |             "max": null,
712 |             "min": null,
713 |             "show": true
714 |           },
715 |           {
716 |             "format": "short",
717 |             "label": null,
718 |             "logBase": 1,
719 |             "max": null,
720 |             "min": null,
721 |             "show": true
722 |           }
723 |         ],
724 |         "yaxis": {
725 |           "align": false,
726 |           "alignLevel": null
727 |         }
728 |       }
729 |     ],
730 |     "refresh": "10s",
731 |     "schemaVersion": 20,
732 |     "style": "dark",
733 |     "tags": [
734 |       "Service"
735 |     ],
736 |     "templating": {
737 |       "list": [
738 |         {
739 |           "allValue": null,
740 |           "current": {
741 |             "text": "default/language-echo",
742 |             "value": "default/language-echo"
743 |           },
744 |           "datasource": "$datasource",
745 |           "definition": "label_values(service_http_request_count_total, job)",
746 |           "hide": 0,
747 |           "includeAll": false,
748 |           "label": "Job",
749 |           "multi": false,
750 |           "name": "job",
751 |           "options": [],
752 |           "query": "label_values(service_http_request_count_total, job)",
753 |           "refresh": 1,
754 |           "regex": "",
755 |           "skipUrlSync": false,
756 |           "sort": 1,
757 |           "tagValuesQuery": "",
758 |           "tags": [],
759 |           "tagsQuery": "",
760 |           "type": "query",
761 |           "useTags": false
762 |         },
763 |         {
764 |           "current": {
765 |             "text": "Prometheus",
766 |             "value": "Prometheus"
767 |           },
768 |           "hide": 2,
769 |           "includeAll": false,
770 |           "label": null,
771 |           "multi": false,
772 |           "name": "datasource",
773 |           "options": [],
774 |           "query": "prometheus",
775 |           "refresh": 1,
776 |           "regex": "",
777 |           "skipUrlSync": false,
778 |           "type": "datasource"
779 |         },
780 |         {
781 |           "auto": false,
782 |           "auto_count": 30,
783 |           "auto_min": "10s",
784 |           "current": {
785 |             "text": "1m",
786 |             "value": "1m"
787 |           },
788 |           "hide": 0,
789 |           "label": "Interval",
790 |           "name": "interval",
791 |           "options": [
792 |             {
793 |               "selected": true,
794 |               "text": "1m",
795 |               "value": "1m"
796 |             },
797 |             {
798 |               "selected": false,
799 |               "text": "10m",
800 |               "value": "10m"
801 |             },
802 |             {
803 |               "selected": false,
804 |               "text": "30m",
805 |               "value": "30m"
806 |             },
807 |             {
808 |               "selected": false,
809 |               "text": "1h",
810 |               "value": "1h"
811 |             },
812 |             {
813 |               "selected": false,
814 |               "text": "6h",
815 |               "value": "6h"
816 |             },
817 |             {
818 |               "selected": false,
819 |               "text": "12h",
820 |               "value": "12h"
821 |             },
822 |             {
823 |               "selected": false,
824 |               "text": "1d",
825 |               "value": "1d"
826 |             },
827 |             {
828 |               "selected": false,
829 |               "text": "7d",
830 |               "value": "7d"
831 |             },
832 |             {
833 |               "selected": false,
834 |               "text": "14d",
835 |               "value": "14d"
836 |             },
837 |             {
838 |               "selected": false,
839 |               "text": "30d",
840 |               "value": "30d"
841 |             }
842 |           ],
843 |           "query": "1m,10m,30m,1h,6h,12h,1d,7d,14d,30d",
844 |           "refresh": 2,
845 |           "skipUrlSync": false,
846 |           "type": "interval"
847 |         },
848 |         {
849 |           "allValue": null,
850 |           "current": {
851 |             "text": "All",
852 |             "value": "$__all"
853 |           },
854 |           "datasource": "Prometheus",
855 |           "definition": "label_values(service_http_request_count_total{exported_endpoint!~\"/metrics\", job=~\"$job\"}, exported_endpoint)",
856 |           "hide": 0,
857 |           "includeAll": true,
858 |           "label": "Endpoint",
859 |           "multi": false,
860 |           "name": "endpoint",
861 |           "options": [],
862 |           "query": "label_values(service_http_request_count_total{exported_endpoint!~\"/metrics\", job=~\"$job\"}, exported_endpoint)",
863 |           "refresh": 1,
864 |           "regex": "",
865 |           "skipUrlSync": false,
866 |           "sort": 0,
867 |           "tagValuesQuery": "",
868 |           "tags": [],
869 |           "tagsQuery": "",
870 |           "type": "query",
871 |           "useTags": false
872 |         },
873 |         {
874 |           "allValue": null,
875 |           "current": {
876 |             "text": "GET",
877 |             "value": "GET"
878 |           },
879 |           "datasource": "Prometheus",
880 |           "definition": "label_values(service_http_request_count_total, method)",
881 |           "hide": 0,
882 |           "includeAll": true,
883 |           "label": "Method",
884 |           "multi": false,
885 |           "name": "method",
886 |           "options": [],
887 |           "query": "label_values(service_http_request_count_total, method)",
888 |           "refresh": 1,
889 |           "regex": "",
890 |           "skipUrlSync": false,
891 |           "sort": 0,
892 |           "tagValuesQuery": "",
893 |           "tags": [],
894 |           "tagsQuery": "",
895 |           "type": "query",
896 |           "useTags": false
897 |         },
898 |         {
899 |           "allValue": null,
900 |           "current": {
901 |             "text": "200",
902 |             "value": "200"
903 |           },
904 |           "datasource": "Prometheus",
905 |           "definition": "label_values(service_http_request_count_total, status)",
906 |           "hide": 0,
907 |           "includeAll": true,
908 |           "label": "Status",
909 |           "multi": false,
910 |           "name": "status",
911 |           "options": [],
912 |           "query": "label_values(service_http_request_count_total, status)",
913 |           "refresh": 1,
914 |           "regex": "",
915 |           "skipUrlSync": false,
916 |           "sort": 0,
917 |           "tagValuesQuery": "",
918 |           "tags": [],
919 |           "tagsQuery": "",
920 |           "type": "query",
921 |           "useTags": false
922 |         },
923 |         {
924 |           "allValue": null,
925 |           "current": {
926 |             "text": "0.25",
927 |             "value": "0.25"
928 |           },
929 |           "hide": 0,
930 |           "includeAll": false,
931 |           "label": "Quantile",
932 |           "multi": false,
933 |           "name": "quantile",
934 |           "options": [
935 |             {
936 |               "selected": false,
937 |               "text": "0.95",
938 |               "value": "0.95"
939 |             },
940 |             {
941 |               "selected": false,
942 |               "text": "0.75",
943 |               "value": "0.75"
944 |             },
945 |             {
946 |               "selected": true,
947 |               "text": "0.25",
948 |               "value": "0.25"
949 |             }
950 |           ],
951 |           "query": "0.95,0.75,0.25",
952 |           "skipUrlSync": false,
953 |           "type": "custom"
954 |         }
955 |       ]
956 |     },
957 |     "time": {
958 |       "from": "now-1h",
959 |       "to": "now"
960 |     },
961 |     "timepicker": {
962 |       "refresh_intervals": [
963 |         "10s",
964 |         "30s",
965 |         "1m",
966 |         "5m",
967 |         "15m",
968 |         "30m",
969 |         "1h",
970 |         "2h",
971 |         "1d"
972 |       ],
973 |       "time_options": [
974 |         "5m",
975 |         "15m",
976 |         "1h",
977 |         "6h",
978 |         "12h",
979 |         "24h",
980 |         "2d",
981 |         "7d",
982 |         "30d"
983 |       ]
984 |     },
985 |     "timezone": "browser",
986 |     "title": "HTTP Services",
987 |     "uid": "EyaoypOZz",
988 |     "version": 2
989 |   }


--------------------------------------------------------------------------------
/example/grafana-prometheus.json:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/example/grafana-prometheus.json


--------------------------------------------------------------------------------
/images/alertmanager-arch.svg:
--------------------------------------------------------------------------------
1 | <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
2 | <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" width="658px" height="494px" version="1.1" content="&lt;mxfile userAgent=&quot;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36&quot; version=&quot;8.6.9&quot; editor=&quot;www.draw.io&quot; type=&quot;google&quot;&gt;&lt;diagram id=&quot;eda46daa-11a9-bbae-2e7a-d514aa78bb29&quot; name=&quot;Page-1&quot;&gt;7V1bk5s6Ev41rjr7YBfizuNcMjmnKtmaymxV9jzKINvaMJYP4MxMfv0KLGFQg41tge3JjCsVWwgB+r5utVrdYmTdPb9+TvBq8ZVFJB6ZRvQ6su5Hpols0+X/5SVvmxLPNDYF84RGotK24In+IqJQVlvTiKS1ihljcUZX9cKQLZckzGplOEnYS73ajMX1q67wnICCpxDHsPQ7jbLFptR3jG35n4TOF/LKyBBHnrGsLArSBY7YS6XI+jSy7hLGss2359c7EuedJ/tlc95Dy9HyxhKyzLqcYJMwCs0onEUBJgEJx8gMNm38xPFaPO6/WUZnNMQZZUt+5JGuSEyXRDxB9ia75WVBM/K0wmH++4VDP7JuF9lzzH8h/pUleJn36+2MxvEdi1lSnGY9PHy6u+M3eDtPcET5jctjS7bMq5d9ZOR1Ypym4nuaJewHqbTkG/mHH4lwuiCRuO5PkmT87uObmM6XvCxj+Y3N2DJ7Evee11qwhP7iZVjeblFhQz1kid8P+JnGOWlvEppXvBV9xa9AXlshQCWwXCIIeyZZ8sariBNsSRshDKa3+fmyZZZpiSqLCqvswBKMFmyel01vAedfBOYt+LszbBNvSgzPN6JoOrZcAP/IdGN+zdsp/zLPv3wjIaH8meUBfpHyWFk5UUsi+lMWfRp/xTSunF45diirNiSQYogO4stt8WmlSMLWy6igUX42FsdCjit/9v4I4SiEsE0bMMJrIgTSwAegD1wHEOKJZLxgoxW4Gv4ATAXMCzoBJqVarwBDvBok8CiZVgUVtPqdTBeM/WjXAVci6r2peilMh0q2DqLMfByYKIqCEPu+740RHOgBILDHq3As8Cqv9/w6z828ySxmL+ECJ9lklbCQ5FjcNiHaV++a1r5xFKEmKfRO71wTdOU9TVc4CxdbiapKgy7D6cH65O4wnFQhMIq/wySpm/XUG6bBXuOoEVTkaBAZhACsn1ma0hWElA+JWXywRXyKQrOsIHh4aADomUZRfv3BdJpbQ8hH3VSaowEfONTdPP51OghtYjYU6e26IpMiUO1Rv4Hypg7K+7BLY/4svOgzWZIEZyxJq/SvdLT7zzqfthbdMk6LmdMNr4DM1evGBBDHVaMgr39SQ388JvmTLsg6/VfF0ti026JyhxvZqvNIUw9BrLpStKzhRjppNVS6kkRzIsWCJdmCzdkSx5+2pYomqnQ0WUY3uT+G/5zGLPyxKXqgcV04K8Jo3+efoh6/8//mTU4c+fNvcQXySrPNIdsTP/+WV+TfH0lC+YPnc4J9qjF/tt0g8a5g6ySUtYT8ZDiZk3IsagYzITHOuM1da6wJG3HqI6OFoAgSeF6NBEidTWxuQZxUdQEp7fh1/e0pzWyeDjRT8KR8mCPnliZ0NjzRmCx5b5oGF+qfNMrtJ1V0hXRGOMMp10ekRRorPIvxlMSPLKWFC6syJZSD5xelQjmIgkmkOtxOWZax5x5HA+TVpR15zsQBAu82yLsOGwiCJnkGQYNQnW9CN5S5g6T4DDHXh1hAC+gbW2e/JRKm1c3y7AkJOJv+jml2MA6H2/+djI69amyDFaRBfejWAlvdB2L6wznLrJkxJYFv+CHBBNv22LE+UDvKxTkkalDYHE+rEVpai45iLKbchsqkgVo6VXiZsFCNLibs/0iWvQlU8TpjhUNH3uAXliPYPpusMgLtsXqPs1jbB/iqDds29Gi3am3FHDWli3SPWavFHnWhOtBJtMujky7KOA3TnoZawYmUOV5jeDZA9iZOCI7eoC8vLZ79oTfkJa5V6JXJsCDHFnnjupH3rJ2jizExa0Jfrk0cq0xkFTabpUS/nnCgsfet5OhlWN3AdEj5XdDl/AuZZaLJ86x5WUMujkIN731o+KPk3O0m50YzFwbQ8C70XN+TaL26IJm8kPAEy+q26tyPBHrQffShO7sid1bd6ZkAuSuJBBosnuO8+MDZS4FMfsMfznSp+xSJaojAGc6VbkDE5GrrB2DiqLL6IVe+B8ALeAvtAdxOxkVakVu3U83pVNy5LgtTCkPNwlSD1trYcqLXSQ0zKm9mj9eJQ4DfKtVWeYV0x3WM+nXcQCHkpsFjZ6qQsdAoflpP0zChUxisxIU0q1M0ISn9hadFhRx58Xi8tnM7cu67KIhSlRRK6BaHP+aFRFSG8lnx18C6w3SKSPYQtzsqUyyqRGwR6V2OCUv2YembGJ3kmhAtj+seD7d+vha/BVztMADmGvQXqmqvrVe6RVtcnm7Tpb8khnvc5kPNkD0/dCI/MD1s+EEQOWO5Qqkrcqcd467w7MW5nWIaQ3bgGGNCIFE/445tqOOOstqhL/oG8EGmi/TLB8/RKdCDEMJxpobpIDPE0XRqTMOxc0ZCIGc4Qtj9m7f71jD7Vf26VL1UpXuWu/xzqXoZXP2BZB3J1viP3UC65wPSHQrIywOrh/XjZp3rKDrX7i/kAAI8UCTLdQB88vJRC8B+S9rdEADDGfhfywWd0oxB3977D9JE0uo8S2osXPy9i9dpdlE5emWE4NXk6CHHmihr/EbQEKIu9yuo4iwXTPTiDGXukZAErk/xdugqJaNtxk8YszVvdp9Hvbvc7Ui57OSdAA42HYC5rgqY1ZBR0ACX7fYBl17HRKtbvc1aPSZA63L9Um1Kb7eJu3FeDRG4od6f3YtT0hipDoISfVTFvsEpedXoty4K1+Bv5shZ0NfrhNqP/j7Zf1cuaRmxcVxw5gDoa/VTXB6OepZNd6NnnIpe85wJyVPaVkRbHJGwITVPVU1T1xjQD/jVkF9ahEyorPsNlj73mIS2WUcJGZMdOzS0rGnW53pjhXR6grEBxr7WEeSdLl32bP216RAlacxQmtCYW67SQq7ZfIj+waIfmFci+g3ZmyITHU71j4KZy2cqkKmgHhdhwG25mJtQ4f8Ugjy2T8b3gK2SUH3ENmWWyp5QOdSHq82B+Vb1XURVf9sXNu8VtH0ptOeDzanHF1kySuccHlLnAhYOr3mYrW0F0zD0tiwPnGPu5Qy2tPj+oW5d9bmQWDAI/gAu198F/PZdWfagP1g2tIq+q9fl+k4xrqLXEjjSQyiCYsO5yvpYjzMmV6sr9nqR1w+qaxwFKmjIbduEZYi9MZxry68dLKF9yHRaILRwnvUnvyQvufmJaYynNKZZfvPPLCIdohy2XZgjo3/bowtaBZdBQDIoyIdL4Mhze8KxOa+pngk4nydkjjPosqrzvJO8VbAIij+lkxt3m+otCdOob3DrwE2dfQ92vBq1pafjoZ/wM+9fqNZ66PTqLgDCqVTPNDf7BEHJhG3YZbjJW2T6p4MAAuwtrabHAFGQAN+7OxHmc1J8pH7TA9V9uyDysavpARoyejM9IDvModgxQDR7e4CYzsjatpeC7AkEME8k4fEKwPpQAP0ogKC++n68AlAb6qgADs2QbruOrhRpSL33lPV0Ju3SKfPVPXXDyKO1iwONa4B5bS5Tvj4OODWq3Su8GkrAt2Pkn9ERu26Ue2k0z27ZOsvfundXvtPQGME3FbDZjIZkErJlSFZZOgkFneDUuMnyF0YoTlebK8zoa35OfzaoW1/7CtyGsHHUMAdDGuKQW9M8f/upADLhCyUGmws0LOp/COslCisym5I8hpPWpvfcqQP5dlwNc/cUDffIZm24rHZ6HZJdUWInDrS7d79pMdAr/e80yKksO9GWtFBdTfjmcaakZSvqBoGAI03GpHrHcnuF3oxJp+ntqe+LlC3D5qWQsnzV8ums7JYkejInRYJOf5zssuXttXHyYuim2kCd6abaXO5AKlBcRxfdkOvZljlzncg2XBLhpteKTiYTwLj9oX2HBvKVMbcwJrBq9TXm3YrCDguITUm1kv9O0USGhYkYGJoML1tN10QONLyaXhGu5Y2AKsCy0SrA/Kf49/vBrAllV1EIZgAz5W09GPOfCcvfoLiV+Xw+8pVFJK/xfw==&lt;/diagram&gt;&lt;/mxfile&gt;"><defs/><g transform="translate(0.5,0.5)"><rect x="427" y="0" width="230" height="493" fill="#ffeccf" stroke="#808080" stroke-dasharray="3 3" pointer-events="none"/><g transform="translate(479.5,7.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="124" height="13" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 13px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 126px; white-space: nowrap; word-wrap: normal; font-weight: bold; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Notification Pipeline</div></div></foreignObject><text x="62" y="13" fill="#000000" text-anchor="middle" font-size="13px" font-family="Arial" font-weight="bold">Notification Pipeline</text></switch></g><rect x="527" y="397" width="70" height="41" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(536.5,404.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="50" height="26" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 52px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;"><b>Receiver</b><br /><div>E-Mail</div></div></div></foreignObject><text x="25" y="19" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">[Not supported by viewer]</text></switch></g><rect x="527" y="452" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(530.5,459.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="62" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 62px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Set Notifies</div></div></foreignObject><text x="31" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Set Notifies</text></switch></g><rect x="442" y="397" width="70" height="40" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(451.5,404.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="50" height="26" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 52px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;"><div><b>Receiver</b></div><div>Webhook<br /></div></div></div></foreignObject><text x="25" y="19" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">[Not supported by viewer]</text></switch></g><rect x="0" y="0" width="110" height="37" rx="2.22" ry="2.22" fill="#ffffff" stroke="#000000" pointer-events="none"/><rect x="267" y="0" width="110" height="150" fill="#fff3e6" stroke="#000000" pointer-events="none"/><g transform="translate(290.5,7.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="62" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 64px; white-space: nowrap; word-wrap: normal; font-weight: bold; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Dispatcher<br /></div></div></foreignObject><text x="31" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial" font-weight="bold">Dispatcher&lt;br&gt;</text></switch></g><rect x="443" y="54" width="70" height="45" fill="#ffffff" stroke="#3399ff" pointer-events="none"/><g transform="translate(458.5,63.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="38" height="26" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 38px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Gossip<br />Settle</div></div></foreignObject><text x="19" y="19" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">[Not supported by viewer]</text></switch></g><rect x="20" y="63" width="78" height="120" fill="#fff3e6" stroke="#000000" pointer-events="none"/><g transform="translate(48.5,116.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="20" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 22px; white-space: nowrap; word-wrap: normal; font-weight: bold; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">API</div></div></foreignObject><text x="10" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial" font-weight="bold">API</text></switch></g><rect x="7" y="6" width="110" height="37" rx="2.22" ry="2.22" fill="#ffffff" stroke="#000000" pointer-events="none"/><g transform="translate(17.5,11.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="88" height="26" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 90px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Alert Generators<br style="font-size: 12px" /><font style="font-size: 12px">(Prometheus)</font></div></div></foreignObject><text x="44" y="19" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">[Not supported by viewer]</text></switch></g><path d="M 59 43 L 59 54.88" fill="none" stroke="#4d4d4d" stroke-miterlimit="10" pointer-events="none"/><path d="M 59 61.88 L 55.5 54.88 L 62.5 54.88 Z" fill="#4d4d4d" stroke="#4d4d4d" stroke-miterlimit="10" pointer-events="none"/><path d="M 147 154.5 C 147 146.5 207 146.5 207 154.5 L 207 192.5 C 207 200.5 147 200.5 147 192.5 Z" fill="#ffffff" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 147 154.5 C 147 160.5 207 160.5 207 154.5 M 147 157.5 C 147 163.5 207 163.5 207 157.5 M 147 160.5 C 147 166.5 207 166.5 207 160.5" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(148.5,168.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="56" height="26" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 56px; white-space: normal; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Silence Provider</div></div></foreignObject><text x="28" y="19" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Silence Provider</text></switch></g><rect x="443" y="159" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(455.5,166.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="44" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 44px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Silencer</div></div></foreignObject><text x="22" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Silencer</text></switch></g><rect x="443" y="204" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(459.5,211.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="36" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 38px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Router</div></div></foreignObject><text x="18" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Router</text></switch></g><rect x="442" y="262" width="70" height="30" fill="#ffffff" stroke="#3399ff" pointer-events="none"/><g transform="translate(464.5,269.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="24" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 24px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Wait</div></div></foreignObject><text x="12" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Wait</text></switch></g><rect x="527" y="262" width="70" height="30" fill="#ffffff" stroke="#3399ff" pointer-events="none"/><g transform="translate(549.5,269.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="24" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 24px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Wait</div></div></foreignObject><text x="12" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Wait</text></switch></g><path d="M 478 189 L 478 195.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 478 202.88 L 474.5 195.88 L 481.5 195.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 337 L 477 357 L 477 332 L 477 343.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 350.88 L 473.5 343.88 L 480.5 343.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 433.88 322 L 367 322" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 440.88 322 L 433.88 325.5 L 433.88 318.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(378.5,290.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="37" height="24" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 11px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; white-space: nowrap; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;background-color:#ffffff;">Already<br />sent?</div></div></foreignObject><text x="19" y="18" fill="#000000" text-anchor="middle" font-size="11px" font-family="Arial">Already&lt;br&gt;sent?</text></switch></g><rect x="442" y="352" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(462.5,359.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="28" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 30px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Retry</div></div></foreignObject><text x="14" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Retry</text></switch></g><path d="M 562 337 L 562 357 L 562 332 L 562 343.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 562 350.88 L 558.5 343.88 L 565.5 343.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><rect x="527" y="307" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(543.5,314.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="36" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 36px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Dedup</div></div></foreignObject><text x="18" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Dedup</text></switch></g><rect x="527" y="352" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(547.5,359.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="28" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 30px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Retry</div></div></foreignObject><text x="14" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Retry</text></switch></g><rect x="442" y="452" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(445.5,459.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="62" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 62px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Set Notifies</div></div></foreignObject><text x="31" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Set Notifies</text></switch></g><path d="M 307 303 C 307 295 367 295 367 303 L 367 341 C 367 349 307 349 307 341 Z" fill="#ffffff" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 307 303 C 307 309 367 309 367 303 M 307 306 C 307 312 367 312 367 306 M 307 309 C 307 315 367 315 367 309" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(308.5,316.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="56" height="26" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 56px; white-space: normal; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Notify Provider</div></div></foreignObject><text x="28" y="19" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Notify Provider</text></switch></g><path d="M 147 57 C 147 49 207 49 207 57 L 207 95 C 207 103 147 103 147 95 Z" fill="#ffffff" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 147 57 C 147 63 207 63 207 57 M 147 60 C 147 66 207 66 207 60 M 147 63 C 147 69 207 69 207 63" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(148.5,70.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="56" height="26" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 56px; white-space: normal; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Alert Provider</div></div></foreignObject><text x="28" y="19" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Alert Provider</text></switch></g><path d="M 177 51 L 177 42 L 271.88 42" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 278.88 42 L 271.88 45.5 L 271.88 38.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(210.5,43.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="49" height="11" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 11px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; white-space: nowrap; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;background-color:#ffffff;">Subscribe</div></div></foreignObject><text x="25" y="11" fill="#000000" text-anchor="middle" font-size="11px" font-family="Arial">Subscribe</text></switch></g><path d="M 207 174 L 434.88 174" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 441.88 174 L 434.88 177.5 L 434.88 170.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 367 77 L 434.88 77" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 441.88 77 L 434.88 80.5 L 434.88 73.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 367 110 L 405 110 L 405 88 L 434.88 88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 441.88 88 L 434.88 91.5 L 434.88 84.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 292 L 477 298.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 305.88 L 473.5 298.88 L 480.5 298.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 562 292 L 562 298.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 562 305.88 L 558.5 298.88 L 565.5 298.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 382 L 477 388.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 395.88 L 473.5 388.88 L 480.5 388.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 562 382 L 562 388.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 562 395.88 L 558.5 388.88 L 565.5 388.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><rect x="443" y="114" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(456.5,121.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="42" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 44px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Inhibitor</div></div></foreignObject><text x="21" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Inhibitor</text></switch></g><rect x="130.5" y="282.5" width="93" height="79" fill="#fff3e6" stroke="#3399ff" pointer-events="none"/><g transform="translate(156.5,290.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="41" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 43px; white-space: nowrap; word-wrap: normal; font-weight: bold; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Cluster<br /></div></div></foreignObject><text x="21" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial" font-weight="bold">Cluster&lt;br&gt;</text></switch></g><path d="M 159.25 319.5 C 146.65 319.5 143.5 331 153.58 333.3 C 143.5 338.36 154.84 349.4 163.03 344.8 C 168.7 354 187.6 354 193.9 344.8 C 206.5 344.8 206.5 335.6 198.63 331 C 206.5 321.8 193.9 312.6 182.88 317.2 C 175 310.3 162.4 310.3 159.25 319.5 Z" fill="#ffffff" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(160.5,325.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="29" height="11" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 11px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 29px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Peers</div></div></foreignObject><text x="15" y="11" fill="#000000" text-anchor="middle" font-size="11px" font-family="Arial">Peers</text></switch></g><path d="M 177 274.88 L 177 207.12" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 177 281.88 L 173.5 274.88 L 180.5 274.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 177 200.12 L 180.5 207.12 L 173.5 207.12 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 298.88 322 L 232.12 322" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 305.88 322 L 298.88 325.5 L 298.88 318.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 225.12 322 L 232.12 318.5 L 232.12 325.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 442 467 L 337 467 L 337 355.12" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 337 348.12 L 340.5 355.12 L 333.5 355.12 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 98 76 L 138.88 76" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 145.88 76 L 138.88 79.5 L 138.88 72.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(100.5,77.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="26" height="11" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 11px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; white-space: nowrap; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;background-color:#ffffff;">Store</div></div></foreignObject><text x="13" y="11" fill="#000000" text-anchor="middle" font-size="11px" font-family="Arial">Store</text></switch></g><path d="M 97 174 L 138.88 174" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 145.88 174 L 138.88 177.5 L 138.88 170.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(100.5,159.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="26" height="11" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 11px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; white-space: nowrap; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;background-color:#ffffff;">Store</div></div></foreignObject><text x="13" y="11" fill="#000000" text-anchor="middle" font-size="11px" font-family="Arial">Store</text></switch></g><g transform="translate(189.5,234.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="42" height="11" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 11px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; white-space: nowrap;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Silences</div></div></foreignObject><text x="21" y="11" fill="#000000" text-anchor="middle" font-size="11px" font-family="Arial">Silences</text></switch></g><g transform="translate(242.5,325.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="54" height="24" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 11px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; white-space: nowrap; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Notification<br />Logs</div></div></foreignObject><text x="27" y="18" fill="#000000" text-anchor="middle" font-size="11px" font-family="Arial">Notification&lt;br&gt;Logs</text></switch></g><path d="M 478 99 L 478 105.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 478 112.88 L 474.5 105.88 L 481.5 105.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 478 144 L 478 150.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 478 157.88 L 474.5 150.88 L 481.5 150.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 478 234 L 477 234 L 477 253.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 260.88 L 473.5 253.88 L 480.5 253.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 478 242 L 562 242 L 562 253.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 562 260.88 L 558.5 253.88 L 565.5 253.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 537 242 L 627 242 L 627 253.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 627 260.88 L 623.5 253.88 L 630.5 253.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><rect x="442" y="307" width="70" height="30" fill="#ffffff" stroke="#b3b3b3" pointer-events="none"/><g transform="translate(458.5,314.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="36" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 36px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Dedup</div></div></foreignObject><text x="18" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Dedup</text></switch></g><rect x="1" y="458" width="176" height="30" rx="4.5" ry="4.5" fill="#ffffff" stroke="#3399ff" pointer-events="none"/><g transform="translate(34.5,467.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="108" height="11" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 11px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 110px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">High Availability mode<br /></div></div></foreignObject><text x="54" y="11" fill="#000000" text-anchor="middle" font-size="11px" font-family="Arial">High Availability mode&lt;br&gt;</text></switch></g><rect x="280" y="30" width="87" height="24" fill="#ffffff" stroke="#999999" pointer-events="none"/><g transform="translate(295.5,35.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="55" height="12" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 12px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 57px; white-space: nowrap; word-wrap: normal; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Aggregate</div></div></foreignObject><text x="28" y="12" fill="#000000" text-anchor="middle" font-size="12px" font-family="Arial">Aggregate</text></switch></g><rect x="307" y="63" width="60" height="28" fill="#ffffff" stroke="#999999" pointer-events="none"/><g transform="translate(311.5,71.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="28" height="10" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 10px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 29px; white-space: nowrap; word-wrap: normal;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Group</div></div></foreignObject><text x="14" y="10" fill="#000000" text-anchor="middle" font-size="10px" font-family="Arial">Group</text></switch></g><path d="M 489 397 L 489 390.12" fill="none" stroke="#cc0000" stroke-miterlimit="10" pointer-events="none"/><path d="M 489 383.12 L 492.5 390.12 L 485.5 390.12 Z" fill="#cc0000" stroke="#cc0000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 437 L 477 443.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 477 450.88 L 473.5 443.88 L 480.5 443.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 574 397 L 574 390.12" fill="none" stroke="#cc0000" stroke-miterlimit="10" pointer-events="none"/><path d="M 574 383.12 L 577.5 390.12 L 570.5 390.12 Z" fill="#cc0000" stroke="#cc0000" stroke-miterlimit="10" pointer-events="none"/><path d="M 562 438 L 562 443.88" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 562 450.88 L 558.5 443.88 L 565.5 443.88 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 356.62 80.28 C 356.77 80.41 356.79 80.65 356.64 80.83 C 356.51 80.96 356.3 81.03 356.11 80.9 L 352.62 78.21 C 352.54 78.13 352.47 78.05 352.48 77.89 L 352.48 72.03 C 352.48 71.79 352.69 71.63 352.88 71.63 C 353.13 71.63 353.28 71.86 353.28 72.03 L 353.28 77.72 Z M 353.04 84.18 C 356.98 84.18 359.67 80.86 359.67 77.51 C 359.67 73.44 356.32 70.83 353.02 70.83 C 348.64 70.83 346.34 74.62 346.34 77.32 C 346.34 81.87 349.95 84.18 353.04 84.18 Z M 352.96 85.5 C 349.02 85.5 345.1 82.48 345 77.41 C 345 73.52 348.27 69.5 352.97 69.5 C 357.1 69.5 361 72.7 361 77.54 C 361 81.84 357.53 85.5 352.96 85.5 Z" fill="#505050" stroke="none" pointer-events="none"/><rect x="307" y="96" width="60" height="28" fill="#ffffff" stroke="#999999" pointer-events="none"/><g transform="translate(311.5,104.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="28" height="10" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 10px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; width: 29px; white-space: nowrap; word-wrap: normal;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">Group</div></div></foreignObject><text x="14" y="10" fill="#000000" text-anchor="middle" font-size="10px" font-family="Arial">Group</text></switch></g><path d="M 356.62 113.28 C 356.77 113.41 356.79 113.65 356.64 113.83 C 356.51 113.96 356.3 114.03 356.11 113.9 L 352.62 111.21 C 352.54 111.13 352.47 111.05 352.48 110.89 L 352.48 105.03 C 352.48 104.79 352.69 104.63 352.88 104.63 C 353.13 104.63 353.28 104.86 353.28 105.03 L 353.28 110.72 Z M 353.04 117.18 C 356.98 117.18 359.67 113.86 359.67 110.51 C 359.67 106.44 356.32 103.83 353.02 103.83 C 348.64 103.83 346.34 107.62 346.34 110.32 C 346.34 114.87 349.95 117.18 353.04 117.18 Z M 352.96 118.5 C 349.02 118.5 345.1 115.48 345 110.41 C 345 106.52 348.27 102.5 352.97 102.5 C 357.1 102.5 361 105.7 361 110.54 C 361 114.84 357.53 118.5 352.96 118.5 Z" fill="#505050" stroke="none" pointer-events="none"/><path d="M 287 55 L 287 77 L 300.63 77" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 305.88 77 L 298.88 80.5 L 300.63 77 L 298.88 73.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 287 73 L 287 110 L 300.63 110" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 305.88 110 L 298.88 113.5 L 300.63 110 L 298.88 106.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 287 101 L 287 138 L 300.63 138" fill="none" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><path d="M 305.88 138 L 298.88 141.5 L 300.63 138 L 298.88 134.5 Z" fill="#000000" stroke="#000000" stroke-miterlimit="10" pointer-events="none"/><g transform="translate(331.5,129.5)rotate(90,6.5,8)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="13" height="16" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 15px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; white-space: nowrap; font-weight: bold; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">...</div></div></foreignObject><text x="7" y="16" fill="#000000" text-anchor="middle" font-size="15px" font-family="Arial" font-weight="bold">...</text></switch></g><g transform="translate(616.5,265.5)"><switch><foreignObject style="overflow:visible;" pointer-events="all" width="21" height="16" requiredFeatures="http://www.w3.org/TR/SVG11/feature#Extensibility"><div xmlns="http://www.w3.org/1999/xhtml" style="display: inline-block; font-size: 15px; font-family: Arial; color: rgb(0, 0, 0); line-height: 1.2; vertical-align: top; white-space: nowrap; font-weight: bold; text-align: center;"><div xmlns="http://www.w3.org/1999/xhtml" style="display:inline-block;text-align:inherit;text-decoration:inherit;">. . .</div></div></foreignObject><text x="11" y="16" fill="#000000" text-anchor="middle" font-size="15px" font-family="Arial" font-weight="bold">. . .</text></switch></g></g></svg>


--------------------------------------------------------------------------------
/images/alertmanager.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/alertmanager.png


--------------------------------------------------------------------------------
/images/bbox-alert.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/bbox-alert.png


--------------------------------------------------------------------------------
/images/bbox-firing.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/bbox-firing.png


--------------------------------------------------------------------------------
/images/bbox-pending.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/bbox-pending.png


--------------------------------------------------------------------------------
/images/bbox-slack.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/bbox-slack.png


--------------------------------------------------------------------------------
/images/bbox-top.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/bbox-top.png


--------------------------------------------------------------------------------
/images/cloudnative.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/cloudnative.png


--------------------------------------------------------------------------------
/images/cncf.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/cncf.png


--------------------------------------------------------------------------------
/images/grafana-datasource.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/grafana-datasource.png


--------------------------------------------------------------------------------
/images/grafana-docker.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/grafana-docker.png


--------------------------------------------------------------------------------
/images/grafana-ingress.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/grafana-ingress.png


--------------------------------------------------------------------------------
/images/grafana-language-echo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/grafana-language-echo.png


--------------------------------------------------------------------------------
/images/grafana-login.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/grafana-login.png


--------------------------------------------------------------------------------
/images/grafana-node-exporter.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/grafana-node-exporter.png


--------------------------------------------------------------------------------
/images/grafana-prometheus.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/grafana-prometheus.png


--------------------------------------------------------------------------------
/images/heartbeat.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/heartbeat.gif


--------------------------------------------------------------------------------
/images/k8s-dashboard.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/k8s-dashboard.png


--------------------------------------------------------------------------------
/images/language-echo-prometheus.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/language-echo-prometheus.png


--------------------------------------------------------------------------------
/images/logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/logo.png


--------------------------------------------------------------------------------
/images/monitoring.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/monitoring.png


--------------------------------------------------------------------------------
/images/prometheus-architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/prometheus-architecture.png


--------------------------------------------------------------------------------
/images/prometheus-dashboard-0.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/prometheus-dashboard-0.jpg


--------------------------------------------------------------------------------
/images/prometheus-dashboard-1.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/prometheus-dashboard-1.jpg


--------------------------------------------------------------------------------
/images/prometheus-federation.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/prometheus-federation.png


--------------------------------------------------------------------------------
/images/prometheus-logo.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/prometheus-logo.png


--------------------------------------------------------------------------------
/images/prometheus-operator-architecture.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/prometheus-operator-architecture.png


--------------------------------------------------------------------------------
/images/remote-read.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/remote-read.png


--------------------------------------------------------------------------------
/images/remote-write.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/chenjiandongx/prometheus101/2fa738363d03c796ee86f65b616010e8ad3ba1fa/images/remote-write.png


--------------------------------------------------------------------------------