├── .gitignore
├── 00-create-eks-cluster
├── 01-AmazonEKSAdminPolicy.json
├── 02-demo-cluster.yaml
└── README.md
├── 01-deploy-sample-application
├── 01-deployment.yaml
├── 02-service.yaml
├── 03-update-deployment.yaml
└── README.md
├── 02-deploy-prometheus
├── README.md
└── values.yaml
├── 03-deploy-grafana
├── README.md
└── values.yaml
├── 04-deploy-argocd
├── README.md
└── applications
│ ├── grafana
│ └── grafana.yaml
│ └── prometheus
│ └── prometheus.yaml
├── 05-deploy-EFK
├── README.md
├── elasticsearch.yaml
├── filebeat.yaml
└── kibana.yaml
├── 06-deploy-keda
└── README.md
└── README.md
/.gitignore:
--------------------------------------------------------------------------------
1 | kubeconfig_*
2 | .idea
3 | .DS_Store
--------------------------------------------------------------------------------
/00-create-eks-cluster/01-AmazonEKSAdminPolicy.json:
--------------------------------------------------------------------------------
1 | {
2 | "Version": "2012-10-17",
3 | "Statement": [
4 | {
5 | "Effect": "Allow",
6 | "Action": [
7 | "eks:*"
8 | ],
9 | "Resource": "*"
10 | },
11 | {
12 | "Effect": "Allow",
13 | "Action": "iam:PassRole",
14 | "Resource": "*",
15 | "Condition": {
16 | "StringEquals": {
17 | "iam:PassedToService": "eks.amazonaws.com"
18 | }
19 | }
20 | }
21 | ]
22 | }
23 |
--------------------------------------------------------------------------------
/00-create-eks-cluster/02-demo-cluster.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: eksctl.io/v1alpha5
3 | kind: ClusterConfig
4 | metadata:
5 | name: my-demo-cluster
6 | region: us-west-2
7 | addons:
8 | - name: vpc-cni
9 | version: latest
10 | resolveConflicts: overwrite
11 | - name: coredns
12 | version: latest
13 | configurationValues: "{\"replicaCount\":3}"
14 | resolveConflicts: overwrite
15 | - name: aws-ebs-csi-driver
16 | version: latest
17 | resolveConflicts: overwrite
18 | - name: kube-proxy
19 | version: latest
20 | resolveConflicts: overwrite
21 | managedNodeGroups:
22 | - name: my-demo-workers
23 | labels: { role: workers }
24 | instanceType: t3.large
25 | volumeSize: 100
26 | privateNetworking: true
27 | desiredCapacity: 2
28 | minSize: 1
29 | maxSize: 4
--------------------------------------------------------------------------------
/00-create-eks-cluster/README.md:
--------------------------------------------------------------------------------
1 | # deploy-eks-cluster
2 |
3 | This example is based on eksctl which is a simple CLI tool for creating and managing clusters on EKS.EKS Clusters can be deployed and managed with a number of solutions including Terraform, Cloudformation,AWS Console and AWS CLI.
4 |
5 | ## Prerequisites
6 |
7 | - An active AWS account
8 | - VPC - eksctl creates a new vpc named eksctl-my-demo-cluster-cluster/VPC in the target region (if you need to use custom vpc configuration then refer to [link](https://eksctl.io/usage/creating-and-managing-clusters/#:~:text=If%20you%20needed%20to%20use%20an%20existing%20VPC%2C%20you%20can%20use%20a%20config%20file%20like%20this%3A))
9 | - IAM permissions – The IAM security principal that you're using must have permissions to work with Amazon EKS IAM roles and service-linked roles, AWS CloudFormation, and a VPC and related resources.
10 | - Install [kubectl](https://kubernetes.io/docs/tasks/tools/),[eksctl](https://eksctl.io/introduction/?h=install#installation) and [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) in your local machine or in the CICD setup
11 |
12 | ### IAM Setup
13 |
14 | - For this setup, create an IAM policy name AmazonEKSAdminPolicy with the policy details in `01-AmazonEKSAdminPolicy.json` and attach the policy to the principal creating the cluster.
15 |
16 | ### Creating an EKS cluster
17 |
18 | The eksctl tool uses CloudFormation under the hood, creating one stack for the EKS master control plane and another stack for the worker nodes. Refer to [eksctl](https://eksctl.io/introduction/) for all avaiable configuration options.
19 |
20 | Run the command below to create a new cluster in the `us-west-2` region; expect this to take around 20 minutes and refer to eksctl documentation for all available options to customize the cluster configurations.
21 |
22 | eksctl create cluster \
23 | --version 1.23 \
24 | --region us-west-2 \
25 | --node-type t3.medium \
26 | --nodes 3 \
27 | --nodes-min 1 \
28 | --nodes-max 4 \
29 | --name my-demo-cluster
30 |
31 | As an alternative, you can also use YAML, as sort of DSL (domain specific language) script for creating Kubernetes clusters with EKS.
32 |
33 | Run the below command to create the cluster (expect this to take around 20 minutes):
34 |
35 | eksctl create cluster -f 02-demo-cluster.yaml
36 |
37 | Sample log:
38 |
39 | ```
40 | ❯❯ eksctl create cluster -f 02-demo-cluster.yaml
41 | 2023-01-15 10:52:38 [ℹ] eksctl version 0.124.0-dev+ac917eb50.2022-12-23T08:05:44Z
42 | 2023-01-15 10:52:38 [ℹ] using region us-west-2
43 | 2023-01-15 10:52:39 [ℹ] setting availability zones to [us-west-2c us-west-2a us-west-2b]
44 | 2023-01-15 10:52:39 [ℹ] subnets for us-west-2c - public:192.168.0.0/19 private:192.168.96.0/19
45 | 2023-01-15 10:52:39 [ℹ] subnets for us-west-2a - public:192.168.32.0/19 private:192.168.128.0/19
46 | 2023-01-15 10:52:39 [ℹ] subnets for us-west-2b - public:192.168.64.0/19 private:192.168.160.0/19
47 | 2023-01-15 10:52:40 [ℹ] nodegroup "my-demo-workers" will use "ami-0d453cab46e7202b2" [AmazonLinux2/1.23]
48 | 2023-01-15 10:52:40 [ℹ] using Kubernetes version 1.23
49 | 2023-01-15 10:52:40 [ℹ] creating EKS cluster "my-demo-cluster" in "us-west-2" region with un-managed nodes
50 | 2023-01-15 10:52:40 [ℹ] 1 nodegroup (my-demo-workers) was included (based on the include/exclude rules)
51 | 2023-01-15 10:52:40 [ℹ] will create a CloudFormation stack for cluster itself and 1 nodegroup stack(s)
52 | 2023-01-15 10:52:40 [ℹ] will create a CloudFormation stack for cluster itself and 0 managed nodegroup stack(s)
53 | 2023-01-15 10:52:40 [ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=us-west-2 --cluster=my-demo-cluster'
54 | 2023-01-15 10:52:40 [ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "my-demo-cluster" in "us-west-2"
55 | 2023-01-15 10:52:40 [ℹ] CloudWatch logging will not be enabled for cluster "my-demo-cluster" in "us-west-2"
56 | 2023-01-15 10:52:40 [ℹ] you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=us-west-2 --cluster=my-demo-cluster'
57 | 2023-01-15 10:52:40 [ℹ]
58 | 2 sequential tasks: { create cluster control plane "my-demo-cluster",
59 | 2 sequential sub-tasks: {
60 | wait for control plane to become ready,
61 | create nodegroup "my-demo-workers",
62 | }
63 | }
64 | 2023-01-15 10:52:40 [ℹ] building cluster stack "eksctl-my-demo-cluster-cluster"
65 | 2023-01-15 10:52:42 [ℹ] deploying stack "eksctl-my-demo-cluster-cluster"
66 | 2023-01-15 10:53:12 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
67 | 2023-01-15 10:53:44 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
68 | 2023-01-15 10:54:45 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
69 | 2023-01-15 10:55:46 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
70 | 2023-01-15 10:56:47 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
71 | 2023-01-15 10:57:48 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
72 | 2023-01-15 10:58:49 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
73 | 2023-01-15 10:59:50 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
74 | 2023-01-15 11:00:51 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
75 | 2023-01-15 11:01:53 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
76 | 2023-01-15 11:02:54 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
77 | 2023-01-15 11:03:55 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-cluster"
78 | 2023-01-15 11:06:02 [ℹ] building nodegroup stack "eksctl-my-demo-cluster-nodegroup-my-demo-workers"
79 | 2023-01-15 11:06:04 [ℹ] deploying stack "eksctl-my-demo-cluster-nodegroup-my-demo-workers"
80 | 2023-01-15 11:06:04 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-nodegroup-my-demo-workers"
81 | 2023-01-15 11:06:35 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-nodegroup-my-demo-workers"
82 | 2023-01-15 11:07:11 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-nodegroup-my-demo-workers"
83 | 2023-01-15 11:07:50 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-nodegroup-my-demo-workers"
84 | 2023-01-15 11:08:36 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-nodegroup-my-demo-workers"
85 | 2023-01-15 11:09:15 [ℹ] waiting for CloudFormation stack "eksctl-my-demo-cluster-nodegroup-my-demo-workers"
86 | 2023-01-15 11:09:15 [ℹ] waiting for the control plane to become ready
87 | 2023-01-15 11:09:16 [✔] saved kubeconfig as "/Users/chimbu/.kube/config"
88 | 2023-01-15 11:09:16 [ℹ] no tasks
89 | 2023-01-15 11:09:16 [✔] all EKS cluster resources for "my-demo-cluster" have been created
90 | 2023-01-15 11:09:17 [ℹ] adding identity "arn:aws:iam::317630533282:role/eksctl-my-demo-cluster-nodegroup-NodeInstanceRole-14J48FWWCMCO7" to auth ConfigMap
91 | 2023-01-15 11:09:17 [ℹ] nodegroup "my-demo-workers" has 0 node(s)
92 | 2023-01-15 11:09:17 [ℹ] waiting for at least 1 node(s) to become ready in "my-demo-workers"
93 | 2023-01-15 11:10:05 [ℹ] nodegroup "my-demo-workers" has 4 node(s)
94 | 2023-01-15 11:10:05 [ℹ] node "ip-192-168-51-22.us-west-2.compute.internal" is ready
95 | 2023-01-15 11:10:05 [ℹ] node "ip-192-168-62-41.us-west-2.compute.internal" is not ready
96 | 2023-01-15 11:10:05 [ℹ] node "ip-192-168-8-29.us-west-2.compute.internal" is not ready
97 | 2023-01-15 11:10:05 [ℹ] node "ip-192-168-84-235.us-west-2.compute.internal" is not ready
98 | 2023-01-15 11:10:07 [ℹ] kubectl command should work with "/Users/chimbu/.kube/config", try 'kubectl get nodes'
99 | 2023-01-15 11:10:07 [✔] EKS cluster "my-demo-cluster" in "us-west-2" region is ready
100 |
101 | ```
102 |
103 |
104 |
105 | Run the below command to destroy the cluster (expect this to take around 20 minutes):
106 |
107 | eksctl delete cluster -f 02-demo-cluster.yaml
108 |
109 | eksctl automatically updates the kubeconfig with the cluster configurations. Run the below command to verify the cluster connecivity
110 |
111 | kubectl get pods --all-namespaces
112 |
113 | Sample output:
114 |
115 | ```
116 | ❯❯ kubectl get pods --all-namespaces
117 | NAMESPACE NAME READY STATUS RESTARTS AGE
118 | default nginx 1/1 Running 0 36h
119 | elastic-system elastic-operator-0 1/1 Running 0 36h
120 | kube-system aws-node-26f7k 1/1 Running 0 37h
121 | kube-system aws-node-8x2fh 1/1 Running 0 37h
122 | kube-system aws-node-nsqjc 1/1 Running 0 37h
123 | kube-system coredns-57ff979f67-m2hlh 1/1 Running 0 37h
124 | kube-system coredns-57ff979f67-qvxqx 1/1 Running 0 37h
125 | kube-system ebs-csi-controller-6d4b84cd85-kfjz4 6/6 Running 0 36h
126 | kube-system ebs-csi-controller-6d4b84cd85-vvjlf 6/6 Running 0 36h
127 | kube-system ebs-csi-node-d9hkt 3/3 Running 0 36h
128 | kube-system ebs-csi-node-pv688 3/3 Running 0 36h
129 | kube-system ebs-csi-node-wmmq4 3/3 Running 0 36h
130 | kube-system kube-proxy-9hxsh 1/1 Running 0 37h
131 | kube-system kube-proxy-9jlqz 1/1 Running 0 37h
132 | kube-system kube-proxy-dgtgv 1/1 Running 0 37h
133 | ```
134 |
--------------------------------------------------------------------------------
/01-deploy-sample-application/01-deployment.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: apps/v1
2 | kind: Deployment
3 | metadata:
4 | name: eks-sample-deployment
5 | namespace: eks-sample-app
6 | labels:
7 | app: eks-sample-app
8 | spec:
9 | replicas: 3
10 | selector:
11 | matchLabels:
12 | app: eks-sample-app
13 | template:
14 | metadata:
15 | labels:
16 | app: eks-sample-app
17 | spec:
18 | containers:
19 | - name: eks-sample-app
20 | image: nginx:1.22.1
21 | ports:
22 | - name: http
23 | containerPort: 80
--------------------------------------------------------------------------------
/01-deploy-sample-application/02-service.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: v1
3 | kind: Service
4 | metadata:
5 | name: eks-sample-service
6 | namespace: eks-sample-app
7 | labels:
8 | app: eks-sample-app
9 | spec:
10 | type: LoadBalancer
11 | selector:
12 | app: eks-sample-app
13 | ports:
14 | - protocol: TCP
15 | port: 80
16 | targetPort: 80
17 |
--------------------------------------------------------------------------------
/01-deploy-sample-application/03-update-deployment.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: apps/v1
2 | kind: Deployment
3 | metadata:
4 | name: eks-sample-deployment
5 | namespace: eks-sample-app
6 | labels:
7 | app: eks-sample-app
8 | spec:
9 | replicas: 3
10 | selector:
11 | matchLabels:
12 | app: eks-sample-app
13 | template:
14 | metadata:
15 | labels:
16 | app: eks-sample-app
17 | spec:
18 | containers:
19 | - name: eks-sample-app
20 | image: nginx:1.23.3
21 | ports:
22 | - name: http
23 | containerPort: 80
--------------------------------------------------------------------------------
/01-deploy-sample-application/README.md:
--------------------------------------------------------------------------------
1 | # deploy-sample-application
2 |
3 | Follow the instructions after the EKS cluster configurations are completed. We will deploy a few core kubernetes objects and provision an external loadbalancer to access the application outside the EKS cluster.
4 |
5 | ## Prerequisites
6 |
7 | - An active EKS cluster and kubectl is configured to the correct EKS cluster
8 | - Clone this repo to your local and set the current working directory to the cloned repo
9 |
10 | ## Create a Namespace
11 |
12 | In Kubernetes namespaces provide a mechanism for isolating groups of resources within a single cluster. Names of resources need to be unique within a namespace, but not across namespaces. Run the below command to create namespace
13 |
14 | kubectl create namespace eks-sample-app
15 |
16 | ## Create a Kubernetes deployment
17 |
18 | A Kubernetes Deployment tells Kubernetes how to create or modify instances of the pods that hold a containerized application. Deployments can help to efficiently scale the number of replica pods, enable the rollout of updated code in a controlled manner, or roll back to an earlier deployment version if necessary. To learn more, see Deployments in the Kubernetes documentation.
19 |
20 | Apply the deployment manifest to your cluster.
21 |
22 | kubectl apply -f 01-deployment.yaml
23 |
24 | Review the deployment configurations.
25 |
26 | kubectl describe deployments.apps --namespace eks-sample-app eks-sample-deployment
27 |
28 |
29 | ## Create a service
30 |
31 | A service allows you to access all replicas through a single IP address or name. For more information, see Service in the Kubernetes documentation.
32 |
33 | There are different types of Service objects, and the one we want to use for testing is called LoadBalancer, which means an external load balancer. Amazon EKS has support for the LoadBalancer type using the class Elastic Load Balancer (ELB). EKS will automatically provision and de-provision a ELB when we create and destroy service objects.
34 |
35 | Apply the service manifest to your cluster.
36 |
37 | kubectl apply -f 02-service.yaml
38 |
39 | View all resources that exist in the eks-sample-app namespace.
40 |
41 | kubectl get all --namespace eks-sample-app
42 |
43 |
44 |
45 | You can see AWS automatically provisioned an external loadbalancer for the service type loadbalancer and you can access the application outside the cluster with the DNS name available under the EXTERNAL-IP field.
46 |
47 |
48 |
49 |
50 | ## Deploy a new application version
51 |
52 | In kubernetes you can easily deploy a new version of an existing deployment by updating the image details.
53 |
54 | Apply `03-update-deployment.yaml` deployment manifest to your cluster.
55 |
56 | kubectl apply -f 03-update-deployment.yaml
57 |
58 | Kubernetes performs a rolling update by default to minimize the downtime during upgrades and create a replica set and pods.
59 | Review the deployment configurations and verify the image details
60 |
61 | kubectl describe deployments.apps --namespace eks-sample-app eks-sample-deployment
62 |
63 | Once you're finished with the sample application, you can remove the sample namespace, service, and deployment with the following command.
64 |
65 | kubectl delete namespace eks-sample-app
66 |
--------------------------------------------------------------------------------
/02-deploy-prometheus/README.md:
--------------------------------------------------------------------------------
1 | # Deploy prometheus
2 |
3 | [Prometheus](https://prometheus.io/), a [Cloud Native Computing Foundation](https://cncf.io/) project, is is a popular open-source monitoring and alerting solution optimized for container environments.
4 |
5 | It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.
6 |
7 | Follow the instructions in this document to deploy a self-managed prometheus in EKE cluster and the instructions are based on [Prometheus Community Kubernetes Helm Charts](https://github.com/prometheus-community/helm-charts)
8 |
9 | If you are looking for a fully managed prometheus offering then please refer to [Amazon Managed Service for Prometheus](https://aws.amazon.com/prometheus/).
10 |
11 | ## Prerequisites
12 |
13 | - Kubernetes 1.22+
14 | - Helm 3.9+
15 |
16 | ## Get Repository Info
17 |
18 | ```console
19 | helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
20 | helm repo update
21 | ```
22 |
23 | ## Install/Upgrade prometheus with default values
24 |
25 | ```console
26 | helm upgrade -install [RELEASE_NAME] prometheus-community/prometheus --namespace [K8S_NAMESPACE] --create-namespace --wait --debug
27 | ```
28 |
29 | By default this chart installs additional, dependent charts:
30 |
31 | - [alertmanager](https://github.com/prometheus-community/helm-charts/tree/main/charts/alertmanager)
32 | - [kube-state-metrics](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-state-metrics)
33 | - [prometheus-node-exporter](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-node-exporter)
34 | - [prometheus-pushgateway](https://github.com/walker-tom/helm-charts/tree/main/charts/prometheus-pushgateway)
35 |
36 | Run the following command to install prometheus without any additional add-ons
37 |
38 | ```console
39 | helm upgrade -install [RELEASE_NAME] prometheus-community/prometheus --set alertmanager.enabled=false --set kube-state-metrics.enabled=false --set prometheus-node-exporter.enabled=false --set prometheus-pushgateway.enabled=false --namespace [K8S_NAMESPACE] --create-namespace --wait --debug
40 | ```
41 |
42 | The above commands install the latest chart version and use the `--version` argument to install a specific version of the prometheus chart.
43 |
44 | ```console
45 | helm upgrade -install [RELEASE_NAME] prometheus-community/prometheus --namespace [K8S_NAMESPACE] --version 18.0.0 --create-namespace --wait --debug
46 | ```
47 |
48 | ## Install/Upgrade prometheus with custom values
49 |
50 | - Create a `values.yaml` file with custom helm chart inputs. Refer to the `values.yaml` file in this repo for sample configurations.
51 |
52 | - Refer to the [official promethues chart](https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus) for recent configurations.
53 |
54 | Run the following command to install prometheus with custom configurations
55 |
56 | ```console
57 | helm upgrade -install [RELEASE_NAME] prometheus-community/prometheus --namespace [K8S_NAMESPACE] -f values.yaml --create-namespace --wait --debug
58 | ```
59 |
60 | ## Scraping Pod Metrics
61 |
62 | This chart uses a default configuration that causes prometheus to scrape a variety of [kubernetes resource types](https://github.com/prometheus-community/helm-charts/blob/main/charts/prometheus/values.yaml#L614), provided they have the correct annotations.
63 |
64 | In order to get prometheus to scrape pods, you must add annotations to the the required pods as below:
65 |
66 | ```yaml
67 | metadata:
68 | annotations:
69 | prometheus.io/scrape: "true"
70 | prometheus.io/path: /metrics
71 | prometheus.io/port: "8080"
72 | ```
73 |
74 | You should adjust `prometheus.io/path` based on the URL that your pod serves metrics from. `prometheus.io/port` should be set to the port that your pod serves metrics from. Note that the values for `prometheus.io/scrape` and `prometheus.io/port` must be enclosed in double quotes.
75 |
76 | ## View/Query Pod Metrics
77 |
78 | This chart creates a `prometheus-server` service with `ClusterIP` type which is accessible only inside the cluster. Change the [service type](https://github.com/prometheus-community/helm-charts/blob/main/charts/prometheus/values.yaml#L562) to `LoadBalancer` if you want to access prometheus outside cluster.
79 |
80 | Implement [basic-auth](https://prometheus.io/docs/guides/basic-auth/) and IP restrictions if you are exposing prometheus outside the cluster.
81 |
82 | Run the following `kubectl port-forward` command to connect to prometheus-server and go to `localhost:8080` in the browser.
83 |
84 | ```console
85 | kubectl port-forward --namespace [K8S_NAMESPACE] svc/prometheus-server 8080:80
86 | ```
87 |
88 | Query the required metrics in promethues UI
89 |
90 |
91 |
92 |
93 |
--------------------------------------------------------------------------------
/02-deploy-prometheus/values.yaml:
--------------------------------------------------------------------------------
1 | rbac:
2 | create: true
3 |
4 | podSecurityPolicy:
5 | enabled: false
6 |
7 | imagePullSecrets:
8 | # - name: "image-pull-secret"
9 |
10 | ## Define serviceAccount names for components. Defaults to component's fully qualified name.
11 | ##
12 | serviceAccounts:
13 | server:
14 | create: true
15 | name:
16 | annotations: {}
17 |
18 | ## Monitors ConfigMap changes and POSTs to a URL
19 | ## Ref: https://github.com/jimmidyson/configmap-reload
20 | ##
21 | configmapReload:
22 | prometheus:
23 | ## If false, the configmap-reload container will not be deployed
24 | ##
25 | enabled: true
26 |
27 | ## configmap-reload container name
28 | ##
29 | name: configmap-reload
30 |
31 | ## configmap-reload container image
32 | ##
33 | image:
34 | repository: jimmidyson/configmap-reload
35 | tag: v0.8.0
36 | # When digest is set to a non-empty value, images will be pulled by digest (regardless of tag value).
37 | digest: ""
38 | pullPolicy: IfNotPresent
39 |
40 | # containerPort: 9533
41 |
42 | ## Additional configmap-reload container arguments
43 | ##
44 | extraArgs: {}
45 | ## Additional configmap-reload volume directories
46 | ##
47 | extraVolumeDirs: []
48 |
49 |
50 | ## Additional configmap-reload mounts
51 | ##
52 | extraConfigmapMounts: []
53 | # - name: prometheus-alerts
54 | # mountPath: /etc/alerts.d
55 | # subPath: ""
56 | # configMap: prometheus-alerts
57 | # readOnly: true
58 |
59 | ## Security context to be added to configmap-reload container
60 | containerSecurityContext: {}
61 |
62 | ## configmap-reload resource requests and limits
63 | ## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
64 | ##
65 | resources: {}
66 |
67 | server:
68 | ## Prometheus server container name
69 | ##
70 | name: server
71 |
72 | ## Use a ClusterRole (and ClusterRoleBinding)
73 | ## - If set to false - we define a RoleBinding in the defined namespaces ONLY
74 | ##
75 | ## NB: because we need a Role with nonResourceURL's ("/metrics") - you must get someone with Cluster-admin privileges to define this role for you, before running with this setting enabled.
76 | ## This makes prometheus work - for users who do not have ClusterAdmin privs, but wants prometheus to operate on their own namespaces, instead of clusterwide.
77 | ##
78 | ## You MUST also set namespaces to the ones you have access to and want monitored by Prometheus.
79 | ##
80 | # useExistingClusterRoleName: nameofclusterrole
81 |
82 | ## namespaces to monitor (instead of monitoring all - clusterwide). Needed if you want to run without Cluster-admin privileges.
83 | # namespaces:
84 | # - yournamespace
85 |
86 | # sidecarContainers - add more containers to prometheus server
87 | # Key/Value where Key is the sidecar `- name: `
88 | # Example:
89 | # sidecarContainers:
90 | # webserver:
91 | # image: nginx
92 | sidecarContainers: {}
93 |
94 | # sidecarTemplateValues - context to be used in template for sidecarContainers
95 | # Example:
96 | # sidecarTemplateValues: *your-custom-globals
97 | # sidecarContainers:
98 | # webserver: |-
99 | # {{ include "webserver-container-template" . }}
100 | # Template for `webserver-container-template` might looks like this:
101 | # image: "{{ .Values.server.sidecarTemplateValues.repository }}:{{ .Values.server.sidecarTemplateValues.tag }}"
102 | # ...
103 | #
104 | sidecarTemplateValues: {}
105 |
106 | ## Prometheus server container image
107 | ##
108 | image:
109 | repository: quay.io/prometheus/prometheus
110 | # if not set appVersion field from Chart.yaml is used
111 | tag: ""
112 | # When digest is set to a non-empty value, images will be pulled by digest (regardless of tag value).
113 | digest: ""
114 | pullPolicy: IfNotPresent
115 |
116 | ## prometheus server priorityClassName
117 | ##
118 | priorityClassName: ""
119 |
120 | ## EnableServiceLinks indicates whether information about services should be injected
121 | ## into pod's environment variables, matching the syntax of Docker links.
122 | ## WARNING: the field is unsupported and will be skipped in K8s prior to v1.13.0.
123 | ##
124 | enableServiceLinks: true
125 |
126 | ## The URL prefix at which the container can be accessed. Useful in the case the '-web.external-url' includes a slug
127 | ## so that the various internal URLs are still able to access as they are in the default case.
128 | ## (Optional)
129 | prefixURL: ""
130 |
131 | ## External URL which can access prometheus
132 | ## Maybe same with Ingress host name
133 | baseURL: ""
134 |
135 | ## Additional server container environment variables
136 | ##
137 | ## You specify this manually like you would a raw deployment manifest.
138 | ## This means you can bind in environment variables from secrets.
139 | ##
140 | ## e.g. static environment variable:
141 | ## - name: DEMO_GREETING
142 | ## value: "Hello from the environment"
143 | ##
144 | ## e.g. secret environment variable:
145 | ## - name: USERNAME
146 | ## valueFrom:
147 | ## secretKeyRef:
148 | ## name: mysecret
149 | ## key: username
150 | env: []
151 |
152 | # List of flags to override default parameters, e.g:
153 | # - --enable-feature=agent
154 | # - --storage.agent.retention.max-time=30m
155 | defaultFlagsOverride: []
156 |
157 | extraFlags:
158 | - web.enable-lifecycle
159 | ## web.enable-admin-api flag controls access to the administrative HTTP API which includes functionality such as
160 | ## deleting time series. This is disabled by default.
161 | # - web.enable-admin-api
162 | ##
163 | ## storage.tsdb.no-lockfile flag controls BD locking
164 | # - storage.tsdb.no-lockfile
165 | ##
166 | ## storage.tsdb.wal-compression flag enables compression of the write-ahead log (WAL)
167 | # - storage.tsdb.wal-compression
168 |
169 | ## Path to a configuration file on prometheus server container FS
170 | configPath: /etc/config/prometheus.yml
171 |
172 | ### The data directory used by prometheus to set --storage.tsdb.path
173 | ### When empty server.persistentVolume.mountPath is used instead
174 | storagePath: ""
175 |
176 | global:
177 | ## How frequently to scrape targets by default
178 | ##
179 | scrape_interval: 1m
180 | ## How long until a scrape request times out
181 | ##
182 | scrape_timeout: 10s
183 | ## How frequently to evaluate rules
184 | ##
185 | evaluation_interval: 1m
186 | ## https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write
187 | ##
188 | remoteWrite: []
189 | ## https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_read
190 | ##
191 | remoteRead: []
192 |
193 | ## Custom HTTP headers for Liveness/Readiness/Startup Probe
194 | ##
195 | ## Useful for providing HTTP Basic Auth to healthchecks
196 | probeHeaders: []
197 |
198 | ## Additional Prometheus server container arguments
199 | ##
200 | extraArgs: {}
201 |
202 | ## Additional InitContainers to initialize the pod
203 | ##
204 | extraInitContainers: []
205 |
206 | ## Additional Prometheus server Volume mounts
207 | ##
208 | extraVolumeMounts: []
209 |
210 | ## Additional Prometheus server Volumes
211 | ##
212 | extraVolumes: []
213 |
214 | ## Additional Prometheus server hostPath mounts
215 | ##
216 | extraHostPathMounts: []
217 | # - name: certs-dir
218 | # mountPath: /etc/kubernetes/certs
219 | # subPath: ""
220 | # hostPath: /etc/kubernetes/certs
221 | # readOnly: true
222 |
223 | extraConfigmapMounts: []
224 | # - name: certs-configmap
225 | # mountPath: /prometheus
226 | # subPath: ""
227 | # configMap: certs-configmap
228 | # readOnly: true
229 |
230 | ## Additional Prometheus server Secret mounts
231 | # Defines additional mounts with secrets. Secrets must be manually created in the namespace.
232 | extraSecretMounts: []
233 | # - name: secret-files
234 | # mountPath: /etc/secrets
235 | # subPath: ""
236 | # secretName: prom-secret-files
237 | # readOnly: true
238 |
239 | ## ConfigMap override where fullname is {{.Release.Name}}-{{.Values.server.configMapOverrideName}}
240 | ## Defining configMapOverrideName will cause templates/server-configmap.yaml
241 | ## to NOT generate a ConfigMap resource
242 | ##
243 | configMapOverrideName: ""
244 |
245 | ## Extra labels for Prometheus server ConfigMap (ConfigMap that holds serverFiles)
246 | extraConfigmapLabels: {}
247 |
248 | ingress:
249 | ## If true, Prometheus server Ingress will be created
250 | ##
251 | enabled: false
252 |
253 | # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
254 | # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
255 | # ingressClassName: nginx
256 |
257 | ## Prometheus server Ingress annotations
258 | ##
259 | annotations: {}
260 | # kubernetes.io/ingress.class: nginx
261 | # kubernetes.io/tls-acme: 'true'
262 |
263 | ## Prometheus server Ingress additional labels
264 | ##
265 | extraLabels: {}
266 |
267 | ## Prometheus server Ingress hostnames with optional path
268 | ## Must be provided if Ingress is enabled
269 | ##
270 | hosts: []
271 | # - prometheus.domain.com
272 | # - domain.com/prometheus
273 |
274 | path: /
275 |
276 | # pathType is only for k8s >= 1.18
277 | pathType: Prefix
278 |
279 | ## Extra paths to prepend to every host configuration. This is useful when working with annotation based services.
280 | extraPaths: []
281 | # - path: /*
282 | # backend:
283 | # serviceName: ssl-redirect
284 | # servicePort: use-annotation
285 |
286 | ## Prometheus server Ingress TLS configuration
287 | ## Secrets must be manually created in the namespace
288 | ##
289 | tls: []
290 | # - secretName: prometheus-server-tls
291 | # hosts:
292 | # - prometheus.domain.com
293 |
294 | ## Server Deployment Strategy type
295 | strategy:
296 | type: Recreate
297 |
298 | ## hostAliases allows adding entries to /etc/hosts inside the containers
299 | hostAliases: []
300 | # - ip: "127.0.0.1"
301 | # hostnames:
302 | # - "example.com"
303 |
304 | ## Node tolerations for server scheduling to nodes with taints
305 | ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
306 | ##
307 | tolerations: []
308 | # - key: "key"
309 | # operator: "Equal|Exists"
310 | # value: "value"
311 | # effect: "NoSchedule|PreferNoSchedule|NoExecute(1.6 only)"
312 |
313 | ## Node labels for Prometheus server pod assignment
314 | ## Ref: https://kubernetes.io/docs/user-guide/node-selection/
315 | ##
316 | nodeSelector: {}
317 |
318 | ## Pod affinity
319 | ##
320 | affinity: {}
321 |
322 | ## PodDisruptionBudget settings
323 | ## ref: https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
324 | ##
325 | podDisruptionBudget:
326 | enabled: false
327 | maxUnavailable: 1
328 |
329 | ## Use an alternate scheduler, e.g. "stork".
330 | ## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
331 | ##
332 | # schedulerName:
333 |
334 | persistentVolume:
335 | ## If true, Prometheus server will create/use a Persistent Volume Claim
336 | ## If false, use emptyDir
337 | ##
338 | enabled: true
339 |
340 | ## Prometheus server data Persistent Volume access modes
341 | ## Must match those of existing PV or dynamic provisioner
342 | ## Ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
343 | ##
344 | accessModes:
345 | - ReadWriteOnce
346 |
347 | ## Prometheus server data Persistent Volume labels
348 | ##
349 | labels: {}
350 |
351 | ## Prometheus server data Persistent Volume annotations
352 | ##
353 | annotations: {}
354 |
355 | ## Prometheus server data Persistent Volume existing claim name
356 | ## Requires server.persistentVolume.enabled: true
357 | ## If defined, PVC must be created manually before volume will be bound
358 | existingClaim: ""
359 |
360 | ## Prometheus server data Persistent Volume mount root path
361 | ##
362 | mountPath: /data
363 |
364 | ## Prometheus server data Persistent Volume size
365 | ##
366 | size: 8Gi
367 |
368 | ## Prometheus server data Persistent Volume Storage Class
369 | ## If defined, storageClassName:
370 | ## If set to "-", storageClassName: "", which disables dynamic provisioning
371 | ## If undefined (the default) or set to null, no storageClassName spec is
372 | ## set, choosing the default provisioner. (gp2 on AWS, standard on
373 | ## GKE, AWS & OpenStack)
374 | ##
375 | # storageClass: "-"
376 |
377 | ## Prometheus server data Persistent Volume Binding Mode
378 | ## If defined, volumeBindingMode:
379 | ## If undefined (the default) or set to null, no volumeBindingMode spec is
380 | ## set, choosing the default mode.
381 | ##
382 | # volumeBindingMode: ""
383 |
384 | ## Subdirectory of Prometheus server data Persistent Volume to mount
385 | ## Useful if the volume's root directory is not empty
386 | ##
387 | subPath: ""
388 |
389 | ## Persistent Volume Claim Selector
390 | ## Useful if Persistent Volumes have been provisioned in advance
391 | ## Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#selector
392 | ##
393 | # selector:
394 | # matchLabels:
395 | # release: "stable"
396 | # matchExpressions:
397 | # - { key: environment, operator: In, values: [ dev ] }
398 |
399 | ## Persistent Volume Name
400 | ## Useful if Persistent Volumes have been provisioned in advance and you want to use a specific one
401 | ##
402 | # volumeName: ""
403 |
404 | emptyDir:
405 | ## Prometheus server emptyDir volume size limit
406 | ##
407 | sizeLimit: ""
408 |
409 | ## Annotations to be added to Prometheus server pods
410 | ##
411 | podAnnotations: {}
412 | # iam.amazonaws.com/role: prometheus
413 |
414 | ## Labels to be added to Prometheus server pods
415 | ##
416 | podLabels: {}
417 |
418 | ## Prometheus AlertManager configuration
419 | ##
420 | alertmanagers: []
421 |
422 | ## Specify if a Pod Security Policy for node-exporter must be created
423 | ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/
424 | ##
425 | podSecurityPolicy:
426 | annotations: {}
427 | ## Specify pod annotations
428 | ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#apparmor
429 | ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#seccomp
430 | ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#sysctl
431 | ##
432 | # seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
433 | # seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default'
434 | # apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default'
435 |
436 | ## Use a StatefulSet if replicaCount needs to be greater than 1 (see below)
437 | ##
438 | replicaCount: 1
439 |
440 | ## Annotations to be added to deployment
441 | ##
442 | deploymentAnnotations: {}
443 |
444 | statefulSet:
445 | ## If true, use a statefulset instead of a deployment for pod management.
446 | ## This allows to scale replicas to more than 1 pod
447 | ##
448 | enabled: false
449 |
450 | annotations: {}
451 | labels: {}
452 | podManagementPolicy: OrderedReady
453 |
454 | ## Alertmanager headless service to use for the statefulset
455 | ##
456 | headless:
457 | annotations: {}
458 | labels: {}
459 | servicePort: 80
460 | ## Enable gRPC port on service to allow auto discovery with thanos-querier
461 | gRPC:
462 | enabled: false
463 | servicePort: 10901
464 | # nodePort: 10901
465 |
466 | ## Prometheus server readiness and liveness probe initial delay and timeout
467 | ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
468 | ##
469 | tcpSocketProbeEnabled: false
470 | probeScheme: HTTP
471 | readinessProbeInitialDelay: 30
472 | readinessProbePeriodSeconds: 5
473 | readinessProbeTimeout: 4
474 | readinessProbeFailureThreshold: 3
475 | readinessProbeSuccessThreshold: 1
476 | livenessProbeInitialDelay: 30
477 | livenessProbePeriodSeconds: 15
478 | livenessProbeTimeout: 10
479 | livenessProbeFailureThreshold: 3
480 | livenessProbeSuccessThreshold: 1
481 | startupProbe:
482 | enabled: false
483 | periodSeconds: 5
484 | failureThreshold: 30
485 | timeoutSeconds: 10
486 |
487 | ## Prometheus server resource requests and limits
488 | ## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
489 | ##
490 | resources: {}
491 | # limits:
492 | # cpu: 500m
493 | # memory: 512Mi
494 | # requests:
495 | # cpu: 500m
496 | # memory: 512Mi
497 |
498 | # Required for use in managed kubernetes clusters (such as AWS EKS) with custom CNI (such as calico),
499 | # because control-plane managed by AWS cannot communicate with pods' IP CIDR and admission webhooks are not working
500 | ##
501 | hostNetwork: false
502 |
503 | # When hostNetwork is enabled, this will set to ClusterFirstWithHostNet automatically
504 | dnsPolicy: ClusterFirst
505 |
506 | # Use hostPort
507 | # hostPort: 9090
508 |
509 | ## Vertical Pod Autoscaler config
510 | ## Ref: https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler
511 | verticalAutoscaler:
512 | ## If true a VPA object will be created for the controller (either StatefulSet or Deployemnt, based on above configs)
513 | enabled: false
514 | # updateMode: "Auto"
515 | # containerPolicies:
516 | # - containerName: 'prometheus-server'
517 |
518 | # Custom DNS configuration to be added to prometheus server pods
519 | dnsConfig: {}
520 | # nameservers:
521 | # - 1.2.3.4
522 | # searches:
523 | # - ns1.svc.cluster-domain.example
524 | # - my.dns.search.suffix
525 | # options:
526 | # - name: ndots
527 | # value: "2"
528 | # - name: edns0
529 |
530 | ## Security context to be added to server pods
531 | ##
532 | securityContext:
533 | runAsUser: 65534
534 | runAsNonRoot: true
535 | runAsGroup: 65534
536 | fsGroup: 65534
537 |
538 | ## Security context to be added to server container
539 | ##
540 | containerSecurityContext: {}
541 |
542 | service:
543 | ## If false, no Service will be created for the Prometheus server
544 | ##
545 | enabled: true
546 |
547 | annotations: {}
548 | labels: {}
549 | clusterIP: ""
550 |
551 | ## List of IP addresses at which the Prometheus server service is available
552 | ## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips
553 | ##
554 | externalIPs: []
555 |
556 | loadBalancerIP: ""
557 | loadBalancerSourceRanges: []
558 | servicePort: 80
559 | sessionAffinity: None
560 | type: ClusterIP
561 |
562 | ## Enable gRPC port on service to allow auto discovery with thanos-querier
563 | gRPC:
564 | enabled: false
565 | servicePort: 10901
566 | # nodePort: 10901
567 |
568 | ## If using a statefulSet (statefulSet.enabled=true), configure the
569 | ## service to connect to a specific replica to have a consistent view
570 | ## of the data.
571 | statefulsetReplica:
572 | enabled: false
573 | replica: 0
574 |
575 | ## Prometheus server pod termination grace period
576 | ##
577 | terminationGracePeriodSeconds: 300
578 |
579 | ## Prometheus data retention period (default if not specified is 15 days)
580 | ##
581 | retention: "15d"
582 |
583 | ## Prometheus server ConfigMap entries for rule files (allow prometheus labels interpolation)
584 | ruleFiles: {}
585 |
586 | ## Prometheus server ConfigMap entries
587 | ##
588 | serverFiles:
589 | ## Alerts configuration
590 | ## Ref: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
591 | alerting_rules.yml: {}
592 | # groups:
593 | # - name: Instances
594 | # rules:
595 | # - alert: InstanceDown
596 | # expr: up == 0
597 | # for: 5m
598 | # labels:
599 | # severity: page
600 | # annotations:
601 | # description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
602 | # summary: 'Instance {{ $labels.instance }} down'
603 | ## DEPRECATED DEFAULT VALUE, unless explicitly naming your files, please use alerting_rules.yml
604 | alerts: {}
605 |
606 | ## Records configuration
607 | ## Ref: https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/
608 | recording_rules.yml: {}
609 | ## DEPRECATED DEFAULT VALUE, unless explicitly naming your files, please use recording_rules.yml
610 | rules: {}
611 |
612 | prometheus.yml:
613 | rule_files:
614 | - /etc/config/recording_rules.yml
615 | - /etc/config/alerting_rules.yml
616 | ## Below two files are DEPRECATED will be removed from this default values file
617 | - /etc/config/rules
618 | - /etc/config/alerts
619 |
620 | scrape_configs:
621 | - job_name: prometheus
622 | static_configs:
623 | - targets:
624 | - localhost:9090
625 |
626 | # A scrape configuration for running Prometheus on a Kubernetes cluster.
627 | # This uses separate scrape configs for cluster components (i.e. API server, node)
628 | # and services to allow each to use different authentication configs.
629 | #
630 | # Kubernetes labels will be added as Prometheus labels on metrics via the
631 | # `labelmap` relabeling action.
632 |
633 | # Scrape config for API servers.
634 | #
635 | # Kubernetes exposes API servers as endpoints to the default/kubernetes
636 | # service so this uses `endpoints` role and uses relabelling to only keep
637 | # the endpoints associated with the default/kubernetes service using the
638 | # default named port `https`. This works for single API server deployments as
639 | # well as HA API server deployments.
640 | - job_name: 'kubernetes-apiservers'
641 |
642 | kubernetes_sd_configs:
643 | - role: endpoints
644 |
645 | # Default to scraping over https. If required, just disable this or change to
646 | # `http`.
647 | scheme: https
648 |
649 | # This TLS & bearer token file config is used to connect to the actual scrape
650 | # endpoints for cluster components. This is separate to discovery auth
651 | # configuration because discovery & scraping are two separate concerns in
652 | # Prometheus. The discovery auth config is automatic if Prometheus runs inside
653 | # the cluster. Otherwise, more config options have to be provided within the
654 | # .
655 | tls_config:
656 | ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
657 | # If your node certificates are self-signed or use a different CA to the
658 | # master CA, then disable certificate verification below. Note that
659 | # certificate verification is an integral part of a secure infrastructure
660 | # so this should only be disabled in a controlled environment. You can
661 | # disable certificate verification by uncommenting the line below.
662 | #
663 | insecure_skip_verify: true
664 | bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
665 |
666 | # Keep only the default/kubernetes service endpoints for the https port. This
667 | # will add targets for each API server which Kubernetes adds an endpoint to
668 | # the default/kubernetes service.
669 | relabel_configs:
670 | - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
671 | action: keep
672 | regex: default;kubernetes;https
673 |
674 | - job_name: 'kubernetes-nodes'
675 |
676 | # Default to scraping over https. If required, just disable this or change to
677 | # `http`.
678 | scheme: https
679 |
680 | # This TLS & bearer token file config is used to connect to the actual scrape
681 | # endpoints for cluster components. This is separate to discovery auth
682 | # configuration because discovery & scraping are two separate concerns in
683 | # Prometheus. The discovery auth config is automatic if Prometheus runs inside
684 | # the cluster. Otherwise, more config options have to be provided within the
685 | # .
686 | tls_config:
687 | ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
688 | # If your node certificates are self-signed or use a different CA to the
689 | # master CA, then disable certificate verification below. Note that
690 | # certificate verification is an integral part of a secure infrastructure
691 | # so this should only be disabled in a controlled environment. You can
692 | # disable certificate verification by uncommenting the line below.
693 | #
694 | insecure_skip_verify: true
695 | bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
696 |
697 | kubernetes_sd_configs:
698 | - role: node
699 |
700 | relabel_configs:
701 | - action: labelmap
702 | regex: __meta_kubernetes_node_label_(.+)
703 | - target_label: __address__
704 | replacement: kubernetes.default.svc:443
705 | - source_labels: [__meta_kubernetes_node_name]
706 | regex: (.+)
707 | target_label: __metrics_path__
708 | replacement: /api/v1/nodes/$1/proxy/metrics
709 |
710 |
711 | - job_name: 'kubernetes-nodes-cadvisor'
712 |
713 | # Default to scraping over https. If required, just disable this or change to
714 | # `http`.
715 | scheme: https
716 |
717 | # This TLS & bearer token file config is used to connect to the actual scrape
718 | # endpoints for cluster components. This is separate to discovery auth
719 | # configuration because discovery & scraping are two separate concerns in
720 | # Prometheus. The discovery auth config is automatic if Prometheus runs inside
721 | # the cluster. Otherwise, more config options have to be provided within the
722 | # .
723 | tls_config:
724 | ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
725 | # If your node certificates are self-signed or use a different CA to the
726 | # master CA, then disable certificate verification below. Note that
727 | # certificate verification is an integral part of a secure infrastructure
728 | # so this should only be disabled in a controlled environment. You can
729 | # disable certificate verification by uncommenting the line below.
730 | #
731 | insecure_skip_verify: true
732 | bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
733 |
734 | kubernetes_sd_configs:
735 | - role: node
736 |
737 | # This configuration will work only on kubelet 1.7.3+
738 | # As the scrape endpoints for cAdvisor have changed
739 | # if you are using older version you need to change the replacement to
740 | # replacement: /api/v1/nodes/$1:4194/proxy/metrics
741 | # more info here https://github.com/coreos/prometheus-operator/issues/633
742 | relabel_configs:
743 | - action: labelmap
744 | regex: __meta_kubernetes_node_label_(.+)
745 | - target_label: __address__
746 | replacement: kubernetes.default.svc:443
747 | - source_labels: [__meta_kubernetes_node_name]
748 | regex: (.+)
749 | target_label: __metrics_path__
750 | replacement: /api/v1/nodes/$1/proxy/metrics/cadvisor
751 |
752 | # Metric relabel configs to apply to samples before ingestion.
753 | # [Metric Relabeling](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#metric_relabel_configs)
754 | # metric_relabel_configs:
755 | # - action: labeldrop
756 | # regex: (kubernetes_io_hostname|failure_domain_beta_kubernetes_io_region|beta_kubernetes_io_os|beta_kubernetes_io_arch|beta_kubernetes_io_instance_type|failure_domain_beta_kubernetes_io_zone)
757 |
758 | # Scrape config for service endpoints.
759 | #
760 | # The relabeling allows the actual service scrape endpoint to be configured
761 | # via the following annotations:
762 | #
763 | # * `prometheus.io/scrape`: Only scrape services that have a value of
764 | # `true`, except if `prometheus.io/scrape-slow` is set to `true` as well.
765 | # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
766 | # to set this to `https` & most likely set the `tls_config` of the scrape config.
767 | # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
768 | # * `prometheus.io/port`: If the metrics are exposed on a different port to the
769 | # service then set this appropriately.
770 | # * `prometheus.io/param_`: If the metrics endpoint uses parameters
771 | # then you can set any parameter
772 | - job_name: 'kubernetes-service-endpoints'
773 | honor_labels: true
774 |
775 | kubernetes_sd_configs:
776 | - role: endpoints
777 |
778 | relabel_configs:
779 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
780 | action: keep
781 | regex: true
782 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
783 | action: drop
784 | regex: true
785 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
786 | action: replace
787 | target_label: __scheme__
788 | regex: (https?)
789 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
790 | action: replace
791 | target_label: __metrics_path__
792 | regex: (.+)
793 | - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
794 | action: replace
795 | target_label: __address__
796 | regex: (.+?)(?::\d+)?;(\d+)
797 | replacement: $1:$2
798 | - action: labelmap
799 | regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
800 | replacement: __param_$1
801 | - action: labelmap
802 | regex: __meta_kubernetes_service_label_(.+)
803 | - source_labels: [__meta_kubernetes_namespace]
804 | action: replace
805 | target_label: namespace
806 | - source_labels: [__meta_kubernetes_service_name]
807 | action: replace
808 | target_label: service
809 | - source_labels: [__meta_kubernetes_pod_node_name]
810 | action: replace
811 | target_label: node
812 |
813 | # Scrape config for slow service endpoints; same as above, but with a larger
814 | # timeout and a larger interval
815 | #
816 | # The relabeling allows the actual service scrape endpoint to be configured
817 | # via the following annotations:
818 | #
819 | # * `prometheus.io/scrape-slow`: Only scrape services that have a value of `true`
820 | # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
821 | # to set this to `https` & most likely set the `tls_config` of the scrape config.
822 | # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
823 | # * `prometheus.io/port`: If the metrics are exposed on a different port to the
824 | # service then set this appropriately.
825 | # * `prometheus.io/param_`: If the metrics endpoint uses parameters
826 | # then you can set any parameter
827 | - job_name: 'kubernetes-service-endpoints-slow'
828 | honor_labels: true
829 |
830 | scrape_interval: 5m
831 | scrape_timeout: 30s
832 |
833 | kubernetes_sd_configs:
834 | - role: endpoints
835 |
836 | relabel_configs:
837 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape_slow]
838 | action: keep
839 | regex: true
840 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scheme]
841 | action: replace
842 | target_label: __scheme__
843 | regex: (https?)
844 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
845 | action: replace
846 | target_label: __metrics_path__
847 | regex: (.+)
848 | - source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
849 | action: replace
850 | target_label: __address__
851 | regex: (.+?)(?::\d+)?;(\d+)
852 | replacement: $1:$2
853 | - action: labelmap
854 | regex: __meta_kubernetes_service_annotation_prometheus_io_param_(.+)
855 | replacement: __param_$1
856 | - action: labelmap
857 | regex: __meta_kubernetes_service_label_(.+)
858 | - source_labels: [__meta_kubernetes_namespace]
859 | action: replace
860 | target_label: namespace
861 | - source_labels: [__meta_kubernetes_service_name]
862 | action: replace
863 | target_label: service
864 | - source_labels: [__meta_kubernetes_pod_node_name]
865 | action: replace
866 | target_label: node
867 |
868 | - job_name: 'prometheus-pushgateway'
869 | honor_labels: true
870 |
871 | kubernetes_sd_configs:
872 | - role: service
873 |
874 | relabel_configs:
875 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
876 | action: keep
877 | regex: pushgateway
878 |
879 | # Example scrape config for probing services via the Blackbox Exporter.
880 | #
881 | # The relabeling allows the actual service scrape endpoint to be configured
882 | # via the following annotations:
883 | #
884 | # * `prometheus.io/probe`: Only probe services that have a value of `true`
885 | - job_name: 'kubernetes-services'
886 | honor_labels: true
887 |
888 | metrics_path: /probe
889 | params:
890 | module: [http_2xx]
891 |
892 | kubernetes_sd_configs:
893 | - role: service
894 |
895 | relabel_configs:
896 | - source_labels: [__meta_kubernetes_service_annotation_prometheus_io_probe]
897 | action: keep
898 | regex: true
899 | - source_labels: [__address__]
900 | target_label: __param_target
901 | - target_label: __address__
902 | replacement: blackbox
903 | - source_labels: [__param_target]
904 | target_label: instance
905 | - action: labelmap
906 | regex: __meta_kubernetes_service_label_(.+)
907 | - source_labels: [__meta_kubernetes_namespace]
908 | target_label: namespace
909 | - source_labels: [__meta_kubernetes_service_name]
910 | target_label: service
911 |
912 | # Example scrape config for pods
913 | #
914 | # The relabeling allows the actual pod scrape endpoint to be configured via the
915 | # following annotations:
916 | #
917 | # * `prometheus.io/scrape`: Only scrape pods that have a value of `true`,
918 | # except if `prometheus.io/scrape-slow` is set to `true` as well.
919 | # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
920 | # to set this to `https` & most likely set the `tls_config` of the scrape config.
921 | # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
922 | # * `prometheus.io/port`: Scrape the pod on the indicated port instead of the default of `9102`.
923 | - job_name: 'kubernetes-pods'
924 | honor_labels: true
925 |
926 | kubernetes_sd_configs:
927 | - role: pod
928 |
929 | relabel_configs:
930 | - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
931 | action: keep
932 | regex: true
933 | - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_slow]
934 | action: drop
935 | regex: true
936 | - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
937 | action: replace
938 | regex: (https?)
939 | target_label: __scheme__
940 | - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
941 | action: replace
942 | target_label: __metrics_path__
943 | regex: (.+)
944 | - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
945 | action: replace
946 | regex: (.+?)(?::\d+)?;(\d+)
947 | replacement: $1:$2
948 | target_label: __address__
949 | - action: labelmap
950 | regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
951 | replacement: __param_$1
952 | - action: labelmap
953 | regex: __meta_kubernetes_pod_label_(.+)
954 | - source_labels: [__meta_kubernetes_namespace]
955 | action: replace
956 | target_label: namespace
957 | - source_labels: [__meta_kubernetes_pod_name]
958 | action: replace
959 | target_label: pod
960 | - source_labels: [__meta_kubernetes_pod_phase]
961 | regex: Pending|Succeeded|Failed|Completed
962 | action: drop
963 |
964 | # Example Scrape config for pods which should be scraped slower. An useful example
965 | # would be stackriver-exporter which queries an API on every scrape of the pod
966 | #
967 | # The relabeling allows the actual pod scrape endpoint to be configured via the
968 | # following annotations:
969 | #
970 | # * `prometheus.io/scrape-slow`: Only scrape pods that have a value of `true`
971 | # * `prometheus.io/scheme`: If the metrics endpoint is secured then you will need
972 | # to set this to `https` & most likely set the `tls_config` of the scrape config.
973 | # * `prometheus.io/path`: If the metrics path is not `/metrics` override this.
974 | # * `prometheus.io/port`: Scrape the pod on the indicated port instead of the default of `9102`.
975 | - job_name: 'kubernetes-pods-slow'
976 | honor_labels: true
977 |
978 | scrape_interval: 5m
979 | scrape_timeout: 30s
980 |
981 | kubernetes_sd_configs:
982 | - role: pod
983 |
984 | relabel_configs:
985 | - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape_slow]
986 | action: keep
987 | regex: true
988 | - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scheme]
989 | action: replace
990 | regex: (https?)
991 | target_label: __scheme__
992 | - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
993 | action: replace
994 | target_label: __metrics_path__
995 | regex: (.+)
996 | - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
997 | action: replace
998 | regex: (.+?)(?::\d+)?;(\d+)
999 | replacement: $1:$2
1000 | target_label: __address__
1001 | - action: labelmap
1002 | regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
1003 | replacement: __param_$1
1004 | - action: labelmap
1005 | regex: __meta_kubernetes_pod_label_(.+)
1006 | - source_labels: [__meta_kubernetes_namespace]
1007 | action: replace
1008 | target_label: namespace
1009 | - source_labels: [__meta_kubernetes_pod_name]
1010 | action: replace
1011 | target_label: pod
1012 | - source_labels: [__meta_kubernetes_pod_phase]
1013 | regex: Pending|Succeeded|Failed|Completed
1014 | action: drop
1015 |
1016 | # adds additional scrape configs to prometheus.yml
1017 | # must be a string so you have to add a | after extraScrapeConfigs:
1018 | # example adds prometheus-blackbox-exporter scrape config
1019 | extraScrapeConfigs:
1020 | # - job_name: 'prometheus-blackbox-exporter'
1021 | # metrics_path: /probe
1022 | # params:
1023 | # module: [http_2xx]
1024 | # static_configs:
1025 | # - targets:
1026 | # - https://example.com
1027 | # relabel_configs:
1028 | # - source_labels: [__address__]
1029 | # target_label: __param_target
1030 | # - source_labels: [__param_target]
1031 | # target_label: instance
1032 | # - target_label: __address__
1033 | # replacement: prometheus-blackbox-exporter:9115
1034 |
1035 | # Adds option to add alert_relabel_configs to avoid duplicate alerts in alertmanager
1036 | # useful in H/A prometheus with different external labels but the same alerts
1037 | alertRelabelConfigs:
1038 | # alert_relabel_configs:
1039 | # - source_labels: [dc]
1040 | # regex: (.+)\d+
1041 | # target_label: dc
1042 |
1043 | networkPolicy:
1044 | ## Enable creation of NetworkPolicy resources.
1045 | ##
1046 | enabled: false
1047 |
1048 | # Force namespace of namespaced resources
1049 | forceNamespace: null
1050 |
1051 | # Extra manifests to deploy as an array
1052 | extraManifests: []
1053 | # - apiVersion: v1
1054 | # kind: ConfigMap
1055 | # metadata:
1056 | # labels:
1057 | # name: prometheus-extra
1058 | # data:
1059 | # extra-data: "value"
1060 |
1061 | # Configuration of subcharts defined in Chart.yaml
1062 |
1063 | ## alertmanager sub-chart configurable values
1064 | ## Please see https://github.com/prometheus-community/helm-charts/tree/main/charts/alertmanager
1065 | ##
1066 | alertmanager:
1067 | ## If false, alertmanager will not be installed
1068 | ##
1069 | enabled: true
1070 |
1071 | persistence:
1072 | size: 2Gi
1073 |
1074 | podSecurityContext:
1075 | runAsUser: 65534
1076 | runAsNonRoot: true
1077 | runAsGroup: 65534
1078 | fsGroup: 65534
1079 |
1080 | ## kube-state-metrics sub-chart configurable values
1081 | ## Please see https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-state-metrics
1082 | ##
1083 | kube-state-metrics:
1084 | ## If false, kube-state-metrics sub-chart will not be installed
1085 | ##
1086 | enabled: true
1087 |
1088 | ## promtheus-node-exporter sub-chart configurable values
1089 | ## Please see https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-node-exporter
1090 | ##
1091 | prometheus-node-exporter:
1092 | ## If false, node-exporter will not be installed
1093 | ##
1094 | enabled: true
1095 |
1096 | rbac:
1097 | pspEnabled: false
1098 |
1099 | containerSecurityContext:
1100 | allowPrivilegeEscalation: false
1101 |
1102 | ## pprometheus-pushgateway sub-chart configurable values
1103 | ## Please see https://github.com/prometheus-community/helm-charts/tree/main/charts/prometheus-pushgateway
1104 | ##
1105 | prometheus-pushgateway:
1106 | ## If false, pushgateway will not be installed
1107 | ##
1108 | enabled: true
1109 |
1110 | # Optional service annotations
1111 | serviceAnnotations:
1112 | prometheus.io/probe: pushgateway
1113 |
--------------------------------------------------------------------------------
/03-deploy-grafana/README.md:
--------------------------------------------------------------------------------
1 | # Deploy grafana
2 |
3 | [Grafana](https://grafana.com/) is an open-source observability and data visualization platform that allows you to query, visualize, alert on and understand your metrics no matter where they are stored.
4 |
5 | Follow the instructions in this document to deploy a self-managed Grafana instance in EKE cluster and the instructions are based on [Grafana Community Kubernetes Helm Charts](https://github.com/grafana/helm-charts/tree/main/charts/grafana)
6 |
7 | If you are looking for a fully managed Grafana solution, then please refer to [Amazon Managed Grafana](https://aws.amazon.com/grafana/) or [Grafana Cloud](https://grafana.com/products/cloud/).
8 |
9 | ## Prerequisites
10 |
11 | - Kubernetes 1.22+
12 | - Helm 3.9+
13 |
14 |
15 | ## Get Repository Info
16 |
17 | ```console
18 | helm repo add grafana https://grafana.github.io/helm-charts
19 | helm repo update
20 | ```
21 |
22 | ## Install/Upgrade grafana with default values
23 |
24 | ```console
25 | helm upgrade -install [RELEASE_NAME] grafana/grafana --namespace [K8S_NAMESPACE] --create-namespace --wait --debug
26 | ```
27 |
28 | The above commands install the latest chart version and use the `--version` argument to install a specific version of the prometheus chart.
29 |
30 | ```console
31 | helm upgrade -install [RELEASE_NAME] grafana/grafana --namespace [K8S_NAMESPACE] --version 18.0.0 --create-namespace --wait --debug
32 | ```
33 |
34 | ## Install/Upgrade prometheus with custom values
35 |
36 | - Create a `values.yaml` file with custom helm chart inputs. Refer to the `values.yaml` file in this repo for sample configurations.
37 |
38 | - Refer to the [official grafana chart](https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml)) for recent configurations.
39 |
40 | Run the following command to install prometheus with custom configurations
41 |
42 | ```console
43 | helm upgrade -install [RELEASE_NAME] grafana/grafana --namespace [K8S_NAMESPACE] -f values.yaml --create-namespace --wait --debug
44 | ```
45 |
46 | ## Access Grafana
47 |
48 | This chart creates a `grafana` service with `ClusterIP` type which is accessible only inside the cluster. Change the [service type](https://github.com/grafana/helm-charts/blob/main/charts/grafana/values.yaml#L173) to `LoadBalancer` if you want to access grafana outside cluster.
49 |
50 | Run the following `kubectl port-forward` command to connect to grafana and go to `localhost:3000` in the browser or use the loadbalancer DNS address
51 |
52 | ```console
53 | kubectl port-forward --namespace [K8S_NAMESPACE] svc/grafana 3000:80
54 | ```
55 |
56 | You can get the default username and password from the Kubernetes secret
57 |
58 | ```console
59 | kubectl get secrets grafana --template='{{ range $key, $value := .data }}{{ printf "%s: %s\n" $key ($value | base64decode) }}{{ end }}'
60 | ```
61 |
62 | Login to grafana with the default username and password
63 |
64 |
65 |
66 |
67 | ## Configure prometheus Datasource
68 |
69 | Follow the below steps to configure the prometheus data source and access the data stored in prometheus
70 |
71 | Goto Datasources -> Add data source -> Select Prometheus -> configure the datasource with prometheus endpoint -> Click "Save & Exit"
72 |
73 |
74 |
75 | You should see a Successful "Data source is working" message if the prometheus endpoint is configured as expected.
76 |
77 |
78 |
79 | You can now able to query the metrics in grafana and create dashboards/setup alerts.
80 |
81 |
82 |
--------------------------------------------------------------------------------
/03-deploy-grafana/values.yaml:
--------------------------------------------------------------------------------
1 | global:
2 | # To help compatibility with other charts which use global.imagePullSecrets.
3 | # Allow either an array of {name: pullSecret} maps (k8s-style), or an array of strings (more common helm-style).
4 | # Can be tempalted.
5 | # global:
6 | # imagePullSecrets:
7 | # - name: pullSecret1
8 | # - name: pullSecret2
9 | # or
10 | # global:
11 | # imagePullSecrets:
12 | # - pullSecret1
13 | # - pullSecret2
14 | imagePullSecrets: []
15 |
16 | rbac:
17 | create: true
18 | ## Use an existing ClusterRole/Role (depending on rbac.namespaced false/true)
19 | # useExistingRole: name-of-some-(cluster)role
20 | pspEnabled: true
21 | pspUseAppArmor: true
22 | namespaced: false
23 | extraRoleRules: []
24 | # - apiGroups: []
25 | # resources: []
26 | # verbs: []
27 | extraClusterRoleRules: []
28 | # - apiGroups: []
29 | # resources: []
30 | # verbs: []
31 | serviceAccount:
32 | create: true
33 | name:
34 | nameTest:
35 | ## ServiceAccount labels.
36 | labels: {}
37 | ## Service account annotations. Can be templated.
38 | # annotations:
39 | # eks.amazonaws.com/role-arn: arn:aws:iam::123456789000:role/iam-role-name-here
40 | autoMount: true
41 |
42 | replicas: 1
43 |
44 | ## Create a headless service for the deployment
45 | headlessService: false
46 |
47 | ## Create HorizontalPodAutoscaler object for deployment type
48 | #
49 | autoscaling:
50 | enabled: false
51 | minReplicas: 1
52 | maxReplicas: 5
53 | targetCPU: "60"
54 | targetMemory: ""
55 | behavior: {}
56 |
57 | ## See `kubectl explain poddisruptionbudget.spec` for more
58 | ## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/
59 | podDisruptionBudget: {}
60 | # minAvailable: 1
61 | # maxUnavailable: 1
62 |
63 | ## See `kubectl explain deployment.spec.strategy` for more
64 | ## ref: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
65 | deploymentStrategy:
66 | type: RollingUpdate
67 |
68 | readinessProbe:
69 | httpGet:
70 | path: /api/health
71 | port: 3000
72 |
73 | livenessProbe:
74 | httpGet:
75 | path: /api/health
76 | port: 3000
77 | initialDelaySeconds: 60
78 | timeoutSeconds: 30
79 | failureThreshold: 10
80 |
81 | ## Use an alternate scheduler, e.g. "stork".
82 | ## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
83 | ##
84 | # schedulerName: "default-scheduler"
85 |
86 | image:
87 | repository: grafana/grafana
88 | # Overrides the Grafana image tag whose default is the chart appVersion
89 | tag: ""
90 | sha: ""
91 | pullPolicy: IfNotPresent
92 |
93 | ## Optionally specify an array of imagePullSecrets.
94 | ## Secrets must be manually created in the namespace.
95 | ## ref: https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/
96 | ## Can be templated.
97 | ##
98 | pullSecrets: []
99 | # - myRegistrKeySecretName
100 |
101 | testFramework:
102 | enabled: true
103 | image: "bats/bats"
104 | tag: "v1.4.1"
105 | imagePullPolicy: IfNotPresent
106 | securityContext: {}
107 |
108 | securityContext:
109 | runAsUser: 472
110 | runAsGroup: 472
111 | fsGroup: 472
112 |
113 | containerSecurityContext: {}
114 |
115 | # Enable creating the grafana configmap
116 | createConfigmap: true
117 |
118 | # Extra configmaps to mount in grafana pods
119 | # Values are templated.
120 | extraConfigmapMounts: []
121 | # - name: certs-configmap
122 | # mountPath: /etc/grafana/ssl/
123 | # subPath: certificates.crt # (optional)
124 | # configMap: certs-configmap
125 | # readOnly: true
126 |
127 |
128 | extraEmptyDirMounts: []
129 | # - name: provisioning-notifiers
130 | # mountPath: /etc/grafana/provisioning/notifiers
131 |
132 |
133 | # Apply extra labels to common labels.
134 | extraLabels: {}
135 |
136 | ## Assign a PriorityClassName to pods if set
137 | # priorityClassName:
138 |
139 | downloadDashboardsImage:
140 | repository: curlimages/curl
141 | tag: 7.85.0
142 | sha: ""
143 | pullPolicy: IfNotPresent
144 |
145 | downloadDashboards:
146 | env: {}
147 | envFromSecret: ""
148 | resources: {}
149 | securityContext: {}
150 | envValueFrom: {}
151 | # ENV_NAME:
152 | # configMapKeyRef:
153 | # name: configmap-name
154 | # key: value_key
155 |
156 | ## Pod Annotations
157 | # podAnnotations: {}
158 |
159 | ## Pod Labels
160 | # podLabels: {}
161 |
162 | podPortName: grafana
163 | gossipPortName: grafana-alert
164 | ## Deployment annotations
165 | # annotations: {}
166 |
167 | ## Expose the grafana service to be accessed from outside the cluster (LoadBalancer service).
168 | ## or access it from within the cluster (ClusterIP service). Set the service type and the port to serve it.
169 | ## ref: http://kubernetes.io/docs/user-guide/services/
170 | ##
171 | service:
172 | enabled: true
173 | type: ClusterIP
174 | port: 80
175 | targetPort: 3000
176 | # targetPort: 4181 To be used with a proxy extraContainer
177 | ## Service annotations. Can be templated.
178 | annotations: {}
179 | labels: {}
180 | portName: service
181 | # Adds the appProtocol field to the service. This allows to work with istio protocol selection. Ex: "http" or "tcp"
182 | appProtocol: ""
183 |
184 | serviceMonitor:
185 | ## If true, a ServiceMonitor CRD is created for a prometheus operator
186 | ## https://github.com/coreos/prometheus-operator
187 | ##
188 | enabled: false
189 | path: /metrics
190 | # namespace: monitoring (defaults to use the namespace this chart is deployed to)
191 | labels: {}
192 | interval: 1m
193 | scheme: http
194 | tlsConfig: {}
195 | scrapeTimeout: 30s
196 | relabelings: []
197 | targetLabels: []
198 |
199 | extraExposePorts: []
200 | # - name: keycloak
201 | # port: 8080
202 | # targetPort: 8080
203 | # type: ClusterIP
204 |
205 | # overrides pod.spec.hostAliases in the grafana deployment's pods
206 | hostAliases: []
207 | # - ip: "1.2.3.4"
208 | # hostnames:
209 | # - "my.host.com"
210 |
211 | ingress:
212 | enabled: false
213 | # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
214 | # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
215 | # ingressClassName: nginx
216 | # Values can be templated
217 | annotations: {}
218 | # kubernetes.io/ingress.class: nginx
219 | # kubernetes.io/tls-acme: "true"
220 | labels: {}
221 | path: /
222 |
223 | # pathType is only for k8s >= 1.1=
224 | pathType: Prefix
225 |
226 | hosts:
227 | - chart-example.local
228 | ## Extra paths to prepend to every host configuration. This is useful when working with annotation based services.
229 | extraPaths: []
230 | # - path: /*
231 | # backend:
232 | # serviceName: ssl-redirect
233 | # servicePort: use-annotation
234 | ## Or for k8s > 1.19
235 | # - path: /*
236 | # pathType: Prefix
237 | # backend:
238 | # service:
239 | # name: ssl-redirect
240 | # port:
241 | # name: use-annotation
242 |
243 |
244 | tls: []
245 | # - secretName: chart-example-tls
246 | # hosts:
247 | # - chart-example.local
248 |
249 | resources: {}
250 | # limits:
251 | # cpu: 100m
252 | # memory: 128Mi
253 | # requests:
254 | # cpu: 100m
255 | # memory: 128Mi
256 |
257 | ## Node labels for pod assignment
258 | ## ref: https://kubernetes.io/docs/user-guide/node-selection/
259 | #
260 | nodeSelector: {}
261 |
262 | ## Tolerations for pod assignment
263 | ## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
264 | ##
265 | tolerations: []
266 |
267 | ## Affinity for pod assignment (evaluated as template)
268 | ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
269 | ##
270 | affinity: {}
271 |
272 | ## Topology Spread Constraints
273 | ## ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/
274 | ##
275 | topologySpreadConstraints: []
276 |
277 | ## Additional init containers (evaluated as template)
278 | ## ref: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/
279 | ##
280 | extraInitContainers: []
281 |
282 | ## Enable an Specify container in extraContainers. This is meant to allow adding an authentication proxy to a grafana pod
283 | extraContainers: ""
284 | # extraContainers: |
285 | # - name: proxy
286 | # image: quay.io/gambol99/keycloak-proxy:latest
287 | # args:
288 | # - -provider=github
289 | # - -client-id=
290 | # - -client-secret=
291 | # - -github-org=
292 | # - -email-domain=*
293 | # - -cookie-secret=
294 | # - -http-address=http://0.0.0.0:4181
295 | # - -upstream-url=http://127.0.0.1:3000
296 | # ports:
297 | # - name: proxy-web
298 | # containerPort: 4181
299 |
300 | ## Volumes that can be used in init containers that will not be mounted to deployment pods
301 | extraContainerVolumes: []
302 | # - name: volume-from-secret
303 | # secret:
304 | # secretName: secret-to-mount
305 | # - name: empty-dir-volume
306 | # emptyDir: {}
307 |
308 | ## Enable persistence using Persistent Volume Claims
309 | ## ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
310 | ##
311 | persistence:
312 | type: pvc
313 | enabled: false
314 | # storageClassName: default
315 | accessModes:
316 | - ReadWriteOnce
317 | size: 10Gi
318 | # annotations: {}
319 | finalizers:
320 | - kubernetes.io/pvc-protection
321 | # selectorLabels: {}
322 | ## Sub-directory of the PV to mount. Can be templated.
323 | # subPath: ""
324 | ## Name of an existing PVC. Can be templated.
325 | # existingClaim:
326 | ## Extra labels to apply to a PVC.
327 | extraPvcLabels: {}
328 |
329 | ## If persistence is not enabled, this allows to mount the
330 | ## local storage in-memory to improve performance
331 | ##
332 | inMemory:
333 | enabled: false
334 | ## The maximum usage on memory medium EmptyDir would be
335 | ## the minimum value between the SizeLimit specified
336 | ## here and the sum of memory limits of all containers in a pod
337 | ##
338 | # sizeLimit: 300Mi
339 |
340 | initChownData:
341 | ## If false, data ownership will not be reset at startup
342 | ## This allows the grafana-server to be run with an arbitrary user
343 | ##
344 | enabled: true
345 |
346 | ## initChownData container image
347 | ##
348 | image:
349 | repository: busybox
350 | tag: "1.31.1"
351 | sha: ""
352 | pullPolicy: IfNotPresent
353 |
354 | ## initChownData resource requests and limits
355 | ## Ref: http://kubernetes.io/docs/user-guide/compute-resources/
356 | ##
357 | resources: {}
358 | # limits:
359 | # cpu: 100m
360 | # memory: 128Mi
361 | # requests:
362 | # cpu: 100m
363 | # memory: 128Mi
364 | securityContext:
365 | runAsNonRoot: false
366 | runAsUser: 0
367 |
368 |
369 | # Administrator credentials when not using an existing secret (see below)
370 | adminUser: admin
371 | # adminPassword: strongpassword
372 |
373 | # Use an existing secret for the admin user.
374 | admin:
375 | ## Name of the secret. Can be templated.
376 | existingSecret: ""
377 | userKey: admin-user
378 | passwordKey: admin-password
379 |
380 | ## Define command to be executed at startup by grafana container
381 | ## Needed if using `vault-env` to manage secrets (ref: https://banzaicloud.com/blog/inject-secrets-into-pods-vault/)
382 | ## Default is "run.sh" as defined in grafana's Dockerfile
383 | # command:
384 | # - "sh"
385 | # - "/run.sh"
386 |
387 | ## Extra environment variables that will be pass onto deployment pods
388 | ##
389 | ## to provide grafana with access to CloudWatch on AWS EKS:
390 | ## 1. create an iam role of type "Web identity" with provider oidc.eks.* (note the provider for later)
391 | ## 2. edit the "Trust relationships" of the role, add a line inside the StringEquals clause using the
392 | ## same oidc eks provider as noted before (same as the existing line)
393 | ## also, replace NAMESPACE and prometheus-operator-grafana with the service account namespace and name
394 | ##
395 | ## "oidc.eks.us-east-1.amazonaws.com/id/XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX:sub": "system:serviceaccount:NAMESPACE:prometheus-operator-grafana",
396 | ##
397 | ## 3. attach a policy to the role, you can use a built in policy called CloudWatchReadOnlyAccess
398 | ## 4. use the following env: (replace 123456789000 and iam-role-name-here with your aws account number and role name)
399 | ##
400 | ## env:
401 | ## AWS_ROLE_ARN: arn:aws:iam::123456789000:role/iam-role-name-here
402 | ## AWS_WEB_IDENTITY_TOKEN_FILE: /var/run/secrets/eks.amazonaws.com/serviceaccount/token
403 | ## AWS_REGION: us-east-1
404 | ##
405 | ## 5. uncomment the EKS section in extraSecretMounts: below
406 | ## 6. uncomment the annotation section in the serviceAccount: above
407 | ## make sure to replace arn:aws:iam::123456789000:role/iam-role-name-here with your role arn
408 |
409 | env: {}
410 |
411 | ## "valueFrom" environment variable references that will be added to deployment pods. Name is templated.
412 | ## ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#envvarsource-v1-core
413 | ## Renders in container spec as:
414 | ## env:
415 | ## ...
416 | ## - name:
417 | ## valueFrom:
418 | ##
419 | envValueFrom: {}
420 | # ENV_NAME:
421 | # configMapKeyRef:
422 | # name: configmap-name
423 | # key: value_key
424 |
425 | ## The name of a secret in the same kubernetes namespace which contain values to be added to the environment
426 | ## This can be useful for auth tokens, etc. Value is templated.
427 | envFromSecret: ""
428 |
429 | ## Sensible environment variables that will be rendered as new secret object
430 | ## This can be useful for auth tokens, etc
431 | envRenderSecret: {}
432 |
433 | ## The names of secrets in the same kubernetes namespace which contain values to be added to the environment
434 | ## Each entry should contain a name key, and can optionally specify whether the secret must be defined with an optional key.
435 | ## Name is templated.
436 | envFromSecrets: []
437 | ## - name: secret-name
438 | ## optional: true
439 |
440 | ## The names of conifgmaps in the same kubernetes namespace which contain values to be added to the environment
441 | ## Each entry should contain a name key, and can optionally specify whether the configmap must be defined with an optional key.
442 | ## Name is templated.
443 | ## ref: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/#configmapenvsource-v1-core
444 | envFromConfigMaps: []
445 | ## - name: configmap-name
446 | ## optional: true
447 |
448 | # Inject Kubernetes services as environment variables.
449 | # See https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/#environment-variables
450 | enableServiceLinks: true
451 |
452 | ## Additional grafana server secret mounts
453 | # Defines additional mounts with secrets. Secrets must be manually created in the namespace.
454 | extraSecretMounts: []
455 | # - name: secret-files
456 | # mountPath: /etc/secrets
457 | # secretName: grafana-secret-files
458 | # readOnly: true
459 | # subPath: ""
460 | #
461 | # for AWS EKS (cloudwatch) use the following (see also instruction in env: above)
462 | # - name: aws-iam-token
463 | # mountPath: /var/run/secrets/eks.amazonaws.com/serviceaccount
464 | # readOnly: true
465 | # projected:
466 | # defaultMode: 420
467 | # sources:
468 | # - serviceAccountToken:
469 | # audience: sts.amazonaws.com
470 | # expirationSeconds: 86400
471 | # path: token
472 | #
473 | # for CSI e.g. Azure Key Vault use the following
474 | # - name: secrets-store-inline
475 | # mountPath: /run/secrets
476 | # readOnly: true
477 | # csi:
478 | # driver: secrets-store.csi.k8s.io
479 | # readOnly: true
480 | # volumeAttributes:
481 | # secretProviderClass: "akv-grafana-spc"
482 | # nodePublishSecretRef: # Only required when using service principal mode
483 | # name: grafana-akv-creds # Only required when using service principal mode
484 |
485 | ## Additional grafana server volume mounts
486 | # Defines additional volume mounts.
487 | extraVolumeMounts: []
488 | # - name: extra-volume-0
489 | # mountPath: /mnt/volume0
490 | # readOnly: true
491 | # existingClaim: volume-claim
492 | # - name: extra-volume-1
493 | # mountPath: /mnt/volume1
494 | # readOnly: true
495 | # hostPath: /usr/shared/
496 | # - name: grafana-secrets
497 | # csi: true
498 | # data:
499 | # driver: secrets-store.csi.k8s.io
500 | # readOnly: true
501 | # volumeAttributes:
502 | # secretProviderClass: "grafana-env-spc"
503 |
504 | ## Container Lifecycle Hooks. Execute a specific bash command or make an HTTP request
505 | lifecycleHooks: {}
506 | # postStart:
507 | # exec:
508 | # command: []
509 |
510 | ## Pass the plugins you want installed as a list.
511 | ##
512 | plugins: []
513 | # - digrich-bubblechart-panel
514 | # - grafana-clock-panel
515 |
516 | ## Configure grafana datasources
517 | ## ref: http://docs.grafana.org/administration/provisioning/#datasources
518 | ##
519 | datasources: {}
520 | # datasources.yaml:
521 | # apiVersion: 1
522 | # datasources:
523 | # - name: Prometheus
524 | # type: prometheus
525 | # url: http://prometheus-prometheus-server
526 | # access: proxy
527 | # isDefault: true
528 | # - name: CloudWatch
529 | # type: cloudwatch
530 | # access: proxy
531 | # uid: cloudwatch
532 | # editable: false
533 | # jsonData:
534 | # authType: default
535 | # defaultRegion: us-east-1
536 |
537 | ## Configure grafana alerting (can be templated)
538 | ## ref: http://docs.grafana.org/administration/provisioning/#alerting
539 | ##
540 | alerting: {}
541 | # rules.yaml:
542 | # apiVersion: 1
543 | # groups:
544 | # - orgId: 1
545 | # name: '{{ .Chart.Name }}_my_rule_group'
546 | # folder: my_first_folder
547 | # interval: 60s
548 | # rules:
549 | # - uid: my_id_1
550 | # title: my_first_rule
551 | # condition: A
552 | # data:
553 | # - refId: A
554 | # datasourceUid: '-100'
555 | # model:
556 | # conditions:
557 | # - evaluator:
558 | # params:
559 | # - 3
560 | # type: gt
561 | # operator:
562 | # type: and
563 | # query:
564 | # params:
565 | # - A
566 | # reducer:
567 | # type: last
568 | # type: query
569 | # datasource:
570 | # type: __expr__
571 | # uid: '-100'
572 | # expression: 1==0
573 | # intervalMs: 1000
574 | # maxDataPoints: 43200
575 | # refId: A
576 | # type: math
577 | # dashboardUid: my_dashboard
578 | # panelId: 123
579 | # noDataState: Alerting
580 | # for: 60s
581 | # annotations:
582 | # some_key: some_value
583 | # labels:
584 | # team: sre_team_1
585 | # contactpoints.yaml:
586 | # apiVersion: 1
587 | # contactPoints:
588 | # - orgId: 1
589 | # name: cp_1
590 | # receivers:
591 | # - uid: first_uid
592 | # type: pagerduty
593 | # settings:
594 | # integrationKey: XXX
595 | # severity: critical
596 | # class: ping failure
597 | # component: Grafana
598 | # group: app-stack
599 | # summary: |
600 | # {{ `{{ include "default.message" . }}` }}
601 |
602 | ## Configure notifiers
603 | ## ref: http://docs.grafana.org/administration/provisioning/#alert-notification-channels
604 | ##
605 | notifiers: {}
606 | # notifiers.yaml:
607 | # notifiers:
608 | # - name: email-notifier
609 | # type: email
610 | # uid: email1
611 | # # either:
612 | # org_id: 1
613 | # # or
614 | # org_name: Main Org.
615 | # is_default: true
616 | # settings:
617 | # addresses: an_email_address@example.com
618 | # delete_notifiers:
619 |
620 | ## Configure grafana dashboard providers
621 | ## ref: http://docs.grafana.org/administration/provisioning/#dashboards
622 | ##
623 | ## `path` must be /var/lib/grafana/dashboards/
624 | ##
625 | dashboardProviders: {}
626 | # dashboardproviders.yaml:
627 | # apiVersion: 1
628 | # providers:
629 | # - name: 'default'
630 | # orgId: 1
631 | # folder: ''
632 | # type: file
633 | # disableDeletion: false
634 | # editable: true
635 | # options:
636 | # path: /var/lib/grafana/dashboards/default
637 |
638 | ## Configure grafana dashboard to import
639 | ## NOTE: To use dashboards you must also enable/configure dashboardProviders
640 | ## ref: https://grafana.com/dashboards
641 | ##
642 | ## dashboards per provider, use provider name as key.
643 | ##
644 | dashboards: {}
645 | # default:
646 | # some-dashboard:
647 | # json: |
648 | # $RAW_JSON
649 | # custom-dashboard:
650 | # file: dashboards/custom-dashboard.json
651 | # prometheus-stats:
652 | # gnetId: 2
653 | # revision: 2
654 | # datasource: Prometheus
655 | # local-dashboard:
656 | # url: https://example.com/repository/test.json
657 | # token: ''
658 | # local-dashboard-base64:
659 | # url: https://example.com/repository/test-b64.json
660 | # token: ''
661 | # b64content: true
662 | # local-dashboard-gitlab:
663 | # url: https://example.com/repository/test-gitlab.json
664 | # gitlabToken: ''
665 | # local-dashboard-bitbucket:
666 | # url: https://example.com/repository/test-bitbucket.json
667 | # bearerToken: ''
668 |
669 | ## Reference to external ConfigMap per provider. Use provider name as key and ConfigMap name as value.
670 | ## A provider dashboards must be defined either by external ConfigMaps or in values.yaml, not in both.
671 | ## ConfigMap data example:
672 | ##
673 | ## data:
674 | ## example-dashboard.json: |
675 | ## RAW_JSON
676 | ##
677 | dashboardsConfigMaps: {}
678 | # default: ""
679 |
680 | ## Grafana's primary configuration
681 | ## NOTE: values in map will be converted to ini format
682 | ## ref: http://docs.grafana.org/installation/configuration/
683 | ##
684 | grafana.ini:
685 | paths:
686 | data: /var/lib/grafana/
687 | logs: /var/log/grafana
688 | plugins: /var/lib/grafana/plugins
689 | provisioning: /etc/grafana/provisioning
690 | analytics:
691 | check_for_updates: true
692 | log:
693 | mode: console
694 | grafana_net:
695 | url: https://grafana.net
696 | server:
697 | domain: "{{ if (and .Values.ingress.enabled .Values.ingress.hosts) }}{{ .Values.ingress.hosts | first }}{{ else }}''{{ end }}"
698 | ## grafana Authentication can be enabled with the following values on grafana.ini
699 | # server:
700 | # The full public facing url you use in browser, used for redirects and emails
701 | # root_url:
702 | # https://grafana.com/docs/grafana/latest/auth/github/#enable-github-in-grafana
703 | # auth.github:
704 | # enabled: false
705 | # allow_sign_up: false
706 | # scopes: user:email,read:org
707 | # auth_url: https://github.com/login/oauth/authorize
708 | # token_url: https://github.com/login/oauth/access_token
709 | # api_url: https://api.github.com/user
710 | # team_ids:
711 | # allowed_organizations:
712 | # client_id:
713 | # client_secret:
714 | ## LDAP Authentication can be enabled with the following values on grafana.ini
715 | ## NOTE: Grafana will fail to start if the value for ldap.toml is invalid
716 | # auth.ldap:
717 | # enabled: true
718 | # allow_sign_up: true
719 | # config_file: /etc/grafana/ldap.toml
720 |
721 | ## Grafana's LDAP configuration
722 | ## Templated by the template in _helpers.tpl
723 | ## NOTE: To enable the grafana.ini must be configured with auth.ldap.enabled
724 | ## ref: http://docs.grafana.org/installation/configuration/#auth-ldap
725 | ## ref: http://docs.grafana.org/installation/ldap/#configuration
726 | ldap:
727 | enabled: false
728 | # `existingSecret` is a reference to an existing secret containing the ldap configuration
729 | # for Grafana in a key `ldap-toml`.
730 | existingSecret: ""
731 | # `config` is the content of `ldap.toml` that will be stored in the created secret
732 | config: ""
733 | # config: |-
734 | # verbose_logging = true
735 |
736 | # [[servers]]
737 | # host = "my-ldap-server"
738 | # port = 636
739 | # use_ssl = true
740 | # start_tls = false
741 | # ssl_skip_verify = false
742 | # bind_dn = "uid=%s,ou=users,dc=myorg,dc=com"
743 |
744 | ## Grafana's SMTP configuration
745 | ## NOTE: To enable, grafana.ini must be configured with smtp.enabled
746 | ## ref: http://docs.grafana.org/installation/configuration/#smtp
747 | smtp:
748 | # `existingSecret` is a reference to an existing secret containing the smtp configuration
749 | # for Grafana.
750 | existingSecret: ""
751 | userKey: "user"
752 | passwordKey: "password"
753 |
754 | ## Sidecars that collect the configmaps with specified label and stores the included files them into the respective folders
755 | ## Requires at least Grafana 5 to work and can't be used together with parameters dashboardProviders, datasources and dashboards
756 | sidecar:
757 | image:
758 | repository: quay.io/kiwigrid/k8s-sidecar
759 | tag: 1.22.0
760 | sha: ""
761 | imagePullPolicy: IfNotPresent
762 | resources: {}
763 | # limits:
764 | # cpu: 100m
765 | # memory: 100Mi
766 | # requests:
767 | # cpu: 50m
768 | # memory: 50Mi
769 | securityContext: {}
770 | # skipTlsVerify Set to true to skip tls verification for kube api calls
771 | # skipTlsVerify: true
772 | enableUniqueFilenames: false
773 | readinessProbe: {}
774 | livenessProbe: {}
775 | # Log level default for all sidecars. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL. Defaults to INFO
776 | # logLevel: INFO
777 | alerts:
778 | enabled: false
779 | # Additional environment variables for the alerts sidecar
780 | env: {}
781 | # Do not reprocess already processed unchanged resources on k8s API reconnect.
782 | # ignoreAlreadyProcessed: true
783 | # label that the configmaps with alert are marked with
784 | label: grafana_alert
785 | # value of label that the configmaps with alert are set to
786 | labelValue: ""
787 | # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL.
788 | # logLevel: INFO
789 | # If specified, the sidecar will search for alert config-maps inside this namespace.
790 | # Otherwise the namespace in which the sidecar is running will be used.
791 | # It's also possible to specify ALL to search in all namespaces
792 | searchNamespace: null
793 | # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds.
794 | watchMethod: WATCH
795 | # search in configmap, secret or both
796 | resource: both
797 | # watchServerTimeout: request to the server, asking it to cleanly close the connection after that.
798 | # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S
799 | # watchServerTimeout: 3600
800 | #
801 | # watchClientTimeout: is a client-side timeout, configuring your local socket.
802 | # If you have a network outage dropping all packets with no RST/FIN,
803 | # this is how long your client waits before realizing & dropping the connection.
804 | # defaults to 66sec (sic!)
805 | # watchClientTimeout: 60
806 | #
807 | # Endpoint to send request to reload alerts
808 | reloadURL: "http://localhost:3000/api/admin/provisioning/alerting/reload"
809 | # Absolute path to shell script to execute after a alert got reloaded
810 | script: null
811 | skipReload: false
812 | # Deploy the alert sidecar as an initContainer in addition to a container.
813 | # Sets the size limit of the alert sidecar emptyDir volume
814 | sizeLimit: {}
815 | dashboards:
816 | enabled: false
817 | # Additional environment variables for the dashboards sidecar
818 | env: {}
819 | # Do not reprocess already processed unchanged resources on k8s API reconnect.
820 | # ignoreAlreadyProcessed: true
821 | SCProvider: true
822 | # label that the configmaps with dashboards are marked with
823 | label: grafana_dashboard
824 | # value of label that the configmaps with dashboards are set to
825 | labelValue: ""
826 | # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL.
827 | # logLevel: INFO
828 | # folder in the pod that should hold the collected dashboards (unless `defaultFolderName` is set)
829 | folder: /tmp/dashboards
830 | # The default folder name, it will create a subfolder under the `folder` and put dashboards in there instead
831 | defaultFolderName: null
832 | # Namespaces list. If specified, the sidecar will search for config-maps/secrets inside these namespaces.
833 | # Otherwise the namespace in which the sidecar is running will be used.
834 | # It's also possible to specify ALL to search in all namespaces.
835 | searchNamespace: null
836 | # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds.
837 | watchMethod: WATCH
838 | # search in configmap, secret or both
839 | resource: both
840 | # If specified, the sidecar will look for annotation with this name to create folder and put graph here.
841 | # You can use this parameter together with `provider.foldersFromFilesStructure`to annotate configmaps and create folder structure.
842 | folderAnnotation: null
843 | # Endpoint to send request to reload alerts
844 | reloadURL: "http://localhost:3000/api/admin/provisioning/dashboards/reload"
845 | # Absolute path to shell script to execute after a configmap got reloaded
846 | script: null
847 | skipReload: false
848 | # watchServerTimeout: request to the server, asking it to cleanly close the connection after that.
849 | # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S
850 | # watchServerTimeout: 3600
851 | #
852 | # watchClientTimeout: is a client-side timeout, configuring your local socket.
853 | # If you have a network outage dropping all packets with no RST/FIN,
854 | # this is how long your client waits before realizing & dropping the connection.
855 | # defaults to 66sec (sic!)
856 | # watchClientTimeout: 60
857 | #
858 | # provider configuration that lets grafana manage the dashboards
859 | provider:
860 | # name of the provider, should be unique
861 | name: sidecarProvider
862 | # orgid as configured in grafana
863 | orgid: 1
864 | # folder in which the dashboards should be imported in grafana
865 | folder: ''
866 | # type of the provider
867 | type: file
868 | # disableDelete to activate a import-only behaviour
869 | disableDelete: false
870 | # allow updating provisioned dashboards from the UI
871 | allowUiUpdates: false
872 | # allow Grafana to replicate dashboard structure from filesystem
873 | foldersFromFilesStructure: false
874 | # Additional dashboard sidecar volume mounts
875 | extraMounts: []
876 | # Sets the size limit of the dashboard sidecar emptyDir volume
877 | sizeLimit: {}
878 | datasources:
879 | enabled: false
880 | # Additional environment variables for the datasourcessidecar
881 | env: {}
882 | # Do not reprocess already processed unchanged resources on k8s API reconnect.
883 | # ignoreAlreadyProcessed: true
884 | # label that the configmaps with datasources are marked with
885 | label: grafana_datasource
886 | # value of label that the configmaps with datasources are set to
887 | labelValue: ""
888 | # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL.
889 | # logLevel: INFO
890 | # If specified, the sidecar will search for datasource config-maps inside this namespace.
891 | # Otherwise the namespace in which the sidecar is running will be used.
892 | # It's also possible to specify ALL to search in all namespaces
893 | searchNamespace: null
894 | # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds.
895 | watchMethod: WATCH
896 | # search in configmap, secret or both
897 | resource: both
898 | # watchServerTimeout: request to the server, asking it to cleanly close the connection after that.
899 | # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S
900 | # watchServerTimeout: 3600
901 | #
902 | # watchClientTimeout: is a client-side timeout, configuring your local socket.
903 | # If you have a network outage dropping all packets with no RST/FIN,
904 | # this is how long your client waits before realizing & dropping the connection.
905 | # defaults to 66sec (sic!)
906 | # watchClientTimeout: 60
907 | #
908 | # Endpoint to send request to reload datasources
909 | reloadURL: "http://localhost:3000/api/admin/provisioning/datasources/reload"
910 | # Absolute path to shell script to execute after a datasource got reloaded
911 | script: null
912 | skipReload: false
913 | # Deploy the datasource sidecar as an initContainer in addition to a container.
914 | # This is needed if skipReload is true, to load any datasources defined at startup time.
915 | initDatasources: false
916 | # Sets the size limit of the datasource sidecar emptyDir volume
917 | sizeLimit: {}
918 | plugins:
919 | enabled: false
920 | # Additional environment variables for the plugins sidecar
921 | env: {}
922 | # Do not reprocess already processed unchanged resources on k8s API reconnect.
923 | # ignoreAlreadyProcessed: true
924 | # label that the configmaps with plugins are marked with
925 | label: grafana_plugin
926 | # value of label that the configmaps with plugins are set to
927 | labelValue: ""
928 | # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL.
929 | # logLevel: INFO
930 | # If specified, the sidecar will search for plugin config-maps inside this namespace.
931 | # Otherwise the namespace in which the sidecar is running will be used.
932 | # It's also possible to specify ALL to search in all namespaces
933 | searchNamespace: null
934 | # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds.
935 | watchMethod: WATCH
936 | # search in configmap, secret or both
937 | resource: both
938 | # watchServerTimeout: request to the server, asking it to cleanly close the connection after that.
939 | # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S
940 | # watchServerTimeout: 3600
941 | #
942 | # watchClientTimeout: is a client-side timeout, configuring your local socket.
943 | # If you have a network outage dropping all packets with no RST/FIN,
944 | # this is how long your client waits before realizing & dropping the connection.
945 | # defaults to 66sec (sic!)
946 | # watchClientTimeout: 60
947 | #
948 | # Endpoint to send request to reload plugins
949 | reloadURL: "http://localhost:3000/api/admin/provisioning/plugins/reload"
950 | # Absolute path to shell script to execute after a plugin got reloaded
951 | script: null
952 | skipReload: false
953 | # Deploy the datasource sidecar as an initContainer in addition to a container.
954 | # This is needed if skipReload is true, to load any plugins defined at startup time.
955 | initPlugins: false
956 | # Sets the size limit of the plugin sidecar emptyDir volume
957 | sizeLimit: {}
958 | notifiers:
959 | enabled: false
960 | # Additional environment variables for the notifierssidecar
961 | env: {}
962 | # Do not reprocess already processed unchanged resources on k8s API reconnect.
963 | # ignoreAlreadyProcessed: true
964 | # label that the configmaps with notifiers are marked with
965 | label: grafana_notifier
966 | # value of label that the configmaps with notifiers are set to
967 | labelValue: ""
968 | # Log level. Can be one of: DEBUG, INFO, WARN, ERROR, CRITICAL.
969 | # logLevel: INFO
970 | # If specified, the sidecar will search for notifier config-maps inside this namespace.
971 | # Otherwise the namespace in which the sidecar is running will be used.
972 | # It's also possible to specify ALL to search in all namespaces
973 | searchNamespace: null
974 | # Method to use to detect ConfigMap changes. With WATCH the sidecar will do a WATCH requests, with SLEEP it will list all ConfigMaps, then sleep for 60 seconds.
975 | watchMethod: WATCH
976 | # search in configmap, secret or both
977 | resource: both
978 | # watchServerTimeout: request to the server, asking it to cleanly close the connection after that.
979 | # defaults to 60sec; much higher values like 3600 seconds (1h) are feasible for non-Azure K8S
980 | # watchServerTimeout: 3600
981 | #
982 | # watchClientTimeout: is a client-side timeout, configuring your local socket.
983 | # If you have a network outage dropping all packets with no RST/FIN,
984 | # this is how long your client waits before realizing & dropping the connection.
985 | # defaults to 66sec (sic!)
986 | # watchClientTimeout: 60
987 | #
988 | # Endpoint to send request to reload notifiers
989 | reloadURL: "http://localhost:3000/api/admin/provisioning/notifications/reload"
990 | # Absolute path to shell script to execute after a notifier got reloaded
991 | script: null
992 | skipReload: false
993 | # Deploy the notifier sidecar as an initContainer in addition to a container.
994 | # This is needed if skipReload is true, to load any notifiers defined at startup time.
995 | initNotifiers: false
996 | # Sets the size limit of the notifier sidecar emptyDir volume
997 | sizeLimit: {}
998 |
999 | ## Override the deployment namespace
1000 | ##
1001 | namespaceOverride: ""
1002 |
1003 | ## Number of old ReplicaSets to retain
1004 | ##
1005 | revisionHistoryLimit: 10
1006 |
1007 | ## Add a seperate remote image renderer deployment/service
1008 | imageRenderer:
1009 | deploymentStrategy: {}
1010 | # Enable the image-renderer deployment & service
1011 | enabled: false
1012 | replicas: 1
1013 | autoscaling:
1014 | enabled: false
1015 | minReplicas: 1
1016 | maxReplicas: 5
1017 | targetCPU: "60"
1018 | targetMemory: ""
1019 | behavior: {}
1020 | image:
1021 | # image-renderer Image repository
1022 | repository: grafana/grafana-image-renderer
1023 | # image-renderer Image tag
1024 | tag: latest
1025 | # image-renderer Image sha (optional)
1026 | sha: ""
1027 | # image-renderer ImagePullPolicy
1028 | pullPolicy: Always
1029 | # extra environment variables
1030 | env:
1031 | HTTP_HOST: "0.0.0.0"
1032 | # RENDERING_ARGS: --no-sandbox,--disable-gpu,--window-size=1280x758
1033 | # RENDERING_MODE: clustered
1034 | # IGNORE_HTTPS_ERRORS: true
1035 | # image-renderer deployment serviceAccount
1036 | serviceAccountName: ""
1037 | # image-renderer deployment securityContext
1038 | securityContext: {}
1039 | # image-renderer deployment container securityContext
1040 | containerSecurityContext:
1041 | capabilities:
1042 | drop: ['ALL']
1043 | allowPrivilegeEscalation: false
1044 | readOnlyRootFilesystem: true
1045 | # image-renderer deployment Host Aliases
1046 | hostAliases: []
1047 | # image-renderer deployment priority class
1048 | priorityClassName: ''
1049 | service:
1050 | # Enable the image-renderer service
1051 | enabled: true
1052 | # image-renderer service port name
1053 | portName: 'http'
1054 | # image-renderer service port used by both service and deployment
1055 | port: 8081
1056 | targetPort: 8081
1057 | # Adds the appProtocol field to the image-renderer service. This allows to work with istio protocol selection. Ex: "http" or "tcp"
1058 | appProtocol: ""
1059 | serviceMonitor:
1060 | ## If true, a ServiceMonitor CRD is created for a prometheus operator
1061 | ## https://github.com/coreos/prometheus-operator
1062 | ##
1063 | enabled: false
1064 | path: /metrics
1065 | # namespace: monitoring (defaults to use the namespace this chart is deployed to)
1066 | labels: {}
1067 | interval: 1m
1068 | scheme: http
1069 | tlsConfig: {}
1070 | scrapeTimeout: 30s
1071 | relabelings: []
1072 | # See: https://doc.crds.dev/github.com/prometheus-operator/kube-prometheus/monitoring.coreos.com/ServiceMonitor/v1@v0.11.0#spec-targetLabels
1073 | targetLabels: []
1074 | # - targetLabel1
1075 | # - targetLabel2
1076 | # If https is enabled in Grafana, this needs to be set as 'https' to correctly configure the callback used in Grafana
1077 | grafanaProtocol: http
1078 | # In case a sub_path is used this needs to be added to the image renderer callback
1079 | grafanaSubPath: ""
1080 | # name of the image-renderer port on the pod
1081 | podPortName: http
1082 | # number of image-renderer replica sets to keep
1083 | revisionHistoryLimit: 10
1084 | networkPolicy:
1085 | # Enable a NetworkPolicy to limit inbound traffic to only the created grafana pods
1086 | limitIngress: true
1087 | # Enable a NetworkPolicy to limit outbound traffic to only the created grafana pods
1088 | limitEgress: false
1089 | # Allow additional services to access image-renderer (eg. Prometheus operator when ServiceMonitor is enabled)
1090 | extraIngressSelectors: []
1091 | resources: {}
1092 | # limits:
1093 | # cpu: 100m
1094 | # memory: 100Mi
1095 | # requests:
1096 | # cpu: 50m
1097 | # memory: 50Mi
1098 | ## Node labels for pod assignment
1099 | ## ref: https://kubernetes.io/docs/user-guide/node-selection/
1100 | #
1101 | nodeSelector: {}
1102 |
1103 | ## Tolerations for pod assignment
1104 | ## ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
1105 | ##
1106 | tolerations: []
1107 |
1108 | ## Affinity for pod assignment (evaluated as template)
1109 | ## ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
1110 | ##
1111 | affinity: {}
1112 |
1113 | ## Use an alternate scheduler, e.g. "stork".
1114 | ## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
1115 | ##
1116 | # schedulerName: "default-scheduler"
1117 |
1118 | networkPolicy:
1119 | ## @param networkPolicy.enabled Enable creation of NetworkPolicy resources. Only Ingress traffic is filtered for now.
1120 | ##
1121 | enabled: false
1122 | ## @param networkPolicy.allowExternal Don't require client label for connections
1123 | ## The Policy model to apply. When set to false, only pods with the correct
1124 | ## client label will have network access to grafana port defined.
1125 | ## When true, grafana will accept connections from any source
1126 | ## (with the correct destination port).
1127 | ##
1128 | ingress: true
1129 | ## @param networkPolicy.ingress When true enables the creation
1130 | ## an ingress network policy
1131 | ##
1132 | allowExternal: true
1133 | ## @param networkPolicy.explicitNamespacesSelector A Kubernetes LabelSelector to explicitly select namespaces from which traffic could be allowed
1134 | ## If explicitNamespacesSelector is missing or set to {}, only client Pods that are in the networkPolicy's namespace
1135 | ## and that match other criteria, the ones that have the good label, can reach the grafana.
1136 | ## But sometimes, we want the grafana to be accessible to clients from other namespaces, in this case, we can use this
1137 | ## LabelSelector to select these namespaces, note that the networkPolicy's namespace should also be explicitly added.
1138 | ##
1139 | ## Example:
1140 | ## explicitNamespacesSelector:
1141 | ## matchLabels:
1142 | ## role: frontend
1143 | ## matchExpressions:
1144 | ## - {key: role, operator: In, values: [frontend]}
1145 | ##
1146 | explicitNamespacesSelector: {}
1147 | ##
1148 | ##
1149 | ##
1150 | ##
1151 | ##
1152 | ##
1153 | egress:
1154 | ## @param networkPolicy.egress.enabled When enabled, an egress network policy will be
1155 | ## created allowing grafana to connect to external data sources from kubernetes cluster.
1156 | enabled: false
1157 | ##
1158 | ## @param networkPolicy.egress.ports Add individual ports to be allowed by the egress
1159 | ports: []
1160 | ## Add ports to the egress by specifying - port:
1161 | ## E.X.
1162 | ## ports:
1163 | ## - port: 80
1164 | ## - port: 443
1165 | ##
1166 | ##
1167 | ##
1168 | ##
1169 | ##
1170 | ##
1171 |
1172 | # Enable backward compatibility of kubernetes where version below 1.13 doesn't have the enableServiceLinks option
1173 | enableKubeBackwardCompatibility: false
1174 | useStatefulSet: false
1175 | # Create a dynamic manifests via values:
1176 | extraObjects: []
1177 | # - apiVersion: "kubernetes-client.io/v1"
1178 | # kind: ExternalSecret
1179 | # metadata:
1180 | # name: grafana-secrets
1181 | # spec:
1182 | # backendType: gcpSecretsManager
1183 | # data:
1184 | # - key: grafana-admin-password
1185 | # name: adminPassword
--------------------------------------------------------------------------------
/04-deploy-argocd/README.md:
--------------------------------------------------------------------------------
1 | # Argo CD
2 |
3 | [Argo CD](https://argo-cd.readthedocs.io/en/stable/) is a declarative, GitOps continuous delivery tool for Kubernetes. This repository contains the instructions to deploy argo cd in the EKS cluster and configurations to manage applications in Argo CD.
4 |
5 | ## Prerequisites
6 |
7 | - Kubernetes 1.22+
8 | - kubectl
9 |
10 |
11 | ## Install Argo CD
12 |
13 | Follow the [official documentation](https://argo-cd.readthedocs.io/en/stable/getting_started/) to deploy Argo CD to your cluster.
14 |
15 | Use the [manifests](https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml) file to quickly install Argo CD or use the [helm chart](https://github.com/argoproj/argo-helm/tree/main/charts/argo-cd) to customize the configurations.
16 |
17 | Run the command `kubectl get pods -n argocd` to verify the installation and ensure all the pods are in `Running` state.
18 |
19 | ```console
20 | ❯❯ kubectl get pods -n argocd
21 | NAME READY STATUS RESTARTS AGE
22 | argocd-application-controller-0 1/1 Running 0 136m
23 | argocd-applicationset-controller-bdbc5976d-rsz4p 1/1 Running 0 136m
24 | argocd-dex-server-7c8974cfc9-zq894 1/1 Running 0 136m
25 | argocd-notifications-controller-56dbd4976-4kdjn 1/1 Running 0 136m
26 | argocd-redis-6bdcf5f74-wdx5v 1/1 Running 0 136m
27 | argocd-repo-server-5bcc9567f8-5rjfc 1/1 Running 0 136m
28 | argocd-server-5ccfbc6db6-dz8c5 1/1 Running 0 136m
29 |
30 | ```
31 |
32 | Kubectl port-forwarding can be used to connect to the Argo CD API server without exposing the service.
33 |
34 | ```console
35 | kubectl port-forward svc/argocd-server -n argocd 8080:443
36 | ```
37 |
38 | The API server can then be accessed using https://localhost:8080
39 |
40 | You can use the loadbalancer address if the service is exposed outside the cluster.
41 |
42 | Login to the Argo CD dashboard, The default username is `admin` and run the below command to retrieve the default password.
43 |
44 | ```console
45 | kubectl -n argocd get secret argocd-initial-admin-secret -o jsonpath="{.data.password}" | base64 -d; echo
46 | ```
47 |
48 | ## Deploy Applications
49 |
50 | In Argo CD you can deploy the applications using the Helm charts or manifest files stored in a git repository. For this example, we will deploy Prometheus and grafana to the EKS cluster and use the official helm charts.
51 |
52 | You can install Helm charts through the UI, or in the [declarative GitOps way](https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/). We recommend you follow the declarative GitOps way to deploy an argo application.
53 |
54 | Run the below command to deploy grafana and prometheus applications via Argo CD and explore the deployed configurations in argo CD UI
55 |
56 |
57 | ```console
58 | kubectl apply -f ./applications/grafana/grafana.yaml
59 | kubectl apply -f ./applications/prometheus/prometheus.yaml
60 | ```
61 |
62 | Change the `targetRevision` version `6.50.5` in `grafana.yaml` file and apply the changes. Argo CD will automatically identify the changes in the configurations and roll out the new version.
63 |
64 | ## Demo
65 |
66 | https://user-images.githubusercontent.com/112865563/215147384-92f62a74-b411-42e9-859a-5896ad870707.mp4
67 |
--------------------------------------------------------------------------------
/04-deploy-argocd/applications/grafana/grafana.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: argoproj.io/v1alpha1
3 | kind: Application
4 | metadata:
5 | name: grafana
6 | namespace: argocd
7 | spec:
8 | project: default
9 | source:
10 | chart: grafana
11 | repoURL: https://grafana.github.io/helm-charts
12 | targetRevision: 6.50.0
13 | helm:
14 | releaseName: grafana
15 | destination:
16 | server: "https://kubernetes.default.svc"
17 | namespace: grafana
18 | syncPolicy:
19 | syncOptions:
20 | - CreateNamespace=true
21 | automated:
22 | selfHeal: true
23 | prune: true
24 |
--------------------------------------------------------------------------------
/04-deploy-argocd/applications/prometheus/prometheus.yaml:
--------------------------------------------------------------------------------
1 | ---
2 | apiVersion: argoproj.io/v1alpha1
3 | kind: Application
4 | metadata:
5 | name: prometheus
6 | namespace: argocd
7 | spec:
8 | project: default
9 | source:
10 | chart: prometheus
11 | repoURL: https://prometheus-community.github.io/helm-charts
12 | targetRevision: 19.3.3
13 | helm:
14 | releaseName: prometheus
15 | destination:
16 | server: "https://kubernetes.default.svc"
17 | namespace: prometheus
18 | syncPolicy:
19 | syncOptions:
20 | - CreateNamespace=true
21 | automated:
22 | selfHeal: true
23 | prune: true
--------------------------------------------------------------------------------
/05-deploy-EFK/README.md:
--------------------------------------------------------------------------------
1 | # deploy-Elasticsearch-Filebeat-Kibana-stack
2 |
3 | This repository contains sample code to deploy a self-managed elastic search cluster in EKS with kibana and filebat.
4 |
5 | The required resources are created and managed via [Elastic Cloud on Kubernetes (ECK)](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-overview.html) which is a Kubernetes Operator to orchestrate Elastic applications (Elasticsearch, Kibana, APM Server, Enterprise Search, Beats, Elastic Agent, and Elastic Maps Server) on Kubernetes.
6 |
7 | The ECK operator relies on a set of Custom Resource Definitions (CRD) to declaratively define how each application is deployed. ECK simplifies deploying the whole Elastic stack on Kubernetes, giving us tools to automate and streamline critical operations.
8 |
9 | It focuses on streamlining all those critical operations such as, Managing and monitoring multiple clusters, Upgrading to new stack versions with ease, Scaling cluster capacity up and down, Changing cluster configuration, Dynamically scaling local storage (includes Elastic Local Volume, a local storage driver), Scheduling backups etc.
10 |
11 | The sample configurations are based on elastic search version 8.6.1 and. Refer to the official documentation for version details.
12 |
13 | ## Prerequisites
14 |
15 | - [EKS cluster with Kubernetes 1.22+](https://github.com/doitintl/aws-eks-devops-best-practices/tree/main/00-create-eks-cluster)
16 | - [EKS cluster with Amazon EBS CSI Driver](https://docs.aws.amazon.com/eks/latest/userguide/managing-ebs-csi.html) (Note: The add-on is deployed as part of cluster creation if you have used [create-eks-cluster](https://github.com/doitintl/aws-eks-devops-best-practices/tree/main/00-create-eks-cluster)]
17 | - Attach `AmazonEBSCSIDriverPolicy` to the worker node role
18 | - [kubectl](https://kubernetes.io/docs/tasks/tools/)
19 |
20 | ## Install ECK Operator
21 |
22 | To deploy Elasticsearch on Kubernetes, first we need to install the ECK operator in the Kubernetes cluster.
23 |
24 | There are two main ways to install the ECK in a Kubernetes cluster, 1) Install ECK using the YAML manifests, and 2) Install ECK using the Helm chart. This installation is based on YAML manifests.
25 |
26 | Run the following command to install the custom resource definitions for ECK operator version 2.6.1
27 |
28 | kubectl create -f https://download.elastic.co/downloads/eck/2.6.1/crds.yaml
29 |
30 | Install the operator with its RBAC rules:
31 |
32 | kubectl apply -f https://download.elastic.co/downloads/eck/2.6.1/operator.yaml
33 |
34 |
35 | Verify the ECK operator installtion and ensure the workload is running as expected
36 |
37 | ```
38 | ❯❯ kubectl get crd
39 | NAME CREATED AT
40 | agents.agent.k8s.elastic.co 2023-02-10T16:19:25Z
41 | apmservers.apm.k8s.elastic.co 2023-02-10T16:19:26Z
42 | beats.beat.k8s.elastic.co 2023-02-10T16:19:27Z
43 | elasticmapsservers.maps.k8s.elastic.co 2023-02-10T16:19:28Z
44 | elasticsearchautoscalers.autoscaling.k8s.elastic.co 2023-02-10T16:19:29Z
45 | elasticsearches.elasticsearch.k8s.elastic.co 2023-02-10T16:19:30Z
46 | enterprisesearches.enterprisesearch.k8s.elastic.co 2023-02-10T16:19:31Z
47 | kibanas.kibana.k8s.elastic.co 2023-02-10T16:19:32Z
48 |
49 | ❯❯ kubectl get all -n elastic-system
50 | NAME READY STATUS RESTARTS AGE
51 | pod/elastic-operator-0 1/1 Running 0 74s
52 |
53 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
54 | service/elastic-webhook-server ClusterIP 10.100.206.147 443/TCP 76s
55 |
56 | NAME READY AGE
57 | statefulset.apps/elastic-operator 1/1 78s
58 |
59 | #Monitor the operator pod logs
60 | ❯❯ kubectl logs -f -n elastic-system pod/elastic-operator-0
61 | ```
62 |
63 | ## Deploy Elasticsearch Cluster
64 |
65 | Now that ECK is running in the Kubernetes cluster, let's deploy an elastic search cluster with 1 Master node and 2 Data node pods in the `default` namespace.
66 |
67 | Refer to `elasticsearch.yaml` for [elatsicsample configurations and [ECK documentation](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-orchestrating-elastic-stack-applications.html) for all the available configuration options. Customize the configurations based on your requirements.
68 |
69 |
70 | kubectl apply -f elasticsearch.yaml
71 |
72 |
73 | Verify the installation
74 |
75 | ```
76 | ❯❯ kubectl get statefulset,pods,sc,pv,pvc
77 |
78 | NAME READY AGE
79 | statefulset.apps/demo-es-data 2/2 94s
80 | statefulset.apps/demo-es-masters 1/1 94s
81 |
82 | NAME READY STATUS RESTARTS AGE
83 | pod/demo-es-data-0 1/1 Running 0 95s
84 | pod/demo-es-data-1 1/1 Running 0 95s
85 | pod/demo-es-masters-0 1/1 Running 0 95s
86 |
87 | NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
88 | storageclass.storage.k8s.io/gp2 (default) kubernetes.io/aws-ebs Delete WaitForFirstConsumer false 2d16h
89 |
90 | NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
91 | persistentvolume/pvc-7f3961d0-e227-4e63-a4bb-48178b33d37e 10Gi RWO Delete Bound default/elasticsearch-data-demo-es-data-1 gp2 93s
92 | persistentvolume/pvc-b553429d-fe34-4c7b-bcf0-eb8262144408 10Gi RWO Delete Bound default/elasticsearch-data-demo-es-data-0 gp2 93s
93 | persistentvolume/pvc-d37144db-fba3-48a8-a866-b882be011d3a 10Gi RWO Delete Bound default/elasticsearch-data-demo-es-masters-0 gp2 93s
94 |
95 | NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
96 | persistentvolumeclaim/elasticsearch-data-demo-es-data-0 Bound pvc-b553429d-fe34-4c7b-bcf0-eb8262144408 10Gi RWO gp2 98s
97 | persistentvolumeclaim/elasticsearch-data-demo-es-data-1 Bound pvc-7f3961d0-e227-4e63-a4bb-48178b33d37e 10Gi RWO gp2 98s
98 | persistentvolumeclaim/elasticsearch-data-demo-es-masters-0 Bound pvc-d37144db-fba3-48a8-a866-b882be011d3a 10Gi RWO gp2 98s
99 |
100 | ```
101 |
102 | Elasticsearch svc `demo-es-http` is created as a ClusterIP service and accessible only within the cluster. Change the service type to [loadbalancer](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-services.html#k8s-allow-public-access) if required.
103 |
104 | A default user named elastic is automatically created with the password stored in a Kubernetes secret:
105 |
106 |
107 | ELASTIC_USER_PASSWORD=$(kubectl get secret demo-es-elastic-user -o go-template='{{.data.elastic | base64decode}}')
108 |
109 |
110 | Test the elasticsearch connection from your local work station
111 |
112 | kubectl port-forward service/demo-es-http 9200
113 |
114 |
115 | Disabling certificate verification using the -k flag is not recommended and should be used for testing purposes only
116 |
117 | curl -u "elastic:$ELASTIC_USER_PASSWORD" -k "https://localhost:9200"
118 |
119 | Sample output:
120 |
121 | ```
122 | {
123 | "name" : "demo-es-masters-0",
124 | "cluster_name" : "demo",
125 | "cluster_uuid" : "9js8hJuAQhmdXv7p1C-YJw",
126 | "version" : {
127 | "number" : "8.6.1",
128 | "build_flavor" : "default",
129 | "build_type" : "docker",
130 | "build_hash" : "180c9830da956993e59e2cd70eb32b5e383ea42c",
131 | "build_date" : "2023-01-24T21:35:11.506992272Z",
132 | "build_snapshot" : false,
133 | "lucene_version" : "9.4.2",
134 | "minimum_wire_compatibility_version" : "7.17.0",
135 | "minimum_index_compatibility_version" : "7.0.0"
136 | },
137 | "tagline" : "You Know, for Search"
138 | }
139 | ```
140 |
141 | ## Deploy Kibana Cluster
142 |
143 | The next step is to deploy kibana, a free and open user interface that lets you visualize your Elasticsearch data.
144 |
145 | Refer to `kibana.yaml` for sample configurations and refer to [eck documents](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-kibana-es.html) for all available configuration options.
146 |
147 | kubectl apply -f kibana.yaml
148 |
149 | Verify the installation
150 |
151 | ```
152 | ❯❯ kubectl get all -l "common.k8s.elastic.co/type=kibana"
153 |
154 | NAME READY STATUS RESTARTS AGE
155 | pod/demo-kb-799d67ffff-7tftp 1/1 Running 0 2m49s
156 |
157 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
158 | service/demo-kb-http LoadBalancer 10.100.235.170 a8cad8cba32374f788b1744cb79c46d5-2001568967.us-west-2.elb.amazonaws.com 5601:30645/TCP 2m54s
159 |
160 | NAME READY UP-TO-DATE AVAILABLE AGE
161 | deployment.apps/demo-kb 1/1 1 1 2m54s
162 |
163 | NAME DESIRED CURRENT READY AGE
164 | replicaset.apps/demo-kb-799d67ffff 1 1 1 2m53s
165 | ```
166 |
167 | Access the kibana application via the network loadbalancer and use the default elasticsearch username and password. (Ex: https://a8cad8cba32374f788b1744cb79c46d5-2001568967.us-west-2.elb.amazonaws.com:5601)
168 |
169 | 
170 |
171 | 
172 |
173 | ## Deploy Filebeat
174 |
175 | Filebeat is a lightweight shipper for forwarding and centralizing log data.It is installed as a daemonset in the kubernetes cluster and collects the logs generated by the containers. The collected logs are shipped to elasticsearch and indexed. You can then query the logs via kibana or elasticsearch API.
176 |
177 | Refer to `filebeat.yaml` for sample configurations and refer to [eck documents](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-beat-configuration.html) for all available configuration options.
178 |
179 | kubectl apply -f filebeat.yaml
180 |
181 | Verify the installation
182 |
183 | ```
184 | ❯❯ kubectl get pods -l "beat.k8s.elastic.co/name=demo"
185 |
186 | NAME READY STATUS RESTARTS AGE
187 | demo-beat-filebeat-4xvs2 1/1 Running 0 2m53s
188 | demo-beat-filebeat-7s8f5 1/1 Running 0 2m53s
189 | Chimbus-MBP:05-deploy-ELK-stack chimbu$ kubectl get all -l "beat.k8s.elastic.co/name=demo"
190 | NAME READY STATUS RESTARTS AGE
191 | pod/demo-beat-filebeat-4xvs2 1/1 Running 0 3m2s
192 | pod/demo-beat-filebeat-7s8f5 1/1 Running 0 3m2s
193 |
194 | NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
195 | daemonset.apps/demo-beat-filebeat 2 2 2 2 2 3m5s
196 |
197 | ```
198 |
199 | Login to kibana and create a data view to explore the collected data. Also deploy a sample application and explore the log data.
200 |
201 | 
202 |
203 | 
204 |
205 | 
206 |
207 | 
208 |
--------------------------------------------------------------------------------
/05-deploy-EFK/elasticsearch.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: elasticsearch.k8s.elastic.co/v1
2 | kind: Elasticsearch
3 | metadata:
4 | name: demo
5 | spec:
6 | version: 8.6.1
7 | nodeSets:
8 | - name: masters
9 | count: 1
10 | config:
11 | node.roles: ["master"]
12 | node.store.allow_mmap: false
13 | volumeClaimTemplates:
14 | - metadata:
15 | name: elasticsearch-data # Do not change this name unless you set up a volume mount for the data path.
16 | spec:
17 | accessModes:
18 | - ReadWriteOnce
19 | resources:
20 | requests:
21 | storage: 10Gi
22 | storageClassName: gp2 #default storage class in eks
23 | - name: data
24 | count: 2
25 | config:
26 | node.roles: ["data", "ingest", "ml", "transform"]
27 | node.store.allow_mmap: false
28 | volumeClaimTemplates:
29 | - metadata:
30 | name: elasticsearch-data # Do not change this name unless you set up a volume mount for the data path.
31 | spec:
32 | accessModes:
33 | - ReadWriteOnce
34 | resources:
35 | requests:
36 | storage: 10Gi
37 | storageClassName: gp2 #default storage class in eks
38 |
--------------------------------------------------------------------------------
/05-deploy-EFK/filebeat.yaml:
--------------------------------------------------------------------------------
1 | apiVersion: beat.k8s.elastic.co/v1beta1
2 | kind: Beat
3 | metadata:
4 | name: demo
5 | spec:
6 | type: filebeat
7 | version: 8.6.1
8 | elasticsearchRef:
9 | name: demo
10 | config:
11 | filebeat.autodiscover.providers:
12 | - node: ${NODE_NAME}
13 | type: kubernetes
14 | hints:
15 | enabled: true
16 | default_config:
17 | type: container
18 | paths:
19 | - /var/log/containers/*${data.kubernetes.container.id}.log
20 | processors:
21 | - add_cloud_metadata: {}
22 | - add_host_metadata: {}
23 | daemonSet:
24 | podTemplate:
25 | spec:
26 | serviceAccountName: filebeat
27 | automountServiceAccountToken: true
28 | terminationGracePeriodSeconds: 30
29 | dnsPolicy: ClusterFirstWithHostNet
30 | hostNetwork: true # Allows to provide richer host metadata
31 | containers:
32 | - name: filebeat
33 | securityContext:
34 | runAsUser: 0
35 | # If using Red Hat OpenShift uncomment this:
36 | #privileged: true
37 | volumeMounts:
38 | - name: varlogcontainers
39 | mountPath: /var/log/containers
40 | - name: varlogpods
41 | mountPath: /var/log/pods
42 | - name: varlibdockercontainers
43 | mountPath: /var/lib/docker/containers
44 | env:
45 | - name: NODE_NAME
46 | valueFrom:
47 | fieldRef:
48 | fieldPath: spec.nodeName
49 | volumes:
50 | - name: varlogcontainers
51 | hostPath:
52 | path: /var/log/containers
53 | - name: varlogpods
54 | hostPath:
55 | path: /var/log/pods
56 | - name: varlibdockercontainers
57 | hostPath:
58 | path: /var/lib/docker/containers
59 | ---
60 | apiVersion: rbac.authorization.k8s.io/v1
61 | kind: ClusterRole
62 | metadata:
63 | name: filebeat
64 | rules:
65 | - apiGroups: [""] # "" indicates the core API group
66 | resources:
67 | - namespaces
68 | - pods
69 | - nodes
70 | verbs:
71 | - get
72 | - watch
73 | - list
74 | ---
75 | apiVersion: v1
76 | kind: ServiceAccount
77 | metadata:
78 | name: filebeat
79 | namespace: default
80 | ---
81 | apiVersion: rbac.authorization.k8s.io/v1
82 | kind: ClusterRoleBinding
83 | metadata:
84 | name: filebeat
85 | subjects:
86 | - kind: ServiceAccount
87 | name: filebeat
88 | namespace: default
89 | roleRef:
90 | kind: ClusterRole
91 | name: filebeat
92 | apiGroup: rbac.authorization.k8s.io
--------------------------------------------------------------------------------
/05-deploy-EFK/kibana.yaml:
--------------------------------------------------------------------------------
1 |
2 | ---
3 | apiVersion: kibana.k8s.elastic.co/v1
4 | kind: Kibana
5 | metadata:
6 | name: demo
7 | spec:
8 | version: 8.6.1
9 | count: 1
10 | elasticsearchRef:
11 | name: demo #elasticsearch deployment name
12 | namespace: default
13 | http:
14 | service:
15 | metadata:
16 | annotations:
17 | service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
18 | service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
19 | service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
20 | service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
21 | service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
22 | spec:
23 | type: LoadBalancer # default is ClusterIP
24 | podTemplate:
25 | spec:
26 | containers:
27 | - name: kibana
28 | env:
29 | - name: NODE_OPTIONS
30 | value: "--max-old-space-size=2048"
31 | resources:
32 | requests:
33 | memory: 1Gi
34 | cpu: 0.5
35 | limits:
36 | memory: 2.5Gi
37 | cpu: 2
38 |
--------------------------------------------------------------------------------
/06-deploy-keda/README.md:
--------------------------------------------------------------------------------
1 | Please refer to [keda-eks-event-driven-autoscaling-demo](https://github.com/ChimbuChinnadurai/keda-eks-event-driven-autoscaling-demo/tree/main) for EKS KEDA example.
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # aws-eks-sample-templates
2 |
3 | This repository contains the sample templates to get started with AWS EKS quickly
4 |
5 | Each directory contains the instructions to deploy the required resources in EKS.
6 |
7 | ## Create a new eks cluster and deploy a sample application
8 |
9 | https://user-images.githubusercontent.com/112865563/213638891-8c4e03c0-4ef4-4e1e-a2fb-9e0679317f89.mp4
10 |
11 | ## Deploy Prometheus and Grafana
12 |
13 | https://user-images.githubusercontent.com/112865563/215078845-5fcabb5f-3bd8-4769-b735-b9c5ce808111.mp4
14 |
--------------------------------------------------------------------------------