├── .gitignore
├── README.md
├── day-1
    ├── images
    │   ├── Introduction-to-Observability.png
    │   └── why-monitoring-why-observability.png
    └── readme.md
├── day-2
    ├── custom_kube_prometheus_stack.yml
    ├── images
    │   └── prometheus-architecture.gif
    └── readme.md
├── day-3
    ├── alb_controller.md
    ├── ingress_kube_prom_stack.yaml
    └── readme.md
├── day-4
    ├── alerts-alertmanager-servicemonitor-manifest
    │   ├── alertmangerconfig.yml
    │   ├── alerts.yml
    │   ├── email-secrets.yml
    │   ├── kustomization.yml
    │   └── serviceMonitor.yml
    ├── application
    │   ├── service-a
    │   │   ├── Dockerfile
    │   │   ├── index.js
    │   │   ├── package-lock.json
    │   │   ├── package.json
    │   │   └── tracing.js
    │   └── service-b
    │   │   ├── Dockerfile
    │   │   ├── index.js
    │   │   ├── package-lock.json
    │   │   ├── package.json
    │   │   └── tracing.js
    ├── images
    │   └── architecture.gif
    ├── kubernetes-manifest
    │   ├── deployment-svc-a.yml
    │   ├── deployment-svc-b.yml
    │   ├── kustomization.yml
    │   ├── service-svc-a.yml
    │   └── service-svc-b.yml
    ├── readme.md
    └── test.sh
├── day-5
    ├── fluentbit-values.yaml
    ├── images
    │   └── architecture.gif
    └── readme.md
├── day-6
    ├── images
    │   └── architecture.gif
    ├── jaeger-values.yaml
    └── readme.md
├── day-7
    ├── README.md
    ├── jaeger-values.yaml
    ├── k8s-manifests
    │   ├── deployment-a.yml
    │   ├── deployment-b.yml
    │   ├── kustomization.yml
    │   ├── namespace.yml
    │   ├── svc-a.yml
    │   └── svc-b.yml
    ├── microservice-a
    │   ├── .dockerignore
    │   ├── .env
    │   ├── docker-compose.yml
    │   ├── dockerfile
    │   ├── go.mod
    │   ├── go.sum
    │   ├── main.go
    │   ├── otel-collector-config.yaml
    │   ├── prometheus.yaml
    │   └── test.sh
    ├── microservice-b
    │   ├── .dockerignore
    │   ├── .env
    │   ├── docker-compose.yml
    │   ├── dockerfile
    │   ├── go.mod
    │   ├── go.sum
    │   ├── main.go
    │   ├── otel-collector-config.yaml
    │   ├── prometheus.yaml
    │   └── test.sh
    ├── otel-collector-values.yaml
    ├── prometheus-values.yaml
    └── test.sh
└── opensearch-stack
    ├── fluent-bit-config.yaml
    ├── fluent-bit-daemonset.yaml
    ├── log-generator.yaml
    └── prerequisites.md


/.gitignore:
--------------------------------------------------------------------------------
1 | **/*.pptx
2 | 
3 | **/**/node_modules
4 | 
5 | **/**/*.pem


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | 
 2 | # 📚 7-Day Observability Tutorial Series
 3 | 
 4 | Welcome to the 7-Day Observability Tutorial Series! This repository contains the code and detailed explanations for setting up and understanding observability in Kubernetes using Prometheus, Grafana, Elasticsearch Fluentbit, Kibana, Jaeger, groundcover(eBPF), opentelemetry e.t.c.,.
 5 | 
 6 | ## 📅 Overview of Each Day
 7 | 
 8 | ### Day 1: Introduction to Observability
 9 | - **Concepts Covered**:
10 |   - Introduction to Observability, Monitoring, Logging, and Tracing.
11 |   - The difference between Monitoring and Observability.
12 |   - Tools available for Monitoring and Observability.
13 |   - Comparison between monitoring and observing in Bare-Metal Servers vs. Kubernetes.
14 | - **Key Learning**:
15 |   - Understand the fundamental concepts of observability.
16 |   - Learn why monitoring and observability are crucial in modern IT environments.
17 | 
18 | ### Day 2: Prometheus - Setting Up Monitoring
19 | - **Concepts Covered**:
20 |   - Introduction to Prometheus and its architecture.
21 |   - Setup and configuration of Prometheus in an EKS cluster.
22 |   - Installation of kube-prometheus-stack with Helm and integrating it with Grafana.
23 |   - Basic queries and setup for monitoring with Prometheus and Grafana.
24 | - **Key Learning**:
25 |   - Get hands-on experience with Prometheus and Grafana.
26 |   - Learn to install and configure Prometheus on Kubernetes.
27 | 
28 | ### Day 3: Metrics and PromQL in Prometheus
29 | - **Concepts Covered**:
30 |   - Introduction to PromQL and basic querying techniques.
31 |   - Aggregation and functions in PromQL to analyze metrics data.
32 | - **Key Learning**:
33 |   - Master the Prometheus Query Language (PromQL) for querying and analyzing metrics.
34 | 
35 | ### Day 4: Instrumentation and Custom Metrics
36 | - **Concepts Covered**:
37 |   - Instrumentation for adding monitoring capabilities to applications.
38 |   - Understanding different types of metrics in Prometheus: Counter, Gauge, Histogram, and Summary.
39 |   - Writing custom metrics in a Node.js application using the `prom-client` library.
40 |   - Dockerizing the application and deploying it on Kubernetes.
41 |   - Setting up Alertmanager for alerting based on custom metrics.
42 | - **Key Learning**:
43 |   - Learn how to instrument applications to expose custom metrics.
44 |   - Configure alerts in Alertmanager to monitor application performance.
45 |   - Understand how to work with different types of metrics in Prometheus.
46 | 
47 | ### Day 5: Logging with EFK Stack
48 | - **Concepts Covered**:
49 |   - Introduction to logging in distributed systems and Kubernetes.
50 |   - Setting up the EFK stack (Elasticsearch, Fluentbit, Kibana) on Kubernetes.
51 |   - Detailed setup and configuration for collecting and visualizing logs.
52 |   - Cleaning up the Kubernetes cluster and resources.
53 | - **Key Learning**:
54 |   - Understand the importance of logging and how to set up
55 | 
56 | ### Day 6: Distributed Tracing with Jaeger
57 | - **Concepts Covered**:
58 |   - Introduction to Jaeger and its architecture for distributed tracing.
59 |   - Setting up Jaeger in a Kubernetes cluster using Helm.
60 |   - Instrumenting services using OpenTelemetry to enable tracing.
61 |   - Viewing and analyzing traces in the Jaeger UI.
62 |   - Cleaning up the environment after setting up Jaeger.
63 | - **Key Learning**:
64 |   - Gain insights into distributed tracing and how it helps in debugging and performance optimization.
65 |   - Learn how to set up and configure Jaeger for tracing in a microservices architecture.
66 | 
67 | ### Day 7: OpenTelemetry – Setting Up Unified Observability
68 | - **Concepts Covered**:
69 |   - Introduction to OpenTelemetry, a unified framework for observability.
70 |   - Understanding how OpenTelemetry integrates tracing, metrics, and logging.
71 |   - Comparison of OpenTelemetry with prior observability tools like Jaeger, Prometheus
72 |   - Supported programming languages and multi-language support in OpenTelemetry.
73 |   - Step-by-step setup of OpenTelemetry in Kubernetes.
74 | - **Key Learning**:
75 |   - Learn how OpenTelemetry simplifies the process of collecting and exporting telemetry data.
76 |   - Understand the benefits of a unified observability approach using OpenTelemetry.
77 |   - Gain hands-on experience with setting up OpenTelemetry Collector, Prometheus, Jaeger, and Elasticsearch to monitor a Golang microservice application.
78 | 
79 | 


--------------------------------------------------------------------------------
/day-1/images/Introduction-to-Observability.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/iam-veeramalla/observability-zero-to-hero/9445b2364672b23b72f029e65471ed485a0c8950/day-1/images/Introduction-to-Observability.png


--------------------------------------------------------------------------------
/day-1/images/why-monitoring-why-observability.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/iam-veeramalla/observability-zero-to-hero/9445b2364672b23b72f029e65471ed485a0c8950/day-1/images/why-monitoring-why-observability.png


--------------------------------------------------------------------------------
/day-1/readme.md:
--------------------------------------------------------------------------------
 1 | # 💡 Introduction to Observability
 2 | - Observability is the ability to understand the internal state of a system by analyzing the data it produces, including logs, metrics, and traces.
 3 | 
 4 | - Monitoring(Metrics): involves tracking system metrics like CPU usage, memory usage, and network performance. Provides alerts based on predefined thresholds and conditions
 5 |     - `Monitoring tells us what is happening.`
 6 | - Logging(Logs):  involves the collection of log data from various components of a system.
 7 |     - `Logging explains why it is happening.`
 8 | - Tracing(Traces): involves tracking the flow of a request or transaction as it moves through different services and components within a system.
 9 |     - `Tracing shows how it is happening.`
10 | 
11 | ![Introduction to Observability](images/Introduction-to-Observability.png)
12 | 
13 | ## 🤔 Why Monitoring?
14 | - Monitoring helps us keep an eye on our systems to ensure they are working properly.
15 | - Perpose:  maintaining the **health, performance, and security** of IT environments.
16 | - It enables early detection of issues, ensuring that they can be addressed before causing significant downtime or data loss.
17 | 
18 | - We use monitoring to:
19 |     - Detect Problems Early
20 |     - Measure Performance:
21 |     - Ensure Availability:
22 | 
23 | ## 🤔 Why Observability?
24 | - Observability helps us understand why our systems are behaving the way they are.
25 | - It’s like having a detailed map and tools to explore and diagnose issues.
26 | 
27 | - We use observability to:
28 |     - Diagnose Issues:
29 |     - Understand Behavior:
30 |     - Improve Systems:
31 | 
32 | ![why-monitoring-why-observability](images/why-monitoring-why-observability.png)
33 | 
34 | 
35 | ## 🆚 What is the Exact Difference Between Monitoring and Observability?
36 | - 🔥 Monitoring is the *`when and what`* of a system error, and observability is the *`why and how`*
37 | 
38 | | Category       | Monitoring                                   | Observability                                         |
39 | |----------------|----------------------------------------------|------------------------------------------------------|
40 | | Focus          | Checking if everything is working as expected| Understanding why things are happening in the system  |
41 | | Data           | Collects metrics like CPU usage, memory usage, and error rates | Collects logs, metrics, and traces to provide a full picture |
42 | | Alerts         | Sends notifications when something goes wrong| Correlates events and anomalies to identify root causes |
43 | | Example        | If a server's CPU usage goes above 90%, monitoring will alert us | If a website is slow, observability helps us trace the user's request through different services to find the bottleneck |
44 | | Insight        | Identifies potential issues before they become critical | Helps diagnose issues and understand system behavior |
45 | 
46 | 
47 | ## 🔭 Does Observability Cover Monitoring?
48 | - Yes!! Monitoring is subset of Observability
49 | - Observability is a broader concept that includes monitoring as one of its components.
50 | - monitoring focuses on tracking specific metrics and alerting on predefined conditions
51 | - observability provides a comprehensive understanding of the system by collecting and analyzing a wider range of data, including **logs, metrics, and traces**.
52 | 
53 | ## 🖥️ What Can Be Monitored?
54 | - Infrastructure: CPU usage, memory usage, disk I/O, network traffic.
55 | - Applications: Response times, error rates, throughput.
56 | - Databases: Query performance, connection pool usage, transaction rates.
57 | - Network: Latency, packet loss, bandwidth usage.
58 | - Security: Unauthorized access attempts, vulnerability scans, firewall logs.
59 | 
60 | ## 👀 What Can Be Observed?
61 | - Logs: Detailed records of events and transactions within the system.
62 | - Metrics: Quantitative data points like CPU load, memory consumption, and request counts.
63 | - Traces: Data that shows the flow of requests through various services and components.
64 | 
65 | ## 🆚 Monitoring on Bare-Metal Servers vs. Monitoring Kubernetes
66 | - Bare-Metal Servers:
67 |     - Direct Access: Easier access to hardware metrics and logs.
68 |     - Fewer Layers: Simpler environment with fewer abstraction layers.
69 | 
70 | - Kubernetes:
71 |     - Dynamic Environment: Challenges with monitoring ephemeral containers and dynamic scaling.
72 |     - Distributed Nature: Requires tools that can handle distributed systems and correlate data from multiple sources.
73 | 
74 | ## 🆚 Observing on Bare-Metal Servers vs. Observing Kubernetes
75 | - Bare-Metal Servers:
76 |     - Simpler Observability: Easier to collect and correlate logs, metrics, and traces due to fewer components and layers.
77 | 
78 | - Kubernetes:
79 |     - Complex Observability: Requires sophisticated tools to handle the dynamic and distributed nature of containers and microservices.
80 |     - Integration: Necessitates the integration of multiple observability tools to get a complete picture of the system.
81 | 
82 | ## ⚒️ What are the Tools Available?
83 | - **Monitoring Tools**: Prometheus, Grafana, Nagios, Zabbix, PRTG.
84 | - **Observability Tools**: ELK Stack (Elasticsearch, Logstash, Kibana), EFK Stack (Elasticsearch, FluentBit, Kibana) Splunk, Jaeger, Zipkin, New Relic, Dynatrace, Datadog.
85 | 
86 | 


--------------------------------------------------------------------------------
/day-2/custom_kube_prometheus_stack.yml:
--------------------------------------------------------------------------------
 1 | alertmanager:
 2 |   alertmanagerSpec:
 3 |     # Selects Alertmanager configuration based on these labels. Ensure that the Alertmanager configuration has matching labels.
 4 |     # ✅ Solves error: Misconfigured Alertmanager selectors can lead to missing alert configurations.
 5 |     # ✅ Solves error: Alertmanager wasn't able to findout the applied CRD (kind: Alertmanagerconfig)
 6 |     alertmanagerConfigSelector:
 7 |       matchLabels:
 8 |         release: monitoring
 9 | 
10 |     # Sets the number of Alertmanager replicas to 3 for high availability.
11 |     # ✅ Solves error: Single replica can cause alerting issues during pod failures.
12 |     # ✅ Solves error: Alertmanager Cluster Status is Disabled (GitHub issue)
13 |     replicas: 2
14 | 
15 |     # Sets the strategy for matching Alertmanager configurations. 'None' means no specific matching strategy.
16 |     # ✅ Solves error: Incorrect matcher strategy can lead to unhandled alert configurations.
17 |     # ✅ Solves error: Get rid of namespace matchers when creating AlertManagerConfig (GitHub issue)
18 |     alertmanagerConfigMatcherStrategy:
19 |       type: None


--------------------------------------------------------------------------------
/day-2/images/prometheus-architecture.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/iam-veeramalla/observability-zero-to-hero/9445b2364672b23b72f029e65471ed485a0c8950/day-2/images/prometheus-architecture.gif


--------------------------------------------------------------------------------
/day-2/readme.md:
--------------------------------------------------------------------------------
  1 | # Monitoring
  2 | 
  3 | ## Metrics vs Monitoring
  4 | 
  5 | Metrics are measurements or data points that tell you what is happening. For example, the number of steps you walk each day, your heart rate, or the temperature outside—these are all metrics.
  6 | 
  7 | Monitoring is the process of keeping an eye on these metrics over time to understand what’s normal, identify changes, and detect problems. It's like watching your step count daily to see if you're meeting your fitness goal or checking your heart rate to make sure it's in a healthy range.
  8 | 
  9 | ## 🚀 Prometheus
 10 | - Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud.
 11 | - It is known for its robust data model, powerful query language (PromQL), and the ability to generate alerts based on the collected time-series data.
 12 | - It can be configured and set up on both bare-metal servers and container environments like Kubernetes.
 13 | 
 14 | ## 🏠 Prometheus Architecture
 15 | - The architecture of Prometheus is designed to be highly flexible, scalable, and modular.
 16 | - It consists of several core components, each responsible for a specific aspect of the monitoring process.
 17 | 
 18 | ![Prometheus Architecture](images/prometheus-architecture.gif)
 19 | 
 20 | ### 🔥 Prometheus Server
 21 | - Prometheus server is the core of the monitoring system. It is responsible for scraping metrics from various configured targets, storing them in its time-series database (TSDB), and serving queries through its HTTP API.
 22 | - Components:
 23 |     - **Retrieval**: This module handles the scraping of metrics from endpoints, which are discovered either through static configurations or dynamic service discovery methods.
 24 |     - **TSDB (Time Series Database)**: The data scraped from targets is stored in the TSDB, which is designed to handle high volumes of time-series data efficiently.
 25 |     - **HTTP Server**: This provides an API for querying data using PromQL, retrieving metadata, and interacting with other components of the Prometheus ecosystem.
 26 | - **Storage**: The scraped data is stored on local disk (HDD/SSD) in a format optimized for time-series data.
 27 | 
 28 | ### 🌐 Service Discovery
 29 | - Service discovery automatically identifies and manages the list of scrape targets (i.e., services or applications) that Prometheus monitors.
 30 | - This is crucial in dynamic environments like Kubernetes where services are constantly being created and destroyed.
 31 | - Components:
 32 |     - **Kubernetes**: In Kubernetes environments, Prometheus can automatically discover services, pods, and nodes using Kubernetes API, ensuring it monitors the most up-to-date list of targets.
 33 |     - **File SD (Service Discovery)**: Prometheus can also read static target configurations from files, allowing for flexibility in environments where dynamic service discovery is not used.
 34 | 
 35 | ### 📤 Pushgateway
 36 | - The Pushgateway is used to expose metrics from short-lived jobs or applications that cannot be scraped directly by Prometheus.
 37 | - These jobs push their metrics to the Pushgateway, which then makes them available for Prometheus to scrape(pull).
 38 | - Use Case:
 39 |     - It's particularly useful for batch jobs or tasks that have a limited lifespan and would otherwise not have their metrics collected.
 40 | 
 41 | ### 🚨 Alertmanager
 42 | - The Alertmanager is responsible for managing alerts generated by the Prometheus server.
 43 | - It takes care of deduplicating, grouping, and routing alerts to the appropriate notification channels such as PagerDuty, email, or Slack.
 44 | 
 45 | ### 🧲 Exporters
 46 | - Exporters are small applications that collect metrics from various third-party systems and expose them in a format Prometheus can scrape. They are essential for monitoring systems that do not natively support Prometheus.
 47 | - Types of Exporters:
 48 |     - Common exporters include the Node Exporter (for hardware metrics), the MySQL Exporter (for database metrics), and various other application-specific exporters.
 49 | 
 50 | ### 🖥️ Prometheus Web UI
 51 | - The Prometheus Web UI allows users to explore the collected metrics data, run ad-hoc PromQL queries, and visualize the results directly within Prometheus.
 52 | 
 53 | ### 📊 Grafana
 54 | - Grafana is a powerful dashboard and visualization tool that integrates with Prometheus to provide rich, customizable visualizations of the metrics data.
 55 | 
 56 | ### 🔌 API Clients
 57 | - API clients interact with Prometheus through its HTTP API to fetch data, query metrics, and integrate Prometheus with other systems or custom applications.
 58 | 
 59 | # 🛠️  Installation & Configurations
 60 | ## 📦 Step 1: Create EKS Cluster
 61 | 
 62 | ### Prerequisites
 63 | - Download and Install AWS Cli - Please Refer [this]("https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html") link.
 64 | - Setup and configure AWS CLI using the `aws configure` command.
 65 | - Install and configure eksctl using the steps mentioned [here]("https://eksctl.io/installation/").
 66 | - Install and configure kubectl as mentioned [here]("https://kubernetes.io/docs/tasks/tools/").
 67 | 
 68 | 
 69 | ```bash
 70 | eksctl create cluster --name=observability \
 71 |                       --region=us-east-1 \
 72 |                       --zones=us-east-1a,us-east-1b \
 73 |                       --without-nodegroup
 74 | ```
 75 | ```bash
 76 | eksctl utils associate-iam-oidc-provider \
 77 |     --region us-east-1 \
 78 |     --cluster observability \
 79 |     --approve
 80 | ```
 81 | ```bash
 82 | eksctl create nodegroup --cluster=observability \
 83 |                         --region=us-east-1 \
 84 |                         --name=observability-ng-private \
 85 |                         --node-type=t3.medium \
 86 |                         --nodes-min=2 \
 87 |                         --nodes-max=3 \
 88 |                         --node-volume-size=20 \
 89 |                         --managed \
 90 |                         --asg-access \
 91 |                         --external-dns-access \
 92 |                         --full-ecr-access \
 93 |                         --appmesh-access \
 94 |                         --alb-ingress-access \
 95 |                         --node-private-networking
 96 | 
 97 | # Update ./kube/config file
 98 | aws eks update-kubeconfig --name observability
 99 | ```
100 | 
101 | ### 🧰 Step 2: Install kube-prometheus-stack
102 | ```bash
103 | helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
104 | helm repo update
105 | ```
106 | 
107 | ### 🚀 Step 3: Deploy the chart into a new namespace "monitoring"
108 | ```bash
109 | kubectl create ns monitoring
110 | ```
111 | ```bash
112 | cd day-2
113 | 
114 | helm install monitoring prometheus-community/kube-prometheus-stack \
115 | -n monitoring \
116 | -f ./custom_kube_prometheus_stack.yml
117 | ```
118 | 
119 | ### ✅ Step 4: Verify the Installation
120 | ```bash
121 | kubectl get all -n monitoring
122 | ```
123 | - **Prometheus UI**:
124 | ```bash
125 | kubectl port-forward service/prometheus-operated -n monitoring 9090:9090
126 | ```
127 | 
128 | **NOTE:** If you are using an EC2 Instance or Cloud VM, you need to pass `--address 0.0.0.0` to the above command. Then you can access the UI on <instance-ip:port>
129 | 
130 | - **Grafana UI**: password is `prom-operator`
131 | ```bash
132 | kubectl port-forward service/monitoring-grafana -n monitoring 8080:80
133 | ```
134 | - **Alertmanager UI**:
135 | ```bash
136 | kubectl port-forward service/alertmanager-operated -n monitoring 9093:9093
137 | ```
138 | 
139 | ### 🧼 Step 5: Clean UP
140 | - **Uninstall helm chart**:
141 | ```bash
142 | helm uninstall monitoring --namespace monitoring
143 | ```
144 | - **Delete namespace**:
145 | ```bash
146 | kubectl delete ns monitoring
147 | ```
148 | - **Delete Cluster & everything else**:
149 | ```bash
150 | eksctl delete cluster --name observability
151 | ```
152 | 


--------------------------------------------------------------------------------
/day-3/alb_controller.md:
--------------------------------------------------------------------------------
1 | # ALB Controller Installation
2 | 
3 | Follow the steps mentioned [here]("https://github.com/iam-veeramalla/aws-devops-zero-to-hero/blob/main/day-22/alb-controller-add-on.md")


--------------------------------------------------------------------------------
/day-3/ingress_kube_prom_stack.yaml:
--------------------------------------------------------------------------------
 1 | apiVersion: networking.k8s.io/v1
 2 | kind: Ingress
 3 | metadata:
 4 |   name: kubernetes-prometheus-stack
 5 |   annotations:
 6 |     alb.ingress.kubernetes.io/scheme: internet-facing
 7 |     alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]'
 8 |     alb.ingress.kubernetes.io/target-type: ip
 9 | spec:
10 |   ingressClassName: alb
11 |   rules:
12 |     - http:
13 |         paths:
14 |           - path: /prometheus
15 |             pathType: Prefix
16 |             backend:
17 |               service:
18 |                 name: prometheus-service  # Change this to your Prometheus service name
19 |                 port:
20 |                   number: 9090
21 |           - path: /grafana
22 |             pathType: Prefix
23 |             backend:
24 |               service:
25 |                 name: grafana-service  # Change this to your Grafana service name
26 |                 port:
27 |                   number: 3000
28 |           - path: /alertmanager
29 |             pathType: Prefix
30 |             backend:
31 |               service:
32 |                 name: alertmanager-service  # Change this to your Alertmanager service name
33 |                 port:
34 |                   number: 9093
35 | 


--------------------------------------------------------------------------------
/day-3/readme.md:
--------------------------------------------------------------------------------
 1 | 
 2 | ## 📊 Metrics in Prometheus:
 3 | - Metrics in Prometheus are the core data objects that represent measurements collected from monitored systems.
 4 | - These metrics provide insights into various aspects of **system performance, health, and behavior**.
 5 | 
 6 | ## 🏷️ Labels:
 7 | - Metrics are paired with Labels.
 8 | - Labels are key-value pairs that allow you to differentiate between dimensions of a metric, such as different services, instances, or endpoints.
 9 | 
10 | 
11 | ## 🔍 Example:
12 | ```bash
13 | container_cpu_usage_seconds_total{namespace="kube-system", endpoint="https-metrics"}
14 | ```
15 | - `container_cpu_usage_seconds_total` is the metric.
16 | - `{namespace="kube-system", endpoint="https-metrics"}` are the labels.
17 | 
18 | 
19 | ## 🛠️ What is PromQL?
20 | - PromQL (Prometheus Query Language) is a powerful and flexible query language used to query data from Prometheus.
21 | - It allows you to retrieve and manipulate time series data, perform mathematical operations, aggregate data, and much more.
22 | 
23 | - 🔑 Key Features of PromQL:
24 |     - Selecting Time Series: You can select specific metrics with filters and retrieve their data.
25 |     - Mathematical Operations: PromQL allows for mathematical operations on metrics.
26 |     - Aggregation: You can aggregate data across multiple time series.
27 |     - Functionality: PromQL includes a wide range of functions to analyze and manipulate data.
28 | 
29 | ## 💡 Basic Examples of PromQL
30 | - `container_cpu_usage_seconds_total`
31 |     - Return all time series with the metric container_cpu_usage_seconds_total
32 | - `container_cpu_usage_seconds_total{namespace="kube-system",pod=~"kube-proxy.*"}`
33 |     - Return all time series with the metric `container_cpu_usage_seconds_total` and the given `namespace` and `pod` labels.
34 | - `container_cpu_usage_seconds_total{namespace="kube-system",pod=~"kube-proxy.*"}[5m]`
35 |     - Return a whole range of time (in this case 5 minutes up to the query time) for the same vector, making it a range vector.
36 | 
37 | ## ⚙️ Aggregation & Functions in PromQL
38 | - Aggregation in PromQL allows you to combine multiple time series into a single one, based on certain labels.
39 | - **Sum Up All CPU Usage**:
40 |     ```bash
41 |     sum(rate(node_cpu_seconds_total[5m]))
42 |     ```
43 |     - This query aggregates the CPU usage across all nodes.
44 | 
45 | - **Average Memory Usage per Namespace:**
46 |     ```bash
47 |     avg(container_memory_usage_bytes) by (namespace)
48 |     ```
49 |     - This query provides the average memory usage grouped by namespace.
50 | 
51 | - **rate() Function:**
52 |     - The rate() function calculates the per-second average rate of increase of the time series in a specified range.
53 |     ```bash
54 |     rate(container_cpu_usage_seconds_total[5m])
55 |     ```
56 |     - This calculates the rate of CPU usage over 5 minutes.
57 | - **increase() Function:**
58 |     - The increase() function returns the increase in a counter over a specified time range.
59 |     ```bash
60 |     increase(kube_pod_container_status_restarts_total[1h])
61 |     ```
62 |     - This gives the total increase in container restarts over the last hour.
63 | 
64 | - **histogram_quantile() Function:**
65 |     - The histogram_quantile() function calculates quantiles (e.g., 95th percentile) from histogram data.
66 |     ```bash
67 |     histogram_quantile(0.95, sum(rate(apiserver_request_duration_seconds_bucket[5m])) by (le))
68 |     ```
69 |     - This calculates the 95th percentile of Kubernetes API request durations.
70 | 


--------------------------------------------------------------------------------
/day-4/alerts-alertmanager-servicemonitor-manifest/alertmangerconfig.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: monitoring.coreos.com/v1alpha1
 2 | kind: AlertmanagerConfig
 3 | metadata:
 4 |   name: main-rules-alert-config
 5 |   namespace: monitoring
 6 |   labels:
 7 |     release: monitoring
 8 | spec:
 9 |   route:
10 |     repeatInterval: 30m
11 |     receiver: 'null'
12 |     routes:
13 |     - matchers:
14 |       - name: alertname
15 |         value: HighCpuUsage
16 |       receiver: 'send-email'
17 |     - matchers:
18 |       - name: alertname
19 |         value: PodRestart
20 |       receiver: 'send-email'
21 |       repeatInterval: 5m
22 |   receivers:
23 |   - name: 'send-email'
24 |     emailConfigs:
25 |     - to: YOUR_EMAIL_ID
26 |       from: YOUR_EMAIL_ID
27 |       sendResolved: false
28 |       smarthost: smtp.gmail.com:587
29 |       authUsername: YOUR_EMAIL_ID
30 |       authIdentity: YOUR_EMAIL_ID
31 |       authPassword:
32 |         name: mail-pass
33 |         key: gmail-pass
34 |   - name: 'null'
35 | 


--------------------------------------------------------------------------------
/day-4/alerts-alertmanager-servicemonitor-manifest/alerts.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: monitoring.coreos.com/v1
 2 | kind: PrometheusRule
 3 | metadata:
 4 |   name: custom-alert-rules
 5 |   namespace: monitoring
 6 |   labels:
 7 |     release: monitoring # if you installed through then you've to mention the release name of helm, otherwise prometheus will not recognize it
 8 | spec:
 9 |   groups:
10 |   - name: custom.rules
11 |     rules:
12 |     - alert: HighCpuUsage
13 |       expr: 100 - (avg by(instance) (rate(node_cpu_seconds_total{mode="idle"}[2m])) * 100) > 50
14 |       for: 5m
15 |       labels:
16 |         severity: warning
17 |       annotations:
18 |         summary: "High CPU usage on instance {{ $labels.instance }}"
19 |         description: "CPU usage is above 50% (current value: {{ $value }}%)"
20 |     - alert: PodRestart
21 |       expr: kube_pod_container_status_restarts_total > 2
22 |       for: 0m
23 |       labels:
24 |         severity: critical
25 |       annotations:
26 |         summary: "Pod restart detected in namespace {{ $labels.namespace }}"
27 |         description: "Pod {{ $labels.pod }} in namespace {{ $labels.namespace }} has restarted {{ $value }} times"
28 | 


--------------------------------------------------------------------------------
/day-4/alerts-alertmanager-servicemonitor-manifest/email-secrets.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: v1
 2 | kind: Secret
 3 | type: Opaque
 4 | metadata:
 5 |   name: mail-pass
 6 |   namespace: monitoring
 7 |   labels:
 8 |     release: monitoring
 9 | data:
10 |   gmail-pass: <<ENTER_YOUR_APP PASSWORDS_IN_BASE64_ENCODED_FORMAT>>
11 | 
12 | 
13 | 


--------------------------------------------------------------------------------
/day-4/alerts-alertmanager-servicemonitor-manifest/kustomization.yml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | kind: Kustomization
3 | namespace: monitoring
4 | resources:
5 |   - alerts.yml
6 |   - email-secrets.yml
7 |   - alertmangerconfig.yml
8 |   - serviceMonitor.yml
9 | 


--------------------------------------------------------------------------------
/day-4/alerts-alertmanager-servicemonitor-manifest/serviceMonitor.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: monitoring.coreos.com/v1
 2 | kind: ServiceMonitor
 3 | metadata:
 4 |   labels:
 5 |     app: a-service-service-monitor
 6 |     release: monitoring
 7 |   name: a-service-service-monitor
 8 |   namespace: monitoring
 9 | spec:
10 |   jobLabel: job
11 |   endpoints:
12 |     - interval: 2s
13 |       port: a-service-port
14 |       path: /metrics
15 |   selector:
16 |     matchLabels:
17 |       app: a-service
18 |   namespaceSelector:
19 |     matchNames:
20 |       - dev
21 | 


--------------------------------------------------------------------------------
/day-4/application/service-a/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM node:18-alpine
 2 | 
 3 | COPY package*.json /usr/app/
 4 | 
 5 | COPY index.js /usr/app/
 6 | 
 7 | COPY tracing.js /usr/app/
 8 | 
 9 | WORKDIR /usr/app
10 | 
11 | RUN npm install
12 | 
13 | CMD ["node", "index.js"]
14 | 


--------------------------------------------------------------------------------
/day-4/application/service-a/index.js:
--------------------------------------------------------------------------------
  1 | // service-a/index.js
  2 | require('dotenv').config();
  3 | require('./tracing'); // Add this line to initialize tracing
  4 | const express = require('express');
  5 | const morgan = require('morgan');
  6 | const pino = require('pino');
  7 | const axios = require('axios');
  8 | const promClient = require('prom-client');
  9 | 
 10 | const app = express();
 11 | 
 12 | const logger = pino();
 13 | 
 14 | const logging = () => {
 15 |     logger.info("Here are the logs")
 16 |     logger.info("Please have a look ")
 17 |     logger.info("This is just for testing")
 18 | }
 19 | 
 20 | app.use(morgan('common'))
 21 | 
 22 | const PORT = 3001;
 23 | 
 24 | 
 25 | 
 26 | 
 27 | // Prometheus metrics
 28 | const httpRequestCounter = new promClient.Counter({
 29 |     name: 'http_requests_total',
 30 |     help: 'Total number of HTTP requests',
 31 |     labelNames: ['method', 'path', 'status_code'],
 32 | });
 33 | 
 34 | const requestDurationHistogram = new promClient.Histogram({
 35 |     name: 'http_request_duration_seconds',
 36 |     help: 'Duration of HTTP requests in seconds',
 37 |     labelNames: ['method', 'path', 'status_code'],
 38 |     buckets: [0.1, 0.5, 1, 5, 10], // Buckets for the histogram in seconds
 39 | });
 40 | 
 41 | const requestDurationSummary = new promClient.Summary({
 42 |     name: 'http_request_duration_summary_seconds',
 43 |     help: 'Summary of the duration of HTTP requests in seconds',
 44 |     labelNames: ['method', 'path', 'status_code'],
 45 |     percentiles: [0.5, 0.9, 0.99], // Define your percentiles here
 46 | });
 47 | 
 48 | 
 49 | 
 50 | // Gauge metric
 51 | const gauge = new promClient.Gauge({
 52 |     name: 'node_gauge_example',
 53 |     help: 'Example of a gauge tracking async task duration',
 54 |     labelNames: ['method', 'status']
 55 | });
 56 | 
 57 | // Define an async function that simulates a task taking random time
 58 | const simulateAsyncTask = async () => {
 59 |     const randomTime = Math.random() * 5; // Random time between 0 and 5 seconds
 60 |     return new Promise((resolve) => setTimeout(resolve, randomTime * 1000));
 61 | };
 62 | 
 63 | app.disable('etag');
 64 | 
 65 | // Middleware to track metrics
 66 | app.use((req, res, next) => {
 67 |     const start = Date.now();
 68 |     res.on('finish', () => {
 69 |         const duration = (Date.now() - start) / 1000; // Duration in seconds
 70 |         const { method, url } = req;
 71 |         const statusCode = res.statusCode; // Get the actual HTTP status code
 72 |         httpRequestCounter.labels({ method, path: url, status_code: statusCode }).inc();
 73 |         requestDurationHistogram.labels({ method, path: url, status_code: statusCode }).observe(duration);
 74 |         requestDurationSummary.labels({ method, path: url, status_code: statusCode }).observe(duration);
 75 |     });
 76 |     next();
 77 | });
 78 | 
 79 | app.get('/', (req, res) => {
 80 |     res.status(200).json({
 81 |         status: "🏃- Running"
 82 |     });
 83 | });
 84 | 
 85 | app.get('/healthy', (req, res) => {
 86 |     res.status(200).json({
 87 |         name: "👀 - Obserability 🔥- Abhishek Veeramalla",
 88 |         status: "healthy"
 89 |     })
 90 | });
 91 | 
 92 | app.get('/serverError', (req, res) => {
 93 |     res.status(500).json({
 94 |         error: " Internal server error",
 95 |         statusCode: 500
 96 |     })
 97 | });
 98 | 
 99 | app.get('/notFound', (req, res) => {
100 |     res.status(404).json({
101 |         error: "Not Found",
102 |         statusCode: "404"
103 |     })
104 | });
105 | 
106 | app.get('/logs', (req, res) => {
107 |     logging();
108 |     res.status(200).json({
109 |         objective: "To generate logs"
110 |     })
111 | });
112 | 
113 | 
114 | // Simulate a crash by throwing an error
115 | app.get('/crash', (req, res) => {
116 |     console.log('Intentionally crashing the server...');
117 |     process.exit(1);
118 | });
119 | 
120 | 
121 | // Define the /example route
122 | app.get('/example', async (req, res) => {
123 |     const endGauge = gauge.startTimer({ method: req.method, status: res.statusCode });
124 |     await simulateAsyncTask();
125 |     endGauge();
126 |     res.send('Async task completed');
127 | });
128 | 
129 | // Expose metrics for Prometheus to scrape
130 | app.get('/metrics', async (req, res) => {
131 |     res.set('Content-Type', promClient.register.contentType);
132 |     res.end(await promClient.register.metrics());
133 | });
134 | 
135 | // Calling to service-b
136 | app.get('/call-service-b', async (req, res) => {
137 |   try {
138 |     const response = await axios.get(`${process.env.SERVICE_B_URI}/hello`);
139 |     res.send(`<h1 style="font-size: 100px">Service B says: ${response.data}<h1>`);
140 |   } catch (error) {
141 |     res.status(500).send('Error communicating with Service B');
142 |   }
143 | });
144 | 
145 | app.listen(PORT, () => {
146 |   console.log(`Service A is running on port ${PORT}`);
147 | });


--------------------------------------------------------------------------------
/day-4/application/service-a/package.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "name": "app-code",
 3 |   "version": "1.0.0",
 4 |   "description": "",
 5 |   "main": "index.js",
 6 |   "scripts": {
 7 |     "test": "echo \"Error: no test specified\" && exit 1",
 8 |     "start": "node index.js"
 9 |   },
10 |   "author": "",
11 |   "license": "ISC",
12 |   "dependencies": {
13 |     "@opentelemetry/api": "^1.9.0",
14 |     "@opentelemetry/auto-instrumentations-node": "^0.49.2",
15 |     "@opentelemetry/exporter-jaeger": "^1.26.0",
16 |     "@opentelemetry/exporter-trace-otlp-grpc": "^0.53.0",
17 |     "@opentelemetry/instrumentation": "^0.53.0",
18 |     "@opentelemetry/instrumentation-express": "^0.41.1",
19 |     "@opentelemetry/instrumentation-http": "^0.53.0",
20 |     "@opentelemetry/resources": "^1.26.0",
21 |     "@opentelemetry/sdk-node": "^0.53.0",
22 |     "@opentelemetry/sdk-trace-base": "^1.26.0",
23 |     "@opentelemetry/sdk-trace-node": "^1.26.0",
24 |     "@opentelemetry/semantic-conventions": "^1.27.0",
25 |     "axios": "^1.7.6",
26 |     "dotenv": "^16.4.5",
27 |     "express": "^4.19.2",
28 |     "morgan": "^1.10.0",
29 |     "pino": "^9.2.0",
30 |     "prom-client": "^15.1.2"
31 |   }
32 | }
33 | 


--------------------------------------------------------------------------------
/day-4/application/service-a/tracing.js:
--------------------------------------------------------------------------------
 1 | 'use strict';
 2 | 
 3 | const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node'); // Updated import
 4 | const { JaegerExporter } = require('@opentelemetry/exporter-jaeger');
 5 | const { registerInstrumentations } = require('@opentelemetry/instrumentation');
 6 | const { Resource } = require('@opentelemetry/resources');
 7 | const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
 8 | const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
 9 | const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http');
10 | const { ExpressInstrumentation } = require('@opentelemetry/instrumentation-express');
11 | 
12 | // Initialize the provider
13 | const provider = new NodeTracerProvider({
14 |   resource: new Resource({
15 |     [SemanticResourceAttributes.SERVICE_NAME]: 'service-a',
16 |   }),
17 | });
18 | 
19 | const JAEGER_ENDPOINT =  process.env.OTEL_EXPORTER_JAEGER_ENDPOINT
20 | 
21 | // Setup the exporter
22 | const exporter = new JaegerExporter({
23 |   endpoint: JAEGER_ENDPOINT, // Replace with the appropriate Jaeger collector endpoint
24 | });
25 | 
26 | // Add the exporter to the provider
27 | provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
28 | 
29 | // Initialize the provider and instrumentations
30 | provider.register();
31 | 
32 | registerInstrumentations({
33 |   instrumentations: [
34 |     new HttpInstrumentation({
35 |       applyCustomAttributesOnSpan: (span, request, response) => {
36 |         span.setAttribute('custom-attribute', 'custom-value');
37 |       },
38 |     }),
39 |     new ExpressInstrumentation(), // Add this for Express.js instrumentation
40 |   ],
41 | });
42 | 
43 | console.log('Tracing initialized');
44 | 


--------------------------------------------------------------------------------
/day-4/application/service-b/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM node:18-alpine
 2 | 
 3 | COPY package*.json /usr/app/
 4 | 
 5 | COPY index.js /usr/app/
 6 | 
 7 | COPY tracing.js /usr/app/
 8 | 
 9 | WORKDIR /usr/app
10 | 
11 | RUN npm install
12 | 
13 | CMD ["node", "index.js"]
14 | 


--------------------------------------------------------------------------------
/day-4/application/service-b/index.js:
--------------------------------------------------------------------------------
 1 | // service-b/index.js
 2 | require('dotenv').config();
 3 | require('./tracing'); // Add this line to initialize tracing
 4 | const express = require('express');
 5 | const morgan = require('morgan');
 6 | 
 7 | const app = express();
 8 | const PORT = 3002;
 9 | app.use(morgan('common'))
10 | 
11 | app.get('/hello', (req, res) => {
12 |   res.send('Hello from Service B!');
13 | });
14 | 
15 | app.listen(PORT, () => {
16 |   console.log(`Service B is running on port ${PORT}`);
17 | });


--------------------------------------------------------------------------------
/day-4/application/service-b/package.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "name": "app-code",
 3 |   "version": "1.0.0",
 4 |   "description": "",
 5 |   "main": "index.js",
 6 |   "scripts": {
 7 |     "test": "echo \"Error: no test specified\" && exit 1",
 8 |     "start": "node index.js"
 9 |   },
10 |   "author": "",
11 |   "license": "ISC",
12 |   "dependencies": {
13 |     "@opentelemetry/api": "^1.9.0",
14 |     "@opentelemetry/auto-instrumentations-node": "^0.49.2",
15 |     "@opentelemetry/exporter-jaeger": "^1.26.0",
16 |     "@opentelemetry/exporter-trace-otlp-grpc": "^0.53.0",
17 |     "@opentelemetry/instrumentation": "^0.53.0",
18 |     "@opentelemetry/instrumentation-express": "^0.41.1",
19 |     "@opentelemetry/instrumentation-http": "^0.53.0",
20 |     "@opentelemetry/resources": "^1.26.0",
21 |     "@opentelemetry/sdk-node": "^0.53.0",
22 |     "@opentelemetry/sdk-trace-base": "^1.26.0",
23 |     "@opentelemetry/sdk-trace-node": "^1.26.0",
24 |     "@opentelemetry/semantic-conventions": "^1.27.0",
25 |     "axios": "^1.7.6",
26 |     "dotenv": "^16.4.5",
27 |     "express": "^4.19.2",
28 |     "morgan": "^1.10.0",
29 |     "pino": "^9.2.0",
30 |     "prom-client": "^15.1.2"
31 |   }
32 | }
33 | 


--------------------------------------------------------------------------------
/day-4/application/service-b/tracing.js:
--------------------------------------------------------------------------------
 1 | 'use strict';
 2 | 
 3 | const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node'); // Updated import
 4 | const { JaegerExporter } = require('@opentelemetry/exporter-jaeger');
 5 | const { registerInstrumentations } = require('@opentelemetry/instrumentation');
 6 | const { Resource } = require('@opentelemetry/resources');
 7 | const { SemanticResourceAttributes } = require('@opentelemetry/semantic-conventions');
 8 | const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
 9 | const { HttpInstrumentation } = require('@opentelemetry/instrumentation-http');
10 | const { ExpressInstrumentation } = require('@opentelemetry/instrumentation-express');
11 | 
12 | // Initialize the provider
13 | const provider = new NodeTracerProvider({
14 |   resource: new Resource({
15 |     [SemanticResourceAttributes.SERVICE_NAME]: 'service-b',
16 |   }),
17 | });
18 | 
19 | const JAEGER_ENDPOINT =  process.env.OTEL_EXPORTER_JAEGER_ENDPOINT
20 | 
21 | // Setup the exporter
22 | const exporter = new JaegerExporter({
23 |   endpoint: JAEGER_ENDPOINT, // Replace with the appropriate Jaeger collector endpoint
24 | });
25 | 
26 | // Add the exporter to the provider
27 | provider.addSpanProcessor(new SimpleSpanProcessor(exporter));
28 | 
29 | // Initialize the provider and instrumentations
30 | provider.register();
31 | 
32 | registerInstrumentations({
33 |   instrumentations: [
34 |     new HttpInstrumentation({
35 |       applyCustomAttributesOnSpan: (span, request, response) => {
36 |         span.setAttribute('custom-attribute', 'custom-value');
37 |       },
38 |     }),
39 |     new ExpressInstrumentation(), // Add this for Express.js instrumentation
40 |   ],
41 | });
42 | 
43 | console.log('Tracing initialized');
44 | 


--------------------------------------------------------------------------------
/day-4/images/architecture.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/iam-veeramalla/observability-zero-to-hero/9445b2364672b23b72f029e65471ed485a0c8950/day-4/images/architecture.gif


--------------------------------------------------------------------------------
/day-4/kubernetes-manifest/deployment-svc-a.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: apps/v1
 2 | kind: Deployment
 3 | metadata:
 4 |   labels:
 5 |     app: service-a-deployment
 6 |     # run: service-a-deployment
 7 |   name: service-a-deployment
 8 | spec:
 9 |   replicas: 1
10 |   selector:
11 |     matchLabels:
12 |       app: service-a-deployment
13 |   template:
14 |     metadata:
15 |       labels:
16 |         app: service-a-deployment
17 |     spec:
18 |       containers:
19 |       - image: abhishekf5/demoservice-a:v
20 |         name: service-a
21 |         imagePullPolicy: Always
22 |         ports:
23 |         - containerPort: 3001
24 |         env:
25 |         - name: OTEL_EXPORTER_JAEGER_ENDPOINT
26 |           value: "http://jaeger-collector.tracing:14268/api/traces"
27 |         - name: SERVICE_B_URI
28 |           value: "http://b-service.dev"
29 | 


--------------------------------------------------------------------------------
/day-4/kubernetes-manifest/deployment-svc-b.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: apps/v1
 2 | kind: Deployment
 3 | metadata:
 4 |   labels:
 5 |     app: service-b-deployment
 6 |     # run: service-b-deployment
 7 |   name: service-b-deployment
 8 | spec:
 9 |   replicas: 1
10 |   selector:
11 |     matchLabels:
12 |       app: service-b-deployment
13 |   template:
14 |     metadata:
15 |       labels:
16 |         app: service-b-deployment
17 |     spec:
18 |       containers:
19 |       - image: abhishekf5/demoservice-a:v
20 |         name: service-b
21 |         imagePullPolicy: Always
22 |         ports:
23 |         - containerPort: 3002
24 |         env:
25 |         - name: OTEL_EXPORTER_JAEGER_ENDPOINT
26 |           value: "http://jaeger-collector.tracing:14268/api/traces"


--------------------------------------------------------------------------------
/day-4/kubernetes-manifest/kustomization.yml:
--------------------------------------------------------------------------------
1 | apiVersion: kustomize.config.k8s.io/v1beta1
2 | kind: Kustomization
3 | namespace: dev
4 | resources:
5 |   - deployment-svc-a.yml
6 |   - service-svc-a.yml
7 |   - deployment-svc-b.yml
8 |   - service-svc-b.yml
9 | 


--------------------------------------------------------------------------------
/day-4/kubernetes-manifest/service-svc-a.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: v1
 2 | kind: Service
 3 | metadata:
 4 |   labels:
 5 |     job: node-api
 6 |     app: a-service
 7 |   name: a-service
 8 | spec:
 9 |   ports:
10 |   - name: a-service-port
11 |     port: 80
12 |     protocol: TCP
13 |     targetPort: 3001
14 |   selector:
15 |     app: service-a-deployment
16 |   type: LoadBalancer
17 | 
18 | 


--------------------------------------------------------------------------------
/day-4/kubernetes-manifest/service-svc-b.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: v1
 2 | kind: Service
 3 | metadata:
 4 |   labels:
 5 |     job: node-api
 6 |     app: b-service
 7 |   name: b-service
 8 | spec:
 9 |   ports:
10 |   - name: b-service-port
11 |     port: 80
12 |     protocol: TCP
13 |     targetPort: 3002
14 |   selector:
15 |     app: service-b-deployment
16 | 
17 | 


--------------------------------------------------------------------------------
/day-4/readme.md:
--------------------------------------------------------------------------------
  1 | ## 🎛️ Instrumentation
  2 | - Instrumentation refers to the process of adding monitoring capabilities to your applications, systems, or services.
  3 | - This involves embedding/Writting code or using tools to collect metrics, logs, or traces that provide insights into how the system is performing.
  4 | 
  5 | ## 🎯 Purpose of Instrumentation:
  6 | - **Visibility**: It helps you gain visibility into the internal state of your applications and infrastructure.
  7 | - **Metrics Collection**: By collecting key metrics like CPU usage, memory consumption, request rates, error rates, etc., you can understand the health and performance of your system.
  8 | - **Troubleshooting**: When something goes wrong, instrumentation allows you to diagnose the issue quickly by providing detailed insights.
  9 | 
 10 | ## ⚙️ How it Works:
 11 | - **Code-Level Instrumentation**: You can add instrumentation directly in your application code to expose metrics. For example, in a `Node.js` application, you might use a library like prom-client to expose custom metrics.
 12 | 
 13 | ## 📈 Instrumentation in Prometheus:
 14 | - 📤 **Exporters**: Prometheus uses exporters to collect metrics from different systems. These exporters expose metrics in a format that Prometheus can scrape and store.
 15 |     - **Node Exporter**: Collects system-level metrics from Linux/Unix systems.
 16 |     - **MySQL Exporter (For MySQL Database)**:  Collects metrics from a MySQL database.
 17 |     - **PostgreSQL Exporter (For PostgreSQL Database)**: Collects metrics from a PostgreSQL database.
 18 | - 📊 **Custom Metrics**: You can instrument your application to expose custom metrics that are relevant to your specific use case. For example, you might track the number of user logins per minute.
 19 | 
 20 | ## 📈 Types of Metrics in Prometheus
 21 | - 🔄️ **Counter**:
 22 |     - A Counter is a cumulative metric that represents a single numerical value that only ever goes up. It is used for counting events like the number of HTTP requests, errors, or tasks completed.
 23 |     - **Example**: Counting the number of times a container restarts in your Kubernetes cluster
 24 |     - **Metric Example**: `kube_pod_container_status_restarts_total`
 25 | 
 26 | - 📏 **Gauge**:
 27 |     - A Gauge is a metric that represents a single numerical value that can go up and down. It is typically used for things like memory usage, CPU usage, or the current number of active users.
 28 |     - **Example**: Monitoring the memory usage of a container in your Kubernetes cluster.
 29 |     - **Metric Example**: `container_memory_usage_bytes`
 30 | 
 31 | - 📊 **Histogram**:
 32 |     - A Histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets.
 33 |     - It also provides a sum of all observed values and a count of observations.
 34 |     - **Example**: Measuring the response time of Kubernetes API requests in various time buckets.
 35 |     - **Metric Example**: `apiserver_request_duration_seconds_bucket`
 36 | 
 37 | - 📝 Summary:
 38 |     - Similar to a Histogram, a Summary samples observations and provides a total count of observations, their sum, and configurable quantiles (percentiles).
 39 |     - **Example**: Monitoring the 95th percentile of request durations to understand high latency in your Kubernetes API.
 40 |     - **Metric Example**: `apiserver_request_duration_seconds_sum`
 41 | 
 42 | 
 43 | # 🎯 Project Objectives
 44 | - 🛠️ **Implement Custom Metrics in Node.js Application**: Use the prom-client library to write and expose custom metrics in the Node.js application.
 45 | - 🚨 **Set Up Alerts in Alertmanager**: Configure Alertmanager to send email notifications if a container crashes more than two times.
 46 | - 📝 **Set Up Logging**: Implement logging on both application and cluster (node) logs for better observability using EFK stack(Elasticsearch, FluentBit, Kibana).
 47 | - 📸 **Implement Distributed Tracing for Node.js Application**: Enhance observability by instrumenting the Node.js application for distributed tracing using Jaeger. enabling better performance monitoring and troubleshooting of complex, multi-service architectures.
 48 | 
 49 | # 🏠 Architecture
 50 | ![Project Architecture](images/architecture.gif)
 51 | 
 52 | ## 1) Write Custom Metrics
 53 | - Please take a look at `day-4/application/service-a/index.js` file to learn more about custom metrics. below is the brief overview
 54 | - **Express Setup**: Initializes an Express application and sets up logging with Morgan.
 55 | - **Logging with Pino**: Defines a custom logging function using Pino for structured logging.
 56 | - **Prometheus Metrics with prom-client**: Integrates Prometheus for monitoring HTTP requests using the prom-client library:
 57 |     - `http_requests_total`: counter
 58 |     - `http_request_duration_seconds`: histogram
 59 |     - `http_request_duration_summary_seconds`: summary
 60 |     - `node_gauge_example`: gauge for tracking async task duration
 61 | ### Basic Routes:
 62 | - `/` : Returns a "Running" status.
 63 | - `/healthy`: Returns the health status of the server.
 64 | - `/serverError`: Simulates a 500 Internal Server Error.
 65 | - `/notFound`: Simulates a 404 Not Found error.
 66 | - `/logs`: Generates logs using the custom logging function.
 67 | - `/crash`: Simulates a server crash by exiting the process.
 68 | - `/example`: Tracks async task duration with a gauge.
 69 | - `/metrics`: Exposes Prometheus metrics endpoint.
 70 | - `/call-service-b`: To call service b & receive data from service b
 71 | 
 72 | ## 2) dockerize & push it to the registry
 73 | - To containerize the applications and push it to your Docker registry, run the following commands:
 74 | ```bash
 75 | cd day-4
 76 | 
 77 | # Dockerize microservice - a
 78 | docker build -t <<NAME_OF_YOUR_REPO>>:<<TAG>> application/service-a/ 
 79 | # or use abhishekf5/demoservice-a:v
 80 | 
 81 | # Dockerize microservice - b
 82 | docker build -t <<NAME_OF_YOUR_REPO>>:<<TAG>> application/service-b/ 
 83 | 
 84 | or use the pre-built images
 85 | - abhishekf5/demoservice-a:v
 86 | - abhishekf5/demoservice-b:v
 87 | 
 88 | ```
 89 | 
 90 | ## 3) Kubernetes manifest
 91 | - Review the Kubernetes manifest files located in `day-4/kubernetes-manifest`.
 92 | - Apply the Kubernetes manifest files to your cluster by running:
 93 | ```bash
 94 | kubectl create ns dev
 95 | 
 96 | kubectl apply -k kubernetes-manifest/
 97 | ```
 98 | 
 99 | ## 4) Test all the endpoints
100 | - Open a browser and get the LoadBalancer DNS name & hit the DNS name with following routes to test the application:
101 |     - `/`
102 |     - `/healthy`
103 |     - `/serverError`
104 |     - `/notFound`
105 |     - `/logs`
106 |     - `/example`
107 |     - `/metrics`
108 |     - `/call-service-b`
109 | - Alternatively, you can run the automated script `test.sh`, which will automatically send random requests to the LoadBalancer and generate metrics:
110 | ```bash
111 | ./test.sh <<LOAD_BALANCER_DNS_NAME>>
112 | ```
113 | 
114 | ## 5) Configure Alertmanager
115 | - Review the Alertmanager configuration files located in `day-4/alerts-alertmanager-servicemonitor-manifest` but below is the brief overview
116 |     - Before configuring Alertmanager, we need credentials to send emails. For this project, we are using Gmail, but any SMTP provider like AWS SES can be used. so please grab the credentials for that.
117 |     - Open your Google account settings and search App password & create a new password & put the password in `day-4/alerts-alertmanager-servicemonitor-manifest/email-secret.yml`
118 |     - One last thing, please add your email id in the `day-4/alerts-alertmanager-servicemonitor-manifest/alertmanagerconfig.yml`
119 | - **HighCpuUsage**: Triggers a warning alert if the average CPU usage across instances exceeds 50% for more than 5 minutes.
120 | - **PodRestart**: Triggers a critical alert immediately if any pod restarts more than 2 times.
121 | - Apply the manifest files to your cluster by running:
122 | ```bash
123 | kubectl apply -k alerts-alertmanager-servicemonitor-manifest/
124 | ```
125 | - Wait for 4-5 minutes and then check the Prometheus UI to confirm that the custom metrics implemented in the Node.js application are available:
126 |     - `http_requests_total`: counter
127 |     - `http_request_duration_seconds`: histogram
128 |     - `http_request_duration_summary_seconds`: summary
129 |     - `node_gauge_example`: gauge for tracking async task duration
130 | 
131 | ## 6) Testing Alerts
132 | - To test the alerting system, manually crash the container more than 2 times to trigger an alert (email notification).
133 | - To crash the application container, hit the following endpoint
134 | - `<<LOAD_BALANCER_DNS_NAME>>/crash`
135 | - You should receive an email once the application container has restarted at least 3 times.


--------------------------------------------------------------------------------
/day-4/test.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Set the base URL of your Node.js application
 4 | BASE_URL="http://$1"
 5 | 
 6 | echo $BASE_URL
 7 | 
 8 | # Define an array of endpoints
 9 | ENDPOINTS=(
10 |   "/"
11 |   "/healthy"
12 |   "/serverError"
13 |   "/notFound"
14 |   "/logs"
15 |   "/example"
16 |   "/metrics"
17 |   "/call-service-b"
18 |   "/call-service-b"
19 |   "/call-service-b"
20 | )
21 | 
22 | # Function to make a random request to one of the endpoints
23 | make_random_request() {
24 |   local endpoint=${ENDPOINTS[$RANDOM % ${#ENDPOINTS[@]}]}
25 |   curl -s -o /dev/null -w "%{http_code}" "$BASE_URL$endpoint"
26 | }
27 | 
28 | # Make 1000 random requests
29 | for ((i=1; i<=1000; i++)); do
30 |   make_random_request
31 |   echo "Request $i completed"
32 |   sleep 0.1  # Optional: Sleep for a short duration between requests to simulate real traffic
33 | done
34 | 
35 | echo "Completed 1000 requests"
36 | 


--------------------------------------------------------------------------------
/day-5/fluentbit-values.yaml:
--------------------------------------------------------------------------------
  1 | # Default values for fluent-bit.
  2 | 
  3 | # kind -- DaemonSet or Deployment
  4 | kind: DaemonSet
  5 | 
  6 | # replicaCount -- Only applicable if kind=Deployment
  7 | replicaCount: 1
  8 | 
  9 | image:
 10 |   repository: cr.fluentbit.io/fluent/fluent-bit
 11 |   # Overrides the image tag whose default is {{ .Chart.AppVersion }}
 12 |   # Set to "-" to not use the default value
 13 |   tag:
 14 |   digest:
 15 |   pullPolicy: IfNotPresent
 16 | 
 17 | testFramework:
 18 |   enabled: true
 19 |   namespace:
 20 |   image:
 21 |     repository: busybox
 22 |     pullPolicy: Always
 23 |     tag: latest
 24 |     digest:
 25 | 
 26 | imagePullSecrets: []
 27 | nameOverride: ""
 28 | fullnameOverride: ""
 29 | 
 30 | serviceAccount:
 31 |   create: true
 32 |   annotations: {}
 33 |   name:
 34 | 
 35 | rbac:
 36 |   create: true
 37 |   nodeAccess: false
 38 |   eventsAccess: false
 39 | 
 40 | # Configure podsecuritypolicy
 41 | # Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/
 42 | # from Kubernetes 1.25, PSP is deprecated
 43 | # See: https://kubernetes.io/blog/2022/08/23/kubernetes-v1-25-release/#pod-security-changes
 44 | # We automatically disable PSP if Kubernetes version is 1.25 or higher
 45 | podSecurityPolicy:
 46 |   create: false
 47 |   annotations: {}
 48 | 
 49 | # OpenShift-specific configuration
 50 | openShift:
 51 |   enabled: false
 52 |   securityContextConstraints:
 53 |     # Create SCC for Fluent-bit and allow use it
 54 |     create: true
 55 |     name: ""
 56 |     annotations: {}
 57 |     # Use existing SCC in cluster, rather then create new one
 58 |     existingName: ""
 59 | 
 60 | podSecurityContext: {}
 61 | #   fsGroup: 2000
 62 | 
 63 | hostNetwork: false
 64 | dnsPolicy: ClusterFirst
 65 | 
 66 | dnsConfig: {}
 67 | #   nameservers:
 68 | #     - 1.2.3.4
 69 | #   searches:
 70 | #     - ns1.svc.cluster-domain.example
 71 | #     - my.dns.search.suffix
 72 | #   options:
 73 | #     - name: ndots
 74 | #       value: "2"
 75 | #     - name: edns0
 76 | 
 77 | hostAliases: []
 78 | #   - ip: "1.2.3.4"
 79 | #     hostnames:
 80 | #     - "foo.local"
 81 | #     - "bar.local"
 82 | 
 83 | securityContext: {}
 84 | #   capabilities:
 85 | #     drop:
 86 | #     - ALL
 87 | #   readOnlyRootFilesystem: true
 88 | #   runAsNonRoot: true
 89 | #   runAsUser: 1000
 90 | 
 91 | service:
 92 |   type: ClusterIP
 93 |   port: 2020
 94 |   internalTrafficPolicy:
 95 |   loadBalancerClass:
 96 |   loadBalancerSourceRanges: []
 97 |   labels: {}
 98 |   # nodePort: 30020
 99 |   # clusterIP: 172.16.10.1
100 |   annotations: {}
101 | #   prometheus.io/path: "/api/v1/metrics/prometheus"
102 | #   prometheus.io/port: "2020"
103 | #   prometheus.io/scrape: "true"
104 |   externalIPs: []
105 |   # externalIPs:
106 |   #  - 2.2.2.2
107 | 
108 | 
109 | serviceMonitor:
110 |   enabled: false
111 |   #   namespace: monitoring
112 |   #   interval: 10s
113 |   #   scrapeTimeout: 10s
114 |   #   selector:
115 |   #    prometheus: my-prometheus
116 |   #  ## metric relabel configs to apply to samples before ingestion.
117 |   #  ##
118 |   #  metricRelabelings:
119 |   #    - sourceLabels: [__meta_kubernetes_service_label_cluster]
120 |   #      targetLabel: cluster
121 |   #      regex: (.*)
122 |   #      replacement: ${1}
123 |   #      action: replace
124 |   #  ## relabel configs to apply to samples after ingestion.
125 |   #  ##
126 |   #  relabelings:
127 |   #    - sourceLabels: [__meta_kubernetes_pod_node_name]
128 |   #      separator: ;
129 |   #      regex: ^(.*)$
130 |   #      targetLabel: nodename
131 |   #      replacement: $1
132 |   #      action: replace
133 |   #  scheme: ""
134 |   #  tlsConfig: {}
135 | 
136 |   ## Bear in mind if you want to collect metrics from a different port
137 |   ## you will need to configure the new ports on the extraPorts property.
138 |   additionalEndpoints: []
139 |   # - port: metrics
140 |   #   path: /metrics
141 |   #   interval: 10s
142 |   #   scrapeTimeout: 10s
143 |   #   scheme: ""
144 |   #   tlsConfig: {}
145 |   #   # metric relabel configs to apply to samples before ingestion.
146 |   #   #
147 |   #   metricRelabelings:
148 |   #     - sourceLabels: [__meta_kubernetes_service_label_cluster]
149 |   #       targetLabel: cluster
150 |   #       regex: (.*)
151 |   #       replacement: ${1}
152 |   #       action: replace
153 |   #   # relabel configs to apply to samples after ingestion.
154 |   #   #
155 |   #   relabelings:
156 |   #     - sourceLabels: [__meta_kubernetes_pod_node_name]
157 |   #       separator: ;
158 |   #       regex: ^(.*)$
159 |   #       targetLabel: nodename
160 |   #       replacement: $1
161 |   #       action: replace
162 | 
163 | prometheusRule:
164 |   enabled: false
165 | #   namespace: ""
166 | #   additionalLabels: {}
167 | #   rules:
168 | #   - alert: NoOutputBytesProcessed
169 | #     expr: rate(fluentbit_output_proc_bytes_total[5m]) == 0
170 | #     annotations:
171 | #       message: |
172 | #         Fluent Bit instance {{ $labels.instance }}'s output plugin {{ $labels.name }} has not processed any
173 | #         bytes for at least 15 minutes.
174 | #       summary: No Output Bytes Processed
175 | #     for: 15m
176 | #     labels:
177 | #       severity: critical
178 | 
179 | dashboards:
180 |   enabled: false
181 |   labelKey: grafana_dashboard
182 |   labelValue: 1
183 |   annotations: {}
184 |   namespace: ""
185 | 
186 | lifecycle: {}
187 | #   preStop:
188 | #     exec:
189 | #       command: ["/bin/sh", "-c", "sleep 20"]
190 | 
191 | livenessProbe:
192 |   httpGet:
193 |     path: /
194 |     port: http
195 | 
196 | readinessProbe:
197 |   httpGet:
198 |     path: /api/v1/health
199 |     port: http
200 | 
201 | resources: {}
202 | #   limits:
203 | #     cpu: 100m
204 | #     memory: 128Mi
205 | #   requests:
206 | #     cpu: 100m
207 | #     memory: 128Mi
208 | 
209 | ## only available if kind is Deployment
210 | ingress:
211 |   enabled: false
212 |   ingressClassName: ""
213 |   annotations: {}
214 |   #  kubernetes.io/ingress.class: nginx
215 |   #  kubernetes.io/tls-acme: "true"
216 |   hosts: []
217 |   # - host: fluent-bit.example.tld
218 |   extraHosts: []
219 |   # - host: fluent-bit-extra.example.tld
220 |   ## specify extraPort number
221 |   #   port: 5170
222 |   tls: []
223 |   #  - secretName: fluent-bit-example-tld
224 |   #    hosts:
225 |   #      - fluent-bit.example.tld
226 | 
227 | ## only available if kind is Deployment
228 | autoscaling:
229 |   vpa:
230 |     enabled: false
231 | 
232 |     annotations: {}
233 | 
234 |     # List of resources that the vertical pod autoscaler can control. Defaults to cpu and memory
235 |     controlledResources: []
236 | 
237 |     # Define the max allowed resources for the pod
238 |     maxAllowed: {}
239 |     # cpu: 200m
240 |     # memory: 100Mi
241 |     # Define the min allowed resources for the pod
242 |     minAllowed: {}
243 |     # cpu: 200m
244 |     # memory: 100Mi
245 | 
246 |     updatePolicy:
247 |       # Specifies whether recommended updates are applied when a Pod is started and whether recommended updates
248 |       # are applied during the life of a Pod. Possible values are "Off", "Initial", "Recreate", and "Auto".
249 |       updateMode: Auto
250 | 
251 |   enabled: false
252 |   minReplicas: 1
253 |   maxReplicas: 3
254 |   targetCPUUtilizationPercentage: 75
255 |   #  targetMemoryUtilizationPercentage: 75
256 |   ## see https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/#autoscaling-on-multiple-metrics-and-custom-metrics
257 |   customRules: []
258 |   #     - type: Pods
259 |   #       pods:
260 |   #         metric:
261 |   #           name: packets-per-second
262 |   #         target:
263 |   #           type: AverageValue
264 |   #           averageValue: 1k
265 |   ## see https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-configurable-scaling-behavior
266 |   behavior: {}
267 | #      scaleDown:
268 | #        policies:
269 | #          - type: Pods
270 | #            value: 4
271 | #            periodSeconds: 60
272 | #          - type: Percent
273 | #            value: 10
274 | #            periodSeconds: 60
275 | 
276 | ## only available if kind is Deployment
277 | podDisruptionBudget:
278 |   enabled: false
279 |   annotations: {}
280 |   maxUnavailable: "30%"
281 | 
282 | nodeSelector: {}
283 | 
284 | tolerations: []
285 | 
286 | affinity: {}
287 | 
288 | labels: {}
289 | 
290 | annotations: {}
291 | 
292 | podAnnotations: {}
293 | 
294 | podLabels: {}
295 | 
296 | ## How long (in seconds) a pods needs to be stable before progressing the deployment
297 | ##
298 | minReadySeconds:
299 | 
300 | ## How long (in seconds) a pod may take to exit (useful with lifecycle hooks to ensure lb deregistration is done)
301 | ##
302 | terminationGracePeriodSeconds:
303 | 
304 | priorityClassName: ""
305 | 
306 | env: []
307 | #  - name: FOO
308 | #    value: "bar"
309 | 
310 | # The envWithTpl array below has the same usage as "env", but is using the tpl function to support templatable string.
311 | # This can be useful when you want to pass dynamic values to the Chart using the helm argument "--set <variable>=<value>"
312 | # https://helm.sh/docs/howto/charts_tips_and_tricks/#using-the-tpl-function
313 | envWithTpl: []
314 | #  - name: FOO_2
315 | #    value: "{{ .Values.foo2 }}"
316 | #
317 | # foo2: bar2
318 | 
319 | envFrom: []
320 | 
321 | # This supports either a structured array or a templatable string
322 | extraContainers: []
323 | 
324 | # Array mode
325 | # extraContainers:
326 | #   - name: do-something
327 | #     image: busybox
328 | #     command: ['do', 'something']
329 | 
330 | # String mode
331 | # extraContainers: |-
332 | #   - name: do-something
333 | #     image: bitnami/kubectl:{{ .Capabilities.KubeVersion.Major }}.{{ .Capabilities.KubeVersion.Minor }}
334 | #     command: ['kubectl', 'version']
335 | 
336 | flush: 1
337 | 
338 | metricsPort: 2020
339 | 
340 | extraPorts: []
341 | #   - port: 5170
342 | #     containerPort: 5170
343 | #     protocol: TCP
344 | #     name: tcp
345 | #     nodePort: 30517
346 | 
347 | extraVolumes: []
348 | 
349 | extraVolumeMounts: []
350 | 
351 | updateStrategy: {}
352 | #   type: RollingUpdate
353 | #   rollingUpdate:
354 | #     maxUnavailable: 1
355 | 
356 | # Make use of a pre-defined configmap instead of the one templated here
357 | existingConfigMap: ""
358 | 
359 | networkPolicy:
360 |   enabled: false
361 | #   ingress:
362 | #     from: []
363 | 
364 | luaScripts:
365 |   setIndex.lua: |
366 |     function set_index(tag, timestamp, record)
367 |         index = "abhishek-"
368 |         if record["kubernetes"] ~= nil then
369 |             if record["kubernetes"]["namespace_name"] == "logging" then
370 |                 return -1, timestamp, record  -- Skip logs from the logging namespace
371 |             end
372 |             if record["kubernetes"]["namespace_name"] ~= nil then
373 |                 if record["kubernetes"]["container_name"] ~= nil then
374 |                     record["es_index"] = index
375 |                         .. record["kubernetes"]["namespace_name"]
376 |                         .. "-"
377 |                         .. record["kubernetes"]["container_name"]
378 |                     return 1, timestamp, record
379 |                 end
380 |                 record["es_index"] = index
381 |                     .. record["kubernetes"]["namespace_name"]
382 |                 return 1, timestamp, record
383 |             end
384 |         end
385 |         return 1, timestamp, record
386 |     end
387 | 
388 | ## https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/configuration-file
389 | config:
390 |   service: |
391 |     [SERVICE]
392 |         Daemon Off
393 |         Flush {{ .Values.flush }}
394 |         Log_Level {{ .Values.logLevel }}
395 |         Parsers_File /fluent-bit/etc/parsers.conf
396 |         Parsers_File /fluent-bit/etc/conf/custom_parsers.conf
397 |         HTTP_Server On
398 |         HTTP_Listen 0.0.0.0
399 |         HTTP_Port {{ .Values.metricsPort }}
400 |         Health_Check On
401 | 
402 |   ## https://docs.fluentbit.io/manual/pipeline/inputs
403 |   inputs: |
404 |     [INPUT]
405 |         Name tail
406 |         Path /var/log/containers/*.log
407 |         multiline.parser docker, cri
408 |         Tag kube.*
409 |         Mem_Buf_Limit 5MB
410 |         Skip_Long_Lines On
411 | 
412 |     [INPUT]
413 |         Name systemd
414 |         Tag host.*
415 |         Systemd_Filter _SYSTEMD_UNIT=kubelet.service
416 |         Read_From_Tail On
417 | 
418 |   ## https://docs.fluentbit.io/manual/pipeline/filters
419 |   filters: |
420 |     [FILTER]
421 |         Name kubernetes
422 |         Match kube.*
423 |         Merge_Log On
424 |         Keep_Log Off
425 |         K8S-Logging.Parser On
426 |         K8S-Logging.Exclude On
427 | 
428 |     [FILTER]
429 |         Name lua
430 |         Match kube.*
431 |         script /fluent-bit/scripts/setIndex.lua
432 |         call set_index
433 | 
434 |   ## https://docs.fluentbit.io/manual/pipeline/outputs
435 |   outputs: |
436 |     [OUTPUT]
437 |         Name es
438 |         Match kube.*
439 |         Type  _doc
440 |         Host elasticsearch-master
441 |         Port 9200
442 |         HTTP_User elastic
443 |         HTTP_Passwd cbTQj1qxRIPNF5uc
444 |         tls On
445 |         tls.verify Off
446 |         Logstash_Format On
447 |         Logstash_Prefix logstash
448 |         Retry_Limit False
449 |         Suppress_Type_Name On
450 | 
451 |     [OUTPUT]
452 |         Name es
453 |         Match host.*
454 |         Type  _doc
455 |         Host elasticsearch-master
456 |         Port 9200
457 |         HTTP_User elastic
458 |         HTTP_Passwd cbTQj1qxRIPNF5uc
459 |         tls On
460 |         tls.verify Off
461 |         Logstash_Format On
462 |         Logstash_Prefix node
463 |         Retry_Limit False
464 |         Suppress_Type_Name On
465 | 
466 |   ## https://docs.fluentbit.io/manual/administration/configuring-fluent-bit/classic-mode/upstream-servers
467 |   ## This configuration is deprecated, please use `extraFiles` instead.
468 |   upstream: {}
469 | 
470 |   ## https://docs.fluentbit.io/manual/pipeline/parsers
471 |   customParsers: |
472 |     [PARSER]
473 |         Name docker_no_time
474 |         Format json
475 |         Time_Keep Off
476 |         Time_Key time
477 |         Time_Format %Y-%m-%dT%H:%M:%S.%L
478 | 
479 |   # This allows adding more files with arbitrary filenames to /fluent-bit/etc/conf by providing key/value pairs.
480 |   # The key becomes the filename, the value becomes the file content.
481 |   extraFiles: {}
482 | #     upstream.conf: |
483 | #       [UPSTREAM]
484 | #           upstream1
485 | #
486 | #       [NODE]
487 | #           name       node-1
488 | #           host       127.0.0.1
489 | #           port       43000
490 | #     example.conf: |
491 | #       [OUTPUT]
492 | #           Name example
493 | #           Match foo.*
494 | #           Host bar
495 | 
496 | # The config volume is mounted by default, either to the existingConfigMap value, or the default of "fluent-bit.fullname"
497 | volumeMounts:
498 |   - name: config
499 |     mountPath: /fluent-bit/etc/conf
500 | 
501 | daemonSetVolumes:
502 |   - name: varlog
503 |     hostPath:
504 |       path: /var/log
505 |   - name: varlibdockercontainers
506 |     hostPath:
507 |       path: /var/lib/docker/containers
508 |   - name: etcmachineid
509 |     hostPath:
510 |       path: /etc/machine-id
511 |       type: File
512 | 
513 | daemonSetVolumeMounts:
514 |   - name: varlog
515 |     mountPath: /var/log
516 |   - name: varlibdockercontainers
517 |     mountPath: /var/lib/docker/containers
518 |     readOnly: true
519 |   - name: etcmachineid
520 |     mountPath: /etc/machine-id
521 |     readOnly: true
522 | 
523 | command:
524 |   - /fluent-bit/bin/fluent-bit
525 | 
526 | args:
527 |   - --workdir=/fluent-bit/etc
528 |   - --config=/fluent-bit/etc/conf/fluent-bit.conf
529 | 
530 | # This supports either a structured array or a templatable string
531 | initContainers: []
532 | 
533 | # Array mode
534 | # initContainers:
535 | #   - name: do-something
536 | #     image: bitnami/kubectl:1.22
537 | #     command: ['kubectl', 'version']
538 | 
539 | # String mode
540 | # initContainers: |-
541 | #   - name: do-something
542 | #     image: bitnami/kubectl:{{ .Capabilities.KubeVersion.Major }}.{{ .Capabilities.KubeVersion.Minor }}
543 | #     command: ['kubectl', 'version']
544 | 
545 | logLevel: info
546 | 
547 | hotReload:
548 |   enabled: false
549 |   image:
550 |     repository: ghcr.io/jimmidyson/configmap-reload
551 |     tag: v0.11.1
552 |     digest:
553 |     pullPolicy: IfNotPresent
554 |   resources: {}
555 | 
556 | 


--------------------------------------------------------------------------------
/day-5/images/architecture.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/iam-veeramalla/observability-zero-to-hero/9445b2364672b23b72f029e65471ed485a0c8950/day-5/images/architecture.gif


--------------------------------------------------------------------------------
/day-5/readme.md:
--------------------------------------------------------------------------------
  1 | # 🔍 Logging overview
  2 | - Logging is crucial in any distributed system, especially in Kubernetes, to monitor application behavior, detect issues, and ensure the smooth functioning of microservices.
  3 | 
  4 | 
  5 | ## 🚀 Importance:
  6 | - **Debugging**: Logs provide critical information when debugging issues in applications.
  7 | - **Auditing**: Logs serve as an audit trail, showing what actions were taken and by whom.
  8 | - **Performance** Monitoring: Analyzing logs can help identify performance bottlenecks.
  9 | - **Security**: Logs help in detecting unauthorized access or malicious activities.
 10 | 
 11 | ## 🛠️ Tools Available for Logging in Kubernetes
 12 | - 🗂️ EFK Stack (Elasticsearch, Fluentbit, Kibana)
 13 | - 🗂️ EFK Stack (Elasticsearch, FluentD, Kibana)
 14 | - 🗂️ ELK Stack (Elasticsearch, Logstash, Kibana)
 15 | - 📊 Promtail + Loki + Grafana
 16 | 
 17 | ## 📦 EFK Stack (Elasticsearch, Fluentbit, Kibana)
 18 | - EFK is a popular logging stack used to collect, store, and analyze logs in Kubernetes.
 19 | - **Elasticsearch**: Stores and indexes log data for easy retrieval.
 20 | - **Fluentbit**: A lightweight log forwarder that collects logs from different sources and sends them to Elasticsearch.
 21 | - **Kibana**: A visualization tool that allows users to explore and analyze logs stored in Elasticsearch.
 22 | 
 23 | # 🏠 Architecture
 24 | ![Project Architecture](images/architecture.gif)
 25 | 
 26 | 
 27 | ## 📝 Step-by-Step Setup
 28 | 
 29 | ### 1) Create IAM Role for Service Account
 30 | ```bash
 31 | eksctl create iamserviceaccount \
 32 |     --name ebs-csi-controller-sa \
 33 |     --namespace kube-system \
 34 |     --cluster observability \
 35 |     --role-name AmazonEKS_EBS_CSI_DriverRole \
 36 |     --role-only \
 37 |     --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
 38 |     --approve
 39 | ```
 40 | - This command creates an IAM role for the EBS CSI controller.
 41 | - IAM role allows EBS CSI controller to interact with AWS resources, specifically for managing EBS volumes in the Kubernetes cluster.
 42 | - We will attach the Role with service account
 43 | 
 44 | ### 2) Retrieve IAM Role ARN
 45 | ```bash
 46 | ARN=$(aws iam get-role --role-name AmazonEKS_EBS_CSI_DriverRole --query 'Role.Arn' --output text)
 47 | ```
 48 | - Command retrieves the ARN of the IAM role created for the EBS CSI controller service account.
 49 | 
 50 | ### 3) Deploy EBS CSI Driver
 51 | ```bash
 52 | eksctl create addon --cluster observability --name aws-ebs-csi-driver --version latest \
 53 |     --service-account-role-arn $ARN --force
 54 | ```
 55 | - Above command deploys the AWS EBS CSI driver as an addon to your Kubernetes cluster.
 56 | - It uses the previously created IAM service account role to allow the driver to manage EBS volumes securely.
 57 | 
 58 | ### 4) Create Namespace for Logging
 59 | ```bash
 60 | kubectl create namespace logging
 61 | ```
 62 | 
 63 | ### 5) Install Elasticsearch on K8s
 64 | 
 65 | ```bash
 66 | helm repo add elastic https://helm.elastic.co
 67 | 
 68 | helm install elasticsearch \
 69 |  --set replicas=1 \
 70 |  --set volumeClaimTemplate.storageClassName=gp2 \
 71 |  --set persistence.labels.enabled=true elastic/elasticsearch -n logging
 72 | ```
 73 | - Installs Elasticsearch in the `logging` namespace.
 74 | - It sets the number of replicas, specifies the storage class, and enables persistence labels to ensure
 75 | data is stored on persistent volumes.
 76 | 
 77 | ### 6) Retrieve Elasticsearch Username & Password
 78 | ```bash
 79 | # for username
 80 | kubectl get secrets --namespace=logging elasticsearch-master-credentials -ojsonpath='{.data.username}' | base64 -d
 81 | # for password
 82 | kubectl get secrets --namespace=logging elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d
 83 | ```
 84 | - Retrieves the password for the Elasticsearch cluster's master credentials from the Kubernetes secret.
 85 | - The password is base64 encoded, so it needs to be decoded before use.
 86 | - 👉 **Note**: Please write down the password for future reference
 87 | 
 88 | ### 7) Install Kibana
 89 | ```bash
 90 | helm install kibana --set service.type=LoadBalancer elastic/kibana -n logging
 91 | ```
 92 | - Kibana provides a user-friendly interface for exploring and visualizing data stored in Elasticsearch.
 93 | - It is exposed as a LoadBalancer service, making it accessible from outside the cluster.
 94 | 
 95 | ### 8) Install Fluentbit with Custom Values/Configurations
 96 | - 👉 **Note**: Please update the `HTTP_Passwd` field in the `fluentbit-values.yml` file with the password retrieved earlier in step 6: (i.e NJyO47UqeYBsoaEU)"
 97 | ```bash
 98 | helm repo add fluent https://fluent.github.io/helm-charts
 99 | helm install fluent-bit fluent/fluent-bit -f fluentbit-values.yaml -n logging
100 | ```
101 | 
102 | ## ✅ Conclusion
103 | - We have successfully installed the EFK stack in our Kubernetes cluster, which includes Elasticsearch for storing logs, Fluentbit for collecting and forwarding logs, and Kibana for visualizing logs.
104 | - To verify the setup, access the Kibana dashboard by entering the `LoadBalancer DNS name followed by :5601 in your browser.
105 |     - `http://LOAD_BALANCER_DNS_NAME:5601`
106 | - Use the username and password retrieved in step 6 to log in.
107 | - Once logged in, create a new data view in Kibana and explore the logs collected from your Kubernetes cluster.
108 | 
109 | 
110 | 
111 | ## 🧼 Clean Up
112 | ```bash
113 | 
114 | helm uninstall monitoring -n monitoring
115 | 
116 | helm uninstall fluent-bit -n logging
117 | 
118 | helm uninstall elasticsearch -n logging
119 | 
120 | helm uninstall kibana -n logging
121 | 
122 | cd day-4
123 | 
124 | kubectl delete -k kubernetes-manifest/
125 | 
126 | kubectl delete -k alerts-alertmanager-servicemonitor-manifest/
127 | 
128 | 
129 | eksctl delete cluster --name observability
130 | 
131 | ```
132 | 


--------------------------------------------------------------------------------
/day-6/images/architecture.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/iam-veeramalla/observability-zero-to-hero/9445b2364672b23b72f029e65471ed485a0c8950/day-6/images/architecture.gif


--------------------------------------------------------------------------------
/day-6/jaeger-values.yaml:
--------------------------------------------------------------------------------
 1 | storage:
 2 |   type: elasticsearch
 3 |   elasticsearch:
 4 |     host: elasticsearch-master.logging.svc  # Replace with your Elasticsearch service DNS
 5 |     port: 9200
 6 |     scheme: https
 7 |     user: elastic  # Replace with the actual username if necessary
 8 |     password: cbTQj1qxRIPNF5uc  # Replace with the actual password
 9 |     tls:
10 |       enabled: true
11 |       ca: /tls/ca-cert.pem  # Path where the CA cert is mounted
12 | 
13 | provisionDataStore:
14 |   cassandra: false
15 |   elasticsearch: false
16 | 
17 | query:
18 |   cmdlineParams:
19 |     es.tls.ca: "/tls/ca-cert.pem"
20 |   extraConfigmapMounts:
21 |     - name: jaeger-tls
22 |       mountPath: /tls
23 |       subPath: ""
24 |       configMap: jaeger-tls
25 |       readOnly: true
26 | 
27 | collector:
28 |   cmdlineParams:
29 |     es.tls.ca: "/tls/ca-cert.pem"
30 |   extraConfigmapMounts:
31 |     - name: jaeger-tls
32 |       mountPath: /tls
33 |       subPath: ""
34 |       configMap: jaeger-tls
35 |       readOnly: true
36 | 


--------------------------------------------------------------------------------
/day-6/readme.md:
--------------------------------------------------------------------------------
  1 | ## 🕵️‍♂️ What is Jaeger?
  2 | - Jaeger is an open-source, end-to-end distributed tracing system used for monitoring and troubleshooting microservices-based architectures. It helps developers understand how requests flow through a complex system, by tracing the path a request takes and measuring how long each step in that path takes.
  3 | 
  4 | ## ❓ Why Use Jaeger?
  5 | - In modern applications, especially microservices architectures, a single user request can touch multiple services. When something goes wrong, it’s challenging to pinpoint the source of the problem. Jaeger helps by:
  6 | 
  7 | - 🐢 **Identifying bottlenecks**: See where your application spends most of its time.
  8 | - 🔍 **Finding root causes of errors**: Trace errors back to their source.
  9 | - ⚡ **Optimizing performance**: Understand and improve the latency of services.
 10 | 
 11 | 
 12 | ## 📚 Core Concepts of Jaeger
 13 | 
 14 | - 🛤️ **Trace**: A trace represents the journey of a request as it travels through various services. Think of it as a detailed map that shows every stop a request makes in your system.
 15 | - 📏 **Span**: Each trace is made up of multiple spans. A span is a single operation within a trace, such as an API call or a database query. It has a start time and a duration.
 16 | - 🏷️ **Tags**: Tags are key-value pairs that provide additional context about a span. For example, a tag might indicate the HTTP method used (GET, POST) or the status code returned.
 17 | - 📝 **Logs**: Logs in a span provide details about what’s happening during that operation. They can capture events like errors or important checkpoints.
 18 | - 🔗 **Context Propagation**: For Jaeger to trace requests across services, it needs to propagate context. This means each service in the call chain passes along the trace information to the next service.
 19 | 
 20 | # 🏠 Architecture
 21 | ![Project Architecture](images/architecture.gif)
 22 | 
 23 | 
 24 | 
 25 | ## ⚙️ Setting Up Jaeger
 26 | 
 27 | ### Step 1: Instrumenting Your Code
 28 | - To start tracing, you need to instrument your services. This means adding tracing capabilities to your code. Most popular programming languages and frameworks have libraries or middleware that make this easy.
 29 | - We have already instrumented our code using OpenTelemetry libraries/packages. For more details, refer to `day-4/application/service-a/tracing.js` or `day-4/application/service-b/tracing.js`.
 30 | 
 31 | 
 32 | ### Step 2: Components of Jaeger
 33 | - Jaeger consists of several components:
 34 | - Agent: Collects traces from your application.
 35 | - Collector: Receives traces from the agent and processes them.
 36 | - Query: Provides a UI to view traces.
 37 | - Storage: Stores traces for later retrieval (often a database like *Elasticsearch*).
 38 | 
 39 | 
 40 | ### Step 3: Export Elasticsearch CA Certificate
 41 | - This command retrieves the CA certificate from the Elasticsearch master certificate secret and decodes it, saving it to a ca-cert.pem file.
 42 | ```bash
 43 | kubectl get secret elasticsearch-master-certs -n logging -o jsonpath='{.data.ca\.crt}' | base64 --decode > ca-cert.pem
 44 | ```
 45 | 
 46 | ### Step 4: Create Tracing Namespace
 47 | - Creates a new Kubernetes namespace called tracing if it doesn't already exist, where Jaeger components will be installed.
 48 | ```bash
 49 | kubectl create ns tracing
 50 | ```
 51 | 
 52 | ### Step 5: Create ConfigMap for Jaeger's TLS Certificate
 53 | - Creates a ConfigMap in the tracing namespace, containing the CA certificate to be used by Jaeger for TLS.
 54 | ```bash
 55 | kubectl create configmap jaeger-tls --from-file=ca-cert.pem -n tracing
 56 | ```
 57 | ### Step 6: Create Secret for Elasticsearch TLS
 58 | - Creates a Kubernetes Secret in the tracing namespace, containing the CA certificate for Elasticsearch TLS communication.
 59 | ```bash
 60 | kubectl create secret generic es-tls-secret --from-file=ca-cert.pem -n tracing
 61 | ```
 62 | ### Step 7: Add Jaeger Helm Repository
 63 | - adds the official Jaeger Helm chart repository to your Helm setup, making it available for installations.
 64 | ```bash
 65 | helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
 66 | 
 67 | helm repo update
 68 | ```
 69 | 
 70 | ### Step 8: Install Jaeger with Custom Values
 71 | - 👉 **Note**: Please update the `password` field and other related field in the `jaeger-values.yaml` file with the password retrieved earlier in day-4 at step 6: (i.e NJyO47UqeYBsoaEU)"
 72 | -  Command installs Jaeger into the tracing namespace using a custom jaeger-values.yaml configuration file. Ensure the password is updated in the file before installation.
 73 | ```bash
 74 | helm install jaeger jaegertracing/jaeger -n tracing --values jaeger-values.yaml
 75 | ```
 76 | ### Step 9: Port Forward Jaeger Query Service
 77 | - Command forwards port 8080 on your local machine to the Jaeger Query service, allowing you to access the Jaeger UI locally.
 78 | ```bash
 79 | kubectl port-forward svc/jaeger-query 8080:80 -n tracing
 80 | 
 81 | ```
 82 | 
 83 | ## 🧼 Clean Up
 84 | ```bash
 85 | 
 86 | helm uninstall jaeger -n tracing
 87 | 
 88 | helm uninstall elasticsearch -n logging
 89 | 
 90 | # Also delete PVC created for elasticsearch
 91 | 
 92 | helm uninstall monitoring -n monitoring
 93 | 
 94 | cd day-4
 95 | 
 96 | kubectl delete -k kubernetes-manifest/
 97 | 
 98 | kubectl delete -k alerts-alertmanager-servicemonitor-manifest/
 99 | 
100 | # Delete cluster
101 | eksctl delete cluster --name observability
102 | 
103 | ```
104 | 
105 | 


--------------------------------------------------------------------------------
/day-7/README.md:
--------------------------------------------------------------------------------
  1 | ## 📊 What is OpenTelemetry?
  2 | - OpenTelemetry is an open-source observability framework for generating, collecting, and exporting telemetry data (traces, metrics, logs) to help monitor applications.
  3 | 
  4 | ## 🛠️ How is it Different from Other Libraries?
  5 | - OpenTelemetry offers a unified standard for observability across multiple tools and vendors, unlike other libraries that may focus only on a specific aspect like tracing or metrics.
  6 | 
  7 | ## ⏳ What Existed Before OpenTelemetry?
  8 | - Before OpenTelemetry, observability was typically managed using a combination of specialized tools for different aspects like
  9 |     - `Tracing`: Tools like Jaeger and Zipkin were used to track requests
 10 |     - `Metrics`: Solutions like Prometheus and StatsD were popular for collecting metrics
 11 |     - `Logging`: Tools like ELK Stack (Elasticsearch, Logstash, Kibana) or Fluentd were used to aggregate and analyze logs.
 12 | -  OpenTelemetry unified these by standardizing how telemetry data is collected and exported.
 13 | - Prior to OpenTelemetry, there were OpenTracing and OpenCensus, which OpenTelemetry merged to provide a more comprehensive and standardized observability solution.
 14 | 
 15 | ## 🌐 Supported Programming Languages
 16 | 
 17 | OpenTelemetry supports several languages, including:
 18 | 
 19 | - **Go**
 20 | - **Java**
 21 | - **JavaScript**
 22 | - **Python**
 23 | - **C#**
 24 | - **C++**
 25 | - **Ruby**
 26 | - **PHP**
 27 | - **Swift**
 28 | - ...and others.
 29 | 
 30 | ## Architecture
 31 | 
 32 | ### 🖥️ Step 1: Create EKS Cluster
 33 | 
 34 | ```bash
 35 | eksctl create cluster --name=observability \
 36 |                       --region=us-east-1 \
 37 |                       --zones=us-east-1a,us-east-1b \
 38 |                       --without-nodegroup
 39 | ```
 40 | ```bash
 41 | eksctl utils associate-iam-oidc-provider \
 42 |     --region us-east-1 \
 43 |     --cluster observability \
 44 |     --approve
 45 | ```
 46 | ```bash
 47 | eksctl create nodegroup --cluster=observability \
 48 |                         --region=us-east-1 \
 49 |                         --name=observability-ng-private \
 50 |                         --node-type=t3.medium \
 51 |                         --nodes-min=2 \
 52 |                         --nodes-max=3 \
 53 |                         --node-volume-size=20 \
 54 |                         --managed \
 55 |                         --asg-access \
 56 |                         --external-dns-access \
 57 |                         --full-ecr-access \
 58 |                         --appmesh-access \
 59 |                         --alb-ingress-access \
 60 |                         --node-private-networking
 61 | 
 62 | # Update ./kube/config file
 63 | aws eks update-kubeconfig --name observability
 64 | ```
 65 | 
 66 | ### 🔐 Step 2: Create IAM Role for Service Account
 67 | ```bash
 68 | eksctl create iamserviceaccount \
 69 |     --name ebs-csi-controller-sa \
 70 |     --namespace kube-system \
 71 |     --cluster observability \
 72 |     --role-name AmazonEKS_EBS_CSI_DriverRole \
 73 |     --role-only \
 74 |     --attach-policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy \
 75 |     --approve
 76 | ```
 77 | - This command creates an IAM role for the EBS CSI controller.
 78 | - IAM role allows EBS CSI controller to interact with AWS resources, specifically for managing EBS volumes in the Kubernetes cluster.
 79 | - We will attach the Role with service account
 80 | 
 81 | ### 📝 Step 3: Retrieve IAM Role ARN
 82 | ```bash
 83 | ARN=$(aws iam get-role --role-name AmazonEKS_EBS_CSI_DriverRole --query 'Role.Arn' --output text)
 84 | ```
 85 | - Command retrieves the ARN of the IAM role created for the EBS CSI controller service account.
 86 | 
 87 | ### 📦 Step 4: Deploy EBS CSI Driver
 88 | ```bash
 89 | eksctl create addon --cluster observability --name aws-ebs-csi-driver --version latest \
 90 |     --service-account-role-arn $ARN --force
 91 | ```
 92 | - Above command deploys the AWS EBS CSI driver as an addon to your Kubernetes cluster.
 93 | - It uses the previously created IAM service account role to allow the driver to manage EBS volumes securely.
 94 | 
 95 | 
 96 | ### 🧩 Step 5: Understand the Application
 97 | - We have two very simple microservice A (`microservice-a`) & B (`microservice-a`), Built with Golang using the Gin web framework for handling HTTP requests.
 98 | - **Microservice A** API Endpoints:
 99 |     - `GET /hello-a` – Returns a greeting message
100 |     - `GET /call-b` – Calls another service (Service B) and returns its response
101 |     - `GET /getme-coffee` – Fetches and returns data from an external coffee API
102 | - **Microservice B** API Endpoints:
103 |     - `GET /hello-b` – Returns a greeting message
104 |     - `GET /call-a` – Calls another service (Service A) and returns its response
105 |     - `GET /getme-coffee` – Fetches and returns data from an external coffee API
106 | - Observability:
107 |     - OpenTelemetry SDK integrated for tracing and metrics.
108 |     - Metrics and traces are exported to the OpenTelemetry Collector via OTLP over HTTP.
109 | - Instrumentation:
110 |     - Uses OpenTelemetry middleware (otelgin) for automatic request tracing.
111 |     - Instruments HTTP clients with otelhttp for distributed tracing of outbound requests.
112 | 
113 | 
114 | ### 🐳 Step 6: Dockerize & push it to the registry
115 | ```bash
116 | # Dockerize microservice - a
117 | docker build -t <<NAME_OF_YOUR_REPO>>:<<TAG>> microservice-a/
118 | 
119 | # Dockerize microservice - b
120 | docker build -t <<NAME_OF_YOUR_REPO>>:<<TAG>> microservice-b/
121 | 
122 | # push both images
123 | docker push  <<NAME_OF_YOUR_REPO>>:<<TAG>>
124 | docker push  <<NAME_OF_YOUR_REPO>>:<<TAG>>
125 | ```
126 | 
127 | 
128 | ### 🗂️ Step 7: Create Namespace for observability components
129 | ```bash
130 | kubectl create namespace olly
131 | ```
132 | 
133 | ### 📚 Step 8: Install Elasticsearch on K8s
134 | helm repo add elastic https://helm.elastic.co
135 | 
136 | helm install elasticsearch \
137 |  --set replicas=1 \
138 |  --set volumeClaimTemplate.storageClassName=gp2 \
139 |  --set persistence.labels.enabled=true elastic/elasticsearch -n olly
140 | 
141 | 
142 | ### 📜 Step 9: Export Elasticsearch CA Certificate
143 | - This command retrieves the CA certificate from the Elasticsearch master certificate secret and decodes it, saving it to a ca-cert.pem file.
144 | ```bash
145 | kubectl get secret elasticsearch-master-certs -n olly -o jsonpath='{.data.ca\.crt}' | base64 --decode > ca-cert.pem
146 | ```
147 | 
148 | ###  🔑 Step 10: Create ConfigMap for Jaeger's TLS Certificate
149 | - Creates a ConfigMap in the olly namespace, containing the CA certificate to be used by Jaeger for TLS.
150 | ```bash
151 | kubectl create configmap jaeger-tls --from-file=ca-cert.pem -n olly
152 | ```
153 | 
154 | ### 🛡️ Step 11: Create Secret for Elasticsearch TLS
155 | - Creates a Kubernetes Secret in the tracing namespace, containing the CA certificate for Elasticsearch TLS communication.
156 | ```bash
157 | kubectl create secret generic es-tls-secret --from-file=ca-cert.pem -n olly
158 | ```
159 | 
160 | ### 🔍 Step 12: Retrieve Elasticsearch Username & Password
161 | ```bash
162 | # for username
163 | kubectl get secrets --namespace=olly elasticsearch-master-credentials -ojsonpath='{.data.username}' | base64 -d
164 | # for password
165 | kubectl get secrets --namespace=olly elasticsearch-master-credentials -ojsonpath='{.data.password}' | base64 -d
166 | ```
167 | - Retrieves the password for the Elasticsearch cluster's master credentials from the Kubernetes secret.
168 | - 👉 **Note**: Please write down the password for future reference
169 | 
170 | 
171 | ### 🕵️‍♂️ Step 13: Install Jaeger with Custom Values
172 | - 👉 **Note**: Please update the `password` field and other related field in the `jaeger-values.yaml` file with the password retrieved previous step at step 12: (i.e NJyO47UqeYBsoaEU)"
173 | -  Command installs Jaeger into the olly namespace using a custom jaeger-values.yaml configuration file. Ensure the password is updated in the file before installation.
174 | ```bash
175 | helm repo add jaegertracing https://jaegertracing.github.io/helm-charts
176 | helm repo update
177 | 
178 | helm install jaeger jaegertracing/jaeger -n olly --values jaeger-values.yaml
179 | ```
180 | 
181 | ### 🌐 Step 14: Access UI - Port Forward Jaeger Query Service
182 | kubectl port-forward svc/jaeger-query 8080:80 -n olly
183 | 
184 | 
185 | 
186 | ### 📈 Step 15: Install Opentelemetry-collector
187 | helm install otel-collector open-telemetry/opentelemetry-collector -n olly --values otel-collector-values.yaml
188 | 
189 | 
190 | ### 📊 Step 16: Install prometheus
191 | ```bash
192 | helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
193 | helm repo update
194 | 
195 | helm install  prometheus prometheus-community/prometheus -n olly --values prometheus-values.yaml
196 | ```
197 | 
198 | ### 🚀 Step 17: Deploy the applicaiton
199 | - ***Note:*** - Review the Kubernetes manifest files located in `./k8s-manifest`. and you should change image name & tag with your own image
200 | ```bash
201 | kubectl apply -k k8s-manifests/
202 | ```
203 | - 👉 ***Note***: wait for 5 minutes till you load balancer comes in running state
204 | 
205 | ## 🔄 Step 18: Generate Load
206 | - Script: `test.sh` takes two load balancer DNS addresses as input arguments and alternates requests between them using curl.
207 | - `test.sh` Continuously sends random HTTP requests every second to predefined routes on two provided load balancer DNSs
208 | - ***Note:*** Keep the script running in another terminal to quickly gather metrics & traces.
209 | 
210 | ```bash
211 | ./test.sh http://Microservice_A_LOAD_BALANCER_DNS http://Microservice_B_LOAD_BALANCER_DNS
212 | ```
213 | 
214 | ### 📊 Step 19: Access the UI of Prometheus
215 | ```bash
216 | kubectl port-forward svc/prometheus-server 9090:80 -n olly
217 | ```
218 | - Look for your application's metrics like `request_count`, `request_duration_ms`, `active_requests` and other to monitor request rates & performance.
219 | 
220 | 
221 | ### 🕵️‍♂️ Step 20: Access the UI of Jaeger
222 | ```bash
223 | kubectl port-forward svc/jaeger-query 8080:80 -n olly
224 | ```
225 | -  Look for traces from the service name microservice-a, microservice-b and operations such as `[/hello-a, /call-b, and /getme-coffee]` or `[/hello-b, /call-a, and /getme-coffee]` to monitor request flows and dependencies.
226 | 
227 | ## ✅ Conclusion
228 | - By following the above steps, you have successfully set up an observability stack using OpenTelemetry on an EKS cluster. This setup allows you to monitor your microservices effectively through integrated tracing, metrics, and logging.
229 | 
230 | ## 🧼 Clean Up
231 | ```bash
232 | helm uninstall prometheus -n olly
233 | helm uninstall otel-collector -n olly
234 | helm uninstall jaeger -n olly
235 | helm uninstall elasticsearch -n olly
236 | 
237 | <!-- Delete all the pvc & pv -->
238 | 
239 | kubectl delete -k k8s-manifests/
240 | 
241 | 
242 | kubectl delete ns olly
243 | 
244 | eksctl delete cluster --name observability
245 | ```


--------------------------------------------------------------------------------
/day-7/jaeger-values.yaml:
--------------------------------------------------------------------------------
 1 | storage:
 2 |   type: elasticsearch
 3 |   elasticsearch:
 4 |     host: elasticsearch-master.olly.svc  # Replace with your Elasticsearch service DNS
 5 |     port: 9200
 6 |     scheme: https
 7 |     user: elastic  # Replace with the actual username if necessary
 8 |     password: F2Dm1tKzDQDYnNXR  # Replace with the actual password
 9 |     tls:
10 |       enabled: true
11 |       ca: /tls/ca-cert.pem  # Path where the CA cert is mounted
12 | 
13 | provisionDataStore:
14 |   cassandra: false
15 |   elasticsearch: false
16 | 
17 | query:
18 |   cmdlineParams:
19 |     es.tls.ca: "/tls/ca-cert.pem"
20 |   extraConfigmapMounts:
21 |     - name: jaeger-tls
22 |       mountPath: /tls
23 |       subPath: ""
24 |       configMap: jaeger-tls
25 |       readOnly: true
26 | 
27 | collector:
28 |   image:
29 |     repository: jaegertracing/jaeger-collector
30 |     tag: latest
31 | 
32 |   # Configure the Collector service to expose OTLP ports
33 |   service:
34 |     type: ClusterIP
35 |     otlp:
36 |       grpc:
37 |         name: otlp-grpc
38 |         # enabled: true
39 |         port: 4317  # gRPC OTLP port
40 |       http:
41 |         name: otlp-http
42 |         # enabled: true
43 |         port: 4318  # HTTP OTLP port
44 | 
45 | 
46 | 
47 |   cmdlineParams:
48 |     es.tls.ca: "/tls/ca-cert.pem"
49 |     collector.otlp.grpc.host-port: "0.0.0.0:4317"  # Enable OTLP gRPC receiver on port 4317
50 |     collector.otlp.http.host-port: "0.0.0.0:4318"  # Enable OTLP HTTP receiver on port 4318
51 | 
52 |   extraConfigmapMounts:
53 |     - name: jaeger-tls
54 |       mountPath: /tls
55 |       subPath: ""
56 |       configMap: jaeger-tls
57 |       readOnly: true
58 | 
59 | 
60 |   # Define the service ports for OTLP receivers
61 |   ports:
62 |     otlp-grpc:
63 |       enabled: true
64 |       containerPort: 4317
65 |       servicePort: 4317
66 |       protocol: TCP
67 |     otlp-http:
68 |       enabled: true
69 |       containerPort: 4318
70 |       servicePort: 4318
71 |       protocol: TCP
72 | 


--------------------------------------------------------------------------------
/day-7/k8s-manifests/deployment-a.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: apps/v1
 2 | kind: Deployment
 3 | metadata:
 4 |   labels:
 5 |     app: go-service-a-deployment
 6 |     # run: go-service-a-deployment
 7 |   name: go-service-a-deployment
 8 | spec:
 9 |   replicas: 1
10 |   selector:
11 |     matchLabels:
12 |       app: go-service-a-deployment
13 |   template:
14 |     metadata:
15 |       labels:
16 |         app: go-service-a-deployment
17 |     spec:
18 |       containers:
19 |       # - image: ankitjodhani/golang-svc-a:latest
20 |       - image: <<IMAGE_NAME_ID>>:<<TAG>>
21 |         name: service-a
22 |         imagePullPolicy: Always
23 |         ports:
24 |         - containerPort: 80
25 |         env:
26 |         - name: OTEL_COLLECTOR_ENDPOINT
27 |           value: "otel-collector-opentelemetry-collector.olly:4318"
28 |         - name: SVC_B_URI
29 |           value: "http://b-service.dev"
30 |         - name: PORT
31 |           value: "80"
32 | 


--------------------------------------------------------------------------------
/day-7/k8s-manifests/deployment-b.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: apps/v1
 2 | kind: Deployment
 3 | metadata:
 4 |   labels:
 5 |     app: go-service-b-deployment
 6 |     # run: go-service-b-deployment
 7 |   name: go-service-b-deployment
 8 | spec:
 9 |   replicas: 1
10 |   selector:
11 |     matchLabels:
12 |       app: go-service-b-deployment
13 |   template:
14 |     metadata:
15 |       labels:
16 |         app: go-service-b-deployment
17 |     spec:
18 |       containers:
19 |       # - image: ankitjodhani/golang-svc-b:latest
20 |       - image: <<IMAGE_NAME_ID>>:<<TAG>>
21 |         name: service-a
22 |         imagePullPolicy: Always
23 |         ports:
24 |         - containerPort: 80
25 |         env:
26 |         - name: OTEL_COLLECTOR_ENDPOINT
27 |           value: "otel-collector-opentelemetry-collector.olly:4318"
28 |         - name: SVC_A_URI
29 |           value: "http://a-service.dev"
30 |         - name: PORT
31 |           value: "80"
32 | 


--------------------------------------------------------------------------------
/day-7/k8s-manifests/kustomization.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: kustomize.config.k8s.io/v1beta1
 2 | kind: Kustomization
 3 | namespace: dev
 4 | resources:
 5 |   - namespace.yml
 6 |   - deployment-a.yml
 7 |   - deployment-b.yml
 8 |   - svc-a.yml
 9 |   - svc-b.yml
10 | 


--------------------------------------------------------------------------------
/day-7/k8s-manifests/namespace.yml:
--------------------------------------------------------------------------------
1 | apiVersion: v1
2 | kind: Namespace
3 | metadata:
4 |   name: dev


--------------------------------------------------------------------------------
/day-7/k8s-manifests/svc-a.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: v1
 2 | kind: Service
 3 | metadata:
 4 |   labels:
 5 |     job: go-api
 6 |     app: a-service
 7 |   name: a-service
 8 |   annotations:
 9 |     prometheus.io/scrape: "true"
10 |     prometheus.io/port: "80"
11 |     prometheus.io/path: "/metrics"
12 | 
13 | spec:
14 |   ports:
15 |   - name: a-service-port
16 |     port: 80
17 |     protocol: TCP
18 |     targetPort: 80
19 |   selector:
20 |     app: go-service-a-deployment
21 |   type: LoadBalancer
22 | 
23 | 


--------------------------------------------------------------------------------
/day-7/k8s-manifests/svc-b.yml:
--------------------------------------------------------------------------------
 1 | apiVersion: v1
 2 | kind: Service
 3 | metadata:
 4 |   labels:
 5 |     job: go-api
 6 |     app: b-service
 7 |   name: b-service
 8 |   annotations:
 9 |     prometheus.io/scrape: "true"
10 |     prometheus.io/port: "80"
11 |     prometheus.io/path: "/metrics"
12 | 
13 | spec:
14 |   ports:
15 |   - name: b-service-port
16 |     port: 80
17 |     protocol: TCP
18 |     targetPort: 80
19 |   selector:
20 |     app: go-service-b-deployment
21 |   type: LoadBalancer
22 | 
23 | 


--------------------------------------------------------------------------------
/day-7/microservice-a/.dockerignore:
--------------------------------------------------------------------------------
1 | .env


--------------------------------------------------------------------------------
/day-7/microservice-a/.env:
--------------------------------------------------------------------------------
1 | SVC_B_URI=http://localhost:8081
2 | OTEL_COLLECTOR_ENDPOINT=localhost:4318
3 | PORT=80


--------------------------------------------------------------------------------
/day-7/microservice-a/docker-compose.yml:
--------------------------------------------------------------------------------
 1 | version: '3'
 2 | 
 3 | services:
 4 |   otel-collector:
 5 |     image: otel/opentelemetry-collector-contrib:latest
 6 |     command: ["--config=/etc/otel-collector-config.yaml"]
 7 |     volumes:
 8 |       - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
 9 |     ports:
10 |       - "4317:4317"   # OTLP gRPC receiver
11 |       - "4318:4318"   # OTLP HTTP receiver
12 |       - "8889:8889"   # Prometheus metrics exporter
13 |     depends_on:
14 |       - jaeger
15 | 
16 |   prometheus:
17 |     image: prom/prometheus:latest
18 |     volumes:
19 |       - ./prometheus.yaml:/etc/prometheus/prometheus.yml
20 |     command:
21 |       - "--config.file=/etc/prometheus/prometheus.yml"
22 |     ports:
23 |       - "9090:9090"
24 |     depends_on:
25 |       - otel-collector
26 | 
27 |   jaeger:
28 |     image: jaegertracing/all-in-one:latest
29 |     ports:
30 |       - "16686:16686"  # Jaeger UI
31 |       - "14250:14250"  # Jaeger gRPC receiver
32 | 
33 | 


--------------------------------------------------------------------------------
/day-7/microservice-a/dockerfile:
--------------------------------------------------------------------------------
 1 | # Use official Golang image as the build image
 2 | FROM golang:1.23-alpine AS builder
 3 | 
 4 | WORKDIR /app
 5 | 
 6 | COPY go.mod ./
 7 | COPY go.sum ./
 8 | RUN go mod download
 9 | 
10 | COPY . ./
11 | 
12 | RUN go build -o service-a .
13 | 
14 | # Use a minimal image for the runtime
15 | FROM alpine:latest
16 | 
17 | WORKDIR /app
18 | 
19 | COPY --from=builder /app/service-a .
20 | 
21 | EXPOSE 80
22 | 
23 | CMD ["./service-a"]
24 | 


--------------------------------------------------------------------------------
/day-7/microservice-a/go.mod:
--------------------------------------------------------------------------------
 1 | module microservice-a
 2 | 
 3 | go 1.23.0
 4 | 
 5 | require (
 6 | 	github.com/gin-gonic/gin v1.10.0
 7 | 	github.com/joho/godotenv v1.5.1
 8 | 	go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin v0.55.0
 9 | 	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.55.0
10 | 	go.opentelemetry.io/otel v1.30.0
11 | 	go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.30.0
12 | 	go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.30.0
13 | 	go.opentelemetry.io/otel/metric v1.30.0
14 | 	go.opentelemetry.io/otel/sdk v1.30.0
15 | 	go.opentelemetry.io/otel/sdk/metric v1.30.0
16 | )
17 | 
18 | require (
19 | 	github.com/bytedance/sonic v1.12.2 // indirect
20 | 	github.com/bytedance/sonic/loader v0.2.0 // indirect
21 | 	github.com/cenkalti/backoff/v4 v4.3.0 // indirect
22 | 	github.com/cloudwego/base64x v0.1.4 // indirect
23 | 	github.com/cloudwego/iasm v0.2.0 // indirect
24 | 	github.com/felixge/httpsnoop v1.0.4 // indirect
25 | 	github.com/gabriel-vasile/mimetype v1.4.5 // indirect
26 | 	github.com/gin-contrib/sse v0.1.0 // indirect
27 | 	github.com/go-logr/logr v1.4.2 // indirect
28 | 	github.com/go-logr/stdr v1.2.2 // indirect
29 | 	github.com/go-playground/locales v0.14.1 // indirect
30 | 	github.com/go-playground/universal-translator v0.18.1 // indirect
31 | 	github.com/go-playground/validator/v10 v10.22.1 // indirect
32 | 	github.com/goccy/go-json v0.10.3 // indirect
33 | 	github.com/google/uuid v1.6.0 // indirect
34 | 	github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0 // indirect
35 | 	github.com/json-iterator/go v1.1.12 // indirect
36 | 	github.com/klauspost/cpuid/v2 v2.2.8 // indirect
37 | 	github.com/leodido/go-urn v1.4.0 // indirect
38 | 	github.com/mattn/go-isatty v0.0.20 // indirect
39 | 	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
40 | 	github.com/modern-go/reflect2 v1.0.2 // indirect
41 | 	github.com/pelletier/go-toml/v2 v2.2.3 // indirect
42 | 	github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
43 | 	github.com/ugorji/go/codec v1.2.12 // indirect
44 | 	go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.30.0 // indirect
45 | 	go.opentelemetry.io/otel/trace v1.30.0 // indirect
46 | 	go.opentelemetry.io/proto/otlp v1.3.1 // indirect
47 | 	golang.org/x/arch v0.10.0 // indirect
48 | 	golang.org/x/crypto v0.27.0 // indirect
49 | 	golang.org/x/net v0.29.0 // indirect
50 | 	golang.org/x/sys v0.25.0 // indirect
51 | 	golang.org/x/text v0.18.0 // indirect
52 | 	google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1 // indirect
53 | 	google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1 // indirect
54 | 	google.golang.org/grpc v1.66.1 // indirect
55 | 	google.golang.org/protobuf v1.34.2 // indirect
56 | 	gopkg.in/yaml.v3 v3.0.1 // indirect
57 | )
58 | 


--------------------------------------------------------------------------------
/day-7/microservice-a/go.sum:
--------------------------------------------------------------------------------
  1 | github.com/bytedance/sonic v1.12.2 h1:oaMFuRTpMHYLpCntGca65YWt5ny+wAceDERTkT2L9lg=
  2 | github.com/bytedance/sonic v1.12.2/go.mod h1:B8Gt/XvtZ3Fqj+iSKMypzymZxw/FVwgIGKzMzT9r/rk=
  3 | github.com/bytedance/sonic/loader v0.1.1/go.mod h1:ncP89zfokxS5LZrJxl5z0UJcsk4M4yY2JpfqGeCtNLU=
  4 | github.com/bytedance/sonic/loader v0.2.0 h1:zNprn+lsIP06C/IqCHs3gPQIvnvpKbbxyXQP1iU4kWM=
  5 | github.com/bytedance/sonic/loader v0.2.0/go.mod h1:ncP89zfokxS5LZrJxl5z0UJcsk4M4yY2JpfqGeCtNLU=
  6 | github.com/cenkalti/backoff/v4 v4.3.0 h1:MyRJ/UdXutAwSAT+s3wNd7MfTIcy71VQueUuFK343L8=
  7 | github.com/cenkalti/backoff/v4 v4.3.0/go.mod h1:Y3VNntkOUPxTVeUxJ/G5vcM//AlwfmyYozVcomhLiZE=
  8 | github.com/cloudwego/base64x v0.1.4 h1:jwCgWpFanWmN8xoIUHa2rtzmkd5J2plF/dnLS6Xd/0Y=
  9 | github.com/cloudwego/base64x v0.1.4/go.mod h1:0zlkT4Wn5C6NdauXdJRhSKRlJvmclQ1hhJgA0rcu/8w=
 10 | github.com/cloudwego/iasm v0.2.0 h1:1KNIy1I1H9hNNFEEH3DVnI4UujN+1zjpuk6gwHLTssg=
 11 | github.com/cloudwego/iasm v0.2.0/go.mod h1:8rXZaNYT2n95jn+zTI1sDr+IgcD2GVs0nlbbQPiEFhY=
 12 | github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 13 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
 14 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 15 | github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg=
 16 | github.com/felixge/httpsnoop v1.0.4/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U=
 17 | github.com/gabriel-vasile/mimetype v1.4.5 h1:J7wGKdGu33ocBOhGy0z653k/lFKLFDPJMG8Gql0kxn4=
 18 | github.com/gabriel-vasile/mimetype v1.4.5/go.mod h1:ibHel+/kbxn9x2407k1izTA1S81ku1z/DlgOW2QE0M4=
 19 | github.com/gin-contrib/sse v0.1.0 h1:Y/yl/+YNO8GZSjAhjMsSuLt29uWRFHdHYUb5lYOV9qE=
 20 | github.com/gin-contrib/sse v0.1.0/go.mod h1:RHrZQHXnP2xjPF+u1gW/2HnVO7nvIa9PG3Gm+fLHvGI=
 21 | github.com/gin-gonic/gin v1.10.0 h1:nTuyha1TYqgedzytsKYqna+DfLos46nTv2ygFy86HFU=
 22 | github.com/gin-gonic/gin v1.10.0/go.mod h1:4PMNQiOhvDRa013RKVbsiNwoyezlm2rm0uX/T7kzp5Y=
 23 | github.com/go-logr/logr v1.2.2/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
 24 | github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
 25 | github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
 26 | github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
 27 | github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
 28 | github.com/go-playground/assert/v2 v2.2.0 h1:JvknZsQTYeFEAhQwI4qEt9cyV5ONwRHC+lYKSsYSR8s=
 29 | github.com/go-playground/assert/v2 v2.2.0/go.mod h1:VDjEfimB/XKnb+ZQfWdccd7VUvScMdVu0Titje2rxJ4=
 30 | github.com/go-playground/locales v0.14.1 h1:EWaQ/wswjilfKLTECiXz7Rh+3BjFhfDFKv/oXslEjJA=
 31 | github.com/go-playground/locales v0.14.1/go.mod h1:hxrqLVvrK65+Rwrd5Fc6F2O76J/NuW9t0sjnWqG1slY=
 32 | github.com/go-playground/universal-translator v0.18.1 h1:Bcnm0ZwsGyWbCzImXv+pAJnYK9S473LQFuzCbDbfSFY=
 33 | github.com/go-playground/universal-translator v0.18.1/go.mod h1:xekY+UJKNuX9WP91TpwSH2VMlDf28Uj24BCp08ZFTUY=
 34 | github.com/go-playground/validator/v10 v10.22.1 h1:40JcKH+bBNGFczGuoBYgX4I6m/i27HYW8P9FDk5PbgA=
 35 | github.com/go-playground/validator/v10 v10.22.1/go.mod h1:dbuPbCMFw/DrkbEynArYaCwl3amGuJotoKCe95atGMM=
 36 | github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
 37 | github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
 38 | github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
 39 | github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
 40 | github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
 41 | github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
 42 | github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
 43 | github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0 h1:asbCHRVmodnJTuQ3qamDwqVOIjwqUPTYmYuemVOx+Ys=
 44 | github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0/go.mod h1:ggCgvZ2r7uOoQjOyu2Y1NhHmEPPzzuhWgcza5M1Ji1I=
 45 | github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0=
 46 | github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4=
 47 | github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
 48 | github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo=
 49 | github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg=
 50 | github.com/klauspost/cpuid/v2 v2.2.8 h1:+StwCXwm9PdpiEkPyzBXIy+M9KUb4ODm0Zarf1kS5BM=
 51 | github.com/klauspost/cpuid/v2 v2.2.8/go.mod h1:Lcz8mBdAVJIBVzewtcLocK12l3Y+JytZYpaMropDUws=
 52 | github.com/knz/go-libedit v1.10.1/go.mod h1:MZTVkCWyz0oBc7JOWP3wNAzd002ZbM/5hgShxwh4x8M=
 53 | github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
 54 | github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
 55 | github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
 56 | github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
 57 | github.com/leodido/go-urn v1.4.0 h1:WT9HwE9SGECu3lg4d/dIA+jxlljEa1/ffXKmRjqdmIQ=
 58 | github.com/leodido/go-urn v1.4.0/go.mod h1:bvxc+MVxLKB4z00jd1z+Dvzr47oO32F/QSNjSBOlFxI=
 59 | github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
 60 | github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
 61 | github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
 62 | github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg=
 63 | github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
 64 | github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M=
 65 | github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
 66 | github.com/pelletier/go-toml/v2 v2.2.3 h1:YmeHyLY8mFWbdkNWwpr+qIL2bEqT0o95WSdkNHvL12M=
 67 | github.com/pelletier/go-toml/v2 v2.2.3/go.mod h1:MfCQTFTvCcUyyvvwm1+G6H/jORL20Xlb6rzQu9GuUkc=
 68 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
 69 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
 70 | github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
 71 | github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
 72 | github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
 73 | github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
 74 | github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
 75 | github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
 76 | github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
 77 | github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
 78 | github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
 79 | github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
 80 | github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=
 81 | github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
 82 | github.com/twitchyliquid64/golang-asm v0.15.1 h1:SU5vSMR7hnwNxj24w34ZyCi/FmDZTkS4MhqMhdFk5YI=
 83 | github.com/twitchyliquid64/golang-asm v0.15.1/go.mod h1:a1lVb/DtPvCB8fslRZhAngC2+aY1QWCk3Cedj/Gdt08=
 84 | github.com/ugorji/go/codec v1.2.12 h1:9LC83zGrHhuUA9l16C9AHXAqEV/2wBQ4nkvumAE65EE=
 85 | github.com/ugorji/go/codec v1.2.12/go.mod h1:UNopzCgEMSXjBc6AOMqYvWC1ktqTAfzJZUZgYf6w6lg=
 86 | go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin v0.55.0 h1:n4Dd8YaDFeTd2uw+uCHJzOKeqfLgAOlePZpQ5f9cAoE=
 87 | go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin v0.55.0/go.mod h1:8aCCTMjP225r98yevEMM5NYDb3ianWLoeIzZ1rPyxHU=
 88 | go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.55.0 h1:ZIg3ZT/aQ7AfKqdwp7ECpOK6vHqquXXuyTjIO8ZdmPs=
 89 | go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.55.0/go.mod h1:DQAwmETtZV00skUwgD6+0U89g80NKsJE3DCKeLLPQMI=
 90 | go.opentelemetry.io/contrib/propagators/b3 v1.30.0 h1:vumy4r1KMyaoQRltX7cJ37p3nluzALX9nugCjNNefuY=
 91 | go.opentelemetry.io/contrib/propagators/b3 v1.30.0/go.mod h1:fRbvRsaeVZ82LIl3u0rIvusIel2UUf+JcaaIpy5taho=
 92 | go.opentelemetry.io/otel v1.30.0 h1:F2t8sK4qf1fAmY9ua4ohFS/K+FUuOPemHUIXHtktrts=
 93 | go.opentelemetry.io/otel v1.30.0/go.mod h1:tFw4Br9b7fOS+uEao81PJjVMjW/5fvNCbpsDIXqP0pc=
 94 | go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.30.0 h1:VrMAbeJz4gnVDg2zEzjHG4dEH86j4jO6VYB+NgtGD8s=
 95 | go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.30.0/go.mod h1:qqN/uFdpeitTvm+JDqqnjm517pmQRYxTORbETHq5tOc=
 96 | go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.30.0 h1:lsInsfvhVIfOI6qHVyysXMNDnjO9Npvl7tlDPJFBVd4=
 97 | go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.30.0/go.mod h1:KQsVNh4OjgjTG0G6EiNi1jVpnaeeKsKMRwbLN+f1+8M=
 98 | go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.30.0 h1:umZgi92IyxfXd/l4kaDhnKgY8rnN/cZcF1LKc6I8OQ8=
 99 | go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.30.0/go.mod h1:4lVs6obhSVRb1EW5FhOuBTyiQhtRtAnnva9vD3yRfq8=
100 | go.opentelemetry.io/otel/metric v1.30.0 h1:4xNulvn9gjzo4hjg+wzIKG7iNFEaBMX00Qd4QIZs7+w=
101 | go.opentelemetry.io/otel/metric v1.30.0/go.mod h1:aXTfST94tswhWEb+5QjlSqG+cZlmyXy/u8jFpor3WqQ=
102 | go.opentelemetry.io/otel/sdk v1.30.0 h1:cHdik6irO49R5IysVhdn8oaiR9m8XluDaJAs4DfOrYE=
103 | go.opentelemetry.io/otel/sdk v1.30.0/go.mod h1:p14X4Ok8S+sygzblytT1nqG98QG2KYKv++HE0LY/mhg=
104 | go.opentelemetry.io/otel/sdk/metric v1.30.0 h1:QJLT8Pe11jyHBHfSAgYH7kEmT24eX792jZO1bo4BXkM=
105 | go.opentelemetry.io/otel/sdk/metric v1.30.0/go.mod h1:waS6P3YqFNzeP01kuo/MBBYqaoBJl7efRQHOaydhy1Y=
106 | go.opentelemetry.io/otel/trace v1.30.0 h1:7UBkkYzeg3C7kQX8VAidWh2biiQbtAKjyIML8dQ9wmc=
107 | go.opentelemetry.io/otel/trace v1.30.0/go.mod h1:5EyKqTzzmyqB9bwtCCq6pDLktPK6fmGf/Dph+8VI02o=
108 | go.opentelemetry.io/proto/otlp v1.3.1 h1:TrMUixzpM0yuc/znrFTP9MMRh8trP93mkCiDVeXrui0=
109 | go.opentelemetry.io/proto/otlp v1.3.1/go.mod h1:0X1WI4de4ZsLrrJNLAQbFeLCm3T7yBkR0XqQ7niQU+8=
110 | golang.org/x/arch v0.10.0 h1:S3huipmSclq3PJMNe76NGwkBR504WFkQ5dhzWzP8ZW8=
111 | golang.org/x/arch v0.10.0/go.mod h1:FEVrYAQjsQXMVJ1nsMoVVXPZg6p2JE2mx8psSWTDQys=
112 | golang.org/x/crypto v0.27.0 h1:GXm2NjJrPaiv/h1tb2UH8QfgC/hOf/+z0p6PT8o1w7A=
113 | golang.org/x/crypto v0.27.0/go.mod h1:1Xngt8kV6Dvbssa53Ziq6Eqn0HqbZi5Z6R0ZpwQzt70=
114 | golang.org/x/net v0.29.0 h1:5ORfpBpCs4HzDYoodCDBbwHzdR5UrLBZ3sOnUJmFoHo=
115 | golang.org/x/net v0.29.0/go.mod h1:gLkgy8jTGERgjzMic6DS9+SP0ajcu6Xu3Orq/SpETg0=
116 | golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
117 | golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
118 | golang.org/x/sys v0.25.0 h1:r+8e+loiHxRqhXVl6ML1nO3l1+oFoWbnlu2Ehimmi34=
119 | golang.org/x/sys v0.25.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
120 | golang.org/x/text v0.18.0 h1:XvMDiNzPAl0jr17s6W9lcaIhGUfUORdGCNsuLmPG224=
121 | golang.org/x/text v0.18.0/go.mod h1:BuEKDfySbSR4drPmRPG/7iBdf8hvFMuRexcpahXilzY=
122 | google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1 h1:hjSy6tcFQZ171igDaN5QHOw2n6vx40juYbC/x67CEhc=
123 | google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1/go.mod h1:qpvKtACPCQhAdu3PyQgV4l3LMXZEtft7y8QcarRsp9I=
124 | google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1 h1:pPJltXNxVzT4pK9yD8vR9X75DaWYYmLGMsEvBfFQZzQ=
125 | google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1/go.mod h1:UqMtugtsSgubUsoxbuAoiCXvqvErP7Gf0so0mK9tHxU=
126 | google.golang.org/grpc v1.66.1 h1:hO5qAXR19+/Z44hmvIM4dQFMSYX9XcWsByfoxutBpAM=
127 | google.golang.org/grpc v1.66.1/go.mod h1:s3/l6xSSCURdVfAnL+TqCNMyTDAGN6+lZeVxnZR128Y=
128 | google.golang.org/protobuf v1.34.2 h1:6xV6lTsCfpGD21XK49h7MhtcApnLqkfYgPcdHftf6hg=
129 | google.golang.org/protobuf v1.34.2/go.mod h1:qYOHts0dSfpeUzUFpOMr/WGzszTmLH+DiWniOlNbLDw=
130 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
131 | gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
132 | gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
133 | gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
134 | gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
135 | gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
136 | nullprogram.com/x/optparse v1.0.0/go.mod h1:KdyPE+Igbe0jQUrVfMqDMeJQIJZEuyV7pjYmp6pbG50=
137 | 


--------------------------------------------------------------------------------
/day-7/microservice-a/main.go:
--------------------------------------------------------------------------------
  1 | package main
  2 | 
  3 | import (
  4 | 	"context"
  5 | 	"fmt"
  6 | 	"io/ioutil"
  7 | 	"log"
  8 | 	"net/http"
  9 | 	"os"
 10 | 	"time"
 11 | 
 12 | 	"github.com/gin-gonic/gin"
 13 | 	"github.com/joho/godotenv"
 14 | 
 15 | 	"go.opentelemetry.io/otel"
 16 | 	"go.opentelemetry.io/otel/attribute"
 17 | 	"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
 18 | 	"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
 19 | 	"go.opentelemetry.io/otel/metric"
 20 | 	sdkmetric "go.opentelemetry.io/otel/sdk/metric"
 21 | 	"go.opentelemetry.io/otel/sdk/resource"
 22 | 	"go.opentelemetry.io/otel/sdk/trace"
 23 | 	semconv "go.opentelemetry.io/otel/semconv/v1.21.0"
 24 | 
 25 | 	"go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin"
 26 | 	"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
 27 | )
 28 | 
 29 | var (
 30 | 	requestCounter        metric.Int64Counter
 31 | 	requestDuration       metric.Float64Histogram
 32 | 	activeRequestsCounter metric.Int64UpDownCounter
 33 | )
 34 | 
 35 | func initProvider() (func(context.Context) error, error) {
 36 | 	ctx := context.Background()
 37 | 
 38 | 	// Load environment variables
 39 | 	err := godotenv.Load()
 40 | 	if err != nil {
 41 | 		log.Println("No .env file found. Using environment variables 👌")
 42 | 	}
 43 | 
 44 | 	// Read the OTEL collector endpoint from environment variable
 45 | 	otelEndpoint := os.Getenv("OTEL_COLLECTOR_ENDPOINT")
 46 | 	if otelEndpoint == "" {
 47 | 		otelEndpoint = "localhost:4318" // Default endpoint
 48 | 	}
 49 | 
 50 | 	// Create a resource with the service name
 51 | 	res, err := resource.New(ctx,
 52 | 		resource.WithAttributes(
 53 | 			semconv.ServiceNameKey.String("microservice-a"),
 54 | 		),
 55 | 	)
 56 | 	if err != nil {
 57 | 		return nil, fmt.Errorf("failed to create resource: %w", err)
 58 | 	}
 59 | 
 60 | 	// Create OTLP trace exporter over HTTP with custom endpoint
 61 | 	traceExporter, err := otlptracehttp.New(ctx,
 62 | 		otlptracehttp.WithEndpoint(otelEndpoint),
 63 | 		otlptracehttp.WithInsecure(),
 64 | 	)
 65 | 	if err != nil {
 66 | 		return nil, fmt.Errorf("failed to create trace exporter: %w", err)
 67 | 	}
 68 | 
 69 | 	// Create OTLP metric exporter over HTTP with custom endpoint
 70 | 	metricExporter, err := otlpmetrichttp.New(ctx,
 71 | 		otlpmetrichttp.WithEndpoint(otelEndpoint),
 72 | 		otlpmetrichttp.WithInsecure(),
 73 | 	)
 74 | 	if err != nil {
 75 | 		return nil, fmt.Errorf("failed to create metric exporter: %w", err)
 76 | 	}
 77 | 
 78 | 	// Create trace provider with the exporter and resource
 79 | 	tracerProvider := trace.NewTracerProvider(
 80 | 		trace.WithBatcher(traceExporter),
 81 | 		trace.WithResource(res),
 82 | 	)
 83 | 
 84 | 	// Create metric reader and meter provider with the resource
 85 | 	metricReader := sdkmetric.NewPeriodicReader(metricExporter)
 86 | 	meterProvider := sdkmetric.NewMeterProvider(
 87 | 		sdkmetric.WithReader(metricReader),
 88 | 		sdkmetric.WithResource(res),
 89 | 	)
 90 | 
 91 | 	// Set global providers
 92 | 	otel.SetTracerProvider(tracerProvider)
 93 | 	otel.SetMeterProvider(meterProvider)
 94 | 
 95 | 	return func(ctx context.Context) error {
 96 | 		err := tracerProvider.Shutdown(ctx)
 97 | 		if err != nil {
 98 | 			return err
 99 | 		}
100 | 		err = meterProvider.Shutdown(ctx)
101 | 		if err != nil {
102 | 			return err
103 | 		}
104 | 		return nil
105 | 	}, nil
106 | }
107 | 
108 | // Basic Hello Handler
109 | func hello(c *gin.Context) {
110 | 	startTime := time.Now()
111 | 	ctx := c.Request.Context()
112 | 
113 | 	// Increment active requests
114 | 	activeRequestsCounter.Add(ctx, 1)
115 | 	defer activeRequestsCounter.Add(ctx, -1)
116 | 
117 | 	c.JSON(http.StatusOK, gin.H{
118 | 		"message": "👋 Hello from microservice-a",
119 | 	})
120 | 
121 | 	duration := time.Since(startTime).Milliseconds()
122 | 
123 | 	requestCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("endpoint", "/hello-a")))
124 | 	requestDuration.Record(ctx, float64(duration), metric.WithAttributes(attribute.String("endpoint", "/hello-a")))
125 | }
126 | 
127 | // Call Service A Handler
128 | func callB(c *gin.Context) {
129 | 	startTime := time.Now()
130 | 	ctx := c.Request.Context()
131 | 
132 | 	activeRequestsCounter.Add(ctx, 1)
133 | 	defer activeRequestsCounter.Add(ctx, -1)
134 | 
135 | 	// Load environment variables
136 | 	err := godotenv.Load()
137 | 	if err != nil {
138 | 		log.Println("No .env file found. Using environment variables 👌")
139 | 	}
140 | 
141 | 	SVC_B_URI := os.Getenv("SVC_B_URI")
142 | 	if SVC_B_URI == "" {
143 | 		SVC_B_URI = "http://localhost:8081" // Default URI for service-B
144 | 	}
145 | 
146 | 	// Create a new HTTP client with OpenTelemetry instrumentation
147 | 	client := http.Client{
148 | 		Transport: otelhttp.NewTransport(http.DefaultTransport),
149 | 	}
150 | 
151 | 	// Create a new request
152 | 	req, err := http.NewRequest("GET", fmt.Sprintf("%s/hello-b", SVC_B_URI), nil)
153 | 	if err != nil {
154 | 		c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to create request to service-B"})
155 | 		return
156 | 	}
157 | 
158 | 	// Use the context from Gin
159 | 	req = req.WithContext(ctx)
160 | 
161 | 	// Make the request
162 | 	resp, err := client.Do(req)
163 | 	if err != nil {
164 | 		c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to reach service-B"})
165 | 		return
166 | 	}
167 | 	defer resp.Body.Close()
168 | 
169 | 	resBody, _ := ioutil.ReadAll(resp.Body)
170 | 
171 | 	c.JSON(http.StatusOK, gin.H{
172 | 		"message":  "🥳 Response from service-B",
173 | 		"response": string(resBody),
174 | 	})
175 | 
176 | 	duration := time.Since(startTime).Milliseconds()
177 | 
178 | 	requestCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("endpoint", "/call-b")))
179 | 	requestDuration.Record(ctx, float64(duration), metric.WithAttributes(attribute.String("endpoint", "/call-b")))
180 | }
181 | 
182 | // Get Coffee Handler
183 | func getMeCoffee(c *gin.Context) {
184 | 	startTime := time.Now()
185 | 	ctx := c.Request.Context()
186 | 
187 | 	activeRequestsCounter.Add(ctx, 1)
188 | 	defer activeRequestsCounter.Add(ctx, -1)
189 | 
190 | 	// Create a new HTTP client with OpenTelemetry instrumentation
191 | 	client := http.Client{
192 | 		Transport: otelhttp.NewTransport(http.DefaultTransport),
193 | 	}
194 | 
195 | 	// Create a new request
196 | 	req, err := http.NewRequest("GET", "https://api.sampleapis.com/coffee/iced", nil)
197 | 	if err != nil {
198 | 		c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to create request to coffee API"})
199 | 		return
200 | 	}
201 | 
202 | 	// Use the context from Gin
203 | 	req = req.WithContext(ctx)
204 | 
205 | 	// Make the request
206 | 	resp, err := client.Do(req)
207 | 	if err != nil {
208 | 		c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to fetch coffee"})
209 | 		return
210 | 	}
211 | 	defer resp.Body.Close()
212 | 
213 | 	resBody, _ := ioutil.ReadAll(resp.Body)
214 | 
215 | 	c.JSON(http.StatusOK, gin.H{
216 | 		"message":  "🍵 Here is your coffee",
217 | 		"response": string(resBody),
218 | 	})
219 | 
220 | 	duration := time.Since(startTime).Milliseconds()
221 | 
222 | 	requestCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("endpoint", "/getme-coffee")))
223 | 	requestDuration.Record(ctx, float64(duration), metric.WithAttributes(attribute.String("endpoint", "/getme-coffee")))
224 | }
225 | 
226 | func main() {
227 | 	ctx := context.Background()
228 | 	shutdown, err := initProvider()
229 | 	if err != nil {
230 | 		log.Fatalf("Failed to initialize OpenTelemetry: %v", err)
231 | 	}
232 | 	defer func() {
233 | 		if err := shutdown(ctx); err != nil {
234 | 			log.Fatalf("Error shutting down provider: %v", err)
235 | 		}
236 | 	}()
237 | 
238 | 	router := gin.Default()
239 | 
240 | 	// Use OpenTelemetry middleware for Gin
241 | 	router.Use(otelgin.Middleware("microservice-a"))
242 | 
243 | 	// Initialize the Meter
244 | 	meter := otel.GetMeterProvider().Meter("microservice-a")
245 | 
246 | 	// Initialize instruments using the Meter interface methods
247 | 	requestCounter, err = meter.Int64Counter(
248 | 		"request_count",
249 | 		metric.WithDescription("Counts the number of requests received"),
250 | 	)
251 | 	if err != nil {
252 | 		log.Fatalf("Failed to create counter: %v", err)
253 | 	}
254 | 
255 | 	requestDuration, err = meter.Float64Histogram(
256 | 		"request_duration_ms",
257 | 		metric.WithDescription("Records the duration of requests in milliseconds"),
258 | 	)
259 | 	if err != nil {
260 | 		log.Fatalf("Failed to create histogram: %v", err)
261 | 	}
262 | 
263 | 	activeRequestsCounter, err = meter.Int64UpDownCounter(
264 | 		"active_requests",
265 | 		metric.WithDescription("Counts the number of active requests"),
266 | 	)
267 | 	if err != nil {
268 | 		log.Fatalf("Failed to create up-down counter: %v", err)
269 | 	}
270 | 
271 | 	router.GET("/hello-a", hello)
272 | 	router.GET("/call-b", callB)
273 | 	router.GET("/getme-coffee", getMeCoffee)
274 | 
275 | 	err = godotenv.Load()
276 | 	if err != nil {
277 | 		log.Println("No .env file found. Using environment variables 👌")
278 | 	}
279 | 
280 | 	PORT := os.Getenv("PORT")
281 | 	if PORT == "" {
282 | 		PORT = "80" // Default URI for service-B
283 | 	}
284 | 
285 | 	// Start the server
286 | 	router.Run(fmt.Sprintf(":%s", PORT))
287 | }
288 | 


--------------------------------------------------------------------------------
/day-7/microservice-a/otel-collector-config.yaml:
--------------------------------------------------------------------------------
 1 | # 👉 Note: this file is for to test in local environment - nothing to do with k8s
 2 | 
 3 | 
 4 | # receivers:
 5 | #   otlp:
 6 | #     protocols:
 7 | #       http:
 8 | #       grpc:
 9 | 
10 | receivers:
11 |   otlp:
12 |     protocols:
13 |       http:
14 |         endpoint: "0.0.0.0:4318"
15 |       grpc:
16 |         endpoint: "0.0.0.0:4317"
17 | 
18 | processors:
19 |   batch:
20 | 
21 | exporters:
22 |   prometheus:
23 |     endpoint: "0.0.0.0:8889"
24 |   otlp:
25 |     endpoint: "jaeger:4317"  # Send data to Jaeger over gRPC
26 |     tls:
27 |       insecure: true
28 | service:
29 |   pipelines:
30 |     metrics:
31 |       receivers: [otlp]
32 |       processors: [batch]
33 |       exporters: [prometheus]
34 |     traces:
35 |       receivers: [otlp]
36 |       processors: [batch]
37 |       exporters: [otlp]
38 | 


--------------------------------------------------------------------------------
/day-7/microservice-a/prometheus.yaml:
--------------------------------------------------------------------------------
 1 | # 👉 Note: this file is for to test in local environment - nothing to do with k8s
 2 | 
 3 | 
 4 | global:
 5 |   scrape_interval: 2s
 6 | 
 7 | scrape_configs:
 8 |   - job_name: 'otel-collector'
 9 |     scrape_interval: 2s
10 |     static_configs:
11 |       - targets: ['otel-collector:8889']
12 | 
13 | 
14 | 


--------------------------------------------------------------------------------
/day-7/microservice-a/test.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Set the base URL of your Node.js application
 4 | BASE_URL="http://localhost:8080"
 5 | 
 6 | echo $BASE_URL
 7 | 
 8 | # Define an array of endpoints
 9 | ENDPOINTS=(
10 |   "/hello-a"
11 |   "/call-b"
12 |   "/getme-coffee"
13 | )
14 | 
15 | # Function to make a random request to one of the endpoints
16 | make_random_request() {
17 |   local endpoint=${ENDPOINTS[$RANDOM % ${#ENDPOINTS[@]}]}
18 |   curl -s -o /dev/null -w "%{http_code}" "$BASE_URL$endpoint"
19 | }
20 | 
21 | # Make 1000 random requests
22 | for ((i=1; i<=1000; i++)); do
23 |   make_random_request
24 |   echo "Request $i completed"
25 |   sleep 0.1  # Optional: Sleep for a short duration between requests to simulate real traffic
26 | done
27 | 
28 | echo "Completed 1000 requests"
29 | 


--------------------------------------------------------------------------------
/day-7/microservice-b/.dockerignore:
--------------------------------------------------------------------------------
1 | .env


--------------------------------------------------------------------------------
/day-7/microservice-b/.env:
--------------------------------------------------------------------------------
1 | SVC_A_URI=http://localhost:8080
2 | OTEL_EXPORTER_OTLP_ENDPOINT="localhost:4317"
3 | OTEL_COLLECTOR_ENDPOINT=localhost:4318
4 | PORT=80


--------------------------------------------------------------------------------
/day-7/microservice-b/docker-compose.yml:
--------------------------------------------------------------------------------
 1 | version: '3'
 2 | 
 3 | services:
 4 |   otel-collector:
 5 |     image: otel/opentelemetry-collector-contrib:latest
 6 |     command: ["--config=/etc/otel-collector-config.yaml"]
 7 |     volumes:
 8 |       - ./otel-collector-config.yaml:/etc/otel-collector-config.yaml
 9 |     ports:
10 |       - "4317:4317"   # OTLP gRPC receiver
11 |       - "4318:4318"   # OTLP HTTP receiver
12 |       - "8889:8889"   # Prometheus metrics exporter
13 |     depends_on:
14 |       - jaeger
15 | 
16 |   prometheus:
17 |     image: prom/prometheus:latest
18 |     volumes:
19 |       - ./prometheus.yaml:/etc/prometheus/prometheus.yml
20 |     command:
21 |       - "--config.file=/etc/prometheus/prometheus.yml"
22 |     ports:
23 |       - "9090:9090"
24 |     depends_on:
25 |       - otel-collector
26 | 
27 |   jaeger:
28 |     image: jaegertracing/all-in-one:latest
29 |     ports:
30 |       - "16686:16686"  # Jaeger UI
31 |       - "14250:14250"  # Jaeger gRPC receiver
32 | 
33 | 


--------------------------------------------------------------------------------
/day-7/microservice-b/dockerfile:
--------------------------------------------------------------------------------
 1 | # Use official Golang image as the build image
 2 | FROM golang:1.23-alpine AS builder
 3 | 
 4 | WORKDIR /app
 5 | 
 6 | COPY go.mod ./
 7 | COPY go.sum ./
 8 | RUN go mod download
 9 | 
10 | COPY . ./
11 | 
12 | RUN go build -o service-b .
13 | 
14 | # Use a minimal image for the runtime
15 | FROM alpine:latest
16 | 
17 | WORKDIR /app
18 | 
19 | COPY --from=builder /app/service-b .
20 | 
21 | EXPOSE 80
22 | 
23 | CMD ["./service-b"]
24 | 


--------------------------------------------------------------------------------
/day-7/microservice-b/go.mod:
--------------------------------------------------------------------------------
 1 | module microservice-b
 2 | 
 3 | go 1.23.0
 4 | 
 5 | require (
 6 | 	github.com/gin-gonic/gin v1.10.0
 7 | 	github.com/joho/godotenv v1.5.1
 8 | 	go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin v0.55.0
 9 | 	go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.55.0
10 | 	go.opentelemetry.io/otel v1.30.0
11 | 	go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.30.0
12 | 	go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.30.0
13 | 	go.opentelemetry.io/otel/metric v1.30.0
14 | 	go.opentelemetry.io/otel/sdk v1.30.0
15 | 	go.opentelemetry.io/otel/sdk/metric v1.30.0
16 | )
17 | 
18 | require (
19 | 	github.com/bytedance/sonic v1.12.2 // indirect
20 | 	github.com/bytedance/sonic/loader v0.2.0 // indirect
21 | 	github.com/cenkalti/backoff/v4 v4.3.0 // indirect
22 | 	github.com/cloudwego/base64x v0.1.4 // indirect
23 | 	github.com/cloudwego/iasm v0.2.0 // indirect
24 | 	github.com/felixge/httpsnoop v1.0.4 // indirect
25 | 	github.com/gabriel-vasile/mimetype v1.4.5 // indirect
26 | 	github.com/gin-contrib/sse v0.1.0 // indirect
27 | 	github.com/go-logr/logr v1.4.2 // indirect
28 | 	github.com/go-logr/stdr v1.2.2 // indirect
29 | 	github.com/go-playground/locales v0.14.1 // indirect
30 | 	github.com/go-playground/universal-translator v0.18.1 // indirect
31 | 	github.com/go-playground/validator/v10 v10.22.1 // indirect
32 | 	github.com/goccy/go-json v0.10.3 // indirect
33 | 	github.com/google/uuid v1.6.0 // indirect
34 | 	github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0 // indirect
35 | 	github.com/json-iterator/go v1.1.12 // indirect
36 | 	github.com/klauspost/cpuid/v2 v2.2.8 // indirect
37 | 	github.com/leodido/go-urn v1.4.0 // indirect
38 | 	github.com/mattn/go-isatty v0.0.20 // indirect
39 | 	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
40 | 	github.com/modern-go/reflect2 v1.0.2 // indirect
41 | 	github.com/pelletier/go-toml/v2 v2.2.3 // indirect
42 | 	github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
43 | 	github.com/ugorji/go/codec v1.2.12 // indirect
44 | 	go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.30.0 // indirect
45 | 	go.opentelemetry.io/otel/trace v1.30.0 // indirect
46 | 	go.opentelemetry.io/proto/otlp v1.3.1 // indirect
47 | 	golang.org/x/arch v0.10.0 // indirect
48 | 	golang.org/x/crypto v0.27.0 // indirect
49 | 	golang.org/x/net v0.29.0 // indirect
50 | 	golang.org/x/sys v0.25.0 // indirect
51 | 	golang.org/x/text v0.18.0 // indirect
52 | 	google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1 // indirect
53 | 	google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1 // indirect
54 | 	google.golang.org/grpc v1.66.1 // indirect
55 | 	google.golang.org/protobuf v1.34.2 // indirect
56 | 	gopkg.in/yaml.v3 v3.0.1 // indirect
57 | )
58 | 


--------------------------------------------------------------------------------
/day-7/microservice-b/go.sum:
--------------------------------------------------------------------------------
  1 | github.com/bytedance/sonic v1.12.2 h1:oaMFuRTpMHYLpCntGca65YWt5ny+wAceDERTkT2L9lg=
  2 | github.com/bytedance/sonic v1.12.2/go.mod h1:B8Gt/XvtZ3Fqj+iSKMypzymZxw/FVwgIGKzMzT9r/rk=
  3 | github.com/bytedance/sonic/loader v0.1.1/go.mod h1:ncP89zfokxS5LZrJxl5z0UJcsk4M4yY2JpfqGeCtNLU=
  4 | github.com/bytedance/sonic/loader v0.2.0 h1:zNprn+lsIP06C/IqCHs3gPQIvnvpKbbxyXQP1iU4kWM=
  5 | github.com/bytedance/sonic/loader v0.2.0/go.mod h1:ncP89zfokxS5LZrJxl5z0UJcsk4M4yY2JpfqGeCtNLU=
  6 | github.com/cenkalti/backoff/v4 v4.3.0 h1:MyRJ/UdXutAwSAT+s3wNd7MfTIcy71VQueUuFK343L8=
  7 | github.com/cenkalti/backoff/v4 v4.3.0/go.mod h1:Y3VNntkOUPxTVeUxJ/G5vcM//AlwfmyYozVcomhLiZE=
  8 | github.com/cloudwego/base64x v0.1.4 h1:jwCgWpFanWmN8xoIUHa2rtzmkd5J2plF/dnLS6Xd/0Y=
  9 | github.com/cloudwego/base64x v0.1.4/go.mod h1:0zlkT4Wn5C6NdauXdJRhSKRlJvmclQ1hhJgA0rcu/8w=
 10 | github.com/cloudwego/iasm v0.2.0 h1:1KNIy1I1H9hNNFEEH3DVnI4UujN+1zjpuk6gwHLTssg=
 11 | github.com/cloudwego/iasm v0.2.0/go.mod h1:8rXZaNYT2n95jn+zTI1sDr+IgcD2GVs0nlbbQPiEFhY=
 12 | github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 13 | github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
 14 | github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 15 | github.com/felixge/httpsnoop v1.0.4 h1:NFTV2Zj1bL4mc9sqWACXbQFVBBg2W3GPvqp8/ESS2Wg=
 16 | github.com/felixge/httpsnoop v1.0.4/go.mod h1:m8KPJKqk1gH5J9DgRY2ASl2lWCfGKXixSwevea8zH2U=
 17 | github.com/gabriel-vasile/mimetype v1.4.5 h1:J7wGKdGu33ocBOhGy0z653k/lFKLFDPJMG8Gql0kxn4=
 18 | github.com/gabriel-vasile/mimetype v1.4.5/go.mod h1:ibHel+/kbxn9x2407k1izTA1S81ku1z/DlgOW2QE0M4=
 19 | github.com/gin-contrib/sse v0.1.0 h1:Y/yl/+YNO8GZSjAhjMsSuLt29uWRFHdHYUb5lYOV9qE=
 20 | github.com/gin-contrib/sse v0.1.0/go.mod h1:RHrZQHXnP2xjPF+u1gW/2HnVO7nvIa9PG3Gm+fLHvGI=
 21 | github.com/gin-gonic/gin v1.10.0 h1:nTuyha1TYqgedzytsKYqna+DfLos46nTv2ygFy86HFU=
 22 | github.com/gin-gonic/gin v1.10.0/go.mod h1:4PMNQiOhvDRa013RKVbsiNwoyezlm2rm0uX/T7kzp5Y=
 23 | github.com/go-logr/logr v1.2.2/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
 24 | github.com/go-logr/logr v1.4.2 h1:6pFjapn8bFcIbiKo3XT4j/BhANplGihG6tvd+8rYgrY=
 25 | github.com/go-logr/logr v1.4.2/go.mod h1:9T104GzyrTigFIr8wt5mBrctHMim0Nb2HLGrmQ40KvY=
 26 | github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
 27 | github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
 28 | github.com/go-playground/assert/v2 v2.2.0 h1:JvknZsQTYeFEAhQwI4qEt9cyV5ONwRHC+lYKSsYSR8s=
 29 | github.com/go-playground/assert/v2 v2.2.0/go.mod h1:VDjEfimB/XKnb+ZQfWdccd7VUvScMdVu0Titje2rxJ4=
 30 | github.com/go-playground/locales v0.14.1 h1:EWaQ/wswjilfKLTECiXz7Rh+3BjFhfDFKv/oXslEjJA=
 31 | github.com/go-playground/locales v0.14.1/go.mod h1:hxrqLVvrK65+Rwrd5Fc6F2O76J/NuW9t0sjnWqG1slY=
 32 | github.com/go-playground/universal-translator v0.18.1 h1:Bcnm0ZwsGyWbCzImXv+pAJnYK9S473LQFuzCbDbfSFY=
 33 | github.com/go-playground/universal-translator v0.18.1/go.mod h1:xekY+UJKNuX9WP91TpwSH2VMlDf28Uj24BCp08ZFTUY=
 34 | github.com/go-playground/validator/v10 v10.22.1 h1:40JcKH+bBNGFczGuoBYgX4I6m/i27HYW8P9FDk5PbgA=
 35 | github.com/go-playground/validator/v10 v10.22.1/go.mod h1:dbuPbCMFw/DrkbEynArYaCwl3amGuJotoKCe95atGMM=
 36 | github.com/goccy/go-json v0.10.3 h1:KZ5WoDbxAIgm2HNbYckL0se1fHD6rz5j4ywS6ebzDqA=
 37 | github.com/goccy/go-json v0.10.3/go.mod h1:oq7eo15ShAhp70Anwd5lgX2pLfOS3QCiwU/PULtXL6M=
 38 | github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
 39 | github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
 40 | github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
 41 | github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
 42 | github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
 43 | github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0 h1:asbCHRVmodnJTuQ3qamDwqVOIjwqUPTYmYuemVOx+Ys=
 44 | github.com/grpc-ecosystem/grpc-gateway/v2 v2.22.0/go.mod h1:ggCgvZ2r7uOoQjOyu2Y1NhHmEPPzzuhWgcza5M1Ji1I=
 45 | github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0=
 46 | github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4=
 47 | github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
 48 | github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo=
 49 | github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg=
 50 | github.com/klauspost/cpuid/v2 v2.2.8 h1:+StwCXwm9PdpiEkPyzBXIy+M9KUb4ODm0Zarf1kS5BM=
 51 | github.com/klauspost/cpuid/v2 v2.2.8/go.mod h1:Lcz8mBdAVJIBVzewtcLocK12l3Y+JytZYpaMropDUws=
 52 | github.com/knz/go-libedit v1.10.1/go.mod h1:MZTVkCWyz0oBc7JOWP3wNAzd002ZbM/5hgShxwh4x8M=
 53 | github.com/kr/pretty v0.3.1 h1:flRD4NNwYAUpkphVc1HcthR4KEIFJ65n8Mw5qdRn3LE=
 54 | github.com/kr/pretty v0.3.1/go.mod h1:hoEshYVHaxMs3cyo3Yncou5ZscifuDolrwPKZanG3xk=
 55 | github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
 56 | github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
 57 | github.com/leodido/go-urn v1.4.0 h1:WT9HwE9SGECu3lg4d/dIA+jxlljEa1/ffXKmRjqdmIQ=
 58 | github.com/leodido/go-urn v1.4.0/go.mod h1:bvxc+MVxLKB4z00jd1z+Dvzr47oO32F/QSNjSBOlFxI=
 59 | github.com/mattn/go-isatty v0.0.20 h1:xfD0iDuEKnDkl03q4limB+vH+GxLEtL/jb4xVJSWWEY=
 60 | github.com/mattn/go-isatty v0.0.20/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
 61 | github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
 62 | github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg=
 63 | github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
 64 | github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M=
 65 | github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
 66 | github.com/pelletier/go-toml/v2 v2.2.3 h1:YmeHyLY8mFWbdkNWwpr+qIL2bEqT0o95WSdkNHvL12M=
 67 | github.com/pelletier/go-toml/v2 v2.2.3/go.mod h1:MfCQTFTvCcUyyvvwm1+G6H/jORL20Xlb6rzQu9GuUkc=
 68 | github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
 69 | github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
 70 | github.com/rogpeppe/go-internal v1.12.0 h1:exVL4IDcn6na9z1rAb56Vxr+CgyK3nn3O+epU5NdKM8=
 71 | github.com/rogpeppe/go-internal v1.12.0/go.mod h1:E+RYuTGaKKdloAfM02xzb0FW3Paa99yedzYV+kq4uf4=
 72 | github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
 73 | github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
 74 | github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
 75 | github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
 76 | github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
 77 | github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
 78 | github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
 79 | github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
 80 | github.com/stretchr/testify v1.9.0 h1:HtqpIVDClZ4nwg75+f6Lvsy/wHu+3BoSGCbBAcpTsTg=
 81 | github.com/stretchr/testify v1.9.0/go.mod h1:r2ic/lqez/lEtzL7wO/rwa5dbSLXVDPFyf8C91i36aY=
 82 | github.com/twitchyliquid64/golang-asm v0.15.1 h1:SU5vSMR7hnwNxj24w34ZyCi/FmDZTkS4MhqMhdFk5YI=
 83 | github.com/twitchyliquid64/golang-asm v0.15.1/go.mod h1:a1lVb/DtPvCB8fslRZhAngC2+aY1QWCk3Cedj/Gdt08=
 84 | github.com/ugorji/go/codec v1.2.12 h1:9LC83zGrHhuUA9l16C9AHXAqEV/2wBQ4nkvumAE65EE=
 85 | github.com/ugorji/go/codec v1.2.12/go.mod h1:UNopzCgEMSXjBc6AOMqYvWC1ktqTAfzJZUZgYf6w6lg=
 86 | go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin v0.55.0 h1:n4Dd8YaDFeTd2uw+uCHJzOKeqfLgAOlePZpQ5f9cAoE=
 87 | go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin v0.55.0/go.mod h1:8aCCTMjP225r98yevEMM5NYDb3ianWLoeIzZ1rPyxHU=
 88 | go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.55.0 h1:ZIg3ZT/aQ7AfKqdwp7ECpOK6vHqquXXuyTjIO8ZdmPs=
 89 | go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.55.0/go.mod h1:DQAwmETtZV00skUwgD6+0U89g80NKsJE3DCKeLLPQMI=
 90 | go.opentelemetry.io/contrib/propagators/b3 v1.30.0 h1:vumy4r1KMyaoQRltX7cJ37p3nluzALX9nugCjNNefuY=
 91 | go.opentelemetry.io/contrib/propagators/b3 v1.30.0/go.mod h1:fRbvRsaeVZ82LIl3u0rIvusIel2UUf+JcaaIpy5taho=
 92 | go.opentelemetry.io/otel v1.30.0 h1:F2t8sK4qf1fAmY9ua4ohFS/K+FUuOPemHUIXHtktrts=
 93 | go.opentelemetry.io/otel v1.30.0/go.mod h1:tFw4Br9b7fOS+uEao81PJjVMjW/5fvNCbpsDIXqP0pc=
 94 | go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.30.0 h1:VrMAbeJz4gnVDg2zEzjHG4dEH86j4jO6VYB+NgtGD8s=
 95 | go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp v1.30.0/go.mod h1:qqN/uFdpeitTvm+JDqqnjm517pmQRYxTORbETHq5tOc=
 96 | go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.30.0 h1:lsInsfvhVIfOI6qHVyysXMNDnjO9Npvl7tlDPJFBVd4=
 97 | go.opentelemetry.io/otel/exporters/otlp/otlptrace v1.30.0/go.mod h1:KQsVNh4OjgjTG0G6EiNi1jVpnaeeKsKMRwbLN+f1+8M=
 98 | go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.30.0 h1:umZgi92IyxfXd/l4kaDhnKgY8rnN/cZcF1LKc6I8OQ8=
 99 | go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp v1.30.0/go.mod h1:4lVs6obhSVRb1EW5FhOuBTyiQhtRtAnnva9vD3yRfq8=
100 | go.opentelemetry.io/otel/metric v1.30.0 h1:4xNulvn9gjzo4hjg+wzIKG7iNFEaBMX00Qd4QIZs7+w=
101 | go.opentelemetry.io/otel/metric v1.30.0/go.mod h1:aXTfST94tswhWEb+5QjlSqG+cZlmyXy/u8jFpor3WqQ=
102 | go.opentelemetry.io/otel/sdk v1.30.0 h1:cHdik6irO49R5IysVhdn8oaiR9m8XluDaJAs4DfOrYE=
103 | go.opentelemetry.io/otel/sdk v1.30.0/go.mod h1:p14X4Ok8S+sygzblytT1nqG98QG2KYKv++HE0LY/mhg=
104 | go.opentelemetry.io/otel/sdk/metric v1.30.0 h1:QJLT8Pe11jyHBHfSAgYH7kEmT24eX792jZO1bo4BXkM=
105 | go.opentelemetry.io/otel/sdk/metric v1.30.0/go.mod h1:waS6P3YqFNzeP01kuo/MBBYqaoBJl7efRQHOaydhy1Y=
106 | go.opentelemetry.io/otel/trace v1.30.0 h1:7UBkkYzeg3C7kQX8VAidWh2biiQbtAKjyIML8dQ9wmc=
107 | go.opentelemetry.io/otel/trace v1.30.0/go.mod h1:5EyKqTzzmyqB9bwtCCq6pDLktPK6fmGf/Dph+8VI02o=
108 | go.opentelemetry.io/proto/otlp v1.3.1 h1:TrMUixzpM0yuc/znrFTP9MMRh8trP93mkCiDVeXrui0=
109 | go.opentelemetry.io/proto/otlp v1.3.1/go.mod h1:0X1WI4de4ZsLrrJNLAQbFeLCm3T7yBkR0XqQ7niQU+8=
110 | golang.org/x/arch v0.10.0 h1:S3huipmSclq3PJMNe76NGwkBR504WFkQ5dhzWzP8ZW8=
111 | golang.org/x/arch v0.10.0/go.mod h1:FEVrYAQjsQXMVJ1nsMoVVXPZg6p2JE2mx8psSWTDQys=
112 | golang.org/x/crypto v0.27.0 h1:GXm2NjJrPaiv/h1tb2UH8QfgC/hOf/+z0p6PT8o1w7A=
113 | golang.org/x/crypto v0.27.0/go.mod h1:1Xngt8kV6Dvbssa53Ziq6Eqn0HqbZi5Z6R0ZpwQzt70=
114 | golang.org/x/net v0.29.0 h1:5ORfpBpCs4HzDYoodCDBbwHzdR5UrLBZ3sOnUJmFoHo=
115 | golang.org/x/net v0.29.0/go.mod h1:gLkgy8jTGERgjzMic6DS9+SP0ajcu6Xu3Orq/SpETg0=
116 | golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
117 | golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
118 | golang.org/x/sys v0.25.0 h1:r+8e+loiHxRqhXVl6ML1nO3l1+oFoWbnlu2Ehimmi34=
119 | golang.org/x/sys v0.25.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
120 | golang.org/x/text v0.18.0 h1:XvMDiNzPAl0jr17s6W9lcaIhGUfUORdGCNsuLmPG224=
121 | golang.org/x/text v0.18.0/go.mod h1:BuEKDfySbSR4drPmRPG/7iBdf8hvFMuRexcpahXilzY=
122 | google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1 h1:hjSy6tcFQZ171igDaN5QHOw2n6vx40juYbC/x67CEhc=
123 | google.golang.org/genproto/googleapis/api v0.0.0-20240903143218-8af14fe29dc1/go.mod h1:qpvKtACPCQhAdu3PyQgV4l3LMXZEtft7y8QcarRsp9I=
124 | google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1 h1:pPJltXNxVzT4pK9yD8vR9X75DaWYYmLGMsEvBfFQZzQ=
125 | google.golang.org/genproto/googleapis/rpc v0.0.0-20240903143218-8af14fe29dc1/go.mod h1:UqMtugtsSgubUsoxbuAoiCXvqvErP7Gf0so0mK9tHxU=
126 | google.golang.org/grpc v1.66.1 h1:hO5qAXR19+/Z44hmvIM4dQFMSYX9XcWsByfoxutBpAM=
127 | google.golang.org/grpc v1.66.1/go.mod h1:s3/l6xSSCURdVfAnL+TqCNMyTDAGN6+lZeVxnZR128Y=
128 | google.golang.org/protobuf v1.34.2 h1:6xV6lTsCfpGD21XK49h7MhtcApnLqkfYgPcdHftf6hg=
129 | google.golang.org/protobuf v1.34.2/go.mod h1:qYOHts0dSfpeUzUFpOMr/WGzszTmLH+DiWniOlNbLDw=
130 | gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
131 | gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
132 | gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
133 | gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
134 | gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
135 | gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
136 | nullprogram.com/x/optparse v1.0.0/go.mod h1:KdyPE+Igbe0jQUrVfMqDMeJQIJZEuyV7pjYmp6pbG50=
137 | 


--------------------------------------------------------------------------------
/day-7/microservice-b/main.go:
--------------------------------------------------------------------------------
  1 | package main
  2 | 
  3 | import (
  4 | 	"context"
  5 | 	"fmt"
  6 | 	"io/ioutil"
  7 | 	"log"
  8 | 	"net/http"
  9 | 	"os"
 10 | 	"time"
 11 | 
 12 | 	"github.com/gin-gonic/gin"
 13 | 	"github.com/joho/godotenv"
 14 | 
 15 | 	"go.opentelemetry.io/otel"
 16 | 	"go.opentelemetry.io/otel/attribute"
 17 | 	"go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp"
 18 | 	"go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp"
 19 | 	"go.opentelemetry.io/otel/metric"
 20 | 	sdkmetric "go.opentelemetry.io/otel/sdk/metric"
 21 | 	"go.opentelemetry.io/otel/sdk/resource"
 22 | 	"go.opentelemetry.io/otel/sdk/trace"
 23 | 	semconv "go.opentelemetry.io/otel/semconv/v1.21.0"
 24 | 
 25 | 	"go.opentelemetry.io/contrib/instrumentation/github.com/gin-gonic/gin/otelgin"
 26 | 	"go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"
 27 | )
 28 | 
 29 | var (
 30 | 	requestCounter        metric.Int64Counter
 31 | 	requestDuration       metric.Float64Histogram
 32 | 	activeRequestsCounter metric.Int64UpDownCounter
 33 | )
 34 | 
 35 | func initProvider() (func(context.Context) error, error) {
 36 | 	ctx := context.Background()
 37 | 
 38 | 	// Load environment variables
 39 | 	err := godotenv.Load()
 40 | 	if err != nil {
 41 | 		log.Println("No .env file found. Using environment variables 👌")
 42 | 	}
 43 | 
 44 | 	// Read the OTEL collector endpoint from environment variable
 45 | 	otelEndpoint := os.Getenv("OTEL_COLLECTOR_ENDPOINT")
 46 | 	if otelEndpoint == "" {
 47 | 		otelEndpoint = "localhost:4318" // Default endpoint
 48 | 	}
 49 | 
 50 | 	// Create a resource with the service name
 51 | 	res, err := resource.New(ctx,
 52 | 		resource.WithAttributes(
 53 | 			semconv.ServiceNameKey.String("microservice-b"),
 54 | 		),
 55 | 	)
 56 | 	if err != nil {
 57 | 		return nil, fmt.Errorf("failed to create resource: %w", err)
 58 | 	}
 59 | 
 60 | 	// Create OTLP trace exporter over HTTP with custom endpoint
 61 | 	traceExporter, err := otlptracehttp.New(ctx,
 62 | 		otlptracehttp.WithEndpoint(otelEndpoint),
 63 | 		otlptracehttp.WithInsecure(),
 64 | 	)
 65 | 	if err != nil {
 66 | 		return nil, fmt.Errorf("failed to create trace exporter: %w", err)
 67 | 	}
 68 | 
 69 | 	// Create OTLP metric exporter over HTTP with custom endpoint
 70 | 	metricExporter, err := otlpmetrichttp.New(ctx,
 71 | 		otlpmetrichttp.WithEndpoint(otelEndpoint),
 72 | 		otlpmetrichttp.WithInsecure(),
 73 | 	)
 74 | 	if err != nil {
 75 | 		return nil, fmt.Errorf("failed to create metric exporter: %w", err)
 76 | 	}
 77 | 
 78 | 	// Create trace provider with the exporter and resource
 79 | 	tracerProvider := trace.NewTracerProvider(
 80 | 		trace.WithBatcher(traceExporter),
 81 | 		trace.WithResource(res),
 82 | 	)
 83 | 
 84 | 	// Create metric reader and meter provider with the resource
 85 | 	metricReader := sdkmetric.NewPeriodicReader(metricExporter)
 86 | 	meterProvider := sdkmetric.NewMeterProvider(
 87 | 		sdkmetric.WithReader(metricReader),
 88 | 		sdkmetric.WithResource(res),
 89 | 	)
 90 | 
 91 | 	// Set global providers
 92 | 	otel.SetTracerProvider(tracerProvider)
 93 | 	otel.SetMeterProvider(meterProvider)
 94 | 
 95 | 	return func(ctx context.Context) error {
 96 | 		err := tracerProvider.Shutdown(ctx)
 97 | 		if err != nil {
 98 | 			return err
 99 | 		}
100 | 		err = meterProvider.Shutdown(ctx)
101 | 		if err != nil {
102 | 			return err
103 | 		}
104 | 		return nil
105 | 	}, nil
106 | }
107 | 
108 | // Basic Hello Handler
109 | func hello(c *gin.Context) {
110 | 	startTime := time.Now()
111 | 	ctx := c.Request.Context()
112 | 
113 | 	// Increment active requests
114 | 	activeRequestsCounter.Add(ctx, 1)
115 | 	defer activeRequestsCounter.Add(ctx, -1)
116 | 
117 | 	c.JSON(http.StatusOK, gin.H{
118 | 		"message": "👋 Hello from microservice-b",
119 | 	})
120 | 
121 | 	duration := time.Since(startTime).Milliseconds()
122 | 
123 | 	requestCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("endpoint", "/hello-b")))
124 | 	requestDuration.Record(ctx, float64(duration), metric.WithAttributes(attribute.String("endpoint", "/hello-b")))
125 | }
126 | 
127 | // Call Service A Handler
128 | func callA(c *gin.Context) {
129 | 	startTime := time.Now()
130 | 	ctx := c.Request.Context()
131 | 
132 | 	activeRequestsCounter.Add(ctx, 1)
133 | 	defer activeRequestsCounter.Add(ctx, -1)
134 | 
135 | 	// Load environment variables
136 | 	err := godotenv.Load()
137 | 	if err != nil {
138 | 		log.Println("No .env file found. Using environment variables 👌")
139 | 	}
140 | 
141 | 	SVC_A_URI := os.Getenv("SVC_A_URI")
142 | 	if SVC_A_URI == "" {
143 | 		SVC_A_URI = "http://localhost:8080" // Default URI for service-A
144 | 	}
145 | 
146 | 	// Create a new HTTP client with OpenTelemetry instrumentation
147 | 	client := http.Client{
148 | 		Transport: otelhttp.NewTransport(http.DefaultTransport),
149 | 	}
150 | 
151 | 	// Create a new request
152 | 	req, err := http.NewRequest("GET", fmt.Sprintf("%s/hello-a", SVC_A_URI), nil)
153 | 	if err != nil {
154 | 		c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to create request to service-A"})
155 | 		return
156 | 	}
157 | 
158 | 	// Use the context from Gin
159 | 	req = req.WithContext(ctx)
160 | 
161 | 	// Make the request
162 | 	resp, err := client.Do(req)
163 | 	if err != nil {
164 | 		c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to reach service-A"})
165 | 		return
166 | 	}
167 | 	defer resp.Body.Close()
168 | 
169 | 	resBody, _ := ioutil.ReadAll(resp.Body)
170 | 
171 | 	c.JSON(http.StatusOK, gin.H{
172 | 		"message":  "🥳 Response from service-A",
173 | 		"response": string(resBody),
174 | 	})
175 | 
176 | 	duration := time.Since(startTime).Milliseconds()
177 | 
178 | 	requestCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("endpoint", "/call-a")))
179 | 	requestDuration.Record(ctx, float64(duration), metric.WithAttributes(attribute.String("endpoint", "/call-a")))
180 | }
181 | 
182 | // Get Coffee Handler
183 | func getMeCoffee(c *gin.Context) {
184 | 	startTime := time.Now()
185 | 	ctx := c.Request.Context()
186 | 
187 | 	activeRequestsCounter.Add(ctx, 1)
188 | 	defer activeRequestsCounter.Add(ctx, -1)
189 | 
190 | 	// Create a new HTTP client with OpenTelemetry instrumentation
191 | 	client := http.Client{
192 | 		Transport: otelhttp.NewTransport(http.DefaultTransport),
193 | 	}
194 | 
195 | 	// Create a new request
196 | 	req, err := http.NewRequest("GET", "https://api.sampleapis.com/coffee/iced", nil)
197 | 	if err != nil {
198 | 		c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to create request to coffee API"})
199 | 		return
200 | 	}
201 | 
202 | 	// Use the context from Gin
203 | 	req = req.WithContext(ctx)
204 | 
205 | 	// Make the request
206 | 	resp, err := client.Do(req)
207 | 	if err != nil {
208 | 		c.JSON(http.StatusInternalServerError, gin.H{"error": "Failed to fetch coffee"})
209 | 		return
210 | 	}
211 | 	defer resp.Body.Close()
212 | 
213 | 	resBody, _ := ioutil.ReadAll(resp.Body)
214 | 
215 | 	c.JSON(http.StatusOK, gin.H{
216 | 		"message":  "🍵 Here is your coffee",
217 | 		"response": string(resBody),
218 | 	})
219 | 
220 | 	duration := time.Since(startTime).Milliseconds()
221 | 
222 | 	requestCounter.Add(ctx, 1, metric.WithAttributes(attribute.String("endpoint", "/getme-coffee")))
223 | 	requestDuration.Record(ctx, float64(duration), metric.WithAttributes(attribute.String("endpoint", "/getme-coffee")))
224 | }
225 | 
226 | func main() {
227 | 	ctx := context.Background()
228 | 	shutdown, err := initProvider()
229 | 	if err != nil {
230 | 		log.Fatalf("Failed to initialize OpenTelemetry: %v", err)
231 | 	}
232 | 	defer func() {
233 | 		if err := shutdown(ctx); err != nil {
234 | 			log.Fatalf("Error shutting down provider: %v", err)
235 | 		}
236 | 	}()
237 | 
238 | 	router := gin.Default()
239 | 
240 | 	// Use OpenTelemetry middleware for Gin
241 | 	router.Use(otelgin.Middleware("microservice-b"))
242 | 
243 | 	// Initialize the Meter
244 | 	meter := otel.GetMeterProvider().Meter("microservice-b")
245 | 
246 | 	// Initialize instruments using the Meter interface methods
247 | 	requestCounter, err = meter.Int64Counter(
248 | 		"request_count",
249 | 		metric.WithDescription("Counts the number of requests received"),
250 | 	)
251 | 	if err != nil {
252 | 		log.Fatalf("Failed to create counter: %v", err)
253 | 	}
254 | 
255 | 	requestDuration, err = meter.Float64Histogram(
256 | 		"request_duration_ms",
257 | 		metric.WithDescription("Records the duration of requests in milliseconds"),
258 | 	)
259 | 	if err != nil {
260 | 		log.Fatalf("Failed to create histogram: %v", err)
261 | 	}
262 | 
263 | 	activeRequestsCounter, err = meter.Int64UpDownCounter(
264 | 		"active_requests",
265 | 		metric.WithDescription("Counts the number of active requests"),
266 | 	)
267 | 	if err != nil {
268 | 		log.Fatalf("Failed to create up-down counter: %v", err)
269 | 	}
270 | 
271 | 	router.GET("/hello-b", hello)
272 | 	router.GET("/call-a", callA)
273 | 	router.GET("/getme-coffee", getMeCoffee)
274 | 
275 | 	PORT := os.Getenv("PORT")
276 | 	if PORT == "" {
277 | 		PORT = "80" // Default URI for service-B
278 | 	}
279 | 
280 | 	// Start the server
281 | 	router.Run(fmt.Sprintf(":%s", PORT))
282 | }
283 | 


--------------------------------------------------------------------------------
/day-7/microservice-b/otel-collector-config.yaml:
--------------------------------------------------------------------------------
 1 | # 👉 Note: this file is for to test in local environment - nothing to do with k8s
 2 | 
 3 | 
 4 | # receivers:
 5 | #   otlp:
 6 | #     protocols:
 7 | #       http:
 8 | #       grpc:
 9 | 
10 | receivers:
11 |   otlp:
12 |     protocols:
13 |       http:
14 |         endpoint: "0.0.0.0:4318"
15 |       grpc:
16 |         endpoint: "0.0.0.0:4317"
17 | 
18 | processors:
19 |   batch:
20 | 
21 | exporters:
22 |   prometheus:
23 |     endpoint: "0.0.0.0:8889"
24 |   otlp:
25 |     endpoint: "jaeger:4317"  # Send data to Jaeger over gRPC
26 |     tls:
27 |       insecure: true
28 | service:
29 |   pipelines:
30 |     metrics:
31 |       receivers: [otlp]
32 |       processors: [batch]
33 |       exporters: [prometheus]
34 |     traces:
35 |       receivers: [otlp]
36 |       processors: [batch]
37 |       exporters: [otlp]
38 | 


--------------------------------------------------------------------------------
/day-7/microservice-b/prometheus.yaml:
--------------------------------------------------------------------------------
 1 | # 👉 Note: this file is for to test in local environment - nothing to do with k8s
 2 | 
 3 | global:
 4 |   scrape_interval: 2s
 5 | 
 6 | scrape_configs:
 7 |   - job_name: 'otel-collector'
 8 |     scrape_interval: 2s
 9 |     static_configs:
10 |       - targets: ['otel-collector:8889']
11 | 


--------------------------------------------------------------------------------
/day-7/microservice-b/test.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Set the base URL of your Node.js application
 4 | BASE_URL="http://localhost:8081"
 5 | 
 6 | echo $BASE_URL
 7 | 
 8 | # Define an array of endpoints
 9 | ENDPOINTS=(
10 |   "/hello-b"
11 |   "/call-a"
12 |   "/getme-coffee"
13 | )
14 | 
15 | # Function to make a random request to one of the endpoints
16 | make_random_request() {
17 |   local endpoint=${ENDPOINTS[$RANDOM % ${#ENDPOINTS[@]}]}
18 |   curl -s -o /dev/null -w "%{http_code}" "$BASE_URL$endpoint"
19 | }
20 | 
21 | # Make 1000 random requests
22 | for ((i=1; i<=1000; i++)); do
23 |   make_random_request
24 |   echo "Request $i completed"
25 |   sleep 0.1  # Optional: Sleep for a short duration between requests to simulate real traffic
26 | done
27 | 
28 | echo "Completed 1000 requests"
29 | 


--------------------------------------------------------------------------------
/day-7/otel-collector-values.yaml:
--------------------------------------------------------------------------------
  1 | # otel-collector-values.yaml
  2 | 
  3 | mode: "deployment"
  4 | 
  5 | config:
  6 |   receivers:
  7 |     otlp:
  8 |       protocols:
  9 |         http:
 10 |           endpoint: "0.0.0.0:4318"
 11 |         grpc:
 12 |           endpoint: "0.0.0.0:4317"
 13 | 
 14 |   processors:
 15 |     batch: {}
 16 | 
 17 |   exporters:
 18 |     prometheus:
 19 |       endpoint: "0.0.0.0:8889"
 20 | 
 21 |     # jaeger:
 22 |     #   endpoint: "jaeger-collector.olly:14250"  # Jaeger gRPC endpoint
 23 |     #   insecure: true
 24 | 
 25 |     otlp:
 26 |       endpoint: "jaeger-collector.olly:4317"  # Update as per your Jaeger service
 27 |       tls:
 28 |         insecure: true
 29 |     debug:
 30 |       verbosity: detailed
 31 |   service:
 32 |     pipelines:
 33 |       metrics:
 34 |         receivers: [otlp]
 35 |         processors: [batch]
 36 |         exporters: [prometheus]
 37 |       traces:
 38 |         receivers: [otlp]
 39 |         processors: [batch]
 40 |         exporters: [otlp]
 41 | 
 42 | image:
 43 |   repository: "otel/opentelemetry-collector-contrib"  # Use contrib image
 44 |   tag: "latest"  # Specify the desired tag
 45 |   pullPolicy: "IfNotPresent"
 46 | 
 47 | command:
 48 |   name: "otelcol-contrib"  # Optional: Update command name if necessary
 49 | 
 50 | service:
 51 |   type: ClusterIP
 52 | 
 53 | # Uncomment and configure if using ServiceMonitor
 54 | # serviceMonitor:
 55 | #   enabled: true
 56 | #   namespace: olly
 57 | #   selector:
 58 | #     matchLabels:
 59 | #       app: otel-collector
 60 | #   endpoints:
 61 | #     - port: prometheus
 62 | #       interval: 2s
 63 | 
 64 | resources:
 65 |   requests:
 66 |     memory: "256Mi"
 67 |     cpu: "250m"
 68 |   limits:
 69 |     memory: "512Mi"
 70 |     cpu: "500m"
 71 | 
 72 | ports:
 73 |   prometheus:
 74 |     enabled: true
 75 |     containerPort: 8889
 76 |     servicePort: 8889
 77 |     hostPort: 8889
 78 |     protocol: TCP
 79 |     appProtocol: TCP
 80 |   otlp:
 81 |     enabled: true
 82 |     containerPort: 4317
 83 |     servicePort: 4317
 84 |     hostPort: 4317
 85 |     protocol: TCP
 86 |     # nodePort: 30317
 87 |     appProtocol: grpc
 88 |   otlp-http:
 89 |     enabled: true
 90 |     containerPort: 4318
 91 |     servicePort: 4318
 92 |     hostPort: 4318
 93 |     protocol: TCP
 94 |   jaeger-compact:
 95 |     enabled: true
 96 |     containerPort: 6831
 97 |     servicePort: 6831
 98 |     hostPort: 6831
 99 |     protocol: UDP
100 |   jaeger-thrift:
101 |     enabled: true
102 |     containerPort: 14268
103 |     servicePort: 14268
104 |     hostPort: 14268
105 |     protocol: TCP
106 |   jaeger-grpc:
107 |     enabled: true
108 |     containerPort: 14250
109 |     servicePort: 14250
110 |     hostPort: 14250
111 |     protocol: TCP
112 |   zipkin:
113 |     enabled: true
114 |     containerPort: 9411
115 |     servicePort: 9411
116 |     hostPort: 9411
117 |     protocol: TCP


--------------------------------------------------------------------------------
/day-7/prometheus-values.yaml:
--------------------------------------------------------------------------------
 1 | serverFiles:
 2 |   prometheus.yml:
 3 |       scrape_configs:
 4 |       - job_name: otel-collector
 5 |         static_configs:
 6 |           - targets:
 7 |             - otel-collector-opentelemetry-collector.olly:8889
 8 | 
 9 | alertmanager:
10 |   enabled: false
11 | 
12 | prometheus-pushgateway:
13 |   enabled: false
14 | 
15 | kube-state-metrics:
16 |   enabled: false
17 | 
18 | prometheus-node-exporter:
19 |   enabled: false


--------------------------------------------------------------------------------
/day-7/test.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash
 2 | 
 3 | # Check if both load balancers are provided as input
 4 | if [ $# -ne 2 ]; then
 5 |     echo "Usage: $0 <LB-1 DNS> <LB-2 DNS>"
 6 |     exit 1
 7 | fi
 8 | 
 9 | # Assign input arguments to variables
10 | LB1=$1
11 | LB2=$2
12 | 
13 | # Define available routes for LB1 and LB2
14 | LB1_ROUTES=("/call-b" "/hello-a" "/getme-coffee")
15 | LB2_ROUTES=("/call-a" "/hello-b" "/getme-coffee")
16 | 
17 | # Function to generate random index and request from LB1
18 | request_lb1() {
19 |     RANDOM_INDEX=$((RANDOM % ${#LB1_ROUTES[@]}))
20 |     URL="$LB1${LB1_ROUTES[$RANDOM_INDEX]}"
21 |     echo "Sending request to LB1: $URL"
22 |     curl -s -o /dev/null -w "%{http_code}" $URL
23 | }
24 | 
25 | # Function to generate random index and request from LB2
26 | request_lb2() {
27 |     RANDOM_INDEX=$((RANDOM % ${#LB2_ROUTES[@]}))
28 |     URL="$LB2${LB2_ROUTES[$RANDOM_INDEX]}"
29 |     echo "Sending request to LB2: $URL"
30 |     curl -s -o /dev/null -w "%{http_code}" $URL
31 | }
32 | 
33 | # Loop for sending requests to both LBs randomly
34 | while true; do
35 |     # Randomly choose between LB1 and LB2
36 |     if (( RANDOM % 2 == 0 )); then
37 |         request_lb1
38 |     else
39 |         request_lb2
40 |     fi
41 | 
42 |     # Sleep for 1 second between requests (adjust if needed)
43 |     sleep 1
44 | done
45 | 


--------------------------------------------------------------------------------
/opensearch-stack/fluent-bit-config.yaml:
--------------------------------------------------------------------------------
 1 | # fluent-bit-config.yaml
 2 | apiVersion: v1
 3 | kind: ConfigMap
 4 | metadata:
 5 |   name: fluent-bit-config
 6 |   namespace: logging
 7 | data:
 8 |   fluent-bit.conf: |
 9 |     [SERVICE]
10 |         Flush        5
11 |         Daemon       Off
12 |         Log_Level    info
13 |         Parsers_File parsers.conf
14 | 
15 |     [INPUT]
16 |         Name              tail
17 |         Path              /var/log/containers/*.log
18 |         Parser            docker
19 |         Tag               kube.*
20 |         Refresh_Interval  5
21 |         Mem_Buf_Limit     5MB
22 |         Skip_Long_Lines   On
23 | 
24 |     [FILTER]
25 |         Name                kubernetes
26 |         Match               kube.*
27 |         Kube_URL            https://kubernetes.default.svc:443
28 |         Merge_Log           On
29 |         K8S-Logging.Parser  On
30 |         K8S-Logging.Exclude Off
31 | 
32 |     [OUTPUT]
33 |         Name            opensearch
34 |         Match           *
35 |         Host            <replace with your password>
36 |         Port            <replace with your port>
37 |         Index           fluentbit
38 |         HTTP_User       <replace with your username>
39 |         HTTP_Passwd     <replace with your password>
40 |         TLS               On
41 |         TLS.verify        Off
42 |         Suppress_Type_Name On
43 |         Include_Tag_Key    On
44 |         Logstash_Format    On
45 |         Logstash_Prefix    kubernetes
46 |         Replace_Dots       On
47 |         Retry_Limit        False
48 |         # Add these parameters for OpenSearch compatibility
49 |         Write_Operation    create
50 | 
51 | 
52 |   parsers.conf: |
53 |     [PARSER]
54 |         Name        docker
55 |         Format      json
56 |         Time_Key    time
57 |         Time_Format %Y-%m-%dT%H:%M:%S.%L
58 |         Time_Keep   On
59 |         Decode_Field_As   escaped_utf8 log do_next
60 |         Decode_Field_As   json log
61 | 


--------------------------------------------------------------------------------
/opensearch-stack/fluent-bit-daemonset.yaml:
--------------------------------------------------------------------------------
 1 | # fluent-bit-daemonset.yaml
 2 | apiVersion: apps/v1
 3 | kind: DaemonSet
 4 | metadata:
 5 |   name: fluent-bit
 6 |   namespace: logging
 7 |   labels:
 8 |     app: fluent-bit
 9 | spec:
10 |   selector:
11 |     matchLabels:
12 |       app: fluent-bit
13 |   template:
14 |     metadata:
15 |       labels:
16 |         app: fluent-bit
17 |     spec:
18 |       serviceAccountName: fluent-bit
19 |       containers:
20 |         - name: fluent-bit
21 |           image: fluent/fluent-bit:3.0.4 
22 |           imagePullPolicy: Always
23 |           volumeMounts:
24 |             - name: varlog
25 |               mountPath: /var/log
26 |             - name: varlibdockercontainers
27 |               mountPath: /var/lib/docker/containers
28 |               readOnly: true
29 |             - name: config
30 |               mountPath: /fluent-bit/etc/
31 |       volumes:
32 |         - name: varlog
33 |           hostPath:
34 |             path: /var/log
35 |         - name: varlibdockercontainers
36 |           hostPath:
37 |             path: /var/lib/docker/containers
38 |         - name: config
39 |           configMap:
40 |             name: fluent-bit-config
41 | 


--------------------------------------------------------------------------------
/opensearch-stack/log-generator.yaml:
--------------------------------------------------------------------------------
 1 | # log-generator-deployment.yaml
 2 | apiVersion: apps/v1
 3 | kind: Deployment
 4 | metadata:
 5 |   name: log-generator
 6 |   labels:
 7 |     app: log-generator
 8 | spec:
 9 |   replicas: 1
10 |   selector:
11 |     matchLabels:
12 |       app: log-generator
13 |   template:
14 |     metadata:
15 |       labels:
16 |         app: log-generator
17 |     spec:
18 |       containers:
19 |         - name: log-generator
20 |           image: busybox
21 |           command: ["/bin/sh", "-c"]
22 |           args:
23 |             - |
24 |               while true; do
25 |                 echo "$(date) INFO: Application started";
26 |                 echo "$(date) DEBUG: Debugging app logic";
27 |                 echo "$(date) ERROR: An error occurred!";
28 |                 sleep 5;
29 |               done
30 | 


--------------------------------------------------------------------------------
/opensearch-stack/prerequisites.md:
--------------------------------------------------------------------------------
1 | # OpenSearch Stack 
2 | 
3 | ### Prerequisites
4 | 
5 | - Setup a Kubernetes cluster
6 | - Create the logging namespace - `kubectl create ns logging`
7 | - Create serviceaccount in the namespace - `kubectl create sa fluent-bit -n logging`
8 | 
9 | 


--------------------------------------------------------------------------------