├── generic_conf
    ├── setup_logging.conf
    ├── basic_vts_location.conf
    ├── basic_vts_setup.conf
    ├── lua_path_setup.conf
    ├── define_cache.conf
    ├── setup_cache.conf
    └── backend_definition.conf
├── load_test.sh
├── img
    ├── .DS_Store
    ├── 2.2.0_wrk.webp
    ├── add_source.webp
    ├── cache_hit.webp
    ├── cache_lock.webp
    ├── set_source.webp
    ├── 2.2.0_metrics.webp
    ├── 2.2.1_wrk_1s.webp
    ├── 2.2.1_wrk_60s.webp
    ├── 3.0.0_metrics.webp
    ├── 3.1.0_metrics.webp
    ├── 3.1.1_metrics.webp
    ├── 4.0.0_metrics.webp
    ├── 4.0.1_metrics.webp
    ├── edge_backend.webp
    ├── metrics_status.webp
    ├── 2.2.1_metrics_1s.webp
    ├── 2.2.1_metrics_60s.webp
    ├── initial_architecture.webp
    ├── metrics_architecture.webp
    ├── nginx_directive_restriction.webp
    └── simplified_workers_nginx_architecture.webp
├── data
    └── grafana
    │   ├── grafana.db
    │   └── alerting
    │       └── 1
    │           └── __default__.tmpl
├── .gitignore
├── src
    ├── edge.lua
    ├── backend.lua
    ├── load_tests.lua
    ├── loadbalancer.lua
    └── simulations.lua
├── Dockerfile
├── config
    └── prometheus.yml
├── nginx_backend.conf
├── nginx_edge.conf
├── nginx_loadbalancer.conf
├── LICENSE
├── docker-compose.yaml
└── README.md


/generic_conf/setup_logging.conf:
--------------------------------------------------------------------------------
1 | #access_log /dev/stdout;
2 | access_log off;
3 | 


--------------------------------------------------------------------------------
/load_test.sh:
--------------------------------------------------------------------------------
1 | wrk -c10 -t2 -d600s -s ./src/load_tests.lua --latency http://localhost:18080
2 | 


--------------------------------------------------------------------------------
/img/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/.DS_Store


--------------------------------------------------------------------------------
/img/2.2.0_wrk.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/2.2.0_wrk.webp


--------------------------------------------------------------------------------
/img/add_source.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/add_source.webp


--------------------------------------------------------------------------------
/img/cache_hit.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/cache_hit.webp


--------------------------------------------------------------------------------
/img/cache_lock.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/cache_lock.webp


--------------------------------------------------------------------------------
/img/set_source.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/set_source.webp


--------------------------------------------------------------------------------
/img/2.2.0_metrics.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/2.2.0_metrics.webp


--------------------------------------------------------------------------------
/img/2.2.1_wrk_1s.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/2.2.1_wrk_1s.webp


--------------------------------------------------------------------------------
/img/2.2.1_wrk_60s.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/2.2.1_wrk_60s.webp


--------------------------------------------------------------------------------
/img/3.0.0_metrics.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/3.0.0_metrics.webp


--------------------------------------------------------------------------------
/img/3.1.0_metrics.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/3.1.0_metrics.webp


--------------------------------------------------------------------------------
/img/3.1.1_metrics.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/3.1.1_metrics.webp


--------------------------------------------------------------------------------
/img/4.0.0_metrics.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/4.0.0_metrics.webp


--------------------------------------------------------------------------------
/img/4.0.1_metrics.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/4.0.1_metrics.webp


--------------------------------------------------------------------------------
/img/edge_backend.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/edge_backend.webp


--------------------------------------------------------------------------------
/data/grafana/grafana.db:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/data/grafana/grafana.db


--------------------------------------------------------------------------------
/img/metrics_status.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/metrics_status.webp


--------------------------------------------------------------------------------
/img/2.2.1_metrics_1s.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/2.2.1_metrics_1s.webp


--------------------------------------------------------------------------------
/img/2.2.1_metrics_60s.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/2.2.1_metrics_60s.webp


--------------------------------------------------------------------------------
/img/initial_architecture.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/initial_architecture.webp


--------------------------------------------------------------------------------
/img/metrics_architecture.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/metrics_architecture.webp


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | !edge/.gitkeep
2 | !pop/.gitkeep
3 | !data/grafana/grafana.db
4 | edge/*
5 | pop/*
6 | data/prometheus/*
7 | .DS_Store/*
8 | 


--------------------------------------------------------------------------------
/img/nginx_directive_restriction.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/nginx_directive_restriction.webp


--------------------------------------------------------------------------------
/generic_conf/basic_vts_location.conf:
--------------------------------------------------------------------------------
1 | location /status {
2 |   vhost_traffic_status_display;
3 |   vhost_traffic_status_display_format html;
4 | }
5 | 


--------------------------------------------------------------------------------
/img/simplified_workers_nginx_architecture.webp:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/leandromoreira/cdn-up-and-running/HEAD/img/simplified_workers_nginx_architecture.webp


--------------------------------------------------------------------------------
/src/edge.lua:
--------------------------------------------------------------------------------
1 | local simulations = require "simulations"
2 | local edge = {}
3 | 
4 | edge.simulate_load = function()
5 |   simulations.for_work_longtail(simulations.profiles.edge)
6 | end
7 | 
8 | return edge
9 | 


--------------------------------------------------------------------------------
/generic_conf/basic_vts_setup.conf:
--------------------------------------------------------------------------------
1 | vhost_traffic_status_zone shared:vhost_traffic_status:12m;
2 | vhost_traffic_status_filter_by_set_key $status status::*;
3 | vhost_traffic_status_histogram_buckets 0.005 0.01 0.05 0.1 0.5 1 5 10; # buckets are in seconds
4 | 


--------------------------------------------------------------------------------
/generic_conf/lua_path_setup.conf:
--------------------------------------------------------------------------------
1 | lua_package_path "/usr/local/openresty/lualib/?.lua;/usr/local/openresty/luajit/share/lua/5.1/?.lua;/lua/src/?.lua";
2 | lua_package_cpath "/usr/local/openresty/lualib/?.so;/usr/local/openresty/luajit/lib/lua/5.1/?.so;";
3 | 


--------------------------------------------------------------------------------
/generic_conf/define_cache.conf:
--------------------------------------------------------------------------------
1 | proxy_cache zone_1;
2 | proxy_cache_key $cache_key;
3 | proxy_cache_lock on;
4 | proxy_http_version 1.1;
5 | proxy_set_header Connection "";
6 | proxy_buffering on;
7 | proxy_buffers 16 16k;
8 | add_header X-Cache-Status $upstream_cache_status;
9 | 


--------------------------------------------------------------------------------
/generic_conf/setup_cache.conf:
--------------------------------------------------------------------------------
1 | proxy_cache_path /cache/ levels=2:2 keys_zone=zone_1:10m max_size=10m inactive=10m use_temp_path=off;
2 | proxy_cache_lock_timeout 2s;
3 | proxy_cache_use_stale error timeout updating;
4 | proxy_read_timeout 2s;
5 | proxy_send_timeout 2s;
6 | proxy_ignore_client_abort on;
7 | 


--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
 1 | FROM igorbarinov/openresty-nginx-module-vts
 2 | 
 3 | RUN apk  --no-cache --virtual .build-deps add build-base git \
 4 |   && git clone https://github.com/openresty/lua-resty-balancer.git \
 5 |   && cd lua-resty-balancer/ \
 6 |   && make \
 7 |   && cp -r lib/resty/* /usr/local/openresty/lualib/resty/ \
 8 |   && cp librestychash.so /usr/local/openresty/lualib/ \
 9 |   && apk del .build-deps
10 | 


--------------------------------------------------------------------------------
/src/backend.lua:
--------------------------------------------------------------------------------
 1 | local simulations = require "simulations"
 2 | local backend = {}
 3 | 
 4 | backend.generate_content = function()
 5 |   simulations.for_work_longtail(simulations.profiles.backend)
 6 | 
 7 |   ngx.header['Content-Type'] = 'application/json'
 8 |   ngx.header['Cache-Control'] = 'public, max-age=' .. (ngx.var.arg_max_age or 10)
 9 | 
10 |   ngx.say('{"service": "api", "value": 42, "request": "' .. ngx.var.uri .. '"}')
11 | end
12 | 
13 | return backend
14 | 


--------------------------------------------------------------------------------
/config/prometheus.yml:
--------------------------------------------------------------------------------
 1 | global:
 2 |   scrape_interval:     10s # By default, scrape targets every 15 seconds.
 3 |   evaluation_interval: 10s # By default, scrape targets every 15 seconds.
 4 |   scrape_timeout: 2s # the global default (10s).
 5 | 
 6 |   external_labels:
 7 |       monitor: 'CDN'
 8 | 
 9 | scrape_configs:
10 |   - job_name: 'prometheus'
11 |     metrics_path: '/status/format/prometheus'
12 |     static_configs:
13 |       - targets: ['loadbalancer:8080', 'edge:8080', 'edge1:8080', 'edge2:8080', 'backend:8080', 'backend1:8080']
14 | 


--------------------------------------------------------------------------------
/nginx_backend.conf:
--------------------------------------------------------------------------------
 1 | # vi:syntax=nginx
 2 | events {
 3 |   worker_connections 1024;
 4 | }
 5 | 
 6 | error_log stderr;
 7 | 
 8 | http {
 9 |   include generic_conf/setup_logging.conf;
10 | 
11 |   include generic_conf/lua_path_setup.conf;
12 |   include generic_conf/basic_vts_setup.conf;
13 | 
14 |   server {
15 |     listen 8080;
16 | 
17 |     location / {
18 |       content_by_lua_block {
19 |         local backend = require "backend"
20 |         backend.generate_content()
21 |       }
22 |     }
23 | 
24 |     include generic_conf/basic_vts_location.conf;
25 |   }
26 | }
27 | 
28 | 
29 | 


--------------------------------------------------------------------------------
/src/load_tests.lua:
--------------------------------------------------------------------------------
 1 | math.randomseed(os.time())
 2 | local random = math.random
 3 | 
 4 | local popular_percentage = 96
 5 | local popular_items_quantity = 5
 6 | local max_total_items = 200
 7 | 
 8 | -- trying to model the long tail
 9 | request = function()
10 |   local is_popular = random(1, 100) <= popular_percentage
11 |   local item = ""
12 | 
13 |   if is_popular then
14 |     item = "item-" .. random(1, popular_items_quantity)
15 |   else
16 |     item = "item-" .. random(popular_items_quantity + 1, popular_items_quantity + max_total_items)
17 |   end
18 | 
19 |   return wrk.format(nil, "/path/" .. item .. ".ext")
20 | end
21 | 
22 | 


--------------------------------------------------------------------------------
/generic_conf/backend_definition.conf:
--------------------------------------------------------------------------------
 1 | env BACKEND_HOST; # allow list for os.getenv
 2 | env BACKEND_PORT; # allow list for os.getenv
 3 | 
 4 | upstream backend {
 5 |   server 0.0.0.1;   # just an invalid address as a place holder
 6 | 
 7 |     balancer_by_lua_block {
 8 |       local balancer = require "ngx.balancer"
 9 |       local host = os.getenv("BACKEND_HOST")
10 |       local port = number(os.getenv("BACKEND_PORT"))
11 | 
12 |       local ok, err = balancer.set_current_peer(host, port)
13 |       if not ok then
14 |         ngx.log(ngx.ERR, "failed to set the current peer: ", err)
15 |         return ngx.exit(500)
16 |       end
17 |     }
18 | 
19 |   keepalive 10;  # connection pool
20 | }
21 | 
22 | 


--------------------------------------------------------------------------------
/nginx_edge.conf:
--------------------------------------------------------------------------------
 1 | # vi:syntax=nginx
 2 | events {
 3 |   worker_connections 1024;
 4 | }
 5 | 
 6 | error_log stderr;
 7 | 
 8 | http {
 9 |   resolver 127.0.0.11 ipv6=off;
10 |   include generic_conf/setup_logging.conf;
11 | 
12 |   include generic_conf/lua_path_setup.conf;
13 |   include generic_conf/basic_vts_setup.conf;
14 |   include generic_conf/setup_cache.conf;
15 | 
16 |   upstream backend {
17 |     server backend:8080;
18 |     server backend1:8080;
19 |     keepalive 10;  # connection pool
20 |   }
21 | 
22 |   server {
23 |     listen 8080;
24 | 
25 |     location / {
26 |       set_by_lua_block $cache_key {
27 |         return ngx.var.uri
28 |       }
29 | 
30 |       access_by_lua_block {
31 |         local edge = require "edge"
32 |         edge.simulate_load()
33 |       }
34 | 
35 |       proxy_pass http://backend;
36 |       include generic_conf/define_cache.conf;
37 |       add_header X-Edge Server;
38 |     }
39 | 
40 |     include generic_conf/basic_vts_location.conf;
41 |   }
42 | 
43 | }
44 | 
45 | 


--------------------------------------------------------------------------------
/nginx_loadbalancer.conf:
--------------------------------------------------------------------------------
 1 | # vi:syntax=nginx
 2 | events {
 3 |   worker_connections 1024;
 4 | }
 5 | 
 6 | error_log stderr;
 7 | 
 8 | http {
 9 |   resolver 127.0.0.11 ipv6=off;
10 |   include generic_conf/setup_logging.conf;
11 | 
12 |   include generic_conf/lua_path_setup.conf;
13 |   include generic_conf/basic_vts_setup.conf;
14 |   include generic_conf/setup_cache.conf;
15 | 
16 |   init_by_lua_block {
17 |     loadbalancer = require "loadbalancer"
18 |     loadbalancer.setup_server_list()
19 |   }
20 | 
21 |   upstream backend {
22 |     server 0.0.0.1;
23 |     balancer_by_lua_block {
24 |       loadbalancer.set_proper_server()
25 |     }
26 |     keepalive 60;
27 |   }
28 | 
29 |   server {
30 |     listen 8080;
31 | 
32 |     location / {
33 |       access_by_lua_block {
34 |         loadbalancer.resolve_name_for_upstream()
35 |       }
36 | 
37 |       proxy_pass http://backend;
38 |       add_header X-Edge LoadBalancer;
39 |     }
40 | 
41 |     include generic_conf/basic_vts_location.conf;
42 |   }
43 | 
44 | }
45 | 
46 | 


--------------------------------------------------------------------------------
/src/loadbalancer.lua:
--------------------------------------------------------------------------------
 1 | local resty_chash = require "resty.chash"
 2 | 
 3 | local loadbalancer = {}
 4 | 
 5 | loadbalancer.setup_server_list = function()
 6 |   local server_list = {
 7 |     ["edge"] = 1,
 8 |     ["edge1"] = 1,
 9 |     ["edge2"] = 1,
10 |   }
11 |   local chash_up = resty_chash:new(server_list)
12 | 
13 |   package.loaded.my_chash_up = chash_up
14 |   package.loaded.my_servers = server_list
15 | end
16 | 
17 | loadbalancer.set_proper_server = function()
18 |   local b = require "ngx.balancer"
19 |   local chash_up = package.loaded.my_chash_up
20 |   local servers = package.loaded.my_ip_servers
21 |   local id = chash_up:find(ngx.var.uri) -- hashing based on uri
22 | 
23 |   assert(b.set_current_peer(servers[id] .. ":8080"))
24 | end
25 | 
26 | loadbalancer.resolve_name_for_upstream = function()
27 |   local resolver = require "resty.dns.resolver"
28 |   local r, err = resolver:new{
29 |     nameservers = {"127.0.0.11", {"127.0.0.11", 53} },
30 |     retrans = 5,
31 |     timeout = 1000,
32 |     no_random = true,
33 |   }
34 |   -- quick hack, we could use ips already
35 |   -- or resolve names on background
36 |   if package.loaded.my_ip_servers ~= nil then
37 |     return
38 |   end
39 | 
40 |   local servers = package.loaded.my_servers
41 |   local ip_servers = {}
42 | 
43 |   for host, weight in pairs(servers) do
44 |     local answers, err, tries = r:query(host, nil, {})
45 |     ip_servers[host] = answers[1].address
46 |   end
47 | 
48 |   package.loaded.my_ip_servers  = ip_servers
49 | end
50 | 
51 | return loadbalancer
52 | 


--------------------------------------------------------------------------------
/src/simulations.lua:
--------------------------------------------------------------------------------
 1 | local simulations = {}
 2 | local random = math.random
 3 | local sleep = ngx.sleep
 4 | local second = 0.001 -- a millisecond in second
 5 | 
 6 | -- setup entropy
 7 | math.randomseed(ngx.time() + ngx.worker.pid())
 8 | 
 9 | -- a percentile distribution based on a percentiles map
10 | -- {
11 | --  {
12 | --    p=50, min=1, max=400,
13 | --  }
14 | -- }
15 | -- for instance, for 50% we'll wait min 1ms and max 400ms
16 | simulations.for_work_longtail = function(percentiles)
17 |   -- sort by percentile
18 |   table.sort(percentiles, function(a,b) return  a.p < b.p end)
19 | 
20 |   local current_percentage = random(1, 100)
21 |   local min_wait_ms = 1
22 |   local max_wait_ms = 1000
23 | 
24 |   for _, percentile in pairs(percentiles) do
25 |     if current_percentage <= percentile.p then
26 |       min_wait_ms = percentile.min
27 |       max_wait_ms = percentile.max
28 |       break
29 |     end
30 |   end
31 | 
32 |   local sleep_seconds = random(min_wait_ms, max_wait_ms) * second -- sleep expects seconds
33 |   ngx.header["X-Latency"] = "simulated=" .. sleep_seconds .. "s, min=" .. min_wait_ms .. ", max=" .. max_wait_ms .. ", profile=" .. (ngx.var.arg_profile or "empty")
34 | 
35 |   sleep(sleep_seconds)
36 | end
37 | 
38 | -- the percentile latency configuation in ms
39 | simulations.profiles = {
40 |   edge={
41 |     {p=50, min=1, max=20,}, {p=90, min=21, max=50,}, {p=95, min=51, max=150,}, {p=99, min=151, max=500,},
42 |   },
43 |   backend={
44 |     {p=50, min=100, max=400,}, {p=90, min=401, max=500,}, {p=95, min=501, max=1500,}, {p=99, min=1501, max=3000,},
45 |   },
46 | }
47 | 
48 | return simulations
49 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | BSD 3-Clause License
 2 | 
 3 | Copyright (c) 2021, Leandro Moreira
 4 | All rights reserved.
 5 | 
 6 | Redistribution and use in source and binary forms, with or without
 7 | modification, are permitted provided that the following conditions are met:
 8 | 
 9 | 1. Redistributions of source code must retain the above copyright notice, this
10 |    list of conditions and the following disclaimer.
11 | 
12 | 2. Redistributions in binary form must reproduce the above copyright notice,
13 |    this list of conditions and the following disclaimer in the documentation
14 |    and/or other materials provided with the distribution.
15 | 
16 | 3. Neither the name of the copyright holder nor the names of its
17 |    contributors may be used to endorse or promote products derived from
18 |    this software without specific prior written permission.
19 | 
20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
30 | 


--------------------------------------------------------------------------------
/data/grafana/alerting/1/__default__.tmpl:
--------------------------------------------------------------------------------
 1 | 
 2 | {{ define "__subject" }}[{{ .Status | toUpper }}{{ if eq .Status "firing" }}:{{ .Alerts.Firing | len }}{{ end }}] {{ .GroupLabels.SortedPairs.Values | join " " }} {{ if gt (len .CommonLabels) (len .GroupLabels) }}({{ with .CommonLabels.Remove .GroupLabels.Names }}{{ .Values | join " " }}{{ end }}){{ end }}{{ end }}
 3 | 
 4 | {{ define "__text_alert_list" }}{{ range . }}
 5 | Value: {{ or .ValueString "<no value>" }}
 6 | Labels:
 7 | {{ range .Labels.SortedPairs }} - {{ .Name }} = {{ .Value }}
 8 | {{ end }}Annotations:
 9 | {{ range .Annotations.SortedPairs }} - {{ .Name }} = {{ .Value }}
10 | {{ end }}{{ if gt (len .GeneratorURL) 0 }}Source: {{ .GeneratorURL }}
11 | {{ end }}{{ if gt (len .SilenceURL) 0 }}Silence: {{ .SilenceURL }}
12 | {{ end }}{{ if gt (len .DashboardURL) 0 }}Dashboard: {{ .DashboardURL }}
13 | {{ end }}{{ if gt (len .PanelURL) 0 }}Panel: {{ .PanelURL }}
14 | {{ end }}{{ end }}{{ end }}
15 | 
16 | {{ define "default.title" }}{{ template "__subject" . }}{{ end }}
17 | 
18 | {{ define "default.message" }}{{ if gt (len .Alerts.Firing) 0 }}**Firing**
19 | {{ template "__text_alert_list" .Alerts.Firing }}{{ if gt (len .Alerts.Resolved) 0 }}
20 | 
21 | {{ end }}{{ end }}{{ if gt (len .Alerts.Resolved) 0 }}**Resolved**
22 | {{ template "__text_alert_list" .Alerts.Resolved }}{{ end }}{{ end }}
23 | 
24 | 
25 | {{ define "__teams_text_alert_list" }}{{ range . }}
26 | Value: {{ or .ValueString "<no value>" }}
27 | Labels:
28 | {{ range .Labels.SortedPairs }} - {{ .Name }} = {{ .Value }}
29 | {{ end }}
30 | Annotations:
31 | {{ range .Annotations.SortedPairs }} - {{ .Name }} = {{ .Value }}
32 | {{ end }}
33 | {{ if gt (len .GeneratorURL) 0 }}Source: {{ .GeneratorURL }}
34 | 
35 | {{ end }}{{ if gt (len .SilenceURL) 0 }}Silence: {{ .SilenceURL }}
36 | 
37 | {{ end }}{{ if gt (len .DashboardURL) 0 }}Dashboard: {{ .DashboardURL }}
38 | 
39 | {{ end }}{{ if gt (len .PanelURL) 0 }}Panel: {{ .PanelURL }}
40 | 
41 | {{ end }}
42 | {{ end }}{{ end }}
43 | 
44 | 
45 | {{ define "teams.default.message" }}{{ if gt (len .Alerts.Firing) 0 }}**Firing**
46 | {{ template "__teams_text_alert_list" .Alerts.Firing }}{{ if gt (len .Alerts.Resolved) 0 }}
47 | 
48 | {{ end }}{{ end }}{{ if gt (len .Alerts.Resolved) 0 }}**Resolved**
49 | {{ template "__teams_text_alert_list" .Alerts.Resolved }}{{ end }}{{ end }}
50 | 


--------------------------------------------------------------------------------
/docker-compose.yaml:
--------------------------------------------------------------------------------
  1 | version: "3.9"
  2 | services:
  3 |   nginx_base:
  4 |     build:
  5 |       context: .
  6 |     volumes:
  7 |       - "./generic_conf/:/usr/local/openresty/nginx/conf/generic_conf/"
  8 |       - "./src/:/lua/src/"
  9 |   loadbalancer:
 10 |     extends:
 11 |       service: nginx_base
 12 |     volumes:
 13 |       - "./nginx_loadbalancer.conf:/usr/local/openresty/nginx/conf/nginx.conf"
 14 |     ports:
 15 |       - "18080:8080"
 16 |     depends_on:
 17 |       - edge
 18 |       - edge1
 19 |       - edge2
 20 | 
 21 |   backend:
 22 |     extends:
 23 |       service: nginx_base
 24 |     volumes:
 25 |       - "./nginx_backend.conf:/usr/local/openresty/nginx/conf/nginx.conf"
 26 |     ports:
 27 |       - "8080:8080"
 28 | 
 29 |   backend1:
 30 |     extends:
 31 |       service: nginx_base
 32 |     volumes:
 33 |       - "./nginx_backend.conf:/usr/local/openresty/nginx/conf/nginx.conf"
 34 |     ports:
 35 |       - "8180:8080"
 36 | 
 37 |   edge:
 38 |     extends:
 39 |       service: nginx_base
 40 |     volumes:
 41 |       - "./nginx_edge.conf:/usr/local/openresty/nginx/conf/nginx.conf"
 42 |     depends_on:
 43 |       - backend
 44 |       - backend1
 45 |     ports:
 46 |       - "8081:8080"
 47 | 
 48 |   edge1:
 49 |     extends:
 50 |       service: nginx_base
 51 |     volumes:
 52 |       - "./nginx_edge.conf:/usr/local/openresty/nginx/conf/nginx.conf"
 53 |     depends_on:
 54 |       - backend
 55 |       - backend1
 56 |     ports:
 57 |       - "8082:8080"
 58 | 
 59 |   edge2:
 60 |     extends:
 61 |       service: nginx_base
 62 |     volumes:
 63 |       - "./nginx_edge.conf:/usr/local/openresty/nginx/conf/nginx.conf"
 64 |     depends_on:
 65 |       - backend
 66 |       - backend1
 67 |     ports:
 68 |       - "8083:8080"
 69 | 
 70 |   prometheus:
 71 |     image: prom/prometheus:v2.17.1
 72 |     container_name: prometheus
 73 |     volumes:
 74 |       - ./config:/etc/prometheus
 75 |       - ./data/prometheus:/prometheus
 76 |     command:
 77 |       - '--config.file=/etc/prometheus/prometheus.yml'
 78 |       - '--storage.tsdb.path=/prometheus'
 79 |       - '--web.console.libraries=/etc/prometheus/console_libraries'
 80 |       - '--web.console.templates=/etc/prometheus/consoles'
 81 |       - '--storage.tsdb.retention.time=24h'
 82 |       - '--web.enable-lifecycle'
 83 |     restart: unless-stopped
 84 |     ports:
 85 |       - "9090:9090"
 86 |     labels:
 87 |       org.label-schema.group: "monitoring"
 88 |     depends_on:
 89 |       - edge
 90 |       - edge1
 91 |       - edge2
 92 |       - backend
 93 |       - backend1
 94 |       - loadbalancer
 95 | 
 96 |   grafana:
 97 |     image: grafana/grafana:latest
 98 |     container_name: monitoring_grafana
 99 |     restart: unless-stopped
100 |     links:
101 |       - prometheus
102 |     volumes:
103 |       - ./data/grafana:/var/lib/grafana
104 |     environment:
105 |       - GF_SECURITY_ADMIN_USER=admin
106 |       - GF_SECURITY_ADMIN_PASSWORD=admin
107 |       - GF_USERS_ALLOW_SIGN_UP=false
108 |     ports:
109 |       - "9091:3000"
110 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # CDN Up and Running
  2 | 
  3 | The objective of this repo is to build a body of knowledge on how CDNs work by coding one from "scratch". The CDN we're going to design uses: nginx, lua, docker, docker-compose, Prometheus, grafana, and wrk.
  4 | 
  5 | We'll start creating a single backend service and expand from there to a multi-node, latency simulated, observable, and testable CDN. In each section, there are discussions regarding the challenges and trade-offs of building/managing/operating a CDN.
  6 | 
  7 | ![grafana screenshot](/img/4.0.1_metrics.webp "grafana screenshot")
  8 | 
  9 | ## What is a CDN?
 10 | 
 11 | A Content Delivery Network is a set of computers, spatially distributed in order to provide high availability and **better performance** for systems that have their **work cached** on this network.
 12 | 
 13 | ## Why do you need a CDN?
 14 | 
 15 | A CDN helps to improve:
 16 | * loading times (smoother streaming, instant page to buy, quick friends feed, etc)
 17 | * accommodate traffic spikes (black friday, popular streaming release, breaking news, etc)
 18 | * decrease costs (traffic offloading)
 19 | * scalability for millions
 20 | 
 21 | ## How does a CDN work?
 22 | 
 23 | CDNs are able to make services faster by placing the content (media files, pages, games, javascript, a json response, etc) closer to the users.
 24 | 
 25 | When a user wants to consume a service, the CDN routing system will deliver the "best" node where the content is likely **already cached and closer to the client**. Don't worry about the loose use of the word best in here. I hope that throughout the reading, the understanding of what is the best node will be elucidated.
 26 | 
 27 | ## The CDN stack
 28 | 
 29 | The CDN we'll build relies on:
 30 | * [`Linux/GNU/Kernel`](https://www.linux.org/) - a kernel / operating system with outstanding networking capabilities as well as IO excellence.
 31 | * [`Nginx`](http://nginx.org/) - an excellent web server that can be used as a reverse proxy providing caching capability.
 32 | * [`Lua(jit)`](https://luajit.org/) - a simple powerful language to add features into nginx.
 33 | * [`Prometheus`](https://prometheus.io/) - A system with a dimensional data model, flexible query language, efficient time series database.
 34 | * [`Grafana`](https://github.com/grafana/grafana) - An open source analytics & monitoring tool that plugs with many sources, including prometheus.
 35 | * [`Containers`](https://www.docker.com/) - technology to package, deploy, and isolate applications, we'll use docker and docker compose.
 36 | 
 37 | # Origin - the backend service
 38 | 
 39 | Origin is the system where the content is created - or at least it's the source to the CDN. The sample service we're going to build will be a straightforward JSON API. The backend service could be returning an image, video, javascript, HTML page, game, or anything you want to deliver to your clients.
 40 | 
 41 | We'll use Nginx and Lua to design the backend service. It's a great excuse to introduce Nginx and Lua since we're going to use them a lot here.
 42 | 
 43 | > **Heads up: the backend service could be written in any language you like.**
 44 | 
 45 | ## Nginx - quick introduction
 46 | 
 47 | Nginx is a web server that will follow its [configuration](http://nginx.org/en/docs/beginners_guide.html#conf_structure). The config file uses [directives](http://nginx.org/en/docs/dirindex.html) as the dominant factor. A directive is a simple construction to set properties in nginx. There are two types of directives: **simple and block (context)**.
 48 | 
 49 | A **simple directive** is formed by its name followed by parameters ending with a semicolon.
 50 | 
 51 | ```nginx
 52 | # Syntax: <name> <parameters>;
 53 | # Example
 54 | add_header X-Header AnyValue;
 55 | ```
 56 | 
 57 | The **block directive** follows the same pattern, but instead of a semicolon, it ends surrounded by curly braces. A block directive can also have directives within it. This block is also known as context.
 58 | 
 59 | ```nginx
 60 | # Syntax: <name> <parameters> <block>
 61 | location / {
 62 |   add_header X-Header AnyValue;
 63 | }
 64 | ```
 65 | 
 66 | Nginx uses workers (processes) to handle the requests. The [nginx architecture](https://www.aosabook.org/en/nginx.html) plays a crucial role in its performance.
 67 | 
 68 | ![simplified workers nginx architecture](/img/simplified_workers_nginx_architecture.webp "simplified workers nginx architecture")
 69 | 
 70 | > **Heads up: Although a single accept queue serving multiple workers is common, there are other models to [load balance the incoming requests](https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/).**
 71 | 
 72 | ## Backend service conf
 73 | 
 74 | Let's walk through the backend JSON API nginx configuration. I think it'll be much easier if we see it in action.
 75 | 
 76 | ```nginx
 77 | events {
 78 |   worker_connections 1024;
 79 | }
 80 | error_log stderr;
 81 | 
 82 | http {
 83 |   access_log /dev/stdout;
 84 | 
 85 |   server {
 86 |     listen 8080;
 87 | 
 88 |     location / {
 89 |       content_by_lua_block {
 90 |         ngx.header['Content-Type'] = 'application/json'
 91 |         ngx.say('{"service": "api", "value": 42}')
 92 |       }
 93 |     }
 94 |   }
 95 | }
 96 | ```
 97 | 
 98 | Were you able to understand what this config is doing? In any case, let's break it down by making comments on each directive.
 99 | 
100 | The [`events`](http://nginx.org/en/docs/ngx_core_module.html#events) provides context for [connection processing configurations](http://nginx.org/en/docs/events.html), and the [`worker_connections`](http://nginx.org/en/docs/ngx_core_module.html#worker_connections) defines the maximum number of simultaneous connections that can be opened by a worker process.
101 | ```nginx
102 | events {
103 |   worker_connections 1024;
104 | }
105 | ```
106 | 
107 | The [`error_log`](http://nginx.org/en/docs/ngx_core_module.html#error_log) configures logging for error. Here we just send all the errors to the stdout (error)
108 | 
109 | ```nginx
110 | error_log stderr;
111 | ```
112 | 
113 | The [`http`](http://nginx.org/en/docs/http/ngx_http_core_module.html#http) provides a root context to set up all the http/s servers.
114 | 
115 | ```nginx
116 | http {}
117 | ```
118 | 
119 | The [`access_log`](http://nginx.org/en/docs/http/ngx_http_log_module.html#access_log) configures the path (and optionally format, etc) for the access logging.
120 | 
121 | ```nginx
122 | access_log /dev/stdout;
123 | ```
124 | 
125 | The [`server`](http://nginx.org/en/docs/http/ngx_http_core_module.html#server) sets the root configuration for a server, aka where we're going to setup specific behavior to the server. You can have multiple `server` blocks per `http` context.
126 | 
127 | ```nginx
128 | server {}
129 | ```
130 | 
131 | Within the `server` we can set the [`listen`](http://nginx.org/en/docs/http/ngx_http_core_module.html#listen) directive controlling the address and/or the port on which the [server will accept requests](http://nginx.org/en/docs/http/request_processing.html).
132 | 
133 | ```nginx
134 | listen 8080;
135 | ````
136 | 
137 | In the server configuration, we can specify a route by using the [`location`](http://nginx.org/en/docs/http/ngx_http_core_module.html#location) directive. This will be used to provide specific configuration for that matching request path.
138 | 
139 | ```nginx
140 | location / {}
141 | ```
142 | 
143 | Within this location (by the way, `/` will handle all the requests) we'll use Lua to create the response. There's a directive called [`content_by_lua_block`](https://github.com/openresty/lua-nginx-module#content_by_lua_block) which provides a context where the Lua code will run.
144 | 
145 | ```nginx
146 | content_by_lua_block {}
147 | ```
148 | 
149 | Finally, we'll use Lua and the basic [Nginx Lua API](https://github.com/openresty/lua-nginx-module#nginx-api-for-lua) to set the desired behavior.
150 | 
151 | ```lua
152 | -- ngx.header sets the current response header that is to be sent.
153 | ngx.header['Content-Type'] = 'application/json'
154 | -- ngx.say will write the response body
155 | ngx.say('{"service": "api", "value": 42}')
156 | ```
157 | 
158 | Notice that most of the directives contain their scope. For instance, the `location` is only applicable within the `location` (recursively) and `server` context.
159 | 
160 | ![directive restriction](/img/nginx_directive_restriction.webp "directive restriction")
161 | 
162 | > **Heads up: we won't comment on each directive we add from now on, we'll only describe the most relevant for the section.**
163 | 
164 | ## CDN 1.0.0 Demo time
165 | 
166 | Let's see what we did.
167 | 
168 | ```bash
169 | git checkout 1.0.0 # going back to specific configuration
170 | docker-compose run --rm --service-ports backend # run the containers exposing the service
171 | http http://localhost:8080/path/to/my/content.ext # consuming the service, I used httpie but you can use curl or anything you like
172 | 
173 | # you should see the json response :)
174 | ```
175 | 
176 | ## Adding caching capabilities
177 | 
178 | For the backend service to be cacheable we need to set up the caching policy. We'll use the HTTP header [Cache-Control](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control) to setup what caching behavior we want.
179 | 
180 | ```Lua
181 | -- we want the content to be cached by 10 seconds OR the provided max_age (ex: /path/to/service?max_age=40 for 40 seconds)
182 | ngx.header['Cache-Control'] = 'public, max-age=' .. (ngx.var.arg_max_age or 10)
183 | ```
184 | 
185 | And, if you want, make sure to check the returned response header `Cache-Control`.
186 | 
187 | ```bash
188 | git checkout 1.0.1 # going back to specific configuration
189 | docker-compose run --rm --service-ports backend
190 | http "http://localhost:8080/path/to/my/content.ext?max_age=30"
191 | ```
192 | 
193 | ## Adding metrics
194 | 
195 | Checking the logging is fine for debugging. But once we're reaching more traffic, it'll be nearly impossible to understand how the service is operating. To tackle this case, we're going to use [VTS](https://github.com/vozlt/nginx-module-vts), an nginx module which adds metrics measurements.
196 | 
197 | ```nginx
198 | vhost_traffic_status_zone shared:vhost_traffic_status:12m;
199 | vhost_traffic_status_filter_by_set_key $status status::*;
200 | vhost_traffic_status_histogram_buckets 0.005 0.01 0.05 0.1 0.5 1 5 10; # buckets are in seconds
201 | ```
202 | 
203 | The [`vhost_traffic_status_zone`](https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_zone) sets a memory space required for the metrics. The  [`vhost_traffic_status_filter_by_set_key`](https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_filter_by_set_key) groups metrics by a given variable (for instance, we decided to group metrics by `status`) and finally, the [`vhost_traffic_status_histogram_buckets`](https://github.com/vozlt/nginx-module-vts#vhost_traffic_status_histogram_buckets) provides a way to bucketize the metrics in seconds. We decided to create buckets varying from `0.005` to `10` seconds, because they will help us to create percentiles (`p99`, `p50`, etc).
204 | 
205 | ```nginx
206 | location /status {
207 |   vhost_traffic_status_display;
208 |   vhost_traffic_status_display_format html;
209 | }
210 | ```
211 | 
212 | We also must expose the metrics in a location. We will use the `/status` to do it.
213 | 
214 | ```bash
215 | git checkout 1.1.0
216 | docker-compose run --rm --service-ports backend
217 | # if you go to http://localhost:8080/status/format/html you'll see information about the server 8080
218 | # notice that VTS also provides other formats such as status/format/prometheus, which will be pretty helpful for us in near future
219 | ```
220 | 
221 | ![nginx vts status page](/img/metrics_status.webp "nginx vts status page")
222 | 
223 | With metrics, we can run (load) tests and see if the configuration changes we made are resulting in a better performance or not.
224 | 
225 | > **Heads up**: You can [group the metrics under a custom namespace](https://github.com/leandromoreira/cdn-up-and-running/commit/105f54a27d1b58b88659789ae024d70c89d4a478). This is useful when you have a single location that behaves differently depending on the context.
226 | 
227 | ## Refactoring the nginx conf
228 | 
229 | As the configuration becomes bigger, it also gets harder to comprehend. Nginx offers a neat directive called [`include`](http://nginx.org/en/docs/ngx_core_module.html#include) which allows us to create partial config files and include them into the root configuration file.
230 | 
231 | ```diff
232 | -    location /status {
233 | -      vhost_traffic_status_display;
234 | -      vhost_traffic_status_display_format html;
235 | -    }
236 | +    include basic_vts_location.conf;
237 | 
238 | ```
239 | 
240 | We can extract location, group configurations per similarities, or anything that makes sense to a file. We can do [a similar thing for the Lua code](https://github.com/openresty/lua-nginx-module#lua_package_path) as well.
241 | 
242 | ```diff
243 |        content_by_lua_block {
244 | -        ngx.header['Content-Type'] = 'application/json'
245 | -        ngx.header['Cache-Control'] = 'public, max-age=' .. (ngx.var.arg_max_age or 10)
246 | -
247 | -        ngx.say('{"service": "api", "value": 42, "request": "' .. ngx.var.uri .. '"}')
248 | +        local backend = require "backend"
249 | +        backend.generate_content()
250 |        }
251 | ```
252 | 
253 | All these modifications were made to improve readability, but it also promotes reuse.
254 | 
255 | 
256 | # The CDN - siting in front of the backend
257 | 
258 | ## Proxying
259 | 
260 | What we did so far has nothing to do with the CDN. Now it's time to start building the CDN. For that, we'll create another node with nginx, just adding a few new directives to connect the `edge` (CDN) node with the `backend` node.
261 | 
262 | ![backend edge architecture](/img/edge_backend.webp "backend edge architecture")
263 | 
264 | There's really nothing fancy here, it's just an [`upstream`](http://nginx.org/en/docs/http/ngx_http_upstream_module.html#upstream) block with a server pointing to our `backend` endpoint. In the location, we do not provide the content, but instead we point to the upstream, using the [`proxy_pass`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_pass), we just created.
265 | 
266 | ```nginx
267 | upstream backend {
268 |   server backend:8080;
269 |   keepalive 10;  # connection pool for reuse
270 | }
271 | 
272 | server {
273 |   listen 8080;
274 | 
275 |   location / {
276 |     proxy_pass http://backend;
277 |     add_header X-Cache-Status $upstream_cache_status;
278 |   }
279 | }
280 | ````
281 | 
282 | We also added a new header (X-Cache-Status) to indicate whether the [cache was used or not](http://nginx.org/en/docs/http/ngx_http_upstream_module.html#variables).
283 | * **HIT**: when the content is in the CDN, the `X-Cache-Status` should return a hit.
284 | * **MISS**: when the content isn't in the CDN, the `X-Cache-Status` should return a miss.
285 | 
286 | ```bash
287 | git checkout 2.0.0
288 | docker-compose up
289 | # we still can fetch the content from the backend
290 | http "http://localhost:8080/path/to/my/content.ext"
291 | # but we really want to access the content through the edge (CDN)
292 | http "http://localhost:8081/path/to/my/content.ext"
293 | ```
294 | 
295 | ## Caching
296 | 
297 | When we try to fetch content, the `X-Cache-Status` header is absent. It seems that the edge node is always invariably requesting the backend. This is not the way a CDN should work, right?
298 | 
299 | ```log
300 | backend_1     | 172.22.0.4 - - [05/Jan/2022:17:24:48 +0000] "GET /path/to/my/content.ext HTTP/1.0" 200 70 "-" "HTTPie/2.6.0"
301 | edge_1        | 172.22.0.1 - - [05/Jan/2022:17:24:48 +0000] "GET /path/to/my/content.ext HTTP/1.1" 200 70 "-" "HTTPie/2.6.0"
302 | ````
303 | 
304 | The edge is just proxying the clients to the backend. What are we missing? Is there any reason to use a "simple" proxy at all? Well, it does, maybe you want to provide throttling, authentication, authorization, tls termination, or a gateway for multiple services, but that's not what we want.
305 | 
306 | We need to create a cache area on nginx through the directive [`proxy_cache_path`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_path). It's setting up the path where the cached content will reside, the shared memory `key_zone`, and policies such as `inactive`, `max_size`, among others, to control how we want the cache to behave.
307 | 
308 | ```nginx
309 | proxy_cache_path /cache/ levels=2:2 keys_zone=zone_1:10m max_size=10m inactive=10m use_temp_path=off;
310 | ```
311 | 
312 | Once we've configured a proper cache, we must also set up the [`proxy_cache`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache) pointing to the right zone (via `proxy_cache_path keys_zone=<name>:size`), and the [`proxy_pass`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_pass) linking to the upstream we've created.
313 | 
314 | ```nginx
315 | location / {
316 |     # ...
317 |     proxy_pass http://backend;
318 |     proxy_cache zone_1;
319 | }
320 | ```
321 | 
322 | There is another important aspect of caching which is managed by the directive [`proxy_cache_key`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_key).
323 | When a client requests content from nginx, it will (highly simplified):
324 | 
325 | * Receive the request (let's say: `GET /path/to/something.txt`)
326 | * Apply a hash md5 function over the cache key value (let's assume that the cache key is the `uri`)
327 |   * md5("/path/to/something.txt") => `b3c4c5e7dc10b13dc2e3f852e52afcf3`
328 |     * you can check that on your terminarl `echo -n "/path/to/something.txt" | md5`
329 | * It checks whether the content (hash `b3c4..`) is cached or not
330 | * If it's cached, it just returns the object otherwise it fetches the content from the backend
331 |   * It also saves locally (in memory and on disk) to avoid future requests
332 | 
333 | Let's create a variable called `cache_key` using the lua directive [`set_by_lua_block`](https://github.com/openresty/lua-nginx-module#set_by_lua_block). It will, for each incoming request, fill the `cache_key` with the `uri` **value**. Beyond that, we also need to update the [`proxy_cache_key`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_key).
334 | 
335 | ```nginx
336 | location / {
337 |     set_by_lua_block $cache_key {
338 |       return ngx.var.uri
339 |     }
340 |     # ...
341 |     proxy_cache_key $cache_key;
342 | }
343 | ```
344 | 
345 | > **Heads up**: Using `uri` as cache key will make the following two requests http://example.com/path/to/content.ext and http://example.edu/path/to/content.ext (if they're using the same cache proxy) as if they were a single object. If you do not provide a cache key, nginx will use a reasonable **default value** `$scheme$proxy_host$request_uri`.
346 | 
347 | Now we can see the caching properly working.
348 | 
349 | ```bash
350 | git checkout 2.1.0
351 | docker-compose up
352 | http "http://localhost:8081/path/to/my/content.ext"
353 | # the second request must get the content from the CDN without leaving to the backend
354 | http "http://localhost:8081/path/to/my/content.ext"
355 | ```
356 | 
357 | ![cache hit header](/img/cache_hit.webp "cache hit header")
358 | 
359 | ## Monitoring Tools
360 | 
361 | Checking the cache effectiveness by looking at the command line isn't efficient. It's better if we use a tool for that. **Prometheus** will be used to scrape metrics on all servers, and **Grafana** will show graphics based on the metrics collected by the prometheus.
362 | 
363 | ![instrumentalization architecture](/img/metrics_architecture.webp "instrumentalization architecture")
364 | 
365 | Prometheus configuration will look like this.
366 | 
367 | ```yaml
368 | global:
369 |   scrape_interval:     10s # each 10s prometheus will scrape targets
370 |   evaluation_interval: 10s
371 |   scrape_timeout: 2s
372 | 
373 |   external_labels:
374 |       monitor: 'CDN'
375 | 
376 | scrape_configs:
377 |   - job_name: 'prometheus'
378 |     metrics_path: '/status/format/prometheus'
379 |     static_configs:
380 |       - targets: ['edge:8080', 'backend:8080'] # the server list to be scrapped by the scrap_path
381 | ```
382 | 
383 | Now, we need to add a prometheus source for Grafana.
384 | 
385 | ![grafana source](/img/add_source.webp "grafana source")
386 | 
387 | And set the proper prometheus server.
388 | 
389 | ![grafana source set](/img/set_source.webp "grafana source set")
390 | 
391 | ## Simulated Work (latency)
392 | 
393 | The backend server is artificially creating responses. We'll add simulated latency using lua. The idea is to make it closer to real-world situations. We're going to model the latency using [percentiles](https://www.mathsisfun.com/data/percentiles.html).
394 | 
395 | ```lua
396 | percentile_config={
397 |     {p=50, min=1, max=20,}, {p=90, min=21, max=50,}, {p=95, min=51, max=150,}, {p=99, min=151, max=500,},
398 | }
399 | ```
400 | 
401 | We randomly pick a number from 1 to 100, and then we apply another random using the respective `percentile profile` ranging from the min to the max. Finally, we [`sleep`](https://github.com/openresty/lua-nginx-module#ngxsleep) that duration.
402 | 
403 | ```lua
404 | local current_percentage = random(1, 100) -- decide with percentile this request will be
405 | -- let's assume we picked 94
406 | -- therefore we'll use the percentile_config with p90
407 | local sleep_duration = random(p90.min, p90.max)
408 | sleep(sleep_seconds)
409 | ```
410 | 
411 | This model lets us freely try to emulate closer to [real-world observed latencies](https://research.google/pubs/pub40801/).
412 | 
413 | ## Load Testing
414 | 
415 | We'll run some load testing to learn more about the solution we're building. Wrk is an HTTP benchmarking tool that you can dynamically configure using lua. We pick a random number from 1 to 100 and request that item.
416 | 
417 | ```lua
418 | request = function()
419 |   local item = "item_" .. random(1, 100)
420 | 
421 |   return wrk.format(nil, "/" .. item .. ".ext")
422 | end
423 | ```
424 | 
425 | The command line will run the tests for 10 minutes (600s), using two threads, and 10 connections.
426 | 
427 | ```bash
428 | wrk -c10 -t2 -d600s -s ./src/load_tests.lua --latency http://localhost:8081
429 | ```
430 | 
431 | Of course, you can run this on your machine:
432 | 
433 | ```bash
434 | git checkout 2.2.0
435 | docker-compose up
436 | 
437 | # run the tests
438 | ./load_test.sh
439 | 
440 | # go check on grafana, how the system is behaving
441 | http://localhost:9091
442 | ```
443 | 
444 | The `wrk` output was as shown bellow. There were **37k** requests with **674** failing requests in total.
445 | 
446 | ```bash
447 | Running 10m test @ http://localhost:8081
448 |   2 threads and 10 connections
449 |   Thread Stats   Avg      Stdev     Max   +/- Stdev
450 |     Latency   218.31ms  236.55ms   1.99s    84.32%
451 |     Req/Sec    35.14     29.02   202.00     79.15%
452 |   Latency Distribution
453 |      50%  162.73ms
454 |      75%  350.33ms
455 |      90%  519.56ms
456 |      99%    1.02s
457 |   37689 requests in 10.00m, 15.50MB read
458 |   Non-2xx or 3xx responses: 674
459 | Requests/sec:     62.80
460 | Transfer/sec:     26.44KB
461 | ```
462 | 
463 | Grafana showed that in a given instant, **68** requests were responded by the `edge`. From these requests, **16** went through the `backend`. The [cache efficiency](https://www.cloudflare.com/learning/cdn/what-is-a-cache-hit-ratio/) was **76%**, 1% of the request's latency was longer than **3.6s**, 5% observed more than **786ms**, and the median was around **73ms**.
464 | 
465 | 
466 | > Be aware! Latencies measurements highly depend on the bucket size, and [using histogram to measure performance](https://medium.com/mercari-engineering/have-you-been-using-histogram-metrics-correctly-730c9547a7a9) might not be adequate.
467 | 
468 | ![grafana result for 2.2.0](/img/2.2.0_metrics.webp "grafana result for 2.2.0")
469 | 
470 | ## Learning by testing - let's change the cache ttl (max age)
471 | 
472 | This project should engage you to experiment, change parameters values, run load testing, and check the results. I think this loop can be a great to learn. Let's try to see what happens when we change the cache behavior.
473 | 
474 | ### 1s
475 | 
476 | Using 1s for cache validity.
477 | 
478 | ```lua
479 | request = function()
480 |   local item = "item_" .. random(1, 100)
481 | 
482 |   return wrk.format(nil, "/" .. item .. ".ext?max_age=1")
483 | end
484 | ```
485 | 
486 | Run the tests, and the result is: only 16k requests with 773 errors.
487 | 
488 | ```
489 | Running 10m test @ http://localhost:8081
490 |   2 threads and 10 connections
491 |   Thread Stats   Avg      Stdev     Max   +/- Stdev
492 |     Latency   378.72ms  254.21ms   1.46s    68.40%
493 |     Req/Sec    15.11      9.98    90.00     74.18%
494 |   Latency Distribution
495 |      50%  396.15ms
496 |      75%  507.22ms
497 |      90%  664.18ms
498 |      99%    1.05s
499 |   16643 requests in 10.00m, 6.83MB read
500 |   Non-2xx or 3xx responses: 773
501 | Requests/sec:     27.74
502 | Transfer/sec:     11.66KB
503 | ```
504 | 
505 | We also noticed that the cache hit went down significantly `(23%)`, and many more requests leaked to the backend.
506 | 
507 | ![grafana result for 2.2.1 1 second](/img/2.2.1_metrics_1s.webp "grafana result for 2.2.1 1 second")
508 | 
509 | ### 60s
510 | 
511 | What if instead we increase the caching expire to a complete minute?!
512 | 
513 | ```lua
514 | request = function()
515 |   local item = "item_" .. random(1, 100)
516 | 
517 |   return wrk.format(nil, "/" .. item .. ".ext?max_age=60")
518 | end
519 | ```
520 | 
521 | Run the tests, and the result now is: 45k requests with 551 errors.
522 | 
523 | ```bash
524 | Running 10m test @ http://localhost:8081
525 |   2 threads and 10 connections
526 |   Thread Stats   Avg      Stdev     Max   +/- Stdev
527 |     Latency   196.27ms  223.43ms   1.79s    84.74%
528 |     Req/Sec    42.31     34.80   242.00     78.01%
529 |   Latency Distribution
530 |      50%   79.67ms
531 |      75%  321.06ms
532 |      90%  494.41ms
533 |      99%    1.01s
534 |   45695 requests in 10.00m, 18.79MB read
535 |   Non-2xx or 3xx responses: 551
536 | Requests/sec:     76.15
537 | Transfer/sec:     32.06KB
538 | ```
539 | 
540 | We see a much better **cache efficiency (80% vs 23%)** and **throughput (45k vs 16k requests)**.
541 | 
542 | ![grafana result for 2.2.1 60 seconds](/img/2.2.1_metrics_60s.webp "grafana result for 2.2.1 60 seconds")
543 | 
544 | > **Heads up**: caching for longer helps improve performance but at the cost of stale content.
545 | 
546 | ## Fine tunning - cache lock, stale, timeout, network
547 | 
548 | Using default configurations for Nginx, linux, and others will be sufficient for many small workloads. But when you're goal is more ambitious, you will inevitably need to fine-tune the CDN for your need. 
549 | 
550 | The process of fine-tuning a web server is gigantic. It goes from managing how [`nginx/Linux process sockets`](https://blog.cloudflare.com/the-sad-state-of-linux-socket-balancing/), to [`linux network queuing`](https://github.com/leandromoreira/linux-network-performance-parameters), how [`io`](https://serverfault.com/questions/796665/what-are-the-performance-implications-for-millions-of-files-in-a-modern-file-sys) affects performance, among other aspects. There is a lot of symbiosis between the [application and OS](https://nginx.org/en/docs/http/ngx_http_core_module.html#sendfile) with direct implications to the performance, for instance [saving user land switch context with ktls](https://docs.kernel.org/networking/tls-offload.html).
551 | 
552 | You'll be reading a lot of man pages, mostly tweaking timeouts and buffers. The test loop can help you build confidence in your ideas, let's see.
553 | 
554 | * You have a hypothesis or have observed something weird and want to test a parameter value
555 |   * stick to a single set of related parameters each time
556 | * Set the new value
557 | * Run the tests
558 | * Check results against the same server with the old parameter
559 | 
560 | > **Heads up**: doing tests locally is fine for learning, but most of the time you'll only trust your production results. Be prepared to do a partial deployment, compare old system/config to newer test parameters.
561 | 
562 | Did you notice that the errors were all related to timeout? It seems that the `backend` is taking longer to respond than what the `edge` is willing to wait.
563 | 
564 | ```log
565 | edge_1        | 2021/12/29 11:52:45 [error] 8#8: *3 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 172.25.0.1, server: , request: "GET /item_34.ext HTTP/1.1", upstream: "http://172.25.0.3:8080/item_34.ext", host: "localhost:8081"
566 | ```
567 | 
568 | To solve this problem we can try to increase the proxy timeouts. We're also using a neat directive [`proxy_cache_use_stale`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_use_stale) that serves `stale content` when nginx is dealing with `errors, timeout, or even updating the cache`.
569 | 
570 | ```nginx
571 | proxy_cache_lock_timeout 2s;
572 | proxy_read_timeout 2s;
573 | proxy_send_timeout 2s;
574 | proxy_cache_use_stale error timeout updating;
575 | ```
576 | 
577 | While we were reading about proxy caching, something catch our attention. There's a directive called [`proxy_cache_lock`](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_lock) that collapses multiple user requests for the same content into a single request going `upstream` to fetch the content at a time. This is very often known as [coalescing](https://cloud.google.com/cdn/docs/caching#request-coalescing).
578 | 
579 | ```nginx
580 | proxy_cache_lock on
581 | ```
582 | 
583 | ![caching lock](/img/cache_lock.webp "caching lock")
584 | 
585 | Running the tests we observed that we decrease the timeout errors but we also got less throughput. Why? Maybe it's because of lock contention. The big benefit of this feature it's to avoid the [thundering herd](https://alexpareto.com/2020/06/15/thundering-herds.html) in the backend. Traffic went down from **6k to 3k** and requests from **16 to 8**.
586 | 
587 | ![grafana result for test 3.0.0](/img/3.0.0_metrics.webp "grafana result for test 3.0.0")
588 | 
589 | ## From normal to long tail distribution
590 | 
591 | We've been running load testing assuming a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution) but that's far from reality. What we might see in production is [most of the requests will be towards a few items](https://en.wikipedia.org/wiki/Long_tail). To closer simulate that, we'll tweak our code to randomly pick a number from 1 to 100 and then decide if it's a popular item or not.
592 | 
593 | ```lua
594 | local popular_percentage = 96 -- 96% of users are requesting top 5 content
595 | local popular_items_quantity = 5 -- top content quantity
596 | local max_total_items = 200 -- total items clientes are requesting
597 | 
598 | request = function()
599 |   local is_popular = random(1, 100) <= popular_percentage
600 |   local item = ""
601 | 
602 |   if is_popular then -- if it's popular let's pick one of the top content
603 |     item = "item-" .. random(1, popular_items_quantity)
604 |   else -- otherwise let's pick any resting items
605 |     item = "item-" .. random(popular_items_quantity + 1, popular_items_quantity + max_total_items)
606 |   end
607 | 
608 |   return wrk.format(nil, "/path/" .. item .. ".ext")
609 | end
610 | ```
611 | 
612 | > **Heads-up**: we could model the long tail using [a formula](https://firstmonday.org/ojs/index.php/fm/article/view/1832/1716), but for the purpose of this repo, this extrapolation might be good enough.
613 | 
614 | Now, let's test again with `proxy_cache_lock` `off` and `on`.
615 | 
616 | ### Long tail `proxy_cache_lock` off
617 | ![grafana result for test 3.1.0](/img/3.1.0_metrics.webp "grafana result for test 3.1.0")
618 | ### Long tail `proxy_cache_lock` on
619 | ![grafana result for test 3.1.1](/img/3.1.1_metrics.webp "grafana result for test 3.1.1")
620 | 
621 | It's pretty close, even though the `lock off` is still better marginally. This feature might go to production to show if it's worthy or not.
622 | 
623 | > **Heads up**: the `proxy_cache_lock_timeout` is dangerous but necessary, if the configured time has passed, all the requests will go to the backend.
624 | 
625 | ## Routing challenges
626 | 
627 | We've been testing a single edge but in reality, there will be hundreds of nodes. Having more edge nodes is necessary for scalability, resilience and also to provide closer to user responses. Introducing multiple nodes also introduces another challenge, clients need somehow to figure out which node to fetch the content.
628 | 
629 | There are many ways to overcome this complication, and we'll try to explore some of them.
630 | 
631 | ### Load balancing
632 | 
633 | A load balancer will spread the client's requests among all the edges.
634 | 
635 | #### Round-robin
636 | 
637 | Round-robin is a balancing policy that takes an ordered list of edges and goes serving requests picking a server each time and wrapping around when the server list ends.
638 | 
639 | ```nginx
640 | # on nginx, if we do not specify anything the default policy is weighted round-robin
641 | # http://nginx.org/en/docs/http/ngx_http_upstream_module.html#upstream
642 | upstream backend {
643 |   server edge:8080;
644 |   server edge1:8080;
645 |   server edge2:8080;
646 | }
647 | 
648 | server {
649 |   listen 8080;
650 | 
651 |   location / {
652 |     proxy_pass http://backend;
653 |     add_header X-Edge LoadBalancer;
654 |   }
655 | }
656 | ```
657 | 
658 | What's good about `round-robin`? The requests are shared almost equally to all servers. There might be slower servers or responses which may enqueue lots of requests. There is the [`least_conn`](http://nginx.org/en/docs/http/ngx_http_upstream_module.html#least_conn) that also considers many connections.
659 | 
660 | What's not good about it? It's not caching-aware, meaning multiple clients will face higher latencies because they're asking uncached servers.
661 | 
662 | > [See more about when to use and avoid `rr` policy.](https://github.com/leandromoreira/cdn-up-and-running/issues/10)
663 | 
664 | ```bash
665 | # demo time
666 | git checkout 4.0.0
667 | docker-compose up
668 | ./load_test.sh
669 | ```
670 | 
671 | ![round-robin grafana](/img/4.0.0_metrics.webp "round-robin grafana")
672 | 
673 | > **Heads up**: the load balancer itself here plays a single point of failure role. [Facebook has a great talk explaining](https://www.youtube.com/watch?v=bxhYNfFeVF4) how they created a load balancer that is resilient, maintainable, and scalable.
674 | 
675 | #### Consistent Hashing
676 | 
677 | Knowing that caching awareness is important for a CDN, it's hard to use round-robin as it is. There is a balancing method known as [`consistent hashing`](https://en.wikipedia.org/wiki/Consistent_hashing) which tries to solve this problem by choosing a signal (the `uri` for instance) and mapping it to a hash table, consistently sending all the requests to the same server.
678 | 
679 | There is a directive for that on nginx as well, it's called [`hash`](http://nginx.org/en/docs/http/ngx_http_upstream_module.html#hash).
680 | 
681 | ```nginx
682 | upstream backend {
683 |   hash $request_uri consistent;
684 |   server edge:8080;
685 |   server edge1:8080;
686 |   server edge2:8080;
687 | }
688 | 
689 | server {
690 |   listen 8080;
691 | 
692 |   location / {
693 |     proxy_pass http://backend;
694 |     add_header X-Edge LoadBalancer;
695 |   }
696 | }
697 | ```
698 | 
699 | What's good about `consistent hashing`? It enforces a policy that will increase the chances of a cache hit.
700 | 
701 | What's not good about it? Imagine a single content (video, game) is peaking and now we have a problem of a small number of servers to respond to most of the clients.
702 | 
703 | > **Heads up** [Consistent Hashing With Bounded Load](https://medium.com/vimeo-engineering-blog/improving-load-balancing-with-a-new-consistent-hashing-algorithm-9f1bd75709ed) born to solve this problem.
704 | 
705 | ```bash
706 | # demo time
707 | git checkout 4.0.1
708 | docker-compose up
709 | ./load_test.sh
710 | ```
711 | 
712 | ![consistent hashing grafana](/img/4.0.1_metrics.webp "consistent hashing grafana")
713 | 
714 | > **Heads up** Initially I used a lua library because I thought the consistent hashing was only available for comercial nginx.
715 | 
716 | #### Load balancer bottleneck
717 | 
718 | There are at least two problems (beyond it being a [SPoF](https://en.wikipedia.org/wiki/Single_point_of_failure)) with a load balancer:
719 | 
720 | * Network egress - the input/output bandwidth capacity of the load balancer must be at least sum of all its servers.
721 |   * one could use [DSR](https://www.loadbalancer.org/blog/yahoos-l3-direct-server-return-an-alternative-to-lvs-tun-explored/) or [307](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/307).
722 | * Distributed edges - there might be nodes geographically sparsed that impose a hard time for a load balancer.
723 | 
724 | ### Network reachability
725 | 
726 | Many of the problems we saw on the load balancer section are about network reachability. Here we're going to discuss some of the ways we can tackle that, and each one with their ups and downs.
727 | 
728 | #### API
729 | 
730 | We could introduce an `API (cdn routing)`, all clients will only know where to find a content (`a specific edge node`) after asking for this API. Clients might need to deal with failover.
731 | 
732 | > **Heads up** solving on the software side, one could mix the best of all worlds: start balacing using `consistent hashing` and then when a given content becames popular uses [a better natural distribution](https://brooker.co.za/blog/2012/01/17/two-random.html)
733 | 
734 | #### DNS
735 | 
736 | We could use DNS for that. It looks pretty similar to the API but we're going to rely on dns caching ttl for that. Failover on this case is even harder.
737 | 
738 | #### Anycast
739 | 
740 | We could also use a single [Domain/IP, announcing the IP](https://en.wikipedia.org/wiki/Anycast) in all places we have nodes, leave the [network routing protocols](https://www.youtube.com/watch?v=O6tCoD5c_U0) to find the closest node for a given user.
741 | 
742 | ## Miscellaneous
743 | 
744 | We didn't talk about lots of important aspects of a CDN such as:
745 | 
746 | * [Peering](https://www.peeringdb.com/) - CDNs will host their nodes/content on ISPs, public peering places and private places.
747 | * Security - CDNs suffer a lot of attacks, DDoS, [caching poisoning](https://youst.in/posts/cache-poisoning-at-scale/), and others.
748 | * [Caching strategies](https://netflixtechblog.com/netflix-and-fill-c43a32b490c0) - in some cases, instead of pulling the content from the backend, the backend pushes the content to the edge.
749 | * [Tenants](https://en.wikipedia.org/wiki/Multitenancy)/Isolation - CDNs will host multiple clients on the same nodes, isolation is a must.
750 |   * metrics, caching area, configurations (caching policies, backend), and etc.
751 | * Tokens - CDNs offer some form of [token protection](https://en.wikipedia.org/wiki/JSON_Web_Token) for content from unauthorized clients.
752 | * [Health check (fault detection)](https://youtu.be/1TIzPL4878Q?t=782) - stating whether a node is functional or not.
753 | * HTTP Headers - very often (i.e. [CORS](https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS)) a client wants to add some headers (sometimes dynamically)
754 | * [Geoblocking](https://github.com/leev/ngx_http_geoip2_module#example-usage) - to save money or enforce contractual restrictions, your CDN will employ some policy regarding the locality of users.
755 | * Purging - the ability to [purge content from the cache](https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/#purging-content-from-the-cache).
756 | * [Throttling](https://github.com/leandromoreira/nginx-lua-redis-rate-measuring#use-case-distributed-throttling) - limit the number of concurrent requests.
757 | * [Edge computing](https://leandromoreira.com/2020/04/19/building-an-edge-computing-platform/) - ability to run code as a filter for the content hosted.
758 | * and so on...
759 | 
760 | ## Conclusion
761 | 
762 | I hope you learned a little bit about how a CDN works. It's a complex endeavor, highly dependent on how close your nodes are to the clients and how well you can distribute the load, taking caching into consideration, to accommodate spikes and low traffics likewise.
763 | 


--------------------------------------------------------------------------------