├── .gitignore ├── LICENSE ├── README.md └── docs ├── apps ├── 00-index.md ├── 01-canary-flagger.md └── 02-ab-testing-helm.md ├── istio ├── 00-index.md ├── 01-prerequisites.md ├── 02-gke-setup.md ├── 03-clouddns-setup.md ├── 04-istio-setup.md ├── 05-letsencrypt-setup.md └── 06-grafana-config.md ├── openfaas ├── 00-index.md ├── 01-mtls-config.md ├── 02-mixer-rules.md ├── 03-openfaas-setup.md ├── 04-gateway-config.md └── 05-canary.md └── screens ├── grafana-403-errors.png ├── istio-cert-manager-gcp.png ├── istio-gcp-overview.png ├── jaeger-trace-list.png ├── openfaas-istio-canary-prom.png ├── openfaas-istio-canary-trace.png ├── openfaas-istio-canary.png ├── openfaas-istio-diagram.png ├── openfaas-istio-ga-trace.png ├── routing-desired-state.png └── routing-initial-state.png /.gitignore: -------------------------------------------------------------------------------- 1 | # Binaries for programs and plugins 2 | *.exe 3 | *.exe~ 4 | *.dll 5 | *.so 6 | *.dylib 7 | 8 | # Test binary, build with `go test -c` 9 | *.test 10 | 11 | # Output of the go coverage tool, specifically when used with LiteIDE 12 | *.out 13 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2018 Stefan Prodan 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Istio service mesh guides 2 | 3 | ![istio](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/istio-gcp-overview.png) 4 | 5 | [Istio GKE setup](/docs/istio/00-index.md) 6 | 7 | * [Prerequisites - client tools](/docs/istio/01-prerequisites.md) 8 | * [GKE cluster setup](/docs/istio/02-gke-setup.md) 9 | * [Cloud DNS setup](/docs/istio/03-clouddns-setup.md) 10 | * [Install Istio with Helm](/docs/istio/04-istio-setup.md) 11 | * [Configure Istio Gateway with Let's Encrypt wildcard certificate](/docs/istio/05-letsencrypt-setup.md) 12 | * [Expose services outside the service mesh](/docs/istio/06-grafana-config.md) 13 | 14 | [Progressive delivery walkthrough](docs/apps/00-index.md) 15 | 16 | * [Automated canary deployments with Flagger](/docs/apps/01-canary-flagger.md) 17 | * [A/B testing for a micro-service stack with Helm](/docs/apps/02-ab-testing-helm.md) 18 | 19 | [OpenFaaS service mesh walkthrough](docs/openfaas/00-index.md) 20 | 21 | * [Configure OpenFaaS mutual TLS](/docs/openfaas/01-mtls-config.md) 22 | * [Configure OpenFaaS access policies](/docs/openfaas/02-mixer-rules.md) 23 | * [Install OpenFaaS with Helm](/docs/openfaas/03-openfaas-setup.md) 24 | * [Configure OpenFaaS Gateway to receive external traffic](/docs/openfaas/04-gateway-config.md) 25 | * [Canary deployments for OpenFaaS functions](/docs/openfaas/05-canary.md) 26 | -------------------------------------------------------------------------------- /docs/apps/00-index.md: -------------------------------------------------------------------------------- 1 | # Progressive delivery walkthrough 2 | 3 | This guide shows you how to route traffic between different versions of a service and how to automate canary deployments. 4 | 5 | ![flagger-overview](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/diagrams/flagger-overview.png) 6 | 7 | At the end of this guide you will be deploying a series of micro-services with the following characteristics: 8 | 9 | * A/B testing for frontend services 10 | * Source/Destination based routing for backend services 11 | * Progressive deployments gated by Prometheus 12 | 13 | ### Labs 14 | 15 | * [Automated canary deployments with Flagger](01-canary-flagger.md) 16 | * [A/B testing for a micro-service stack with Helm](02-ab-testing-helm.md) 17 | -------------------------------------------------------------------------------- /docs/apps/01-canary-flagger.md: -------------------------------------------------------------------------------- 1 | # Automated canary deployments with Flagger 2 | 3 | [Flagger](https://github.com/stefanprodan/flagger) is a Kubernetes operator that automates the promotion of 4 | canary deployments using Istio routing for traffic shifting and Prometheus metrics for canary analysis. 5 | 6 | ### Install Flagger 7 | 8 | Deploy Flagger in the `istio-system` namespace using Helm: 9 | 10 | ```bash 11 | # add the Helm repository 12 | helm repo add flagger https://flagger.app 13 | 14 | # install or upgrade 15 | helm upgrade -i flagger flagger/flagger \ 16 | --namespace=istio-system \ 17 | --set metricsServer=http://prometheus.istio-system:9090 18 | ``` 19 | 20 | Flagger is compatible with Kubernetes >1.11.0 and Istio >1.0.0. 21 | 22 | ![flagger-overview](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/diagrams/flagger-canary-overview.png) 23 | 24 | Flagger takes a Kubernetes deployment and optionally a horizontal pod autoscaler (HPA) and creates a series of objects 25 | (Kubernetes deployments, ClusterIP services and Istio virtual services) to drive the canary analysis and promotion. 26 | 27 | A canary deployment is triggered by changes in any of the following objects: 28 | 29 | * Deployment PodSpec (container image, command, ports, env, resources, etc) 30 | * ConfigMaps mounted as volumes or mapped to environment variables 31 | * Secrets mounted as volumes or mapped to environment variables 32 | 33 | Gated canary promotion stages: 34 | 35 | * scan for canary deployments 36 | * check Istio virtual service routes are mapped to primary and canary ClusterIP services 37 | * check primary and canary deployments status 38 | * halt advancement if a rolling update is underway 39 | * halt advancement if pods are unhealthy 40 | * increase canary traffic weight percentage from 0% to 5% (step weight) 41 | * call webhooks and check results 42 | * check canary HTTP request success rate and latency 43 | * halt advancement if any metric is under the specified threshold 44 | * increment the failed checks counter 45 | * check if the number of failed checks reached the threshold 46 | * route all traffic to primary 47 | * scale to zero the canary deployment and mark it as failed 48 | * wait for the canary deployment to be updated and start over 49 | * increase canary traffic weight by 5% (step weight) till it reaches 50% (max weight) 50 | * halt advancement while canary request success rate is under the threshold 51 | * halt advancement while canary request duration P99 is over the threshold 52 | * halt advancement if the primary or canary deployment becomes unhealthy 53 | * halt advancement while canary deployment is being scaled up/down by HPA 54 | * promote canary to primary 55 | * copy ConfigMaps and Secrets from canary to primary 56 | * copy canary deployment spec template over primary 57 | * wait for primary rolling update to finish 58 | * halt advancement if pods are unhealthy 59 | * route all traffic to primary 60 | * scale to zero the canary deployment 61 | * mark rollout as finished 62 | * wait for the canary deployment to be updated and start over 63 | 64 | You can change the canary analysis _max weight_ and the _step weight_ percentage in the Flagger's custom resource. 65 | 66 | ### Automated canary analysis and promotion 67 | 68 | Create a test namespace with Istio sidecar injection enabled: 69 | 70 | ```bash 71 | export REPO=https://raw.githubusercontent.com/weaveworks/flagger/master 72 | 73 | kubectl apply -f ${REPO}/artifacts/namespaces/test.yaml 74 | ``` 75 | 76 | Create a deployment and a horizontal pod autoscaler: 77 | 78 | ```bash 79 | kubectl apply -f ${REPO}/artifacts/canaries/deployment.yaml 80 | kubectl apply -f ${REPO}/artifacts/canaries/hpa.yaml 81 | ``` 82 | 83 | Deploy the load testing service to generate traffic during the canary analysis: 84 | 85 | ```bash 86 | kubectl -n test apply -f ${REPO}/artifacts/loadtester/deployment.yaml 87 | kubectl -n test apply -f ${REPO}/artifacts/loadtester/service.yaml 88 | ``` 89 | 90 | Create a canary custom resource (replace example.com with your own domain): 91 | 92 | ```yaml 93 | apiVersion: flagger.app/v1alpha3 94 | kind: Canary 95 | metadata: 96 | name: podinfo 97 | namespace: test 98 | spec: 99 | # deployment reference 100 | targetRef: 101 | apiVersion: apps/v1 102 | kind: Deployment 103 | name: podinfo 104 | # the maximum time in seconds for the canary deployment 105 | # to make progress before it is rollback (default 600s) 106 | progressDeadlineSeconds: 60 107 | # HPA reference (optional) 108 | autoscalerRef: 109 | apiVersion: autoscaling/v2beta1 110 | kind: HorizontalPodAutoscaler 111 | name: podinfo 112 | service: 113 | # container port 114 | port: 9898 115 | trafficPolicy: 116 | tls: 117 | # use ISTIO_MUTUAL when mTLS is enabled 118 | mode: DISABLE 119 | # Istio gateways (optional) 120 | gateways: 121 | - public-gateway.istio-system.svc.cluster.local 122 | - mesh 123 | # Istio virtual service host names (optional) 124 | hosts: 125 | - app.example.com 126 | canaryAnalysis: 127 | # schedule interval (default 60s) 128 | interval: 1m 129 | # max number of failed metric checks before rollback 130 | threshold: 5 131 | # max traffic percentage routed to canary 132 | # percentage (0-100) 133 | maxWeight: 50 134 | # canary increment step 135 | # percentage (0-100) 136 | stepWeight: 10 137 | metrics: 138 | - name: request-success-rate 139 | # minimum req success rate (non 5xx responses) 140 | # percentage (0-100) 141 | threshold: 99 142 | interval: 1m 143 | - name: request-duration 144 | # maximum req duration P99 145 | # milliseconds 146 | threshold: 500 147 | interval: 30s 148 | # generate traffic during analysis 149 | webhooks: 150 | - name: load-test 151 | url: http://flagger-loadtester.test/ 152 | timeout: 5s 153 | metadata: 154 | cmd: "hey -z 1m -q 10 -c 2 http://podinfo-canary.test:9898/" 155 | ``` 156 | 157 | Save the above resource as podinfo-canary.yaml and then apply it: 158 | 159 | ```bash 160 | kubectl apply -f ./podinfo-canary.yaml 161 | ``` 162 | 163 | After a couple of seconds Flagger will create the canary objects: 164 | 165 | ```bash 166 | # applied 167 | deployment.apps/podinfo 168 | horizontalpodautoscaler.autoscaling/podinfo 169 | canary.flagger.app/podinfo 170 | 171 | # generated 172 | deployment.apps/podinfo-primary 173 | horizontalpodautoscaler.autoscaling/podinfo-primary 174 | service/podinfo 175 | service/podinfo-canary 176 | service/podinfo-primary 177 | virtualservice.networking.istio.io/podinfo 178 | ``` 179 | 180 | ![flagger-canary-steps](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/diagrams/flagger-canary-steps.png) 181 | 182 | Trigger a canary deployment by updating the container image: 183 | 184 | ```bash 185 | kubectl -n test set image deployment/podinfo \ 186 | podinfod=quay.io/stefanprodan/podinfo:1.4.1 187 | ``` 188 | 189 | Flagger detects that the deployment revision changed and starts a new rollout: 190 | 191 | ``` 192 | kubectl -n test describe canary/podinfo 193 | 194 | Status: 195 | Canary Revision: 19871136 196 | Failed Checks: 0 197 | State: finished 198 | Events: 199 | Type Reason Age From Message 200 | ---- ------ ---- ---- ------- 201 | Normal Synced 3m flagger New revision detected podinfo.test 202 | Normal Synced 3m flagger Scaling up podinfo.test 203 | Warning Synced 3m flagger Waiting for podinfo.test rollout to finish: 0 of 1 updated replicas are available 204 | Normal Synced 3m flagger Advance podinfo.test canary weight 5 205 | Normal Synced 3m flagger Advance podinfo.test canary weight 10 206 | Normal Synced 3m flagger Advance podinfo.test canary weight 15 207 | Normal Synced 2m flagger Advance podinfo.test canary weight 20 208 | Normal Synced 2m flagger Advance podinfo.test canary weight 25 209 | Normal Synced 1m flagger Advance podinfo.test canary weight 30 210 | Normal Synced 1m flagger Advance podinfo.test canary weight 35 211 | Normal Synced 55s flagger Advance podinfo.test canary weight 40 212 | Normal Synced 45s flagger Advance podinfo.test canary weight 45 213 | Normal Synced 35s flagger Advance podinfo.test canary weight 50 214 | Normal Synced 25s flagger Copying podinfo.test template spec to podinfo-primary.test 215 | Warning Synced 15s flagger Waiting for podinfo-primary.test rollout to finish: 1 of 2 updated replicas are available 216 | Normal Synced 5s flagger Promotion completed! Scaling down podinfo.test 217 | ``` 218 | 219 | **Note** that if you apply new changes to the deployment during the canary analysis, Flagger will restart the analysis. 220 | 221 | You can monitor all canaries with: 222 | 223 | ```bash 224 | watch kubectl get canaries --all-namespaces 225 | 226 | NAMESPACE NAME STATUS WEIGHT LASTTRANSITIONTIME 227 | test podinfo Progressing 15 2019-01-16T14:05:07Z 228 | prod frontend Succeeded 0 2019-01-15T16:15:07Z 229 | prod backend Failed 0 2019-01-14T17:05:07Z 230 | ``` 231 | 232 | ### Automated rollback 233 | 234 | During the canary analysis you can generate HTTP 500 errors and high latency to test if Flagger pauses the rollout. 235 | 236 | Create a tester pod and exec into it: 237 | 238 | ```bash 239 | kubectl -n test run tester --image=quay.io/stefanprodan/podinfo:1.2.1 -- ./podinfo --port=9898 240 | kubectl -n test exec -it tester-xx-xx sh 241 | ``` 242 | 243 | Generate HTTP 500 errors: 244 | 245 | ```bash 246 | watch curl http://podinfo-canary:9898/status/500 247 | ``` 248 | 249 | Generate latency: 250 | 251 | ```bash 252 | watch curl http://podinfo-canary:9898/delay/1 253 | ``` 254 | 255 | When the number of failed checks reaches the canary analysis threshold, the traffic is routed back to the primary, 256 | the canary is scaled to zero and the rollout is marked as failed. 257 | 258 | ``` 259 | kubectl -n test describe canary/podinfo 260 | 261 | Status: 262 | Canary Revision: 16695041 263 | Failed Checks: 10 264 | State: failed 265 | Events: 266 | Type Reason Age From Message 267 | ---- ------ ---- ---- ------- 268 | Normal Synced 3m flagger Starting canary deployment for podinfo.test 269 | Normal Synced 3m flagger Advance podinfo.test canary weight 5 270 | Normal Synced 3m flagger Advance podinfo.test canary weight 10 271 | Normal Synced 3m flagger Advance podinfo.test canary weight 15 272 | Normal Synced 3m flagger Halt podinfo.test advancement success rate 69.17% < 99% 273 | Normal Synced 2m flagger Halt podinfo.test advancement success rate 61.39% < 99% 274 | Normal Synced 2m flagger Halt podinfo.test advancement success rate 55.06% < 99% 275 | Normal Synced 2m flagger Halt podinfo.test advancement success rate 47.00% < 99% 276 | Normal Synced 2m flagger (combined from similar events): Halt podinfo.test advancement success rate 38.08% < 99% 277 | Warning Synced 1m flagger Rolling back podinfo.test failed checks threshold reached 10 278 | Warning Synced 1m flagger Canary failed! Scaling down podinfo.test 279 | ``` 280 | 281 | ### Monitoring 282 | 283 | Flagger comes with a Grafana dashboard made for canary analysis. 284 | 285 | Install Grafana with Helm: 286 | 287 | ```bash 288 | helm upgrade -i flagger-grafana flagger/grafana \ 289 | --namespace=istio-system \ 290 | --set url=http://prometheus.istio-system:9090 291 | ``` 292 | 293 | The dashboard shows the RED and USE metrics for the primary and canary workloads: 294 | 295 | ![flagger-grafana](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/screens/grafana-canary-analysis.png) 296 | 297 | The canary errors and latency spikes have been recorded as Kubernetes events and logged by Flagger in json format: 298 | 299 | ``` 300 | kubectl -n istio-system logs deployment/flagger --tail=100 | jq .msg 301 | 302 | Starting canary deployment for podinfo.test 303 | Advance podinfo.test canary weight 5 304 | Advance podinfo.test canary weight 10 305 | Advance podinfo.test canary weight 15 306 | Advance podinfo.test canary weight 20 307 | Advance podinfo.test canary weight 25 308 | Advance podinfo.test canary weight 30 309 | Advance podinfo.test canary weight 35 310 | Halt podinfo.test advancement success rate 98.69% < 99% 311 | Advance podinfo.test canary weight 40 312 | Halt podinfo.test advancement request duration 1.515s > 500ms 313 | Advance podinfo.test canary weight 45 314 | Advance podinfo.test canary weight 50 315 | Copying podinfo.test template spec to podinfo-primary.test 316 | Halt podinfo-primary.test advancement waiting for rollout to finish: 1 old replicas are pending termination 317 | Scaling down podinfo.test 318 | Promotion completed! podinfo.test 319 | ``` 320 | ### Alerting 321 | 322 | Flagger can be configured to send Slack notifications: 323 | 324 | ```bash 325 | helm upgrade -i flagger flagger/flagger \ 326 | --namespace=istio-system \ 327 | --set slack.url=https://hooks.slack.com/services/YOUR/SLACK/WEBHOOK \ 328 | --set slack.channel=general \ 329 | --set slack.user=flagger 330 | ``` 331 | 332 | Once configured with a Slack incoming webhook, Flagger will post messages when a canary deployment has been initialized, 333 | when a new revision has been detected and if the canary analysis failed or succeeded. 334 | 335 | ![flagger-slack](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/screens/slack-canary-notifications.png) 336 | 337 | A canary deployment will be rolled back if the progress deadline exceeded or if the analysis 338 | reached the maximum number of failed checks: 339 | 340 | ![flagger-slack-errors](https://raw.githubusercontent.com/stefanprodan/flagger/master/docs/screens/slack-canary-failed.png) 341 | 342 | Besides Slack, you can use Alertmanager to trigger alerts when a canary deployment failed: 343 | 344 | ```yaml 345 | - alert: canary_rollback 346 | expr: flagger_canary_status > 1 347 | for: 1m 348 | labels: 349 | severity: warning 350 | annotations: 351 | summary: "Canary failed" 352 | description: "Workload {{ $labels.name }} namespace {{ $labels.namespace }}" 353 | ``` 354 | 355 | Next: [A/B Testing with Helm](02-ab-testing-helm.md) 356 | 357 | -------------------------------------------------------------------------------- /docs/apps/02-ab-testing-helm.md: -------------------------------------------------------------------------------- 1 | # A/B testing with Istio and Helm 2 | 3 | To experiment with different traffic routing techniques 4 | I've created a Helm chart for [podinfo](https://github.com/stefanprodan/k8s-podinfo) that lets you chain multiple 5 | services and wraps all the Istio objects needs for A/B testing and canary deployments. 6 | 7 | Using the podinfo chart you will be installing three microservices: frontend, backend and data store. 8 | Each of these services can have two versions running in parallel, the versions are called blue and green. 9 | The assumption is that for the frontend you'll be running A/B testing based on the user agent HTTP header. 10 | The green frontend is not backwards compatible with the blue backend so you'll route all requests from the green 11 | frontend to the green backend. For the data store you'll be running performance testing. Both backend versions are 12 | compatible with the blue and green data store so you'll be splitting the traffic between blue and green data stores 13 | and compare the requests latency and error rate to determine if the green store performs 14 | better than the blue one. 15 | 16 | ### Deploy the blue version 17 | 18 | Add the podinfo Helm repository: 19 | 20 | ```bash 21 | helm repo add sp https://stefanprodan.github.io/k8s-podinfo 22 | ``` 23 | 24 | Create a namespace with Istio sidecar injection enabled: 25 | 26 | ```yaml 27 | apiVersion: v1 28 | kind: Namespace 29 | metadata: 30 | labels: 31 | istio-injection: enabled 32 | name: demo 33 | ``` 34 | 35 | Save the above resource as demo.yaml and then apply it: 36 | 37 | ```bash 38 | kubectl apply -f ./demo.yaml 39 | ``` 40 | 41 | ![initial-state](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/routing-initial-state.png) 42 | 43 | Create a frontend release exposed outside the service mesh on the podinfo sub-domain (replace `example.com` with your domain): 44 | 45 | ```yaml 46 | host: podinfo.example.com 47 | exposeHost: true 48 | 49 | blue: 50 | replicas: 2 51 | tag: "1.2.0" 52 | message: "Greetings from the blue frontend" 53 | backend: http://backend:9898/api/echo 54 | 55 | green: 56 | # disabled (all traffic goes to blue) 57 | replicas: 0 58 | ``` 59 | 60 | Save the above resource as frontend.yaml and then install it: 61 | 62 | ```bash 63 | helm install --name frontend sp/podinfo-istio \ 64 | --namespace demo \ 65 | -f ./frontend.yaml 66 | ``` 67 | 68 | Create a backend release: 69 | 70 | ```yaml 71 | host: backend 72 | 73 | blue: 74 | replicas: 2 75 | tag: "1.2.0" 76 | backend: http://store:9898/api/echo 77 | 78 | green: 79 | # disabled (all traffic goes to blue) 80 | replicas: 0 81 | ``` 82 | 83 | Save the above resource as backend.yaml and then install it: 84 | 85 | ```bash 86 | helm install --name backend sp/podinfo-istio \ 87 | --namespace demo \ 88 | -f ./backend.yaml 89 | ``` 90 | 91 | Create a store release: 92 | 93 | ```yaml 94 | host: store 95 | 96 | blue: 97 | replicas: 2 98 | tag: "1.2.0" 99 | weight: 100 100 | 101 | green: 102 | # disabled (all traffic goes to blue) 103 | replicas: 0 104 | ``` 105 | 106 | Save the above resource as store.yaml and then install it: 107 | 108 | ```bash 109 | helm install --name store sp/podinfo-istio \ 110 | --namespace demo \ 111 | -f ./store.yaml 112 | ``` 113 | 114 | Open `https://podinfo.exmaple.com` in your browser, you should see a greetings message from the blue version. 115 | Clicking on the ping button will make a call that spans across all microservices. 116 | 117 | Access Jaeger dashboard using port forwarding: 118 | 119 | ```bash 120 | kubectl -n istio-system port-forward deployment/istio-tracing 16686:16686 121 | ``` 122 | 123 | Navigate to `http://localhost:16686` and select `store` from the service dropdown. You should see a trace for each ping. 124 | 125 | ![jaeger-trace](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/jaeger-trace-list.png) 126 | 127 | Istio tracing is able to capture the ping call spanning across all microservices because podinfo forwards the Zipkin HTTP 128 | headers. When a HTTP request reaches the Istio Gateway, Envoy will inject a series of headers used for tracing. When podinfo 129 | calls a backend service, will copy the headers from the incoming HTTP request: 130 | 131 | ```go 132 | func copyTracingHeaders(from *http.Request, to *http.Request) { 133 | headers := []string{ 134 | "x-request-id", 135 | "x-b3-traceid", 136 | "x-b3-spanid", 137 | "x-b3-parentspanid", 138 | "x-b3-sampled", 139 | "x-b3-flags", 140 | "x-ot-span-context", 141 | } 142 | 143 | for i := range headers { 144 | headerValue := from.Header.Get(headers[i]) 145 | if len(headerValue) > 0 { 146 | to.Header.Set(headers[i], headerValue) 147 | } 148 | } 149 | } 150 | ``` 151 | 152 | ### Deploy the green version 153 | 154 | ![desired-state](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/routing-desired-state.png) 155 | 156 | Change the frontend definition to route traffic coming from Safari users to the green deployment: 157 | 158 | ```yaml 159 | host: podinfo.example.com 160 | exposeHost: true 161 | 162 | blue: 163 | replicas: 2 164 | tag: "1.2.0" 165 | message: "Greetings from the blue frontend" 166 | backend: http://backend:9898/api/echo 167 | 168 | green: 169 | replicas: 2 170 | tag: "1.2.1" 171 | routing: 172 | # target Safari 173 | - match: 174 | - headers: 175 | user-agent: 176 | regex: "^(?!.*Chrome).*Safari.*" 177 | # target API clients by version 178 | - match: 179 | - headers: 180 | x-api-version: 181 | regex: "^(v{0,1})1\\.2\\.([1-9]).*" 182 | message: "Greetings from the green frontend" 183 | backend: http://backend:9898/api/echo 184 | ``` 185 | 186 | Save the above resource and apply it: 187 | 188 | ```bash 189 | helm upgrade --install frontend sp/podinfo-istio \ 190 | --namespace demo \ 191 | -f ./frontend.yaml 192 | ``` 193 | 194 | Change the backend definition to receive traffic based on source labels. The blue frontend will be routed to the blue 195 | backend and the green frontend to the green backend: 196 | 197 | ```yaml 198 | host: backend 199 | 200 | blue: 201 | replicas: 2 202 | tag: "1.2.0" 203 | backend: http://store:9898/api/echo 204 | 205 | green: 206 | replicas: 2 207 | tag: "1.2.1" 208 | routing: 209 | # target green callers 210 | - match: 211 | - sourceLabels: 212 | color: green 213 | backend: http://store:9898/api/echo 214 | ``` 215 | 216 | Save the above resource and apply it: 217 | 218 | ```bash 219 | helm upgrade --install backend sp/podinfo-istio \ 220 | --namespace demo \ 221 | -f ./backend.yaml 222 | ``` 223 | 224 | Change the store definition to route 80% of the traffic to the blue deployment and 20% to the green one: 225 | 226 | ```yaml 227 | host: store 228 | 229 | # load balance 80/20 between blue and green 230 | blue: 231 | replicas: 2 232 | tag: "1.2.0" 233 | weight: 80 234 | 235 | green: 236 | replicas: 1 237 | tag: "1.2.1" 238 | ``` 239 | 240 | Save the above resource and apply it: 241 | 242 | ```bash 243 | helm upgrade --install store sp/podinfo-istio \ 244 | --namespace demo \ 245 | -f ./store.yaml 246 | ``` 247 | 248 | ### Restrict access with Mixer rules 249 | 250 | Let's assume the frontend service has a vulnerability and a bad actor can execute arbitrary commands in the frontend container. 251 | If someone gains access to the frontend service, from there he/she can issue API calls to the backend and data store service. 252 | 253 | In order to simulate this you can exec into the frontend container and curl the data store API: 254 | 255 | ```bash 256 | kubectl -n demo exec -it frontend-blue-675b4dff4b-xhg9d -c podinfod sh 257 | 258 | ~ $ curl -v http://store:9898 259 | * Connected to store (10.31.250.154) port 9898 (#0) 260 | ``` 261 | 262 | There is no reason why the frontend service should have access to the data store, only the backend service should be 263 | able to issue API calls to the store service. With Istio you can define access rules and restrict access based on source and 264 | destination. 265 | 266 | Let's create an Istio config that denies access to the data store unless the caller is the backend service: 267 | 268 | ```yaml 269 | apiVersion: config.istio.io/v1alpha2 270 | kind: denier 271 | metadata: 272 | name: denyhandler 273 | namespace: demo 274 | spec: 275 | status: 276 | code: 7 277 | message: Not allowed 278 | --- 279 | apiVersion: config.istio.io/v1alpha2 280 | kind: checknothing 281 | metadata: 282 | name: denyrequest 283 | namespace: demo 284 | spec: 285 | --- 286 | apiVersion: config.istio.io/v1alpha2 287 | kind: rule 288 | metadata: 289 | name: denystore 290 | namespace: demo 291 | spec: 292 | match: destination.labels["app"] == "store" && source.labels["app"] != "backend" 293 | actions: 294 | - handler: denyhandler.denier 295 | instances: [ denyrequest.checknothing ] 296 | ``` 297 | 298 | Save the above resource as demo-rules.yaml and then apply it: 299 | 300 | ```bash 301 | kubectl apply -f ./demo-rules.yaml 302 | ``` 303 | 304 | Now if you try to call the data store from the frontend container Istio Mixer will deny access: 305 | 306 | ```bash 307 | kubectl -n demo exec -it frontend-blue-675b4dff4b-xhg9d -c podinfod sh 308 | 309 | ~ $ watch curl -s http://store:9898 310 | PERMISSION_DENIED:denyhandler.denier.demo:Not allowed 311 | ``` 312 | 313 | The permission denied error can be observed in Grafana. Open the Istio Workload dashboard, select the demo namespace and 314 | podinfo-blue workload from the dropdown, scroll to outbound services and you'll see the HTTP 403 errors: 315 | 316 | ![grafana-403](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/grafana-403-errors.png) 317 | 318 | Once you have the Mixer rules in place you could create an alert for HTTP 403 errors with Prometheus and Alertmanager 319 | to be notified about suspicious activities inside the service mesh. 320 | 321 | -------------------------------------------------------------------------------- /docs/istio/00-index.md: -------------------------------------------------------------------------------- 1 | # Istio GKE setup 2 | 3 | This guide walks you through setting up Istio on Google Kubernetes Engine. 4 | 5 | ![istio](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/istio-gcp-overview.png) 6 | 7 | At the end of this guide you will be running Istio with the following characteristics: 8 | 9 | * secure Istio ingress gateway with Let’s Encrypt TLS 10 | * encrypted communication between Kubernetes workloads with Istio mutual TLS 11 | * Jaeger tracing 12 | * Prometheus and Grafana monitoring 13 | * canary deployments, A/B testing and traffic mirroring capabilities 14 | 15 | ### Labs 16 | 17 | * [Prerequisites - client tools](01-prerequisites.md) 18 | * [GKE cluster setup](02-gke-setup.md) 19 | * [Cloud DNS setup](03-clouddns-setup.md) 20 | * [Install Istio with Helm](04-istio-setup.md) 21 | * [Configure Istio Gateway with Let's Encrypt wildcard certificate](05-letsencrypt-setup.md) 22 | * [Expose services outside the service mesh](06-grafana-config.md) 23 | -------------------------------------------------------------------------------- /docs/istio/01-prerequisites.md: -------------------------------------------------------------------------------- 1 | # Prerequisites 2 | 3 | You will be creating a cluster on Google’s Kubernetes Engine (GKE), 4 | if you don’t have an account you can sign up [here](https://cloud.google.com/free/) for free credits. 5 | 6 | Login into GCP, create a project and enable billing for it. 7 | 8 | Install the [gcloud](https://cloud.google.com/sdk/) command line utility and configure your project with `gcloud init`. 9 | 10 | Set the default project (replace `PROJECT_ID` with your own project): 11 | 12 | ```bash 13 | gcloud config set project PROJECT_ID 14 | ``` 15 | 16 | Set the default compute region and zone: 17 | 18 | ```bash 19 | gcloud config set compute/region europe-west3 20 | gcloud config set compute/zone europe-west3-a 21 | ``` 22 | 23 | Enable the Kubernetes and Cloud DNS services for your project: 24 | 25 | ```bash 26 | gcloud services enable container.googleapis.com 27 | gcloud services enable dns.googleapis.com 28 | ``` 29 | 30 | Install the `kubectl` command-line tool: 31 | 32 | ```bash 33 | gcloud components install kubectl 34 | ``` 35 | 36 | Install the `helm` command-line tool: 37 | 38 | ```bash 39 | brew install kubernetes-helm 40 | ``` 41 | 42 | Create Tiller service account: 43 | 44 | ```bash 45 | kubectl --namespace kube-system create sa tiller 46 | kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller 47 | ``` 48 | 49 | Install Tiller: 50 | 51 | ```bash 52 | helm init --service-account tiller --upgrade --wait 53 | ``` 54 | 55 | Next: [GKE cluster setup](02-gke-setup.md) 56 | -------------------------------------------------------------------------------- /docs/istio/02-gke-setup.md: -------------------------------------------------------------------------------- 1 | # GKE cluster setup 2 | 3 | Create a cluster with three nodes using the latest Kubernetes version: 4 | 5 | ```bash 6 | k8s_version=$(gcloud container get-server-config --format=json \ 7 | | jq -r '.validMasterVersions[0]') 8 | 9 | gcloud container clusters create istio \ 10 | --cluster-version=${k8s_version} \ 11 | --zone=europe-west3-a \ 12 | --num-nodes=3 \ 13 | --machine-type=n1-highcpu-4 \ 14 | --preemptible \ 15 | --no-enable-cloud-logging \ 16 | --disk-size=50 \ 17 | --enable-autorepair \ 18 | --scopes=gke-default 19 | ``` 20 | 21 | The above command will create a default node pool consisting of `n1-highcpu-4` (vCPU: 4, RAM 3.60GB, DISK: 30GB) preemptible VMs. 22 | Preemptible VMs are up to 80% cheaper than regular instances and are terminated and replaced after a maximum of 24 hours. 23 | 24 | Set up credentials for `kubectl`: 25 | 26 | ```bash 27 | gcloud container clusters get-credentials istio -z=europe-west3-a 28 | ``` 29 | 30 | Create a cluster admin role binding: 31 | 32 | ```bash 33 | kubectl create clusterrolebinding "cluster-admin-$(whoami)" \ 34 | --clusterrole=cluster-admin \ 35 | --user="$(gcloud config get-value core/account)" 36 | ``` 37 | 38 | Validate your setup with: 39 | 40 | ```bash 41 | kubectl get nodes -o wide 42 | ``` 43 | 44 | Next: [Cloud DNS setup](03-clouddns-setup.md) 45 | -------------------------------------------------------------------------------- /docs/istio/03-clouddns-setup.md: -------------------------------------------------------------------------------- 1 | # Cloud DNS setup 2 | 3 | You will need an internet domain and access to the registrar to change the name servers to Google Cloud DNS. 4 | 5 | Create a managed zone named `istio` in Cloud DNS (replace `example.com` with your domain): 6 | 7 | ```bash 8 | gcloud dns managed-zones create \ 9 | --dns-name="example.com." \ 10 | --description="Istio zone" "istio" 11 | ``` 12 | 13 | Look up your zone's name servers: 14 | 15 | ```bash 16 | gcloud dns managed-zones describe istio 17 | ``` 18 | 19 | Update your registrar's name server records with the records returned by the above command. 20 | 21 | Wait for the name servers to change (replace `example.com` with your domain): 22 | 23 | ```bash 24 | watch dig +short NS example.com 25 | ``` 26 | 27 | Create a static IP address named `istio-gateway-ip` in the same region as your GKE cluster: 28 | 29 | ```bash 30 | gcloud compute addresses create istio-gateway-ip --region europe-west3 31 | ``` 32 | 33 | Find the static IP address: 34 | 35 | ```bash 36 | gcloud compute addresses describe istio-gateway-ip --region europe-west3 37 | ``` 38 | 39 | Create the following DNS records (replace `example.com` with your domain and set your Istio Gateway IP): 40 | 41 | ```bash 42 | DOMAIN="example.com" 43 | GATEWAYIP="35.198.98.90" 44 | 45 | gcloud dns record-sets transaction start --zone=istio 46 | 47 | gcloud dns record-sets transaction add --zone=istio \ 48 | --name="${DOMAIN}" --ttl=300 --type=A ${GATEWAYIP} 49 | 50 | gcloud dns record-sets transaction add --zone=istio \ 51 | --name="www.${DOMAIN}" --ttl=300 --type=A ${GATEWAYIP} 52 | 53 | gcloud dns record-sets transaction add --zone=istio \ 54 | --name="*.${DOMAIN}" --ttl=300 --type=A ${GATEWAYIP} 55 | 56 | gcloud dns record-sets transaction execute --zone istio 57 | ``` 58 | 59 | Verify that the wildcard DNS is working (replace `example.com` with your domain): 60 | 61 | ```bash 62 | watch host test.example.com 63 | ``` 64 | 65 | Next: [Install Istio with Helm](04-istio-setup.md) 66 | -------------------------------------------------------------------------------- /docs/istio/04-istio-setup.md: -------------------------------------------------------------------------------- 1 | # Install Istio with Helm 2 | 3 | Add Istio Helm repository: 4 | 5 | ```bash 6 | export ISTIO_VER="1.2.3" 7 | 8 | helm repo add istio.io https://storage.googleapis.com/istio-release/releases/${ISTIO_VER}/charts 9 | ``` 10 | 11 | Installing the Istio custom resource definitions: 12 | 13 | ```bash 14 | helm upgrade -i istio-init istio.io/istio-init --wait --namespace istio-system 15 | ``` 16 | 17 | Wait for Istio CRDs to be deployed: 18 | 19 | ```bash 20 | kubectl -n istio-system wait --for=condition=complete job/istio-init-crd-10 21 | kubectl -n istio-system wait --for=condition=complete job/istio-init-crd-11 22 | kubectl -n istio-system wait --for=condition=complete job/istio-init-crd-12 23 | ``` 24 | 25 | Create a secret for Grafana credentials: 26 | 27 | ```bash 28 | # generate a random password 29 | PASSWORD=$(head -c 12 /dev/urandom | shasum| cut -d' ' -f1) 30 | 31 | kubectl -n istio-system create secret generic grafana \ 32 | --from-literal=username=admin \ 33 | --from-literal=passphrase="$PASSWORD" 34 | ``` 35 | 36 | Configure Istio with Prometheus, Jaeger, and cert-manager and set your load balancer IP: 37 | 38 | ```yaml 39 | # ingress configuration 40 | gateways: 41 | enabled: true 42 | istio-ingressgateway: 43 | type: LoadBalancer 44 | loadBalancerIP: "35.198.98.90" 45 | autoscaleEnabled: true 46 | autoscaleMax: 2 47 | 48 | # common settings 49 | global: 50 | # sidecar settings 51 | proxy: 52 | resources: 53 | requests: 54 | cpu: 10m 55 | memory: 64Mi 56 | limits: 57 | cpu: 2000m 58 | memory: 256Mi 59 | controlPlaneSecurityEnabled: false 60 | mtls: 61 | enabled: false 62 | useMCP: true 63 | 64 | # pilot configuration 65 | pilot: 66 | enabled: true 67 | autoscaleEnabled: true 68 | sidecar: true 69 | resources: 70 | requests: 71 | cpu: 10m 72 | memory: 128Mi 73 | 74 | # sidecar-injector webhook configuration 75 | sidecarInjectorWebhook: 76 | enabled: true 77 | 78 | # security configuration 79 | security: 80 | enabled: true 81 | 82 | # galley configuration 83 | galley: 84 | enabled: true 85 | 86 | # mixer configuration 87 | mixer: 88 | policy: 89 | enabled: false 90 | replicaCount: 1 91 | autoscaleEnabled: true 92 | telemetry: 93 | enabled: true 94 | replicaCount: 1 95 | autoscaleEnabled: true 96 | resources: 97 | requests: 98 | cpu: 10m 99 | memory: 128Mi 100 | 101 | # addon prometheus configuration 102 | prometheus: 103 | enabled: true 104 | scrapeInterval: 5s 105 | 106 | # addon jaeger tracing configuration 107 | tracing: 108 | enabled: true 109 | 110 | # addon grafana configuration 111 | grafana: 112 | enabled: true 113 | security: 114 | enabled: true 115 | ``` 116 | 117 | Save the above file as `my-istio.yaml` and install Istio with Helm: 118 | 119 | ```bash 120 | helm upgrade --install istio istio.io/istio \ 121 | --namespace=istio-system \ 122 | -f ./my-istio.yaml 123 | ``` 124 | 125 | Verify that Istio workloads are running: 126 | 127 | ```bash 128 | watch kubectl -n istio-system get pods 129 | ``` 130 | 131 | Next: [Configure Istio Gateway with Let's Encrypt wildcard certificate](05-letsencrypt-setup.md) 132 | -------------------------------------------------------------------------------- /docs/istio/05-letsencrypt-setup.md: -------------------------------------------------------------------------------- 1 | # Configure Istio Gateway with Let's Encrypt wildcard certificate 2 | 3 | ![istio-letsencrypt](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/istio-cert-manager-gcp.png) 4 | 5 | Create a Istio Gateway in istio-system namespace with HTTPS redirect: 6 | 7 | ```yaml 8 | apiVersion: networking.istio.io/v1alpha3 9 | kind: Gateway 10 | metadata: 11 | name: public-gateway 12 | namespace: istio-system 13 | spec: 14 | selector: 15 | istio: ingressgateway 16 | servers: 17 | - port: 18 | number: 80 19 | name: http 20 | protocol: HTTP 21 | hosts: 22 | - "*" 23 | tls: 24 | httpsRedirect: true 25 | - port: 26 | number: 443 27 | name: https 28 | protocol: HTTPS 29 | hosts: 30 | - "*" 31 | tls: 32 | mode: SIMPLE 33 | privateKey: /etc/istio/ingressgateway-certs/tls.key 34 | serverCertificate: /etc/istio/ingressgateway-certs/tls.crt 35 | ``` 36 | 37 | Save the above resource as istio-gateway.yaml and then apply it: 38 | 39 | ```bash 40 | kubectl apply -f ./istio-gateway.yaml 41 | ``` 42 | 43 | Create a service account with Cloud DNS admin role (replace `my-gcp-project` with your project ID): 44 | 45 | ```bash 46 | GCP_PROJECT=my-gcp-project 47 | 48 | gcloud iam service-accounts create dns-admin \ 49 | --display-name=dns-admin \ 50 | --project=${GCP_PROJECT} 51 | 52 | gcloud iam service-accounts keys create ./gcp-dns-admin.json \ 53 | --iam-account=dns-admin@${GCP_PROJECT}.iam.gserviceaccount.com \ 54 | --project=${GCP_PROJECT} 55 | 56 | gcloud projects add-iam-policy-binding ${GCP_PROJECT} \ 57 | --member=serviceAccount:dns-admin@${GCP_PROJECT}.iam.gserviceaccount.com \ 58 | --role=roles/dns.admin 59 | ``` 60 | 61 | Create a Kubernetes secret with the GCP Cloud DNS admin key: 62 | 63 | ```bash 64 | kubectl create secret generic cert-manager-credentials \ 65 | --from-file=./gcp-dns-admin.json \ 66 | --namespace=istio-system 67 | ``` 68 | 69 | Install cert-manager's CRDs: 70 | 71 | ```bash 72 | CERT_REPO=https://raw.githubusercontent.com/jetstack/cert-manager 73 | 74 | kubectl apply -f ${CERT_REPO}/release-0.7/deploy/manifests/00-crds.yaml 75 | ``` 76 | 77 | Create the cert-manager namespace and disable resource validation: 78 | 79 | ```bash 80 | kubectl create namespace cert-manager 81 | 82 | kubectl label namespace cert-manager certmanager.k8s.io/disable-validation=true 83 | ``` 84 | 85 | Install cert-manager with Helm: 86 | 87 | ```bash 88 | helm repo add jetstack https://charts.jetstack.io && \ 89 | helm repo update && \ 90 | helm upgrade -i cert-manager \ 91 | --namespace cert-manager \ 92 | --version v0.7.0 \ 93 | jetstack/cert-manager 94 | ``` 95 | 96 | Create a letsencrypt issuer for CloudDNS (replace `email@example.com` with a valid email address and `my-gcp-project` with your project ID): 97 | 98 | ```yaml 99 | apiVersion: certmanager.k8s.io/v1alpha1 100 | kind: Issuer 101 | metadata: 102 | name: letsencrypt-prod 103 | namespace: istio-system 104 | spec: 105 | acme: 106 | server: https://acme-v02.api.letsencrypt.org/directory 107 | email: email@example.com 108 | privateKeySecretRef: 109 | name: letsencrypt-prod 110 | dns01: 111 | providers: 112 | - name: cloud-dns 113 | clouddns: 114 | serviceAccountSecretRef: 115 | name: cert-manager-credentials 116 | key: gcp-dns-admin.json 117 | project: my-gcp-project 118 | ``` 119 | 120 | Save the above resource as letsencrypt-issuer.yaml and then apply it: 121 | 122 | ```bash 123 | kubectl apply -f ./letsencrypt-issuer.yaml 124 | ``` 125 | 126 | Create a wildcard certificate (replace `example.com` with your domain): 127 | 128 | ```yaml 129 | apiVersion: certmanager.k8s.io/v1alpha1 130 | kind: Certificate 131 | metadata: 132 | name: istio-gateway 133 | namespace: istio-system 134 | spec: 135 | secretName: istio-ingressgateway-certs 136 | issuerRef: 137 | name: letsencrypt-prod 138 | commonName: "*.example.com" 139 | acme: 140 | config: 141 | - dns01: 142 | provider: cloud-dns 143 | domains: 144 | - "*.example.com" 145 | - "example.com" 146 | ``` 147 | 148 | Save the above resource as of-cert.yaml and then apply it: 149 | 150 | ```bash 151 | kubectl apply -f ./of-cert.yaml 152 | ``` 153 | 154 | In a couple of minutes cert-manager should fetch a wildcard certificate from letsencrypt.org: 155 | 156 | ```text 157 | kubectl -n istio-system describe certificate istio-gateway 158 | 159 | Events: 160 | Type Reason Age From Message 161 | ---- ------ ---- ---- ------- 162 | Normal CertIssued 1m52s cert-manager Certificate issued successfully 163 | ``` 164 | 165 | Recreate Istio ingress gateway pods: 166 | 167 | ```bash 168 | kubectl -n istio-system delete pods -l istio=ingressgateway 169 | ``` 170 | 171 | Note that Istio gateway doesn't reload the certificates from the TLS secret on cert-manager renewal. 172 | Since the GKE cluster is made out of preemptible VMs the gateway pods will be replaced once every 24h, if your not using 173 | preemptible nodes then you need to manually kill the gateway pods every two months before the certificate expires. 174 | 175 | Next: [Expose services outside the service mesh](06-grafana-config.md) 176 | -------------------------------------------------------------------------------- /docs/istio/06-grafana-config.md: -------------------------------------------------------------------------------- 1 | # Expose Grafana outside the service mesh 2 | 3 | In order to expose services via the Istio Gateway you have to create a Virtual Service attached to Istio Gateway. 4 | 5 | Create a virtual service in `istio-system` namespace for Grafana (replace `example.com` with your domain): 6 | 7 | ```yaml 8 | apiVersion: networking.istio.io/v1alpha3 9 | kind: VirtualService 10 | metadata: 11 | name: grafana 12 | namespace: istio-system 13 | spec: 14 | hosts: 15 | - "grafana.example.com" 16 | gateways: 17 | - public-gateway.istio-system.svc.cluster.local 18 | http: 19 | - route: 20 | - destination: 21 | host: grafana 22 | timeout: 30s 23 | ``` 24 | 25 | Save the above resource as grafana-virtual-service.yaml and then apply it: 26 | 27 | ```bash 28 | kubectl apply -f ./grafana-virtual-service.yaml 29 | ``` 30 | 31 | Navigate to `http://grafana.example.com` in your browser and you should be redirected to the HTTPS version. 32 | 33 | Check that HTTP2 is enabled: 34 | 35 | ```bash 36 | curl -I --http2 https://grafana.example.com 37 | 38 | HTTP/2 200 39 | content-type: text/html; charset=UTF-8 40 | x-envoy-upstream-service-time: 3 41 | server: envoy 42 | ``` 43 | 44 | Next: [A/B testing and canary deployments demo](/docs/apps/00-index.md) 45 | -------------------------------------------------------------------------------- /docs/openfaas/00-index.md: -------------------------------------------------------------------------------- 1 | # OpenFaaS service mesh walkthrough 2 | 3 | This guide walks you through setting up OpenFaaS with Istio on Google Kubernetes Engine. 4 | 5 | ![openfaas-istio](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/openfaas-istio-diagram.png) 6 | 7 | At the end of this guide you will be running OpenFaaS with the following characteristics: 8 | 9 | * secure OpenFaaS ingress with Let’s Encrypt TLS and authentication 10 | * encrypted communication between OpenFaaS core services and functions with Istio mutual TLS 11 | * isolated functions with Istio Mixer rules 12 | * Jaeger tracing and Prometheus monitoring for function calls 13 | * canary deployments for OpenFaaS functions 14 | 15 | ### Labs 16 | 17 | * [Configure OpenFaaS mutual TLS](01-mtls-config.md) 18 | * [Configure OpenFaaS access policies](02-mixer-rules.md) 19 | * [Install OpenFaaS with Helm](03-openfaas-setup.md) 20 | * [Configure OpenFaaS Gateway to receive external traffic](04-gateway-config.md) 21 | * [Canary deployments for OpenFaaS functions](05-canary.md) 22 | -------------------------------------------------------------------------------- /docs/openfaas/01-mtls-config.md: -------------------------------------------------------------------------------- 1 | # Configure OpenFaaS mutual TLS 2 | 3 | An OpenFaaS instance is composed out of two namespaces: one for the core services and one for functions. 4 | In order to secure the communication between core services and functions we need to enable mutual TLS on both namespaces. 5 | 6 | Create the OpenFaaS namespaces with Istio sidecar injection enabled: 7 | 8 | ```bash 9 | kubectl apply -f https://raw.githubusercontent.com/openfaas/faas-netes/master/namespaces.yml 10 | ``` 11 | 12 | Enable mTLS on `openfaas` namespace: 13 | 14 | ```yaml 15 | apiVersion: authentication.istio.io/v1alpha1 16 | kind: Policy 17 | metadata: 18 | name: default 19 | namespace: openfaas 20 | spec: 21 | peers: 22 | - mtls: {} 23 | --- 24 | apiVersion: networking.istio.io/v1alpha3 25 | kind: DestinationRule 26 | metadata: 27 | name: default 28 | namespace: openfaas 29 | spec: 30 | host: "*.openfaas.svc.cluster.local" 31 | trafficPolicy: 32 | tls: 33 | mode: ISTIO_MUTUAL 34 | ``` 35 | 36 | Save the above resource as of-mtls.yaml and then apply it: 37 | 38 | ```bash 39 | kubectl apply -f ./of-mtls.yaml 40 | ``` 41 | 42 | Allow plaintext traffic to NATS: 43 | 44 | ```yaml 45 | apiVersion: networking.istio.io/v1alpha3 46 | kind: DestinationRule 47 | metadata: 48 | name: "nats-no-mtls" 49 | namespace: openfaas 50 | spec: 51 | host: "nats.openfaas.svc.cluster.local" 52 | trafficPolicy: 53 | tls: 54 | mode: DISABLE 55 | ``` 56 | 57 | Save the above resource as of-nats-no-mtls.yaml and then apply it: 58 | 59 | ```bash 60 | kubectl apply -f ./of-nats-no-mtls.yaml 61 | ``` 62 | 63 | Enable mTLS on `openfaas-fn` namespace: 64 | 65 | ```yaml 66 | apiVersion: authentication.istio.io/v1alpha1 67 | kind: Policy 68 | metadata: 69 | name: default 70 | namespace: openfaas-fn 71 | spec: 72 | peers: 73 | - mtls: {} 74 | --- 75 | apiVersion: networking.istio.io/v1alpha3 76 | kind: DestinationRule 77 | metadata: 78 | name: default 79 | namespace: openfaas-fn 80 | spec: 81 | host: "*.openfaas-fn.svc.cluster.local" 82 | trafficPolicy: 83 | tls: 84 | mode: ISTIO_MUTUAL 85 | ``` 86 | 87 | Save the above resource as of-functions-mtls.yaml and then apply it: 88 | 89 | ```bash 90 | kubectl apply -f ./of-functions-mtls.yaml 91 | ``` 92 | 93 | Next: [Configure OpenFaaS access policies](02-mixer-rules.md) 94 | -------------------------------------------------------------------------------- /docs/openfaas/02-mixer-rules.md: -------------------------------------------------------------------------------- 1 | # Configure OpenFaaS access policies 2 | 3 | Kubernetes namespaces alone offer only a logical separation between workloads. 4 | To prohibit functions from calling each other or from reaching 5 | the OpenFaaS core services we need to create Istio Mixer rules. 6 | 7 | Deny access to OpenFaaS core services from the `openfaas-fn` namespace except for system functions: 8 | 9 | ```yaml 10 | apiVersion: config.istio.io/v1alpha2 11 | kind: denier 12 | metadata: 13 | name: denyhandler 14 | namespace: openfaas 15 | spec: 16 | status: 17 | code: 7 18 | message: Not allowed 19 | --- 20 | apiVersion: config.istio.io/v1alpha2 21 | kind: checknothing 22 | metadata: 23 | name: denyrequest 24 | namespace: openfaas 25 | spec: 26 | --- 27 | apiVersion: config.istio.io/v1alpha2 28 | kind: rule 29 | metadata: 30 | name: denyopenfaasfn 31 | namespace: openfaas 32 | spec: 33 | match: destination.namespace == "openfaas" && source.namespace == "openfaas-fn" && source.labels["role"] != "openfaas-system" 34 | actions: 35 | - handler: denyhandler.denier 36 | instances: [ denyrequest.checknothing ] 37 | ``` 38 | 39 | Save the above resources as of-rules.yaml and then apply it: 40 | 41 | ```bash 42 | kubectl apply -f ./of-rules.yaml 43 | ``` 44 | 45 | Deny access to functions except for OpenFaaS core services: 46 | 47 | ```yaml 48 | apiVersion: config.istio.io/v1alpha2 49 | kind: denier 50 | metadata: 51 | name: denyhandler 52 | namespace: openfaas-fn 53 | spec: 54 | status: 55 | code: 7 56 | message: Not allowed 57 | --- 58 | apiVersion: config.istio.io/v1alpha2 59 | kind: checknothing 60 | metadata: 61 | name: denyrequest 62 | namespace: openfaas-fn 63 | spec: 64 | --- 65 | apiVersion: config.istio.io/v1alpha2 66 | kind: rule 67 | metadata: 68 | name: denyopenfaasfn 69 | namespace: openfaas-fn 70 | spec: 71 | match: destination.namespace == "openfaas-fn" && source.namespace != "openfaas" && source.labels["role"] != "openfaas-system" 72 | actions: 73 | - handler: denyhandler.denier 74 | instances: [ denyrequest.checknothing ] 75 | ``` 76 | 77 | Save the above resources as of-functions-rules.yaml and then apply it: 78 | 79 | ```bash 80 | kubectl apply -f ./of-functions-rules.yaml 81 | ``` 82 | 83 | Next: [Install OpenFaaS with Helm](03-openfaas-setup.md) 84 | -------------------------------------------------------------------------------- /docs/openfaas/03-openfaas-setup.md: -------------------------------------------------------------------------------- 1 | # Install OpenFaaS with Helm 2 | 3 | Before installing OpenFaaS you need to provide the basic authentication credential for the OpenFaaS gateway. 4 | 5 | Create a secret named `basic-auth` in the `openfaas` namespace: 6 | 7 | ```bash 8 | # generate a random password 9 | password=$(head -c 12 /dev/urandom | shasum| cut -d' ' -f1) 10 | 11 | kubectl -n openfaas create secret generic basic-auth \ 12 | --from-literal=basic-auth-user=admin \ 13 | --from-literal=basic-auth-password=$password 14 | ``` 15 | 16 | Add the OpenFaaS `helm` repository: 17 | 18 | ```bash 19 | helm repo add openfaas https://openfaas.github.io/faas-netes/ 20 | ``` 21 | 22 | Install OpenFaaS with Helm: 23 | 24 | ```bash 25 | helm upgrade --install openfaas openfaas/openfaas \ 26 | --namespace openfaas \ 27 | --set functionNamespace=openfaas-fn \ 28 | --set operator.create=true \ 29 | --set securityContext=true \ 30 | --set basic_auth=true \ 31 | --set exposeServices=false \ 32 | --set operator.createCRD=true 33 | ``` 34 | 35 | Verify that OpenFaaS workloads are running: 36 | 37 | ```bash 38 | kubectl -n openfaas get pods 39 | ``` 40 | 41 | Next: [Configure OpenFaaS Gateway to receive external traffic](04-gateway-config.md) 42 | -------------------------------------------------------------------------------- /docs/openfaas/04-gateway-config.md: -------------------------------------------------------------------------------- 1 | # Configure OpenFaaS Gateway to receive external traffic 2 | 3 | Create an Istio virtual service for OpenFaaS Gateway (replace `example.com` with your domain): 4 | 5 | ```yaml 6 | apiVersion: networking.istio.io/v1alpha3 7 | kind: VirtualService 8 | metadata: 9 | name: gateway 10 | namespace: openfaas 11 | spec: 12 | hosts: 13 | - "openfaas.example.com" 14 | gateways: 15 | - public-gateway.istio-system.svc.cluster.local 16 | http: 17 | - route: 18 | - destination: 19 | host: gateway 20 | timeout: 30s 21 | ``` 22 | 23 | Save the above resource as of-virtual-service.yaml and then apply it: 24 | 25 | ```bash 26 | kubectl apply -f ./of-virtual-service.yaml 27 | ``` 28 | 29 | Wait for OpenFaaS Gateway to come online: 30 | 31 | ```bash 32 | watch curl -v https://openfaas.example.com/healthz 33 | ``` 34 | 35 | Save your credentials in faas-cli store: 36 | 37 | ```bash 38 | echo $password | faas-cli login -g https://openfaas.example.com -u admin --password-stdin 39 | ``` 40 | 41 | Next: [Canary deployments for OpenFaaS functions](05-canary.md) 42 | -------------------------------------------------------------------------------- /docs/openfaas/05-canary.md: -------------------------------------------------------------------------------- 1 | # Canary deployments for OpenFaaS functions 2 | 3 | ![openfaas-canary](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/openfaas-istio-canary.png) 4 | 5 | Create a general available release for the `env` function version 1.0.0: 6 | 7 | ```yaml 8 | apiVersion: openfaas.com/v1alpha2 9 | kind: Function 10 | metadata: 11 | name: env 12 | namespace: openfaas-fn 13 | spec: 14 | name: env 15 | image: stefanprodan/of-env:1.0.0 16 | resources: 17 | requests: 18 | memory: "32Mi" 19 | cpu: "10m" 20 | limits: 21 | memory: "64Mi" 22 | cpu: "100m" 23 | ``` 24 | 25 | Save the above resources as env-ga.yaml and then apply it: 26 | 27 | ```bash 28 | kubectl apply -f ./env-ga.yaml 29 | ``` 30 | 31 | Create a canary release for version 1.1.0: 32 | 33 | ```yaml 34 | apiVersion: openfaas.com/v1alpha2 35 | kind: Function 36 | metadata: 37 | name: env-canary 38 | namespace: openfaas-fn 39 | spec: 40 | name: env-canary 41 | image: stefanprodan/of-env:1.1.0 42 | resources: 43 | requests: 44 | memory: "32Mi" 45 | cpu: "10m" 46 | limits: 47 | memory: "64Mi" 48 | cpu: "100m" 49 | ``` 50 | 51 | Save the above resources as env-canary.yaml and then apply it: 52 | 53 | ```bash 54 | kubectl apply -f ./env-canary.yaml 55 | ``` 56 | 57 | Create an Istio virtual service with 10% traffic going to canary: 58 | 59 | ```yaml 60 | apiVersion: networking.istio.io/v1alpha3 61 | kind: VirtualService 62 | metadata: 63 | name: env 64 | namespace: openfaas-fn 65 | spec: 66 | hosts: 67 | - env 68 | http: 69 | - route: 70 | - destination: 71 | host: env 72 | weight: 90 73 | - destination: 74 | host: env-canary 75 | weight: 10 76 | timeout: 30s 77 | ``` 78 | 79 | Save the above resources as env-virtual-service.yaml and then apply it: 80 | 81 | ```bash 82 | kubectl apply -f ./env-virtual-service.yaml 83 | ``` 84 | 85 | Test traffic routing (one in ten calls should hit the canary release): 86 | 87 | ```bash 88 | while true; do sleep 1; curl -sS https://openfaas.example.com/function/env | grep HOSTNAME; done 89 | 90 | HOSTNAME=env-59bf48fb9d-cjsjw 91 | HOSTNAME=env-59bf48fb9d-cjsjw 92 | HOSTNAME=env-59bf48fb9d-cjsjw 93 | HOSTNAME=env-59bf48fb9d-cjsjw 94 | HOSTNAME=env-59bf48fb9d-cjsjw 95 | HOSTNAME=env-59bf48fb9d-cjsjw 96 | HOSTNAME=env-59bf48fb9d-cjsjw 97 | HOSTNAME=env-59bf48fb9d-cjsjw 98 | HOSTNAME=env-59bf48fb9d-cjsjw 99 | HOSTNAME=env-canary-5dffdf4458-4vnn2 100 | ``` 101 | 102 | Access Jaeger dashboard using port forwarding: 103 | 104 | ```bash 105 | kubectl -n istio-system port-forward deployment/istio-tracing 16686:16686 106 | ``` 107 | 108 | Tracing the general available release: 109 | 110 | ![ga-trace](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/openfaas-istio-ga-trace.png) 111 | 112 | Tracing the canary release: 113 | 114 | ![canary-trace](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/openfaas-istio-canary-trace.png) 115 | 116 | Monitor ga vs canary success rate and latency with Grafana: 117 | 118 | ![canary-prom](https://github.com/stefanprodan/istio-gke/blob/master/docs/screens/openfaas-istio-canary-prom.png) 119 | -------------------------------------------------------------------------------- /docs/screens/grafana-403-errors.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/grafana-403-errors.png -------------------------------------------------------------------------------- /docs/screens/istio-cert-manager-gcp.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/istio-cert-manager-gcp.png -------------------------------------------------------------------------------- /docs/screens/istio-gcp-overview.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/istio-gcp-overview.png -------------------------------------------------------------------------------- /docs/screens/jaeger-trace-list.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/jaeger-trace-list.png -------------------------------------------------------------------------------- /docs/screens/openfaas-istio-canary-prom.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/openfaas-istio-canary-prom.png -------------------------------------------------------------------------------- /docs/screens/openfaas-istio-canary-trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/openfaas-istio-canary-trace.png -------------------------------------------------------------------------------- /docs/screens/openfaas-istio-canary.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/openfaas-istio-canary.png -------------------------------------------------------------------------------- /docs/screens/openfaas-istio-diagram.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/openfaas-istio-diagram.png -------------------------------------------------------------------------------- /docs/screens/openfaas-istio-ga-trace.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/openfaas-istio-ga-trace.png -------------------------------------------------------------------------------- /docs/screens/routing-desired-state.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/routing-desired-state.png -------------------------------------------------------------------------------- /docs/screens/routing-initial-state.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/stefanprodan/istio-gke/2ec4720cbd3569d68397ab85b81c43a29620f455/docs/screens/routing-initial-state.png --------------------------------------------------------------------------------