├── .gitignore
├── LICENSE
├── README.md
├── TWITCH.md
├── code
└── tf-cluster-asg
│ ├── README.md
│ ├── main.tf
│ ├── provider.tf
│ ├── user_data.sh
│ ├── variable.tf
│ └── versions.tf
├── img
└── video.png
└── objectives
├── objective1.md
├── objective2.md
├── objective3.md
├── objective4.md
└── objective5.md
/.gitignore:
--------------------------------------------------------------------------------
1 | # Local Terraform plugins
2 | **/.terraform/plugins/**
3 |
4 | # Terraform secrets files
5 | #*.tfvars
6 | #*.tfvars.*
7 |
8 | # Terraform state files
9 | *.tfstate
10 | *.tfstate.*
11 |
12 | # Crash log files
13 | crash.log
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2020 Wahl Network
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | 
2 |
3 | # Certified Kubernetes Administrator (CKA) Exam Study Guide
4 |
5 | 👋🙂 Welcome!
6 |
7 | This repository contains a study guide created in preparation for passing the Certified Kubernetes Administrator (CKA) exam. All of the content found here was livestreamed on [Twitch](https://www.twitch.tv/wahlnetwork) in collaboration with viewers.
8 |
9 | Each [Exam Objective](#exam-objectives) is broken down into helpful links, commands, videos, scripts, code samples, and more so that you can refer back to this guide during your studies. Everything here is open source and made by a community of inclusive and friendly folks. If you found this project helpful, why not give us a 🌟star🌟 to help increase awareness!
10 |
11 | - [Certified Kubernetes Administrator (CKA) Exam Study Guide](#certified-kubernetes-administrator-cka-exam-study-guide)
12 | - [Project Overview](#project-overview)
13 | - [Exam Objectives](#exam-objectives)
14 | - [Resources](#resources)
15 | - [📝 Official References](#-official-references)
16 | - [🎓 Online Training](#-online-training)
17 | - [🛠 Tools](#-tools)
18 | - [Managed Kubernetes Clusters](#managed-kubernetes-clusters)
19 | - [🤗 Community](#-community)
20 | - [The Fine Print](#the-fine-print)
21 | - [Disclaimer](#disclaimer)
22 | - [Contributing](#contributing)
23 | - [Code of Conduct](#code-of-conduct)
24 | - [License](#license)
25 |
26 | ## Project Overview
27 |
28 | Key things to know:
29 |
30 | - Task tracking is contained on [this Trello board](https://bit.ly/2SzlFRr).
31 | - The `main` branch contains all of the finished work.
32 | - The `draft` branch contains work-in-progress that needs to be polished, verified, and formatted.
33 |
34 | Additionally, you can watch this brief introduction video below:
35 |
36 | [](https://youtu.be/dkYCw88mWow)
37 |
38 | ## Exam Objectives
39 |
40 | The CNCF curriculum is posted [here](https://github.com/cncf/curriculum). The percentage after each objective is the relative score weight on the exam.
41 |
42 | - [Objective 1: Cluster Architecture, Installation & Configuration](objectives/objective1.md) ✔
43 | - [Objective 2: Workloads & Scheduling](objectives/objective2.md) ✔
44 | - [Objective 3: Services & Networking](objectives/objective3.md) ✔
45 | - [Objective 4: Storage](objectives/objective4.md) ✔
46 | - [Objective 5: Troubleshooting](objectives/objective5.md) ✔
47 |
48 | ## Resources
49 |
50 | Fantastic resources from around the world, sorted alphabetically.
51 |
52 | ### 📝 Official References
53 |
54 | - [Certified Kubernetes Administrator (CKA) Exam 1.19 Curriculum](https://github.com/cncf/curriculum/blob/master/CKA_Curriculum_v1.19.pdf)
55 | - [Certified Kubernetes Administrator Exam Registration](https://training.linuxfoundation.org/certification/certified-kubernetes-administrator-cka/)
56 | - [Enable kubectl autocompletion](https://kubernetes.io/docs/tasks/tools/install-kubectl/#enable-kubectl-autocompletion)
57 | - [kubectl Cheat Sheet](https://kubernetes.io/docs/reference/kubectl/cheatsheet/)
58 | - [kubectl Reference Docs](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands)
59 | - [Linux Foundation's Important Instructions: CKA and CKAD](https://docs.linuxfoundation.org/tc-docs/certification/tips-cka-and-ckad)
60 |
61 | ### 🎓 Online Training
62 |
63 | - [A Cloud Guru's Cloud Native Certified Kubernetes Administrator (CKA) Course](https://acloud.guru/learn/7f5137aa-2d26-4b19-8d8c-025b22667e76)
64 | - [Katacoda - Learn Kubernetes using Interactive Browser-Based Scenarios](https://www.katacoda.com/courses/kubernetes)
65 | - [Pluralsight CKA Learning Path](https://app.pluralsight.com/paths/certificate/certified-kubernetes-administrator) by author [Anthony Nocentino](https://app.pluralsight.com/profile/author/anthony-nocentino)
66 |
67 | ### 🛠 Tools
68 |
69 | - [Kubectl-fzf Autocomplete](https://github.com/bonnefoa/kubectl-fzf)
70 | - [Power tools for kubectl](https://github.com/ahmetb/kubectx)
71 |
72 | ## Managed Kubernetes Clusters
73 |
74 | - Google Kubernetes Engine (GKE)
75 | - [Creating a GKE Zonal Cluster](https://cloud.google.com/kubernetes-engine/docs/how-to/creating-a-zonal-cluster)
76 | - [Generating a kubeconfig entry](https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-access-for-kubectl#generate_kubeconfig_entry)
77 |
78 | > Read [Configure Access to Multiple Clusters](https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/) to switch between different clusters while studying.
79 |
80 | ### 🤗 Community
81 |
82 | - [Best Practices for CKA Exam](https://medium.com/@emreodabas_20110/best-practices-for-cka-exam-9c1e51ea9b29)
83 | - [CKA-Study Guide](https://github.com/David-VTUK/CKA-StudyGuide) by [David-VTUK](https://github.com/David-VTUK)
84 | - [How I passed the CKA (Certified Kubernetes Administrator) Exam](https://medium.com/platformer-blog/how-i-passed-the-cka-certified-kubernetes-administrator-exam-8943aa24d71d)
85 | - [How to pass the New CKA exam released at September 2020](https://medium.com/@krishna.sharma1408/how-to-pass-the-new-cka-exam-released-at-september-2020-e0e014d67f78)
86 | - [Interesting Kubernetes application demos](https://www.virtuallyghetto.com/2020/06/interesting-kubernetes-application-demos.html)
87 | - [Kubernetes the Hard Way - Kelsey Hightower](https://github.com/kelseyhightower/kubernetes-the-hard-way)
88 | - [Kubernetes tools and resources from learnk8s](https://learnk8s.io/kubernetes-resources)
89 | - [Practice Enough With These 150 Questions for the CKAD Exam](https://medium.com/bb-tutorials-and-thoughts/practice-enough-with-these-questions-for-the-ckad-exam-2f42d1228552)
90 | - [Stack Overflow - Questions tagged kubernetes](https://stackoverflow.com/questions/tagged/kubernetes)
91 | - [Walidshaari's Kubernetes-Certified-Administrator Repo](https://github.com/walidshaari/Kubernetes-Certified-Administrator)
92 |
93 | ## The Fine Print
94 |
95 | ### Disclaimer
96 |
97 | Absolutely nothing in this organization is officially supported and should be used at your own risk.
98 |
99 | ### Contributing
100 |
101 | Contributions via GitHub pull requests are gladly accepted from their original author. Along with any pull requests, please state that the contribution is your original work and that you license the work to the project under the project's open source license. Whether or not you state this explicitly, by submitting any copyrighted material via pull request, email, or other means you agree to license the material under the project's open source license and warrant that you have the legal authority to do so.
102 |
103 | ### Code of Conduct
104 |
105 | All contributors are expected to abide by the [Code of Conduct](https://github.com/WahlNetwork/welcome/blob/master/COC.md).
106 |
107 | ### License
108 |
109 | Every repository in this organization has a license so that you can freely consume, distribute, and modify the content for non-commercial purposes. By default, the [MIT License](https://opensource.org/licenses/MIT) is used.
110 |
--------------------------------------------------------------------------------
/TWITCH.md:
--------------------------------------------------------------------------------
1 | # Become a Twitch Contributor
2 |
3 | Watching one of my Twitch live streams and want to earn a `contributor` badge for this repository? Make a contribution to this file!
4 |
5 | 1. Fork the repository to your user account.
6 | 2. Edit this file and add your username to the list below.
7 | 3. Commit the change to your forked copy.
8 | 4. Submit a pull request with the changes.
9 | 5. Once reviewed and merged, you will be listed as a `contributor`!
10 |
11 | ## Snazzy Folks
12 |
13 | - [Chris Wahl (Example)](https://github.com/chriswahl)
--------------------------------------------------------------------------------
/code/tf-cluster-asg/README.md:
--------------------------------------------------------------------------------
1 | # Terraform Plan to Create Auto Scaling Group
2 |
3 | This plan will create the following resources:
4 |
5 | - Launch Template with EC2 instances prepared to install Kubernetes with `kubeadm`
6 | - Auto Scaling group to deploy as many instances as your heart desires
7 |
8 | ## Instructions
9 |
10 | - Edit `local.tf` with your environment's information.
11 | - Optionally, edit `user_data.sh` if you wish to alter the startup script.
12 | - Run `terraform init` and `terraform validate` to ensure the code is loaded properly.
13 | - Run `terraform plan` to see the results of a plan against your environment.
14 | - When satisfied, run `terraform apply` to apply the plan and construct the Launch Template and Auto Scaling group.
15 | - If more/less nodes are needed:
16 | - Edit `local.tf` and modify the `node-count` value to the desired amount.
17 | - Re-run `terraform apply` and the Auto Scaling group will create/destroy nodes to reach the new value.
18 | - When done, use `terraform destroy` to remove all resources and terminate potential billing.
19 |
--------------------------------------------------------------------------------
/code/tf-cluster-asg/main.tf:
--------------------------------------------------------------------------------
1 | # Provides the security group id value
2 | data "aws_security_group" "sg" {
3 | tags = {
4 | Name = var.security-group-name
5 | }
6 | }
7 |
8 | # Provides the subnet id value
9 | data "aws_subnet" "subnet" {
10 | tags = {
11 | Name = var.subnet-name
12 | }
13 | }
14 |
15 | # Provides an AWS Launch Template for constructing EC2 instances
16 | resource "aws_launch_template" "cka-node" {
17 | name = var.instance-name
18 | image_id = "ami-07a29e5e945228fa1"
19 | instance_type = var.instance-type
20 | key_name = var.keypair-name
21 | vpc_security_group_ids = [data.aws_security_group.sg.id]
22 | block_device_mappings {
23 | device_name = "/dev/sda1"
24 | ebs {
25 | volume_size = 8
26 | encrypted = "true"
27 | }
28 | }
29 | tags = {
30 | environment = var.tag-environment
31 | source = "Terraform"
32 | }
33 | tag_specifications {
34 | resource_type = "instance"
35 | tags = {
36 | Name = var.instance-name
37 | environment = var.tag-environment
38 | source = "Terraform"
39 | }
40 | }
41 | tag_specifications {
42 | resource_type = "volume"
43 | tags = {
44 | Name = var.instance-name
45 | environment = var.tag-environment
46 | source = "Terraform"
47 | }
48 | }
49 | user_data = filebase64("user_data.sh")
50 | }
51 |
52 | # Provides an Auto Scaling group using instances described in the Launch Template
53 | resource "aws_autoscaling_group" "cka-cluster" {
54 | desired_capacity = var.node-count
55 | max_size = var.node-count
56 | min_size = var.node-count
57 | name = var.asg-name
58 | vpc_zone_identifier = [data.aws_subnet.subnet.id]
59 | launch_template {
60 | id = aws_launch_template.cka-node.id
61 | version = "$Latest"
62 | }
63 | }
64 |
--------------------------------------------------------------------------------
/code/tf-cluster-asg/provider.tf:
--------------------------------------------------------------------------------
1 | terraform {
2 | required_providers {
3 | aws = {
4 | source = "hashicorp/aws"
5 | version = "~>3.3.0"
6 | }
7 | }
8 | }
9 |
10 | provider "aws" {
11 | region = "us-west-2"
12 | }
13 |
--------------------------------------------------------------------------------
/code/tf-cluster-asg/user_data.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # Disable Swap
4 | sudo swapoff -a
5 |
6 | # Bridge Network
7 | sudo modprobe br_netfilter
8 | sudo cat <<'EOF' | sudo tee /etc/sysctl.d/k8s.conf
9 | net.bridge.bridge-nf-call-ip6tables = 1
10 | net.bridge.bridge-nf-call-iptables = 1
11 | EOF
12 | sudo sysctl --system
13 |
14 | # Install Docker
15 | sudo curl -fsSL https://get.docker.com -o /home/ubuntu/get-docker.sh
16 | sudo sh /home/ubuntu/get-docker.sh
17 |
18 | # Install Kube tools
19 | sudo apt-get update && sudo apt-get install -y apt-transport-https curl
20 | curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
21 | cat <<'EOF' | sudo tee /etc/apt/sources.list.d/kubernetes.list
22 | deb https://apt.kubernetes.io/ kubernetes-xenial main
23 | EOF
24 | sudo apt-get update
25 | sudo apt-get install -y kubelet kubeadm kubectl
26 | sudo apt-mark hold kubelet kubeadm kubectl
27 |
28 | # Setup aliases
29 | sudo printf "alias k=kubectl\ncomplete -F __start_kubectl k" > ~/.bash_aliases
--------------------------------------------------------------------------------
/code/tf-cluster-asg/variable.tf:
--------------------------------------------------------------------------------
1 | variable "node-count" {
2 | default = 1
3 | description = "The quantity of EC2 instances to launch in the Auto Scaling group"
4 | type = number
5 | }
6 |
7 | variable "instance-name" {
8 | description = "The name of the EC2 instance"
9 | type = string
10 | }
11 |
12 | variable "asg-name" {
13 | description = "The name of the Auto Scaling group"
14 | type = string
15 | }
16 |
17 | variable "keypair-name" {
18 | description = "The name of the EC2 key pair"
19 | type = string
20 | }
21 |
22 | variable "tag-environment" {
23 | description = "Assigns and AWS environment tag to resources"
24 | type = string
25 | }
26 |
27 | variable "security-group-name" {
28 | description = "The name of the VPC security group"
29 | type = string
30 | }
31 |
32 | variable "subnet-name" {
33 | description = "The name of the VCP subnet"
34 | type = string
35 | }
36 |
37 | variable "instance-type" {
38 | description = "The type of EC2 instance to deploy"
39 | type = string
40 | }
41 |
--------------------------------------------------------------------------------
/code/tf-cluster-asg/versions.tf:
--------------------------------------------------------------------------------
1 | terraform {
2 | required_version = ">= 0.13"
3 | }
4 |
--------------------------------------------------------------------------------
/img/video.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/WahlNetwork/certified-kubernetes-administrator-cka-exam/09aea886364f16b1c91a8a2088ca408acce3fc2d/img/video.png
--------------------------------------------------------------------------------
/objectives/objective1.md:
--------------------------------------------------------------------------------
1 | # Objective 1: Cluster Architecture, Installation & Configuration
2 |
3 | - [Objective 1: Cluster Architecture, Installation & Configuration](#objective-1-cluster-architecture-installation--configuration)
4 | - [1.1 Manage Role Based Access Control (RBAC)](#11-manage-role-based-access-control-rbac)
5 | - [Lab Environment](#lab-environment)
6 | - [Lab Practice](#lab-practice)
7 | - [1.2 Use Kubeadm to Install a Basic Cluster](#12-use-kubeadm-to-install-a-basic-cluster)
8 | - [Kubeadm Tasks for All Nodes](#kubeadm-tasks-for-all-nodes)
9 | - [Kubeadm Tasks for Single Control Node](#kubeadm-tasks-for-single-control-node)
10 | - [Kubeadm Tasks for Worker Node(s)](#kubeadm-tasks-for-worker-nodes)
11 | - [Kubeadm Troubleshooting](#kubeadm-troubleshooting)
12 | - [Kubeadm Optional Tasks](#kubeadm-optional-tasks)
13 | - [1.3 Manage A Highly-Available Kubernetes Cluster](#13-manage-a-highly-available-kubernetes-cluster)
14 | - [HA Deployment Types](#ha-deployment-types)
15 | - [Upgrading from Single Control-Plane to High Availability](#upgrading-from-single-control-plane-to-high-availability)
16 | - [1.4 Provision Underlying Infrastructure to Deploy a Kubernetes Cluster](#14-provision-underlying-infrastructure-to-deploy-a-kubernetes-cluster)
17 | - [1.5 Perform a Version Upgrade on a Kubernetes Cluster using Kubeadm](#15-perform-a-version-upgrade-on-a-kubernetes-cluster-using-kubeadm)
18 | - [First Control Plane Node](#first-control-plane-node)
19 | - [Additional Control Plane Nodes](#additional-control-plane-nodes)
20 | - [Upgrade Control Plane Node Kubectl And Kubelet Tools](#upgrade-control-plane-node-kubectl-and-kubelet-tools)
21 | - [Upgrade Worker Nodes](#upgrade-worker-nodes)
22 | - [1.6 Implement Etcd Backup And Restore](#16-implement-etcd-backup-and-restore)
23 | - [Snapshot The Keyspace](#snapshot-the-keyspace)
24 | - [Restore From Snapshot](#restore-from-snapshot)
25 |
26 | ## 1.1 Manage Role Based Access Control (RBAC)
27 |
28 | Documentation and Resources:
29 |
30 | - [Kubectl Cheat Sheet](https://kubernetes.io/docs/reference/kubectl/cheatsheet/)
31 | - [Using RBAC Authorization](https://kubernetes.io/docs/reference/access-authn-authz/rbac/)
32 | - [A Practical Approach to Understanding Kubernetes Authorization](https://thenewstack.io/a-practical-approach-to-understanding-kubernetes-authorization/)
33 |
34 | RBAC is handled by roles (permissions) and bindings (assignment of permissions to subjects):
35 |
36 | | Object | Description |
37 | | -------------------- | -------------------------------------------------------------------------------------------- |
38 | | `Role` | Permissions within a particular namespace |
39 | | `ClusterRole` | Permissions to non-namespaced resources; can be used to grant the same permissions as a Role |
40 | | `RoleBinding` | Grants the permissions defined in a role to a user or set of users |
41 | | `ClusterRoleBinding` | Grant permissions across a whole cluster |
42 |
43 | ### Lab Environment
44 |
45 | If desired, use a managed Kubernetes cluster, such as Amazon EKS, to immediately begin working with RBAC. The command `aws --region REGION eks update-kubeconfig --name CLUSTERNAME` will generate a .kube configuration file on your workstation to permit kubectl commands.
46 |
47 | ### Lab Practice
48 |
49 | Create the `wahlnetwork1` namespace.
50 |
51 | `kubectl create namespace wahlnetwork1`
52 |
53 | ---
54 |
55 | Create a deployment in the `wahlnetwork1` namespace using the image of your choice:
56 |
57 | 1. `kubectl create deployment hello-node --image=k8s.gcr.io/echoserver:1.4 -n wahlnetwork1`
58 | 1. `kubectl create deployment busybox --image=busybox -n wahlnetwork1 -- sleep 2000`
59 |
60 | You can view the yaml file by adding `--dry-run=client -o yaml` to the end of either deployment.
61 |
62 | ```yaml
63 | apiVersion: apps/v1
64 | kind: Deployment
65 | metadata:
66 | creationTimestamp: null
67 | labels:
68 | app: hello-node
69 | name: hello-node
70 | namespace: wahlnetwork1
71 | spec:
72 | replicas: 1
73 | selector:
74 | matchLabels:
75 | app: hello-node
76 | strategy: {}
77 | template:
78 | metadata:
79 | creationTimestamp: null
80 | labels:
81 | app: hello-node
82 | spec:
83 | containers:
84 | - image: k8s.gcr.io/echoserver:1.4
85 | name: echoserver
86 | resources: {}
87 | ```
88 |
89 | ---
90 |
91 | Create the `pod-reader` role in the `wahlnetwork1` namespace.
92 |
93 | `kubectl create role pod-reader --verb=get --verb=list --verb=watch --resource=pods -n wahlnetwork1`
94 |
95 | > Alternatively, use `kubectl create role pod-reader --verb=get --verb=list --verb=watch --resource=pods -n wahlnetwork1 --dry-run=client -o yaml` to output a proper yaml configuration.
96 |
97 | ```yaml
98 | apiVersion: rbac.authorization.k8s.io/v1
99 | kind: Role
100 | metadata:
101 | creationTimestamp: null
102 | name: pod-reader
103 | namespace: wahlnetwork1
104 | rules:
105 | - apiGroups:
106 | - ""
107 | resources:
108 | - pods
109 | verbs:
110 | - get
111 | - list
112 | - watch
113 | ```
114 |
115 | ---
116 |
117 | Create the `read-pods` rolebinding between the role named `pod-reader` and the user `spongebob` in the `wahlnetwork1` namespace.
118 |
119 | `kubectl create rolebinding --role=pod-reader --user=spongebob read-pods -n wahlnetwork1`
120 |
121 | > Alternatively, use `kubectl create rolebinding --role=pod-reader --user=spongebob read-pods -n wahlnetwork1 --dry-run=client -o yaml` to output a proper yaml configuration.
122 |
123 | ```yaml
124 | apiVersion: rbac.authorization.k8s.io/v1
125 | kind: RoleBinding
126 | metadata:
127 | creationTimestamp: null
128 | name: read-pods
129 | roleRef:
130 | apiGroup: rbac.authorization.k8s.io
131 | kind: Role
132 | name: pod-reader
133 | subjects:
134 | - apiGroup: rbac.authorization.k8s.io
135 | kind: User
136 | name: spongebob
137 | ```
138 |
139 | ---
140 |
141 | Create the `cluster-secrets-reader` clusterrole.
142 |
143 | `kubectl create clusterrole cluster-secrets-reader --verb=get --verb=list --verb=watch --resource=secrets`
144 |
145 | > Alternatively, use `kubectl create clusterrole cluster-secrets-reader --verb=get --verb=list --verb=watch --resource=secrets --dry-run=client -o yaml` to output a proper yaml configuration.
146 |
147 | ```yaml
148 | apiVersion: rbac.authorization.k8s.io/v1
149 | kind: ClusterRole
150 | metadata:
151 | creationTimestamp: null
152 | name: cluster-secrets-reader
153 | rules:
154 | - apiGroups:
155 | - ""
156 | resources:
157 | - secrets
158 | verbs:
159 | - get
160 | - list
161 | - watch
162 | ```
163 |
164 | ---
165 |
166 | Create the `cluster-read-secrets` clusterrolebinding between the clusterrole named `cluster-secrets-reader` and the user `gizmo`.
167 |
168 | `kubectl create clusterrolebinding --clusterrole=cluster-secrets-reader --user=gizmo cluster-read-secrets`
169 |
170 | > Alternatively, use `kubectl create clusterrolebinding --clusterrole=cluster-secrets-reader --user=gizmo cluster-read-secrets --dry-run=client -o yaml` to output a proper yaml configuration.
171 |
172 | ```yaml
173 | apiVersion: rbac.authorization.k8s.io/v1
174 | kind: ClusterRoleBinding
175 | metadata:
176 | creationTimestamp: null
177 | name: cluster-read-secrets
178 | roleRef:
179 | apiGroup: rbac.authorization.k8s.io
180 | kind: ClusterRole
181 | name: cluster-secrets-reader
182 | subjects:
183 | - apiGroup: rbac.authorization.k8s.io
184 | kind: User
185 | name: gizmo
186 | ```
187 |
188 | Test to see if this works by running the `auth` command.
189 |
190 | `kubectl auth can-i get secrets --as=gizmo`
191 |
192 | Attempt to get secrets as the `gizmo` user.
193 |
194 | `kubectl get secrets --as=gizmo`
195 |
196 | ```bash
197 | NAME TYPE DATA AGE
198 | default-token-lz87v kubernetes.io/service-account-token 3 7d1h
199 | ```
200 |
201 | ## 1.2 Use Kubeadm to Install a Basic Cluster
202 |
203 | Official documentation: [Creating a cluster with kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/)
204 |
205 | > Terraform code is available [here](../code/tf-cluster-asg/) to create the resources necessary to experiment with `kubeadm`
206 |
207 | ### Kubeadm Tasks for All Nodes
208 |
209 | - Create Amazon EC2 Instances
210 | - Create an AWS Launch Template using an Ubuntu 18.04 LTS image (or newer) of size `t3a.small` (2 CPU, 2 GiB Memory).
211 | - Disable the [swap](https://askubuntu.com/questions/214805/how-do-i-disable-swap) file.
212 | - Note: This can be validated by using the console command `free` when SSH'd to the instance. The swap space total should be 0.
213 | - Consume this template as part of an Auto Scaling Group of 1 or more instances. This makes deployment of new instances and removal of old instances trivial.
214 | - [Configure iptables](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#letting-iptables-see-bridged-traffic)
215 | - This allows iptables to see bridged traffic.
216 | - [Install the Docker container runtime](https://kubernetes.io/docs/setup/production-environment/container-runtimes/#docker)
217 | - The [docker-install](https://github.com/docker/docker-install) script is handy for this.
218 | - [Install kubeadm, kubelet, and kubectl](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#installing-kubeadm-kubelet-and-kubectl)
219 |
220 | Alternatively, use a `user-data` bash script attached to the Launch Template:
221 |
222 | ```bash
223 | #!/bin/bash
224 |
225 | # Disable Swap
226 | sudo swapoff -a
227 |
228 | # Bridge Network
229 | sudo modprobe br_netfilter
230 | sudo cat <<'EOF' | sudo tee /etc/sysctl.d/k8s.conf
231 | net.bridge.bridge-nf-call-ip6tables = 1
232 | net.bridge.bridge-nf-call-iptables = 1
233 | EOF
234 | sudo sysctl --system
235 |
236 | # Install Docker
237 | sudo curl -fsSL https://get.docker.com -o /home/ubuntu/get-docker.sh
238 | sudo sh /home/ubuntu/get-docker.sh
239 |
240 | # Install Kube tools
241 | sudo apt-get update && sudo apt-get install -y apt-transport-https curl
242 | curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
243 | cat <<'EOF' | sudo tee /etc/apt/sources.list.d/kubernetes.list
244 | deb https://apt.kubernetes.io/ kubernetes-xenial main
245 | EOF
246 | sudo apt-get update
247 | sudo apt-get install -y kubelet kubeadm kubectl
248 | sudo apt-mark hold kubelet kubeadm kubectl
249 | ```
250 |
251 | Optionally, add `sudo kubeadm config images pull` to the end of the script to pre-pull images required for setting up a Kubernetes cluster.
252 |
253 | ```bash
254 | $ sudo kubeadm config images pull
255 |
256 | [config/images] Pulled k8s.gcr.io/kube-apiserver:v1.19.2
257 | [config/images] Pulled k8s.gcr.io/kube-controller-manager:v1.19.2
258 | [config/images] Pulled k8s.gcr.io/kube-scheduler:v1.19.2
259 | [config/images] Pulled k8s.gcr.io/kube-proxy:v1.19.2
260 | [config/images] Pulled k8s.gcr.io/pause:3.2
261 | [config/images] Pulled k8s.gcr.io/etcd:3.4.13-0
262 | [config/images] Pulled k8s.gcr.io/coredns:1.7.0
263 | ```
264 |
265 | ### Kubeadm Tasks for Single Control Node
266 |
267 | - Initialize the cluster
268 | - Choose your Container Network Interface (CNI) plugin. This guide uses [Calico's CNI](https://docs.projectcalico.org/about/about-calico).
269 | - Run `sudo kubeadm init --pod-network-cidr=192.168.0.0/16` to initialize the cluster and provide a pod network aligned to [Calico's default configuration](https://docs.projectcalico.org/getting-started/kubernetes/quickstart#create-a-single-host-kubernetes-cluster).
270 | - Write down the `kubeadm join` output to [join worker nodes](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#join-nodes) later in this guide.
271 | - Example `kubeadm join 10.0.0.100:6443 --token 12345678901234567890 --discovery-token-ca-cert-hash sha256:123456789012345678901234567890123456789012345678901234567890`
272 | - [Install Calico](https://docs.projectcalico.org/getting-started/kubernetes/quickstart)
273 | - [Configure local kubectl access](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#optional-controlling-your-cluster-from-machines-other-than-the-control-plane-node)
274 | - This step simply copies the `admin.conf` file into a location accessible for a regular user.
275 |
276 | Alternatively, use the [Flannel CNI](https://coreos.com/flannel/docs/latest/kubernetes.html).
277 |
278 | - Run `sudo kubeadm init --pod-network-cidr=10.244.0.0/16` to initialize the cluster and provide a pod network aligned to [Flannel's default configuration](https://github.com/coreos/flannel/blob/master/Documentation/kubernetes.md).
279 | - Note: The [`kube-flannel.yml`](https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml) file is hosted in the same location.
280 |
281 | ### Kubeadm Tasks for Worker Node(s)
282 |
283 | - [Join the cluster](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#join-nodes)
284 | - Note: You can view the cluster config with `kubectl config view`. This includes the cluster server address (e.g. `server: https://10.0.0.100:6443`)
285 |
286 | ### Kubeadm Troubleshooting
287 |
288 | - If using `kubeadm init` without a pod network CIDR the CoreDNS pods will remain [stuck in pending state](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/#coredns-or-kube-dns-is-stuck-in-the-pending-state)
289 | - Broke cluster and want to start over? Use `kubeadm reset` and `rm -rf .kube` in the user home directory to remove the old config and avoid [TLS certificate errors](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/troubleshooting-kubeadm/#tls-certificate-errors)
290 | - If seeing `error: error loading config file "/etc/kubernetes/admin.conf": open /etc/kubernetes/admin.conf: permission denied` it likely means the `KUBECONFIG` variable is set to that path, try `unset KUBECONFIG` to use the `$HOME/.kube/config` file.
291 |
292 | ### Kubeadm Optional Tasks
293 |
294 | - [Install kubectl client locally on Windows](https://kubernetes.io/docs/tasks/tools/install-kubectl/#install-kubectl-on-windows) for those using this OS.
295 | - Single node cluster? [Taint the control node](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/) to accept pods without dedicated worker nodes.
296 | - Deploy the "hello-node" app from the [minikube tutorial](https://kubernetes.io/docs/tutorials/hello-minikube/) to test basic functionality.
297 |
298 | ## 1.3 Manage A Highly-Available Kubernetes Cluster
299 |
300 | [High Availability Production Environment](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/)
301 |
302 | Kubernetes Components for HA:
303 |
304 | - Load Balancer / VIP
305 | - DNS records
306 | - etcd Endpoint
307 | - Certificates
308 | - Any HA specific queries / configuration / settings
309 |
310 | ### HA Deployment Types
311 |
312 | - With stacked control plane nodes. This approach requires less infrastructure. The etcd members and control plane nodes are co-located.
313 | - With an external etcd cluster. This approach requires more infrastructure. The control plane nodes and etcd members are separated. ([source](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/high-availability/))
314 |
315 | ### Upgrading from Single Control-Plane to High Availability
316 |
317 | If you have plans to upgrade this single control-plane kubeadm cluster to high availability you should specify the --control-plane-endpoint to set the shared endpoint for all control-plane nodes. Such an endpoint can be either a DNS name or an IP address of a load-balancer. ([source](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#initializing-your-control-plane-node))
318 |
319 | ## 1.4 Provision Underlying Infrastructure to Deploy a Kubernetes Cluster
320 |
321 | See Objective [1.2 Use Kubeadm to Install a Basic Cluster](#12-use-kubeadm-to-install-a-basic-cluster).
322 |
323 | > Note: Make sure that swap is disabled on all nodes.
324 |
325 | ## 1.5 Perform a Version Upgrade on a Kubernetes Cluster using Kubeadm
326 |
327 | - [Upgrading kubeadm clusters](https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/)
328 | - [Safely Drain a Node while Respecting the PodDisruptionBudget](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/)
329 | - [Cluster Management: Maintenance on a Node](https://kubernetes.io/docs/tasks/administer-cluster/cluster-management/#maintenance-on-a-node)
330 |
331 | > Note: All containers are restarted after upgrade, because the container spec hash value is changed. Upgrades are constrained from one minor version to the next minor version.
332 |
333 | ### First Control Plane Node
334 |
335 | Update the kubeadm tool and verify the new version
336 |
337 | > Note: The `--allow-change-held-packages` flag is used because kubeadm updates should be held to prevent automated updates.
338 |
339 | ```bash
340 | apt-get update && \
341 | apt-get install -y --allow-change-held-packages kubeadm=1.19.x-00
342 | kubeadm version
343 | ```
344 |
345 | ---
346 |
347 | [Drain](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#drain) the node to mark as unschedulable
348 |
349 | `kubectl drain $NODENAME --ignore-daemonsets`
350 |
351 | Drain Diagram
352 |
353 | 
354 |
355 |
356 |
357 | ---
358 |
359 | Perform an upgrade plan to validate that your cluster can be upgraded
360 |
361 | > Note: This also fetches the versions you can upgrade to and shows a table with the component config version states.
362 |
363 | `sudo kubeadm upgrade plan`
364 |
365 | ---
366 |
367 | Upgrade the cluster
368 |
369 | `sudo kubeadm upgrade apply v1.19.x`
370 |
371 | ---
372 |
373 | [Uncordon](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#uncordon) the node to mark as schedulable
374 |
375 | `kubectl uncordon $NODENAME`
376 |
377 | ### Additional Control Plane Nodes
378 |
379 | Repeat the first control plane node steps while replacing the "upgrade the cluster" step using the command below:
380 |
381 | `sudo kubeadm upgrade node`
382 |
383 | ### Upgrade Control Plane Node Kubectl And Kubelet Tools
384 |
385 | Upgrade the kubelet and kubectl on all control plane nodes
386 |
387 | ```bash
388 | apt-get update && \
389 | apt-get install -y --allow-change-held-packages kubelet=1.19.x-00 kubectl=1.19.x-00
390 | ```
391 |
392 | ---
393 |
394 | Restart the kubelet
395 |
396 | ```bash
397 | sudo systemctl daemon-reload
398 | sudo systemctl restart kubelet
399 | ```
400 |
401 | ### Upgrade Worker Nodes
402 |
403 | Upgrade kubeadm
404 |
405 | ```bash
406 | apt-get update && \
407 | apt-get install -y --allow-change-held-packages kubeadm=1.19.x-00
408 | ```
409 |
410 | ---
411 |
412 | Drain the node
413 |
414 | `kubectl drain $NODENAME --ignore-daemonsets`
415 |
416 | ---
417 |
418 | Upgrade the kubelet configuration
419 |
420 | `sudo kubeadm upgrade node`
421 |
422 | ---
423 |
424 | Upgrade kubelet and kubectl
425 |
426 | ```bash
427 | apt-get update && \
428 | apt-get install -y --allow-change-held-packages kubelet=1.19.x-00 kubectl=1.19.x-00
429 |
430 | sudo systemctl daemon-reload
431 | sudo systemctl restart kubelet
432 | ```
433 |
434 | ---
435 |
436 | Uncordon the node
437 |
438 | `kubectl uncordon $NODENAME`
439 |
440 | ## 1.6 Implement Etcd Backup And Restore
441 |
442 | - [Operating etcd clusters for Kubernetes: Backing up an etcd cluster](https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#backing-up-an-etcd-cluster)
443 | - [Etcd Documentation: Disaster Recovery](https://etcd.io/docs/v3.4.0/op-guide/recovery/)
444 | - [Kubernetes Tips: Backup and Restore Etcd](https://medium.com/better-programming/kubernetes-tips-backup-and-restore-etcd-97fe12e56c57)
445 |
446 | ### Snapshot The Keyspace
447 |
448 | Use `etcdctl snapshot save`.
449 |
450 | Snapshot the keyspace served by \$ENDPOINT to the file snapshot.db:
451 |
452 | `ETCDCTL_API=3 etcdctl --endpoints $ENDPOINT snapshot save snapshot.db`
453 |
454 | ### Restore From Snapshot
455 |
456 | Use `etcdctl snapshot restore`.
457 |
458 | > Note: Restoring overwrites some snapshot metadata (specifically, the member ID and cluster ID); the member loses its former identity.
459 | >
460 | > Note: Snapshot integrity is verified when restoring from a snapshot using an integrity hash created by `etcdctl snapshot save`, but not when restoring from a file copy.
461 |
462 | Create new etcd data directories (m1.etcd, m2.etcd, m3.etcd) for a three member cluster:
463 |
464 | ```bash
465 | $ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
466 | --name m1 \
467 | --initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
468 | --initial-cluster-token etcd-cluster-1 \
469 | --initial-advertise-peer-urls http://host1:2380
470 | $ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
471 | --name m2 \
472 | --initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
473 | --initial-cluster-token etcd-cluster-1 \
474 | --initial-advertise-peer-urls http://host2:2380
475 | $ ETCDCTL_API=3 etcdctl snapshot restore snapshot.db \
476 | --name m3 \
477 | --initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
478 | --initial-cluster-token etcd-cluster-1 \
479 | --initial-advertise-peer-urls http://host3:2380
480 | ```
481 |
--------------------------------------------------------------------------------
/objectives/objective2.md:
--------------------------------------------------------------------------------
1 | # Objective 2: Workloads & Scheduling
2 |
3 | - [Objective 2: Workloads & Scheduling](#objective-2-workloads--scheduling)
4 | - [2.1 Understand Deployments And How To Perform Rolling Update And Rollbacks](#21-understand-deployments-and-how-to-perform-rolling-update-and-rollbacks)
5 | - [Create Deployment](#create-deployment)
6 | - [Perform Rolling Update](#perform-rolling-update)
7 | - [Perform Rollbacks](#perform-rollbacks)
8 | - [2.2 Use Configmaps And Secrets To Configure Applications](#22-use-configmaps-and-secrets-to-configure-applications)
9 | - [Configmaps](#configmaps)
10 | - [Secrets](#secrets)
11 | - [Other Concepts](#other-concepts)
12 | - [2.3 Know How To Scale Applications](#23-know-how-to-scale-applications)
13 | - [2.4 Understand The Primitives Used To Create Robust, Self-Healing, Application Deployments](#24-understand-the-primitives-used-to-create-robust-self-healing-application-deployments)
14 | - [2.5 Understand How Resource Limits Can Affect Pod Scheduling](#25-understand-how-resource-limits-can-affect-pod-scheduling)
15 | - [2.6 Awareness Of Manifest Management And Common Templating Tools](#26-awareness-of-manifest-management-and-common-templating-tools)
16 |
17 | ## 2.1 Understand Deployments And How To Perform Rolling Update And Rollbacks
18 |
19 | [Official Documentation](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#use-case)
20 |
21 | Deployments are used to manage Pods and ReplicaSets in a declarative manner.
22 |
23 | ### Create Deployment
24 |
25 | Using the [nginx](https://hub.docker.com/_/nginx) image on Docker Hub, we can use a Deployment to push any number of replicas of that image to the cluster.
26 |
27 | Create the `nginx` deployment in the `wahlnetwork1` namespace.
28 |
29 | `kubectl create deployment nginx --image=nginx --replicas=3 -n wahlnetwork1`
30 |
31 | > Alternatively, use `kubectl create deployment nginx --image=nginx --replicas=3 -n wahlnetwork1 --dry-run=client -o yaml` to output a proper yaml configuration.
32 |
33 | ```yaml
34 | apiVersion: apps/v1
35 | kind: Deployment
36 | metadata:
37 | creationTimestamp: null
38 | labels:
39 | app: nginx
40 | name: nginx
41 | namespace: wahlnetwork1
42 | spec:
43 | replicas: 3
44 | selector:
45 | matchLabels:
46 | app: nginx
47 | strategy: {}
48 | template:
49 | metadata:
50 | creationTimestamp: null
51 | labels:
52 | app: nginx
53 | spec:
54 | containers:
55 | - image: nginx
56 | name: nginx
57 | resources: {}
58 | ```
59 |
60 | ### Perform Rolling Update
61 |
62 | [Official Documentation](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#updating-a-deployment)
63 |
64 | Used to make changes to the pod's template and roll them out to the cluster. Triggered when data within `.spec.template` is changed.
65 |
66 | Update the `nginx` deployment in the `wahlnetwork1` namespace to use version `1.16.1`
67 |
68 | `kubectl set image deployment/nginx nginx=nginx:1.16.1 -n wahlnetwork1 --record`
69 |
70 | Track the rollout status.
71 |
72 | `kubectl rollout status deployment.v1.apps/nginx -n wahlnetwork1`
73 |
74 | ```bash
75 | Waiting for deployment "nginx" rollout to finish: 1 out of 2 new replicas have been updated...
76 | Waiting for deployment "nginx" rollout to finish: 1 out of 2 new replicas have been updated...
77 | Waiting for deployment "nginx" rollout to finish: 1 out of 2 new replicas have been updated...
78 | Waiting for deployment "nginx" rollout to finish: 1 old replicas are pending termination...
79 | Waiting for deployment "nginx" rollout to finish: 1 old replicas are pending termination...
80 | deployment "nginx" successfully rolled out
81 | ```
82 |
83 | ### Perform Rollbacks
84 |
85 | [Official Documentation](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#rolling-back-a-deployment)
86 |
87 | Rollbacks offer a method for reverting the changes to a pod's `.spec.template` data to a previous version. By default, executing the `rollout undo` command will revert to the previous version. The desired version can also be declared.
88 |
89 | Review the version history for the `nginx` deployment in the `wahlnetwork1` namespace. In this scenario, other revisions 1-4 have been made to simulate a deployment lifecycle. The 4th revision specifies a fake image version of `1.222222222222` to force a rolling update failure.
90 |
91 | `kubectl rollout history deployment.v1.apps/nginx -n wahlnetwork1`
92 |
93 | ```bash
94 | deployment.apps/nginx
95 | REVISION CHANGE-CAUSE
96 | 1
97 | 2 kubectl.exe set image deployment/nginx nginx=nginx:1.16.1 --record=true --namespace=wahlnetwork1
98 | 3 kubectl.exe set image deployment/nginx nginx=nginx:1.14.1 --record=true --namespace=wahlnetwork1
99 | 4 kubectl.exe set image deployment/nginx nginx=nginx:1.222222222222 --record=true --namespace=wahlnetwork1
100 | ```
101 |
102 | Revert to the previous version of the `nginx` deployment to use image version `1.14.1`. This forces revision 3 to become revision 5. Note that revision 3 no longer exists.
103 |
104 | `kubectl rollout undo deployment.v1.apps/nginx -n wahlnetwork1`
105 |
106 | ```bash
107 | deployment.apps/nginx rolled back
108 |
109 | ~ kubectl rollout history deployment.v1.apps/nginx -n wahlnetwork1
110 |
111 | deployment.apps/nginx
112 | REVISION CHANGE-CAUSE
113 | 1
114 | 2 kubectl.exe set image deployment/nginx nginx=nginx:1.16.1 --record=true --namespace=wahlnetwork1
115 | 4 kubectl.exe set image deployment/nginx nginx=nginx:1.222222222222 --record=true --namespace=wahlnetwork1
116 | 5 kubectl.exe set image deployment/nginx nginx=nginx:1.14.1 --record=true --namespace=wahlnetwork1
117 | ```
118 |
119 | Revert to revision 2 of the `nginx` deployment, which becomes revision 6 (the next available revision number). Note that revision 2 no longer exists.
120 |
121 | `kubectl rollout undo deployment.v1.apps/nginx -n wahlnetwork1 --to-revision=2`
122 |
123 | ```bash
124 | ~ kubectl rollout history deployment.v1.apps/nginx -n wahlnetwork1
125 |
126 | deployment.apps/nginx
127 | REVISION CHANGE-CAUSE
128 | 1
129 | 4 kubectl.exe set image deployment/nginx nginx=nginx:1.222222222222 --record=true --namespace=wahlnetwork1
130 | 5 kubectl.exe set image deployment/nginx nginx=nginx:1.14.1 --record=true --namespace=wahlnetwork1
131 | 6 kubectl.exe set image deployment/nginx nginx=nginx:1.16.1 --record=true --namespace=wahlnetwork1
132 | ```
133 |
134 | ## 2.2 Use Configmaps And Secrets To Configure Applications
135 |
136 | ### Configmaps
137 |
138 | API object used to store non-confidential data in key-value pairs
139 |
140 | - [Official Documentation](https://kubernetes.io/docs/concepts/configuration/configmap/)
141 | [Configure a Pod to Use a ConfigMap](https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/)
142 |
143 | Create a configmap named `game-config` using a directory.
144 |
145 | `kubectl create configmap game-config --from-file=/code/configmap/`
146 |
147 | ```bash
148 | ~ k describe configmap game-config
149 |
150 | Name: game-config
151 | Namespace: default
152 | Labels:
153 | Annotations:
154 |
155 | Data
156 | ====
157 | game.properties:
158 | ----
159 | enemies=aliens
160 | lives=3
161 | enemies.cheat=true
162 | enemies.cheat.level=noGoodRotten
163 | secret.code.passphrase=UUDDLRLRBABAS
164 | secret.code.allowed=true
165 | secret.code.lives=30
166 |
167 | ui.properties:
168 | ----
169 | color.good=purple
170 | color.bad=yellow
171 | allow.textmode=true
172 | how.nice.to.look=fairlyNice
173 |
174 | Events:
175 | ```
176 |
177 | Create a configmap named `game-config` using a file.
178 |
179 | `kubectl create configmap game-config-2 --from-file=/code/configmap/game.properties`
180 |
181 | Create a configmap named `game-config` using an env-file.
182 |
183 | `kubectl create configmap game-config-env-file --from-env-file=/code/configmap/game-env-file.properties`
184 |
185 | Create a configmap named `special-config` using a literal key/value pair.
186 |
187 | `kubectl create configmap special-config --from-literal=special.how=very`
188 |
189 | Edit a configmap named `game-config`.
190 |
191 | `kubectl edit configmap game-config`
192 |
193 | Get a configmap named `game-config` and output the response into yaml.
194 |
195 | `kubectl get configmaps game-config -o yaml`
196 |
197 | Use a configmap with a pod by declaring a value for `.spec.containers.env.name.valueFrom.configMapKeyRef`.
198 |
199 | ```yaml
200 | apiVersion: v1
201 | kind: Pod
202 | metadata:
203 | name: dapi-test-pod
204 | spec:
205 | containers:
206 | - name: test-container
207 | image: k8s.gcr.io/busybox
208 | command: ["/bin/sh", "-c", "env"]
209 | env:
210 | # Define the environment variable
211 | - name: SPECIAL_LEVEL_KEY
212 | valueFrom:
213 | configMapKeyRef:
214 | # The ConfigMap containing the value you want to assign to SPECIAL_LEVEL_KEY
215 | name: special-config
216 | # Specify the key associated with the value
217 | key: special.how
218 | restartPolicy: Never
219 | ```
220 |
221 | Investigate the configmap value `very` from the key `SPECIAL_LEVEL_KEY` by reviewing the logs for the pod or by connecting to the pod directly.
222 |
223 | `kubectl exec -n wahlnetwork1 --stdin nginx-6889dfccd5-msmn8 --tty -- /bin/bash`
224 |
225 | ```bash
226 | ~ kubectl logs dapi-test-pod
227 |
228 | KUBERNETES_SERVICE_PORT=443
229 | KUBERNETES_PORT=tcp://10.96.0.1:443
230 | HOSTNAME=dapi-test-pod
231 | SHLVL=1
232 | HOME=/root
233 | KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
234 | PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
235 | KUBERNETES_PORT_443_TCP_PORT=443
236 | KUBERNETES_PORT_443_TCP_PROTO=tcp
237 | SPECIAL_LEVEL_KEY=very
238 | KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
239 | KUBERNETES_SERVICE_PORT_HTTPS=443
240 | PWD=/
241 | KUBERNETES_SERVICE_HOST=10.96.0.1
242 | ```
243 |
244 | ### Secrets
245 |
246 | - [Managing Secret using kubectl](https://kubernetes.io/docs/tasks/configmap-secret/managing-secret-using-kubectl/)
247 | - [Using Secrets](https://kubernetes.io/docs/concepts/configuration/secret/#using-secrets)
248 |
249 | Create a secret named `db-user-pass` using files.
250 |
251 | ```bash
252 | kubectl create secret generic db-user-pass `
253 | --from-file=./username.txt `
254 | --from-file=./password.txt
255 | ```
256 |
257 | The key name can be modified by inserting a key name into the file path. For example, setting the key names to `funusername` and `funpassword` can be done as shown below:
258 |
259 | ```bash
260 | kubectl create secret generic fundb-user-pass `
261 | --from-file=funusername=./username.txt `
262 | --from-file=funpassword=./password.txt
263 | ```
264 |
265 | Check to make sure the key names matches the defined names.
266 |
267 | `kubectl describe secret fundb-user-pass`
268 |
269 | ```bash
270 | Name: fundb-user-pass
271 | Namespace: default
272 | Labels:
273 | Annotations:
274 |
275 | Type: Opaque
276 |
277 | Data
278 | ====
279 | funpassword: 14 bytes
280 | funusername: 7 bytes
281 | ```
282 |
283 | Get secret values from `db-user-pass`.
284 |
285 | `kubectl get secret db-user-pass -o jsonpath='{.data}'`
286 |
287 | Edit secret values using the `edit` command.
288 |
289 | `kubectl edit secrets db-user-pass`
290 |
291 | ```yaml
292 | apiVersion: v1
293 | data:
294 | password.txt: PASSWORD
295 | username.txt: USERNAME
296 | kind: Secret
297 | metadata:
298 | creationTimestamp: "2020-10-13T22:48:27Z"
299 | name: db-user-pass
300 | namespace: default
301 | resourceVersion: "1022459"
302 | selfLink: /api/v1/namespaces/default/secrets/db-user-pass
303 | uid: 6bb24810-dd33-4b92-9a37-424f3c7553b6
304 | type: Opaque
305 | ```
306 |
307 | Use a secret with a pod by declaring a value for `.spec.containers.env.name.valueFrom.secretKeyRef`.
308 |
309 | ```yaml
310 | apiVersion: v1
311 | kind: Pod
312 | metadata:
313 | name: secret-env-pod
314 | spec:
315 | containers:
316 | - name: mycontainer
317 | image: redis
318 | env:
319 | - name: SECRET_USERNAME
320 | valueFrom:
321 | secretKeyRef:
322 | name: mysecret
323 | key: username
324 | - name: SECRET_PASSWORD
325 | valueFrom:
326 | secretKeyRef:
327 | name: mysecret
328 | key: password
329 | restartPolicy: Never
330 | ```
331 |
332 | ### Other Concepts
333 |
334 | - [Using imagePullSecrets](https://kubernetes.io/docs/concepts/configuration/secret/#using-imagepullsecrets)
335 |
336 | ## 2.3 Know How To Scale Applications
337 |
338 | Scaling is accomplished by changing the number of replicas in a Deployment.
339 |
340 | - [Running Multiple Instances of Your App](https://kubernetes.io/docs/tutorials/kubernetes-basics/scale/scale-intro/)
341 |
342 | Scale a deployment named `nginx` from 3 to 4 replicas.
343 |
344 | `kubectl scale deployments/nginx --replicas=4`
345 |
346 | ## 2.4 Understand The Primitives Used To Create Robust, Self-Healing, Application Deployments
347 |
348 | - Don't use naked Pods (that is, Pods not bound to a ReplicaSet or Deployment) if you can avoid it. Naked Pods will not be rescheduled in the event of a node failure. ([source](https://kubernetes.io/docs/concepts/configuration/overview/#naked-pods-vs-replicasets-deployments-and-jobs))
349 | - A Deployment, which both creates a ReplicaSet to ensure that the desired number of Pods is always available, and specifies a strategy to replace Pods (such as RollingUpdate), is almost always preferable to creating Pods directly, except for some explicit `restartPolicy: Never` scenarios. A Job may also be appropriate. ([source](https://kubernetes.io/docs/concepts/configuration/overview/#naked-pods-vs-replicasets-deployments-and-jobs))
350 | - Define and use labels that identify semantic attributes of your application or Deployment, such as `{ app: myapp, tier: frontend, phase: test, deployment: v3 }`. ([source](https://kubernetes.io/docs/concepts/configuration/overview/#using-labels))
351 |
352 | ## 2.5 Understand How Resource Limits Can Affect Pod Scheduling
353 |
354 | Resource limits are a mechanism to control the amount of resources needed by a container. This commonly translates into CPU and memory limits.
355 |
356 | - Limits set an upper boundary on the amount of resources a container is allowed to consume from the host.
357 | - Requests set an upper boundary on the amount of resources a container is allowed to consume from the host.
358 | - If a limit is set without a request, the request value is set to equal the limit value.
359 | - [Managing Resources for Containers](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/)
360 | - [Resource Quotas](https://kubernetes.io/docs/concepts/policy/resource-quotas/)
361 |
362 | Here is an example of pod configured with resource requests and limits.
363 |
364 | ```yaml
365 | apiVersion: v1
366 | kind: Pod
367 | metadata:
368 | name: frontend
369 | spec:
370 | containers:
371 | - name: app
372 | image: images.my-company.example/app:v4
373 | resources:
374 | requests:
375 | memory: "64Mi"
376 | cpu: "250m"
377 | limits:
378 | memory: "128Mi"
379 | cpu: "500m"
380 | - name: log-aggregator
381 | image: images.my-company.example/log-aggregator:v6
382 | resources:
383 | requests:
384 | memory: "64Mi"
385 | cpu: "250m"
386 | limits:
387 | memory: "128Mi"
388 | cpu: "500m"
389 | ```
390 |
391 | ## 2.6 Awareness Of Manifest Management And Common Templating Tools
392 |
393 | - [Templating YAML in Kubernetes with real code](https://learnk8s.io/templating-yaml-with-code)
394 | - [yq](https://github.com/kislyuk/yq): Command-line YAML/XML processor
395 | - [kustomize](https://github.com/kubernetes-sigs/kustomize): lets you customize raw, template-free YAML files for multiple purposes, leaving the original YAML untouched and usable as is.
396 | - [Helm](https://github.com/helm/helm): A tool for managing Charts. Charts are packages of pre-configured Kubernetes resources.
397 |
--------------------------------------------------------------------------------
/objectives/objective3.md:
--------------------------------------------------------------------------------
1 | # Objective 3: Services & Networking
2 |
3 | - [Objective 3: Services & Networking](#objective-3-services--networking)
4 | - [3.1 Understand Host Networking Configuration On The Cluster Nodes](#31-understand-host-networking-configuration-on-the-cluster-nodes)
5 | - [3.2 Understand Connectivity Between Pods](#32-understand-connectivity-between-pods)
6 | - [3.3 Understand ClusterIP, NodePort, LoadBalancer Service Types And Endpoints](#33-understand-clusterip-nodeport-loadbalancer-service-types-and-endpoints)
7 | - [ClusterIP](#clusterip)
8 | - [NodePort](#nodeport)
9 | - [LoadBalancer](#loadbalancer)
10 | - [ExternalIP](#externalip)
11 | - [ExternalName](#externalname)
12 | - [Networking Cleanup for Objective 3.3](#networking-cleanup-for-objective-33)
13 | - [3.4 Know How To Use Ingress Controllers And Ingress Resources](#34-know-how-to-use-ingress-controllers-and-ingress-resources)
14 | - [3.5 Know How To Configure And Use CoreDNS](#35-know-how-to-configure-and-use-coredns)
15 | - [3.6 Choose An Appropriate Container Network Interface Plugin](#36-choose-an-appropriate-container-network-interface-plugin)
16 |
17 | > Note: If you need access to the pod network while working through the networking examples, use the [Get a Shell to a Running Container](https://kubernetes.io/docs/tasks/debug-application-cluster/get-shell-running-container/) guide to deploy a shell container. I often like to have a tab open to the shell container to run arbitrary network commands without the need to `exec` in and out of it repeatedly.
18 |
19 | ## 3.1 Understand Host Networking Configuration On The Cluster Nodes
20 |
21 | - Design
22 |
23 | - All nodes can talk
24 | - All pods can talk (without NAT)
25 | - Every pod gets a unique IP address
26 |
27 | - Network Types
28 |
29 | - Pod Network
30 | - Node Network
31 | - Services Network
32 | - Rewrites egress traffic destined to a service network endpoint with a pod network IP address
33 |
34 | - Proxy Modes
35 | - IPTables Mode
36 | - The standard mode
37 | - `kube-proxy` watches the Kubernetes control plane for the addition and removal of Service and Endpoint objects
38 | - For each Service, it installs iptables rules, which capture traffic to the Service's clusterIP and port, and redirect that traffic to one of the Service's backend sets.
39 | - For each Endpoint object, it installs iptables rules which select a backend Pod.
40 | - [Official Documentation](https://kubernetes.io/docs/concepts/services-networking/service/#proxy-mode-iptables)
41 | - [Kubernetes Networking Demystified: A Brief Guide](https://www.stackrox.com/post/2020/01/kubernetes-networking-demystified/)
42 | - IPVS Mode
43 | - Since 1.11
44 | - Linux IP Virtual Server (IPVS)
45 | - L4 load balancer
46 |
47 | ## 3.2 Understand Connectivity Between Pods
48 |
49 | [Official Documentation](https://kubernetes.io/docs/concepts/cluster-administration/networking/)
50 |
51 | Read [The Kubernetes network model](https://kubernetes.io/docs/concepts/cluster-administration/networking/#the-kubernetes-network-model):
52 |
53 | - Every pod gets its own address
54 | - Fundamental requirements on any networking implementation
55 | - Pods on a node can communicate with all pods on all nodes without NAT
56 | - Agents on a node (e.g. system daemons, kubelet) can communicate with all pods on that node
57 | - Pods in the host network of a node can communicate with all pods on all nodes without NAT
58 | - Kubernetes IP addresses exist at the Pod scope
59 | - Containers within a pod can communicate with one another over `localhost`
60 | - "IP-per-pod" model
61 |
62 | ## 3.3 Understand ClusterIP, NodePort, LoadBalancer Service Types And Endpoints
63 |
64 | Services are all about abstracting away the details of which pods are running behind a particular network endpoint. For many applications, work must be processed by some other service. Using a service allows the application to "toss over" the work to Kubernetes, which then uses a selector to determine which pods are healthy and available to receive the work. The service abstracts numerous replica pods that are available to do work.
65 |
66 | - [Official Documentation](https://kubernetes.io/docs/concepts/services-networking/service/)
67 | - [Katakoda Networking Introduction](https://www.katacoda.com/courses/kubernetes/networking-introduction)
68 |
69 | > Note: This section was completed using a GKE cluster and may differ from what your cluster looks like.
70 |
71 | ### ClusterIP
72 |
73 | - Exposes the Service on a cluster-internal IP.
74 | - Choosing this value makes the Service only reachable from within the cluster.
75 | - This is the default ServiceType.
76 | - [Using Source IP](https://kubernetes.io/docs/tutorials/services/source-ip/)
77 | - [Kubectl Expose Command Reference](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#expose)
78 |
79 | The imperative option is to create a deployment and then expose the deployment. In this example, the deployment is exposed using a ClusterIP service that accepts traffic on port 80 and translates it to the pod using port 8080.
80 |
81 | `kubectl create deployment funkyapp1 --image=k8s.gcr.io/echoserver:1.4`
82 |
83 | `kubectl expose deployment funkyapp1 --name=funkyip --port=80 --target-port=8080 --type=ClusterIP`
84 |
85 | > Note: The `--type=ClusterIP` parameter is optional when deploying a `ClusterIP` service since this is the default type.
86 |
87 | ```yaml
88 | apiVersion: v1
89 | kind: Service
90 | metadata:
91 | creationTimestamp: null
92 | labels:
93 | app: funkyapp1 #Selector
94 | name: funkyip
95 | spec:
96 | ports:
97 | - port: 80
98 | protocol: TCP
99 | targetPort: 8080
100 | selector:
101 | app: funkyapp1
102 | type: ClusterIP #Note this!
103 | ```
104 |
105 | Using `kubectl describe svc funkyip` shows more details:
106 |
107 | ```bash
108 | Name: funkyip
109 | Namespace: default
110 | Labels: app=funkyapp1
111 | Annotations: cloud.google.com/neg: {"ingress":true}
112 | Selector: app=funkyapp1
113 | Type: ClusterIP
114 | IP: 10.108.3.156
115 | Port: 80/TCP
116 | TargetPort: 8080/TCP
117 | Endpoints: 10.104.2.7:8080
118 | Session Affinity: None
119 | Events:
120 | ```
121 |
122 | ---
123 |
124 | Check to make sure the `funkyip` service exists. This also shows the assigned service (cluster IP) address.
125 |
126 | `kubectl get svc funkyip`
127 |
128 | ```bash
129 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
130 | funkyip ClusterIP 10.108.3.156 80/TCP 21m
131 | ```
132 |
133 | ---
134 |
135 | From there, you can see the endpoint created to match any pod discovered using the `app: funkyapp1` label.
136 |
137 | `kubectl get endpoints funkyip`
138 |
139 | ```bash
140 | NAME ENDPOINTS AGE
141 | funkyip 10.104.2.7:8080 21m
142 | ```
143 |
144 | ---
145 |
146 | The endpoint matches the IP address of the matching pod.
147 |
148 | `kubectl get pods -o wide`
149 |
150 | ```bash
151 | NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
152 | funkyapp1-7b478ccf9b-2vlc2 1/1 Running 0 21m 10.104.2.7 gke-my-first-cluster-1-default-pool-504c1e77-zg6v
153 | shell-demo 1/1 Running 0 3m12s 10.128.0.14 gke-my-first-cluster-1-default-pool-504c1e77-m9lk
154 | ```
155 |
156 | ---
157 |
158 | The `.spec.ports.port` value defines the port used to access the service. The `.spec.ports.targetPort` value defines the port used to access the container's application.
159 |
160 | `User -> Port -> Kubernetes Service -> Target Port -> Application`
161 |
162 | This can be tested using `curl`:
163 |
164 | ```bash
165 | export CLUSTER_IP=$(kubectl get services/funkyip -o go-template='{{(index .spec.clusterIP)}}')
166 | echo CLUSTER_IP=$CLUSTER_IP
167 | ```
168 |
169 | From there, use `curl $CLUSTER_IP:80` to hit the service `port`, which redirects to the `targetPort` of 8080.
170 |
171 | `curl 10.108.3.156:80`
172 |
173 | ```bash
174 | CLIENT VALUES:
175 | client_address=10.128.0.14
176 | command=GET
177 | real path=/
178 | query=nil
179 | request_version=1.1
180 | request_uri=http://10.108.3.156:8080/
181 |
182 | SERVER VALUES:
183 | server_version=nginx: 1.10.0 - lua: 10001
184 |
185 | HEADERS RECEIVED:
186 | accept=*/*
187 | host=10.108.3.156
188 | user-agent=curl/7.64.0
189 | BODY:
190 | -no body in request-root
191 | ```
192 |
193 | ### NodePort
194 |
195 | - Exposes the Service on each Node's IP at a static port (the NodePort).
196 | - [Official Documentation](https://kubernetes.io/docs/concepts/services-networking/service/#nodeport)
197 |
198 | `kubectl expose deployment funkyapp1 --name=funkynode --port=80 --target-port=8080 --type=NodePort`
199 |
200 | ```yaml
201 | apiVersion: v1
202 | kind: Service
203 | metadata:
204 | creationTimestamp: null
205 | labels:
206 | app: funkyapp1 #Selector
207 | name: funkynode
208 | spec:
209 | ports:
210 | - port: 80
211 | protocol: TCP
212 | targetPort: 8080
213 | selector:
214 | app: funkyapp1
215 | type: NodePort #Note this!
216 | ```
217 |
218 | ---
219 |
220 | This service is available on each node at a specific port.
221 |
222 | `kubectl describe svc funkynode`
223 |
224 | ```bash
225 | Name: funkynode
226 | Namespace: default
227 | Labels: app=funkyapp1
228 | Annotations: cloud.google.com/neg: {"ingress":true}
229 | Selector: app=funkyapp1
230 | Type: NodePort
231 | IP: 10.108.5.37
232 | Port: 80/TCP
233 | TargetPort: 8080/TCP
234 | NodePort: 30182/TCP
235 | Endpoints: 10.104.2.7:8080
236 | Session Affinity: None
237 | External Traffic Policy: Cluster
238 | Events:
239 | ```
240 |
241 | ---
242 |
243 | By using the node IP address with the `nodePort` value, we can see the desired payload. Make sure to scale the deployment so that each node is running one replica of the pod. For a cluster with 2 worker nodes, this can be done with `kubectl scale deploy funkyapp1 --replicas=3`.
244 |
245 | From there, it is possible to `curl` directly to a node IP address using the `nodePort` when using the shell pod demo. If working from outside the pod network, use the service IP address.
246 |
247 | `curl 10.128.0.14:30182`
248 |
249 | ```bash
250 | CLIENT VALUES:
251 | client_address=10.128.0.14
252 | command=GET
253 | real path=/
254 | query=nil
255 | request_version=1.1
256 | request_uri=http://10.128.0.14:8080/
257 |
258 | SERVER VALUES:
259 | server_version=nginx: 1.10.0 - lua: 10001
260 |
261 | HEADERS RECEIVED:
262 | accept=*/*
263 | host=10.128.0.14:30182
264 | user-agent=curl/7.64.0
265 | BODY:
266 | -no body in request-root
267 | ```
268 |
269 | ### LoadBalancer
270 |
271 | - Exposes the Service externally using a cloud provider's load balancer.
272 | - NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created.
273 | - [Source IP for Services with Type LoadBalancer](https://kubernetes.io/docs/tutorials/services/source-ip/#source-ip-for-services-with-type-loadbalancer)
274 |
275 | `kubectl expose deployment funkyapp1 --name=funkylb --port=80 --target-port=8080 --type=LoadBalancer`
276 |
277 | ```yaml
278 | apiVersion: v1
279 | kind: Service
280 | metadata:
281 | creationTimestamp: null
282 | labels:
283 | app: funkyapp1
284 | name: funkylb
285 | spec:
286 | ports:
287 | - port: 80
288 | protocol: TCP
289 | targetPort: 8080
290 | selector:
291 | app: funkyapp1
292 | type: LoadBalancer #Note this!
293 | ```
294 |
295 | ---
296 |
297 | Get information on the `funkylb` service to determine the External IP address.
298 |
299 | `kubectl get svc funkylb`
300 |
301 | ```bash
302 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
303 | funkylb LoadBalancer 10.108.11.148 35.232.149.96 80:31679/TCP 64s
304 | ```
305 |
306 | It is then possible to retrieve the payload using the External IP address and port value from anywhere on the Internet; no need to use the pod shell demo!
307 |
308 | `curl 35.232.149.96:80`
309 |
310 | ```bash
311 | CLIENT VALUES:
312 | client_address=10.104.2.1
313 | command=GET
314 | real path=/
315 | query=nil
316 | request_version=1.1
317 | request_uri=http://35.232.149.96:8080/
318 |
319 | SERVER VALUES:
320 | server_version=nginx: 1.10.0 - lua: 10001
321 |
322 | HEADERS RECEIVED:
323 | accept=*/*
324 | host=35.232.149.96
325 | user-agent=curl/7.55.1
326 | BODY:
327 | -no body in request-
328 | ```
329 |
330 | ### ExternalIP
331 |
332 | [Official Documentation](https://kubernetes.io/docs/concepts/services-networking/service/#external-ips)
333 |
334 | - Exposes a Kubernetes service on an external IP address.
335 | - Kubernetes has no control over this external IP address.
336 |
337 | Here is an example spec:
338 |
339 | ```yaml
340 | apiVersion: v1
341 | kind: Service
342 | metadata:
343 | name: my-service
344 | spec:
345 | selector:
346 | app: MyApp
347 | ports:
348 | - name: http
349 | protocol: TCP
350 | port: 80
351 | targetPort: 9376
352 | externalIPs:
353 | - 80.11.12.10 #Take note!
354 | ```
355 |
356 | ### ExternalName
357 |
358 | - Maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value.
359 | - No proxy of any kind is set up.
360 |
361 | ### Networking Cleanup for Objective 3.3
362 |
363 | Run these commands to cleanup the resources, if desired.
364 |
365 | ```bash
366 | kubectl delete svc funkyip
367 | kubectl delete svc funkynode
368 | kubectl delete svc funkylb
369 | kubectl delete deploy funkyapp1
370 | ```
371 |
372 | ## 3.4 Know How To Use Ingress Controllers And Ingress Resources
373 |
374 | Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster.
375 |
376 | - Traffic routing is controlled by rules defined on the **Ingress resource**.
377 | - An **Ingress controller** is responsible for fulfilling the Ingress, usually with a load balancer, though it may also configure your edge router or additional frontends to help handle the traffic.
378 | - For example, the [NGINX Ingress Controller for Kubernetes](https://www.nginx.com/products/nginx/kubernetes-ingress-controller)
379 | - The name of an Ingress object must be a valid DNS subdomain name.
380 | - [Ingress Documentation](https://kubernetes.io/docs/concepts/services-networking/ingress/)
381 | - A list of [Ingress Controllers](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/)
382 | - [Katacoda - Create Ingress Routing](https://www.katacoda.com/courses/kubernetes/create-kubernetes-ingress-routes) lab
383 | - [Katacoda - Nginx on Kubernetes](https://www.katacoda.com/javajon/courses/kubernetes-applications/nginx) lab
384 |
385 | Example of an ingress resource:
386 |
387 | ```yaml
388 | apiVersion: networking.k8s.io/v1
389 | kind: Ingress
390 | metadata:
391 | name: minimal-ingress
392 | annotations:
393 | nginx.ingress.kubernetes.io/rewrite-target: /
394 | spec:
395 | rules:
396 | - http:
397 | paths:
398 | - path: /testpath
399 | pathType: Prefix
400 | backend:
401 | service:
402 | name: test
403 | port:
404 | number: 80
405 | ```
406 |
407 | Information on some of the objects within this resource:
408 |
409 | - [Ingress Rules](https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-rules)
410 | - [Path Types](https://kubernetes.io/docs/concepts/services-networking/ingress/#path-types)
411 |
412 | And, in the case of Nginx, [a custom resource definition (CRD) is often used](https://octopus.com/blog/nginx-ingress-crds) to extend the usefulness of an ingress. An example is shown below:
413 |
414 | ```yaml
415 | apiVersion: k8s.nginx.org/v1
416 | kind: VirtualServer
417 | metadata:
418 | name: cafe
419 | spec:
420 | host: cafe.example.com
421 | tls:
422 | secret: cafe-secret
423 | upstreams:
424 | - name: tea
425 | service: tea-svc
426 | port: 80
427 | - name: coffee
428 | service: coffee-svc
429 | port: 80
430 | routes:
431 | - path: /tea
432 | action:
433 | pass: tea
434 | - path: /coffee
435 | action:
436 | pass: coffee
437 | ```
438 |
439 | ## 3.5 Know How To Configure And Use CoreDNS
440 |
441 | CoreDNS is a general-purpose authoritative DNS server that can serve as cluster DNS.
442 |
443 | - A bit of history:
444 | - As of Kubernetes v1.12, CoreDNS is the recommended DNS Server, replacing `kube-dns`.
445 | - In Kubernetes version 1.13 and later the CoreDNS feature gate is removed and CoreDNS is used by default.
446 | - In Kubernetes 1.18, `kube-dns` usage with kubeadm has been deprecated and will be removed in a future version.
447 | - [Using CoreDNS for Service Discovery](https://kubernetes.io/docs/tasks/administer-cluster/coredns/)
448 | - [Customizing DNS Service](https://kubernetes.io/docs/tasks/administer-cluster/dns-custom-nameservers/)
449 |
450 | CoreDNS is installed with the following default [Corefile](https://coredns.io/2017/07/23/corefile-explained/) configuration:
451 |
452 | ```yaml
453 | apiVersion: v1
454 | kind: ConfigMap
455 | metadata:
456 | name: coredns
457 | namespace: kube-system
458 | data:
459 | Corefile: |
460 | .:53 {
461 | errors
462 | health {
463 | lameduck 5s
464 | }
465 | ready
466 | kubernetes cluster.local in-addr.arpa ip6.arpa {
467 | pods insecure
468 | fallthrough in-addr.arpa ip6.arpa
469 | ttl 30
470 | }
471 | prometheus :9153
472 | forward . /etc/resolv.conf
473 | cache 30
474 | loop
475 | reload
476 | loadbalance
477 | }
478 | ```
479 |
480 | If you need to customize CoreDNS behavior, you create and apply your own ConfigMap to override settings in the Corefile. The [Configuring DNS Servers for Kubernetes Clusters](https://docs.cloud.oracle.com/en-us/iaas/Content/ContEng/Tasks/contengconfiguringdnsserver.htm) document describes this in detail.
481 |
482 | ---
483 |
484 | Review your configmaps for the `kube-system` namespace to determine if there is a `coredns-custom` configmap.
485 |
486 | `kubectl get configmaps --namespace=kube-system`
487 |
488 | ```bash
489 | NAME DATA AGE
490 | cluster-kubestore 0 23h
491 | clustermetrics 0 23h
492 | extension-apiserver-authentication 6 24h
493 | gke-common-webhook-lock 0 23h
494 | ingress-gce-lock 0 23h
495 | ingress-uid 2 23h
496 | kube-dns 0 23h
497 | kube-dns-autoscaler 1 23h
498 | metrics-server-config 1 23h
499 | ```
500 |
501 | ---
502 |
503 | Create a file named `coredns.yml` containing a configmap with the desired DNS entries in the `data` field such as the example below:
504 |
505 | ```yaml
506 | apiVersion: v1
507 | kind: ConfigMap
508 | metadata:
509 | name: coredns-custom
510 | namespace: kube-system
511 | data:
512 | example.server:
513 | | # All custom server files must have a “.server” file extension.
514 | # Change example.com to the domain you wish to forward.
515 | example.com {
516 | # Change 1.1.1.1 to your customer DNS resolver.
517 | forward . 1.1.1.1
518 | }
519 | ```
520 |
521 | ---
522 |
523 | Apply the configmap.
524 |
525 | `kubectl apply -f coredns.yml`
526 |
527 | ---
528 |
529 | Validate the existence of the `coredns-custom` configmap.
530 |
531 | `kubectl get configmaps --namespace=kube-system`
532 |
533 | ```bash
534 | NAME DATA AGE
535 | cluster-kubestore 0 24h
536 | clustermetrics 0 24h
537 | coredns-custom 1 6s
538 | extension-apiserver-authentication 6 24h
539 | gke-common-webhook-lock 0 24h
540 | ingress-gce-lock 0 24h
541 | ingress-uid 2 24h
542 | kube-dns 0 24h
543 | kube-dns-autoscaler 1 24h
544 | metrics-server-config 1 24h
545 | ```
546 |
547 | ---
548 |
549 | Get the configmap and output the value in yaml format.
550 |
551 | `kubectl get configmaps --namespace=kube-system coredns-custom -o yaml`
552 |
553 | ```yaml
554 | apiVersion: v1
555 | data:
556 | example.server: |
557 | # Change example.com to the domain you wish to forward.
558 | example.com {
559 | # Change 1.1.1.1 to your customer DNS resolver.
560 | forward . 1.1.1.1
561 | }
562 | kind: ConfigMap
563 | metadata:
564 | annotations:
565 | kubectl.kubernetes.io/last-applied-configuration: |
566 | {"apiVersion":"v1","data":{"example.server":"# Change example.com to the domain you wish to forward.\nexample.com {\n # Change 1.1.1.1 to your customer DNS resolver.\n forward . 1.1.1.1\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"coredns-custom","namespace":"kube-system"}}
567 | creationTimestamp: "2020-10-27T19:49:24Z"
568 | managedFields:
569 | - apiVersion: v1
570 | fieldsType: FieldsV1
571 | fieldsV1:
572 | f:data:
573 | .: {}
574 | f:example.server: {}
575 | f:metadata:
576 | f:annotations:
577 | .: {}
578 | f:kubectl.kubernetes.io/last-applied-configuration: {}
579 | manager: kubectl-client-side-apply
580 | operation: Update
581 | time: "2020-10-27T19:49:24Z"
582 | name: coredns-custom
583 | namespace: kube-system
584 | resourceVersion: "519480"
585 | selfLink: /api/v1/namespaces/kube-system/configmaps/coredns-custom
586 | uid: 8d3250a5-cbb4-4f01-aae3-4e83bd158ebe
587 | ```
588 |
589 | ## 3.6 Choose An Appropriate Container Network Interface Plugin
590 |
591 | Generally, it seems that Flannel is good for starting out in a very simplified environment, while Calico (and others) extend upon the basic functionality to meet design-specific requirements.
592 |
593 | - [Network Plugins](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/)
594 | - [Choosing a CNI Network Provider for Kubernetes](https://chrislovecnm.com/kubernetes/cni/choosing-a-cni-provider/)
595 | - [Comparing Kubernetes CNI Providers: Flannel, Calico, Canal, and Weave](https://rancher.com/blog/2019/2019-03-21-comparing-kubernetes-cni-providers-flannel-calico-canal-and-weave/)
596 |
597 | Common decision points include:
598 |
599 | - Network Model: Layer 2, Layer 3, VXLAN, etc.
600 | - Routing: Routing and route distribution for pod traffic between nodes
601 | - Network Policy: Essentially the firewall between network / pod segments
602 | - IP Address Management (IPAM)
603 | - Datastore:
604 | - `etcd` - for direct connection to an etcd cluster
605 | - Kubernetes - for connection to a Kubernetes API server
606 |
--------------------------------------------------------------------------------
/objectives/objective4.md:
--------------------------------------------------------------------------------
1 | # Objective 4: Storage
2 |
3 | - [Objective 4: Storage](#objective-4-storage)
4 | - [4.1 Understand Storage Classes, Persistent Volumes](#41-understand-storage-classes-persistent-volumes)
5 | - [Storage Classes](#storage-classes)
6 | - [Persistent Volumes](#persistent-volumes)
7 | - [4.2 Understand Volume Mode, Access Modes And Reclaim Policies For Volumes](#42-understand-volume-mode-access-modes-and-reclaim-policies-for-volumes)
8 | - [Volume Mode](#volume-mode)
9 | - [Access Modes](#access-modes)
10 | - [Reclaim Policies](#reclaim-policies)
11 | - [4.3 Understand Persistent Volume Claims Primitive](#43-understand-persistent-volume-claims-primitive)
12 | - [4.4 Know How To Configure Applications With Persistent Storage](#44-know-how-to-configure-applications-with-persistent-storage)
13 |
14 | ## 4.1 Understand Storage Classes, Persistent Volumes
15 |
16 | - [Storage Classes](https://kubernetes.io/docs/concepts/storage/storage-classes/)
17 | - [Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/)
18 |
19 | ### Storage Classes
20 |
21 | - [Reclaim Policy](https://kubernetes.io/docs/concepts/storage/storage-classes/#reclaim-policy): PersistentVolumes that are dynamically created by a StorageClass will have the reclaim policy specified in the reclaimPolicy field of the class
22 | - Delete: When PersistentVolumeClaim is deleted, also deletes PersistentVolume and underlying storage object
23 | - Retain: When PersistentVolumeClaim is deleted, PersistentVolume remains and volume is "released"
24 | - [Volume Binding Mode](https://kubernetes.io/docs/concepts/storage/storage-classes/#volume-binding-mode):
25 | - `Immediate`: By default, the `Immediate` mode indicates that volume binding and dynamic provisioning occurs once the PersistentVolumeClaim is created
26 | - `WaitForFirstConsumer`: Delay the binding and provisioning of a PersistentVolume until a Pod using the PersistentVolumeClaim is created
27 | - Supported by `AWSElasticBlockStore`, `GCEPersistentDisk`, and `AzureDisk`
28 | - [Allow Volume Expansion](https://kubernetes.io/docs/concepts/storage/storage-classes/#allow-volume-expansion): Allow volumes to be expanded
29 | - Note: It is not possible to reduce the size of a PersistentVolume
30 | - Default Storage Class: A default storage class is used when a PersistentVolumeClaim does not specify the storage class
31 | - Can be handy when a single default services all pod volumes
32 | - [Provisioner](https://kubernetes.io/docs/concepts/storage/storage-classes/#provisioner)
33 | - Determines the volume plugin to use for provisioning PVs.
34 | - Example: `gke-pd`, `azure-disk`
35 |
36 | ---
37 |
38 | View all storage classes
39 |
40 | `kubectl get storageclass` or `kubectl get sc`
41 |
42 | ```bash
43 | NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
44 | standard (default) kubernetes.io/gce-pd Delete Immediate true 25h
45 | ```
46 |
47 | ---
48 |
49 | View the storage class in yaml format
50 |
51 | `kubectl get sc standard -o yaml`
52 |
53 | ```yaml
54 | allowVolumeExpansion: true
55 | apiVersion: storage.k8s.io/v1
56 | kind: StorageClass
57 | metadata:
58 | name: standard
59 | parameters:
60 | type: pd-standard
61 | provisioner: kubernetes.io/gce-pd
62 | reclaimPolicy: Delete
63 | volumeBindingMode: Immediate
64 | ```
65 |
66 | ---
67 |
68 | Make a custom storage class using the yaml configuration below and save it as `speedyssdclass.yaml`
69 |
70 | ```yaml
71 | allowVolumeExpansion: true
72 | apiVersion: storage.k8s.io/v1
73 | kind: StorageClass
74 | metadata:
75 | name: speedyssdclass
76 | parameters:
77 | type: pd-ssd # Note: This will use SSD backed disks
78 | fstype: ext4
79 | replication-type: none
80 | provisioner: kubernetes.io/gce-pd
81 | reclaimPolicy: Delete
82 | volumeBindingMode: Immediate
83 | ```
84 |
85 | ---
86 |
87 | Apply the storage class configuration to the cluster
88 |
89 | `kubectl apply -f speedyssdclass.yaml`
90 |
91 | ```bash
92 | storageclass.storage.k8s.io/speedyssdclass created
93 | ```
94 |
95 | ---
96 |
97 | Get the storage classes
98 |
99 | `kubectl get sc`
100 |
101 | ```bash
102 | NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
103 | speedyssdclass kubernetes.io/gce-pd Retain WaitForFirstConsumer true 5m19s
104 | standard (default) kubernetes.io/gce-pd Delete Immediate true 8d
105 | ```
106 |
107 | ### Persistent Volumes
108 |
109 | View a persistent volume in yaml format
110 |
111 | `kubectl get pv pvc-d2f6e37e-277f-4b7b-8725-542609f1dea4 -o yaml`
112 |
113 | ```yaml
114 | apiVersion: v1
115 | kind: PersistentVolume
116 | metadata:
117 | name: pvc-d2f6e37e-277f-4b7b-8725-542609f1dea4
118 | spec:
119 | accessModes:
120 | - ReadWriteOnce
121 | capacity:
122 | storage: 1Gi
123 | persistentVolumeReclaimPolicy: Delete
124 | storageClassName: standard
125 | volumeMode: Filesystem
126 | ```
127 |
128 | ---
129 |
130 | Create a new disk named `pv100` in Google Cloud to be used as a persistent volume
131 |
132 | > Note: Use the zone of your GKE cluster
133 |
134 | `gcloud compute disks create pv100 --size 10GiB --zone=us-central1-c`
135 |
136 | ---
137 |
138 | Make a custom persistent volume using the yaml configuration below and save it as `pv100.yaml`
139 |
140 | ```yaml
141 | apiVersion: v1
142 | kind: PersistentVolume
143 | metadata:
144 | name: pv100
145 | spec:
146 | accessModes:
147 | - ReadWriteOnce
148 | capacity:
149 | storage: 1Gi
150 | persistentVolumeReclaimPolicy: Delete
151 | storageClassName: standard
152 | volumeMode: Filesystem
153 | gcePersistentDisk: # This section is required since we are not using a Storage Class
154 | fsType: ext4
155 | pdName: pv100
156 | ```
157 |
158 | ---
159 |
160 | Apply the persistent volume to the cluster
161 |
162 | `kubectl apply -f pv100.yaml`
163 |
164 | ```bash
165 | persistentvolume/pv100 created
166 | ```
167 |
168 | ---
169 |
170 | Get the persistent volume and notice that it has a status of `Available` since there is no `PersistentVolumeClaim` to bind against
171 |
172 | `kubectl get pv pv100`
173 |
174 | ```bash
175 | NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
176 | pv100 1Gi RWO Delete Available standard 2m51s
177 | ```
178 |
179 | ## 4.2 Understand Volume Mode, Access Modes And Reclaim Policies For Volumes
180 |
181 | - [Volume Mode](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#volume-mode)
182 | - [Access Modes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes)
183 | - [Reclaim Policy](https://kubernetes.io/docs/concepts/storage/storage-classes/#reclaim-policy)
184 |
185 | ### Volume Mode
186 |
187 | - Filesystem: Kubernetes formats the volume and presents it to a specified mount point.
188 | - If the volume is backed by a block device and the device is empty, Kuberneretes creates a filesystem on the device before mounting it for the first time.
189 | - Block: Kubernetes exposes a raw block device to the container.
190 | - Improved time to usage and perhaps performance.
191 | - The container must know what to do with the device; there is no filesystem.
192 | - Defined in `.spec.volumeMode` for a `PersistentVolumeClaim`.
193 |
194 | ---
195 |
196 | View the volume mode for persistent volume claims using the `-o wide` to see the `VOLUMEMODE` column
197 |
198 | `kubectl get pvc -o wide`
199 |
200 | ```bash
201 | NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE VOLUMEMODE
202 | www-web-0 Bound pvc-f3e92637-7e0d-46a3-ad87-ef1275bb5a72 1Gi RWO standard 19m Filesystem
203 | www-web-1 Bound pvc-d2f6e37e-277f-4b7b-8725-542609f1dea4 1Gi RWO standard 19m Filesystem
204 | ```
205 |
206 | ### Access Modes
207 |
208 | - ReadWriteOnce (RWO): can be mounted as read-write by a single node
209 | - ReadOnlyMany (ROX): can be mounted as read-only by many nodes
210 | - ReadWriteMany (RWX): can be mounted as read-write by many nodes
211 | - Defined in `.spec.accessModes` for a `PersistentVolumeClaim` and `PersistentVolume`
212 |
213 | View the access mode for persistent volume claims
214 |
215 | `kubectl get pvc`
216 |
217 | ```bash
218 | NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
219 | www-web-0 Bound pvc-f3e92637-7e0d-46a3-ad87-ef1275bb5a72 1Gi RWO standard 28m
220 | www-web-1 Bound pvc-d2f6e37e-277f-4b7b-8725-542609f1dea4 1Gi RWO standard 27m
221 | ```
222 |
223 | ### Reclaim Policies
224 |
225 | - [Reclaim Policy](https://kubernetes.io/docs/concepts/storage/storage-classes/#reclaim-policy): PersistentVolumes that are dynamically created by a StorageClass will have the reclaim policy specified in the reclaimPolicy field of the class
226 | - Delete: When PersistentVolumeClaim is deleted, also deletes PersistentVolume and underlying storage object
227 | - Retain: When PersistentVolumeClaim is deleted, PersistentVolume remains and volume is "released"
228 | - [Change the Reclaim Policy of a PersistentVolume](https://kubernetes.io/docs/tasks/administer-cluster/change-pv-reclaim-policy/)
229 | - Defined in `.spec.persistentVolumeReclaimPolicy` for `PersistentVolume`.
230 |
231 | ---
232 |
233 | View the reclaim policy set on persistent volumes
234 |
235 | `kubectl get pv`
236 |
237 | ```bash
238 | NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
239 | pvc-d2f6e37e-277f-4b7b-8725-542609f1dea4 1Gi RWO Delete Bound default/www-web-1 standard 45m
240 | pvc-f3e92637-7e0d-46a3-ad87-ef1275bb5a72 1Gi RWO Delete Bound default/www-web-0 standard 45m
241 | ```
242 |
243 | ## 4.3 Understand Persistent Volume Claims Primitive
244 |
245 | Make a custom persistent volume claim using the yaml configuration below and save it as `pvc01.yaml`
246 |
247 | ```yaml
248 | apiVersion: v1
249 | kind: PersistentVolumeClaim
250 | metadata:
251 | name: pvc01
252 | spec:
253 | storageClassName: standard
254 | accessModes:
255 | - ReadWriteOnce
256 | resources:
257 | requests:
258 | storage: 3Gi
259 | ```
260 |
261 | ---
262 |
263 | Apply the persistent volume claim
264 |
265 | `kubectl apply -f pvc01.yaml`
266 |
267 | ```bash
268 | persistentvolumeclaim/pvc01 created
269 | ```
270 |
271 | ---
272 |
273 | Get the persistent volume claim
274 |
275 | `kubectl get pvc pvc01`
276 |
277 | ```bash
278 | NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
279 | pvc01 Bound pvc-9f2e7c5d-b64c-467e-bba6-86ccb333d981 3Gi RWO standard 5m19s
280 | ```
281 |
282 | ## 4.4 Know How To Configure Applications With Persistent Storage
283 |
284 | - [Configure a Pod to Use a PersistentVolume for Storage](https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/)
285 |
286 | ---
287 |
288 | Create a new yaml file using the configuration below and save it as `pv-pod.yaml`
289 |
290 | > Note: Make sure to create `pvc01` in [this earlier step](#43-understand-persistent-volume-claims-primitive)
291 |
292 | ```yaml
293 | apiVersion: v1
294 | kind: Pod
295 | metadata:
296 | name: pv-pod
297 | spec:
298 | volumes:
299 | - name: pv-pod-storage # The name of the volume, used by .spec.containers.volumeMounts.name
300 | persistentVolumeClaim:
301 | claimName: pvc01 # This pvc was created in an earlier step
302 | containers:
303 | - name: pv-pod-container
304 | image: nginx
305 | ports:
306 | - containerPort: 80
307 | name: "http-server"
308 | volumeMounts:
309 | - mountPath: "/usr/share/nginx/html"
310 | name: pv-pod-storage # This refers back to .spec.volumes.name
311 | ```
312 |
313 | ---
314 |
315 | Apply the pod
316 |
317 | `kubectl apply -f pv-pod.yaml`
318 |
319 | ```bash
320 | pod/pv-pod created
321 | ```
322 |
323 | ---
324 |
325 | Watch the pod provisioning process
326 |
327 | `kubectl get pod -w pv-pod`
328 |
329 | ```bash
330 | NAME READY STATUS RESTARTS AGE
331 | pv-pod 1/1 Running 0 30s
332 | ```
333 |
334 | ---
335 |
336 | View the binding on `pvc01`
337 |
338 | `kubectl describe pvc pvc01`
339 |
340 | ```bash
341 | Name: pvc01
342 | Namespace: default
343 | StorageClass: standard
344 | Status: Bound
345 | Volume: pvc-9f2e7c5d-b64c-467e-bba6-86ccb333d981
346 | Labels:
347 | Annotations: pv.kubernetes.io/bind-completed: yes
348 | pv.kubernetes.io/bound-by-controller: yes
349 | volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/gce-pd
350 | Finalizers: [kubernetes.io/pvc-protection]
351 | Capacity: 3Gi
352 | Access Modes: RWO
353 | VolumeMode: Filesystem
354 | Mounted By: pv-pod # Here it is!
355 | Events:
356 | Type Reason Age From Message
357 | ---- ------ ---- ---- -------
358 | Normal ProvisioningSucceeded 36m persistentvolume-controller Successfully provisioned volume pvc-9f2e7c5d-b64c-467e-bba6-86ccb333d981 using kubernetes.io/gce-pd
359 | ```
360 |
--------------------------------------------------------------------------------
/objectives/objective5.md:
--------------------------------------------------------------------------------
1 | # Objective 5: Troubleshooting
2 |
3 | - [Troubleshooting Kubernetes deployments](https://learnk8s.io/troubleshooting-deployments)
4 |
5 | - [Objective 5: Troubleshooting](#objective-5-troubleshooting)
6 | - [5.1 Evaluate Cluster And Node Logging](#51-evaluate-cluster-and-node-logging)
7 | - [Cluster Logging](#cluster-logging)
8 | - [Node Logging](#node-logging)
9 | - [5.2 Understand How To Monitor Applications](#52-understand-how-to-monitor-applications)
10 | - [5.3 Manage Container Stdout & Stderr Logs](#53-manage-container-stdout--stderr-logs)
11 | - [5.4 Troubleshoot Application Failure](#54-troubleshoot-application-failure)
12 | - [5.5 Troubleshoot Cluster Component Failure](#55-troubleshoot-cluster-component-failure)
13 | - [5.6 Troubleshoot Networking](#56-troubleshoot-networking)
14 |
15 | ## 5.1 Evaluate Cluster And Node Logging
16 |
17 | ### Cluster Logging
18 |
19 | Having a separate storage location for cluster component logging, such as nodes, pods, and applications.
20 |
21 | - [Cluster-level logging architectures](https://kubernetes.io/docs/concepts/cluster-administration/logging/#cluster-level-logging-architectures)
22 | - [Kubernetes Logging Best Practices](https://platform9.com/blog/kubernetes-logging-best-practices/)
23 |
24 | Commonly deployed in one of three ways:
25 |
26 | 1. [Logging agent on each node](https://kubernetes.io/docs/concepts/cluster-administration/logging/#using-a-node-logging-agent) that sends log data to a backend storage repository
27 | 1. These agents can be deployed using a DaemonSet replica to ensure nodes have the agent running
28 | 2. Note: This approach only works for applications' standard output (_stdout_) and standard error (_stderr_)
29 | 2. [Logging agent as a sidecar](https://kubernetes.io/docs/concepts/cluster-administration/logging/#using-a-sidecar-container-with-the-logging-agent) to specific deployments that sends log data to a backend storage repository
30 | 1. Note: Writing logs to a file and then streaming them to stdout can double disk usage
31 | 3. [Configure the containerized application](https://kubernetes.io/docs/concepts/cluster-administration/logging/#exposing-logs-directly-from-the-application) to send log data to a backend storage repository
32 |
33 | ### Node Logging
34 |
35 | Having a log file on the node that is populated with standard output (_stdout_) and standard error (_stderr_) log entries from containers running on the node.
36 |
37 | - [Logging at the node level](https://kubernetes.io/docs/concepts/cluster-administration/logging/#logging-at-the-node-level)
38 |
39 | ## 5.2 Understand How To Monitor Applications
40 |
41 | - [Using kubectl describe pod to fetch details about pods](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application-introspection/#using-kubectl-describe-pod-to-fetch-details-about-pods)
42 | - [Interacting with running Pods](https://kubernetes.io/docs/reference/kubectl/cheatsheet/#interacting-with-running-pods)
43 |
44 | ## 5.3 Manage Container Stdout & Stderr Logs
45 |
46 | - [Kubectl Commands - Logs](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#logs)
47 | - [How to find—and use—your GKE logs with Cloud Logging](https://cloud.google.com/blog/products/management-tools/finding-your-gke-logs)
48 | - [Enable Log Rotation in Kubernetes Cluster](https://vividcode.io/enable-log-rotation-in-kubernetes-cluster/)
49 |
50 | `kubectl logs [-f] [-p] (POD | TYPE/NAME) [-c CONTAINER]`
51 |
52 | - `-f` will follow the logs
53 | - `-p` will pull up the previous instance of the container
54 | - `-c` will select a specific container for pods that have more than one
55 |
56 | ## 5.4 Troubleshoot Application Failure
57 |
58 | - [Troubleshoot Applications](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-application/)
59 | - Status: Pending
60 | - The Pod has been accepted by the Kubernetes cluster, but one or more of the containers has not been set up and made ready to run.
61 | - If no resources available on cluster, Cluster Autoscaling will increased node count if enabled
62 | - Once node count satisfied, pods in Pending status will be deployed
63 | - Status: Waiting
64 | - A container in the Waiting state is still running the operations it requires in order to complete start up
65 |
66 | ---
67 |
68 | Describe the pod to get details on the configuration, containers, events, conditions, volumes, etc.
69 |
70 | - Is the status equal to RUNNING?
71 | - Are there enough resources to schedule the pod?
72 | - Are there enough `hostPorts` remaining to schedule the pod?
73 |
74 | `kubectl describe pod counter`
75 |
76 | ```yaml
77 | Name: counter
78 | Namespace: default
79 | Priority: 0
80 | Node: gke-my-first-cluster-1-default-pool-504c1e77-xcvj/10.128.0.15
81 | Start Time: Tue, 10 Nov 2020 16:33:10 -0600
82 | Labels:
83 | Annotations:
84 | Status: Running
85 | IP: 10.104.1.7
86 | IPs:
87 | IP: 10.104.1.7
88 | Containers:
89 | count:
90 | Container ID: docker://430313804a529153c1dc5badd1394164906a7dead8708a4b850a0466997e1c34
91 | Image: busybox
92 | Image ID: docker-pullable://busybox@sha256:a9286defaba7b3a519d585ba0e37d0b2cbee74ebfe590960b0b1d6a5e97d1e1d
93 | Port:
94 | Host Port:
95 | Args:
96 | /bin/sh
97 | -c
98 | i=0; while true; do
99 | echo "$i: $(date)" >> /var/log/1.log;
100 | echo "$(date) INFO $i" >> /var/log/2.log;
101 | i=$((i+1));
102 | sleep 1;
103 | done
104 |
105 | State: Running
106 | Started: Tue, 10 Nov 2020 16:33:12 -0600
107 | Ready: True
108 | Restart Count: 0
109 | Environment:
110 | Mounts:
111 | /var/log from varlog (rw)
112 | /var/run/secrets/kubernetes.io/serviceaccount from default-token-2qnnp (ro)
113 | count-log-1:
114 | Container ID: docker://d5e95aa4aec3a55435d610298f94e7b8b2cfdf2fb88968f00ca4719a567a6e37
115 | Image: busybox
116 | Image ID: docker-pullable://busybox@sha256:a9286defaba7b3a519d585ba0e37d0b2cbee74ebfe590960b0b1d6a5e97d1e1d
117 | Port:
118 | Host Port:
119 | Args:
120 | /bin/sh
121 | -c
122 | tail -n+1 -f /var/log/1.log
123 | State: Running
124 | Started: Tue, 10 Nov 2020 16:33:13 -0600
125 | Ready: True
126 | Restart Count: 0
127 | Environment:
128 | Mounts:
129 | /var/log from varlog (rw)
130 | /var/run/secrets/kubernetes.io/serviceaccount from default-token-2qnnp (ro)
131 | count-log-2:
132 | Container ID: docker://eaa9983cbd55288a139b63c30cfe3811031dedfae0842b9233ac48db65387d4d
133 | Image: busybox
134 | Image ID: docker-pullable://busybox@sha256:a9286defaba7b3a519d585ba0e37d0b2cbee74ebfe590960b0b1d6a5e97d1e1d
135 | Port:
136 | Host Port:
137 | Args:
138 | /bin/sh
139 | -c
140 | tail -n+1 -f /var/log/2.log
141 | State: Running
142 | Started: Tue, 10 Nov 2020 16:33:13 -0600
143 | Ready: True
144 | Restart Count: 0
145 | Environment:
146 | Mounts:
147 | /var/log from varlog (rw)
148 | /var/run/secrets/kubernetes.io/serviceaccount from default-token-2qnnp (ro)
149 | Conditions:
150 | Type Status
151 | Initialized True
152 | Ready True
153 | ContainersReady True
154 | PodScheduled True
155 | Volumes:
156 | varlog:
157 | Type: EmptyDir (a temporary directory that shares a pod's lifetime)
158 | Medium:
159 | SizeLimit:
160 | default-token-2qnnp:
161 | Type: Secret (a volume populated by a Secret)
162 | SecretName: default-token-2qnnp
163 | Optional: false
164 | QoS Class: BestEffort
165 | Node-Selectors:
166 | Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
167 | node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
168 | Events:
169 | Type Reason Age From Message
170 | ---- ------ ---- ---- -------
171 | Normal Scheduled 30m default-scheduler Successfully assigned default/counter to gke-my-first-cluster-1-default-pool-504c1e77-xcvj
172 | Normal Pulling 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Pulling image "busybox"
173 | Normal Pulled 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Successfully pulled image "busybox"
174 | Normal Created 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Created container count
175 | Normal Started 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Started container count
176 | Normal Pulling 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Pulling image "busybox"
177 | Normal Created 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Created container count-log-1
178 | Normal Pulled 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Successfully pulled image "busybox"
179 | Normal Started 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Started container count-log-1
180 | Normal Pulling 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Pulling image "busybox"
181 | Normal Pulled 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Successfully pulled image "busybox"
182 | Normal Created 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Created container count-log-2
183 | Normal Started 30m kubelet, gke-my-first-cluster-1-default-pool-504c1e77-xcvj Started container count-log-2
184 | ```
185 |
186 | ---
187 |
188 | Validate the commands being presented to the pod to ensure nothing was configured incorrectly.
189 |
190 | `kubectl apply --validate -f mypod.yaml`
191 |
192 | ## 5.5 Troubleshoot Cluster Component Failure
193 |
194 | - [Troubleshoot Clusters](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/)
195 | - [A general overview of cluster failure modes](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/#a-general-overview-of-cluster-failure-modes)
196 | - [Control Plane Components](https://kubernetes.io/docs/concepts/overview/components/#control-plane-components)
197 | - [Node Components](https://kubernetes.io/docs/concepts/overview/components/#node-components)
198 |
199 | ---
200 |
201 | Components to investigate:
202 |
203 | - Control Plane Components
204 | - `kube-apiserver`
205 | - `etcd`
206 | - `kube-scheduler`
207 | - `kube-controller-manager`
208 | - `cloud-controller-manager`
209 | - Node Components
210 | - `kubelet`
211 | - `kube-proxy`
212 | - Container runtime (e.g. Docker)
213 |
214 | ---
215 |
216 | View the components with:
217 |
218 | `kubectl get all -n kube-system`
219 |
220 | ```bash
221 | NAME READY STATUS RESTARTS AGE
222 | pod/konnectivity-agent-56nck 1/1 Running 0 15d
223 | pod/konnectivity-agent-gmklx 1/1 Running 0 15d
224 | pod/konnectivity-agent-wg92c 1/1 Running 0 15d
225 | pod/kube-dns-576766df6b-cz4ln 3/3 Running 0 15d
226 | pod/kube-dns-576766df6b-rcsk7 3/3 Running 0 15d
227 | pod/kube-dns-autoscaler-7f89fb6b79-pq66d 1/1 Running 0 15d
228 | pod/kube-proxy-gke-my-first-cluster-1-default-pool-504c1e77-m9lk 1/1 Running 0 15d
229 | pod/kube-proxy-gke-my-first-cluster-1-default-pool-504c1e77-xcvj 1/1 Running 0 15d
230 | pod/kube-proxy-gke-my-first-cluster-1-default-pool-504c1e77-zg6v 1/1 Running 0 15d
231 | pod/l7-default-backend-7fd66b8b88-ng57f 1/1 Running 0 15d
232 | pod/metrics-server-v0.3.6-7c5cb99b6f-2d8bx 2/2 Running 0 15d
233 |
234 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
235 | service/default-http-backend NodePort 10.108.1.184 80:32084/TCP 15d
236 | service/kube-dns ClusterIP 10.108.0.10 53/UDP,53/TCP 15d
237 | service/metrics-server ClusterIP 10.108.1.154 443/TCP 15d
238 |
239 | NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
240 | daemonset.apps/konnectivity-agent 3 3 3 3 3 15d
241 | daemonset.apps/kube-proxy 0 0 0 0 0 kubernetes.io/os=linux,node.kubernetes.io/kube-proxy-ds-ready=true 15d
242 | daemonset.apps/metadata-proxy-v0.1 0 0 0 0 0 cloud.google.com/metadata-proxy-ready=true,kubernetes.io/os=linux 15d
243 | daemonset.apps/nvidia-gpu-device-plugin 0 0 0 0 0 15d
244 |
245 | NAME READY UP-TO-DATE AVAILABLE AGE
246 | deployment.apps/kube-dns 2/2 2 2 15d
247 | deployment.apps/kube-dns-autoscaler 1/1 1 1 15d
248 | deployment.apps/l7-default-backend 1/1 1 1 15d
249 | deployment.apps/metrics-server-v0.3.6 1/1 1 1 15d
250 |
251 | NAME DESIRED CURRENT READY AGE
252 | replicaset.apps/kube-dns-576766df6b 2 2 2 15d
253 | replicaset.apps/kube-dns-autoscaler-7f89fb6b79 1 1 1 15d
254 | replicaset.apps/l7-default-backend-7fd66b8b88 1 1 1 15d
255 | replicaset.apps/metrics-server-v0.3.6-7c5cb99b6f 1 1 1 15d
256 | replicaset.apps/metrics-server-v0.3.6-7ff8cdbc49 0 0 0 15d
257 | ```
258 |
259 | ---
260 |
261 | Retrieve detailed information about the cluster
262 |
263 | `kubectl cluster-info` or `kubectl cluster-info dump`
264 |
265 | ---
266 |
267 | Retrieve a list of known API resources to aid with describing or troubleshooting
268 |
269 | `kubectl api-resources`
270 |
271 | ```bash
272 | NAME SHORTNAMES APIGROUP NAMESPACED KIND
273 | bindings true Binding
274 | componentstatuses cs false ComponentStatus
275 | configmaps cm true ConfigMap
276 | endpoints ep true Endpoints
277 | events ev true Event
278 | limitranges limits true LimitRange
279 | namespaces ns false Namespace
280 | nodes no false Node
281 | persistentvolumeclaims pvc true PersistentVolumeClaim
282 | persistentvolumes pv false PersistentVolume
283 |
284 |
285 | ```
286 |
287 | ---
288 |
289 | [Check the logs](https://kubernetes.io/docs/tasks/debug-application-cluster/debug-cluster/#looking-at-logs) in `/var/log` on the master and worker nodes:
290 |
291 | - Master
292 | - `/var/log/kube-apiserver.log` - API Server, responsible for serving the API
293 | - `/var/log/kube-scheduler.log` - Scheduler, responsible for making scheduling decisions
294 | - `/var/log/kube-controller-manager.log` - Controller that manages replication controllers
295 | - Worker Nodes
296 | - `/var/log/kubelet.log` - Kubelet, responsible for running containers on the node
297 | - `/var/log/kube-proxy.log` - Kube Proxy, responsible for service load balancing
298 |
299 | ## 5.6 Troubleshoot Networking
300 |
301 | - [Flannel Troubleshooting](https://github.com/coreos/flannel/blob/master/Documentation/troubleshooting.md#kubernetes-specific)
302 | - The flannel kube subnet manager relies on the fact that each node already has a podCIDR defined.
303 | - [Calico Troubleshooting](https://docs.projectcalico.org/maintenance/troubleshoot/)
304 | - [Containers do not have network connectivity](https://docs.projectcalico.org/maintenance/troubleshoot/troubleshooting#containers-do-not-have-network-connectivity)
305 |
--------------------------------------------------------------------------------