├── LICENSE ├── README.md └── slides.pptx /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Kubernetes-Master-Class-Upgrades 2 | Kubernetes Master Class: A Seamless Approach to Rancher & Kubernetes Upgrades 3 | 4 | ## [YouTube video](https://www.youtube.com/watch?v=d8kS8y8cLq4) 5 | 6 | ## Terms 7 | - **Rancher Server** is a set of pods that run the main orchestration engine and UI for Rancher. 8 | - **RKE** (Rancher Kubernetes Engine) is the tool Rancher uses to create and manage Kubernetes clusters 9 | - **Local/upstream cluster** This is the cluster where the Rancher server is installed; this is usually an RKE built cluster) 10 | - **Downstream cluster(s)** are Kubernetes cluster that Rancher is managing 11 | 12 | ## High-Level rules 13 | The following are the high-level rules for planning a Rancher/Kubernetes/Docker upgrade. 14 | - Do not rush an upgrade. 15 | - Do not stack upgrades (We recommended at least 24hours between upgrades) 16 | - Make sure you have a good backup 17 | - The recommended order of upgrades is Rancher, Kubernetes, and then Docker. 18 | - All upgrades should be tested in a lab or non-prod environment before being deployed to Production. 19 | - Review all release notes [link](https://github.com/rancher/rancher/releases/tag/v2.5.5) 20 | - Review the support matrix [link](https://rancher.com/support-maintenance-terms/all-supported-versions/rancher-v2.5.5/) 21 | - It is not required, but we recommended pausing any CI/CD pipelines using the Rancher API during an upgrade. 22 | 23 | ## Picking a version 24 | Please see the following recommendations when planning version upgrades. 25 | - **Rancher**: perform one minor version jump at a time 26 | For example: when upgrading from v2.1.x -> v2.3.x. We encourage upgrading v2.1.x -> v2.2.x -> v2.3.x but this is not required. 27 | - **Kubernetes**: perform no more than two minor versions at a time, ideally avoid skipping minor versions entirely as this can increase the chances of an issue due to accumulated changes 28 | For example: when upgrading from v1.13.x -> v1.19.x we encourage upgrading v1.13.x -> v1.15.x -> v1.17.x -> v1.19.x 29 | - **RKE**: perform one major RKE versions jump at a time 30 | For example: when upgrading from v0.1.x -> v1.1.0 instead do v0.1.x -> v0.2.x -> v.0.3.x -> v1.0.x -> v1.1.x 31 | 32 | 33 | ## Creating your change control 34 | 35 | ### Scheduled change window 36 | - **Rancher upgrade** - 30Mins for install with 30mins for rollback 37 | - **Kubernetes upgrade** - 60Mins for install which may be longer for larger clusters with 60Mins for troubleshooting/rollback 38 | 39 | ### Effect / Impact during the change window 40 | - **Rancher upgrade** - Only management of Rancher and downstream clusters are impacted; applications shouldn't know that anything is being done. But any CI/CD pipelines should be paused. 41 | - **Kubernetes upgrade of the local cluster** - The Rancher UI should disconnect and reconnect after a few mins due to the ingress-controllers being restarted. 42 | - **Kubernetes upgrade of downstream clusters** - Applications might see a short network blip as ingress-controllers and networking is restarted. See [link](https://rancher.com/blog/2020/zero-downtime/) for more details 43 | 44 | ### Maintenance window 45 | - **Rancher upgrade** - A maintenance window is not required, but CI/CD pipelines should be paused. 46 | - **Kubernetes upgrade of the local cluster** - A maintenance window is not needed, but CI/CD pipelines should be paused. 47 | - **Kubernetes upgrade of downstream clusters** - This should be done during a maintenance window or a quiet time 48 | 49 | ## Rancher Upgrade – Prep work 50 | - Check if the Rancher UI is accessible 51 | - Check if all clusters in UI are in an Active state 52 | - Check if all pods in kube-system and cattle-system namespaces are running in both the local and downstream clusters. 53 | ``` 54 | kubectl get pods -n kube-system 55 | kubectl get pods -n cattle-system 56 | ``` 57 | - Verify etcd has scheduled snapshots configured, and these are working. 58 | - **RKE**: if Rancher is deployed on a Kubernetes cluster built with RKE, verify etcd snapshots are enabled and working, on etcd nodes you can confirm with the following: 59 | ``` 60 | ls -l /opt/rke/etcd-snapshots 61 | docker logs etcd-rolling-snapshots 62 | ``` 63 | - **k3s**: if Rancher is deployed on a k3s Kubernetes cluster, ensure scheduled backups are configured and working. Please see the k3s [Documentation](https://rancher.com/docs/k3s/latest/en/) pages for further information on this. 64 | - Create a one-time datastore snapshot; please see the following Documentation for RKE and k3s and the single node Docker install options for more information 65 | - **RKE**: check for expired/expiring Kubernetes certs 66 | ``` 67 | for i in $(ls /etc/kubernetes/ssl/*.pem|grep -v key); do echo -n $i" "; openssl x509 -startdate -enddate -noout -in $i | grep 'notAfter='; done 68 | ``` 69 | 70 | ## Rancher Upgrade - Change 71 | - Update helm repo cache 72 | ``` 73 | helm repo update 74 | helm fetch rancher-stable/rancher 75 | ``` 76 | - Verify you’re connected to the correct cluster 77 | ``` 78 | kubectl get nodes -o wide 79 | ``` 80 | - Take an etcd snapshot 81 | ``` 82 | rke etcd snapshot-save --config cluster.yaml --name pre-rancher-upgrade-`date '+%Y%m%d%H%M%S'` 83 | ``` 84 | - Grab the current helm values using helm get values rancher -n cattle-system 85 | Example output: 86 | ``` 87 | USER-SUPPLIED VALUES: 88 | antiAffinity: required 89 | auditLog: 90 | level: 2 91 | hostname: rancher.example.com 92 | ingress: 93 | tls: 94 | source: secret 95 | ``` 96 | - Use the values to build your upgrade command 97 | **NOTE**: The only thing you should change is the version flag. 98 | ``` 99 | helm upgrade --install rancher rancher-stable/rancher \ 100 | --namespace cattle-system \ 101 | --set hostname=rancher.example.com \ 102 | --set ingress.tls.source=secret \ 103 | --set auditLog.level=2 \ 104 | --set antiAffinity=required \ 105 | --version 2.5.5 106 | ``` 107 | - Wait for the upgrade to finish 108 | ``` 109 | kubectl -n cattle-system rollout status deploy/rancher 110 | ``` 111 | - Official Rancher upgrade [Documentation](https://rancher.com/docs/rancher/v2.x/en/installation/install-rancher-on-k8s/upgrades/) 112 | 113 | ## Rancher Upgrade – Verify 114 | - Check if the Rancher UI is accessible 115 | - Check if all clusters in UI are in an Active state 116 | - Check if all pods in kube-system and cattle-system namespaces are running in both the local and downstream clusters. 117 | ``` 118 | kubectl get pods -n kube-system 119 | kubectl get pods -n cattle-system 120 | ``` 121 | - Verify new Rancher version (Bottom Left corner) 122 | - Verify all Rancher, cattle-cluster-agent, and cattle-node-agent is running on the new version on the local cluster 123 | ``` 124 | kubectl get pods -n cattle-system -o wide 125 | ``` 126 | - Verify all downstream cluster are Active 127 | - Verify all Rancher, cattle-cluster-agent, and cattle-node-agent runs on the new version on all downstream clusters. 128 | ``` 129 | kubectl get pods -n cattle-system -o wide 130 | ``` 131 | Take a post-upgrade etcd snapshot 132 | ``` 133 | rke etcd snapshot-save --config cluster.yaml --name post-rancher-upgrade-`date '+%Y%m%d%H%M%S'` 134 | ``` 135 | 136 | ## Rancher Upgrade – Backout 137 | - You can not downgrade Rancher; you must do an etcd restore [Documentation](https://rancher.com/docs/rke/latest/en/etcd-snapshots/restoring-from-backup/) 138 | ``` 139 | rke etcd snapshot-restore --name pre-rancher-upgrade-..... --config ./cluster.yaml 140 | ``` 141 | 142 | ## RKE Upgrade – Prep work 143 | - Verify the correct `cluster.yaml` and `cluster.rkestate` file 144 | - Verify SSH access to all nodes in the cluster 145 | - Verify all nodes are Ready 146 | ``` 147 | kubectl get nodes -o wide 148 | ``` 149 | - Verify all pods are Healthy 150 | ``` 151 | kubectl get pods --all-namespaces -o wide | grep -v 'Running\|Completed' 152 | ``` 153 | - We're looking for Pods crashing or stuck. 154 | - Verify Kubernetes version is available in RKE 155 | ``` 156 | rke config --list-version --all –print 157 | ``` 158 | - You might need to upgrade to a newer RKE version if the recommend k8s version isn't available. 159 | 160 | ## RKE Upgrade – Change 161 | - Take an etcd snapshot 162 | ``` 163 | rke etcd snapshot-save --config cluster.yaml --name pre-k8s-upgrade-`date '+%Y%m%d%H%M%S'` 164 | ``` 165 | - Change `kubernetes_version` in the `cluster.yaml` 166 | ``` 167 | kubernetes_version: "1.19.7-rancher1-1" 168 | ``` 169 | - If you have an air-gapped setup, please see [Documentation](https://rancher.com/docs/rke/latest/en/config-options/system-images/) 170 | 171 | ## RKE Upgrade - Verify 172 | - Verify all nodes are Ready and at the new version 173 | ```kubectl get nodes -o wide 174 | ``` 175 | - Verify all pods are Healthy 176 | ``` 177 | kubectl get pods --all-namespaces -o wide | grep -v 'Running\|Completed’ 178 | ``` 179 | - All pods should be healthy; we're looking for Pods crashing or stuck. 180 | 181 | ## RKE Upgrade – Backout 182 | - You can not downgrade Rancher; **you must do an etcd restore** 183 | ``` 184 | rke etcd snapshot-restore --name pre-k8s-upgrade-..... --config ./cluster.yaml 185 | ``` 186 | - [Documentation](https://rancher.com/docs/rke/latest/en/etcd-snapshots/restoring-from-backup/) 187 | 188 | ## Common issues 189 | 190 | ### Missing `cluster.yaml` and `cluster.rkestate` 191 | 192 | #### Setting up a lab environment 193 | - Build a standard RKE cluster [Documentation](https://rancher.com/docs/rke/latest/en/installation/#deploying-kubernetes-with-rke) 194 | - Delete `cluster.rkestate` 195 | - Delete `kube_config_cluster.yml` 196 | 197 | #### Reproducing the issue 198 | - `rke up` 199 | - You should see rke generating new certificates (See example output below) 200 | ``` 201 | INFO[0004] [certificates] Generating CA kubernetes certificates 202 | INFO[0005] [certificates] Generating Kubernetes API server aggregation layer requestheader client CA certificates 203 | INFO[0005] [certificates] GenerateServingCertificate is disabled, checking if there are unused kubelet certificates 204 | INFO[0005] [certificates] Generating Kubernetes API server certificates 205 | INFO[0006] [certificates] Generating Service account token key 206 | INFO[0006] [certificates] Generating Kube Controller certificates 207 | INFO[0006] [certificates] Generating Kube Scheduler certificates 208 | INFO[0006] [certificates] Generating Kube Proxy certificates 209 | INFO[0007] [certificates] Generating Node certificate 210 | INFO[0007] [certificates] Generating admin certificates and kubeconfig 211 | ``` 212 | 213 | #### Resolution 214 | 215 | - SSH to one of controlplane nodes 216 | - Run the [script](https://raw.githubusercontent.com/rancherlabs/support-tools/master/how-to-retrieve-kubeconfig-from-custom-cluster/rke-node-kubeconfig.sh) and follow the instructions given to get a kubeconfig file for the cluster. 217 | - Run the [script](https://raw.githubusercontent.com/rancherlabs/support-tools/master/how-to-retrieve-cluster-yaml-from-custom-cluster/cluster-yaml-recovery.sh) and follow the instructions given to get a cluster.yaml and cluster.rkestate file for the cluster. 218 | - Copy the files cluster.yml, cluster.rkestate, and kube_config_cluster.yml to a safe location. 219 | 220 | ### Upgrading from an old Helm version 221 | 222 | #### Setting up a lab environment 223 | - Build a standard RKE cluster [Documentation](https://rancher.com/docs/rke/latest/en/installation/#deploying-kubernetes-with-rke) 224 | - Setup [helm2](https://github.com/helm/helm/releases/tag/v2.17.0) 225 | ``` 226 | kubectl -n kube-system create serviceaccount tiller 227 | kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller 228 | helm init --service-account tiller --wait 229 | helm repo add rancher-latest https://releases.rancher.com/server-charts/latest 230 | 231 | ``` 232 | - Install Rancher using helm2 233 | ``` 234 | helm install rancher-latest/rancher --name rancher \ 235 | --namespace cattle-system \ 236 | --set hostname=rancher.example.com \ 237 | --set ingress.tls.source=secret \ 238 | --version 2.3.10 239 | ``` 240 | 241 | #### Reproducing the issue 242 | - Setup [helm3](https://github.com/helm/helm/releases/tag/v3.5.0) 243 | ``` 244 | helm repo add rancher-latest https://releases.rancher.com/server-charts/latest 245 | helm repo update 246 | ``` 247 | - Try to upgrade Rancher 248 | ``` 249 | helm upgrade --install rancher rancher-latest/rancher \ 250 | --namespace cattle-system \ 251 | --set hostname=rancher.example.com \ 252 | --set ingress.tls.source=secret \ 253 | --version 2.5.5 254 | ``` 255 | - Error message 256 | ``` 257 | Release "rancher" does not exist. Installing it now. 258 | Error: rendered manifests contain a resource that already exists. Unable to continue with install: ServiceAccount "rancher" in namespace "cattle-system" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "rancher"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "cattle-system" 259 | ``` 260 | 261 | #### Resolution 262 | - Take an etcd snapshot 263 | ``` 264 | rke etcd snapshot-save --config cluster.yaml --name helm2-helm3-`date '+%Y%m%d%H%M%S'` 265 | ``` 266 | - Update annotates and labels for Rancher objects 267 | ``` 268 | kubectl annotate --overwrite namespace cattle-system app.kubernetes.io/managed-by=helm 269 | kubectl annotate --overwrite namespace cattle-system meta.helm.sh/release-name=rancher 270 | kubectl annotate --overwrite namespace cattle-system meta.helm.sh/release-namespace=cattle-system 271 | kubectl label --overwrite namespace cattle-system app.kubernetes.io/managed-by=Helm 272 | kubectl -n cattle-system annotate --overwrite sa rancher app.kubernetes.io/managed=helm 273 | kubectl -n cattle-system annotate --overwrite sa rancher meta.helm.sh/release-name=rancher 274 | kubectl -n cattle-system annotate --overwrite sa rancher meta.helm.sh/release-namespace=cattle-system 275 | kubectl -n cattle-system label --overwrite sa rancher app.kubernetes.io/managed-by=Helm 276 | kubectl -n cattle-system annotate --overwrite ClusterRoleBinding rancher app.kubetes.io/managed-by=helm 277 | kubectl -n cattle-system annotate --overwrite ClusterRoleBinding rancher meta.helm.sh/release-name=rancher 278 | kubectl -n cattle-system annotate --overwrite ClusterRoleBinding rancher meta.helm.sh/release-namespace=cattle-system 279 | kubectl -n cattle-system label --overwrite ClusterRoleBinding rancher app.kubernetes.io/managed-by=Helm 280 | kubectl -n cattle-system annotate --overwrite service rancher app.kubernetes.ianaged-by=helm 281 | kubectl -n cattle-system annotate --overwrite service rancher meta.helm.sh/release-name=rancher 282 | kubectl -n cattle-system annotate --overwrite service rancher meta.helm.sh/release-namespace=cattle-system 283 | kubectl -n cattle-system label --overwrite service rancher app.kubernetes.io/managed-by=Helm 284 | kubectl -n cattle-system annotate --overwrite Deployment rancher app.kuberne.io/managed-by=helm 285 | kubectl -n cattle-system annotate --overwrite Deployment rancher meta.helm.sh/release-name=rancher 286 | kubectl -n cattle-system annotate --overwrite Deployment rancher meta.helm.sh/release-namespace=cattle-system 287 | kubectl -n cattle-system label --overwrite Deployment rancher app.kubernetes.io/managed-by=Helm 288 | kubectl -n cattle-system annotate --overwrite Ingress rancher app.kubernetio/managed-by=helm 289 | kubectl -n cattle-system annotate --overwrite Ingress rancher meta.helm.sh/release-name=rancher 290 | kubectl -n cattle-system annotate --overwrite Ingress rancher meta.helm.sh/release-namespace=cattle-system 291 | kubectl -n cattle-system label --overwrite Ingress rancher app.kubernetes.io/managed-by=Helm 292 | ``` 293 | - Upgrade Rancher 294 | ``` 295 | helm upgrade --install rancher rancher-latest/rancher \ 296 | --namespace cattle-system \ 297 | --set hostname=rancher.example.com \ 298 | --set ingress.tls.source=secret \ 299 | --version 2.5.5 300 | ``` 301 | 302 | ### Upgrading with a broken node 303 | 304 | #### Setting up a lab environment 305 | - Build a standard RKE cluster [Documentation](https://rancher.com/docs/rke/latest/en/installation/#deploying-kubernetes-with-rke) 306 | - Install Rancher using helm3 307 | ``` 308 | kubectl create namespace cattle-system 309 | helm upgrade --install rancher rancher-latest/rancher \ 310 | --namespace cattle-system \ 311 | --set hostname=rancher.example.com \ 312 | --set ingress.tls.source=secret \ 313 | --set antiAffinity=required \ 314 | --version 2.5.5 315 | ``` 316 | 317 | #### Reproducing the issue 318 | - `systemctl stop docker` one of the nodes in the cluster 319 | - Error message 320 | ``` 321 | NAME STATUS ROLES AGE VERSION 322 | mmattox-lab-c-01 Ready controlplane,etcd,worker 9m22s v1.19.7 323 | mmattox-lab-c-02 Ready controlplane,etcd,worker 9m22s v1.19.7 324 | mmattox-lab-c-03 NotReady controlplane,etcd,worker 9m22s v1.19.7 325 | ``` 326 | - `kubectl get pods -n cattle-system -o wide | grep -ve 'Running\|Completed'` 327 | ``` 328 | NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 329 | rancher-7df6ff577b-nqjfv 0/1 Pending 0 2m11s 330 | ``` 331 | 332 | #### Resolution 333 | **NOTE** This should only be done if the node is unrecoverable, and a replacement should be added to the cluster ASAP. 334 | - Upgrading Rancher 335 | - Change replicas to 2 for Rancher deployment 336 | ``` 337 | helm upgrade --install rancher rancher-latest/rancher \ 338 | --namespace cattle-system \ 339 | --set hostname=rancher.example.com \ 340 | --set ingress.tls.source=secret \ 341 | --set replicas=2 342 | --version 2.5.5 343 | ``` 344 | - Upgrading Kubernetes 345 | - Edit cluster.yml 346 | - Comment out bad node. **NOTE** You should only remove one node at a time. 347 | - Run a `rke up` 348 | - Delete the node from the cluster if RKE left to behind 349 | `kubectl delete node mmattox-lab-c-03` 350 | -------------------------------------------------------------------------------- /slides.pptx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mattmattox/Kubernetes-Master-Class-Upgrades/d6896246fc3084ffcfe13dab0b62cced8ac9cd70/slides.pptx --------------------------------------------------------------------------------