├── resources ├── membomb │ ├── requirements.txt │ ├── Dockerfile │ ├── pod_oom.yaml │ ├── pod_eviction.yaml │ └── app.py ├── 11_ops-techlab.png ├── images │ ├── prometheus_cmo.png │ ├── prometheus_use-cases.png │ └── prometheus_architecture.png ├── troubleshooting_cheat_sheet.md └── 11_ops-techlab.xml ├── .gitignore ├── slides.md ├── .travis.yml ├── package.json ├── labs ├── 50_backup_restore.md ├── 10_warmup.md ├── 70_upgrade.md ├── 20_installation.md ├── 40_configuration_best_practices.md ├── 60_monitoring_troubleshooting.md ├── 30_daily_business.md ├── 21_ansible_inventory.md ├── 72_upgrade_verification.md ├── 12_access_environment.md ├── 11_overview.md ├── 23_verification.md ├── 22_installation.md ├── 31_user_management.md ├── 32_update_hosts.md ├── 51_backup.md ├── 62_logs.md ├── 42_outgoing_http_proxies.md ├── 71_upgrade_openshift3.11.104.md ├── 52_restore.md ├── 34_renew_certificates.md ├── 33_persistent_storage.md ├── 61_monitoring.md ├── 41_out_of_resource_handling.md └── 35_add_new_node_and_master.md ├── theme ├── 02_corner_top_left.svg ├── 03_corner_top_left.svg ├── 04_corner_top_left.svg ├── 05_corner_top_left.svg ├── 01_corner_bottom_right.svg ├── 01_corner_top_left.svg ├── 02_corner_top_right.svg ├── 03_corner_top_right.svg ├── 04_corner_top_right.svg ├── 05_corner_top_right.svg ├── puzzle.css └── puzzle_tagline_bg_rgb.svg ├── appendices ├── 02_internet_resources.md ├── 03_aws_storage.md └── 01_prometheus.md ├── CONTRIBUTING.md ├── README.md ├── deploy.sh └── LICENSE /resources/membomb/requirements.txt: -------------------------------------------------------------------------------- 1 | psutil 2 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | node_modules/ 2 | out/ 3 | *.bak 4 | notes.md 5 | -------------------------------------------------------------------------------- /resources/11_ops-techlab.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/appuio/ops-techlab/HEAD/resources/11_ops-techlab.png -------------------------------------------------------------------------------- /resources/images/prometheus_cmo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/appuio/ops-techlab/HEAD/resources/images/prometheus_cmo.png -------------------------------------------------------------------------------- /resources/images/prometheus_use-cases.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/appuio/ops-techlab/HEAD/resources/images/prometheus_use-cases.png -------------------------------------------------------------------------------- /resources/images/prometheus_architecture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/appuio/ops-techlab/HEAD/resources/images/prometheus_architecture.png -------------------------------------------------------------------------------- /slides.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: OpenShift Operations Techlab 3 | --- 4 | 5 | 6 | # OpenShift Operations Techlab 7 | 8 | 9 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | notifications: 2 | email: false 3 | branches: 4 | only: 5 | - master 6 | language: node_js 7 | node_js: 8 | - stable 9 | script: bash ./deploy.sh 10 | env: 11 | global: 12 | - COMMIT_AUTHOR_EMAIL: hello@appuio.ch 13 | -------------------------------------------------------------------------------- /resources/membomb/Dockerfile: -------------------------------------------------------------------------------- 1 | FROM centos/python-35-centos7 2 | 3 | USER root 4 | RUN yum -y update && yum clean all 5 | USER 1001 6 | 7 | ENV PYTHONUNBUFFERED=1 8 | 9 | COPY . /opt/apt-root/src 10 | 11 | RUN . /opt/app-root/etc/scl_enable && pip install -r /opt/apt-root/src/requirements.txt 12 | 13 | CMD ["/opt/apt-root/src/app.py"] 14 | -------------------------------------------------------------------------------- /package.json: -------------------------------------------------------------------------------- 1 | { 2 | "name": "openshift-developer-workshop", 3 | "version": "1.0.0", 4 | "main": "index.js", 5 | "license": "Apache-2.0", 6 | "scripts": { 7 | "start": "reveal-md slides.md --theme puzzle", 8 | "static": "reveal-md slides.md --theme puzzle --static out" 9 | }, 10 | "dependencies": { 11 | "reveal-md": "^0.1.3" 12 | } 13 | } 14 | -------------------------------------------------------------------------------- /labs/50_backup_restore.md: -------------------------------------------------------------------------------- 1 | # Lab 5: Backup and Restore 2 | 3 | Backup and restore are the main topics of this fifth chapter. 4 | 5 | 6 | ## Chapter Content 7 | 8 | * [5.1: Backup](51_backup.md) 9 | * [5.2: Restore](52_restore.md) 10 | 11 | --- 12 | 13 |

5.1 Backup →

14 | 15 | [← back to the Labs Overview](../README.md) 16 | -------------------------------------------------------------------------------- /resources/membomb/pod_oom.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Pod 3 | metadata: 4 | generateName: membomb-1- 5 | labels: 6 | app: membomb 7 | spec: 8 | containers: 9 | - image: quay.io/appuio/membomb:evict 10 | imagePullPolicy: Always 11 | name: membomb 12 | env: 13 | - name: START_SIZE 14 | value: "0" 15 | - name: SIZE_INCR 16 | value: "9" 17 | restartPolicy: Never 18 | -------------------------------------------------------------------------------- /resources/membomb/pod_eviction.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Pod 3 | metadata: 4 | generateName: membomb-1- 5 | labels: 6 | app: membomb 7 | spec: 8 | containers: 9 | - image: quay.io/appuio/membomb:latest 10 | imagePullPolicy: Always 11 | name: membomb 12 | env: 13 | - name: START_SIZE 14 | value: "-1" 15 | - name: SIZE_INCR 16 | value: "9" 17 | restartPolicy: Never 18 | -------------------------------------------------------------------------------- /labs/10_warmup.md: -------------------------------------------------------------------------------- 1 | # Lab 1: Warmup 2 | 3 | This first chapter gives an overview of the lab environment and explains how to access it. 4 | 5 | 6 | ## Chapter Content 7 | 8 | * [1.1: Architectural Overview](11_overview.md) 9 | * [1.2: Access the Lab Environment](12_access_environment.md) 10 | 11 | --- 12 | 13 |

1.1 Architectural Overview →

14 | 15 | [← back to the Labs Overview](../README.md) 16 | -------------------------------------------------------------------------------- /labs/70_upgrade.md: -------------------------------------------------------------------------------- 1 | # Lab 7: Upgrade OpenShift from 3.11.88 to 3.11.104 2 | 3 | In this chapter, we will do a minor upgrade from OpenShift 3.11.88 to 3.11.104. 4 | 5 | 6 | ## Chapter Content 7 | 8 | * [7.1: Upgrade OpenShift 3.11.88 to 3.11.104](71_upgrade_openshift3.11.104.md) 9 | * [7.2: Verify the Upgrade](72_upgrade_verification.md) 10 | 11 | --- 12 | 13 |

7.1 Upgrade to OpenShift 3.11.104 →

14 | 15 | [← back to the Labs Overview](../README.md) 16 | 17 | -------------------------------------------------------------------------------- /labs/20_installation.md: -------------------------------------------------------------------------------- 1 | # Lab 2: OpenShift Installation 2 | 3 | The second chapter shows how to create the Ansible inventory, install OpenShift and verify the installation. 4 | 5 | 6 | ## Chapter content 7 | 8 | * [2.1: Create the Ansible Inventory](21_ansible_inventory.md) 9 | * [2.2: Install OpenShift](22_installation.md) 10 | * [2.3: Verify the Installation](23_verification.md) 11 | 12 | --- 13 | 14 |

2.1 Create the Ansible inventory →

15 | 16 | [← back to the Labs Overview](../README.md) 17 | -------------------------------------------------------------------------------- /labs/40_configuration_best_practices.md: -------------------------------------------------------------------------------- 1 | # Lab 4: Configuration Best Practices 2 | 3 | The fourth chapter gives some configuration tips and tricks based on our experience with operating OpenShift Container Platform clusters. 4 | 5 | ## Chapter Content 6 | 7 | * [4.1: Out of Resource Handling](41_out_of_resource_handling.md) 8 | * [4.2: Outgoing HTTP Proxies](42_outgoing_http_proxies.md) 9 | 10 | --- 11 | 12 |

4.1 Out of Resource Handling →

13 | 14 | [← back to the Labs Overview](../README.md) 15 | -------------------------------------------------------------------------------- /theme/02_corner_top_left.svg: -------------------------------------------------------------------------------- 1 | 02_corner_top_leftCreated with Sketch. -------------------------------------------------------------------------------- /theme/03_corner_top_left.svg: -------------------------------------------------------------------------------- 1 | 03_corner_top_leftCreated with Sketch. -------------------------------------------------------------------------------- /theme/04_corner_top_left.svg: -------------------------------------------------------------------------------- 1 | 05_corner_top_leftCreated with Sketch. -------------------------------------------------------------------------------- /theme/05_corner_top_left.svg: -------------------------------------------------------------------------------- 1 | 07_corner_top_leftCreated with Sketch. -------------------------------------------------------------------------------- /theme/01_corner_bottom_right.svg: -------------------------------------------------------------------------------- 1 | 01_corner_bottom_rightCreated with Sketch. -------------------------------------------------------------------------------- /theme/01_corner_top_left.svg: -------------------------------------------------------------------------------- 1 | 01_corner_top_leftCreated with Sketch. -------------------------------------------------------------------------------- /theme/02_corner_top_right.svg: -------------------------------------------------------------------------------- 1 | 02_corner_top_rightCreated with Sketch. -------------------------------------------------------------------------------- /theme/03_corner_top_right.svg: -------------------------------------------------------------------------------- 1 | 03_corner_top_rightCreated with Sketch. -------------------------------------------------------------------------------- /theme/04_corner_top_right.svg: -------------------------------------------------------------------------------- 1 | 05_corner_top_rightCreated with Sketch. -------------------------------------------------------------------------------- /theme/05_corner_top_right.svg: -------------------------------------------------------------------------------- 1 | 07_corner_top_rightCreated with Sketch. -------------------------------------------------------------------------------- /labs/60_monitoring_troubleshooting.md: -------------------------------------------------------------------------------- 1 | # Lab 6: Monitoring and Troubleshooting 2 | 3 | In this chapter, we are going to look at how to do monitoring and troubleshooting of OpenShift. The former gives us the knowledge if our platform is operational and if it will remain so. The latter helps us in case those health checks fail but do not provide sufficient details as to why. 4 | 5 | 6 | ## Chapter Content 7 | 8 | * [6.1: Monitoring](61_monitoring.md) 9 | * [6.2: Logs](62_logs.md) 10 | 11 | --- 12 | 13 |

6.1 Monitoring →

14 | 15 | [← back to the Labs Overview](../README.md) 16 | -------------------------------------------------------------------------------- /labs/30_daily_business.md: -------------------------------------------------------------------------------- 1 | # Lab 3: Daily Business 2 | 3 | The third chapter gets into daily business tasks such as managing users, updating hosts or renewing certificates. 4 | 5 | 6 | ## Chapter Content 7 | 8 | * [3.1: Manage Users](31_user_management.md) 9 | * [3.2: Update Hosts](32_update_hosts.md) 10 | * [3.3: Persistent Storage](33_persistent_storage.md) 11 | * [3.4: Renew Certificates](34_renew_certificates.md) 12 | * [3.5: Add New OpenShift Node and Master](35_add_new_node_and_master.md) 13 | 14 | --- 15 | 16 |

3.1 Manage Users →

17 | 18 | [← back to the Labs Overview](../README.md) 19 | -------------------------------------------------------------------------------- /appendices/02_internet_resources.md: -------------------------------------------------------------------------------- 1 | # Appendix 2: Useful Internet Resources 2 | 3 | This appendix is a small collection of rather useful online resources containing scripts and documentation as well as Ansible roles and playbooks and more. 4 | 5 | - Additional Ansible roles and playbooks provided by the community: https://github.com/openshift/openshift-ansible-contrib 6 | - Scripts used by OpenShift Online Operations: https://github.com/openshift/openshift-tools 7 | - Red Hat Communities of Practice: https://github.com/redhat-cop 8 | - Red Hat Consulting DevOps and OpenShift Playbooks: http://v1.uncontained.io/ 9 | - APPUiO OpenShift resources: https://github.com/appuio/ 10 | - Knowledge Base: https://kb.novaordis.com/index.php/OpenShift 11 | 12 | --- 13 | 14 | [← back to the labs overview](../README.md) 15 | 16 | 17 | -------------------------------------------------------------------------------- /labs/21_ansible_inventory.md: -------------------------------------------------------------------------------- 1 | ## Lab 2.1: Create the Ansible Inventory 2 | 3 | In this lab, we will verify the Ansible inventory file, so it fits our lab cluster. The Inventory file describes, how the cluster will be built. 4 | 5 | Take a look at the prepared inventory file: 6 | ``` 7 | [ec2-user@master0 ~]$ less /etc/ansible/hosts 8 | ``` 9 | 10 | Download the default example hosts file from the OpenShift GitHub repository and compare it to the prepared inventory for the lab. 11 | ``` 12 | [ec2-user@master0 ~]$ wget https://raw.githubusercontent.com/openshift/openshift-ansible/release-3.11/inventory/hosts.example 13 | [ec2-user@master0 ~]$ vimdiff hosts.example /etc/ansible/hosts 14 | ``` 15 | --- 16 | 17 | **End of Lab 2.1** 18 | 19 |

2.2 Install OpenShift →

20 | 21 | [← back to the Chapter Overview](20_installation.md) 22 | -------------------------------------------------------------------------------- /labs/72_upgrade_verification.md: -------------------------------------------------------------------------------- 1 | ## Lab 7.3: Verify the Upgrade 2 | 3 | Check the version of the `docker` and `atomic-openshift` packages on all nodes and master. 4 | ``` 5 | [ec2-user@master0 ~]$ ansible all -m shell -a "rpm -qi atomic-openshift | grep -i name -A1" 6 | [ec2-user@master0 ~]$ ansible masters -m shell -a "rpm -qi atomic-openshift-master | grep -i name -A1" 7 | [ec2-user@master0 ~]$ ansible all -m shell -a "rpm -qi atomic-openshift-node | grep -i name -A1" 8 | [ec2-user@master0 ~]$ ansible all -m shell -a "rpm -qi docker | grep -i name -A3" 9 | ``` 10 | 11 | Check the image version of the registry, router, metrics and logging 12 | ``` 13 | [ec2-user@master0 ~]$ oc get pod -o yaml --all-namespaces | grep -i "image:.*.openshift3" 14 | ``` 15 | 16 | Now we need to verify our installation according to chapter "[2.3 Verify OpenShift Installation](23_verification.md)". 17 | 18 | 19 | --- 20 | 21 | **End of Lab 7** 22 | 23 | [← back to the Chapter Overview](70_upgrade.md) 24 | -------------------------------------------------------------------------------- /resources/troubleshooting_cheat_sheet.md: -------------------------------------------------------------------------------- 1 | # APPUiO OpenShift Troubleshooting Cheat Sheet 2 | 3 | ## `oc` troubleshooting commands 4 | 5 | **Get status of current project and its resources:** 6 | ``` 7 | oc status -v 8 | ``` 9 | 10 | **Show events:** 11 | ``` 12 | oc get events 13 | ``` 14 | 15 | **Show a pod's logs:** 16 | ``` 17 | oc logs [-f] [-c CONTAINER] 18 | ``` 19 | 20 | **Show assembled information for a resource:** 21 | ``` 22 | oc describe RESOURCE NAME 23 | ``` 24 | 25 | **Show a resource's definition:** 26 | ``` 27 | oc get RESOURCE NAME -o yaml|json 28 | ``` 29 | 30 | **Start a debug pod:** 31 | ``` 32 | oc debug RESOURCE/NAME 33 | ``` 34 | 35 | **Collect diagnostics:** 36 | ``` 37 | oc adm diagnostics 38 | ``` 39 | 40 | 41 | ## Service logs 42 | 43 | **Show logs for a specific service:** 44 | ``` 45 | journalctl -u UNIT [-f] 46 | ``` 47 | 48 | **Important services**: 49 | - `atomic-openshift-master[-api|-controllers]` 50 | - `atomic-openshift-node` 51 | - `etcd` 52 | - `docker` 53 | - `openvswitch` 54 | - `dnsmasq` 55 | - `NetworkManager` 56 | - `iptables` 57 | 58 | -------------------------------------------------------------------------------- /labs/12_access_environment.md: -------------------------------------------------------------------------------- 1 | ## Lab 1.2: How to Access the Lab Environment 2 | 3 | In the following labs, we are going to use `user[X]` as a placeholder for the user id that was assigned to you. 4 | 5 | If you e.g. had user id 1 you would change 6 | ``` 7 | https://console.user[X].lab.openshift.ch 8 | ``` 9 | to 10 | ``` 11 | https://console.user[X].lab.openshift.ch 12 | ``` 13 | 14 | 15 | There are three main ways we will access our environment. The mentioned ports need to be open from our own location to Amazon AWS. 16 | 17 | - **API** / **Console** 18 | - Address: https://console.user[X].lab.openshift.ch 19 | - Port: 443/tcp 20 | 21 | - **Applications** 22 | - Address: https://console.user[X].lab.openshift.ch 23 | - Ports: 80/tcp and 443/tcp 24 | 25 | - **Administration** 26 | - Address: jump.lab.openshift.ch 27 | - User: ec2-user 28 | - Port: 22/tcp 29 | - Command: 30 | ``` 31 | ssh ec2-user@jump.lab.openshift.ch 32 | ssh ec2-user@master0.user[X].lab.openshift.ch 33 | ``` 34 | --- 35 | 36 | **End of Lab 1.2** 37 | 38 |

2. OpenShift Installation →

39 | 40 | [← back to the Chapter Overview](10_warmup.md) 41 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # How to contribute to the APPUiO OpenShift OPStechlab 2 | 3 | :+1::tada: First off, thanks for taking the time to contribute! :tada::+1: 4 | 5 | ## Did you find a bug? 6 | 7 | * **Ensure the bug was not already reported** by searching on GitHub under [Issues](https://github.com/appuio/ops-techlab/issues). 8 | 9 | * If you're unable to find an open issue addressing the problem, [open a new one](https://github.com/appuio/ops-techlab/issues/new). Be sure to include a **title and clear description**, as much relevant information as possible, and a **code sample** or an **executable test case** demonstrating the expected behavior that is not occurring. 10 | 11 | ## Did you write a patch that fixes a bug? 12 | 13 | * Open a new GitHub pull request with the patch. 14 | 15 | * Ensure the PR description clearly describes the problem and solution. Include the relevant issue number if applicable. 16 | 17 | ## Do you intend to add a new feature or change an existing one? 18 | 19 | * **Feature Request**: open an issue on GitHub and describe your feature. 20 | 21 | * **New Feature**: Implement your Feature on a fork and create a pull request. The core team will gladly check and eventually merge your pull request. 22 | 23 | ## Do you have questions about the techlab? 24 | 25 | * Ask your question as an issue on GitHub 26 | 27 | Thanks! 28 | -------------------------------------------------------------------------------- /labs/11_overview.md: -------------------------------------------------------------------------------- 1 | ## Lab 1.1: Architectural Overview 2 | 3 | This is the environment we will build and work on. It is deployed on Amazon AWS. 4 | 5 | ![Lab OpenShift Cluster Overview](../resources/11_ops-techlab.png) 6 | 7 | Our lab installation consists of the following components: 8 | 1. Three Load Balancers 9 | 1. Application Load Balancer app[X]: Used for load balancing requests to the routers (*.app[X].lab.openshift.ch) 10 | 1. Application Load Balancer console[X]: Used for load balancing reqeusts to the master APIs (console.user[X].lab.openshift.ch) 11 | 1. Classic Load Balancer console[X]-internal: Used for internal load balancing reqeusts to the master APIs (internalconsole.user[X].lab.openshift.ch) 12 | 1. Two OpenShift masters, one will be added later 13 | 1. Two etcd, one will be added later 14 | 1. Three infra nodes, where the following components are running: 15 | 1. Container Native Storage (Gluster) 16 | 1. Routers 17 | 1. Metrics 18 | 1. Logging 19 | 1. Monitoring (Prometheus) 20 | 1. One app node, one will be added later 21 | 1. We are going to use the jump host as a bastion host (jump.lab.openshift.ch) 22 | 23 | 24 | --- 25 | 26 | **End of Lab 1.1** 27 | 28 |

1.2 Access the Lab Environment →

29 | 30 | [← back to the Chapter Overview](10_warmup.md) 31 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # APPUiO OpenShift OPStechlab 2 | 3 | ## Introduction 4 | 5 | In the following guided and hands-on techlab we show the participants how to operate [OpenShift Container Platform](https://www.openshift.com/). The labs cover automated installation and upgrades of OpenShift, troubleshooting and maintenance procedures as well as backup and restore of cluster configuration and resources. 6 | 7 | 8 | ## Contribution 9 | 10 | If you find errors, bugs or missing information please help us improve our techlab and have a look at the [Contribution Guide](CONTRIBUTING.md). 11 | 12 | 13 | ## Before You Begin 14 | 15 | There's a [Troubleshooting Cheat Sheet](resources/troubleshooting_cheat_sheet.md) that might be of help if you run into errors. 16 | 17 | 18 | ## Labs 19 | 20 | 1. [Warmup](labs/10_warmup.md) 21 | 2. [OpenShift Installation](labs/20_installation.md) 22 | 3. [Daily Business](labs/30_daily_business.md) 23 | 4. [Configuration Best Practices](labs/40_configuration_best_practices.md) 24 | 5. [Backup and Restore](labs/50_backup_restore.md) 25 | 6. [Monitoring and Troubleshooting](labs/60_monitoring_troubleshooting.md) 26 | 7. [OpenShift Upgrade](labs/70_upgrade.md) 27 | 28 | 29 | ## Appendices 30 | 31 | 1. [Monitoring with Prometheus](appendices/01_prometheus.md) 32 | 2. [Useful Internet Resources](appendices/02_internet_resources.md) 33 | 3. [Using AWS EFS Storage](appendices/03_aws_storage.md) 34 | 35 | ## License 36 | 37 | This techlab is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)](LICENSE) license. 38 | -------------------------------------------------------------------------------- /resources/membomb/app.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | from __future__ import print_function 4 | import os 5 | import sys 6 | import time 7 | import signal 8 | import psutil 9 | 10 | run = True 11 | 12 | def handler_stop_signals(signum, frame): 13 | global run 14 | run = False 15 | 16 | if signum == 15: 17 | print('Received SIGTERM, shutting down!') 18 | else: 19 | print('Received SIGINT, shutting down!') 20 | 21 | signal.signal(signal.SIGINT, handler_stop_signals) 22 | signal.signal(signal.SIGTERM, handler_stop_signals) 23 | 24 | size = int(os.getenv('START_SIZE', '0')) * 1024 ** 2 25 | size_incr = int(os.getenv('SIZE_INCR', '100')) * 1024 ** 2 26 | 27 | memory_stat = {} 28 | with open('/sys/fs/cgroup/memory/memory.stat') as file: 29 | for line in file: 30 | key, value = line.split(' ') 31 | memory_stat[key] = int(value) 32 | 33 | reserved = max(psutil.virtual_memory().total - memory_stat['hierarchical_memory_limit'], 0) 34 | if reserved == 0: 35 | print("No reserved memory detected. Assuming 2 GiB!") 36 | reserved = 2 * 1024 ** 3 37 | 38 | if size == 0: 39 | size = psutil.virtual_memory().available 40 | elif size < 0: 41 | size = psutil.virtual_memory().available - reserved 42 | 43 | buffers = [bytearray(size)] 44 | 45 | while run: 46 | print("Allocated %d MiB, available memory: %d MiB, reserved memory: %d MiB" % (size / 1024 ** 2, psutil.virtual_memory().available / 1024 ** 2, reserved / 1024 ** 2)) 47 | time.sleep(1) 48 | size += size_incr 49 | buffers.append(bytearray(size_incr)) 50 | 51 | 52 | #size = psutil.virtual_memory().available - 2048 * 1024 ** 2 53 | #print("Allocating %d MiB" % (size / 1024 ** 2)) 54 | #buffer = bytearray(size) 55 | 56 | #while run: 57 | # time.sleep(1) 58 | -------------------------------------------------------------------------------- /deploy.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | set -e # Exit with nonzero exit code if anything fails 3 | 4 | SOURCE_BRANCH="master" 5 | TARGET_BRANCH="gh-pages" 6 | 7 | function doCompile { 8 | npm run-script static 9 | rsync -av theme out 10 | } 11 | 12 | # Pull requests and commits to other branches shouldn't try to deploy, just build to verify 13 | if [ "$TRAVIS_PULL_REQUEST" != "false" -o "$TRAVIS_BRANCH" != "$SOURCE_BRANCH" ]; then 14 | echo "Skipping deploy; just doing a build." 15 | doCompile 16 | exit 0 17 | fi 18 | 19 | # Save some useful information 20 | REPO=`git config remote.origin.url` 21 | SSH_REPO=${REPO/https:\/\/github.com\//git@github.com:} 22 | SHA=`git rev-parse --verify HEAD` 23 | 24 | # Clone the existing gh-pages for this repo into out/ 25 | # Create a new empty branch if gh-pages doesn't exist yet (should only happen on first deply) 26 | git clone $REPO out 27 | cd out 28 | git checkout $TARGET_BRANCH || git checkout --orphan $TARGET_BRANCH 29 | cd .. 30 | 31 | # Clean out existing contents 32 | rm -rf out/**/* out/css/* || exit 0 33 | 34 | # Run our compile script 35 | doCompile 36 | 37 | # Now let's go have some fun with the cloned repo 38 | cd out 39 | git config user.name "Travis CI" 40 | git config user.email "$COMMIT_AUTHOR_EMAIL" 41 | 42 | # If there are no changes to the compiled out (e.g. this is a README update) then just bail. 43 | if [ -z "$(git status --porcelain)" ]; then 44 | echo "No changes to the output on this push; exiting." 45 | exit 0 46 | fi 47 | 48 | # Commit the "changes", i.e. the new version. 49 | # The delta will show diffs between new and old versions. 50 | git add -A . 51 | git commit -m "Deploy to GitHub Pages: ${SHA}" 52 | 53 | eval `ssh-agent -s` 54 | echo "${DEPLOY_KEY}" | base64 -d | ssh-add - 55 | 56 | # Now that we're all set up, we can push. 57 | git push $SSH_REPO $TARGET_BRANCH 58 | -------------------------------------------------------------------------------- /labs/23_verification.md: -------------------------------------------------------------------------------- 1 | ## 2.3 Verify the Installation 2 | 3 | After the completion of the installation, we can verify, if everything is running as expected. Most of the checks have already been done by the playbooks. 4 | First check if the API reachable and all nodes are ready with the right tags. 5 | ``` 6 | [ec2-user@master0 ~]$ oc get nodes --show-labels 7 | ``` 8 | 9 | Check if all pods are running and if OpenShift could deploy all needed components 10 | ``` 11 | [ec2-user@master0 ~]$ oc get pods --all-namespaces 12 | ``` 13 | 14 | Check if all pvc are bound and glusterfs runs fine 15 | ``` 16 | [ec2-user@master0 ~]$ oc get pvc --all-namespaces 17 | ``` 18 | 19 | Check the etcd health status. 20 | ``` 21 | [ec2-user@master0 ~]$ sudo -i 22 | [root@master0 ~]# source /etc/etcd/etcd.conf 23 | [root@master0 ~]# etcdctl2 cluster-health 24 | member 16682006866446bb is healthy: got healthy result from https://172.31.45.211:2379 25 | member 5c619e4b51953519 is healthy: got healthy result from https://172.31.44.160:2379 26 | cluster is healthy 27 | 28 | [root@master0 ~]# etcdctl2 member list 29 | 16682006866446bb: name=master1.user7.lab.openshift.ch peerURLs=https://172.31.45.211:2380 clientURLs=https://172.31.45.211:2379 isLeader=false 30 | 5c619e4b51953519: name=master0.user7.lab.openshift.ch peerURLs=https://172.31.44.160:2380 clientURLs=https://172.31.44.160:2379 isLeader=true 31 | ``` 32 | 33 | Create a project, run a build, push/pull from the internal registry and deploy a test application. 34 | ``` 35 | [ec2-user@master0 ~]$ oc new-project dakota 36 | [ec2-user@master0 ~]$ oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git 37 | [ec2-user@master0 ~]$ oc get pods -w 38 | ``` 39 | We keep this project so we have at least one pod running on OpenShift. If you decide to create other projects/pods you may delete this project with `oc delete project dakota`. 40 | 41 | **End of Lab 2.3** 42 | 43 |

3. Daily Business →

44 | 45 | [← back to the Chapter Overview](20_installation.md) 46 | -------------------------------------------------------------------------------- /labs/22_installation.md: -------------------------------------------------------------------------------- 1 | ## Lab 2.2: Install OpenShift 2 | 3 | In the previous lab we prepared the Ansible inventory to fit our test lab environment. Now we can prepare and run the installation. 4 | 5 | To make sure the playbook keeps on running even if there are network issues or something similar, we strongly encourage you to e.g. use `screen` or `tmux`. 6 | 7 | Now we run the prepare_hosts_for_ose.yml playbook. This will do the following: 8 | - Install the prerequisite packages: wget git net-tools bind-utils iptables-services bridge-utils bash-completion kexec-tools sos psacct 9 | - Enable Ansible ssh pipelining (performance improvements for Ansible) 10 | - Set timezone 11 | - Ensure hostname is preserved in cloud-init 12 | - Set default passwords 13 | - Install oc clients for various platforms on all master 14 | 15 | ``` 16 | [ec2-user@master0 ~]$ ansible-playbook /home/ec2-user/resource/prepare_hosts_for_ose.yml 17 | ``` 18 | 19 | Run the installation 20 | 1. Install OpenShift. This takes a while, get a coffee. 21 | ``` 22 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/prerequisites.yml 23 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml 24 | ``` 25 | 26 | 2. Add the cluster-admin role to the "sheriff" user. 27 | ``` 28 | [ec2-user@master0 ~]$ oc adm policy --as system:admin add-cluster-role-to-user cluster-admin sheriff 29 | ``` 30 | 31 | 3. Now open your browser and access the master API with the user "sheriff": 32 | ``` 33 | https://console.user[X].lab.openshift.ch/console/ 34 | ``` 35 | Password is documented in the Ansible inventory: 36 | ``` 37 | [ec2-user@master0 ~]$ grep keepass /etc/ansible/hosts 38 | ``` 39 | 40 | 4. Deploy the APPUiO openshift-client-distributor. This provides the correct oc client in a Pod and can then be obtained via the OpenShift GUI. For this to work, the Masters must have the package `atomic-openshift-clients-redistributable` installed. In addition the variable `openshift_web_console_extension_script_urls` must be defined in the inventory. 41 | ``` 42 | [ec2-user@master0 ~]$ grep openshift_web_console_extension_script_urls /etc/ansible/hosts 43 | openshift_web_console_extension_script_urls=["https://client.app1.lab.openshift.ch/cli-download-customization.js"] 44 | [ec2-user@master0 ~]$ ansible masters -m shell -a "rpm -qi atomic-openshift-clients-redistributable" 45 | ``` 46 | 47 | Deploy the openshift-client-distributor. 48 | ``` 49 | [ec2-user@master0 ~]$ sudo yum install python-openshift 50 | [ec2-user@master0 ~]$ git clone https://github.com/appuio/openshift-client-distributor 51 | [ec2-user@master0 ~]$ cd openshift-client-distributor 52 | [ec2-user@master0 ~]$ ansible-playbook playbook.yml -e 'openshift_client_distributor_hostname=client.app[X].lab.openshift.ch' 53 | ``` 54 | 55 | 5. You can now download the client binary and use it from your local workstation. The binary is available for Linux, macOS and Windows. (optional) 56 | ``` 57 | https://console.user[X].lab.openshift.ch/console/command-line 58 | ``` 59 | 60 | --- 61 | 62 | **End of Lab 2.2** 63 | 64 |

2.3 Verify the Installation →

65 | 66 | [← back to the Chapter Overview](20_installation.md) 67 | -------------------------------------------------------------------------------- /resources/11_ops-techlab.xml: -------------------------------------------------------------------------------- 1 | 7V1tc9o6Fv41zOzuTBhLsiz7Y0jabmebTudmX7L7zWABvjWYa0yb3F+/kt/AOjIQkB0TmkwbLNsC9JzznKNzjuQBuVs8f0r81fwhDng0wFbwPCD3A4yxRWzxR7a85C0IOUXLLAmDom3b8Bj+yYtGq2jdhAFf1y5M4zhKw1W9cRIvl3yS1tr8JIl/1i+bxlH9XVf+jIOGx4kfwdb/hEE6L1odam9P/J2Hs3nx1pgQJz8z9iffZ0m8WRZvOMBkmv3kpxd+2VnxTddzP4h/7jSRDwNyl8Rxmr9aPN/xSI5uOW75fR8bzlYfPOHL9JgbXJzf8cOPNsWX36x5UvSzTl/KIcm+E5d3WQMy+jkPU/648ify7E8hBaJtni4icYTEy3WaxN+roRNfajQNo+gujuIk641MvSmTnY2m8TIt8EdUHPtROFuKg4hP06qjnRuJQzwib/zBkzQUiN0WN6Sx/Azw6xffRF7On3eaiuH4xOMFT5MXcUkpvpjkt5TCi2l+/HMrCa5btM13hICiotEvpG9W9b0FQLwoMGjAgwA8BtiJ5FgE4Q/ZefF9ResfGyklo2yoqiPxaib/ShSfylvFu2Z356daxpZnP73EFiHkHgQXM9wSuDZUNvQsjv/97a5EapyUKBF55nEzXvIUnoQtWUf3Xx/l547XqUATW/+Ll1xzbyY8/kKOadGZtd6Mg3jhh0vJ2jziMz8N46VJQUGKoCzlZxMfIPGDUACqNB8QndYExLbq4kEsIB6IMAvKh+2458uH5wD5+BpLQ3gWDoG/nmfXqhBIHqby9wRddbKfdnWV1HXVdqCuIlejq5VSn4UFBVg8+EKvkhbRCCh3A/sENFw8Jh2jgW1yJBoEG0DDBmh8Xk4T/6Zl/Qh87k4np+jHxOXjaaeIEHakfniWAT+lWTtUe/PAF7HonNyK6+1PGtN19+1fxVnx6jrVC5O64cEltrtOZ2li6trFzgeznHY1adc+RN29iOIcUfX8fbj+Xlxgyfutv9x9ffzrVaqxirxOi/XIWwbUGCGA/GnTjbYkRTq1DcLS5BObnOxcqO+kCpXOddILlQnXCc5yFpltsIYGQgtNnK5OJMQ5jwUWY3sAmojrebKX8eFs1wA8xFPYntpDje3GmnmGY0DpoS+V44N+4VMMMsElHm+DEJx7hJk9XgqWbUWLSvuoQYlx3+HWSShV1rMTLSJupxjBubq/WrWG0EFjsweF0txowB1nv10hZDPUJUJsjxa1wnWHvMwjNGWfArbEdbbCdR3rEYJk91u8keNkVH/cCZ80MJwY33GmIK915MYutallBgdqIRUFgAHVQGDCYUOQy37js3BddnO1INjY0qiC0xYMkLAeRB/h5MyYSKcotEJSzLbruCCvS1w8gAsAZD33V/LlNOLPtzIDLb44XwbFy/tJ5K/X4WQ/LnmeGI49fw7TpwJq+fq/8vUQa2BpHGoe1BLecKAPsEzZlvDIT8MfvNa5bmyLd/gWh+KTbNMrrupYU1ppWNnNOt4kE17cuUUJdOY4oDPbGzKlt9RPZjwFvWWYV9//uGktjJJlJQOblWiU8Q8gFEK60zrkCV+Hf/rj7AIJ50p+ruyT0tGA3ut8BTVIsAiDQN4/ivwxj0ZVvcExkqSRj1K2VV2siiuKTzvYrU/Q6ag1RK7Lanjc2Eak5garQCOqCk08na752QjDyMWdzKheN/lSV+MhHke+9vnkq/Hif5HvSWrE1EizjU+lXpepXSG1K4O8C3NNW97Ns06iC55Ogksk4Eq++0zAxKvTr6LXRsiX6JJQYRr6WW0hX0Xxy4IXwmSMjZtwOjxjt7KftjhXTMtrI+4w3OWcnMC00Dc/kRIRT2XX4p8QKKl34u/4jSCp5WY6xod4b4sPDOF/eF5F/lJftgVpsD5cZdEVLM86ngx1KNfl4Ahn9NVq4iow2DoYbJ1rYgIF6JqUIeA2wovdh4BrGmYALphX6TgiTKArkae+8LtIfZnGy3Xqbl7HWTAC4y6/pmOC+48J17c1HSu5dGf85fTmsTiMk3Qez+KlH33Ytip2YAeNanKVT6iq+VU2fxOfbPecPC5P/s7T9KWAxN+ksWjavvOXWJZR5GjvR64RjXweNKgFBfIJTb3t4MzuaMf76PGHlv8asyVqQLjLbAmDZl/M1NZxxHMjMhQu8TBe8eV6Hk7T4WR+HjSH19JMp3hySnFbIICkjqEgkcJKZYBgd10F0gBCDNgJBq3634bCE7tOJFxW94kZYjqjrTMQRsDoLFly1FDvmJGaFUH5W5ZgkuJsuRbRGRLpEpcDhrPTzbBt7Zhqxo4yMaUA75oYhI61Md1EDz1U9wUZOTV26KqxQ6+92CEisLpAvv9HWy5f+Ii8ct3mZQUNKyU7O2h4I6OGFNcQuUHnSYzRsKCrq1DuH59YdT5BKqHYZ7PJ3jldjU2Yjk3sXrEJspSVdk5ZT/daOkEWVpbBYGfoUoZsnP/vtcYtFEas7XJV1GXxidvkZZzCJ5RRZalYn+gE+oq7zvvTVbqMwtdQfMYOvXcXOoyZ936lUGBan9k2uO9Us3jDBBoe5LS3NbeVSXVVizo830NHBy26OLh/rh29DF7h2btaz572yxYjptpipY+jbbFN6z1VPbdhfaFjeKHWt9I5E9bXsZhSSNwj6+v1zZlvdtjp0HW6iwCURrDOE06/eIJYSrbVwSfyhBD5XR+d1CXWduzh1n93qbKzhsnoAAzlvoPogGfSm8cW6xWDQMRuv30GKJ2T/OBOYAX8BD/Q8xzkygs1WZOAMuIFZnxEolhZUq5a6CD54cH5050YpiSOIsM5qN7DYFvuQRh0a6yNwAAnTVenBo4SkRaDopkptbZmyoIzpWvVBMehSl2PHou2tAFZ0K+8LHUwXrmj1np3aCKQhd+HZhgHxaUHQWlPReDGktqS+f7CYcJku+ggAu2pBXRdf98sVlcZ5qx2OWoOOCMdECZCnFVmfbd6+sso32JIlo1cAwBqOt+xdGFm12kLAljGdltCUORjxOeXU3Hx/92X0U7zTSgn0Uu5LOT9w0Qdde0dcjwAk7ba0wxOsHxCxelJj9PTVeEEMmgdwwQ3cNhltKergKDaJPSNKM0Fo2ykMprtSVgxUNPWZWV0xQ6nZbeM10aj7jZqaLVWiA6FLBlLbZ6Tvqzs9C7C/aoksm06JDs/dWvp4dpJdbPU47eaAGYY20N3T88mkxRQqi800YnM7TZxYw2Jo0wi+pSnqBZF9Y+L2D4y8gAXWaeVLuLXkJG8bntKHpxPU7hXNKVhELqfQY7lJmbhWj9KINbpjKdsKPGXylOV8hrhKaosFj+zvFqp0jZMW7rnz/SCtvYt31BZyx563tke1CvoB0P6Yb2iH8/2ho7NLOq6mFCsCCSz8VBcghArrjmNiTzhE+32ArdIKN49u0ap9zbpMMFs0Duo6tiqpglSsjxSny33atEH0uxRfQE0BEtUrQ5XkWlpqF+lpgcYwgwNMWvfm9jC7apoUF7VHg3h90lDpWoaqS5zSpT76Q3BuGq+X8hTtlpEYCAwucUSCBXKy9o5pJFgXhMCV5aAY1ezbqEkJOO7SGv2CZQb8bQM1CXumw+AsjVZ8BaBghWEAChkGqiL3D4fAuXBp0K1CFTfoumXvRRIny3BPavxx8qWT6hcG/76Gn918yirxTp+yCkXG3oyGSLHrEx39mNap9nj8x1TSm0bK1QjlDwIfyRvODre6Fd8CVUzTfUBhq/nDavOG8RqcXcQGO18D1M0Yix8LSjEo8qewr0KFNmwQKknjGIsk9atl2Jpdg/5sfj6Nf3+4Z9/jO4eVp8+/2OFRvYNcvrluri4HtCsnmrzWgYCi5ip4k2bzJnBAINGgC+BcypFvLqUWekhX3VlPVP28+l4MRa9+NUNphebeI7qRnW41IHC9M320Xut7LV8iY/eU7dXJi7V6EyLD9779eS9TFPUDTO0Gyq3piswdPElns3C5eyCUGhFPTxlI9NuH8ZnwyKCX5uPCydbpazungSFKHSWAR79KwroYW1SZZ97MnVDFmZKdaRzxsOpEHjKRKtbzNJ3uYnMVtuuoNyI9i3VlTY/+6CVlSP6RUYdVmuX4eKeMJKLlGDSqeFspjzIC3vtEZEDsyuXmgWjJrNgvV4o0hBbhVCWy2nfxbMmNLgC1miONTmHnzVBNVURJpZ5NsDVt30NT1t4aCg/0WgeDmcnFPo/aEv2a49mB/S3MiiOp6z+QYowHu3dWl69I7e17ETD4OJLtjJHG5S66O1T+32mp9d7oTd8KRhK15mei92e+yzTg4hdDxb2wPb0balE+2V6+w3MyalxQ8aH6IwPfUvjgzBR0qmnWh8khHtIPMZw8atYtXKxRWfGCKYwrtYYNfg3W2PU663B99eYKHumnWdqerLa4ixDRLx6IL7KgHeROWzACsZBe2KI2ivF6rcl0hSO4re0Q7ZaXH6qGQIdtfeciYahhVlbjbRfhdlhh8zOJRVzNXzHI9ID51ggdit/Dc1Qqu13q23QNIZBYxfUOMIRhkEcJnGc7o62+Krzhzjg8or/Aw== -------------------------------------------------------------------------------- /labs/31_user_management.md: -------------------------------------------------------------------------------- 1 | ## Lab 3.1: Manage Users 2 | 3 | ### OpenShift Authorization 4 | 5 | Before you begin with this lab, make sure you roughly understand the authorization concept of OpenShift. 6 | [Authorization](https://docs.openshift.com/container-platform/3.6/architecture/additional_concepts/authorization.html) 7 | 8 | ### Add User to Project 9 | 10 | First we create a user and give him the admin role in the openshift-infra project. 11 | Login to the master and create the local user with ansible on all masters (replace ```[password]```): 12 | ``` 13 | [ec2-user@master0 ~]$ ansible masters -a "htpasswd -b /etc/origin/master/htpasswd cowboy [password]" 14 | ``` 15 | 16 | Add the admin role to the newly created user, but only for the project `openshift-infra`: 17 | ``` 18 | [ec2-user@master0 ~]$ oc adm policy add-role-to-user admin cowboy -n openshift-infra 19 | ``` 20 | 21 | Now login with the new user from your client and check if you see the `openshift-infra` project: 22 | ``` 23 | [localuser@localhost ~]$ oc login https://console.user[X].lab.openshift.ch 24 | Username: cowboy 25 | Password: 26 | Login successful. 27 | 28 | You have one project on this server: "openshift-infra" 29 | 30 | Using project "openshift-infra". 31 | ``` 32 | 33 | ### Add Cluster Role to User 34 | 35 | In order to keep things clean, we delete the created rolebinding for the `openshift-infra` project again and give the user "cowboy" the global "cluster-admin" role. 36 | 37 | Login as "sheriff": 38 | ``` 39 | [ec2-user@master0 ~]$ oc login -u sheriff 40 | ``` 41 | 42 | Add the cluster-admin role to the created user: 43 | ``` 44 | [ec2-user@master0 ~]$ oc adm policy remove-role-from-user admin cowboy -n openshift-infra 45 | role "admin" removed: "cowboy" 46 | [ec2-user@master0 ~]$ oc adm policy add-cluster-role-to-user cluster-admin cowboy 47 | cluster role "cluster-admin" added: "cowboy" 48 | ``` 49 | 50 | Now you can try to login from your client with user "cowboy" and check if you see all projects: 51 | ``` 52 | [localuser@localhost ~]$ oc login https://console.user[X].lab.openshift.ch 53 | Authentication required for https://console.user[X].lab.openshift.ch (openshift) 54 | Username: cowboy 55 | Password: 56 | Login successful. 57 | 58 | You have access to the following projects and can switch between them with 'oc project ': 59 | 60 | appuio-infra 61 | default 62 | kube-public 63 | kube-system 64 | logging 65 | management-infra 66 | openshift 67 | * openshift-infra 68 | 69 | Using project "openshift-infra". 70 | ``` 71 | 72 | 73 | ### Create Group and Add User 74 | 75 | Instead of giving privileges to single users, we can also create a group and assign a role to that group. 76 | 77 | Groups can be created manually or synchronized from an LDAP directory. So let's first create a local group manually and add the user "cowboy" to it: 78 | ``` 79 | [ec2-user@master0 ~]$ oc login -u sheriff 80 | 81 | [ec2-user@master0 ~]$ oc adm groups new deputy-sheriffs cowboy 82 | group.user.openshift.io/deputy-sheriffs created 83 | [ec2-user@master0 ~]$ oc get groups 84 | NAME USERS 85 | deputy-sheriffs cowboy 86 | ``` 87 | 88 | Add the cluster-role to the group "deputy-sheriffs": 89 | ``` 90 | [ec2-user@master0 ~]$ oc adm policy add-cluster-role-to-group cluster-admin deputy-sheriffs 91 | cluster role "cluster-admin" added: "deputy-sheriffs" 92 | ``` 93 | 94 | Verify that the group has been added to the cluster-admins: 95 | ``` 96 | [ec2-user@master0 ~]$ oc get clusterrolebindings | grep cluster-admin 97 | cluster-admin /cluster-admin sheriff, cowboy system:masters, deputy-sheriffs 98 | ``` 99 | 100 | 101 | ### Evaluate Authorizations 102 | 103 | It's possible to evaluate authorizations. This can be done with the following pattern: 104 | ``` 105 | oc policy who-can VERB RESOURCE_NAME 106 | ``` 107 | 108 | Examples: 109 | Who can delete the `openshift-infra` project: 110 | ``` 111 | oc policy who-can delete project -n openshift-infra 112 | ``` 113 | 114 | Who can create configmaps in the `default` project: 115 | ``` 116 | oc policy who-can create configmaps -n default 117 | ``` 118 | 119 | You can also get a description of all available clusterroles and clusterrolebinding with the following oc command: 120 | ``` 121 | [ec2-user@master0 ~]$ oc describe clusterrole.rbac 122 | ``` 123 | 124 | ``` 125 | [ec2-user@master0 ~]$ oc describe clusterrolebinding.rbac 126 | ``` 127 | https://docs.openshift.com/container-platform/3.11/admin_guide/manage_rbac.html 128 | 129 | ### Cleanup 130 | 131 | Delete the group, entity and user: 132 | ``` 133 | [ec2-user@master0 ~]$ oc get group 134 | [ec2-user@master0 ~]$ oc delete group deputy-sheriffs 135 | 136 | [ec2-user@master0 ~]$ oc get user 137 | [ec2-user@master0 ~]$ oc delete user cowboy 138 | 139 | [ec2-user@master0 ~]$ oc get identity 140 | [ec2-user@master0 ~]$ oc delete identity htpasswd_auth:cowboy 141 | 142 | [ec2-user@master0 ~]$ ansible masters -a "htpasswd -D /etc/origin/master/htpasswd cowboy" 143 | ``` 144 | 145 | --- 146 | 147 | **End of Lab 3.1** 148 | 149 |

3.2 Update Hosts →

150 | 151 | [← back to the Chapter Overview](30_daily_business.md) 152 | -------------------------------------------------------------------------------- /labs/32_update_hosts.md: -------------------------------------------------------------------------------- 1 | ## Lab 3.2: Update Hosts 2 | 3 | ### OpenShift Excluder 4 | In this lab we take a look at the OpenShift excluders, apply OS updates to all nodes, drain, reboot and schedule them again. 5 | 6 | The config playbook we use to install and configure OpenShift removes yum excludes for specific packages at its beginning. Likewise it adds them back at the end of the playbook run. This makes it possible to update OpenShift-specific packages during a playbook run but freeze these package versions during e.g. a `yum update`. 7 | 8 | First, let's check if the excludes have been set on all nodes. Connect to the first master and run: 9 | ``` 10 | [ec2-user@master0 ~]$ ansible nodes -m shell -a "atomic-openshift-excluder status && atomic-openshift-docker-excluder status" 11 | ... 12 | app-node0.user[X].lab.openshift.ch | SUCCESS | rc=0 >> 13 | exclude -- All packages excluded 14 | exclude -- All packages excluded 15 | ... 16 | ``` 17 | 18 | These excludes are set by using the OpenShift Ansible playbooks or when using the command `atomic-openshift-excluder` or `atomic-openshift-docker-excluder`. For demonstration purposes, we will now remove and set these excludes again. 19 | 20 | ``` 21 | [ec2-user@master0 ~]$ ansible nodes -m shell -a "atomic-openshift-excluder unexclude && atomic-openshift-docker-excluder unexclude" 22 | [ec2-user@master0 ~]$ ansible nodes -m shell -a "atomic-openshift-excluder status && atomic-openshift-docker-excluder status" 23 | [ec2-user@master0 ~]$ ansible nodes -m shell -a "atomic-openshift-excluder exclude && atomic-openshift-docker-excluder exclude" 24 | [ec2-user@master0 ~]$ ansible nodes -m shell -a "atomic-openshift-excluder status && atomic-openshift-docker-excluder status" 25 | ``` 26 | 27 | 28 | ### Apply OS Patches to Masters and Nodes 29 | 30 | If you don't know if you're cluster-admin or not. 31 | Query all users with rolebindings=cluster-admin: 32 | ``` 33 | oc get clusterrolebinding -o json | jq '.items[] | select(.metadata.name | startswith("cluster-admin")) | .userNames' 34 | ``` 35 | 36 | Hint: root on master-node always is system:admin (don't use it for ansible-tasks). But you're able to grant permissions to other users. 37 | 38 | First, login as cluster-admin and drain the first app-node (this deletes all pods so the OpenShift scheduler creates them on other nodes and also disables scheduling of new pods on the node). 39 | ``` 40 | [ec2-user@master0 ~]$ oc get nodes 41 | [ec2-user@master0 ~]$ oc adm drain app-node0.user[X].lab.openshift.ch --ignore-daemonsets --delete-local-data 42 | ``` 43 | 44 | After draining a node, only pods from DaemonSets should remain on the node: 45 | ``` 46 | [ec2-user@master0 ~]$ oc adm manage-node app-node0.user[X].lab.openshift.ch --list-pods 47 | 48 | Listing matched pods on node: app-node0.user[X].lab.openshift.ch 49 | 50 | NAMESPACE NAME READY STATUS RESTARTS AGE 51 | openshift-logging logging-fluentd-lfjnc 1/1 Running 0 33m 52 | openshift-monitoring node-exporter-czhr2 2/2 Running 0 36m 53 | openshift-node sync-rhh8z 1/1 Running 0 46m 54 | openshift-sdn ovs-hz9wj 1/1 Running 0 46m 55 | openshift-sdn sdn-49tpr 1/1 Running 0 46m 56 | ``` 57 | 58 | Scheduling should now be disabled for this node: 59 | ``` 60 | [ec2-user@master0 ~]$ oc get nodes 61 | ... 62 | app-node0.user[X].lab.openshift.ch Ready,SchedulingDisabled compute 2d v1.11.0+d4cacc0 63 | ... 64 | 65 | ``` 66 | 67 | If everything looks good, you can update the node and reboot it. The first command can take a while and doesn't output anything until it's done: 68 | ``` 69 | [ec2-user@master0 ~]$ ansible app-node0.user[X].lab.openshift.ch -m yum -a "name='*' state=latest exclude='atomic-openshift-* openshift-* docker-*'" 70 | [ec2-user@master0 ~]$ ansible app-node0.user[X].lab.openshift.ch --poll=0 --background=1 -m shell -a 'sleep 2 && reboot' 71 | ``` 72 | 73 | After the node becomes ready again, enable schedulable anew. Do not do this before the node has rebooted (it takes a while for the node's status to change to `Not Ready`): 74 | ``` 75 | [ec2-user@master0 ~]$ oc get nodes -w 76 | [ec2-user@master0 ~]$ oc adm manage-node app-node0.user[X].lab.openshift.ch --schedulable 77 | ``` 78 | 79 | Check that pods are correctly starting: 80 | ``` 81 | [ec2-user@master0 ~]$ oc adm manage-node app-node0.user[X].lab.openshift.ch --list-pods 82 | 83 | Listing matched pods on node: app-node0.user[X].lab.openshift.ch 84 | 85 | NAMESPACE NAME READY STATUS RESTARTS AGE 86 | dakota ruby-ex-1-6lc87 1/1 Running 0 12m 87 | openshift-logging logging-fluentd-lfjnc 1/1 Running 1 43m 88 | openshift-monitoring node-exporter-czhr2 2/2 Running 2 47m 89 | openshift-node sync-rhh8z 1/1 Running 1 56m 90 | openshift-sdn ovs-hz9wj 1/1 Running 1 56m 91 | openshift-sdn sdn-49tpr 1/1 Running 1 56m 92 | ``` 93 | 94 | Since we want to update the whole cluster, **you will need to repeat these steps on all servers**. Masters do not need to be drained because they do not run any pods (unschedulable by default). 95 | 96 | --- 97 | 98 | **End of Lab 3.2** 99 | 100 |

3.3 Persistent Storage →

101 | 102 | [← back to the Chapter Overview](30_daily_business.md) 103 | -------------------------------------------------------------------------------- /labs/51_backup.md: -------------------------------------------------------------------------------- 1 | ## Lab 5.1: Backup 2 | 3 | In this techlab you will learn how to create a new backup and which files are important. The following items should be backuped: 4 | 5 | - Cluster data files 6 | - etcd data on each master 7 | - API objects (stored in etcd, but it's a good idea to regularly export all objects) 8 | - Docker registry storage 9 | - PV storage 10 | - Certificates 11 | - Ansible hosts file 12 | 13 | 14 | ### Lab 5.1.1: Master Backup Files 15 | 16 | The following files should be backuped on all masters: 17 | 18 | - Ansible inventory file (contains information about the cluster): `/etc/ansible/hosts` 19 | - Configuration files (for the master), certificates and htpasswd: `/etc/origin/master/` 20 | - Docker configurations: `/etc/sysconfig/docker` `/etc/sysconfig/docker-network` `/etc/sysconfig/docker-storage` 21 | 22 | ### Lab 5.1.2: Node Backup Files 23 | 24 | Backup the following folders on all nodes: 25 | 26 | - Node Configuration files: `/etc/origin/node/` 27 | - Certificates for the docker-registry: `/etc/docker/certs.d/` 28 | - Docker configurations: `/etc/sysconfig/docker` `/etc/sysconfig/docker-network` `/etc/sysconfig/docker-storage` 29 | 30 | ### Lab 5.1.3: Application Backup 31 | 32 | To backup the data in persistent volumes, you should mount them somewhere. If you mount a Glusterfs volume, it is guaranteed to be consistent. The bricks directly on the Glusterfs servers can contain small inconsistencies that Glusterfs hasn't synced to the other instances yet. 33 | 34 | 35 | ### Lab 5.1.4: Project Backup 36 | 37 | It is advisable to regularly backup all project data. 38 | We will set up a cronjob in a project called "project-backup" which hourly writes all resources on OpenShift to a PV. 39 | Let's gather the backup-script: 40 | ``` 41 | [ec2-user@master0 ~]$ sudo yum install git python-openshift -y 42 | [ec2-user@master0 ~]$ git clone https://github.com/mabegglen/openshift-project-backup 43 | ``` 44 | Now we create the cronjob on the first master: 45 | ``` 46 | [ec2-user@master0 ~]$ cd openshift-project-backup 47 | [ec2-user@master0 ~]$ ansible-playbook playbook.yml \ 48 | -e openshift_project_backup_job_name="cronjob-project-backup" \ 49 | -e "openshift_project_backup_schedule=\"0 6,18 * * *\"" \ 50 | -e openshift_project_backup_job_service_account="project-backup" \ 51 | -e openshift_project_backup_namespace="project-backup" \ 52 | -e openshift_project_backup_image="registry.access.redhat.com/openshift3/jenkins-slave-base-rhel7" \ 53 | -e openshift_project_backup_image_tag="v3.11" \ 54 | -e openshift_project_backup_storage_size="1G" \ 55 | -e openshift_project_backup_deadline="3600" \ 56 | -e openshift_project_backup_cronjob_api="batch/v1beta1" 57 | ``` 58 | Details https://github.com/mabegglen/openshift-project-backup 59 | 60 | If you want to reschedule your backup-job to check it's functionality to every 1minute: 61 | 62 | Change the value of schedule: to "*/1 * * * *" 63 | ``` 64 | [ec2-user@master0 ~]$ oc project project-backup 65 | [ec2-user@master0 ~]$ oc get cronjob 66 | [ec2-user@master0 ~]$ oc edit cronjob cronjob-project-backup 67 | ``` 68 | 69 | Show if cronjob is active: 70 | ``` 71 | [ec2-user@master0 openshift-project-backup]$ oc get cronjob 72 | NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE 73 | cronjob-project-backup */1 * * * * False 1 1m 48m 74 | ``` 75 | 76 | Show if backup-pod was launched: 77 | ``` 78 | [ec2-user@master0 openshift-project-backup]$ oc get pods 79 | NAME READY STATUS RESTARTS AGE 80 | cronjob-project-backup-1561384620-kjm6v 1/1 Running 0 47s 81 | 82 | ``` 83 | 84 | Check the logfiles while backup-job is running: 85 | ``` 86 | [ec2-user@master0 openshift-project-backup]$ oc logs -f 87 | ``` 88 | When your Backupjob runs as expected, don't forget to set up the cronjob back to "0 22 * * *" for example. 89 | ``` 90 | [ec2-user@master0 ~]$ oc edit cronjob cronjob-project-backup 91 | ``` 92 | If you wanna Restore a project, proceed to [Lab 5.2.1](52_restore.md#5.2.1) 93 | 94 | 95 | ### Lab 5.1.5: Create etcd Backup 96 | We plan to create a Backup of our etcd. When we've created our backup, we wan't to restore them on master1/master2 and scale out from 1 to 3 nodes. 97 | 98 | First we create a snapshot of our etcd cluster: 99 | ``` 100 | [root@master0 ~]# export ETCD_POD_MANIFEST="/etc/origin/node/pods/etcd.yaml" 101 | [root@master0 ~]# export ETCD_EP=$(grep https ${ETCD_POD_MANIFEST} | cut -d '/' -f3) 102 | [root@master0 ~]# export ETCD_POD=$(oc get pods -n kube-system | grep -o -m 1 '\S*etcd\S*') 103 | [root@master0 ~]# oc project kube-system 104 | Now using project "kube-system" on server "https://internalconsole.user[x].lab.openshift.ch:443". 105 | [root@master0 ~]# oc exec ${ETCD_POD} -c etcd -- /bin/bash -c "ETCDCTL_API=3 etcdctl \ 106 | --cert /etc/etcd/peer.crt \ 107 | --key /etc/etcd/peer.key \ 108 | --cacert /etc/etcd/ca.crt \ 109 | --endpoints $ETCD_EP \ 110 | snapshot save /var/lib/etcd/snapshot.db" 111 | 112 | Snapshot saved at /var/lib/etcd/snapshot.db 113 | ``` 114 | Check Filesize of the snapshot created: 115 | ``` 116 | [root@master0 ~]# ls -hl /var/lib/etcd/snapshot.db 117 | -rw-r--r--. 1 root root 21M Jun 24 16:44 /var/lib/etcd/snapshot.db 118 | ``` 119 | 120 | copy them to the tmp directory for further use: 121 | ``` 122 | [root@master0 ~]# cp /var/lib/etcd/snapshot.db /tmp/snapshot.db 123 | [root@master0 ~]# cp /var/lib/etcd/member/snap/db /tmp/db 124 | ``` 125 | If you wanna Restore an etcd, proceed to [Lab 5.2.2](52_restore.md#5.2.2) 126 | 127 | --- 128 | 129 | **End of Lab 5.1** 130 | 131 |

5.2 Restore →

132 | 133 | [← back to the Chapter Overview](50_backup_restore.md) 134 | -------------------------------------------------------------------------------- /labs/62_logs.md: -------------------------------------------------------------------------------- 1 | ## Lab 6.2: Troubleshooting Using Logs 2 | 3 | As soon as basic functionality of OpenShift itself is reduced or not working at all, we have to have a closer look at the underlying components' log messages. We find these logs either in the journal on the different servers or in Elasticsearch. 4 | 5 | **Note:** If the logging component is not part of the installation, Elasticsearch is not available and therefore the only log location is the journal. Also be aware that Fluentd is responsible for aggregating log messages, but it is possible that Fluentd was not deployed on all OpenShift nodes even though it is a DaemonSet. Check Fluentd's node selector and the node's labels to make sure all logs are aggregated as expected. 6 | 7 | **Note:** While it is convenient to use the EFK stack to analyze log messages in a central place, be aware that depending on the problem, relevant log messages might not be received by Elasticsearch (e.g. SDN problems). 8 | 9 | 10 | ### OpenShift Components Overview 11 | 12 | The master usually houses three master-specific containers: 13 | * `master-api` in OpenShift project `kube-system` 14 | * `master-controllers` in OpenShift project `kube-system` 15 | * `master-etcd` in OpenShift project `kube-system` (usually installed on all masters, also possible externally) 16 | 17 | The node-specific containers can also be found on a master: 18 | * `sync` in OpenShift project `openshift-node` 19 | * `sdn` and `ovs` in OpenShift project `openshift-sdn` 20 | 21 | The node-specific services can also be found on a master: 22 | * `atomic-openshift-node` (in order for the master to be part of the SDN) 23 | * `docker` 24 | 25 | General services include the following: 26 | * `dnsmasq` 27 | * `NetworkManager` 28 | * `firewalld` 29 | 30 | 31 | ### Service States 32 | 33 | Check etcd and master states from the first master using ansible. Check the OpenShift master container first: 34 | ``` 35 | [ec2-user@master0 ~]$ oc get pods -n kube-system -o wide 36 | NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE 37 | master-api-master0.user7.lab.openshift.ch 1/1 Running 9 1d 172.31.44.160 master0.user7.lab.openshift.ch 38 | master-api-master1.user7.lab.openshift.ch 1/1 Running 7 1d 172.31.45.211 master1.user7.lab.openshift.ch 39 | master-api-master2.user7.lab.openshift.ch 1/1 Running 0 4m 172.31.35.148 master2.user7.lab.openshift.ch 40 | master-controllers-master0.user7.lab.openshift.ch 1/1 Running 7 1d 172.31.44.160 master0.user7.lab.openshift.ch 41 | master-controllers-master1.user7.lab.openshift.ch 1/1 Running 6 1d 172.31.45.211 master1.user7.lab.openshift.ch 42 | master-controllers-master2.user7.lab.openshift.ch 1/1 Running 0 4m 172.31.35.148 master2.user7.lab.openshift.ch 43 | master-etcd-master0.user7.lab.openshift.ch 1/1 Running 6 1d 172.31.44.160 master0.user7.lab.openshift.ch 44 | master-etcd-master1.user7.lab.openshift.ch 1/1 Running 4 1d 172.31.45.211 master1.user7.lab.openshift.ch 45 | ``` 46 | 47 | Depending on the outcome of the above commands we have to get a closer look at specific container. This can either be done the conventional way, e.g. the 30 most recent messages for etcd on the first master: 48 | 49 | ``` 50 | [ec2-user@master0 ~]$ oc logs master-etcd-master0.user7.lab.openshift.ch -n kube-system --tail=30 51 | ``` 52 | 53 | There is also the possibility of checking etcd's health using `etcdctl`: 54 | ``` 55 | [root@master0 ~]# etcdctl2 --cert-file=/etc/etcd/peer.crt \ 56 | --key-file=/etc/etcd/peer.key \ 57 | --ca-file=/etc/etcd/ca.crt \ 58 | --peers="https://master0.user[X].lab.openshift.ch:2379,https://master1.user[X].lab.openshift.ch:2379" \ 59 | cluster-health 60 | ``` 61 | 62 | As an etcd cluster needs a quorum to update its state, `etcdctl` will output that the cluster is healthy even if not every member is. 63 | 64 | Back to checking services with systemd: Master-specific services only need to be executed on master hosts, so note the change of the host group in the following command. 65 | 66 | atomic-openshift-node: 67 | ``` 68 | [ec2-user@master0 ~]$ ansible nodes -a "systemctl is-active atomic-openshift-node" 69 | ``` 70 | 71 | Above command applies to all the other node services (`docker`, `dnsmasq` and `NetworkManager`) with which we get an overall overview of OpenShift-specific service states. 72 | 73 | Depending on the outcome of the above commands we have to get a closer look at specific services. This can either be done the conventional way, e.g. the 30 most recent messages for atomic-openshift-node on the first master: 74 | 75 | ``` 76 | [ec2-user@master0 ~]$ ansible masters[0] -a "journalctl -u atomic-openshift-node -n 30" 77 | ``` 78 | 79 | Or by searching Elasticsearch: After logging in to https://logging.app[X].lab.openshift.ch, make sure you're on Kibana's "Discover" tab. Then choose the `.operations.*` index by clicking on the arrow in the dark-grey box on the left to get a list of all available indices. You can then create search queries such as `systemd.t.SYSTEMD_UNIT:atomic-openshift-node.service` in order to filter for all messages from every running OpenShift node service. 80 | 81 | Or if we wanted to filter for error messages we could simply use "error" in the search bar and then by looking at the available fields (in the menu on the left) limit the search results further. 82 | 83 | --- 84 | 85 | **End of Lab 6.2** 86 | 87 |

Upgrade OpenShift from 3.11.88 to 3.11.104 →

88 | 89 | [← back to the Chapter Overview](60_monitoring_troubleshooting.md) 90 | -------------------------------------------------------------------------------- /labs/42_outgoing_http_proxies.md: -------------------------------------------------------------------------------- 1 | ## Lab 4.2: Outgoing HTTP Proxies 2 | 3 | Large corporations often allow internet access only via outgoing HTTP proxies for security reasons. 4 | To use OpenShift Container Platform in such an environment the various OpenShift components and 5 | the containers that run on the platform need to be configured to use an HTTP proxy. In addition 6 | internal resources must be excluded from access via proxy as outgoing proxies usually only allow 7 | access to external resources. This lab shows how to configure OpenShift Container for outgoing 8 | HTTP proxies using the included Ansible playbooks. 9 | We haven't yet added an outgoing HTTP proxy to our lab environment. Therefore this lab currently doesn't 10 | contain hands-on exercises. 11 | 12 | 13 | ### Configure the Ansible Inventory 14 | 15 | The OpenShift Ansible playbooks support three groups of variables for outgoing HTTP proxy configuration. 16 | 17 | Configuration for OpenShift components, container runtime, e.g. Docker, and containers running on the platform: 18 | ``` 19 | openshift_http_proxy= 20 | openshift_https_proxy= 21 | openshift_no_proxy='' 22 | ``` 23 | 24 | Where `` can take one of the following forms: 25 | ``` 26 | http://proxy.example.org:3128 27 | http://192.0.2.42:3128 28 | http://proxyuser:proxypass@proxy.example.org:3128 29 | http://proxyuser:proxypass@192.0.2.42:3128 30 | ``` 31 | 32 | In all cases https can be used instead of http, provided this is supported by the proxy. 33 | `` consists of a comma separated list of: 34 | * hostnames, e.g. `my.example.org` 35 | * domains, e.g. `.example.org` 36 | * IP addresses, e.g. `192.0.2.42` 37 | 38 | Additionally OpenShift implements support for IP subnets, e.g. `192.0.2.0/24`, in `no_proxy`. However other software, including Docker, does not support such entries and ignores them. 39 | 40 | Docker build containers are created directly by Docker with a clean environment, i.e. without the required proxy environment variables. 41 | The following variables tell OpenShift to add `ENV` instructions with the outgoing HTTP proxy configuration to all Docker builds. 42 | This is needed to allow builds to download dependencies from external sources: 43 | ``` 44 | openshift_builddefaults_http_proxy= 45 | openshift_builddefaults_https_proxy= 46 | openshift_builddefaults_no_proxy='' 47 | ``` 48 | 49 | Finally an outgoing HTTP proxy can be configured to allow OpenShift builds to check out sources from external Git repositories: 50 | ``` 51 | openshift_builddefaults_git_http_proxy= 52 | openshift_builddefaults_git_https_proxy= 53 | openshift_builddefaults_git_no_proxy= 54 | ``` 55 | 56 | 57 | ### Internal Docker Registry 58 | 59 | It's recommended to add the IP address of the internal registry to the `no_proxy` 60 | lists. The IP addressed of the internal registry can be looked up after cluster installation with: 61 | ``` 62 | [ec2-user@master0 ~]$ oc get svc docker-registry -n default -o jsonpath='{.spec.clusterIP}' 63 | ``` 64 | 65 | For OpenShift Container Platform 3.5 and earlier this is required as the registry is always 66 | accessed via IP address and Docker doesn't support IP subnets in its `no_proxy` list! 67 | 68 | 69 | ### Build Tools 70 | 71 | Some build tools use a different mechanism and need additional configuration for accessing outgoing HTTP proxies. 72 | 73 | 74 | #### Maven 75 | 76 | The Java build tool Maven reads the [proxy configuration from its settings.xml](https://maven.apache.org/guides/mini/guide-proxies.html). 77 | Java base images by Red Hat contain [support for configuring Maven's outgoing proxy through environment variables](https://access.redhat.com/documentation/en-us/red_hat_jboss_enterprise_application_platform/7.0/html-single/red_hat_jboss_enterprise_application_platform_for_openshift/#eap_s2i_process). 78 | These environment variables are used by all Red Hat Java base images, not just JBoss ones. They must be added to the BuildConfigs of Maven builds. 79 | To add them to all BuildConfigs on the platform you can use the Ansible inventory variable `openshift_builddefaults_json`, 80 | which must then contain the whole build proxy configuration, i.e. the other `openshift_builddefaults_` variables mentioned earlier are ignored. E.g.: 81 | ``` 82 | openshift_builddefaults_json=' 83 | {"BuildDefaults":{"configuration":{"apiVersion":"v1","env":[ 84 | {"name":"HTTP_PROXY","value":""}, 85 | {"name":"HTTPS_PROXY","value":""}, 86 | {"name":"NO_PROXY","value":""}, 87 | {"name":"http_proxy","value":""}, 88 | {"name":"https_proxy","value":""}, 89 | {"name":"no_proxy","value":""}, 90 | {"name":"HTTP_PROXY_HOST","value":""}, 91 | {"name":"HTTP_PROXY_PORT","value":""}, 92 | {"name":"HTTP_PROXY_USERNAME","value":""}, 93 | {"name":"HTTP_PROXY_PASSWORD","value":""}, 94 | {"name":"HTTP_PROXY_NONPROXYHOSTS","value":""}], 95 | "gitHTTPProxy":"", 96 | "gitHTTPSProxy":"", 97 | "gitNoProxy":"", 98 | "kind":"BuildDefaultsConfig"}}}' 99 | ``` 100 | 101 | Note that the value has to be valid JSON. 102 | Also Ansible inventories in INI format do not support line folding, so this has to be a single line. 103 | 104 | If you use Java base images other than the ones provided by Red Hat you have to implement your own solution to configure an outgoing HTTP proxy for Maven. 105 | 106 | 107 | ### Apply Outgoing HTTP Proxy Configuration to Cluster 108 | 109 | To apply the outgoing HTTP proxy configuration to the cluster you have to run the master and node config playbooks: 110 | ``` 111 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-node/bootstrap.yml 112 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-master/config.yml 113 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-master/additional_config.yml 114 | ``` 115 | 116 | 117 | --- 118 | 119 | **End of Lab 4.2** 120 | 121 |

5. Backup and Restore →

122 | 123 | [← back to the Chapter Overview](40_configuration_best_practices.md) 124 | -------------------------------------------------------------------------------- /labs/71_upgrade_openshift3.11.104.md: -------------------------------------------------------------------------------- 1 | ## Lab 7.1: Upgrade OpenShift 3.11.88 to 3.11.104 2 | 3 | ### Upgrade Preparation 4 | 5 | We first need to make sure our lab environment fulfills the requirements mentioned in the official documentation. We are going to do an "[Automated In-place Cluster Upgrade](https://docs.openshift.com/container-platform/3.11/upgrading/automated_upgrades.html)" which lists part of these requirements and explains how to verify the current installation. Also check the [Prerequisites](https://docs.openshift.com/container-platform/3.11/install/prerequisites.html#install-config-install-prerequisites) of the new release. 6 | 7 | Conveniently, our lab environment already fulfills all the requirements, so we can move on to the next step. 8 | 9 | #### 1. Ensure the openshift_deployment_type=openshift-enterprise #### 10 | ``` 11 | [ec2-user@master0 ~]$ grep -i openshift_deployment_type /etc/ansible/hosts 12 | ``` 13 | 14 | #### 2. disable rolling, full system restarts of the hosts #### 15 | ``` 16 | [ec2-user@master0 ~]$ ansible masters -m shell -a "grep -i openshift_rolling_restart_mode /etc/ansible/hosts" 17 | ``` 18 | in our lab environment this parameter isn't set, so let's do it on all master-nodes: 19 | ``` 20 | [ec2-user@master0 ~]$ ansible masters -m lineinfile -a 'path="/etc/ansible/hosts" regexp="^openshift_rolling_restart_mode" line="openshift_rolling_restart_mode=services" state="present"' 21 | ``` 22 | #### 3. change the value of openshift_pkg_version to 3.11.104 in /etc/ansible/hosts #### 23 | ``` 24 | [ec2-user@master0 ~]$ ansible masters -m lineinfile -a 'path="/etc/ansible/hosts" regexp="^openshift_pkg_version" line="openshift_pkg_version=-3.11.104" state="present"' 25 | ``` 26 | #### 4. upgrade the nodes #### 27 | 28 | ##### 4.1 prepare nodes for upgrade ##### 29 | ``` 30 | [ec2-user@master0 ~]$ ansible all -a 'subscription-manager refresh' 31 | [ec2-user@master0 ~]$ ansible all -a 'subscription-manager repos --enable="rhel-7-server-ose-3.11-rpms" --enable="rhel-7-server-rpms" --enable="rhel-7-server-extras-rpms" --enable="rhel-7-server-ansible-2.6-rpms" --enable="rhel-7-fast-datapath-rpms" --disable="rhel-7-server-ose-3.10-rpms" --disable="rhel-7-server-ansible-2.4-rpms"' 32 | [ec2-user@master0 ~]$ ansible all -a 'yum clean all' 33 | [ec2-user@master0 ~]$ ansible masters -m lineinfile -a 'path="/etc/ansible/hosts" regexp="^openshift_certificate_expiry_fail_on_warn" line="openshift_certificate_expiry_fail_on_warn=False" state="present"' 34 | ``` 35 | ##### 4.2 prepare your upgrade-host ##### 36 | ``` 37 | [ec2-user@master0 ~]$ sudo -i 38 | [ec2-user@master0 ~]# yum update -y openshift-ansible 39 | ``` 40 | 41 | ##### 4.3 upgrade the control plane ##### 42 | 43 | Upgrade the so-called control plane, consisting of: 44 | 45 | - etcd 46 | - master components 47 | - node services running on masters 48 | - Docker running on masters 49 | - Docker running on any stand-alone etcd hosts 50 | 51 | ``` 52 | [ec2-user@master0 ~]$ cd /usr/share/ansible/openshift-ansible 53 | [ec2-user@master0 ~]$ ansible-playbook playbooks/byo/openshift-cluster/upgrades/v3_11/upgrade_control_plane.yml 54 | ``` 55 | 56 | ##### 4.4 upgrade the nodes manually (one by one) ##### 57 | 58 | Upgrade node by node manually because we need to make sure, that the nodes running GlusterFS in container have enough time to replicate to the other nodes. 59 | 60 | Upgrade `infra-node0.user[X].lab.openshift.ch`: 61 | ``` 62 | [ec2-user@master0 ~]$ ansible-playbook playbooks/byo/openshift-cluster/upgrades/v3_11/upgrade_nodes.yml \ 63 | --extra-vars openshift_upgrade_nodes_label="kubernetes.io/hostname=infra-node0.user[X].lab.openshift.ch" 64 | ``` 65 | Wait until all GlusterFS Pods are ready again and check if GlusterFS volumes have heal entries. 66 | ``` 67 | [ec2-user@master0 ~]$ oc project glusterfs 68 | [ec2-user@master0 ~]$ oc get pods -o wide | grep glusterfs 69 | [ec2-user@master0 ~]$ oc rsh 70 | sh-4.2# for vol in `gluster volume list`; do gluster volume heal $vol info; done | grep -i "number of entries" 71 | Number of entries: 0 72 | ``` 73 | If all volumes have `Number of entries: 0`, we can proceed with the next node and repeat the check of GlusterFS. 74 | 75 | Upgrade `infra-node1` and `infra-node2` the same way you as you did the first one: 76 | ``` 77 | [ec2-user@master0 ~]$ ansible-playbook playbooks/byo/openshift-cluster/upgrades/v3_11/upgrade_nodes.yml \ 78 | --extra-vars openshift_upgrade_nodes_label="kubernetes.io/hostname=infra-node1.user[X].lab.openshift.ch" 79 | ``` 80 | 81 | Afer upgrading the `infra_nodes`, you need to upgrade the compute nodes: 82 | ``` 83 | [ec2-user@master0 ~]$ ansible-playbook playbooks/byo/openshift-cluster/upgrades/v3_11/upgrade_nodes.yml \ 84 | --extra-vars openshift_upgrade_nodes_label="node-role.kubernetes.io/compute=true" \ 85 | --extra-vars openshift_upgrade_nodes_serial="1" 86 | ``` 87 | 88 | #### 5. Upgrading the EFK Logging Stack #### 89 | 90 | **Note:** Setting openshift_logging_install_logging=true enables you to upgrade the logging stack. 91 | 92 | ``` 93 | [ec2-user@master0 ~]$ grep openshift_logging_install_logging /etc/ansible/hosts 94 | [ec2-user@master0 ~]$ cd /usr/share/ansible/openshift-ansible/playbooks 95 | [ec2-user@master0 ~]$ ansible-playbook openshift-logging/config.yml 96 | [ec2-user@master0 ~]$ oc delete pod --selector="component=fluentd" -n logging 97 | ``` 98 | 99 | #### 6. Upgrading Cluster Metrics #### 100 | ``` 101 | [ec2-user@master0 ~]$ cd /usr/share/ansible/openshift-ansible/playbooks 102 | [ec2-user@master0 ~]$ ansible-playbook openshift-metrics/config.yml 103 | ``` 104 | 105 | #### 7. Update the oc binary #### 106 | The `atomic-openshift-clients-redistributable` package which provides the `oc` binary for different operating systems needs to be updated separately: 107 | ``` 108 | [ec2-user@master0 ~]$ ansible masters -a "yum install --assumeyes --disableexcludes=all atomic-openshift-clients-redistributable" 109 | ``` 110 | 111 | #### 8. Update oc binary on client #### 112 | Update the `oc` binary on your own client. As before, you can get it from: 113 | ``` 114 | https://client.app[X].lab.openshift.ch 115 | ``` 116 | 117 | **Note:** You should tell all users of your platform to update their client. Client and server version differences can lead to compatibility issues. 118 | 119 | --- 120 | 121 | **End of Lab 7.1** 122 | 123 |

7.2 Verify the Upgrade →

124 | 125 | [← back to the Chapter Overview](70_upgrade.md) 126 | -------------------------------------------------------------------------------- /labs/52_restore.md: -------------------------------------------------------------------------------- 1 | ## Lab 5.2: Restore 2 | 3 | 4 | ### Lab 5.2.1: Restore a Project 5 | 6 | We will now delete the initially created `dakota` project and try to restore it from the backup. 7 | ``` 8 | [ec2-user@master0 ~]$ oc delete project dakota 9 | ``` 10 | 11 | Check if the project is being deleted 12 | ``` 13 | [ec2-user@master0 ~]$ oc get project dakota 14 | ``` 15 | 16 | Restore the dakota project from the backup. 17 | ``` 18 | [ec2-user@master0 ~]$ oc new-project dakota 19 | [ec2-user@master0 ~]$ oc project project-backup 20 | [ec2-user@master0 ~]$ oc debug `oc get pods -o jsonpath='{.items[*].metadata.name}' | awk '{print $1}'` 21 | sh-4.2# tar -xvf /backup/backup-201906131343.tar.gz -C /tmp/ 22 | sh-4.2# oc apply -f /tmp/dakota/ 23 | ``` 24 | 25 | Start build and push image to registry 26 | ``` 27 | [ec2-user@master0 ~]$ oc start-build ruby-ex -n dakota 28 | ``` 29 | 30 | Check whether the pods become ready again. 31 | ``` 32 | [ec2-user@master0 ~]$ oc get pods -w -n dakota 33 | ``` 34 | 35 | 36 | ### Lab 5.2.2: Restore the etcd Cluster ### 37 | 38 | :warning: Before you proceed, make sure you've already added master2 [Lab 3.5.2](35_add_new_node_and_master.md#3.5.2) 39 | 40 | copy the snapshot to the master1.user[x].lab.openshift.ch 41 | ``` 42 | [ec2-user@master0 ~]$ userid=[x] 43 | [ec2-user@master0 ~]$ scp /tmp/snapshot.db master1.user$userid.lab.openshift.ch:/tmp/snapshot.db 44 | [ec2-user@master0 ~]$ ansible etcd -m service -a "name=atomic-openshift-node state=stopped" 45 | [ec2-user@master0 ~]$ ansible etcd -m service -a "name=docker state=stopped" 46 | [ec2-user@master0 ~]$ ansible etcd -a "rm -rf /var/lib/etcd" 47 | [ec2-user@master0 ~]$ ansible etcd -a "mv /etc/etcd/etcd.conf /etc/etcd/etcd.conf.bak" 48 | ``` 49 | 50 | switch to user root and restore the etc-database 51 | 52 | :warning: run this task on ALL Masters (master0,master1) 53 | ``` 54 | [ec2-user@master0 ~]$ sudo -i 55 | [root@master0 ~]# yum install etcd-3.2.22-1.el7.x86_64 56 | [root@master0 ~]# rmdir /var/lib/etcd 57 | [root@master0 ~]# mv /etc/etcd/etcd.conf.bak /etc/etcd/etcd.conf 58 | [root@master0 ~]# source /etc/etcd/etcd.conf 59 | [root@master0 ~]# export ETCDCTL_API=3 60 | [root@master0 ~]# ETCDCTL_API=3 etcdctl snapshot restore /tmp/snapshot.db \ 61 | --name $ETCD_NAME \ 62 | --initial-cluster $ETCD_INITIAL_CLUSTER \ 63 | --initial-cluster-token $ETCD_INITIAL_CLUSTER_TOKEN \ 64 | --initial-advertise-peer-urls $ETCD_INITIAL_ADVERTISE_PEER_URLS \ 65 | --data-dir /var/lib/etcd 66 | [root@master0 ~]# restorecon -Rv /var/lib/etcd 67 | ``` 68 | 69 | As we have restored the etcd on all masters we should be able to start the services: 70 | ``` 71 | [ec2-user@master0 ~]$ ansible etcd -m service -a "name=docker state=started" 72 | [ec2-user@master0 ~]$ ansible etcd -m service -a "name=atomic-openshift-node state=started" 73 | ``` 74 | 75 | #### Check ectd-clusther health #### 76 | ``` 77 | [root@master0 ~]# ETCD_ALL_ENDPOINTS=` etcdctl3 --write-out=fields member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'` 78 | [root@master0 ~]# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS endpoint status --write-out=table 79 | [root@master0 ~]# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS endpoint health 80 | ``` 81 | 82 | ### Scale up the etcd Cluster ### 83 | Add the third etcd master2.user[X].lab.openshift.ch to the etcd cluster 84 | We add the 3rd Node (master2) by adding it to the [new_etcd] group and activate this group by uncommenting it: 85 | ``` 86 | [OSEv3:children] 87 | ... 88 | new_etcd 89 | 90 | [new_etcd] 91 | master2.user[X].lab.openshift.ch 92 | ``` 93 | 94 | :warning: the scaleup-playbook provided by redhat doesn't restart the masters seamlessly. If you have to scaleup in production, please do this in a maintenance window. 95 | 96 | Run the scaleup-Playbook to scaleup the etcd-cluster: 97 | 98 | ``` 99 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/scaleup.yml 100 | ``` 101 | 102 | #### Check ectd-clusther health #### 103 | ``` 104 | [root@master0 ~]# ETCD_ALL_ENDPOINTS=` etcdctl3 --write-out=fields member list | awk '/ClientURL/{printf "%s%s",sep,$3; sep=","}'` 105 | [root@master0 ~]# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS endpoint status --write-out=table 106 | [root@master0 ~]# etcdctl3 --endpoints=$ETCD_ALL_ENDPOINTS endpoint health 107 | ``` 108 | 109 | :information_source: don't get confused by the 4 entries. Master0 will show up twice with the same id 110 | 111 | You should now get an output like this. 112 | 113 | ``` 114 | +---------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ 115 | | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX | 116 | +---------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ 117 | | https://master0.user1.lab.openshift.ch:2379 | a8e78dd0690640cb | 3.2.22 | 26 MB | false | 2 | 9667 | 118 | | https://172.31.42.95:2379 | 1ab823337d6e84bf | 3.2.22 | 26 MB | false | 2 | 9667 | 119 | | https://172.31.38.22:2379 | 56f5e08139a21df3 | 3.2.22 | 26 MB | true | 2 | 9667 | 120 | | https://172.31.46.194:2379 | a8e78dd0690640cb | 3.2.22 | 26 MB | false | 2 | 9667 | 121 | +---------------------------------------------+------------------+---------+---------+-----------+-----------+------------+ 122 | 123 | https://172.31.46.194:2379 is healthy: successfully committed proposal: took = 2.556091ms 124 | https://172.31.42.95:2379 is healthy: successfully committed proposal: took = 2.018976ms 125 | https://master0.user1.lab.openshift.ch:2379 is healthy: successfully committed proposal: took = 2.639024ms 126 | https://172.31.38.22:2379 is healthy: successfully committed proposal: took = 1.666699ms 127 | ``` 128 | 129 | #### move new etcd-member in /etc/ansible/hosts #### 130 | 131 | Move the now functional etcd members from the group `[new_etcd]` to `[etcd]` in your Ansible inventory at `/etc/ansible/hosts` so the group looks like: 132 | 133 | 134 | ``` 135 | ... 136 | #new_etcd 137 | 138 | #[new_etcd] 139 | 140 | ... 141 | 142 | [etcd] 143 | master0.user[X].lab.openshift.ch 144 | master1.user[X].lab.openshift.ch 145 | master2.user[X].lab.openshift.ch 146 | ``` 147 | 148 | --- 149 | 150 | **End of Lab 5.2** 151 | 152 |

6. Monitoring and Troubleshooting →

153 | 154 | [← back to the Chapter Overview](50_backup_restore.md) 155 | -------------------------------------------------------------------------------- /appendices/03_aws_storage.md: -------------------------------------------------------------------------------- 1 | # Appendix 3: Using AWS EBS and EFS Storage 2 | This appendix is going to show you how to use AWS EBS and EFS Storage on OpenShift 3.11. 3 | 4 | ## Installation 5 | :information_source: To access the efs-storage at aws, you will need an fsid. Please ask your instructor to get one. 6 | 7 | Uncomment the following part in your Ansible inventory and set the fsid: 8 | ``` 9 | [ec2-user@master0 ~]$ sudo vi /etc/ansible/hosts 10 | ``` 11 | 12 | # EFS Configuration 13 | ``` 14 | openshift_provisioners_install_provisioners=True 15 | openshift_provisioners_efs=True 16 | openshift_provisioners_efs_fsid="[provided by instructor]" 17 | openshift_provisioners_efs_region="eu-central-1" 18 | openshift_provisioners_efs_nodeselector={"beta.kubernetes.io/os": "linux"} 19 | openshift_provisioners_efs_aws_access_key_id="[provided by instructor]" 20 | openshift_provisioners_efs_aws_secret_access_key="[provided by instructor]" 21 | openshift_provisioners_efs_supplementalgroup=65534 22 | openshift_provisioners_efs_path=/persistentvolumes 23 | ``` 24 | 25 | For detailed information about provisioners take a look at https://docs.openshift.com/container-platform/3.11/install_config/provisioners.html#provisioners-efs-ansible-variables 26 | 27 | Execute the playbook to install the provisioner: 28 | ``` 29 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-provisioners/config.yml 30 | ``` 31 | 32 | Check if the pv was created: 33 | ``` 34 | [ec2-user@master0 ~]$ oc get pv 35 | 36 | NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE 37 | provisioners-efs 1Mi RWX Retain Bound openshift-infra/provisioners-efs 22h 38 | ``` 39 | 40 | 41 | :warning: The external provisioner for AWS EFS on OpenShift Container Platform 3.11 is still a Technology Preview feature. 42 | https://docs.openshift.com/container-platform/3.11/install_config/provisioners.html#overview 43 | 44 | #### Create StorageClass 45 | 46 | To enable dynamic provisioning, you need to crate a storageclass: 47 | ``` 48 | [ec2-user@master0 ~]$ cat << EOF > aws-efs-storageclass.yaml 49 | kind: StorageClass 50 | apiVersion: storage.k8s.io/v1beta1 51 | metadata: 52 | name: nfs 53 | provisioner: openshift.org/aws-efs 54 | EOF 55 | [ec2-user@master0 ~]$ oc create -f aws-efs-storageclass.yaml 56 | ``` 57 | 58 | Check if the storage class has been created: 59 | ``` 60 | [ec2-user@master0 ~]$ oc get sc 61 | 62 | NAME PROVISIONER AGE 63 | glusterfs-storage kubernetes.io/glusterfs 23h 64 | nfs openshift.org/aws-efs 23h 65 | ``` 66 | 67 | #### Create PVC 68 | 69 | Now we create a little project and claim a volume from EFS. 70 | 71 | ``` 72 | [ec2-user@master0 ~]$ oc new-project quotatest 73 | [ec2-user@master0 ~]$ oc new-app centos/ruby-25-centos7~https://github.com/sclorg/ruby-ex.git 74 | [ec2-user@master0 ~]$ cat << EOF > test-pvc.yaml 75 | apiVersion: v1 76 | kind: PersistentVolumeClaim 77 | metadata: 78 | name: quotatest 79 | spec: 80 | accessModes: 81 | - ReadWriteOnce 82 | volumeMode: Filesystem 83 | resources: 84 | requests: 85 | storage: 10Mi 86 | storageClassName: nfs 87 | EOF 88 | [ec2-user@master0 ~]$ oc create -f test-pvc.yaml 89 | [ec2-user@master0 ~]$ oc set volume dc/ruby-ex --add --overwrite --name=v1 --type=persistentVolumeClaim --claim-name=quotatest --mount-path=/quotatest 90 | ``` 91 | 92 | Check if we can see our pvc: 93 | ``` 94 | [ec2-user@master0 ~]$ oc get pvc 95 | 96 | NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE 97 | quotatest Bound pvc-2fa78a43-98ee-11e9-94ce-064eab17d15e 10Mi RWX nfs 17m 98 | ``` 99 | 100 | We will now try to write 40Mi in the 10Mi claim to demonstrate, that PVs do not enforce quotas 101 | ``` 102 | [ec2-user@master0 ~]$ oc get pods 103 | NAME READY STATUS RESTARTS AGE 104 | ruby-ex-2-zwnws 1/1 Running 0 1h 105 | [ec2-user@master0 ~]$ oc rsh ruby-ex-2-zwnws 106 | $ df -h /quotatest 107 | Filesystem Size Used Avail Use% Mounted on 108 | fs-4f7f2916.efs.eu-central-1.amazonaws.com:/persistentvolumes/provisioners-efs-pvc-2fa78a43-98ee-11e9-94ce-064eab17d15e 8.0E 0 8.0E 0% /quotatest 109 | $ dd if=/dev/urandom of=/quotatest/quota bs=4096 count=10000 110 | $ $ du -hs /quotatest/ 111 | 40M /quotatest/ 112 | ``` 113 | 114 | #### Delete EFS Volumes 115 | When you delete the PVC, the PV and the corresponding data gets deleted. 116 | The default RECLAIM POLICY is set to 'Delete': 117 | ``` 118 | [ec2-user@master0 ~]$ oc get pv 119 | NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE 120 | provisioners-efs 1Mi RWX Retain Bound openshift-infra/provisioners-efs 23m 121 | pvc-2fa78a43-98ee-11e9-94ce-064eab17d15e 10Mi RWX Delete Bound test/provisioners-efs nfs 17m 122 | registry-volume 5Gi RWX Retain Bound default/registry-claim 13m 123 | ``` 124 | 125 | Rundown the application and delete the pvc: 126 | ``` 127 | [ec2-user@master0 ~]$ oc scale dc/ruby-ex --replicas=0 128 | [ec2-user@master0 ~]$ oc delete pvc quotatest 129 | ``` 130 | 131 | Check if the pv was deleted: 132 | ``` 133 | [ec2-user@master0 ~]$ oc get pv 134 | NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE 135 | provisioners-efs 1Mi RWX Retain Bound openshift-infra/provisioners-efs 23m 136 | registry-volume 5Gi RWX Retain Bound default/registry-claim 13m 137 | ``` 138 | 139 | Check if the efs-provisioner cleans up the NFS Volume: 140 | ``` 141 | [ec2-user@master0 ~]$ oc project openshift-infra 142 | [ec2-user@master0 ~]$ oc get pods 143 | NAME READY STATUS RESTARTS AGE 144 | provisioners-efs-1-l75qr 1/1 Running 0 1h 145 | [ec2-user@master0 ~]$ oc rsh provisioners-efs-1-l75qr 146 | sh-4.2# df /persistentvolumes 147 | Filesystem 1K-blocks Used Available Use% Mounted on 148 | fs-4f7f2916.efs.eu-central-1.amazonaws.com:/persistentvolumes 9007199254739968 0 9007199254739968 0% /persistentvolumes 149 | sh-4.2# ls /persistentvolumes 150 | sh-4.2# 151 | ``` 152 | 153 | --- 154 | 155 | [← back to the labs overview](../README.md) 156 | 157 | -------------------------------------------------------------------------------- /labs/34_renew_certificates.md: -------------------------------------------------------------------------------- 1 | ## Lab 3.4: Renew Certificates 2 | 3 | In this lab we take a look at the OpenShift certificates and how to renew them. 4 | 5 | These are the certificates that need to be maintained. For each component there is a playbook provided by Red Hat that will redeploy the certificates: 6 | - masters (API server and controllers) 7 | - etcd 8 | - nodes 9 | - registry 10 | - router 11 | 12 | 13 | ### Check the Expiration of the Certificates 14 | 15 | To check all your certificates, run the playbook `certificate_expiry/easy-mode.yaml`: 16 | ``` 17 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-checks/certificate_expiry/easy-mode.yaml 18 | ``` 19 | The playbook will generate the following reports with the information of each certificate in JSON and HTML format: 20 | ``` 21 | grep -A2 summary $HOME/cert-expiry-report*.json 22 | $HOME/cert-expiry-report*.html 23 | ``` 24 | 25 | 26 | ### Redeploy etcd Certificates 27 | 28 | To get a feeling for the process of redeploying certificates, we will redeploy the etcd certificates. 29 | 30 | **Warning:** This will lead to a restart of etcd and master services and consequently cause an outage for a few seconds of the OpenShift API. 31 | 32 | First, we check the current etcd certificates creation time: 33 | ``` 34 | [ec2-user@master0 ~]$ sudo openssl x509 -in /etc/origin/master/master.etcd-ca.crt -text -noout | grep -i validity -A 2 35 | Validity 36 | Not Before: Jun 4 15:45:00 2019 GMT 37 | Not After : Jun 2 15:45:00 2024 GMT 38 | 39 | [ec2-user@master0 ~]$ sudo openssl x509 -in /etc/origin/master/master.etcd-client.crt -text -noout | grep -i validity -A 2 40 | Validity 41 | Not Before: Jun 4 15:45:00 2019 GMT 42 | Not After : Jun 2 15:45:00 2024 GMT 43 | 44 | ``` 45 | Note the value for "Validity Not Before:". We will later compare this timestamp with the freshly deployed certificates. 46 | 47 | Redeploy the CA certificate of the etcd servers: 48 | ``` 49 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/redeploy-ca.yml 50 | ``` 51 | 52 | Check the current etcd CA certificate creation time: 53 | ``` 54 | [ec2-user@master0 ~]$ sudo openssl x509 -in /etc/origin/master/master.etcd-ca.crt -text -noout | grep -i validity -A 2 55 | Validity 56 | Not Before: Jun 6 12:58:04 2019 GMT 57 | Not After : Jun 4 12:58:04 2024 GMT 58 | 59 | [ec2-user@master0 ~]$ sudo openssl x509 -in /etc/origin/master/master.etcd-client.crt -text -noout | grep -i validity -A 2 60 | Validity 61 | Not Before: Jun 4 15:45:00 2019 GMT 62 | Not After : Jun 2 15:45:00 2024 GMT 63 | ``` 64 | The etcd CA certificate has been generated, but etcd is still using the old server certificates. We will replace them with the `redeploy-etcd-certificates.yml` playbook. 65 | 66 | **Warning:** This will again lead to a restart of etcd and master services and consequently cause an outage for a few seconds of the OpenShift API. 67 | ``` 68 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/redeploy-certificates.yml 69 | ``` 70 | 71 | Check if the server certificate has been replaced: 72 | ``` 73 | [ec2-user@master0 ~]$ sudo openssl x509 -in /etc/origin/master/master.etcd-ca.crt -text -noout | grep -i validity -A 2 74 | Validity 75 | Not Before: Jun 6 12:58:04 2019 GMT 76 | Not After : Jun 4 12:58:04 2024 GMT 77 | 78 | [ec2-user@master0 ~]$ sudo openssl x509 -in /etc/origin/master/master.etcd-client.crt -text -noout | grep -i validity -A 2 79 | Validity 80 | Not Before: Jun 6 13:28:36 2019 GMT 81 | Not After : Jun 4 13:28:36 2024 GMT 82 | ``` 83 | ### Redeploy nodes Certificates 84 | 85 | 1. Create a new bootstrap.kubeconfig for nodes (MASTER nodes will just copy admin.kubeconfig):" 86 | ``` 87 | [ec2-user@master0 ~]$ sudo oc serviceaccounts create-kubeconfig node-bootstrapper -n openshift-infra --config /etc/origin/master/admin.kubeconfig > /tmp/bootstrap.kubeconfig 88 | ``` 89 | 90 | 2. Distribute ~/bootstrap.kubeconfig from step 1 to infra and compute nodes replacing /etc/origin/node/bootstrap.kubeconfig 91 | ``` 92 | [ec2-user@master0 ~]$ ansible nodes -m copy -a 'src=/tmp/bootstrap.kubeconfig dest=/etc/origin/node/bootstrap.kubeconfig' 93 | ``` 94 | 95 | 3. Move node.kubeconfig and client-ca.crt. These will get recreated when the node service is restarted: 96 | ``` 97 | [ec2-user@master0 ~]$ ansible nodes -m shell -a 'mv /etc/origin/node/client-ca.crt{,.old}' 98 | [ec2-user@master0 ~]$ ansible nodes -m shell -a 'mv /etc/origin/node/node.kubeconfig{,.old}' 99 | ``` 100 | 4. Remove contents of /etc/origin/node/certificates/ on app-/infra-nodes: 101 | ``` 102 | [ec2-user@master0 ~]$ ansible nodes -m shell -a 'rm -rf /etc/origin/node/certificates' --limit 'nodes:!master*' 103 | ``` 104 | 5. Restart node service on app-/infra-nodes: 105 | :warning: restart atomic-openshift-node will fail, until CSR's are approved! Approve (Task 6) the CSR's and restart the Services again. 106 | ``` 107 | [ec2-user@master0 ~]$ ansible nodes -m service -a "name=atomic-openshift-node state=restarted" --limit 'nodes:!master*' 108 | ``` 109 | 6. Approve CSRs, 2 should be approved for each node: 110 | ``` 111 | [ec2-user@master0 ~]$ oc get csr -o name | xargs oc adm certificate approve 112 | ``` 113 | 7. Check if the app-/infra-nodes are READY: 114 | ``` 115 | [ec2-user@master0 ~]$ oc get node 116 | [ec2-user@master0 ~]$ for i in `oc get nodes -o jsonpath=$'{range .items[*]}{.metadata.name}\n{end}'`; do oc get --raw /api/v1/nodes/$i/proxy/healthz; echo -e "\t$i"; done 117 | ``` 118 | 8. Remove contents of /etc/origin/node/certificates/ on master-nodes: 119 | ``` 120 | [ec2-user@master0 ~]$ ansible masters -m shell -a 'rm -rf /etc/origin/node/certificates' 121 | ``` 122 | 9. Restart node service on master-nodes: 123 | ``` 124 | [ec2-user@master0 ~]$ ansible masters -m service -a "name=atomic-openshift-node state=restarted" 125 | ``` 126 | 10. Approve CSRs, 2 should be approved for each node: 127 | ``` 128 | [ec2-user@master0 ~]$ oc get csr -o name | xargs oc adm certificate approve 129 | ``` 130 | 11. Check if the master-nodes are READY: 131 | ``` 132 | [ec2-user@master0 ~]$ oc get node 133 | [ec2-user@master0 ~]$ for i in `oc get nodes -o jsonpath=$'{range .items[*]}{.metadata.name}\n{end}' | grep master`; do oc get --raw /api/v1/nodes/$i/proxy/healthz; echo -e "\t$i"; done 134 | ``` 135 | 136 | 137 | ### Replace the other main certificates 138 | 139 | Use the following playbooks to replace the certificates of the other main components of OpenShift: 140 | 141 | **Warning:** Do not yet replace the router certificates with the corresponding playbook as it will break your routers running on OpenShift 3.6. If you want to, replace the router certificates after upgrading to OpenShift 3.7. (Reference: https://bugzilla.redhat.com/show_bug.cgi?id=1490186) 142 | 143 | - masters (API server and controllers) 144 | - /usr/share/ansible/openshift-ansible/playbooks/openshift-master/redeploy-certificates.yml 145 | 146 | - etcd 147 | - /usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/redeploy-ca.yml 148 | - /usr/share/ansible/openshift-ansible/playbooks/openshift-etcd/redeploy-certificates.yml 149 | 150 | - registry 151 | - /usr/share/ansible/openshift-ansible/playbooks/openshift-hosted/redeploy-registry-certificates.yml 152 | 153 | - router 154 | - /usr/share/ansible/openshift-ansible/playbooks/openshift-hosted/redeploy-router-certificates.yml 155 | 156 | **Warning:** The documented redeploy-certificates.yml for Nodes doesn't exists anymore! (since 3.10) 157 | This is already reported: Red Hat Bugzilla – Bug 1635251. 158 | Red Hat provided this KCS: https://access.redhat.com/solutions/3782361 159 | 160 | - nodes (manual steps needed!) 161 | --- 162 | 163 | **End of Lab 3.4** 164 | 165 |

3.5 Add New OpenShift Node and Master →

166 | 167 | [← back to the Chapter Overview](30_daily_business.md) 168 | -------------------------------------------------------------------------------- /labs/33_persistent_storage.md: -------------------------------------------------------------------------------- 1 | ## Lab 3.3: Persistent Storage 2 | 3 | In this lab we take a look at the OpenShift implementation of Container Native Storage using the `heketi-cli` to resize a volume. 4 | 5 | 6 | ### heketi-cli 7 | 8 | The package `heketi-client` has been pre-installed for you on the bastion host. The package includes the `heketi-cli` command. 9 | In order to use `heketi-cli`, we need the server's URL and admin key: 10 | ``` 11 | [ec2-user@master0 ~]$ oc describe pod -n glusterfs | grep HEKETI_ADMIN_KEY 12 | HEKETI_ADMIN_KEY: [HEKETI_ADMIN_KEY] 13 | ``` 14 | 15 | We can then set variables with this information: 16 | ``` 17 | [ec2-user@master0 ~]$ export HEKETI_CLI_USER=admin 18 | [ec2-user@master0 ~]$ export HEKETI_CLI_KEY="[HEKETI_ADMIN_KEY]" 19 | [ec2-user@master0 ~]$ export HEKETI_CLI_SERVER=$(oc get svc/heketi-storage -n glusterfs --template "http://{{.spec.clusterIP}}:{{(index .spec.ports 0).port}}") 20 | ``` 21 | 22 | Verify that everything is set as it should: 23 | ``` 24 | [ec2-user@master0 ~]$ env | grep -i heketi 25 | HEKETI_CLI_KEY=[PASSWORD] 26 | HEKETI_CLI_SERVER=http://172.30.250.14:8080 27 | HEKETI_CLI_USER=admin 28 | ``` 29 | 30 | Now we can run some useful commands for troubleshooting. 31 | 32 | Get all volumes and then show details of a specific volume using its id: 33 | ``` 34 | [ec2-user@master0 ~]$ heketi-cli volume list 35 | Id:255b9535ee460dfa696a7616b57a7035 Cluster:bc64bf1b4a4e7cc0702d28c7c02674cf Name:glusterfs-registry-volume 36 | Id:e5baabb2bca5ba5cdd749d48d47c4e89 Cluster:bc64bf1b4a4e7cc0702d28c7c02674cf Name:heketidbstorage 37 | 38 | [ec2-user@master0 ~]$ heketi-cli volume info 255b9535ee460dfa696a7616b57a7035 39 | ... 40 | ``` 41 | 42 | Get the cluster id and details of the cluster: 43 | ``` 44 | [ec2-user@master0 ~]$ heketi-cli cluster list 45 | Clusters: 46 | Id:bc64bf1b4a4e7cc0702d28c7c02674cf [file][block] 47 | [ec2-user@master0 ~]$ heketi-cli cluster info bc64bf1b4a4e7cc0702d28c7c02674cf 48 | ... 49 | ``` 50 | 51 | Get nodes and details of a specific node using its id: 52 | ``` 53 | [ec2-user@master0 ~]$ heketi-cli node list 54 | Id:3efc4d8267eb3b65c2d3ed9848aa4328 Cluster:bc64bf1b4a4e7cc0702d28c7c02674cf 55 | Id:c0de1021e7577c26721b22003c14427c Cluster:bc64bf1b4a4e7cc0702d28c7c02674cf 56 | Id:c9612d0eee19146642f51dc2f3d484e5 Cluster:bc64bf1b4a4e7cc0702d28c7c02674cf 57 | [ec2-user@master0 ~]$ heketi-cli node info c9612d0eee19146642f51dc2f3d484e5 58 | ... 59 | ``` 60 | 61 | Show the whole topology: 62 | ``` 63 | [ec2-user@master0 ~]$ heketi-cli topology info 64 | ... 65 | ``` 66 | 67 | 68 | ### Set Default Storage Class 69 | 70 | A StorageClass provides a way to describe a certain type of storage. Different classes might map to different storage types (e.g. nfs, gluster, ...), quality-of-service levels, to backup policies or to arbitrary policies determined by the cluster administrators. In our case we only have one storage class which is `glusterfs-storage`: 71 | ``` 72 | [ec2-user@master0 ~]$ oc get storageclass 73 | ``` 74 | 75 | By setting the anotation `storageclass.kubernetes.io/is-default-class` on a StorageClass we make it the default storage class on an OpenShift cluster: 76 | ``` 77 | [ec2-user@master0 ~]$ oc patch storageclass glusterfs-storage -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' 78 | ``` 79 | 80 | If then someone creates a pvc and does not specify the StorageClass, the [DefaultStorageClass admission controller](https://kubernetes.io/docs/admin/admission-controllers/#defaultstorageclass) does automatically set the StorageClass to the DefaultStorageClass. 81 | 82 | **Note:** We could have set the Ansible inventory variable `openshift_storage_glusterfs_storageclass_default` to `true` during installation in order to let the playbooks automatically do what we just did by hand. For demonstration purposes however we set it to `false`. 83 | 84 | 85 | ### Create and Delete a Persistent Volume Claim 86 | 87 | If you create a PersistentVolumeClaim, Heketi will automatically create a PersistentVolume and bind it to your claim. Likewise if you delete a claim, Heketi will delete the PersistentVolume. 88 | 89 | Create a new project and create a pvc: 90 | ``` 91 | [ec2-user@master0 ~]$ oc new-project labelle 92 | Now using project "labelle" on server "https://console.user[X].lab.openshift.ch:8443". 93 | ... 94 | [ec2-user@master0 ~]$ cat <pvc.yaml 95 | apiVersion: "v1" 96 | kind: "PersistentVolumeClaim" 97 | metadata: 98 | name: "testclaim" 99 | spec: 100 | accessModes: 101 | - "ReadWriteOnce" 102 | resources: 103 | requests: 104 | storage: "1Gi" 105 | EOF 106 | 107 | [ec2-user@master0 ~]$ oc create -f pvc.yaml 108 | persistentvolumeclaim "testclaim" created 109 | ``` 110 | 111 | Check if the pvc could be bound to a new volume: 112 | ``` 113 | [ec2-user@master0 ~]$ oc get pvc 114 | NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE 115 | testclaim Bound pvc-839223fd-30d4-11e8-89f3-067e4f48dfe4 1Gi RWO glusterfs-storage 38s 116 | 117 | [ec2-user@master0 ~]$ oc get pv 118 | NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE 119 | pvc-839223fd-30d4-11e8-89f3-067e4f48dfe4 1Gi RWO Delete Bound labelle/testclaim glusterfs-storage 41s 120 | ... 121 | ``` 122 | 123 | Delete the claim and check if the volume gets deleted: 124 | ``` 125 | [ec2-user@master0 ~]$ oc delete pvc testclaim 126 | persistentvolumeclaim "testclaim" deleted 127 | [ec2-user@master0 ~]$ oc get pv 128 | 129 | [ec2-user@master0 ~]$ oc delete project labelle 130 | ``` 131 | 132 | 133 | ### Resize Existing Volume 134 | 135 | We will resize the registry volume with heketi-cli. 136 | 137 | First we need to know which volume is in use for the registry: 138 | ``` 139 | [ec2-user@master0 ~]$ oc get pvc registry-claim -n default 140 | NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE 141 | registry-claim Bound registry-volume 5Gi RWX 2d 142 | 143 | [ec2-user@master0 ~]$ oc describe pv registry-volume | grep Path 144 | Path: glusterfs-registry-volume 145 | 146 | [ec2-user@master0 ~]$ heketi-cli volume list | grep glusterfs-registry-volume 147 | Id:255b9535ee460dfa696a7616b57a7035 Cluster:bc64bf1b4a4e7cc0702d28c7c02674cf Name:glusterfs-registry-volume 148 | ``` 149 | 150 | Now we can extend the volume from 5Gi to 6Gi: 151 | ``` 152 | [ec2-user@master0 ~]$ heketi-cli volume expand --volume=255b9535ee460dfa696a7616b57a7035 --expand-size=1 153 | Name: glusterfs-registry-volume 154 | Size: 6 155 | ... 156 | ``` 157 | 158 | Check if the gluster volume has the new size: 159 | ``` 160 | [ec2-user@master0 ~]$ ansible infra_nodes -m shell -a "df -ah" | grep glusterfs-registry-volume 161 | 172.31.40.96:glusterfs-registry-volume 6.0G 317M 5.7G 3% /var/lib/origin/openshift.local.volumes/pods/d8dc2712-3bcf-11e8-90a6-066961eacc9a/volumes/kubernetes.io~glusterfs/registry-volume 162 | ``` 163 | 164 | In order for the persistent volume's information and the actually available space to be consistent, we're going to edit the pv's specification: 165 | ``` 166 | [ec2-user@master0 ~]$ oc get pv 167 | NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE 168 | registry-volume 5Gi RWX Retain Bound default/registry-claim 1d 169 | [ec2-user@master0 ~]$ oc patch pv registry-volume -p '{"spec":{"capacity":{"storage":"6Gi"}}}' 170 | [ec2-user@master0 ~]$ oc get pv 171 | NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE 172 | registry-volume 6Gi RWX Retain Bound default/registry-claim 1d 173 | ``` 174 | 175 | --- 176 | 177 | **End of Lab 3.3** 178 | 179 |

3.4 Renew Certificates →

180 | 181 | [← back to the Chapter Overview](30_daily_business.md) 182 | -------------------------------------------------------------------------------- /labs/61_monitoring.md: -------------------------------------------------------------------------------- 1 | ## Lab 6.1: Monitoring 2 | 3 | OpenShift monitoring can be categorized into three different categories which each try to answer their own question: 4 | 1. Is our cluster in an operational state right now? 5 | 2. Will our cluster remain in an operational state in the near future? 6 | 3. Does our cluster have enough capacity to run all pods? 7 | 8 | 9 | ### Is Our Cluster in an Operational State at the Moment? 10 | 11 | In order to answer this first question, we check the state of different vital components: 12 | * Masters 13 | * etcd 14 | * Routers 15 | * Apps 16 | 17 | **Masters** expose health information on an HTTP endpoint at https://`openshift_master_cluster_public_hostname`:`openshift_master_api_port`/healthz that can be checked for a 200 status code. On one hand, this endpoint can be used as a health indicator in a loadbalancer configuration, on the other hand we can use it ourselves for monitoring or troubleshooting purposes. 18 | 19 | Check the masters' health state with a HTTP request: 20 | ``` 21 | [ec2-user@master0 ~]$ curl -v https://console.user[X].lab.openshift.ch/healthz 22 | ``` 23 | 24 | As long as the response is a 200 status code at least one of the masters is still working and the API is accessible via Load Balancer (if there is one). 25 | 26 | **etcd** also exposes a similar health endpoint at https://`openshift_master_cluster_public_hostname`:2379/health, though it is only accessible using the client certificate and corresponding key stored on the masters at `/etc/origin/master/master.etcd-client.crt` and `/etc/origin/master/master.etcd-client.key`. 27 | ``` 28 | [ec2-user@master0 ~]$ sudo curl --cacert /etc/origin/master/master.etcd-ca.crt --cert /etc/origin/master/master.etcd-client.crt --key /etc/origin/master/master.etcd-client.key https://master0.user[X].lab.openshift.ch:2379/health 29 | ``` 30 | 31 | The **HAProxy router pods** are responsible for getting application traffic into OpenShift. Similar to the masters, HAProxy also exposes a /healthz endpoint on port 1936 which can be checked with e.g.: 32 | ``` 33 | [ec2-user@master0 ~]$ curl -v http://router.app[X].lab.openshift.ch:1936/healthz 34 | ``` 35 | 36 | Using the wildcard domain to access a router's health page results in a positive answer if at least one router is up and running and that's all we want to know right now. 37 | 38 | **Note:** Port 1936 is not open by default, so it has to be opened at least for those nodes running the router pods. This can be achieved e.g. by setting the ansible variable `openshift_node_open_ports` (at least as of OpenShift version 3.7 or later). 39 | 40 | **Apps** running on OpenShift should of course be (end-to-end) monitored as well, however, we are not interested in a single application per se. We want to know if all applications of a group of monitored applications do not respond. The more applications not responding the more probable a platform-wide problem is the cause. 41 | 42 | 43 | ### Will our Cluster Remain in an Operational State in the Near Future? 44 | 45 | The second category is based on a wider array of checks. It includes checks that take a more "classic" approach such as storage monitoring, but also includes above checks to find out if single cluster members are not healthy. 46 | 47 | First, let's look at how to use above checks to answer this second question. 48 | 49 | The health endpoint exposed by **masters** was accessed via load balancer in the first category in order to find out if the API is generally available. This time however we want to find out if at least one of the master APIs is unavailable, even if there still are some that are accessible. So we check every single master endpoint directly instead of via load balancer: 50 | ``` 51 | [ec2-user@master0 ~]$ for i in {0..2}; do curl -v https://master${i}.user[X].lab.openshift.ch/healthz; done 52 | ``` 53 | 54 | The **etcd** check above is already run against single members of the cluster and can therefore be applied here in the exact same form. The difference only is that we want to make sure every single member is running, not just the number needed to have quorum. 55 | 56 | The approach used for the masters also applies to the **HAProxy routers**. A router pod is effectively listening on the node's interface it is running on. So instead of connecting via load balancer, we use the nodes' IP addresses the router pods are running on. In our case, these are nodes 0 and 1: 57 | ``` 58 | [ec2-user@master0 ~]$ for i in {0..2}; do curl -v http://infra-node${i}.user[X].lab.openshift.ch:1936/healthz; done 59 | ``` 60 | 61 | As already mentioned, finding out if our cluster will remain in an operational state in the near future also includes some better known checks we could call a more conventional **components monitoring**. 62 | 63 | Next to the usual monitoring of storage per partition/logical volume, there's one logical volume on each node of special interest to us: the **Docker storage**. The Docker storage contains images and container filesystems of running containers. Monitoring the available space of this logical volume is important in order to tune garbage collection. Garbage collection is done by the **kubelets** running on each node. The available garbage collection kubelet arguments can be found in the [official documentation](https://docs.openshift.com/container-platform/3.11/admin_guide/garbage_collection.html). 64 | 65 | Speaking of garbage collection, there's another component that needs frequent garbage collection: the registry. Contrary to the Docker storage on each node, OpenShift only provides a command to prune the registry but does not offer a means to execute it on a regular basis. Until it does, setup the [appuio-pruner](https://github.com/appuio/appuio-pruner) as described in its README. 66 | 67 | 68 | ### Does our Cluster Have Enough Capacity to Run All Pods? 69 | 70 | Besides the obvious components that need monitoring like CPU, memory and storage, this third question is tightly coupled with requests and limits we looked at in [chapter 4](41_out_of_resource_handling.md). 71 | 72 | But let's first get an overview of available resources using tools you might not have heard about before. One such tool is [Cockpit](http://cockpit-project.org/). Cockpit aims to ease administration tasks of Linux servers by making some basic tasks available via web interface. It is installed by default on every master by the OpenShift Ansible playbooks and listens on port 9090. We don't want to expose the web interface to the internet though, so we are going to use SSH port forwarding to access it: 73 | ``` 74 | [ec2-user@master0 ~]$ ssh ec2-user@jump.lab.openshift.ch -L 9090:master0.user[X].lab.openshift.ch:9090 75 | ``` 76 | 77 | After the SSH tunnel has been established, open http://localhost:9090 in your browser and log in using user `ec2-user` and the password provided by the instructor. Explore the different tabs and sections of the web interface. 78 | 79 | Another possibility to get a quick overview of used and available resources is the [kube-ops-view](https://github.com/hjacobs/kube-ops-view) project. Install it on your OpenShift cluster: 80 | ``` 81 | oc new-project ocp-ops-view 82 | oc create sa kube-ops-view 83 | oc adm policy add-scc-to-user anyuid -z kube-ops-view 84 | oc adm policy add-cluster-role-to-user cluster-admin system:serviceaccount:ocp-ops-view:kube-ops-view 85 | oc apply -f https://raw.githubusercontent.com/raffaelespazzoli/kube-ops-view/ocp/deploy-openshift/kube-ops-view.yaml 86 | oc create route edge --service kube-ops-view 87 | oc get route | grep kube-ops-view | awk '{print $2}' 88 | ``` 89 | 90 | The design takes some getting used to, but at least the browser zoom can help with the small size. 91 | 92 | The information about kube-ops-view as well as its installation instructions are actually from a [blog post series](https://blog.openshift.com/full-cluster-capacity-management-monitoring-openshift/) from Red Hat that does a very good job at explaining the different relations and possibilities to finding an answer to our question about capacity. 93 | 94 | These two tools provide a quick look at resource availability. Implementing a mature, enterprise-grade monitoring of OpenShift resources depends on what tools are available already in an IT environment and would go beyond the scope and length of this techlab, but the referred blog post series certainly is a good start. 95 | 96 | 97 | --- 98 | 99 | **End of Lab 6.1** 100 | 101 |

6.2 Troubleshooting Using Logs →

102 | 103 | [← back to the Chapter Overview](60_monitoring_troubleshooting.md) 104 | -------------------------------------------------------------------------------- /labs/41_out_of_resource_handling.md: -------------------------------------------------------------------------------- 1 | ## Lab 4.1: Out of Resource Handling 2 | 3 | This lab deals with out of resource handling on OpenShift platforms, most importantly the handling of out-of-memory conditions. Out of resource conditions can occur either on the container level because of resource limits or on the node level because a node runs out of memory as a result of overcommitting. 4 | They are either handled by OpenShift or directly by the kernel. 5 | 6 | 7 | ### Introduction 8 | 9 | The following terms and behaviours are crucial in understanding this lab. 10 | 11 | Killing a pod or a container are fundamentally different: 12 | * Pods and its containers live on the same node for the duration of their lifetime. 13 | * A pod's restart policy determines whether its containers are restarted after being killed. 14 | * Killed containers always restart on the same node. 15 | * If a pod is killed the configuration of its controller, e.g. ReplicationController, ReplicaSet, Job, ..., determines whether a replacement pod is created. 16 | * Pods without controllers are never replaced after being killed. 17 | 18 | An OpenShift node recovers from out of memory conditions by killing containers or pods: 19 | * **Out of Memory (OOM) Killer**: Linux kernel mechanism which kills processes to recover from out of memory conditions. 20 | * **Pod Eviction**: An OpenShift mechanism which kills pods to recover from out of memory conditions. 21 | 22 | The order in which containers and pods are killed is determined by their Quality of Service (QoS) class. 23 | The QoS class in turn is defined by resource requests and limits developers configure on their containers. 24 | For more information see [Quality of Service Tiers](https://docs.openshift.com/container-platform/3.11/dev_guide/compute_resources.html#quality-of-service-tiers). 25 | 26 | 27 | ### Out of Memory Killer in Action 28 | 29 | To observe how the OOM killer in action create a container which allocates all memory available on the node it runs on: 30 | 31 | ``` 32 | [ec2-user@master0 ~]$ oc new-project out-of-memory 33 | [ec2-user@master0 ~]$ oc create -f https://raw.githubusercontent.com/appuio/ops-techlab/release-3.11/resources/membomb/pod_oom.yaml 34 | ``` 35 | 36 | Wait and watch till the container is up and being killed. `oc get pods -o wide -w` will then show: 37 | ``` 38 | NAME READY STATUS RESTARTS AGE IP NODE 39 | membomb-1-z6md2 0/1 OOMKilled 0 7s 10.131.2.24 app-node0.user8.lab.openshift.ch 40 | ``` 41 | 42 | Run `oc describe pod -l app=membomb` to get more information about the container state which should look like this: 43 | ``` 44 | State: Terminated 45 | Reason: OOMKilled 46 | Exit Code: 137 47 | Started: Thu, 17 May 2018 10:51:02 +0200 48 | Finished: Thu, 17 May 2018 10:51:04 +0200 49 | ``` 50 | 51 | Exit code 137 [indicates](http://tldp.org/LDP/abs/html/exitcodes.html) that the container main process was killed by the `SIGKILL` signal. 52 | With the default `restartPolicy` of `Always` the container would now restart on the same node. For this lab the `restartPolicy` 53 | has been set to `Never` to prevent endless out-of-memory conditions and restarts. 54 | 55 | Now log into the OpenShift node the pod ran on and study how the OOM event looks like in the kernel logs. 56 | You can see on which node the pod ran in the output of either the `oc get` or `oc describe` command you just ran. 57 | In this example this would look like: 58 | 59 | ``` 60 | ssh app-node0.user[X].lab.openshift.ch 61 | journalctl -ke 62 | ``` 63 | 64 | The following lines should be highlighted: 65 | 66 | ``` 67 | May 17 10:51:04 app-node0.user8.lab.openshift.ch kernel: Memory cgroup out of memory: Kill process 5806 (python) score 1990 or sacrifice child 68 | May 17 10:51:04 app-node0.user8.lab.openshift.ch kernel: Killed process 5806 (python) total-vm:7336912kB, anon-rss:5987524kB, file-rss:0kB, shmem-rss:0kB 69 | ``` 70 | 71 | This log messages indicate that the OOM killer has been invoked because a cgroup memory limit has been exceeded 72 | and that it killed a python process which consumed 5987524kB memory. Cgroup is a kernel mechanism which limits 73 | resource usage of processes. 74 | Further up in the log you should see a line like the following, followed by usage and limits of the corresponding cgroup hierarchy: 75 | 76 | ``` 77 | May 17 10:51:04 app-node0.user8.lab.openshift.ch kernel: Task in /kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod6ba0af16_59af_11e8_9a62_0672f11196a0.slice/docker-648ff0b111978161b0ac94fb72a4656ee3f98b8e73f7eb63c5910f5cf8cd9c53.scope killed as a result of limit of /kubepods.slice 78 | ``` 79 | 80 | This message tells you that a limit of the cgroup `kubepods.slice` has been exceeded. That's the cgroup 81 | limiting the resource usage of all container processes on a node, preventing them from using resources 82 | reserved for the kernel and system daemons. 83 | Note that a container can also be killed by the OOM killer because it reached its own memory limit. In that 84 | case a different cgroup will be listed in the `killed as a result of limit of` line. Everything 85 | else will however look the same. 86 | 87 | There are some drawbacks to containers being killed by the out of memory killer: 88 | * Containers are always restarted on the same node, possibly repeating the same out of memory condition over and over again. 89 | * There is no grace period, container processes are immediately killed with SIGKILL. 90 | 91 | Because of this OpenShift provides the "Pod Eviction" mechanism to kill and reschedule pods before they trigger 92 | an out of resource condition. 93 | 94 | 95 | ### Pod Eviction 96 | 97 | OpenShift offers hard and soft evictions. Hard evictions act immediately when the configured threshold is reached. 98 | Soft evictions allow the threshold to be exceeded for a configurable grace period before taking action. 99 | 100 | To observe a pod eviction create a container which allocates memory till it is being evicted: 101 | 102 | ``` 103 | [ec2-user@master0 ~]$ oc create -f https://raw.githubusercontent.com/appuio/ops-techlab/release-3.11/resources/membomb/pod_eviction.yaml 104 | ``` 105 | 106 | Wait till the container gets evicted. Run `oc describe pod -l app=membomb` to see the reason for the eviction: 107 | ``` 108 | Status: Failed 109 | Reason: Evicted 110 | Message: The node was low on resource: memory. 111 | ``` 112 | 113 | After a pod eviction a node is flagged as being under memory pressure for a short time, by default 5 minutes. 114 | Nodes under memory pressure are not considered for scheduling new pods. 115 | 116 | ### Recommendations 117 | 118 | Beginning with OCP 3.6 the memory available for pods on a node is determined by this formula: 119 | ``` 120 | = - - - 121 | ``` 122 | 123 | Where 124 | * `` is the memory (RAM) of a node. 125 | * `` is an option of the OpenShift node service (kubelet), specifying how much memory to reserve for OpenShift node components. 126 | * `` is an option of the OpenShift node service (kubelet), specifying how much memory to reserve for the kernel and system daemons. 127 | 128 | Also beginning with OCP 3.6 the OOM killer is now triggered when the total memory consumed by all pods on a node exceeds the 129 | allocatable memory, even when there's still memory available on the node. You can view the amount of allocatable memory on all 130 | nodes by running `oc describe nodes`. 131 | 132 | For stable operations we recommend to reserve about **10%** of the nodes memory for the kernel, system daemons and node components 133 | with the `kube-reserved` and `system-reserved` parameters. More memory may need to be reserved if you run additional system 134 | daemons for monitoring, backup, etc. on nodes. 135 | OCP 3.6 has a hard memory eviction threshold of 100 MiB preconfigured. No other eviction thresholds are enabled by default. 136 | This is usually to low to trigger pod eviction before the OOM killer hits. We recommend to start with a hard memory eviction 137 | threshold of **500Mi**. If you keep to see lots of OOM killed containers consider increasing the hard eviction threshold or 138 | adding a soft eviction threshold. But remember that hard eviction thresholds are subtracted from the nodes allocatable resources. 139 | 140 | You can configure reserves and eviction thresholds in the node configuration, e.g.: 141 | 142 | ``` 143 | kubeletArguments: 144 | kube-reserved: 145 | - "cpu=200m,memory=512Mi" 146 | system-reserved: 147 | - "cpu=200m,memory=512Mi" 148 | ``` 149 | 150 | See [Allocating Node Resources](https://docs.openshift.com/container-platform/3.11/admin_guide/allocating_node_resources.html) 151 | and [Out of Resource Handling](https://docs.openshift.com/container-platform/3.11/admin_guide/out_of_resource_handling.html) for more information. 152 | 153 | --- 154 | 155 | **End of Lab 4.1** 156 | 157 |

4.2. Outgoing HTTP Proxies →

158 | 159 | [← back to the Chapter Overview](40_configuration_best_practices.md) 160 | -------------------------------------------------------------------------------- /appendices/01_prometheus.md: -------------------------------------------------------------------------------- 1 | # Prometheus 2 | Source: https://github.com/prometheus/prometheus 3 | 4 | Visit [prometheus.io](https://prometheus.io) for the full documentation, 5 | examples and guides. 6 | 7 | Prometheus, a [Cloud Native Computing Foundation](https://cncf.io/) project, is a systems and service monitoring system. It collects metrics 8 | from configured targets at given intervals, evaluates rule expressions, 9 | displays the results, and can trigger alerts if some condition is observed 10 | to be true. 11 | 12 | Prometheus' main distinguishing features as compared to other monitoring systems are: 13 | 14 | - a **multi-dimensional** data model (timeseries defined by metric name and set of key/value dimensions) 15 | - a **flexible query language** to leverage this dimensionality 16 | - no dependency on distributed storage; **single server nodes are autonomous** 17 | - timeseries collection happens via a **pull model** over HTTP 18 | - **pushing timeseries** is supported via an intermediary gateway 19 | - targets are discovered via **service discovery** or **static configuration** 20 | - multiple modes of **graphing and dashboarding support** 21 | - support for hierarchical and horizontal **federation** 22 | 23 | ## Prometheus overview 24 | The following diagram shows the general architectural overview of Prometheus: 25 | 26 | ![Prometheus Architecture](../resources/images/prometheus_architecture.png) 27 | 28 | ## Monitoring use cases 29 | Starting with OpenShift 3.11, Prometheus is installed by default to **monitor the OpenShift cluster** (depicted in the diagram below on the left side: *Kubernetes Prometheus deployment*). This installation is managed by the "Cluster Monitoring Operator" and not intended to be customized (we will do it anyway). 30 | 31 | To **monitor applications** or **define custom Prometheus configurations**, the Tech Preview feature [Operator Lifecycle Manager (OLM)](https://docs.openshift.com/container-platform/3.11/install_config/installing-operator-framework.html]) can be used to install the Prometheus Operator which in turn allows to define Prometheus instances (depicted in the diagram below on the right side: *Service Prometheus deployment*). These instances are fully customizable with the use of *Custom Ressource Definitions (CRD)*. 32 | 33 | ![Prometheus Overview](../resources/images/prometheus_use-cases.png) 34 | 35 | (source: https://sysdig.com/blog/kubernetes-monitoring-prometheus-operator-part3/) 36 | 37 | # Cluster Monitoring Operator 38 | 39 | ![Cluster Monitoring Operator components](../resources/images/prometheus_cmo.png) 40 | 41 | 42 | ## Installation 43 | 44 | 45 | 46 | From OpenShift 3.11 onwards, the CMO is installed per default. To customize the installation you can set the following variables in inventory (small cluster) 47 | 48 | ```ini 49 | openshift_cluster_monitoring_operator_install=true # default value 50 | openshift_cluster_monitoring_operator_prometheus_storage_enabled=true 51 | openshift_cluster_monitoring_operator_prometheus_storage_capacity=50Gi 52 | openshift_cluster_monitoring_operator_prometheus_storage_class_name=[tbd] 53 | openshift_cluster_monitoring_operator_alertmanager_storage_enabled=true 54 | openshift_cluster_monitoring_operator_alertmanager_storage_capacity=2Gi 55 | openshift_cluster_monitoring_operator_alertmanager_storage_class_name=[tbd] 56 | openshift_cluster_monitoring_operator_alertmanager_config=[tbd] 57 | ``` 58 | 59 | Run the installer 60 | 61 | ``` 62 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-monitoring/config.yml 63 | ``` 64 | 65 | ### Access Prometheus 66 | 67 | You can login with the cluster administrator `sheriff` on: 68 | https://prometheus-k8s-openshift-monitoring.app[X].lab.openshift.ch/ 69 | 70 | - Additional targets: `Status` -> `Targets` 71 | - Scrape configuration: `Status` -> `Configuration` 72 | - Defined rules: `Status` -> `Rules` 73 | - Service Discovery: `Status` -> `Service Discovery` 74 | 75 | 76 | ### Configure Prometheus 77 | Let Prometheus scrape service labels in different namespaces 78 | 79 | ``` 80 | [ec2-user@master0 ~]$ oc adm policy add-cluster-role-to-user cluster-reader -z prometheus-k8s -n openshift-monitoring 81 | ``` 82 | 83 | To modify the Prometheus configuration - e.g. retention time, change the ConfigMap `cluster-monitoring-config` as described here: 84 | 85 | 86 | ``` 87 | [ec2-user@master0 ~]$ oc edit cm cluster-monitoring-config -n openshift-monitoring 88 | ``` 89 | 90 | Unfortunately, changing the default scrape config is not supported with the Cluster Monitoring Operator. 91 | 92 | #### etcd monitoring 93 | 94 | To add etcd monitoring, follow this guide: 95 | 96 | 97 | ## Additional services: CRD type ServiceMonitor (unsupported by Red Hat) 98 | 99 | Creating additional ServiceMonitor objects is not supported by Red Hat. See [Supported Configuration](https://docs.openshift.com/container-platform/3.11/install_config/prometheus_cluster_monitoring.html#supported-configuration) for details. 100 | 101 | We will do it anyway :sunglasses:. 102 | 103 | In order for the custom services to be added to the managed Prometheus instance, the label `k8s-app` needs to be present in the "ServiceMonitor" *Custom Ressource (CR)* 104 | 105 | See example for *Service Monitor* `router-metrics`: 106 | 107 | ```yaml 108 | apiVersion: monitoring.coreos.com/v1 109 | kind: ServiceMonitor 110 | metadata: 111 | generation: 1 112 | labels: 113 | k8s-app: router-metrics 114 | name: router-metrics 115 | namespace: "" 116 | spec: 117 | endpoints: 118 | - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token 119 | honorLabels: true 120 | interval: 30s 121 | port: 1936-tcp 122 | scheme: https 123 | tlsConfig: 124 | caFile: /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt 125 | insecureSkipVerify: true 126 | namespaceSelector: 127 | matchNames: 128 | - default 129 | selector: 130 | matchLabels: 131 | router: router 132 | ``` 133 | 134 | ### Router Monitoring 135 | 136 | Create the custom cluster role `router-metrics` and add it to the Prometheus service account `prometheus-k8s`, for Prometheus to be able to read the router metrics. 137 | First you need to check, what labels your routers are using. 138 | 139 | ``` 140 | [ec2-user@master0 ~]$ oc get endpoints -n default --show-labels 141 | NAME ENDPOINTS AGE LABELS 142 | router 172.31.43.147:1936,172.31.47.59:1936,172.31.47.64:1936 + 6 more... 1h router=router 143 | ``` 144 | 145 | Add the `prometheus-k8s` service account to the `router-metrics` cluster role 146 | ``` 147 | [ec2-user@master0 ~]$ oc adm policy add-cluster-role-to-user router-metrics system:serviceaccount:openshift-monitoring:prometheus-k8s 148 | ``` 149 | 150 | Set the router label as parameter and create the service monitor 151 | ``` 152 | [ec2-user@master0 ~]$ oc project openshift-monitoring 153 | [ec2-user@master0 ~]$ oc process -f resource/templates/template-router.yaml -p ROUTER_LABEL="router" | oc apply -f - 154 | ``` 155 | 156 | ### Logging Monitoring 157 | Just works on clustered ElasticSearch, the OPStechlab runs because of lack of ressources on a single node ES. 158 | The Service `logging-es-prometheus` needs to be labeled and the following RoleBinding applied, for Prometheus to be able to get the metrics. 159 | 160 | ``` 161 | [ec2-user@master0 ~]$ oc label svc logging-es-prometheus -n openshift-logging scrape=prometheus 162 | [ec2-user@master0 ~]$ oc create -f resource/templates/template-rolebinding.yaml -n openshift-logging 163 | [ec2-user@master0 ~]$ oc process -f resource/templates/template-logging.yaml | oc apply -f - 164 | ``` 165 | 166 | ## Additional rules: CRD type PrometheusRule 167 | 168 | Take a look at the additional ruleset, that we suggest to use monitoring OpenShift. 169 | ``` 170 | [ec2-user@master0 ~]$ less resource/templates/template-k8s-custom-rules.yaml 171 | ``` 172 | 173 | Add the custom rules from the template folder to Prometheus: 174 | 175 | ``` 176 | [ec2-user@master0 ~]$ oc process -f resource/templates/template-k8s-custom-rules.yaml -p SEVERITY_LABEL="critical" | oc apply -f - 177 | ``` 178 | 179 | ## AlertManager 180 | 181 | Configuring Alertmanager with the Red Hat Ansible playbooks. 182 | 183 | 184 | By hand 185 | 186 | ``` 187 | [ec2-user@master0 ~]$ oc delete secret alertmanager-main 188 | [ec2-user@master0 ~]$ oc create secret generic alertmanager-main --from-file=resource/templates/alertmanager.yaml 189 | ``` 190 | 191 | Follow these guides: 192 | 193 | 194 | Check if the new configuration is in place: https://alertmanager-main-openshift-monitoring.app[X].lab.openshift.ch/#/status 195 | 196 | ## Additional configuration 197 | 198 | ### Add view role for developers 199 | 200 | Let non OpenShift admins access Prometheus: 201 | ``` 202 | [ec2-user@master0 ~]$ oc adm policy add-cluster-role-to-user cluster-monitoring-view [user] 203 | ``` 204 | 205 | ### Add metrics reader service account to access Prometheus metrics 206 | 207 | You can create a service account to access Prometheus through the API 208 | ``` 209 | [ec2-user@master0 ~]$ oc create sa prometheus-metrics-reader -n openshift-monitoring 210 | [ec2-user@master0 ~]$ oc adm policy add-cluster-role-to-user cluster-monitoring-view -z prometheus-metrics-reader -n openshift-monitoring 211 | ``` 212 | 213 | Access the API with a simple `curl` 214 | ``` 215 | [ec2-user@master0 ~]$ export TOKEN=`oc sa get-token prometheus-metrics-reader -n openshift-monitoring` 216 | [ec2-user@master0 ~]$ curl https://prometheus-k8s-openshift-monitoring.app[X].lab.openshift.ch/api/v1/query?query=ALERTS -H "Authorization: Bearer $TOKEN" 217 | ``` 218 | 219 | ### Allow Prometheus to scrape your metrics endpoints (if using ovs-networkpolicy plugin) 220 | 221 | Create an additional network-policy. 222 | 223 | ``` 224 | [ec2-user@master0 ~]$ oc create -f resource/templates/networkpolicy.yaml -n [namespace] 225 | ``` 226 | -------------------------------------------------------------------------------- /theme/puzzle.css: -------------------------------------------------------------------------------- 1 | /** 2 | * A simple theme for reveal.js presentations, similar 3 | * to the default theme. The accent color is darkblue. 4 | * 5 | * This theme is Copyright (C) 2012 Owen Versteeg, https://github.com/StereotypicalApps. It is MIT licensed. 6 | * reveal.js is Copyright (C) 2011-2012 Hakim El Hattab, http://hakim.se 7 | */ 8 | @import url(https://fonts.googleapis.com/css?family=Roboto:400,300,500); 9 | @import url(https://fonts.googleapis.com/css?family=Roboto+Slab); 10 | /********************************************* 11 | * GLOBAL STYLES 12 | *********************************************/ 13 | body { 14 | background: #fff; 15 | background-color: #fff; } 16 | 17 | .reveal { 18 | font-family: "Roboto", sans-serif; 19 | font-size: 2.5rem; 20 | font-weight: 300; 21 | color: #444; } 22 | 23 | .reveal.has-dark-background { 24 | color: #fff; } 25 | 26 | ::selection { 27 | color: #fff; 28 | background: #69B978; 29 | text-shadow: none; } 30 | 31 | .reveal .slides > section, 32 | .reveal .slides > section > section { 33 | line-height: 1.3; 34 | font-weight: inherit; } 35 | 36 | /********************************************* 37 | * HEADERS 38 | *********************************************/ 39 | .reveal h1, .reveal h2, .reveal h3, .reveal h4, .reveal h5, .reveal h6 { 40 | margin: 0 0 2.5rem 0; 41 | color: #444; 42 | font-family: "Roboto Slab", serif; 43 | font-weight: normal; 44 | line-height: 1.5em; 45 | letter-spacing: normal; 46 | text-transform: none; 47 | text-shadow: none; 48 | word-wrap: break-word; } 49 | 50 | .reveal.has-dark-background h1, .reveal.has-dark-background h2, .reveal.has-dark-background h3, .reveal.has-dark-background h4, .reveal.has-dark-background h5, .reveal.has-dark-background h6 { 51 | color: #fff; } 52 | 53 | .reveal h1 { 54 | font-size: 6rem; } 55 | 56 | .reveal h2 { 57 | font-size: 4rem; } 58 | 59 | .reveal h3 { 60 | font-size: 3rem; } 61 | 62 | .reveal h4 { 63 | font-size: 2rem; } 64 | 65 | .reveal h1 { 66 | text-shadow: none; } 67 | 68 | /********************************************* 69 | * OTHER 70 | *********************************************/ 71 | .reveal p { 72 | margin: 2.5rem 0; 73 | line-height: 1.3; } 74 | 75 | /* Ensure certain elements are never larger than the slide itself */ 76 | .reveal img, 77 | .reveal video, 78 | .reveal iframe { 79 | max-width: 90%; 80 | max-height: 90%; } 81 | 82 | .reveal strong, 83 | .reveal b { 84 | font-weight: bold; } 85 | 86 | .reveal em { 87 | font-style: italic; } 88 | 89 | .reveal ol, 90 | .reveal dl, 91 | .reveal ul { 92 | display: inline-block; 93 | text-align: left; 94 | margin: 0 0 0 1em; } 95 | 96 | .reveal ol { 97 | list-style-type: decimal; } 98 | 99 | .reveal ul { 100 | list-style-type: disc; } 101 | 102 | .reveal ul ul { 103 | list-style-type: square; } 104 | 105 | .reveal ul ul ul { 106 | list-style-type: circle; } 107 | 108 | .reveal ul ul, 109 | .reveal ul ol, 110 | .reveal ol ol, 111 | .reveal ol ul { 112 | display: block; 113 | margin-left: 40px; } 114 | 115 | .reveal dt { 116 | font-weight: bold; } 117 | 118 | .reveal dd { 119 | margin-left: 40px; } 120 | 121 | .reveal q, 122 | .reveal blockquote { 123 | quotes: none; } 124 | 125 | .reveal blockquote { 126 | display: block; 127 | position: relative; 128 | width: 70%; 129 | margin: 2.5rem auto; 130 | padding: 5px; 131 | font-style: italic; 132 | background: rgba(255, 255, 255, 0.05); 133 | box-shadow: 0px 0px 2px rgba(0, 0, 0, 0.2); } 134 | 135 | .reveal blockquote p:first-child, 136 | .reveal blockquote p:last-child { 137 | display: inline-block; } 138 | 139 | .reveal q { 140 | font-style: italic; } 141 | 142 | .reveal pre { 143 | display: block; 144 | position: relative; 145 | width: 90%; 146 | margin: 2.5rem auto; 147 | text-align: left; 148 | font-size: 0.5em; 149 | font-family: monospace; 150 | line-height: 1.2em; 151 | word-wrap: break-word; 152 | box-shadow: 0px 0px 6px rgba(0, 0, 0, 0.3); } 153 | 154 | .reveal code { 155 | font-family: monospace; } 156 | 157 | .reveal pre code { 158 | display: block; 159 | padding: 5px; 160 | overflow: auto; 161 | max-height: 400px; 162 | word-wrap: normal; 163 | background: #3F3F3F; 164 | color: #DCDCDC; } 165 | 166 | .reveal table { 167 | font-size: 0.5em; 168 | margin: auto; 169 | border-collapse: collapse; 170 | border-spacing: 0; } 171 | 172 | .reveal table th { 173 | font-weight: bold; } 174 | 175 | .reveal table th, 176 | .reveal table td { 177 | text-align: left; 178 | padding: 0.2em 0.5em 0.2em 0.5em; 179 | border-bottom: 1px solid; } 180 | 181 | .reveal table tr:last-child td { 182 | border-bottom: none; } 183 | 184 | .reveal sup { 185 | vertical-align: super; } 186 | 187 | .reveal sub { 188 | vertical-align: sub; } 189 | 190 | .reveal small { 191 | display: inline-block; 192 | font-size: 0.6em; 193 | line-height: 1.2em; 194 | vertical-align: top; } 195 | 196 | .reveal small * { 197 | vertical-align: top; } 198 | 199 | .reveal .text-left { 200 | text-align: left; } 201 | 202 | /********************************************* 203 | * LINKS 204 | *********************************************/ 205 | .reveal a { 206 | color: #333; 207 | text-decoration: none; 208 | background-image: linear-gradient(to bottom, transparent 50%, rgba(0, 0, 0, 0.3) 50%); 209 | background-repeat: repeat-x; 210 | background-size: 2px 2px; 211 | background-position: 0 1em; 212 | -webkit-transition: color .15s ease; 213 | -moz-transition: color .15s ease; 214 | transition: color .15s ease; } 215 | 216 | .reveal a:hover { 217 | color: #111; 218 | text-shadow: none; 219 | border: none; } 220 | 221 | .reveal.has-dark-background a { 222 | color: #eee; 223 | background-image: linear-gradient(to bottom, rgba(255, 255, 255, 0) 50%, rgba(255, 255, 255, 0.3) 50%); } 224 | 225 | .reveal.has-dark-background a:hover { 226 | color: #fff; } 227 | 228 | .reveal .roll span:after { 229 | color: #fff; 230 | background: #0d0d0d; } 231 | 232 | /********************************************* 233 | * IMAGES 234 | *********************************************/ 235 | .reveal section img.thumbnail { 236 | margin: 15px 0px; 237 | background: rgba(255, 255, 255, 0.12); 238 | border: 4px solid #444; 239 | box-shadow: 0 0 10px rgba(0, 0, 0, 0.15); } 240 | 241 | .reveal a img { 242 | -webkit-transition: all .15s linear; 243 | -moz-transition: all .15s linear; 244 | transition: all .15s linear; } 245 | 246 | .reveal a:hover img { 247 | background: rgba(255, 255, 255, 0.2); 248 | border-color: #333; 249 | box-shadow: 0 0 20px rgba(0, 0, 0, 0.55); } 250 | 251 | /********************************************* 252 | * NAVIGATION CONTROLS 253 | *********************************************/ 254 | .reveal .controls div.navigate-left, 255 | .reveal .controls div.navigate-left.enabled { 256 | border-right-color: #333; } 257 | 258 | .reveal .controls div.navigate-right, 259 | .reveal .controls div.navigate-right.enabled { 260 | border-left-color: #333; } 261 | 262 | .reveal .controls div.navigate-up, 263 | .reveal .controls div.navigate-up.enabled { 264 | border-bottom-color: #333; } 265 | 266 | .reveal .controls div.navigate-down, 267 | .reveal .controls div.navigate-down.enabled { 268 | border-top-color: #333; } 269 | 270 | .reveal .controls div.navigate-left.enabled:hover { 271 | border-right-color: #111; } 272 | 273 | .reveal .controls div.navigate-right.enabled:hover { 274 | border-left-color: #111; } 275 | 276 | .reveal .controls div.navigate-up.enabled:hover { 277 | border-bottom-color: #111; } 278 | 279 | .reveal .controls div.navigate-down.enabled:hover { 280 | border-top-color: #111; } 281 | 282 | /********************************************* 283 | * PROGRESS BAR 284 | *********************************************/ 285 | .reveal .progress { 286 | background: rgba(0, 0, 0, 0.2); } 287 | 288 | .reveal .progress span { 289 | background: #333; 290 | -webkit-transition: width 800ms cubic-bezier(0.26, 0.86, 0.44, 0.985); 291 | -moz-transition: width 800ms cubic-bezier(0.26, 0.86, 0.44, 0.985); 292 | transition: width 800ms cubic-bezier(0.26, 0.86, 0.44, 0.985); } 293 | 294 | /********************************************* 295 | * SLIDE NUMBER 296 | *********************************************/ 297 | .reveal .slide-number { 298 | color: #333; } 299 | 300 | /********************************************* 301 | * SLIDE MASTER 302 | *********************************************/ 303 | .reveal .slide-background.master-white { 304 | background-color: #fff; 305 | background-repeat: no-repeat; } 306 | 307 | .reveal .slide-background.master01 { 308 | background: url("01_corner_bottom_right.svg") bottom right, url("01_corner_top_left.svg") top left; 309 | background-color: #1E5A96; 310 | background-repeat: no-repeat; } 311 | 312 | .master01 ~ section h1, .master01 ~ section h2, .master01 ~ section h3, .master01 ~ section h4, .master01 ~ section h5, .master01 ~ section h6 { 313 | color: #1E5A96; } 314 | 315 | .reveal .slide-background.master02 { 316 | background: url("02_corner_top_right.svg") top right, url("02_corner_top_left.svg") top left; 317 | background-color: #3B7BBE; 318 | background-repeat: no-repeat; } 319 | 320 | .master02 ~ section h1, .master02 ~ section h2, .master02 ~ section h3, .master02 ~ section h4, .master02 ~ section h5, .master02 ~ section h6 { 321 | color: #3B7BBE; } 322 | 323 | .reveal .slide-background.master03 { 324 | background: url("03_corner_top_right.svg") top right, url("03_corner_top_left.svg") top left; 325 | background-color: #238BCA; 326 | background-repeat: no-repeat; } 327 | 328 | .master03 ~ section h1, .master03 ~ section h2, .master03 ~ section h3, .master03 ~ section h4, .master03 ~ section h5, .master03 ~ section h6 { 329 | color: #238BCA; } 330 | 331 | .reveal .slide-background.master04 { 332 | background: url("04_corner_top_right.svg") top right, url("04_corner_top_left.svg") top left; 333 | background-color: #2C97A6; 334 | background-repeat: no-repeat; } 335 | 336 | .master04 ~ section h1, .master04 ~ section h2, .master04 ~ section h3, .master04 ~ section h4, .master04 ~ section h5, .master04 ~ section h6 { 337 | color: #2C97A6; } 338 | 339 | .reveal .slide-background.master05 { 340 | background: url("05_corner_top_right.svg") top right, url("05_corner_top_left.svg") top left; 341 | background-color: #69B978; 342 | background-repeat: no-repeat; } 343 | 344 | .master05 ~ section h1, .master05 ~ section h2, .master05 ~ section h3, .master05 ~ section h4, .master05 ~ section h5, .master05 ~ section h6 { 345 | color: #69B978; } 346 | 347 | .master01 { 348 | color: white !important; } 349 | .master01 h1, .master01 h2, .master01 h3, .master01 h4, .master01 h5, .master01 h6 { 350 | color: white !important; } 351 | 352 | .master02 { 353 | color: white !important; } 354 | .master02 h1, .master02 h2, .master02 h3, .master02 h4, .master02 h5, .master02 h6 { 355 | color: white !important; } 356 | 357 | .master03 { 358 | color: white !important; } 359 | .master03 h1, .master03 h2, .master03 h3, .master03 h4, .master03 h5, .master03 h6 { 360 | color: white !important; } 361 | 362 | .master04 { 363 | color: white !important; } 364 | .master04 h1, .master04 h2, .master04 h3, .master04 h4, .master04 h5, .master04 h6 { 365 | color: white !important; } 366 | 367 | .master05 { 368 | color: white !important; } 369 | .master05 h1, .master05 h2, .master05 h3, .master05 h4, .master05 h5, .master05 h6 { 370 | color: white !important; } 371 | -------------------------------------------------------------------------------- /labs/35_add_new_node_and_master.md: -------------------------------------------------------------------------------- 1 | ## Lab 3.5: Add New OpenShift Node and Master 2 | 3 | In this lab we will add a new node and a new master to our OpenShift cluster. 4 | 5 | 6 | ### Lab 3.5.1: Add a New Node 7 | 8 | Uncomment the new node (`app-node1.user...`) in the Ansible inventory and also uncomment the `new_nodes` group in the "[OSEv3:children]" section. 9 | ``` 10 | [ec2-user@master0 ~]$ sudo vim /etc/ansible/hosts 11 | ... 12 | glusterfs 13 | bastion 14 | #new_masters 15 | new_nodes 16 | ... 17 | 18 | [new_nodes] 19 | app-node1.user7.lab.openshift.ch openshift_node_group_name='node-config-compute' 20 | ... 21 | 22 | ``` 23 | 24 | As in lab 2.2 we need to run an Ansible playbook to prepare the new node for the OpenShift installation. The playbook enables required repositories, installs packages and sets up storage according to the [documented prerequisites](https://docs.openshift.com/container-platform/3.6/install_config/install/host_preparation.html). 25 | 26 | Test the ssh connection and run the pre-install playbook: 27 | ``` 28 | [ec2-user@master0 ~]$ ansible new_nodes[0] -m ping 29 | [ec2-user@master0 ~]$ ansible-playbook resource/prepare_hosts_for_ose.yml --limit=new_nodes[0] 30 | [ec2-user@master0 ~]$ ansible-playbook resource/prepare_docker_storage.yml --limit=new_nodes[0] 31 | ``` 32 | 33 | Now add the new node with the scaleup playbook: 34 | ``` 35 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-node/scaleup.yml 36 | ``` 37 | 38 | Check if the node is ready: 39 | ``` 40 | [ec2-user@master0 ~]$ oc get nodes 41 | NAME STATUS AGE VERSION 42 | app-node0.user2.lab.openshift.ch Ready 3h v1.6.1+5115d708d7 43 | app-node1.user2.lab.openshift.ch Ready,SchedulingDisabled 4m v1.6.1+5115d708d7 44 | infra-node0.user2.lab.openshift.ch Ready 4h v1.6.1+5115d708d7 45 | infra-node1.user2.lab.openshift.ch Ready 4h v1.6.1+5115d708d7 46 | infra-node2.user2.lab.openshift.ch Ready 4h v1.6.1+5115d708d7 47 | master0.user2.lab.openshift.ch Ready,SchedulingDisabled 4h v1.6.1+5115d708d7 48 | master1.user2.lab.openshift.ch Ready,SchedulingDisabled 4h v1.6.1+5115d708d7 49 | ``` 50 | 51 | Enable scheduling for the new node app-node1, drain another one (e.g. app-node0) and check if pods are running correctly on the new node. If you don't see any pods on it make sure there is at least one "non-infra-pod" running on your OpenShift cluster. 52 | ``` 53 | [ec2-user@master0 ~]$ oc adm manage-node app-node1.user[X].lab.openshift.ch --schedulable 54 | [ec2-user@master0 ~]$ oc adm drain app-node0.user[X].lab.openshift.ch --ignore-daemonsets --delete-local-data 55 | [ec2-user@master0 ~]$ watch "oc adm manage-node app-node1.user[X].lab.openshift.ch --list-pods" 56 | ``` 57 | 58 | If everything works as expected, we schedule app-node0 again: 59 | ``` 60 | [ec2-user@master0 ~]$ oc adm manage-node app-node0.user[X].lab.openshift.ch --schedulable 61 | ``` 62 | 63 | Inside the Ansible inventory, we move the new node from the `[new_nodes]` to the `[app_nodes]` group: 64 | ``` 65 | [ec2-user@master0 ~]$ cat /etc/ansible/hosts 66 | ... 67 | [app_nodes] 68 | app-node0.user[X].lab.openshift.ch openshift_hostname=app-node0.user[X].lab.openshift.ch openshift_public_hostname=app-node0.user[X].lab.openshift.ch openshift_node_labels="{'region': 'primary', 'zone': 'default'}" 69 | app-node1.user[X].lab.openshift.ch openshift_hostname=app-node1.user[X].lab.openshift.ch openshift_public_hostname=app-node1.user[X].lab.openshift.ch openshift_node_labels="{'region': 'primary', 'zone': 'default'}" 70 | ... 71 | 72 | [new_nodes] 73 | #master2.user... 74 | 75 | [glusterfs] 76 | ... 77 | ``` 78 | 79 | 80 | ### Lab 3.5.2: Add a New Master 81 | 82 | Uncomment the new master inside the Ansible inventory. It needs to be in both the `[new_nodes]` and the `[new_masters]` groups. 83 | ``` 84 | [ec2-user@master0 ~]$ cat /etc/ansible/hosts 85 | ... 86 | glusterfs 87 | bastion 88 | new_masters 89 | new_nodes 90 | ... 91 | [new_masters] 92 | master2.user[X].lab.openshift.ch openshift_hostname=master2.user[X].lab.openshift.ch openshift_public_hostname=master2.user[X].lab.openshift.ch openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_schedulable=false 93 | ... 94 | [new_nodes] 95 | master2.user[X].lab.openshift.ch openshift_hostname=master2.user[X].lab.openshift.ch openshift_public_hostname=master2.user[X].lab.openshift.ch openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_schedulable=false 96 | ... 97 | ``` 98 | 99 | Check if the host is accessible and run the pre-install playbook: 100 | ``` 101 | [ec2-user@master0 ~]$ ansible master2.user[X].lab.openshift.ch -m ping 102 | [ec2-user@master0 ~]$ ansible-playbook resource/prepare_hosts_for_ose.yml --limit=master2.user[X].lab.openshift.ch 103 | [ec2-user@master0 ~]$ ansible-playbook resource/prepare_docker_storage.yml --limit=master2.user[X].lab.openshift.ch 104 | ``` 105 | 106 | Now we can add the new master: 107 | ``` 108 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-master/scaleup.yml 109 | ``` 110 | 111 | Let's check if the node daemon on the new master is ready: 112 | ``` 113 | [ec2-user@master0 ~]$ oc get nodes 114 | NAME STATUS ROLES AGE VERSION 115 | app-node0.user7.lab.openshift.ch Ready compute 1d v1.11.0+d4cacc0 116 | app-node1.user7.lab.openshift.ch Ready compute 1d v1.11.0+d4cacc0 117 | infra-node0.user7.lab.openshift.ch Ready infra 1d v1.11.0+d4cacc0 118 | infra-node1.user7.lab.openshift.ch Ready infra 1d v1.11.0+d4cacc0 119 | infra-node2.user7.lab.openshift.ch Ready infra 1d v1.11.0+d4cacc0 120 | master0.user7.lab.openshift.ch Ready master 1d v1.11.0+d4cacc0 121 | master1.user7.lab.openshift.ch Ready master 1d v1.11.0+d4cacc0 122 | master2.user7.lab.openshift.ch Ready master 6m v1.11.0+d4cacc0 123 | 124 | ``` 125 | 126 | Check if the old masters see the new one: 127 | ``` 128 | [ec2-user@master0 ~]$ curl https://master2.user[X].lab.openshift.ch 129 | { 130 | "paths": [ 131 | "/api", 132 | "/api/v1", 133 | "/apis", 134 | "/apis/apps", 135 | "/apis/apps.openshift.io", 136 | "/apis/apps.openshift.io/v1", 137 | "/apis/apps/v1beta1", 138 | "/apis/authentication.k8s.io", 139 | "/apis/authentication.k8s.io/v1", 140 | "/apis/authentication.k8s.io/v1beta1", 141 | "/apis/authorization.k8s.io", 142 | "/apis/authorization.k8s.io/v1", 143 | "/apis/authorization.k8s.io/v1beta1", 144 | "/apis/authorization.openshift.io", 145 | "/apis/authorization.openshift.io/v1", 146 | "/apis/autoscaling", 147 | "/apis/autoscaling/v1", 148 | "/apis/batch", 149 | "/apis/batch/v1", 150 | "/apis/batch/v2alpha1", 151 | "/apis/build.openshift.io", 152 | "/apis/build.openshift.io/v1", 153 | "/apis/certificates.k8s.io", 154 | "/apis/certificates.k8s.io/v1beta1", 155 | "/apis/extensions", 156 | "/apis/extensions/v1beta1", 157 | "/apis/image.openshift.io", 158 | "/apis/image.openshift.io/v1", 159 | "/apis/network.openshift.io", 160 | "/apis/network.openshift.io/v1", 161 | "/apis/oauth.openshift.io", 162 | "/apis/oauth.openshift.io/v1", 163 | "/apis/policy", 164 | "/apis/policy/v1beta1", 165 | "/apis/project.openshift.io", 166 | "/apis/project.openshift.io/v1", 167 | "/apis/quota.openshift.io", 168 | "/apis/quota.openshift.io/v1", 169 | "/apis/rbac.authorization.k8s.io", 170 | "/apis/rbac.authorization.k8s.io/v1beta1", 171 | "/apis/route.openshift.io", 172 | "/apis/route.openshift.io/v1", 173 | "/apis/security.openshift.io", 174 | "/apis/security.openshift.io/v1", 175 | "/apis/storage.k8s.io", 176 | "/apis/storage.k8s.io/v1", 177 | "/apis/storage.k8s.io/v1beta1", 178 | "/apis/template.openshift.io", 179 | "/apis/template.openshift.io/v1", 180 | "/apis/user.openshift.io", 181 | "/apis/user.openshift.io/v1", 182 | "/controllers", 183 | "/healthz", 184 | "/healthz/ping", 185 | "/healthz/poststarthook/bootstrap-controller", 186 | "/healthz/poststarthook/ca-registration", 187 | "/healthz/ready", 188 | "/metrics", 189 | "/oapi", 190 | "/oapi/v1", 191 | "/swaggerapi", 192 | "/version", 193 | "/version/openshift" 194 | ] 195 | } 196 | ... 197 | ``` 198 | 199 | If everything worked as expected, we are going to move the new master from the `[new_masters]` to the `[masters]` group inside the Ansible inventory: 200 | ``` 201 | [ec2-user@master0 ~]$ sudo vim /etc/ansible/hosts 202 | [masters] 203 | master0.user[X].lab.openshift.ch openshift_hostname=master0.user[X].lab.openshift.ch openshift_public_hostname=master0.user[X].lab.openshift.ch openshift_node_labels="{'region': 'infra', 'zone': 'default'}" 204 | master1.user[X].lab.openshift.ch openshift_hostname=master1.user[X].lab.openshift.ch openshift_public_hostname=master1.user[X].lab.openshift.ch openshift_node_labels="{'region': 'infra', 'zone': 'default'}" 205 | master2.user[X].lab.openshift.ch openshift_hostname=master2.user[X].lab.openshift.ch openshift_public_hostname=master2.user[X].lab.openshift.ch openshift_node_labels="{'region': 'infra', 'zone': 'default'}" 206 | ... 207 | ``` 208 | 209 | This means we now have an empty `[new_nodes]` and `[new_masters]` groups. 210 | ``` 211 | [ec2-user@master0 ~]$ cat /etc/ansible/hosts 212 | ... 213 | [new_masters] 214 | ... 215 | [new_nodes] 216 | ``` 217 | 218 | 219 | 220 | ### Lab 3.5.3: Fix Logging 221 | 222 | The default logging stack on OpenShift mainly consists of Elasticsearch, fluentd and Kibana, where fluentd is a DaemonSet. This means that a fluentd pod is automatically deployed on every node, even if scheduling is disabled for that node. The limiting factor for the deployment of DaemonSet pods is the node selector which is set by default to the label `logging-infra-fluentd=true`. The logging playbook attaches this label to all nodes by default, so if you wanted to prevent the deployment of fluentd on certain hosts you had to add the label `logging-infra-fluentd=false` in the inventory. As you may have seen, we do not specify the label specifically in the inventory, which means: 223 | - Every node gets the `logging-infra-fluentd=true` attached by the logging playbook 224 | - fluentd is deployed on every node 225 | 226 | This means the new nodes did not yet get the fluentd label because the logging playbook had only been executed when they were not yet active. We can confirm this by looking at what labels each node has: 227 | ``` 228 | oc get nodes --show-labels 229 | ``` 230 | 231 | Then we correct it either by executing the logging playbook or by manually labelling the nodes with `oc`. Executing the playbook takes quite some time but we leave this choice to you: 232 | - So either execute the playbook: 233 | ``` 234 | [ec2-user@master0 ~]$ ansible-playbook /usr/share/ansible/openshift-ansible/playbooks/openshift-logging/config.yml 235 | ``` 236 | 237 | - Or label the nodes manually with `oc`: 238 | ``` 239 | [ec2-user@master0 ~]$ oc label node app-node1.user[X].lab.openshift.ch logging-infra-fluentd=true 240 | [ec2-user@master0 ~]$ oc label node master2.user[X].lab.openshift.ch logging-infra-fluentd=true 241 | ``` 242 | 243 | Confirm that the nodes now have the correct label: 244 | ``` 245 | oc get nodes --show-labels 246 | ``` 247 | 248 | 249 | --- 250 | 251 | **End of Lab 3.5** 252 | 253 |

4. Configuration Best Practices →

254 | 255 | [← back to the Chapter Overview](30_daily_business.md) 256 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Attribution-ShareAlike 4.0 International 2 | 3 | ======================================================================= 4 | 5 | Creative Commons Corporation ("Creative Commons") is not a law firm and 6 | does not provide legal services or legal advice. Distribution of 7 | Creative Commons public licenses does not create a lawyer-client or 8 | other relationship. Creative Commons makes its licenses and related 9 | information available on an "as-is" basis. Creative Commons gives no 10 | warranties regarding its licenses, any material licensed under their 11 | terms and conditions, or any related information. Creative Commons 12 | disclaims all liability for damages resulting from their use to the 13 | fullest extent possible. 14 | 15 | Using Creative Commons Public Licenses 16 | 17 | Creative Commons public licenses provide a standard set of terms and 18 | conditions that creators and other rights holders may use to share 19 | original works of authorship and other material subject to copyright 20 | and certain other rights specified in the public license below. The 21 | following considerations are for informational purposes only, are not 22 | exhaustive, and do not form part of our licenses. 23 | 24 | Considerations for licensors: Our public licenses are 25 | intended for use by those authorized to give the public 26 | permission to use material in ways otherwise restricted by 27 | copyright and certain other rights. Our licenses are 28 | irrevocable. Licensors should read and understand the terms 29 | and conditions of the license they choose before applying it. 30 | Licensors should also secure all rights necessary before 31 | applying our licenses so that the public can reuse the 32 | material as expected. Licensors should clearly mark any 33 | material not subject to the license. This includes other CC- 34 | licensed material, or material used under an exception or 35 | limitation to copyright. More considerations for licensors: 36 | wiki.creativecommons.org/Considerations_for_licensors 37 | 38 | Considerations for the public: By using one of our public 39 | licenses, a licensor grants the public permission to use the 40 | licensed material under specified terms and conditions. If 41 | the licensor's permission is not necessary for any reason--for 42 | example, because of any applicable exception or limitation to 43 | copyright--then that use is not regulated by the license. Our 44 | licenses grant only permissions under copyright and certain 45 | other rights that a licensor has authority to grant. Use of 46 | the licensed material may still be restricted for other 47 | reasons, including because others have copyright or other 48 | rights in the material. A licensor may make special requests, 49 | such as asking that all changes be marked or described. 50 | Although not required by our licenses, you are encouraged to 51 | respect those requests where reasonable. More_considerations 52 | for the public: 53 | wiki.creativecommons.org/Considerations_for_licensees 54 | 55 | ======================================================================= 56 | 57 | Creative Commons Attribution-ShareAlike 4.0 International Public 58 | License 59 | 60 | By exercising the Licensed Rights (defined below), You accept and agree 61 | to be bound by the terms and conditions of this Creative Commons 62 | Attribution-ShareAlike 4.0 International Public License ("Public 63 | License"). To the extent this Public License may be interpreted as a 64 | contract, You are granted the Licensed Rights in consideration of Your 65 | acceptance of these terms and conditions, and the Licensor grants You 66 | such rights in consideration of benefits the Licensor receives from 67 | making the Licensed Material available under these terms and 68 | conditions. 69 | 70 | 71 | Section 1 -- Definitions. 72 | 73 | a. Adapted Material means material subject to Copyright and Similar 74 | Rights that is derived from or based upon the Licensed Material 75 | and in which the Licensed Material is translated, altered, 76 | arranged, transformed, or otherwise modified in a manner requiring 77 | permission under the Copyright and Similar Rights held by the 78 | Licensor. For purposes of this Public License, where the Licensed 79 | Material is a musical work, performance, or sound recording, 80 | Adapted Material is always produced where the Licensed Material is 81 | synched in timed relation with a moving image. 82 | 83 | b. Adapter's License means the license You apply to Your Copyright 84 | and Similar Rights in Your contributions to Adapted Material in 85 | accordance with the terms and conditions of this Public License. 86 | 87 | c. BY-SA Compatible License means a license listed at 88 | creativecommons.org/compatiblelicenses, approved by Creative 89 | Commons as essentially the equivalent of this Public License. 90 | 91 | d. Copyright and Similar Rights means copyright and/or similar rights 92 | closely related to copyright including, without limitation, 93 | performance, broadcast, sound recording, and Sui Generis Database 94 | Rights, without regard to how the rights are labeled or 95 | categorized. For purposes of this Public License, the rights 96 | specified in Section 2(b)(1)-(2) are not Copyright and Similar 97 | Rights. 98 | 99 | e. Effective Technological Measures means those measures that, in the 100 | absence of proper authority, may not be circumvented under laws 101 | fulfilling obligations under Article 11 of the WIPO Copyright 102 | Treaty adopted on December 20, 1996, and/or similar international 103 | agreements. 104 | 105 | f. Exceptions and Limitations means fair use, fair dealing, and/or 106 | any other exception or limitation to Copyright and Similar Rights 107 | that applies to Your use of the Licensed Material. 108 | 109 | g. License Elements means the license attributes listed in the name 110 | of a Creative Commons Public License. The License Elements of this 111 | Public License are Attribution and ShareAlike. 112 | 113 | h. Licensed Material means the artistic or literary work, database, 114 | or other material to which the Licensor applied this Public 115 | License. 116 | 117 | i. Licensed Rights means the rights granted to You subject to the 118 | terms and conditions of this Public License, which are limited to 119 | all Copyright and Similar Rights that apply to Your use of the 120 | Licensed Material and that the Licensor has authority to license. 121 | 122 | j. Licensor means the individual(s) or entity(ies) granting rights 123 | under this Public License. 124 | 125 | k. Share means to provide material to the public by any means or 126 | process that requires permission under the Licensed Rights, such 127 | as reproduction, public display, public performance, distribution, 128 | dissemination, communication, or importation, and to make material 129 | available to the public including in ways that members of the 130 | public may access the material from a place and at a time 131 | individually chosen by them. 132 | 133 | l. Sui Generis Database Rights means rights other than copyright 134 | resulting from Directive 96/9/EC of the European Parliament and of 135 | the Council of 11 March 1996 on the legal protection of databases, 136 | as amended and/or succeeded, as well as other essentially 137 | equivalent rights anywhere in the world. 138 | 139 | m. You means the individual or entity exercising the Licensed Rights 140 | under this Public License. Your has a corresponding meaning. 141 | 142 | 143 | Section 2 -- Scope. 144 | 145 | a. License grant. 146 | 147 | 1. Subject to the terms and conditions of this Public License, 148 | the Licensor hereby grants You a worldwide, royalty-free, 149 | non-sublicensable, non-exclusive, irrevocable license to 150 | exercise the Licensed Rights in the Licensed Material to: 151 | 152 | a. reproduce and Share the Licensed Material, in whole or 153 | in part; and 154 | 155 | b. produce, reproduce, and Share Adapted Material. 156 | 157 | 2. Exceptions and Limitations. For the avoidance of doubt, where 158 | Exceptions and Limitations apply to Your use, this Public 159 | License does not apply, and You do not need to comply with 160 | its terms and conditions. 161 | 162 | 3. Term. The term of this Public License is specified in Section 163 | 6(a). 164 | 165 | 4. Media and formats; technical modifications allowed. The 166 | Licensor authorizes You to exercise the Licensed Rights in 167 | all media and formats whether now known or hereafter created, 168 | and to make technical modifications necessary to do so. The 169 | Licensor waives and/or agrees not to assert any right or 170 | authority to forbid You from making technical modifications 171 | necessary to exercise the Licensed Rights, including 172 | technical modifications necessary to circumvent Effective 173 | Technological Measures. For purposes of this Public License, 174 | simply making modifications authorized by this Section 2(a) 175 | (4) never produces Adapted Material. 176 | 177 | 5. Downstream recipients. 178 | 179 | a. Offer from the Licensor -- Licensed Material. Every 180 | recipient of the Licensed Material automatically 181 | receives an offer from the Licensor to exercise the 182 | Licensed Rights under the terms and conditions of this 183 | Public License. 184 | 185 | b. Additional offer from the Licensor -- Adapted Material. 186 | Every recipient of Adapted Material from You 187 | automatically receives an offer from the Licensor to 188 | exercise the Licensed Rights in the Adapted Material 189 | under the conditions of the Adapter's License You apply. 190 | 191 | c. No downstream restrictions. You may not offer or impose 192 | any additional or different terms or conditions on, or 193 | apply any Effective Technological Measures to, the 194 | Licensed Material if doing so restricts exercise of the 195 | Licensed Rights by any recipient of the Licensed 196 | Material. 197 | 198 | 6. No endorsement. Nothing in this Public License constitutes or 199 | may be construed as permission to assert or imply that You 200 | are, or that Your use of the Licensed Material is, connected 201 | with, or sponsored, endorsed, or granted official status by, 202 | the Licensor or others designated to receive attribution as 203 | provided in Section 3(a)(1)(A)(i). 204 | 205 | b. Other rights. 206 | 207 | 1. Moral rights, such as the right of integrity, are not 208 | licensed under this Public License, nor are publicity, 209 | privacy, and/or other similar personality rights; however, to 210 | the extent possible, the Licensor waives and/or agrees not to 211 | assert any such rights held by the Licensor to the limited 212 | extent necessary to allow You to exercise the Licensed 213 | Rights, but not otherwise. 214 | 215 | 2. Patent and trademark rights are not licensed under this 216 | Public License. 217 | 218 | 3. To the extent possible, the Licensor waives any right to 219 | collect royalties from You for the exercise of the Licensed 220 | Rights, whether directly or through a collecting society 221 | under any voluntary or waivable statutory or compulsory 222 | licensing scheme. In all other cases the Licensor expressly 223 | reserves any right to collect such royalties. 224 | 225 | 226 | Section 3 -- License Conditions. 227 | 228 | Your exercise of the Licensed Rights is expressly made subject to the 229 | following conditions. 230 | 231 | a. Attribution. 232 | 233 | 1. If You Share the Licensed Material (including in modified 234 | form), You must: 235 | 236 | a. retain the following if it is supplied by the Licensor 237 | with the Licensed Material: 238 | 239 | i. identification of the creator(s) of the Licensed 240 | Material and any others designated to receive 241 | attribution, in any reasonable manner requested by 242 | the Licensor (including by pseudonym if 243 | designated); 244 | 245 | ii. a copyright notice; 246 | 247 | iii. a notice that refers to this Public License; 248 | 249 | iv. a notice that refers to the disclaimer of 250 | warranties; 251 | 252 | v. a URI or hyperlink to the Licensed Material to the 253 | extent reasonably practicable; 254 | 255 | b. indicate if You modified the Licensed Material and 256 | retain an indication of any previous modifications; and 257 | 258 | c. indicate the Licensed Material is licensed under this 259 | Public License, and include the text of, or the URI or 260 | hyperlink to, this Public License. 261 | 262 | 2. You may satisfy the conditions in Section 3(a)(1) in any 263 | reasonable manner based on the medium, means, and context in 264 | which You Share the Licensed Material. For example, it may be 265 | reasonable to satisfy the conditions by providing a URI or 266 | hyperlink to a resource that includes the required 267 | information. 268 | 269 | 3. If requested by the Licensor, You must remove any of the 270 | information required by Section 3(a)(1)(A) to the extent 271 | reasonably practicable. 272 | 273 | b. ShareAlike. 274 | 275 | In addition to the conditions in Section 3(a), if You Share 276 | Adapted Material You produce, the following conditions also apply. 277 | 278 | 1. The Adapter's License You apply must be a Creative Commons 279 | license with the same License Elements, this version or 280 | later, or a BY-SA Compatible License. 281 | 282 | 2. You must include the text of, or the URI or hyperlink to, the 283 | Adapter's License You apply. You may satisfy this condition 284 | in any reasonable manner based on the medium, means, and 285 | context in which You Share Adapted Material. 286 | 287 | 3. You may not offer or impose any additional or different terms 288 | or conditions on, or apply any Effective Technological 289 | Measures to, Adapted Material that restrict exercise of the 290 | rights granted under the Adapter's License You apply. 291 | 292 | 293 | Section 4 -- Sui Generis Database Rights. 294 | 295 | Where the Licensed Rights include Sui Generis Database Rights that 296 | apply to Your use of the Licensed Material: 297 | 298 | a. for the avoidance of doubt, Section 2(a)(1) grants You the right 299 | to extract, reuse, reproduce, and Share all or a substantial 300 | portion of the contents of the database; 301 | 302 | b. if You include all or a substantial portion of the database 303 | contents in a database in which You have Sui Generis Database 304 | Rights, then the database in which You have Sui Generis Database 305 | Rights (but not its individual contents) is Adapted Material, 306 | 307 | including for purposes of Section 3(b); and 308 | c. You must comply with the conditions in Section 3(a) if You Share 309 | all or a substantial portion of the contents of the database. 310 | 311 | For the avoidance of doubt, this Section 4 supplements and does not 312 | replace Your obligations under this Public License where the Licensed 313 | Rights include other Copyright and Similar Rights. 314 | 315 | 316 | Section 5 -- Disclaimer of Warranties and Limitation of Liability. 317 | 318 | a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE 319 | EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS 320 | AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF 321 | ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS, 322 | IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION, 323 | WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR 324 | PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, 325 | ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT 326 | KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT 327 | ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU. 328 | 329 | b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE 330 | TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION, 331 | NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT, 332 | INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, 333 | COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR 334 | USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN 335 | ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR 336 | DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR 337 | IN PART, THIS LIMITATION MAY NOT APPLY TO YOU. 338 | 339 | c. The disclaimer of warranties and limitation of liability provided 340 | above shall be interpreted in a manner that, to the extent 341 | possible, most closely approximates an absolute disclaimer and 342 | waiver of all liability. 343 | 344 | 345 | Section 6 -- Term and Termination. 346 | 347 | a. This Public License applies for the term of the Copyright and 348 | Similar Rights licensed here. However, if You fail to comply with 349 | this Public License, then Your rights under this Public License 350 | terminate automatically. 351 | 352 | b. Where Your right to use the Licensed Material has terminated under 353 | Section 6(a), it reinstates: 354 | 355 | 1. automatically as of the date the violation is cured, provided 356 | it is cured within 30 days of Your discovery of the 357 | violation; or 358 | 359 | 2. upon express reinstatement by the Licensor. 360 | 361 | For the avoidance of doubt, this Section 6(b) does not affect any 362 | right the Licensor may have to seek remedies for Your violations 363 | of this Public License. 364 | 365 | c. For the avoidance of doubt, the Licensor may also offer the 366 | Licensed Material under separate terms or conditions or stop 367 | distributing the Licensed Material at any time; however, doing so 368 | will not terminate this Public License. 369 | 370 | d. Sections 1, 5, 6, 7, and 8 survive termination of this Public 371 | License. 372 | 373 | 374 | Section 7 -- Other Terms and Conditions. 375 | 376 | a. The Licensor shall not be bound by any additional or different 377 | terms or conditions communicated by You unless expressly agreed. 378 | 379 | b. Any arrangements, understandings, or agreements regarding the 380 | Licensed Material not stated herein are separate from and 381 | independent of the terms and conditions of this Public License. 382 | 383 | 384 | Section 8 -- Interpretation. 385 | 386 | a. For the avoidance of doubt, this Public License does not, and 387 | shall not be interpreted to, reduce, limit, restrict, or impose 388 | conditions on any use of the Licensed Material that could lawfully 389 | be made without permission under this Public License. 390 | 391 | b. To the extent possible, if any provision of this Public License is 392 | deemed unenforceable, it shall be automatically reformed to the 393 | minimum extent necessary to make it enforceable. If the provision 394 | cannot be reformed, it shall be severed from this Public License 395 | without affecting the enforceability of the remaining terms and 396 | conditions. 397 | 398 | c. No term or condition of this Public License will be waived and no 399 | failure to comply consented to unless expressly agreed to by the 400 | Licensor. 401 | 402 | d. Nothing in this Public License constitutes or may be interpreted 403 | as a limitation upon, or waiver of, any privileges and immunities 404 | that apply to the Licensor or You, including from the legal 405 | processes of any jurisdiction or authority. 406 | 407 | 408 | ======================================================================= 409 | 410 | Creative Commons is not a party to its public 411 | licenses. Notwithstanding, Creative Commons may elect to apply one of 412 | its public licenses to material it publishes and in those instances 413 | will be considered the “Licensor.” The text of the Creative Commons 414 | public licenses is dedicated to the public domain under the CC0 Public 415 | Domain Dedication. Except for the limited purpose of indicating that 416 | material is shared under a Creative Commons public license or as 417 | otherwise permitted by the Creative Commons policies published at 418 | creativecommons.org/policies, Creative Commons does not authorize the 419 | use of the trademark "Creative Commons" or any other trademark or logo 420 | of Creative Commons without its prior written consent including, 421 | without limitation, in connection with any unauthorized modifications 422 | to any of its public licenses or any other arrangements, 423 | understandings, or agreements concerning use of licensed material. For 424 | the avoidance of doubt, this paragraph does not form part of the 425 | public licenses. 426 | 427 | Creative Commons may be contacted at creativecommons.org. 428 | 429 | -------------------------------------------------------------------------------- /theme/puzzle_tagline_bg_rgb.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | puzzle_tagline_bg_rgb 5 | Created with Sketch. 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | --------------------------------------------------------------------------------