├── images ├── k8s-ha.jpg ├── dashboard-index.png └── dashboard-login.png ├── tools └── v1.13 │ ├── net.iptables.k8s.conf │ ├── flannel.install.sh │ ├── kubeadm.reset.sh │ ├── kubeadm.init.sh │ ├── kubeadm.images.sh │ ├── config.using.cluster.sh │ ├── install.docker.sh │ ├── kubernetes.repo │ ├── metrics-server │ ├── metrics-server-service.yaml │ ├── auth-delegator.yaml │ ├── metrics-apiservice.yaml │ ├── auth-reader.yaml │ ├── aggregated-metrics-reader.yaml │ ├── resource-reader.yaml │ └── metrics-server-deployment.yaml │ ├── install.docker.composer.sh │ ├── create.dashboard.token.sh │ ├── flannel.image.sh │ ├── sync.master.ca.sh │ ├── install.k8s.repo.sh │ ├── kubeadm-config.m01.yaml │ ├── init.sys.config.sh │ ├── kubeadm-config.m02.yaml │ ├── kubeadm-config.m03.yaml │ ├── kubeadm.other.master.init.sh │ ├── init.kubeadm.config.sh │ ├── coredns-ha.yaml │ ├── kubernetes-dashboard.yaml │ ├── kubernetes-dashboard-https.yaml │ ├── nginx-ingress.yaml │ └── kube-flannel.yml ├── applications ├── Wayne │ ├── screenshots │ │ ├── home.png │ │ ├── admin-node.png │ │ ├── admin-dashboard.png │ │ ├── config-cluster.png │ │ ├── config-cluster-1.png │ │ └── project-dashboard.png │ ├── v1.3.1 │ │ ├── wayne │ │ │ ├── service.yaml │ │ │ ├── ingress.yaml │ │ │ ├── configmap.yaml │ │ │ └── deployment.yaml │ │ └── dependency │ │ │ ├── rabbitmq.yaml │ │ │ └── mysql.yaml │ └── README.md ├── Weave Scope │ ├── screenshots │ │ ├── weave-pod.png │ │ ├── weave-home.png │ │ └── weave-term.png │ └── README.md ├── Monocular │ ├── yaml │ │ └── custom-repo.yaml │ └── README.md ├── NFS │ └── README.md └── helm │ └── README.md ├── errors └── 1-pod_status_error.md └── README.md /images/k8s-ha.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/images/k8s-ha.jpg -------------------------------------------------------------------------------- /images/dashboard-index.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/images/dashboard-index.png -------------------------------------------------------------------------------- /images/dashboard-login.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/images/dashboard-login.png -------------------------------------------------------------------------------- /tools/v1.13/net.iptables.k8s.conf: -------------------------------------------------------------------------------- 1 | net.bridge.bridge-nf-call-ip6tables = 1 2 | net.bridge.bridge-nf-call-iptables = 1 3 | -------------------------------------------------------------------------------- /applications/Wayne/screenshots/home.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Wayne/screenshots/home.png -------------------------------------------------------------------------------- /applications/Wayne/screenshots/admin-node.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Wayne/screenshots/admin-node.png -------------------------------------------------------------------------------- /applications/Wayne/screenshots/admin-dashboard.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Wayne/screenshots/admin-dashboard.png -------------------------------------------------------------------------------- /applications/Wayne/screenshots/config-cluster.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Wayne/screenshots/config-cluster.png -------------------------------------------------------------------------------- /applications/Weave Scope/screenshots/weave-pod.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Weave Scope/screenshots/weave-pod.png -------------------------------------------------------------------------------- /applications/Wayne/screenshots/config-cluster-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Wayne/screenshots/config-cluster-1.png -------------------------------------------------------------------------------- /applications/Wayne/screenshots/project-dashboard.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Wayne/screenshots/project-dashboard.png -------------------------------------------------------------------------------- /applications/Weave Scope/screenshots/weave-home.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Weave Scope/screenshots/weave-home.png -------------------------------------------------------------------------------- /applications/Weave Scope/screenshots/weave-term.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/HikoQiu/kubeadm-install-k8s/HEAD/applications/Weave Scope/screenshots/weave-term.png -------------------------------------------------------------------------------- /tools/v1.13/flannel.install.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | # 安装 flannel 4 | # wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml 5 | kubectl apply -f ./kube-flannel.yml 6 | -------------------------------------------------------------------------------- /tools/v1.13/kubeadm.reset.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhost="m01 m02 m03" 4 | 5 | for h in $vhost;do 6 | echo "Exec sudo kubeadm reset for $h" 7 | ssh kube@$h "sudo kubeadm reset --force" 8 | done 9 | 10 | -------------------------------------------------------------------------------- /tools/v1.13/kubeadm.init.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhost="m01" 4 | 5 | for h in $vhost;do 6 | echo "Exec sudo kubeadm init for $h" 7 | ssh kube@$h "sudo kubeadm init --config kubeadm-config.$h.yaml" 8 | done 9 | 10 | -------------------------------------------------------------------------------- /tools/v1.13/kubeadm.images.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhost="m01 m02 m03" 4 | 5 | for h in $vhost;do 6 | echo "Pull image for $h -- begings" 7 | sudo kubeadm config images pull --config kubeadm-config.$h.yaml 8 | done 9 | 10 | -------------------------------------------------------------------------------- /tools/v1.13/config.using.cluster.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | # 1. 为 kube 用户配置 4 | mkdir -p $HOME/.kube 5 | sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 6 | sudo chown $(id -u):$(id -g) $HOME/.kube/config 7 | 8 | echo "Finish user kube config ... ok" 9 | -------------------------------------------------------------------------------- /tools/v1.13/install.docker.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhosts="m01 m02 m03 n01 n02 ing01" 4 | 5 | for h in $vhosts 6 | do 7 | echo "Install Docker for $h" 8 | ssh kube@$h "sudo yum install -y docker && sudo systemctl enable docker && sudo systemctl start docker" 9 | done 10 | -------------------------------------------------------------------------------- /tools/v1.13/kubernetes.repo: -------------------------------------------------------------------------------- 1 | [kubernetes] 2 | name=Kubernetes 3 | baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ 4 | enabled=1 5 | gpgcheck=1 6 | repo_gpgcheck=1 7 | gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg 8 | -------------------------------------------------------------------------------- /applications/Wayne/v1.3.1/wayne/service.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Service 3 | metadata: 4 | labels: 5 | app: infra-wayne 6 | name: infra-wayne 7 | namespace: default 8 | spec: 9 | type: NodePort 10 | ports: 11 | - port: 8080 12 | protocol: TCP 13 | targetPort: 8080 14 | selector: 15 | app: infra-wayne 16 | -------------------------------------------------------------------------------- /applications/Wayne/v1.3.1/wayne/ingress.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: extensions/v1beta1 2 | kind: Ingress 3 | metadata: 4 | name: wayne-ingress 5 | namespace: default 6 | spec: 7 | rules: 8 | - host: wayne.k8s.hiko.im 9 | http: 10 | paths: 11 | - path: / 12 | backend: 13 | serviceName: infra-wayne 14 | servicePort: 8080 15 | -------------------------------------------------------------------------------- /tools/v1.13/metrics-server/metrics-server-service.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | apiVersion: v1 3 | kind: Service 4 | metadata: 5 | name: metrics-server 6 | namespace: kube-system 7 | labels: 8 | kubernetes.io/name: "Metrics-server" 9 | spec: 10 | selector: 11 | k8s-app: metrics-server 12 | ports: 13 | - port: 443 14 | protocol: TCP 15 | targetPort: 443 16 | -------------------------------------------------------------------------------- /tools/v1.13/install.docker.composer.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhost="m01 m02 m03" 4 | 5 | for h in $vhost 6 | do 7 | ssh kube@$h "sudo curl -L "https://github.com/docker/compose/releases/download/1.23.2/docker-compose-Linux-x86_64" -o /usr/local/bin/docker-compose" 8 | ssh kube@$h "sudo chmod +x /usr/local/bin/docker-compose" 9 | ssh kube@$h "docker-compose --version" 10 | done 11 | -------------------------------------------------------------------------------- /tools/v1.13/metrics-server/auth-delegator.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | apiVersion: rbac.authorization.k8s.io/v1beta1 3 | kind: ClusterRoleBinding 4 | metadata: 5 | name: metrics-server:system:auth-delegator 6 | roleRef: 7 | apiGroup: rbac.authorization.k8s.io 8 | kind: ClusterRole 9 | name: system:auth-delegator 10 | subjects: 11 | - kind: ServiceAccount 12 | name: metrics-server 13 | namespace: kube-system 14 | -------------------------------------------------------------------------------- /tools/v1.13/metrics-server/metrics-apiservice.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | apiVersion: apiregistration.k8s.io/v1beta1 3 | kind: APIService 4 | metadata: 5 | name: v1beta1.metrics.k8s.io 6 | spec: 7 | service: 8 | name: metrics-server 9 | namespace: kube-system 10 | group: metrics.k8s.io 11 | version: v1beta1 12 | insecureSkipTLSVerify: true 13 | groupPriorityMinimum: 100 14 | versionPriority: 100 15 | -------------------------------------------------------------------------------- /tools/v1.13/metrics-server/auth-reader.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | apiVersion: rbac.authorization.k8s.io/v1beta1 3 | kind: RoleBinding 4 | metadata: 5 | name: metrics-server-auth-reader 6 | namespace: kube-system 7 | roleRef: 8 | apiGroup: rbac.authorization.k8s.io 9 | kind: Role 10 | name: extension-apiserver-authentication-reader 11 | subjects: 12 | - kind: ServiceAccount 13 | name: metrics-server 14 | namespace: kube-system 15 | -------------------------------------------------------------------------------- /tools/v1.13/create.dashboard.token.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | kubectl create sa dashboard-admin -n kube-system 4 | kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin 5 | ADMIN_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}') 6 | DASHBOARD_LOGIN_TOKEN=$(kubectl describe secret -n kube-system ${ADMIN_SECRET} | grep -E '^token' | awk '{print $2}') 7 | echo ${DASHBOARD_LOGIN_TOKEN} 8 | -------------------------------------------------------------------------------- /tools/v1.13/metrics-server/aggregated-metrics-reader.yaml: -------------------------------------------------------------------------------- 1 | kind: ClusterRole 2 | apiVersion: rbac.authorization.k8s.io/v1 3 | metadata: 4 | name: system:aggregated-metrics-reader 5 | labels: 6 | rbac.authorization.k8s.io/aggregate-to-view: "true" 7 | rbac.authorization.k8s.io/aggregate-to-edit: "true" 8 | rbac.authorization.k8s.io/aggregate-to-admin: "true" 9 | rules: 10 | - apiGroups: ["metrics.k8s.io"] 11 | resources: ["pods"] 12 | verbs: ["get", "list", "watch"] 13 | -------------------------------------------------------------------------------- /applications/Monocular/yaml/custom-repo.yaml: -------------------------------------------------------------------------------- 1 | api: 2 | config: 3 | repos: 4 | - name: stable 5 | url: https://aliacs-app-catalog.oss-cn-hangzhou.aliyuncs.com/charts 6 | source: https://github.com/kubernetes/charts/tree/master/stable 7 | - name: incubator 8 | url: https://aliacs-app-catalog.oss-cn-hangzhou.aliyuncs.com/charts-incubator 9 | source: https://github.com/kubernetes/charts/tree/master/incubator 10 | - name: monocular 11 | url: https://kubernetes-helm.github.io/monocular 12 | source: https://github.com/kubernetes-helm/monocular/tree/master/charts -------------------------------------------------------------------------------- /tools/v1.13/flannel.image.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhost="m01 m02 m03 n01 n02" 4 | 5 | for h in $vhost 6 | do 7 | 8 | echo "---> $h" 9 | 10 | # 安装 Pod 网络插件 11 | # 这里选择的是 flannel v0.10.0 版本 12 | # 如果想用其他版本,可以替换url 13 | 14 | # 备注:kube-flannel.yml(下面配置的 yaml)中指定的是 quay.io 的镜像。 15 | # 因为国内无法拉 quay.io 的镜像,所以这里从 docker hub 拉去相同镜像, 16 | # 然后打 tag 为 kube-flannel.yml 中指定的 quay.io/coreos/flannel:v0.10.0-amd64 17 | # 再备注:flannel 是所有节点(master 和 node)都需要的网络组件,所以后面其他节点也可以通过相同方式安装 18 | 19 | sudo docker pull jmgao1983/flannel:v0.10.0-amd64 20 | sudo docker tag jmgao1983/flannel:v0.10.0-amd64 quay.io/coreos/flannel:v0.10.0-amd64 21 | 22 | 23 | done 24 | -------------------------------------------------------------------------------- /tools/v1.13/metrics-server/resource-reader.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | apiVersion: rbac.authorization.k8s.io/v1 3 | kind: ClusterRole 4 | metadata: 5 | name: system:metrics-server 6 | rules: 7 | - apiGroups: 8 | - "" 9 | resources: 10 | - pods 11 | - nodes 12 | - nodes/stats 13 | - namespaces 14 | verbs: 15 | - get 16 | - list 17 | - watch 18 | - apiGroups: 19 | - "extensions" 20 | resources: 21 | - deployments 22 | verbs: 23 | - get 24 | - list 25 | - watch 26 | --- 27 | apiVersion: rbac.authorization.k8s.io/v1 28 | kind: ClusterRoleBinding 29 | metadata: 30 | name: system:metrics-server 31 | roleRef: 32 | apiGroup: rbac.authorization.k8s.io 33 | kind: ClusterRole 34 | name: system:metrics-server 35 | subjects: 36 | - kind: ServiceAccount 37 | name: metrics-server 38 | namespace: kube-system 39 | -------------------------------------------------------------------------------- /applications/Wayne/v1.3.1/dependency/rabbitmq.yaml: -------------------------------------------------------------------------------- 1 | kind: Deployment 2 | apiVersion: extensions/v1beta1 3 | metadata: 4 | name: rabbitmq-wayne 5 | namespace: default 6 | labels: 7 | app: rabbitmq-wayne 8 | spec: 9 | replicas: 1 10 | selector: 11 | matchLabels: 12 | app: rabbitmq-wayne 13 | template: 14 | metadata: 15 | labels: 16 | app: rabbitmq-wayne 17 | spec: 18 | containers: 19 | - name: rabbitmq 20 | image: 'rabbitmq:3.7.8-management' 21 | resources: 22 | limits: 23 | cpu: '1' 24 | memory: 1Gi 25 | requests: 26 | cpu: '1' 27 | memory: 128M 28 | --- 29 | apiVersion: v1 30 | kind: Service 31 | metadata: 32 | labels: 33 | app: rabbitmq-wayne 34 | name: rabbitmq-wayne 35 | namespace: default 36 | spec: 37 | ports: 38 | - port: 5672 39 | protocol: TCP 40 | targetPort: 5672 41 | selector: 42 | app: rabbitmq-wayne 43 | -------------------------------------------------------------------------------- /tools/v1.13/sync.master.ca.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhost="m02 m03" 4 | usr=root 5 | 6 | who=`whoami` 7 | if [[ "$who" != "$usr" ]];then 8 | echo "请使用 root 用户执行, 或者 sudo ./sync.master.ca.sh" 9 | exit 1 10 | fi 11 | 12 | echo $who 13 | 14 | # 需要从 m01 拷贝的 ca 文件 15 | caFiles=( 16 | /etc/kubernetes/pki/ca.crt 17 | /etc/kubernetes/pki/ca.key 18 | /etc/kubernetes/pki/sa.key 19 | /etc/kubernetes/pki/sa.pub 20 | /etc/kubernetes/pki/front-proxy-ca.crt 21 | /etc/kubernetes/pki/front-proxy-ca.key 22 | /etc/kubernetes/pki/etcd/ca.crt 23 | /etc/kubernetes/pki/etcd/ca.key 24 | /etc/kubernetes/admin.conf 25 | ) 26 | 27 | pkiDir=/etc/kubernetes/pki/etcd 28 | for h in $vhost 29 | do 30 | 31 | ssh ${usr}@$h "mkdir -p $pkiDir" 32 | 33 | echo "Dirs for ca scp created, start to scp..." 34 | 35 | # scp 文件到目标机 36 | for f in ${caFiles[@]} 37 | do 38 | echo "scp $f ${usr}@$h:$f" 39 | scp $f ${usr}@$h:$f 40 | done 41 | 42 | echo "Ca files transfered for $h ... ok" 43 | done 44 | -------------------------------------------------------------------------------- /applications/Monocular/README.md: -------------------------------------------------------------------------------- 1 | Monocular: Helm charts 仓库管理 WEB UI 工具 2 | --- 3 | 4 | - [helm/monocular](https://github.com/helm/monocular) 5 | 6 | ## 1. 安装步骤 7 | 8 | ### 1.1 前置要求 9 | 10 | - 已安装 Helm 和 Tiller 11 | - 已安装 Nginx Ingress Controller 12 | 13 | ### 1.2 安装 Monocular 14 | 15 | ``` 16 | helm repo add monocular https://helm.github.io/monocular 17 | helm install monocular/monocular 18 | ``` 19 | 20 | ``` 21 | api: 22 | config: 23 | repos: 24 | - name: stable 25 | url: https://aliacs-app-catalog.oss-cn-hangzhou.aliyuncs.com/charts 26 | source: https://github.com/kubernetes/charts/tree/master/stable 27 | - name: incubator 28 | url: https://aliacs-app-catalog.oss-cn-hangzhou.aliyuncs.com/charts-incubator 29 | source: https://github.com/kubernetes/charts/tree/master/incubator 30 | - name: monocular 31 | url: https://kubernetes-helm.github.io/monocular 32 | source: https://github.com/kubernetes-helm/monocular/tree/master/charts 33 | ``` -------------------------------------------------------------------------------- /applications/Wayne/v1.3.1/dependency/mysql.yaml: -------------------------------------------------------------------------------- 1 | kind: Deployment 2 | apiVersion: extensions/v1beta1 3 | metadata: 4 | name: mysql-wayne 5 | namespace: default 6 | labels: 7 | app: mysql-wayne 8 | spec: 9 | replicas: 1 10 | selector: 11 | matchLabels: 12 | app: mysql-wayne 13 | template: 14 | metadata: 15 | labels: 16 | app: mysql-wayne 17 | spec: 18 | containers: 19 | - name: mysql 20 | image: 'mysql:5.6.41' 21 | env: 22 | - name: MYSQL_ROOT_PASSWORD 23 | value: root 24 | resources: 25 | limits: 26 | cpu: '1' 27 | memory: 512M 28 | requests: 29 | cpu: '1' 30 | memory: 128M 31 | --- 32 | apiVersion: v1 33 | kind: Service 34 | metadata: 35 | labels: 36 | app: mysql-wayne 37 | name: mysql-wayne 38 | namespace: default 39 | spec: 40 | ports: 41 | - port: 3306 42 | protocol: TCP 43 | targetPort: 3306 44 | selector: 45 | app: mysql-wayne 46 | -------------------------------------------------------------------------------- /tools/v1.13/install.k8s.repo.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhost="m01 m02 m03 n01 n02" 4 | 5 | master="m01 m02 m03" 6 | nodes="n01 n02" 7 | 8 | ## 1. 阿里云 kubernetes 仓库 9 | cat < kubernetes.repo 10 | [kubernetes] 11 | name=Kubernetes 12 | baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ 13 | enabled=1 14 | gpgcheck=1 15 | repo_gpgcheck=1 16 | gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg 17 | EOF 18 | 19 | mvCmd="sudo cp ~/kubernetes.repo /etc/yum.repos.d/" 20 | for h in $vhost 21 | do 22 | echo "Setup kubernetes repository for $h" 23 | scp ./kubernetes.repo kube@$h:~ 24 | ssh kube@$h $mvCmd 25 | done 26 | 27 | ## 2. 安装 kubelet kubeadm kubectl 28 | installCmd="sudo yum install -y kubelet kubeadm kubectl && sudo systemctl enable kubelet" 29 | for h in $vhost 30 | do 31 | echo "Install kubelet kubeadm kubectl for : $h" 32 | ssh kube@$h $installCmd 33 | done 34 | 35 | ## 3. 启动 m01 的 kubelet 36 | sudo systemctl start kubelet 37 | -------------------------------------------------------------------------------- /tools/v1.13/kubeadm-config.m01.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: kubeadm.k8s.io/v1beta1 2 | kind: InitConfiguration 3 | localAPIEndpoint: 4 | advertiseAddress: 192.168.33.10 5 | bindPort: 6443 6 | --- 7 | apiVersion: kubeadm.k8s.io/v1beta1 8 | kind: ClusterConfiguration 9 | kubernetesVersion: v1.13.1 10 | 11 | # 指定阿里云镜像仓库 12 | imageRepository: registry.aliyuncs.com/google_containers 13 | 14 | # apiServerCertSANs 填所有的 masterip、lbip、其它可能需要通过它访问 apiserver 的地址、域名或主机名等, 15 | # 如阿里fip,证书中会允许这些ip 16 | # 这里填一个自定义的域名 17 | apiServer: 18 | certSANs: 19 | - "api.k8s.hiko.im" 20 | controlPlaneEndpoint: "api.k8s.hiko.im:6443" 21 | 22 | ## Etcd 配置 23 | etcd: 24 | local: 25 | extraArgs: 26 | listen-client-urls: "https://127.0.0.1:2379,https://192.168.33.10:2379" 27 | advertise-client-urls: "https://192.168.33.10:2379" 28 | listen-peer-urls: "https://192.168.33.10:2380" 29 | initial-advertise-peer-urls: "https://192.168.33.10:2380" 30 | initial-cluster: "m01=https://192.168.33.10:2380" 31 | initial-cluster-state: new 32 | serverCertSANs: 33 | - m01 34 | - 192.168.33.10 35 | peerCertSANs: 36 | - m01 37 | - 192.168.33.10 38 | networking: 39 | podSubnet: "10.244.0.0/16" 40 | 41 | -------------------------------------------------------------------------------- /tools/v1.13/init.sys.config.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | vhost="m01 m02 m03 n01 n02" 4 | 5 | 6 | # 新建 iptable 配置修改文件 7 | cat < net.iptables.k8s.conf 8 | net.bridge.bridge-nf-call-ip6tables = 1 9 | net.bridge.bridge-nf-call-iptables = 1 10 | EOF 11 | 12 | for h in $vhost 13 | do 14 | 15 | echo "--> $h" 16 | 17 | # 1. 关闭 swap 分区 18 | # kubelet 不关闭,kubelet 无法启动 19 | # 也可以通过将参数 --fail-swap-on 设置为 false 来忽略 swap on 20 | ssh kube@$h "sudo swapoff -a" 21 | echo "sudo swapoff -a -- ok" 22 | 23 | # 防止开机自动挂载 swap 分区,注释掉配置 24 | ssh kube@$h "sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab" 25 | echo "Comment swap config file modified -- ok" 26 | 27 | 28 | # 2. 关闭 SELinux 29 | # 否则后续 k8s 挂载目录时可能报错:Permission Denied 30 | ssh kube@$h "sudo setenforce 0" 31 | echo "sudo setenforce 0 -- ok" 32 | 33 | # 防止开机启动开启,修改 SELINUX 配置 34 | ssh kube@$h "sudo sed -i s'/SELINUX=enforcing/SELINUX=disabled'/g /etc/selinux/config" 35 | echo "Disabled selinux -- ok" 36 | 37 | # 3. 配置 iptables 38 | scp net.iptables.k8s.conf kube@$h:~ 39 | ssh kube@$h "sudo mv net.iptables.k8s.conf /etc/sysctl.d/ && sudo sysctl --system" 40 | 41 | 42 | # 安装 wget 43 | ssh kube@$h "sudo yum install -y wget" 44 | 45 | done 46 | -------------------------------------------------------------------------------- /tools/v1.13/kubeadm-config.m02.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: kubeadm.k8s.io/v1beta1 2 | kind: InitConfiguration 3 | localAPIEndpoint: 4 | advertiseAddress: 192.168.33.11 5 | bindPort: 6443 6 | --- 7 | apiVersion: kubeadm.k8s.io/v1beta1 8 | kind: ClusterConfiguration 9 | kubernetesVersion: v1.13.1 10 | 11 | # 指定阿里云镜像仓库 12 | imageRepository: registry.aliyuncs.com/google_containers 13 | 14 | # apiServerCertSANs 填所有的 masterip、lbip、其它可能需要通过它访问 apiserver 的地址、域名或主机名等, 15 | # 如阿里fip,证书中会允许这些ip 16 | # 这里填一个自定义的域名 17 | apiServer: 18 | certSANs: 19 | - "api.k8s.hiko.im" 20 | controlPlaneEndpoint: "api.k8s.hiko.im:6443" 21 | 22 | ## Etcd 配置 23 | etcd: 24 | local: 25 | extraArgs: 26 | listen-client-urls: "https://127.0.0.1:2379,https://192.168.33.11:2379" 27 | advertise-client-urls: "https://192.168.33.11:2379" 28 | listen-peer-urls: "https://192.168.33.11:2380" 29 | initial-advertise-peer-urls: "https://192.168.33.11:2380" 30 | initial-cluster: "m01=https://192.168.33.10:2380,m02=https://192.168.33.11:2380" 31 | initial-cluster-state: existing 32 | serverCertSANs: 33 | - m02 34 | - 192.168.33.11 35 | peerCertSANs: 36 | - m02 37 | - 192.168.33.11 38 | networking: 39 | podSubnet: "10.244.0.0/16" 40 | 41 | -------------------------------------------------------------------------------- /tools/v1.13/metrics-server/metrics-server-deployment.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | apiVersion: v1 3 | kind: ServiceAccount 4 | metadata: 5 | name: metrics-server 6 | namespace: kube-system 7 | --- 8 | # 1. 修改 apiVersion 9 | apiVersion: apps/v1 10 | kind: Deployment 11 | metadata: 12 | name: metrics-server 13 | namespace: kube-system 14 | labels: 15 | k8s-app: metrics-server 16 | spec: 17 | selector: 18 | matchLabels: 19 | k8s-app: metrics-server 20 | template: 21 | metadata: 22 | name: metrics-server 23 | labels: 24 | k8s-app: metrics-server 25 | spec: 26 | serviceAccountName: metrics-server 27 | volumes: 28 | # mount in tmp so we can safely use from-scratch images and/or read-only containers 29 | - name: tmp-dir 30 | emptyDir: {} 31 | containers: 32 | - name: metrics-server 33 | # 2. 修改使用的镜像 34 | image: cloudnil/metrics-server-amd64:v0.3.1 35 | # 3. 指定启动参数,指定使用 InternalIP 进行获取各节点的监控信息 36 | command: 37 | - /metrics-server 38 | - --kubelet-insecure-tls 39 | - --kubelet-preferred-address-types=InternalIP 40 | 41 | imagePullPolicy: Always 42 | volumeMounts: 43 | - name: tmp-dir 44 | mountPath: /tmp 45 | -------------------------------------------------------------------------------- /tools/v1.13/kubeadm-config.m03.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: kubeadm.k8s.io/v1beta1 2 | kind: InitConfiguration 3 | localAPIEndpoint: 4 | advertiseAddress: 192.168.33.12 5 | bindPort: 6443 6 | --- 7 | apiVersion: kubeadm.k8s.io/v1beta1 8 | kind: ClusterConfiguration 9 | kubernetesVersion: v1.13.1 10 | 11 | # 指定阿里云镜像仓库 12 | imageRepository: registry.aliyuncs.com/google_containers 13 | 14 | # apiServerCertSANs 填所有的 masterip、lbip、其它可能需要通过它访问 apiserver 的地址、域名或主机名等, 15 | # 如阿里fip,证书中会允许这些ip 16 | # 这里填一个自定义的域名 17 | apiServer: 18 | certSANs: 19 | - "api.k8s.hiko.im" 20 | controlPlaneEndpoint: "api.k8s.hiko.im:6443" 21 | 22 | ## Etcd 配置 23 | etcd: 24 | local: 25 | extraArgs: 26 | listen-client-urls: "https://127.0.0.1:2379,https://192.168.33.12:2379" 27 | advertise-client-urls: "https://192.168.33.12:2379" 28 | listen-peer-urls: "https://192.168.33.12:2380" 29 | initial-advertise-peer-urls: "https://192.168.33.12:2380" 30 | initial-cluster: "m01=https://192.168.33.10:2380,m02=https://192.168.33.11:2380,m03=https://192.168.33.12:2380" 31 | initial-cluster-state: existing 32 | serverCertSANs: 33 | - m03 34 | - 192.168.33.12 35 | peerCertSANs: 36 | - m03 37 | - 192.168.33.12 38 | networking: 39 | podSubnet: "10.244.0.0/16" 40 | 41 | -------------------------------------------------------------------------------- /tools/v1.13/kubeadm.other.master.init.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | # m01 的 IP 4 | masterIp=192.168.33.10 5 | 6 | vhost=(m02 m03) 7 | vhostIP=(192.168.33.11 192.168.33.12) 8 | 9 | ## 遍历其他 master 主机名和对应 IP 10 | ## 执行启动 kubelet、将 etcd 加入集群、启动kube-apiserver、kube-controller-manager、kube-scheduler 11 | for i in `seq 0 $((${#vhost[*]}-1))` 12 | do 13 | 14 | h=${vhost[${i}]} 15 | ip=${vhostIP[${i}]} 16 | 17 | 18 | # 1. 启动 kubelet 19 | ssh kube@$h "sudo kubeadm init phase certs all --config kubeadm-config.${h}.yaml" 20 | ssh kube@$h "sudo kubeadm init phase etcd local --config kubeadm-config.${h}.yaml" 21 | ssh kube@$h "sudo kubeadm init phase kubeconfig kubelet --config kubeadm-config.${h}.yaml" 22 | ssh kube@$h "sudo kubeadm init phase kubelet-start --config kubeadm-config.${h}.yaml" 23 | 24 | # 2. 将该节点的 etcd 加入集群 25 | ssh kube@$h "kubectl exec -n kube-system etcd-m01 -- etcdctl --ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --endpoints=https://${masterIp}:2379 member add $h https://${ip}:2380" 26 | 27 | # 3. 启动其他 kube-apiserver、kube-controller-manager、kube-scheduler 28 | ssh kube@$h "sudo kubeadm init phase kubeconfig all --config kubeadm-config.${h}.yaml" 29 | ssh kube@$h "sudo kubeadm init phase control-plane all --config kubeadm-config.${h}.yaml" 30 | 31 | # 4. 将该节点标记为 master 节点 32 | ssh kube@$h "sudo kubeadm init phase mark-control-plane --config kubeadm-config.${h}.yaml" 33 | 34 | done 35 | 36 | -------------------------------------------------------------------------------- /applications/Wayne/v1.3.1/wayne/configmap.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: ConfigMap 3 | metadata: 4 | labels: 5 | app: infra-wayne 6 | name: infra-wayne 7 | namespace: default 8 | data: 9 | app.conf: |- 10 | appname = wayne 11 | httpport = 8080 12 | runmode = prod 13 | autorender = false 14 | copyrequestbody = true 15 | EnableDocs = true 16 | EnableAdmin = true 17 | StaticDir = public:static 18 | # Custom config 19 | ShowSql = false 20 | ## if enable username and password login 21 | EnableDBLogin = true 22 | # token, generate jwt token 23 | RsaPrivateKey = "./apikey/rsa-private.pem" 24 | RsaPublicKey = "./apikey/rsa-public.pem" 25 | # token end time. second 26 | TokenLifeTime=86400 27 | 28 | # kubernetes labels config 29 | AppLabelKey= wayne-app 30 | NamespaceLabelKey = wayne-ns 31 | PodAnnotationControllerKindLabelKey = wayne.cloud/controller-kind 32 | 33 | # database configuration: 34 | ## mysql 35 | DBName = "wayne" 36 | DBTns = "tcp(mysql-wayne:3306)" 37 | DBUser = "root" 38 | DBPasswd = "root" 39 | DBLoc = "Asia%2FShanghai" 40 | DBConnTTL = 30 41 | 42 | # web shell auth 43 | appKey = "860af247a91a19b2368d6425797921c6" 44 | 45 | # Set demo namespace and group id 46 | DemoGroupId = "1" 47 | DemoNamespaceId = "1" 48 | 49 | # Sentry 50 | LogLevel = "4" 51 | SentryEnable = false 52 | SentryDSN = "" 53 | SentryLogLevel = "4" 54 | 55 | # Robin 56 | EnableRobin = false 57 | 58 | # api-keys 59 | EnableApiKeys = true 60 | 61 | # Bus 62 | BusEnable = true 63 | BusRabbitMQURL = "amqp://guest:guest@rabbitmq-wayne:5672" 64 | 65 | # Webhook 66 | EnableWebhook = true 67 | WebhookClientTimeout = 10 68 | WebhookClientWindowSize = 16 69 | -------------------------------------------------------------------------------- /applications/Weave Scope/README.md: -------------------------------------------------------------------------------- 1 | Weave Scope:实时监控 kubernetes 工具 2 | --- 3 | 4 | 5 | 6 | Weave Scope 自动实时监控进程、容器、主机节点等,并提供 Web 终端在线和 Pod 、主机交互。 7 | 8 | 9 | - [Installing Weave Scope](https://www.weave.works/docs/scope/latest/installing/#k8s) 10 | 11 | ![weave-pod](./screenshots/weave-pod.png) 12 | ![weave-term](./screenshots/weave-term.png) 13 | 14 | 15 | ## 1. 安装 Weave Scope 16 | 17 | 安装方式很简单,只需要下载对应 yaml 文件,通过 `kubectl apply -f ***.yaml` 进行安装即可。 18 | 19 | 官方介绍的安装方式: 20 | 21 | ``` 22 | kubectl apply -f "https://cloud.weave.works/k8s/scope.yaml?k8s-version=$(kubectl version | base64 | tr -d '\n')" 23 | ``` 24 | 25 | 执行完成后,查看 Weave Scope 的 Pod 和 svc,如下: 26 | 27 | ``` 28 | [kube@m01 ~]$ kubectl get pods -n weave 29 | NAME READY STATUS RESTARTS AGE 30 | weave-scope-agent-2qmcx 1/1 Running 0 38m 31 | weave-scope-agent-5knlz 1/1 Running 0 38m 32 | weave-scope-agent-6gbsw 1/1 Running 0 38m 33 | weave-scope-agent-89fhv 1/1 Running 0 38m 34 | weave-scope-agent-dnjwz 1/1 Running 0 38m 35 | weave-scope-agent-szw7g 1/1 Running 0 38m 36 | weave-scope-app-6979884cc6-t82td 1/1 Running 0 38m 37 | [kube@m01 ~]$ kubectl get svc -n weave 38 | NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE 39 | weave-scope-app ClusterIP 10.102.52.0 80/TCP 38m 40 | 41 | ``` 42 | 43 | ## 2. 配置 ingress 暴露服务 44 | 45 | 添加 ingress 暴露服务,详细配置文件 `ingress.yaml` 如下: 46 | 47 | ``` 48 | apiVersion: extensions/v1beta1 49 | kind: Ingress 50 | metadata: 51 | name: weave-scope-ingress 52 | namespace: weave 53 | spec: 54 | rules: 55 | - host: weave.k8s.hiko.im 56 | http: 57 | paths: 58 | - path: / 59 | backend: 60 | serviceName: weave-scope-app 61 | servicePort: 80 62 | 63 | ``` 64 | 65 | 通过 `kubectl apply -f ingress.yaml` 暴露服务。 66 | 67 | ##3. 访问 68 | 69 | 将 weave.k8s.hiko.im 解析到 k8s 集群的 ingress 机器,再通过浏览器访问:http://weave.k8s.hiko.im , 将看到: 70 | 71 | ![Home](./screenshots/weave-home.png) 72 | -------------------------------------------------------------------------------- /tools/v1.13/init.kubeadm.config.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | ## 1. 配置参数 4 | ## vhost 主机名和 vhostIP IP 一一对应 5 | vhost=(m01 m02 m03) 6 | vhostIP=(192.168.33.10 192.168.33.11 192.168.33.12) 7 | 8 | domain=api.k8s.hiko.im 9 | 10 | ## etcd 初始化 m01 m02 m03 集群配置 11 | etcdInitCluster=( 12 | m01=https://192.168.33.10:2380 13 | m01=https://192.168.33.10:2380,m02=https://192.168.33.11:2380 14 | m01=https://192.168.33.10:2380,m02=https://192.168.33.11:2380,m03=https://192.168.33.12:2380 15 | ) 16 | 17 | ## etcd 初始化时,m01 m02 m03 分别的初始化集群状态 18 | initClusterStatus=( 19 | new 20 | existing 21 | existing 22 | ) 23 | 24 | 25 | ## 2.遍历 master 主机名和对应 IP 26 | ## 生成对应的 kubeadmn 配置文件 27 | for i in `seq 0 $((${#vhost[*]}-1))` 28 | do 29 | 30 | h=${vhost[${i}]} 31 | ip=${vhostIP[${i}]} 32 | 33 | echo "--> $h - $ip" 34 | 35 | ## 生成 kubeadm 配置模板 36 | cat < kubeadm-config.$h.yaml 37 | apiVersion: kubeadm.k8s.io/v1beta1 38 | kind: InitConfiguration 39 | localAPIEndpoint: 40 | advertiseAddress: $ip 41 | bindPort: 6443 42 | --- 43 | apiVersion: kubeadm.k8s.io/v1beta1 44 | kind: ClusterConfiguration 45 | kubernetesVersion: v1.13.1 46 | 47 | # 指定阿里云镜像仓库 48 | imageRepository: registry.aliyuncs.com/google_containers 49 | 50 | # apiServerCertSANs 填所有的 masterip、lbip、其它可能需要通过它访问 apiserver 的地址、域名或主机名等, 51 | # 如阿里fip,证书中会允许这些ip 52 | # 这里填一个自定义的域名 53 | apiServer: 54 | certSANs: 55 | - "$domain" 56 | controlPlaneEndpoint: "$domain:6443" 57 | 58 | ## Etcd 配置 59 | etcd: 60 | local: 61 | extraArgs: 62 | listen-client-urls: "https://127.0.0.1:2379,https://$ip:2379" 63 | advertise-client-urls: "https://$ip:2379" 64 | listen-peer-urls: "https://$ip:2380" 65 | initial-advertise-peer-urls: "https://$ip:2380" 66 | initial-cluster: "${etcdInitCluster[${i}]}" 67 | initial-cluster-state: ${initClusterStatus[${i}]} 68 | serverCertSANs: 69 | - $h 70 | - $ip 71 | peerCertSANs: 72 | - $h 73 | - $ip 74 | networking: 75 | podSubnet: "10.244.0.0/16" 76 | 77 | EOF 78 | 79 | echo "kubeadm-config.$h.yaml created ... ok" 80 | 81 | ## 3. 分发到其他 master 机器 82 | scp kubeadm-config.$h.yaml kube@$h:~ 83 | echo "scp kubeadm-config.$h.yaml ... ok" 84 | 85 | done 86 | -------------------------------------------------------------------------------- /tools/v1.13/coredns-ha.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: apps/v1 2 | kind: Deployment 3 | metadata: 4 | labels: 5 | k8s-app: kube-dns 6 | name: coredns 7 | namespace: kube-system 8 | spec: 9 | #集群规模可自行配置 10 | replicas: 2 11 | selector: 12 | matchLabels: 13 | k8s-app: kube-dns 14 | strategy: 15 | rollingUpdate: 16 | maxSurge: 25% 17 | maxUnavailable: 1 18 | type: RollingUpdate 19 | template: 20 | metadata: 21 | labels: 22 | k8s-app: kube-dns 23 | spec: 24 | affinity: 25 | podAntiAffinity: 26 | preferredDuringSchedulingIgnoredDuringExecution: 27 | - weight: 100 28 | podAffinityTerm: 29 | labelSelector: 30 | matchExpressions: 31 | - key: k8s-app 32 | operator: In 33 | values: 34 | - kube-dns 35 | topologyKey: kubernetes.io/hostname 36 | containers: 37 | - args: 38 | - -conf 39 | - /etc/coredns/Corefile 40 | image: registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.2.6 41 | imagePullPolicy: IfNotPresent 42 | livenessProbe: 43 | failureThreshold: 5 44 | httpGet: 45 | path: /health 46 | port: 8080 47 | scheme: HTTP 48 | initialDelaySeconds: 60 49 | periodSeconds: 10 50 | successThreshold: 1 51 | timeoutSeconds: 5 52 | name: coredns 53 | ports: 54 | - containerPort: 53 55 | name: dns 56 | protocol: UDP 57 | - containerPort: 53 58 | name: dns-tcp 59 | protocol: TCP 60 | - containerPort: 9153 61 | name: metrics 62 | protocol: TCP 63 | resources: 64 | limits: 65 | memory: 170Mi 66 | requests: 67 | cpu: 100m 68 | memory: 70Mi 69 | securityContext: 70 | allowPrivilegeEscalation: false 71 | capabilities: 72 | add: 73 | - NET_BIND_SERVICE 74 | drop: 75 | - all 76 | readOnlyRootFilesystem: true 77 | terminationMessagePath: /dev/termination-log 78 | terminationMessagePolicy: File 79 | volumeMounts: 80 | - mountPath: /etc/coredns 81 | name: config-volume 82 | readOnly: true 83 | dnsPolicy: Default 84 | restartPolicy: Always 85 | schedulerName: default-scheduler 86 | securityContext: {} 87 | serviceAccount: coredns 88 | serviceAccountName: coredns 89 | terminationGracePeriodSeconds: 30 90 | tolerations: 91 | - key: CriticalAddonsOnly 92 | operator: Exists 93 | - effect: NoSchedule 94 | key: node-role.kubernetes.io/master 95 | volumes: 96 | - configMap: 97 | defaultMode: 420 98 | items: 99 | - key: Corefile 100 | path: Corefile 101 | name: coredns 102 | name: config-volume 103 | -------------------------------------------------------------------------------- /applications/Wayne/v1.3.1/wayne/deployment.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: extensions/v1beta1 2 | kind: Deployment 3 | metadata: 4 | labels: 5 | app: infra-wayne 6 | name: infra-wayne 7 | namespace: default 8 | spec: 9 | replicas: 1 10 | selector: 11 | matchLabels: 12 | app: infra-wayne 13 | template: 14 | metadata: 15 | labels: 16 | app: infra-wayne 17 | spec: 18 | volumes: 19 | - name: config 20 | configMap: 21 | name: infra-wayne 22 | containers: 23 | - name: wayne 24 | image: '360cloud/wayne:latest' 25 | command: 26 | - /opt/wayne/backend 27 | - apiserver 28 | env: 29 | - name: GOPATH # app.conf runmode = dev must set GOPATH 30 | value: /go 31 | resources: 32 | limits: 33 | cpu: '0.5' 34 | memory: 1Gi 35 | requests: 36 | cpu: '0.5' 37 | memory: 1Gi 38 | volumeMounts: 39 | - name: config 40 | mountPath: /opt/wayne/conf/ 41 | readinessProbe: 42 | httpGet: 43 | path: healthz 44 | port: 8080 45 | timeoutSeconds: 1 46 | periodSeconds: 10 47 | failureThreshold: 3 48 | imagePullPolicy: Always 49 | --- 50 | kind: Deployment 51 | apiVersion: extensions/v1beta1 52 | metadata: 53 | name: infra-wayne-woker 54 | namespace: default 55 | labels: 56 | app: infra-wayne-woker 57 | spec: 58 | replicas: 1 59 | selector: 60 | matchLabels: 61 | app: infra-wayne-woker 62 | template: 63 | metadata: 64 | labels: 65 | app: infra-wayne-woker 66 | spec: 67 | volumes: 68 | - name: config 69 | configMap: 70 | name: infra-wayne 71 | containers: 72 | - name: wayne 73 | image: '360cloud/wayne:latest' 74 | command: 75 | - /opt/wayne/backend 76 | args: 77 | - worker 78 | - '-t' 79 | - AuditWorker 80 | - '-c' 81 | - '1' 82 | env: 83 | - name: GOPATH 84 | value: /go 85 | resources: 86 | limits: 87 | cpu: '0.5' 88 | memory: 0.5Gi 89 | requests: 90 | cpu: '0.5' 91 | memory: 0.5Gi 92 | volumeMounts: 93 | - name: config 94 | mountPath: /opt/wayne/conf/ 95 | imagePullPolicy: Always 96 | --- 97 | kind: Deployment 98 | apiVersion: extensions/v1beta1 99 | metadata: 100 | name: infra-wayne-webhook 101 | namespace: default 102 | labels: 103 | app: infra-wayne-webhook 104 | spec: 105 | replicas: 1 106 | selector: 107 | matchLabels: 108 | app: infra-wayne-webhook 109 | template: 110 | metadata: 111 | labels: 112 | app: infra-wayne-webhook 113 | spec: 114 | volumes: 115 | - name: config 116 | configMap: 117 | name: infra-wayne 118 | containers: 119 | - name: wayne 120 | image: '360cloud/wayne:latest' 121 | command: 122 | - /opt/wayne/backend 123 | args: 124 | - worker 125 | - '-t' 126 | - WebhookWorker 127 | - '-c' 128 | - '1' 129 | env: 130 | - value: /go 131 | name: GOPATH 132 | resources: 133 | limits: 134 | cpu: '0.5' 135 | memory: 0.5Gi 136 | requests: 137 | cpu: '0.5' 138 | memory: 128M 139 | volumeMounts: 140 | - name: config 141 | mountPath: /opt/wayne/conf/ 142 | imagePullPolicy: Always 143 | 144 | -------------------------------------------------------------------------------- /tools/v1.13/kubernetes-dashboard.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Secret 3 | metadata: 4 | labels: 5 | k8s-app: kubernetes-dashboard 6 | name: kubernetes-dashboard-certs 7 | namespace: kube-system 8 | type: Opaque 9 | --- 10 | apiVersion: v1 11 | kind: ServiceAccount 12 | metadata: 13 | labels: 14 | k8s-app: kubernetes-dashboard 15 | name: kubernetes-dashboard 16 | namespace: kube-system 17 | --- 18 | kind: Role 19 | apiVersion: rbac.authorization.k8s.io/v1 20 | metadata: 21 | name: kubernetes-dashboard-minimal 22 | namespace: kube-system 23 | rules: 24 | # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret. 25 | - apiGroups: [""] 26 | resources: ["secrets"] 27 | verbs: ["create"] 28 | # Allow Dashboard to create 'kubernetes-dashboard-settings' config map. 29 | - apiGroups: [""] 30 | resources: ["configmaps"] 31 | verbs: ["create"] 32 | # Allow Dashboard to get, update and delete Dashboard exclusive secrets. 33 | - apiGroups: [""] 34 | resources: ["secrets"] 35 | resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"] 36 | verbs: ["get", "update", "delete"] 37 | # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map. 38 | - apiGroups: [""] 39 | resources: ["configmaps"] 40 | resourceNames: ["kubernetes-dashboard-settings"] 41 | verbs: ["get", "update"] 42 | # Allow Dashboard to get metrics from heapster. 43 | - apiGroups: [""] 44 | resources: ["services"] 45 | resourceNames: ["heapster"] 46 | verbs: ["proxy"] 47 | - apiGroups: [""] 48 | resources: ["services/proxy"] 49 | resourceNames: ["heapster", "http:heapster:", "https:heapster:"] 50 | verbs: ["get"] 51 | --- 52 | apiVersion: rbac.authorization.k8s.io/v1 53 | kind: RoleBinding 54 | metadata: 55 | name: kubernetes-dashboard-minimal 56 | namespace: kube-system 57 | roleRef: 58 | apiGroup: rbac.authorization.k8s.io 59 | kind: Role 60 | name: kubernetes-dashboard-minimal 61 | subjects: 62 | - kind: ServiceAccount 63 | name: kubernetes-dashboard 64 | namespace: kube-system 65 | --- 66 | kind: Deployment 67 | apiVersion: apps/v1 68 | metadata: 69 | labels: 70 | k8s-app: kubernetes-dashboard 71 | name: kubernetes-dashboard 72 | namespace: kube-system 73 | spec: 74 | replicas: 1 75 | revisionHistoryLimit: 10 76 | selector: 77 | matchLabels: 78 | k8s-app: kubernetes-dashboard 79 | template: 80 | metadata: 81 | labels: 82 | k8s-app: kubernetes-dashboard 83 | spec: 84 | containers: 85 | - name: kubernetes-dashboard 86 | # 使用阿里云的镜像 87 | image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.0 88 | ports: 89 | - containerPort: 8443 90 | protocol: TCP 91 | args: 92 | - --auto-generate-certificates 93 | volumeMounts: 94 | - name: kubernetes-dashboard-certs 95 | mountPath: /certs 96 | # Create on-disk volume to store exec logs 97 | - mountPath: /tmp 98 | name: tmp-volume 99 | livenessProbe: 100 | httpGet: 101 | scheme: HTTPS 102 | path: / 103 | port: 8443 104 | initialDelaySeconds: 30 105 | timeoutSeconds: 30 106 | volumes: 107 | - name: kubernetes-dashboard-certs 108 | secret: 109 | secretName: kubernetes-dashboard-certs 110 | - name: tmp-volume 111 | emptyDir: {} 112 | serviceAccountName: kubernetes-dashboard 113 | tolerations: 114 | - key: node-role.kubernetes.io/master 115 | effect: NoSchedule 116 | --- 117 | kind: Service 118 | apiVersion: v1 119 | metadata: 120 | labels: 121 | k8s-app: kubernetes-dashboard 122 | name: kubernetes-dashboard 123 | namespace: kube-system 124 | spec: 125 | ports: 126 | - port: 443 127 | targetPort: 8443 128 | selector: 129 | k8s-app: kubernetes-dashboard 130 | --- 131 | # 配置 ingress 配置,待会部署完 ingress 之后,就可以通过以下配置的域名访问 132 | apiVersion: extensions/v1beta1 133 | kind: Ingress 134 | metadata: 135 | name: dashboard-ingress 136 | namespace: kube-system 137 | annotations: 138 | # 指定转发协议为 HTTPS,因为 ingress 默认转发协议是 HTTP,而 kubernetes-dashboard 默认是 HTTPS 139 | nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" 140 | spec: 141 | rules: 142 | # 指定访问 dashboard 的域名 143 | - host: dashboard.k8s.hiko.im 144 | http: 145 | paths: 146 | - path: / 147 | backend: 148 | serviceName: kubernetes-dashboard 149 | servicePort: 443 150 | -------------------------------------------------------------------------------- /tools/v1.13/kubernetes-dashboard-https.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Secret 3 | metadata: 4 | labels: 5 | k8s-app: kubernetes-dashboard 6 | name: kubernetes-dashboard-certs 7 | namespace: kube-system 8 | type: Opaque 9 | --- 10 | apiVersion: v1 11 | kind: ServiceAccount 12 | metadata: 13 | labels: 14 | k8s-app: kubernetes-dashboard 15 | name: kubernetes-dashboard 16 | namespace: kube-system 17 | --- 18 | kind: Role 19 | apiVersion: rbac.authorization.k8s.io/v1 20 | metadata: 21 | name: kubernetes-dashboard-minimal 22 | namespace: kube-system 23 | rules: 24 | # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret. 25 | - apiGroups: [""] 26 | resources: ["secrets"] 27 | verbs: ["create"] 28 | # Allow Dashboard to create 'kubernetes-dashboard-settings' config map. 29 | - apiGroups: [""] 30 | resources: ["configmaps"] 31 | verbs: ["create"] 32 | # Allow Dashboard to get, update and delete Dashboard exclusive secrets. 33 | - apiGroups: [""] 34 | resources: ["secrets"] 35 | resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"] 36 | verbs: ["get", "update", "delete"] 37 | # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map. 38 | - apiGroups: [""] 39 | resources: ["configmaps"] 40 | resourceNames: ["kubernetes-dashboard-settings"] 41 | verbs: ["get", "update"] 42 | # Allow Dashboard to get metrics from heapster. 43 | - apiGroups: [""] 44 | resources: ["services"] 45 | resourceNames: ["heapster"] 46 | verbs: ["proxy"] 47 | - apiGroups: [""] 48 | resources: ["services/proxy"] 49 | resourceNames: ["heapster", "http:heapster:", "https:heapster:"] 50 | verbs: ["get"] 51 | --- 52 | apiVersion: rbac.authorization.k8s.io/v1 53 | kind: RoleBinding 54 | metadata: 55 | name: kubernetes-dashboard-minimal 56 | namespace: kube-system 57 | roleRef: 58 | apiGroup: rbac.authorization.k8s.io 59 | kind: Role 60 | name: kubernetes-dashboard-minimal 61 | subjects: 62 | - kind: ServiceAccount 63 | name: kubernetes-dashboard 64 | namespace: kube-system 65 | --- 66 | kind: Deployment 67 | apiVersion: apps/v1 68 | metadata: 69 | labels: 70 | k8s-app: kubernetes-dashboard 71 | name: kubernetes-dashboard 72 | namespace: kube-system 73 | spec: 74 | replicas: 1 75 | revisionHistoryLimit: 10 76 | selector: 77 | matchLabels: 78 | k8s-app: kubernetes-dashboard 79 | template: 80 | metadata: 81 | labels: 82 | k8s-app: kubernetes-dashboard 83 | spec: 84 | containers: 85 | - name: kubernetes-dashboard 86 | # 使用阿里云的镜像 87 | image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.0 88 | ports: 89 | - containerPort: 8443 90 | protocol: TCP 91 | args: 92 | - --auto-generate-certificates 93 | volumeMounts: 94 | - name: kubernetes-dashboard-certs 95 | mountPath: /certs 96 | # Create on-disk volume to store exec logs 97 | - mountPath: /tmp 98 | name: tmp-volume 99 | livenessProbe: 100 | httpGet: 101 | scheme: HTTPS 102 | path: / 103 | port: 8443 104 | initialDelaySeconds: 30 105 | timeoutSeconds: 30 106 | volumes: 107 | - name: kubernetes-dashboard-certs 108 | secret: 109 | secretName: kubernetes-dashboard-certs 110 | - name: tmp-volume 111 | emptyDir: {} 112 | serviceAccountName: kubernetes-dashboard 113 | tolerations: 114 | - key: node-role.kubernetes.io/master 115 | effect: NoSchedule 116 | --- 117 | kind: Service 118 | apiVersion: v1 119 | metadata: 120 | labels: 121 | k8s-app: kubernetes-dashboard 122 | name: kubernetes-dashboard 123 | namespace: kube-system 124 | spec: 125 | ports: 126 | - port: 443 127 | targetPort: 8443 128 | selector: 129 | k8s-app: kubernetes-dashboard 130 | --- 131 | # 配置 ingress 配置,待会部署完 ingress 之后,就可以通过以下配置的域名访问 132 | apiVersion: extensions/v1beta1 133 | kind: Ingress 134 | metadata: 135 | name: dashboard-ingress 136 | namespace: kube-system 137 | annotations: 138 | nginx.ingress.kubernetes.io/ssl-redirect: "true" 139 | nginx.ingress.kubernetes.io/rewrite-target: / 140 | # 指定转发协议为 HTTPS,因为 ingress 默认转发协议是 HTTP,而 kubernetes-dashboard 默认是 HTTPS 141 | nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" 142 | spec: 143 | # 指定使用的 secret 144 | tls: 145 | - secretName: secret-ca-k8s-hiko-im 146 | rules: 147 | # 指定访问 dashboard 的域名 148 | - host: dashboard.k8s.hiko.im 149 | http: 150 | paths: 151 | - path: / 152 | backend: 153 | serviceName: kubernetes-dashboard 154 | servicePort: 443 155 | -------------------------------------------------------------------------------- /applications/NFS/README.md: -------------------------------------------------------------------------------- 1 | NFS: Network File System 2 | --- 3 | 4 | NFS(Network File System)即网络文件系统,是FreeBSD支持的文件系统中的一种,它允许网络中的计算机之间通过TCP/IP网络共享资源。 5 | 6 | 为了后面操作 k8s 配置 PV 和 PVC,这里搭建一个供测试的 NFS 服务器。 7 | 8 | ## 1. 环境 9 | 10 | 操作系统:Centos 7 11 | 安装: nfs-utils 和 rpcbind 12 | 服务器:nfs01(NFS 服务器)和 ing01(挂载 NFS 目录的客户端所在服务器) 13 | 14 | 15 | 查看本地是否已安装 16 | 17 | ``` 18 | [kube@nfs01 ~]$ rpm -qa nfs-utils 19 | nfs-utils-1.3.0-0.61.el7.x86_64 20 | 21 | [kube@nfs01 ~]$ rpm -qa rpcbind 22 | rpcbind-0.2.0-47.el7.x86_64 23 | 24 | ``` 25 | 26 | 如果未安装,执行以下命令安装: 27 | 28 | ``` 29 | sudo yum install -y nfs-utils 30 | 31 | sudo yum install -y rpcbind 32 | ``` 33 | 34 | ## 2. 启动服务 35 | 36 | ### 2.1 rpcbind 37 | 38 | 默认情况 rpcbind 服务是已经启动的(端口:111),如下: 39 | 40 | ``` 41 | [kube@nfs01 ~]$ ss -an| grep 111 42 | udp UNCONN 0 0 *:111 *:* 43 | udp UNCONN 0 0 :::111 :::* 44 | tcp LISTEN 0 128 *:111 *:* 45 | tcp LISTEN 0 128 :::111 :::* 46 | ``` 47 | 48 | 添加开机启动 rpcbind:`sudo systemctl enable rpcbind` 49 | 50 | ### 2.2 nfs 服务 51 | 52 | 通过 `sudo systemctl start nfs` 启动 nfs 服务,如下: 53 | ``` 54 | [kube@nfs01 ~]$ sudo systemctl start nfs 55 | [kube@nfs01 ~]$ sudo systemctl status nfs 56 | ● nfs-server.service - NFS server and services 57 | Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; vendor preset: disabled) 58 | Active: active (exited) since Mon 2019-01-21 03:32:14 UTC; 2min 48s ago 59 | Process: 5318 ExecStartPost=/bin/sh -c if systemctl -q is-active gssproxy; then systemctl restart gssproxy ; fi (code=exited, status=0/SUCCESS) 60 | Process: 5302 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=0/SUCCESS) 61 | Process: 5301 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS) 62 | Main PID: 5302 (code=exited, status=0/SUCCESS) 63 | CGroup: /system.slice/nfs-server.service 64 | 65 | Jan 21 03:32:14 nfs01 systemd[1]: Starting NFS server and services... 66 | Jan 21 03:32:14 nfs01 systemd[1]: Started NFS server and services. 67 | 68 | ``` 69 | 70 | 添加开机启动 nfs:`sudo systemctl enable nfs` 71 | 72 | ### 2.3 验证 73 | 74 | 通过 `rpcinfo -p {IP}` 查看 75 | 76 | ``` 77 | [kube@nfs01 ~]$ rpcinfo -p 192.168.33.50 | grep nfs 78 | 100003 3 tcp 2049 nfs 79 | 100003 4 tcp 2049 nfs 80 | 100227 3 tcp 2049 nfs_acl 81 | 100003 3 udp 2049 nfs 82 | 100003 4 udp 2049 nfs 83 | 100227 3 udp 2049 nfs_acl 84 | ``` 85 | 86 | 服务正常运行。 87 | 88 | 89 | ## 3. 使用 NFS 挂载 90 | 91 | ### 3.1 服务端添加共享目录 92 | 93 | 编辑 /etc/exports,添加以下配置,如下: 94 | 95 | ``` 96 | [kube@nfs01 ~]$ cat /etc/exports 97 | /data 192.168.33.0/24(rw,async) 98 | ``` 99 | 100 | 备注,/etc/exports 格式,如下: 101 | 102 | ``` 103 | 104 | 格式:[共享的目录] [主机名或IP(参数,参数)] 105 | 106 | 当将同一目录共享给多个客户机,但对每个客户机提供的权限不同时,可以如下配置:  107 | 108 | [共享的目录] [主机名1或IP1(参数1,参数2)] [主机名2或IP2(参数3,参数4)] 109 | 110 | 第一列:共享的目录,也就是想共享到网络中的文件系统; 111 | 112 | 第二列:可访问网络/主机 113 | 114 | 可以是 IP、主机名(域名)、网段、通配符等,如下: 115 | 192.168.152.13 指定 IP 地址的主机  116 | nfsclient.test.com 指定域名的主机  117 | 192.168.1.0/24 指定网段中的所有主机  118 | *.test.com        指定域下的所有主机  119 | *                       所有主机  120 | 121 | 第三列:共享参数 122 | 123 | 下面是一些NFS共享的常用参数:  124 |  ro                只读访问  125 |  rw                读写访问  126 |  sync              所有数据在请求时写入共享  127 |  async             NFS在写入数据前可以相应请求  128 |  secure            NFS通过1024以下的安全TCP/IP端口发送  129 |  insecure          NFS通过1024以上的端口发送  130 |  wdelay            如果多个用户要写入NFS目录,则归组写入(默认)  131 |  no_wdelay      如果多个用户要写入NFS目录,则立即写入,当使用async时,无需此设置。  132 |  Hide              在NFS共享目录中不共享其子目录  133 |  no_hide           共享NFS目录的子目录  134 |  subtree_check   如果共享/usr/bin之类的子目录时,强制NFS检查父目录的权限(默认)  135 |  no_subtree_check  和上面相对,不检查父目录权限  136 |  all_squash        共享文件的UID和GID映射匿名用户anonymous,适合公用目录。  137 |  no_all_squash     保留共享文件的UID和GID(默认)  138 |  root_squash       root 用户的所有请求映射成如 anonymous 用户一样的权限(默认)  139 |  no_root_squas     root 用户具有根目录的完全管理访问权限  140 |  anonuid=xxx       指定NFS服务器/etc/passwd文件中匿名用户的UID  141 | 142 | 例如可以编辑/etc/exports为:  143 | /tmp     *(rw,no_root_squash)  144 | /home/public 192.168.0.*(rw)   *(ro)  145 | /home/test  192.168.0.100(rw)  146 | /home/linux  *.the9.com(rw,all_squash,anonuid=40,anongid=40) 147 | 148 | ``` 149 | 150 | 151 | 创建 /data 目录,如下 152 | 153 | ``` 154 | [kube@nfs01 ~]$ sudo mkdir -p /data 155 | [kube@nfs01 ~]$ sudo chown -R nfsnobody.nfsnobody /data 156 | 157 | ``` 158 | 159 | 重启 nfs 服务,如下: 160 | 161 | ``` 162 | [kube@nfs01 ~]$ sudo systemctl restart nfs 163 | [kube@nfs01 ~]$ showmount -e localhost 164 | Export list for localhost: 165 | /data 192.168.33.0/24 166 | ``` 167 | 168 | 通过 showmount -e localhost 查看生效的配置。 169 | 170 | 171 | ### 3.2 客户端挂载 NFS 目录 172 | 173 | 客户端选择的虚拟机是: ing01 174 | 175 | ``` 176 | [kube@ing01 ~]$ showmount -e 192.168.33.50 177 | Export list for 192.168.33.50: 178 | /data 192.168.33.0/24 179 | 180 | [kube@ing01 nfs01]$ sudo mkdir -p /mnt/nfs01 181 | 182 | [kube@ing01 nfs01]$ sudo mount -t nfs 192.168.33.50:/data /mnt/nfs01 183 | 184 | ``` 185 | 186 | ### 3.3 验证 187 | 188 | 在 nfs 服务器(nfs01)的 /data 目录下 新建 hello.txt 文件,然后到客户端所在服务器(ing01)验证挂载的 NFS 目录是否有 hello.txt 189 | 190 | 191 | nfs 服务器上的 /data 目录: 192 | ``` 193 | [kube@nfs01 data]$ hostname 194 | nfs01 195 | [kube@nfs01 data]$ pwd 196 | /data 197 | [kube@nfs01 data]$ ls 198 | hello.txt 199 | ``` 200 | 201 | 客户端上的 /mnt/nfs-01 目录: 202 | ``` 203 | [kube@ing01 mnt]$ hostname 204 | ing01 205 | [kube@ing01 mnt]$ pwd 206 | /mnt 207 | [kube@ing01 mnt]$ ls 208 | nfs01 209 | ``` 210 | 211 | 到这里,配置已经完成。 -------------------------------------------------------------------------------- /applications/helm/README.md: -------------------------------------------------------------------------------- 1 | Helm 2 | --- 3 | 4 | - [helm 官网](https://helm.sh/) 5 | - [官方 Helm 下载和安装](https://docs.helm.sh/using_helm/#installing-helm) 6 | 7 | 可以从官方文档中查看 helm 安装方式,下面演示具体操作: 8 | 9 | ## 1. 下载和安装 helm 10 | 11 | 这里使用的是 [helm-v2.12.2-linux-amd64.tar.gz](https://storage.googleapis.com/kubernetes-helm/helm-v2.12.2-linux-amd64.tar.gz) 这个版本的压缩包。 12 | 13 | 将压缩包下载到 m01 机器,解压并将 helm 可执行文件复制到 `/usr/local/bin` 目录,详细操作如下: 14 | 15 | ``` 16 | 17 | # 下载 helm 压缩包 18 | [kube@m01 helm]$ wget https://storage.googleapis.com/kubernetes-helm/helm-v2.12.2-linux-amd64.tar.gz 19 | --2019-01-10 11:06:16-- https://storage.googleapis.com/kubernetes-helm/helm-v2.12.2-linux-amd64.tar.gz 20 | Resolving storage.googleapis.com (storage.googleapis.com)... 172.217.163.240, 2404:6800:4005:803::2010 21 | Connecting to storage.googleapis.com (storage.googleapis.com)|172.217.163.240|:443... connected. 22 | HTTP request sent, awaiting response... 200 OK 23 | Length: 22724805 (22M) [application/x-tar] 24 | Saving to: ‘helm-v2.12.2-linux-amd64.tar.gz’ 25 | 26 | 100%[===============================================================================================================>] 22,724,805 470KB/s in 67s 27 | 28 | 2019-01-10 11:07:28 (332 KB/s) - ‘helm-v2.12.2-linux-amd64.tar.gz’ saved [22724805/22724805] 29 | 30 | 31 | # 解压 helm 压缩包 32 | [kube@m01 helm]$ tar zxvf helm-v2.12.2-linux-amd64.tar.gz 33 | linux-amd64/ 34 | linux-amd64/tiller 35 | linux-amd64/README.md 36 | linux-amd64/helm 37 | linux-amd64/LICENSE 38 | 39 | 40 | # 将 helm 可执行文件复制到 /usr/local/bin 41 | [kube@m01 helm]$ sudo cp linux-amd64/helm /usr/local/bin/ 42 | 43 | 44 | # 验证 45 | [kube@m01 helm]$ helm help 46 | The Kubernetes package manager 47 | 48 | To begin working with Helm, run the 'helm init' command: 49 | 50 | $ helm init 51 | 52 | ... 53 | ... 54 | 55 | ``` 56 | 57 | 到这里 helm 就安装完成,为了让 helm 能正常工作,需要安装 tiller 和初始化。 58 | 59 | ## 2. 安装 tiller 60 | 61 | 62 | 在 kubernetes 集群里安装 tiller 很简单,helm 官方提供 `helm init` 进行 helm 初始化。`helm init` 主要做以下几个事情: 63 | 64 | - i. 验证 helm 的本地环境是否配置正确 65 | - ii. 像 kubectl 连接集群的方式连接到 kubernetes 集群 66 | - iii. 当连接成功,安装 tiller 到 kubernetes 集群的 kube-system 命名空间下。 67 | 68 | 69 | 备注:由于国内无法访问默认的 tiller 镜像,因此这里使用阿里云提供的国内镜像。 70 | 71 | ``` 72 | helm init -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.12.2 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts 73 | ``` 74 | 75 | 76 | 使用 `helm init` 进行初始化,如下: 77 | 78 | ``` 79 | [kube@m01 helm]$ helm init -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.12.2 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts 80 | Creating /home/kube/.helm 81 | Creating /home/kube/.helm/repository 82 | Creating /home/kube/.helm/repository/cache 83 | Creating /home/kube/.helm/repository/local 84 | Creating /home/kube/.helm/plugins 85 | Creating /home/kube/.helm/starters 86 | Creating /home/kube/.helm/cache/archive 87 | Creating /home/kube/.helm/repository/repositories.yaml 88 | Adding stable repo with URL: https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts 89 | Adding local repo with URL: http://127.0.0.1:8879/charts 90 | $HELM_HOME has been configured at /home/kube/.helm. 91 | 92 | Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster. 93 | 94 | Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy. 95 | To prevent this, run `helm init` with the --tiller-tls-verify flag. 96 | For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation 97 | ``` 98 | 99 | 100 | 备注:升级可以使用:`helm init --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.12.2 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts` 101 | 102 | ## 3. 验证 103 | 104 | `helm version` 将看到 helm 客户端和服务端版本。 105 | 106 | ``` 107 | [kube@m01 ~]$ helm version 108 | Client: &version.Version{SemVer:"v2.12.2", GitCommit:"7d2b0c73d734f6586ed222a567c5d103fed435be", GitTreeState:"clean"} 109 | Server: &version.Version{SemVer:"v2.12.2", GitCommit:"7d2b0c73d734f6586ed222a567c5d103fed435be", GitTreeState:"clean"} 110 | ``` 111 | 112 | `helm search` 查看相关可用 chart,如: 113 | 114 | ``` 115 | [kube@m01 ~]$ helm search mysql 116 | NAME CHART VERSION APP VERSION DESCRIPTION 117 | stable/mysql 0.13.0 5.7.14 Fast, reliable, scalable, and easy to use open-source rel... 118 | stable/mysqldump 2.0.2 2.0.0 A Helm chart to help backup MySQL databases using mysqldump 119 | stable/prometheus-mysql-exporter 0.2.1 v0.11.0 A Helm chart for prometheus mysql exporter with cloudsqlp... 120 | stable/percona 0.3.4 5.7.17 free, fully compatible, enhanced, open source drop-in rep... 121 | stable/percona-xtradb-cluster 0.6.1 5.7.19 free, fully compatible, enhanced, open source drop-in rep... 122 | stable/phpmyadmin 2.0.3 4.8.4 phpMyAdmin is an mysql administration frontend 123 | stable/gcloud-sqlproxy 0.6.1 1.11 DEPRECATED Google Cloud SQL Proxy 124 | stable/mariadb 5.4.3 10.1.37 Fast, reliable, scalable, and easy to use open-source rel... 125 | ``` 126 | 127 | ### 3.1 修改 stable charts 源 128 | 129 | 由于网络原因,这里讲默认的 stable charts 修改为阿里云提供的 charts,如下: 130 | 131 | ``` 132 | [kube@m01 volumns]$ helm repo list 133 | NAME URL 134 | stable https://kubernetes-charts.storage.googleapis.com 135 | local http://127.0.0.1:8879/charts 136 | [kube@m01 volumns]$ helm repo list 137 | NAME URL 138 | stable https://kubernetes-charts.storage.googleapis.com 139 | local http://127.0.0.1:8879/charts 140 | [kube@m01 volumns]$ helm repo remove stable 141 | "stable" has been removed from your repositories 142 | [kube@m01 volumns]$ helm repo add stable https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts 143 | "stable" has been added to your repositories 144 | [kube@m01 volumns]$ helm repo update 145 | Hang tight while we grab the latest from your chart repositories... 146 | ...Skip local chart repository 147 | ...Successfully got an update from the "stable" chart repository 148 | Update Complete. ⎈ Happy Helming!⎈ 149 | [kube@m01 volumns]$ helm repo list 150 | NAME URL 151 | local http://127.0.0.1:8879/charts 152 | stable https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts 153 | ``` 154 | 155 | ## 4. 问题排查 156 | 157 | ### 4.1 *** is forbidden 158 | 159 | 通过 `helm list` 查看集群中安装的 charts,报错: 160 | ``` 161 | [kube@m01 ~]$ helm list 162 | Error: configmaps is forbidden: User "system:serviceaccount:kube-system:default" cannot list resource "configmaps" in API group "" in the namespace "kube-system" 163 | ``` 164 | 165 | 解决方式,参考:https://github.com/helm/helm/issues/3130 166 | 167 | 自Kubernetes 1.6版本开始,API Server启用了RBAC授权。而目前的Tiller部署没有定义授权的ServiceAccount,这会导致访问API Server时被拒绝。我们可以采用如下方法,明确为Tiller部署添加授权。 168 | 169 | ``` 170 | kubectl create serviceaccount --namespace kube-system tiller 171 | kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller 172 | kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}' 173 | ``` 174 | 175 | ## 5. 常用命令 176 | 177 | ``` 178 | # 卸载helm服务端 179 | helm reset 或 helm reset --force 180 | 181 | # 查看仓库中所有可用 Helm charts 182 | helm search 183 | 184 | # 更新 charts 列表 185 | helm repo update 186 | 187 | 188 | 189 | ``` -------------------------------------------------------------------------------- /tools/v1.13/nginx-ingress.yaml: -------------------------------------------------------------------------------- 1 | apiVersion: v1 2 | kind: Namespace 3 | metadata: 4 | name: ingress-nginx 5 | --- 6 | kind: ConfigMap 7 | apiVersion: v1 8 | metadata: 9 | name: nginx-configuration 10 | namespace: ingress-nginx 11 | labels: 12 | app.kubernetes.io/name: ingress-nginx 13 | app.kubernetes.io/part-of: ingress-nginx 14 | --- 15 | kind: ConfigMap 16 | apiVersion: v1 17 | metadata: 18 | name: tcp-services 19 | namespace: ingress-nginx 20 | labels: 21 | app.kubernetes.io/name: ingress-nginx 22 | app.kubernetes.io/part-of: ingress-nginx 23 | --- 24 | kind: ConfigMap 25 | apiVersion: v1 26 | metadata: 27 | name: udp-services 28 | namespace: ingress-nginx 29 | labels: 30 | app.kubernetes.io/name: ingress-nginx 31 | app.kubernetes.io/part-of: ingress-nginx 32 | --- 33 | apiVersion: v1 34 | kind: ServiceAccount 35 | metadata: 36 | name: nginx-ingress-serviceaccount 37 | namespace: ingress-nginx 38 | labels: 39 | app.kubernetes.io/name: ingress-nginx 40 | app.kubernetes.io/part-of: ingress-nginx 41 | --- 42 | apiVersion: rbac.authorization.k8s.io/v1beta1 43 | kind: ClusterRole 44 | metadata: 45 | name: nginx-ingress-clusterrole 46 | labels: 47 | app.kubernetes.io/name: ingress-nginx 48 | app.kubernetes.io/part-of: ingress-nginx 49 | rules: 50 | - apiGroups: 51 | - "" 52 | resources: 53 | - configmaps 54 | - endpoints 55 | - nodes 56 | - pods 57 | - secrets 58 | verbs: 59 | - list 60 | - watch 61 | - apiGroups: 62 | - "" 63 | resources: 64 | - nodes 65 | verbs: 66 | - get 67 | - apiGroups: 68 | - "" 69 | resources: 70 | - services 71 | verbs: 72 | - get 73 | - list 74 | - watch 75 | - apiGroups: 76 | - "extensions" 77 | resources: 78 | - ingresses 79 | verbs: 80 | - get 81 | - list 82 | - watch 83 | - apiGroups: 84 | - "" 85 | resources: 86 | - events 87 | verbs: 88 | - create 89 | - patch 90 | - apiGroups: 91 | - "extensions" 92 | resources: 93 | - ingresses/status 94 | verbs: 95 | - update 96 | --- 97 | apiVersion: rbac.authorization.k8s.io/v1beta1 98 | kind: Role 99 | metadata: 100 | name: nginx-ingress-role 101 | namespace: ingress-nginx 102 | labels: 103 | app.kubernetes.io/name: ingress-nginx 104 | app.kubernetes.io/part-of: ingress-nginx 105 | rules: 106 | - apiGroups: 107 | - "" 108 | resources: 109 | - configmaps 110 | - pods 111 | - secrets 112 | - namespaces 113 | verbs: 114 | - get 115 | - apiGroups: 116 | - "" 117 | resources: 118 | - configmaps 119 | resourceNames: 120 | - "ingress-controller-leader-nginx" 121 | verbs: 122 | - get 123 | - update 124 | - apiGroups: 125 | - "" 126 | resources: 127 | - configmaps 128 | verbs: 129 | - create 130 | - apiGroups: 131 | - "" 132 | resources: 133 | - endpoints 134 | verbs: 135 | - get 136 | --- 137 | apiVersion: rbac.authorization.k8s.io/v1beta1 138 | kind: RoleBinding 139 | metadata: 140 | name: nginx-ingress-role-nisa-binding 141 | namespace: ingress-nginx 142 | labels: 143 | app.kubernetes.io/name: ingress-nginx 144 | app.kubernetes.io/part-of: ingress-nginx 145 | roleRef: 146 | apiGroup: rbac.authorization.k8s.io 147 | kind: Role 148 | name: nginx-ingress-role 149 | subjects: 150 | - kind: ServiceAccount 151 | name: nginx-ingress-serviceaccount 152 | namespace: ingress-nginx 153 | --- 154 | apiVersion: rbac.authorization.k8s.io/v1beta1 155 | kind: ClusterRoleBinding 156 | metadata: 157 | name: nginx-ingress-clusterrole-nisa-binding 158 | labels: 159 | app.kubernetes.io/name: ingress-nginx 160 | app.kubernetes.io/part-of: ingress-nginx 161 | roleRef: 162 | apiGroup: rbac.authorization.k8s.io 163 | kind: ClusterRole 164 | name: nginx-ingress-clusterrole 165 | subjects: 166 | - kind: ServiceAccount 167 | name: nginx-ingress-serviceaccount 168 | namespace: ingress-nginx 169 | --- 170 | apiVersion: apps/v1 171 | kind: Deployment 172 | metadata: 173 | name: nginx-ingress-controller 174 | namespace: ingress-nginx 175 | labels: 176 | app.kubernetes.io/name: ingress-nginx 177 | app.kubernetes.io/part-of: ingress-nginx 178 | spec: 179 | replicas: 3 180 | selector: 181 | matchLabels: 182 | app.kubernetes.io/name: ingress-nginx 183 | app.kubernetes.io/part-of: ingress-nginx 184 | template: 185 | metadata: 186 | labels: 187 | app.kubernetes.io/name: ingress-nginx 188 | app.kubernetes.io/part-of: ingress-nginx 189 | annotations: 190 | prometheus.io/port: "10254" 191 | prometheus.io/scrape: "true" 192 | spec: 193 | hostNetwork: true 194 | affinity: 195 | nodeAffinity: 196 | requiredDuringSchedulingIgnoredDuringExecution: 197 | nodeSelectorTerms: 198 | - matchExpressions: 199 | - key: kubernetes.io/hostname 200 | operator: In 201 | # 指定部署到三台 master 上 202 | values: 203 | - m01 204 | - m02 205 | - m03 206 | podAntiAffinity: 207 | requiredDuringSchedulingIgnoredDuringExecution: 208 | - labelSelector: 209 | matchExpressions: 210 | - key: app.kubernetes.io/name 211 | operator: In 212 | values: 213 | - ingress-nginx 214 | topologyKey: "kubernetes.io/hostname" 215 | tolerations: 216 | - key: node-role.kubernetes.io/master 217 | effect: NoSchedule 218 | serviceAccountName: nginx-ingress-serviceaccount 219 | containers: 220 | - name: nginx-ingress-controller 221 | image: registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.21.0 222 | args: 223 | - /nginx-ingress-controller 224 | - --configmap=$(POD_NAMESPACE)/nginx-configuration 225 | - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services 226 | - --udp-services-configmap=$(POD_NAMESPACE)/udp-services 227 | # - --publish-service=$(POD_NAMESPACE)/ingress-nginx 228 | - --annotations-prefix=nginx.ingress.kubernetes.io 229 | securityContext: 230 | capabilities: 231 | drop: 232 | - ALL 233 | add: 234 | - NET_BIND_SERVICE 235 | # www-data -> 33 236 | runAsUser: 33 237 | env: 238 | - name: POD_NAME 239 | valueFrom: 240 | fieldRef: 241 | fieldPath: metadata.name 242 | - name: POD_NAMESPACE 243 | valueFrom: 244 | fieldRef: 245 | fieldPath: metadata.namespace 246 | ports: 247 | - name: http 248 | containerPort: 80 249 | - name: https 250 | containerPort: 443 251 | livenessProbe: 252 | failureThreshold: 3 253 | httpGet: 254 | path: /healthz 255 | port: 10254 256 | scheme: HTTP 257 | initialDelaySeconds: 10 258 | periodSeconds: 10 259 | successThreshold: 1 260 | timeoutSeconds: 1 261 | readinessProbe: 262 | failureThreshold: 3 263 | httpGet: 264 | path: /healthz 265 | port: 10254 266 | scheme: HTTP 267 | periodSeconds: 10 268 | successThreshold: 1 269 | timeoutSeconds: 1 270 | resources: 271 | limits: 272 | cpu: 1 273 | memory: 1024Mi 274 | requests: 275 | cpu: 0.25 276 | memory: 512Mi 277 | -------------------------------------------------------------------------------- /applications/Wayne/README.md: -------------------------------------------------------------------------------- 1 | Wayne: 360 开源 kubernetes 多集群管理平台 2 | --- 3 | 4 | - [Qihoo360/wayne](https://github.com/Qihoo360/wayne) 5 | - [Wayne wiki](https://github.com/Qihoo360/wayne/wiki) 6 | 7 | 8 | ![](./screenshots/admin-dashboard.png) 9 | 10 | ## 1. 基础配置 11 | 12 | 安装 Wayne 跟安装普通的应用没太大的区别,通过 `kubectl apply -f *.yaml` 进行相关依赖和程序的安装。 13 | 14 | Wayne 提供了指导文档,具体参考:[wiki](https://github.com/Qihoo360/wayne/wiki) 15 | 16 | 17 | Wayne 的仓库中提供了部署所需的 yaml 配置文件,见:[hack/kubernetes](https://github.com/Qihoo360/wayne/tree/master/hack/kubernetes) 18 | 19 | 其中有两个目录:[dependency](https://github.com/Qihoo360/wayne/tree/master/hack/kubernetes/dependency)、[wayne](https://github.com/Qihoo360/wayne/tree/master/hack/kubernetes/wayne) 分别下载并通过 `kubectl apply -f .` 进行安装。 20 | 21 | 先安装 denpendency 中的 MySQL 和 RabbitMQ,再安装 wayne、wayne-webhook 和 wayne-woker。 22 | 23 | 因为官方提供的配置比较高,我自己调整了配置进行安装,具体配置见:[wayne/v1.3.1](./v1.3.1) 24 | 25 | 另外,为了通过域名访问服务,我增加了 `ingress` 配置,具体配置见:[wayne/v1.3.1/wayne/ingress.yaml](./v1.3.1/wayne/ingress.yaml) 26 | 27 | ## 2. 安装 28 | 29 | ### 2.1 安装依赖(MySQL 和 RabbitMQ) 30 | 31 | 进入 dependency 目录,执行 `kubectl apply -f .`,查看所有的 pod: 32 | 33 | ``` 34 | [kube@m01 ~]$ kubectl get pod 35 | NAME READY STATUS RESTARTS AGE 36 | mysql-wayne-df7c8c595-nmss2 1/1 Running 0 6h20m 37 | rabbitmq-wayne-6cc64bbd99-8fj5d 1/1 Running 0 6h20m 38 | 39 | ``` 40 | 41 | 看到 mysql-wayne 和 rabbitmq-wayne 的 pod 已经启动完成。 42 | 43 | ### 2.2 安装 wayne 后端程序 44 | 45 | 进入 [wayne 目录](./v1.3.1/wayne),执行 `kubectl apply -f .`,查看所有的 pod: 46 | 47 | ``` 48 | [kube@m01 ~]$ kubectl get pod 49 | NAME READY STATUS RESTARTS AGE 50 | infra-wayne-7ddd7f4b9c-spcjq 1/1 Running 0 105m 51 | infra-wayne-webhook-58995d89c5-kf9dr 1/1 Running 0 105m 52 | infra-wayne-woker-57685f749d-x8nmn 1/1 Running 0 105m 53 | mysql-wayne-df7c8c595-nmss2 1/1 Running 0 6h20m 54 | rabbitmq-wayne-6cc64bbd99-8fj5d 1/1 Running 0 6h20m 55 | 56 | ``` 57 | 58 | 所有的 pod 正常运行,查看 ingress 配置: 59 | 60 | ``` 61 | [kube@m01 ~]$ kubectl get ing 62 | NAME HOSTS ADDRESS PORTS AGE 63 | wayne-ingress wayne.k8s.hiko.im 80 114m 64 | 65 | ``` 66 | 67 | ### 2.3 访问 68 | 69 | 将 wayne.k8s.hiko.im 解析到 k8s 集群的 ingress 机器,再通过浏览器访问:http://wayne.k8s.hiko.im , 将看到: 70 | 71 | ![Home](./screenshots/home.png) 72 | 73 | ## 3. 配置集群 74 | 75 | 默认管理员账号:admin,密码:admin 76 | 77 | 通过右上角 [管理员] -> [进入后台] 进入管理端,如下: 78 | 79 | ![config-cluster](./screenshots/config-cluster.png) 80 | 81 | ### 3.1 关联集群 82 | 83 | 为了让 wayne 能管理我们的 kubernetes 集群,我们需要在管理端关联集群。 84 | 85 | ![Home](./screenshots/config-cluster-1.png) 86 | 87 | 88 | 其中,KubeConfig 可以从 m01 (api-server 的机器上),对应用户(比如:本教程中使用的 kube 用户)的 `~/.kube/config` 文件查看。 89 | 90 | 本实例配置如下: 91 | 92 | ``` 93 | apiVersion: v1 94 | clusters: 95 | - cluster: 96 | certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRFNE1USXlNVEF6TVRBd00xb1hEVEk0TVRJeE9EQXpNVEF3TTFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTFVEClM0ejhidmM3c05Xc3l3YmVnUDFPZGNlRmlWclhRbHZGOXl1UWpKeFVKYTFUWHVHVGFwQkNtVGIwZjlNUTBIc0MKYk9XY3hZUkkwVDFBSVlMNmZsbWFobXdiV0Q5TDVXak5GNGdOdmNFazdpNnd1a090dVpzK2tZcSszMXNYYjNDZQp6TTNqWUcrV3FnaTZ5N0FPTEZLMmNsNFVwbTJJQVBpbUdDaHI1MnIxTHBwalpLODRMQmo5MUNkN0NyYVI2OEx1CkQ2RDdIRnkxT1lVbHZ1N01VenA5T3hZVFBwZE1DMndXc0hsOFZJWDNpYVZtZVNhTDYwS0x4bDFjQ1l2d0dpanAKdzlCbVBtN2xobzJFMERmV2tMc2tEVkdsUUEyT2t0MFkrMkl5ZXZ2Yy9uZ0N6MEFhN0NZZ1hwQ3JhNHJyWUhieAp3bzRaWXNST0RJditubVVhVzMwQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFFTWpDeHh3a0NWNXJhM2NHMW1SOGJQbzIxV20KNXlUNS9rbE81cjRiMGlQSll0ZEM1VGc4ck9DU2lyN1JtSmJLYzYybVNhcVNvandIWW9YQk1rb1lROGo0Z1dnNgpPTWtJZlpaYTkzMFBlUG5ZNUVrekxSVXg2cnREMFVoSFJwakpaOWF0QmtqU1ZsNmptZEVQWEROMUFGSmtUVjdhCjdGZVhFdTZ6RDJhVHZoVjAyL2lwOGx2MFdhd3VJTXh6K1hyQkNQRlR1MWwydEUvd0VRVzVNb1F3bGdYOW1ZeTcKSVV1dENRWkREV0REQ0hCY1dMa0ZMN0F0UlVBa1JoTlk0K2dRZ29USnRGSmMzR0pER2Vkb1gwcHNRT3ExL1Z2egpicmw4ZHpPbWJ5L056MmZuZHFBNlpzemFHaUd5czJCTS9kUDh3MGsrbFhVSS9YTTFsdHZHZlZwdEpNWT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= 97 | server: https://api.k8s.hiko.im:6443 98 | name: kubernetes 99 | contexts: 100 | - context: 101 | cluster: kubernetes 102 | user: kubernetes-admin 103 | name: kubernetes-admin@kubernetes 104 | current-context: kubernetes-admin@kubernetes 105 | kind: Config 106 | preferences: {} 107 | users: 108 | - name: kubernetes-admin 109 | user: 110 | client-certificate-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM4akNDQWRxZ0F3SUJBZ0lJV2FYTUtqMmRnTWt3RFFZSktvWklodmNOQVFFTEJRQXdGVEVUTUJFR0ExVUUKQXhNS2EzVmlaWEp1WlhSbGN6QWVGdzB4T0RFeU1qRXdNekV3TUROYUZ3MHhPVEV5TWpFd016RXdNRFphTURReApGekFWQmdOVkJBb1REbk41YzNSbGJUcHRZWE4wWlhKek1Sa3dGd1lEVlFRREV4QnJkV0psY201bGRHVnpMV0ZrCmJXbHVNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQTFlaWNXUnp5MnJPOVFuWmEKU1ZVTHIyL2hvemwyTWFhTjkzemRmYkIvNTA0YUtGQXFjWUNCWkpwL0V0NFlwTitOaGFKWDRtS1YvdGdtanpLawpRSzFjdStHSlhiWllQd0pzUE5QcXh3TURJRitmTHpZMk1MSk9ScG1VOTVabFBCanNTa0J6YlpoQ2JZRzhuYXZXCkxqNlBLNXpLVldWRXVucXRFWDMwS0FPRHorOXZvS3BKck1Bd3RIb2lyZjk3SThIdnV5dVhLanBYOW94d2gyRVoKdDRUNVFVU0M5bnRpWWUwQnV0MlNBQytYdDBWaElYd3I0QWIzQzJhSTE2MFR3SWV0Ums1bS9KcGYrd2ZoOXc3MgpiejM3eDJJNUJwQmZsY1R2S3pjMUZ1cjZ6eXJqaWVGV05uQ3RiSFg2R0R6akJ0ODlOZTN5NDFpaUovYWFITUhTClcyWmhNUUlEQVFBQm95Y3dKVEFPQmdOVkhROEJBZjhFQkFNQ0JhQXdFd1lEVlIwbEJBd3dDZ1lJS3dZQkJRVUgKQXdJd0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFBbmxjRTVwL1QrQnhtRkVJZG9idENWWlJ4UC8vVk9KbDg4YQp3VHd0MmpsaFhLNitMSlYrZUZqTzV5WStPZU01bXo5TjRjRGFoSnhpbVlaTzRFT1kwUWJoaHp4OXhvcUt0QTlNCk9HRFRHbitwNUhwOS83emxtYWhxZERQbHpWNXFUTW1uZG1VeXBEdnRwU0d0amlUR1JFRG5KcWlkVlB3ZlhqNU0KMVI3MTd5ZjI3bjNSR05xRHhyeUloay9oZ0lvSXJNOEZRekFDRkhMMnlQUVFzQUc5ZGdxTVR3amJ0WGxacW16UAp2ZnNCQjIxVHV2YzdWd1A2b05sbVpRcnpJb2FRbmMrVm9tSTRVSmFDYWQ0MkxVL0Y4V3FIbjBMYUJqbHFaOFhpClUxcm03TG1HR1NOczVoM3p3OWg5Y0dNVGNQQ0NEUktIN0gza1FPVFc5dGpIbjd4YS9qTT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= 111 | client-key-data: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBMWVpY1dSenkyck85UW5aYVNWVUxyMi9ob3psMk1hYU45M3pkZmJCLzUwNGFLRkFxCmNZQ0JaSnAvRXQ0WXBOK05oYUpYNG1LVi90Z21qektrUUsxY3UrR0pYYlpZUHdKc1BOUHF4d01ESUYrZkx6WTIKTUxKT1JwbVU5NVpsUEJqc1NrQnpiWmhDYllHOG5hdldMajZQSzV6S1ZXVkV1bnF0RVgzMEtBT0R6Kzl2b0twSgpyTUF3dEhvaXJmOTdJOEh2dXl1WEtqcFg5b3h3aDJFWnQ0VDVRVVNDOW50aVllMEJ1dDJTQUMrWHQwVmhJWHdyCjRBYjNDMmFJMTYwVHdJZXRSazVtL0pwZit3Zmg5dzcyYnozN3gySTVCcEJmbGNUdkt6YzFGdXI2enlyamllRlcKTm5DdGJIWDZHRHpqQnQ4OU5lM3k0MWlpSi9hYUhNSFNXMlpoTVFJREFRQUJBb0lCQUZjcXJOdWJjbE13djBUZwpHYmFjVTJDd1JOQlEwQnMzZGM2T01XdlFpcTVsSXorZU8wMTVRa0VPdkEyaU40U29ISEdDVURIT1hyVTB1N0hLCnZ5Z2ovUkFLdmdGVUZ1M0dQUGtrbWgxeTJzbE1iZis5SmFQK1pPdGNGbG8yRFJiS1NTK1F4L2kyL1FyR3ZXZTYKNkZKNzU3ZXI4citOdnM0R3c1UEhNY0ZFZldnenVSOVVRSFM0Tis4N3Yvc0k5NUd4dFZpcXQxTzkyMFJiV3FrZgpqVWVFZkNHeFdrVDhjNXgydnkwK05MdjgrQWJCVVJ6cXhtL0NnWEp3b0F2RVhzR1NYOFJybjNWR1JpZnNsNUwwCjl6c2pvVG1FNkdtV1prYUE3b1FpeHdxWVpoZG05RW9zTjdSNFFJMXVjMjN0UmRKOUhjdXhacktZZXhvY1NFUjEKTFNxNlVTa0NnWUVBNmRpV0tmdDhwTmlodVdKVzJIYjB2TlJzaWhZaHZjQmtUYnVlc1o2UHAvRGpNYkZ2ZmhtcQpRV21qUGk3Q3V3dU5qZzhrNEZZNmJrQS9wM1A4eUVvUkxGRDRqaXZycTMyWTNhYm1IUUJqd2NQZmxlellkWmp3CkpQYm0vdjVoRUNXRzk0WG5wR1ZUOUlNTzJIdFBaRXkxV0xycFoyU3FraFpuQjlZaWd0bzVybHNDZ1lFQTZpeDgKcG13SFByZU9hbjd2R3RzdlZETDBDckppdXRJVERuTTkxSStUY3drTUQ2a3NacmZIZXdMNWk5L1RlQXFDWk9pZgoyTmtRbzlFT1d1cEtFTzZaNjVWMm9ibm90SGdLQzFDc0NpVzVKdU4wTTVMbEpEUVhlbjlNK3F0bVJSUVNXcUFrCnNxeElvb3FXelpsUkUwcUpjUThibXRBbUFRTmNTSTJ5ZkdMeUhHTUNnWUJ1bnJCYVo4Y016QldrOXFvU2VDTkoKK0Vybi81UXlpUUpwNnlrazZOY1lJTkc0dmpENXUvWllQenFqdmNjTWFHaXNIT25hM2ErQ1hBNUFqcE96dzZYZwpDdVdwaTRsT2RIbU4wTmZtUERyMGZFNFdSQllaZXlHT3V1V0hGcHFmNHNDMzhyWWpoSE4wcFZLdWdaYUs0ZWFmCmRMdlkxendCSTJ2VnZ5eFFMaDgvSlFLQmdRQzRVVXZMc2p3Qm9YajNXZkhac2F3UEdndjhYMnhXb0FOZjNGVk8KZWJRVlY0bW15Z0dvMS82clZDd1hiSldHWnI4N3JkNGpVTGRJT2NTU3l0YUJmVXlwb1hzKzBKWFpkcUp4Ulk0awpib3pOanpwblhiZitSd0l6NlA4dVRycXdwSnZOdVQ4cFkzSElmazAwaHZqSnRtRjRHK3dlYnJkN0ZLb09jWG1MCmJsWWpBUUtCZ1FDMTQ0bkorSDVPdWpGREozenRWditoNGMrQk04b09od1JCUVVJclNuWG1MN0lXVU91ZXI1TzYKYnhIamlkSnVPMlNYYWtmWXdyMDBoZEhRVDFubEZTZ3lEaGdkY01Jd1AwcmYySUNPMHJpekFyVkZCTlFMcUdmNApBcjZoVEFEcUd4bElScGEwYmhmeGpDd2VxZ3cwVUUyVnNNd1lNVVZyMzFmeDRIZktwaXBPZkE9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo= 112 | 113 | ``` 114 | 115 | 配置完保存即可。 116 | 117 | 点击 [kubernetes] -> [Node] 将看到节点实况: 118 | 119 | ![Home](./screenshots/admin-node.png) 120 | 121 | ## 4. 问题排查 122 | 123 | 可以通过 kubectl get pods 查看 pod 的状态,以及 kubectl logs -f {POD 实例名称} 查看 POD 输出的日志进行问题排查,如下: 124 | 125 | ``` 126 | [kube@m01 ~]$ kubectl get pods 127 | NAME READY STATUS RESTARTS AGE 128 | infra-wayne-7ddd7f4b9c-spcjq 1/1 Running 0 130m 129 | infra-wayne-webhook-58995d89c5-kf9dr 1/1 Running 0 130m 130 | infra-wayne-woker-57685f749d-x8nmn 1/1 Running 0 130m 131 | mysql-wayne-df7c8c595-nmss2 1/1 Running 0 6h46m 132 | rabbitmq-wayne-6cc64bbd99-8fj5d 1/1 Running 0 6h46m 133 | 134 | 135 | [kube@m01 ~]$ kubectl logs -f infra-wayne-7ddd7f4b9c-spcjq 136 | 2019/01/10 12:21:41.942 [I] [asm_amd64.s:2361] http server Running on http://:8080 137 | 2019/01/10 12:21:41.942 [I] [asm_amd64.s:2361] Admin server Running on :8088 138 | 139 | ``` -------------------------------------------------------------------------------- /tools/v1.13/kube-flannel.yml: -------------------------------------------------------------------------------- 1 | --- 2 | kind: ClusterRole 3 | apiVersion: rbac.authorization.k8s.io/v1beta1 4 | metadata: 5 | name: flannel 6 | rules: 7 | - apiGroups: 8 | - "" 9 | resources: 10 | - pods 11 | verbs: 12 | - get 13 | - apiGroups: 14 | - "" 15 | resources: 16 | - nodes 17 | verbs: 18 | - list 19 | - watch 20 | - apiGroups: 21 | - "" 22 | resources: 23 | - nodes/status 24 | verbs: 25 | - patch 26 | --- 27 | kind: ClusterRoleBinding 28 | apiVersion: rbac.authorization.k8s.io/v1beta1 29 | metadata: 30 | name: flannel 31 | roleRef: 32 | apiGroup: rbac.authorization.k8s.io 33 | kind: ClusterRole 34 | name: flannel 35 | subjects: 36 | - kind: ServiceAccount 37 | name: flannel 38 | namespace: kube-system 39 | --- 40 | apiVersion: v1 41 | kind: ServiceAccount 42 | metadata: 43 | name: flannel 44 | namespace: kube-system 45 | --- 46 | kind: ConfigMap 47 | apiVersion: v1 48 | metadata: 49 | name: kube-flannel-cfg 50 | namespace: kube-system 51 | labels: 52 | tier: node 53 | app: flannel 54 | data: 55 | cni-conf.json: | 56 | { 57 | "name": "cbr0", 58 | "plugins": [ 59 | { 60 | "type": "flannel", 61 | "delegate": { 62 | "hairpinMode": true, 63 | "isDefaultGateway": true 64 | } 65 | }, 66 | { 67 | "type": "portmap", 68 | "capabilities": { 69 | "portMappings": true 70 | } 71 | } 72 | ] 73 | } 74 | net-conf.json: | 75 | { 76 | "Network": "10.244.0.0/16", 77 | "Backend": { 78 | "Type": "vxlan" 79 | } 80 | } 81 | --- 82 | apiVersion: extensions/v1beta1 83 | kind: DaemonSet 84 | metadata: 85 | name: kube-flannel-ds-amd64 86 | namespace: kube-system 87 | labels: 88 | tier: node 89 | app: flannel 90 | spec: 91 | template: 92 | metadata: 93 | labels: 94 | tier: node 95 | app: flannel 96 | spec: 97 | hostNetwork: true 98 | nodeSelector: 99 | beta.kubernetes.io/arch: amd64 100 | tolerations: 101 | - operator: Exists 102 | effect: NoSchedule 103 | serviceAccountName: flannel 104 | initContainers: 105 | - name: install-cni 106 | image: quay.io/coreos/flannel:v0.10.0-amd64 107 | command: 108 | - cp 109 | args: 110 | - -f 111 | - /etc/kube-flannel/cni-conf.json 112 | - /etc/cni/net.d/10-flannel.conflist 113 | volumeMounts: 114 | - name: cni 115 | mountPath: /etc/cni/net.d 116 | - name: flannel-cfg 117 | mountPath: /etc/kube-flannel/ 118 | containers: 119 | - name: kube-flannel 120 | image: quay.io/coreos/flannel:v0.10.0-amd64 121 | command: 122 | - /opt/bin/flanneld 123 | args: 124 | - --ip-masq 125 | - --kube-subnet-mgr 126 | # 指定eth1的网卡 127 | # vagrant + virturlbox的环境,一般需要选择 eth1 的网卡 128 | - --iface=eth1 129 | resources: 130 | requests: 131 | cpu: "100m" 132 | memory: "50Mi" 133 | limits: 134 | cpu: "100m" 135 | memory: "50Mi" 136 | securityContext: 137 | privileged: true 138 | env: 139 | - name: POD_NAME 140 | valueFrom: 141 | fieldRef: 142 | fieldPath: metadata.name 143 | - name: POD_NAMESPACE 144 | valueFrom: 145 | fieldRef: 146 | fieldPath: metadata.namespace 147 | volumeMounts: 148 | - name: run 149 | mountPath: /run 150 | - name: flannel-cfg 151 | mountPath: /etc/kube-flannel/ 152 | volumes: 153 | - name: run 154 | hostPath: 155 | path: /run 156 | - name: cni 157 | hostPath: 158 | path: /etc/cni/net.d 159 | - name: flannel-cfg 160 | configMap: 161 | name: kube-flannel-cfg 162 | --- 163 | apiVersion: extensions/v1beta1 164 | kind: DaemonSet 165 | metadata: 166 | name: kube-flannel-ds-arm64 167 | namespace: kube-system 168 | labels: 169 | tier: node 170 | app: flannel 171 | spec: 172 | template: 173 | metadata: 174 | labels: 175 | tier: node 176 | app: flannel 177 | spec: 178 | hostNetwork: true 179 | nodeSelector: 180 | beta.kubernetes.io/arch: arm64 181 | tolerations: 182 | - operator: Exists 183 | effect: NoSchedule 184 | serviceAccountName: flannel 185 | initContainers: 186 | - name: install-cni 187 | image: quay.io/coreos/flannel:v0.10.0-arm64 188 | command: 189 | - cp 190 | args: 191 | - -f 192 | - /etc/kube-flannel/cni-conf.json 193 | - /etc/cni/net.d/10-flannel.conflist 194 | volumeMounts: 195 | - name: cni 196 | mountPath: /etc/cni/net.d 197 | - name: flannel-cfg 198 | mountPath: /etc/kube-flannel/ 199 | containers: 200 | - name: kube-flannel 201 | image: quay.io/coreos/flannel:v0.10.0-arm64 202 | command: 203 | - /opt/bin/flanneld 204 | args: 205 | - --ip-masq 206 | - --kube-subnet-mgr 207 | resources: 208 | requests: 209 | cpu: "100m" 210 | memory: "50Mi" 211 | limits: 212 | cpu: "100m" 213 | memory: "50Mi" 214 | securityContext: 215 | privileged: true 216 | env: 217 | - name: POD_NAME 218 | valueFrom: 219 | fieldRef: 220 | fieldPath: metadata.name 221 | - name: POD_NAMESPACE 222 | valueFrom: 223 | fieldRef: 224 | fieldPath: metadata.namespace 225 | volumeMounts: 226 | - name: run 227 | mountPath: /run 228 | - name: flannel-cfg 229 | mountPath: /etc/kube-flannel/ 230 | volumes: 231 | - name: run 232 | hostPath: 233 | path: /run 234 | - name: cni 235 | hostPath: 236 | path: /etc/cni/net.d 237 | - name: flannel-cfg 238 | configMap: 239 | name: kube-flannel-cfg 240 | --- 241 | apiVersion: extensions/v1beta1 242 | kind: DaemonSet 243 | metadata: 244 | name: kube-flannel-ds-arm 245 | namespace: kube-system 246 | labels: 247 | tier: node 248 | app: flannel 249 | spec: 250 | template: 251 | metadata: 252 | labels: 253 | tier: node 254 | app: flannel 255 | spec: 256 | hostNetwork: true 257 | nodeSelector: 258 | beta.kubernetes.io/arch: arm 259 | tolerations: 260 | - operator: Exists 261 | effect: NoSchedule 262 | serviceAccountName: flannel 263 | initContainers: 264 | - name: install-cni 265 | image: quay.io/coreos/flannel:v0.10.0-arm 266 | command: 267 | - cp 268 | args: 269 | - -f 270 | - /etc/kube-flannel/cni-conf.json 271 | - /etc/cni/net.d/10-flannel.conflist 272 | volumeMounts: 273 | - name: cni 274 | mountPath: /etc/cni/net.d 275 | - name: flannel-cfg 276 | mountPath: /etc/kube-flannel/ 277 | containers: 278 | - name: kube-flannel 279 | image: quay.io/coreos/flannel:v0.10.0-arm 280 | command: 281 | - /opt/bin/flanneld 282 | args: 283 | - --ip-masq 284 | - --kube-subnet-mgr 285 | resources: 286 | requests: 287 | cpu: "100m" 288 | memory: "50Mi" 289 | limits: 290 | cpu: "100m" 291 | memory: "50Mi" 292 | securityContext: 293 | privileged: true 294 | env: 295 | - name: POD_NAME 296 | valueFrom: 297 | fieldRef: 298 | fieldPath: metadata.name 299 | - name: POD_NAMESPACE 300 | valueFrom: 301 | fieldRef: 302 | fieldPath: metadata.namespace 303 | volumeMounts: 304 | - name: run 305 | mountPath: /run 306 | - name: flannel-cfg 307 | mountPath: /etc/kube-flannel/ 308 | volumes: 309 | - name: run 310 | hostPath: 311 | path: /run 312 | - name: cni 313 | hostPath: 314 | path: /etc/cni/net.d 315 | - name: flannel-cfg 316 | configMap: 317 | name: kube-flannel-cfg 318 | --- 319 | apiVersion: extensions/v1beta1 320 | kind: DaemonSet 321 | metadata: 322 | name: kube-flannel-ds-ppc64le 323 | namespace: kube-system 324 | labels: 325 | tier: node 326 | app: flannel 327 | spec: 328 | template: 329 | metadata: 330 | labels: 331 | tier: node 332 | app: flannel 333 | spec: 334 | hostNetwork: true 335 | nodeSelector: 336 | beta.kubernetes.io/arch: ppc64le 337 | tolerations: 338 | - operator: Exists 339 | effect: NoSchedule 340 | serviceAccountName: flannel 341 | initContainers: 342 | - name: install-cni 343 | image: quay.io/coreos/flannel:v0.10.0-ppc64le 344 | command: 345 | - cp 346 | args: 347 | - -f 348 | - /etc/kube-flannel/cni-conf.json 349 | - /etc/cni/net.d/10-flannel.conflist 350 | volumeMounts: 351 | - name: cni 352 | mountPath: /etc/cni/net.d 353 | - name: flannel-cfg 354 | mountPath: /etc/kube-flannel/ 355 | containers: 356 | - name: kube-flannel 357 | image: quay.io/coreos/flannel:v0.10.0-ppc64le 358 | command: 359 | - /opt/bin/flanneld 360 | args: 361 | - --ip-masq 362 | - --kube-subnet-mgr 363 | resources: 364 | requests: 365 | cpu: "100m" 366 | memory: "50Mi" 367 | limits: 368 | cpu: "100m" 369 | memory: "50Mi" 370 | securityContext: 371 | privileged: true 372 | env: 373 | - name: POD_NAME 374 | valueFrom: 375 | fieldRef: 376 | fieldPath: metadata.name 377 | - name: POD_NAMESPACE 378 | valueFrom: 379 | fieldRef: 380 | fieldPath: metadata.namespace 381 | volumeMounts: 382 | - name: run 383 | mountPath: /run 384 | - name: flannel-cfg 385 | mountPath: /etc/kube-flannel/ 386 | volumes: 387 | - name: run 388 | hostPath: 389 | path: /run 390 | - name: cni 391 | hostPath: 392 | path: /etc/cni/net.d 393 | - name: flannel-cfg 394 | configMap: 395 | name: kube-flannel-cfg 396 | --- 397 | apiVersion: extensions/v1beta1 398 | kind: DaemonSet 399 | metadata: 400 | name: kube-flannel-ds-s390x 401 | namespace: kube-system 402 | labels: 403 | tier: node 404 | app: flannel 405 | spec: 406 | template: 407 | metadata: 408 | labels: 409 | tier: node 410 | app: flannel 411 | spec: 412 | hostNetwork: true 413 | nodeSelector: 414 | beta.kubernetes.io/arch: s390x 415 | tolerations: 416 | - operator: Exists 417 | effect: NoSchedule 418 | serviceAccountName: flannel 419 | initContainers: 420 | - name: install-cni 421 | image: quay.io/coreos/flannel:v0.10.0-s390x 422 | command: 423 | - cp 424 | args: 425 | - -f 426 | - /etc/kube-flannel/cni-conf.json 427 | - /etc/cni/net.d/10-flannel.conflist 428 | volumeMounts: 429 | - name: cni 430 | mountPath: /etc/cni/net.d 431 | - name: flannel-cfg 432 | mountPath: /etc/kube-flannel/ 433 | containers: 434 | - name: kube-flannel 435 | image: quay.io/coreos/flannel:v0.10.0-s390x 436 | command: 437 | - /opt/bin/flanneld 438 | args: 439 | - --ip-masq 440 | - --kube-subnet-mgr 441 | resources: 442 | requests: 443 | cpu: "100m" 444 | memory: "50Mi" 445 | limits: 446 | cpu: "100m" 447 | memory: "50Mi" 448 | securityContext: 449 | privileged: true 450 | env: 451 | - name: POD_NAME 452 | valueFrom: 453 | fieldRef: 454 | fieldPath: metadata.name 455 | - name: POD_NAMESPACE 456 | valueFrom: 457 | fieldRef: 458 | fieldPath: metadata.namespace 459 | volumeMounts: 460 | - name: run 461 | mountPath: /run 462 | - name: flannel-cfg 463 | mountPath: /etc/kube-flannel/ 464 | volumes: 465 | - name: run 466 | hostPath: 467 | path: /run 468 | - name: cni 469 | hostPath: 470 | path: /etc/cni/net.d 471 | - name: flannel-cfg 472 | configMap: 473 | name: kube-flannel-cfg 474 | -------------------------------------------------------------------------------- /errors/1-pod_status_error.md: -------------------------------------------------------------------------------- 1 | Pod 状态为 Error/CrashLoopBackOff 2 | --- 3 | 4 | ##1. 背景 5 | 6 | 本地 k8s 集群所在的宿主机关机重启后,个别 Pod 状态异常,具体如下: 7 | 8 | ``` 9 | [kube@m01 ~]$ kubectl get pods -n kube-system -owide 10 | NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 11 | coredns-6c67f849c7-dkzqf 1/1 Running 1 7d22h 10.244.3.85 n01 12 | coredns-6c67f849c7-zgf9h 1/1 Running 1 7d22h 10.244.0.15 m01 13 | etcd-m01 1/1 Running 23 27d 192.168.33.10 m01 14 | etcd-m02 1/1 Running 14 27d 192.168.33.11 m02 15 | etcd-m03 1/1 Running 10 27d 192.168.33.12 m03 16 | kube-apiserver-m01 1/1 Running 26 27d 192.168.33.10 m01 17 | kube-apiserver-m02 1/1 Running 4 27d 192.168.33.11 m02 18 | kube-apiserver-m03 1/1 Running 12 27d 192.168.33.12 m03 19 | kube-controller-manager-m01 1/1 Running 8 27d 192.168.33.10 m01 20 | kube-controller-manager-m02 1/1 Running 1 27d 192.168.33.11 m02 21 | kube-controller-manager-m03 1/1 Running 4 27d 192.168.33.12 m03 22 | kube-flannel-ds-amd64-7b86z 1/1 Running 3 27d 192.168.33.10 m01 23 | kube-flannel-ds-amd64-98qks 1/1 Running 3 27d 192.168.33.12 m03 24 | kube-flannel-ds-amd64-dvgdn 0/1 Error 5 3m22s 192.168.33.21 n02 25 | kube-flannel-ds-amd64-ljcdp 1/1 Running 2 27d 192.168.33.11 m02 26 | kube-flannel-ds-amd64-s8vzs 1/1 Running 4 26d 192.168.33.20 n01 27 | kube-flannel-ds-amd64-v5lkv 0/1 CrashLoopBackOff 9 23d 192.168.33.40 ing01 28 | kube-proxy-485hs 1/1 Running 2 23d 192.168.33.40 ing01 29 | kube-proxy-c4j4r 1/1 Running 2 26d 192.168.33.20 n01 30 | kube-proxy-krnjq 1/1 Running 2 27d 192.168.33.10 m01 31 | kube-proxy-n9s8c 1/1 Running 3 26d 192.168.33.21 n02 32 | kube-proxy-scb25 1/1 Running 2 27d 192.168.33.12 m03 33 | kube-proxy-xp4rj 1/1 Running 1 27d 192.168.33.11 m02 34 | kube-scheduler-m01 1/1 Running 8 27d 192.168.33.10 m01 35 | kube-scheduler-m02 1/1 Running 1 27d 192.168.33.11 m02 36 | kube-scheduler-m03 1/1 Running 2 27d 192.168.33.12 m03 37 | kubernetes-dashboard-847f8cb7b8-qdgkk 0/1 CrashLoopBackOff 1 26d n02 38 | metrics-server-8658466f94-sr479 1/1 Running 2 26d 10.244.3.83 n01 39 | tiller-deploy-7d6b75487c-46x8x 1/1 Running 1 7d1h 10.244.3.88 n01 40 | 41 | ``` 42 | 43 | ##2. 问题排查 44 | 45 | 查看所有节点状态: 46 | 47 | ``` 48 | [kube@m01 ~]$ kubectl get nodes 49 | NAME STATUS ROLES AGE VERSION 50 | ing01 Ready 23d v1.13.1 51 | m01 Ready master 28d v1.13.1 52 | m02 NotReady master 27d v1.13.1 53 | m03 Ready master 27d v1.13.1 54 | n01 Ready 26d v1.13.1 55 | n02 Ready 26d v1.13.1 56 | ``` 57 | 58 | 59 | 通过 `kubectl describe pod` 查看具体 Pod 信息: 60 | 61 | 查看 kube-flannel: 62 | 63 | ``` 64 | [kube@m01 ~]$ kubectl describe pod kube-flannel-ds-amd64-dvgdn -n kube-system 65 | Name: kube-flannel-ds-amd64-dvgdn 66 | Namespace: kube-system 67 | Priority: 0 68 | PriorityClassName: 69 | Node: n02/192.168.33.21 70 | Start Time: Fri, 18 Jan 2019 02:55:54 +0000 71 | Labels: app=flannel 72 | controller-revision-hash=6688cccc54 73 | pod-template-generation=1 74 | tier=node 75 | Annotations: 76 | 77 | ... 78 | 79 | Events: 80 | Type Reason Age From Message 81 | ---- ------ ---- ---- ------- 82 | Normal Scheduled 7m22s default-scheduler Successfully assigned kube-system/kube-flannel-ds-amd64-dvgdn to n02 83 | Normal Pulled 7m21s kubelet, n02 Container image "quay.io/coreos/flannel:v0.10.0-amd64" already present on machine 84 | Normal Created 7m21s kubelet, n02 Created container 85 | Normal Started 7m21s kubelet, n02 Started container 86 | Normal Created 6m34s (x4 over 7m20s) kubelet, n02 Created container 87 | Normal Started 6m34s (x4 over 7m20s) kubelet, n02 Started container 88 | Normal Pulled 5m45s (x5 over 7m20s) kubelet, n02 Container image "quay.io/coreos/flannel:v0.10.0-amd64" already present on machine 89 | Warning BackOff 2m8s (x26 over 7m17s) kubelet, n02 Back-off restarting failed container 90 | ``` 91 | 92 | 93 | 启动日志 94 | ``` 95 | [kube@m01 ~]$ kubectl logs kube-flannel-ds-amd64-dvgdn -n kube-system 96 | I0118 03:32:44.520004 1 main.go:488] Using interface with name eth1 and address 192.168.33.21 97 | I0118 03:32:44.520061 1 main.go:505] Defaulting external address to interface address (192.168.33.21) 98 | E0118 03:32:44.521067 1 main.go:232] Failed to create SubnetManager: error retrieving pod spec for 'kube-system/kube-flannel-ds-amd64-dvgdn': Get https://10.96.0.1:443/api/v1/namespaces/kube-system/pods/kube-flannel-ds-amd64-dvgdn: dial tcp 10.96.0.1:443: connect: network is unreachable 99 | ``` 100 | 101 | 查看 dashboard: 102 | 103 | ``` 104 | 105 | [kube@m01 ~]$ kubectl describe pod kubernetes-dashboard-847f8cb7b8-qdgkk -n kube-system 106 | Name: kubernetes-dashboard-847f8cb7b8-qdgkk 107 | Namespace: kube-system 108 | Priority: 0 109 | PriorityClassName: 110 | Node: n02/192.168.33.21 111 | Start Time: Sun, 23 Dec 2018 17:35:49 +0000 112 | Labels: k8s-app=kubernetes-dashboard 113 | pod-template-hash=847f8cb7b8 114 | 115 | ... 116 | 117 | Events: 118 | Type Reason Age From Message 119 | ---- ------ ---- ---- ------- 120 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "08da675dd1383279956350886ba2187344f7d081cf16985b82659952a8ac8015" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 121 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "0e2b632f026c842d10f7d30cfaaac239c179de43c2d1ef5bc5dbc90693d31c09" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 122 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "4ebd76003f1a27514c8c7b0cde24f737efef808f56c28f3df323f3447d331266" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 123 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7ee7082a899b32148fa937865e1ab20d18dfde8cef9394505526c2bec6c2c961" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 124 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "58feb9af9d51f450ec789ed9f2aabb441333377bb83dd854137165efb6bc424d" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 125 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "10a1eead86999d3339d36381bbd36d3ecab1ba1399874201761430560fde937d" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 126 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "2e99d5e688656a5e2cbf93e4f5bedd06c342e1cc9d4e30363f2e56b881ab137c" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 127 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "b6b50ec3aae2722e895ed9759e7c0d6f99c35c48683ce441efb52d6029042030" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 128 | Warning FailedCreatePodSandBox 21m kubelet, n02 Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f21ff1077d158f2d67e4a503534e926d60a76e0dea8563502c088d549b43f222" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 129 | Warning FailedCreatePodSandBox 16m (x253 over 21m) kubelet, n02 (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "bef7144ab016d761d3cd7ed60793967a8e379e1e3b5a097e2e3d775acd71d21e" network for pod "kubernetes-dashboard-847f8cb7b8-qdgkk": NetworkPlugin cni failed to set up pod "kubernetes-dashboard-847f8cb7b8-qdgkk_kube-system" network: open /run/flannel/subnet.env: no such file or directory 130 | Normal SandboxChanged 75s (x1005 over 22m) kubelet, n02 Pod sandbox changed, it will be killed and re-created. 131 | ``` 132 | 133 | ##3. 思考 134 | 135 | 136 | ###3.1 猜测问题出在 docker 137 | 138 | 从 kube-flannel 的错误信息(如下)中猜测,问题可能跟 docker 服务有关系(因为提示容器启动失败,容器启动是跟 docker 有关的)。 139 | 140 | ``` 141 | Warning BackOff 2m8s (x26 over 7m17s) kubelet, n02 Back-off restarting failed container 142 | 143 | ``` 144 | 145 | 接着去到对应的虚拟机,查看 docker 的状态和日志: 146 | 147 | docker 状态:正常 148 | 149 | ``` 150 | [kube@n02 ~]$ sudo systemctl status docker 151 | ● docker.service - Docker Application Container Engine 152 | Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled) 153 | Active: active (running) since Fri 2019-01-18 03:15:18 UTC; 6min ago 154 | Docs: http://docs.docker.com 155 | Main PID: 17057 (dockerd-current) 156 | CGroup: /system.slice/docker.service 157 | ├─10743 /usr/bin/docker-containerd-shim-current 18ae9a0097061645324d0677bc25d028de1ce55d61e1fe0cc9b5319770a2f257 /var/run/docker/libcontain... 158 | ├─10798 /usr/bin/docker-containerd-shim-current 3b0cda36376bd06f4e10a948e99a321ef70546d5738ea84f46a0e3bf6c57e669 /var/run/docker/libcontain... 159 | ├─10919 /usr/libexec/docker/docker-runc-current --systemd-cgroup=true kill --all 18ae9a0097061645324d0677bc25d028de1ce55d61e1fe0cc9b5319770... 160 | ├─10925 /usr/bin/docker-containerd-current -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-time... 161 | ├─17057 /usr/bin/dockerd-current --add-runtime docker-runc=/usr/libexec/docker/docker-runc-current --default-runtime=docker-runc --exec-opt... 162 | ├─17062 /usr/bin/docker-containerd-current -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-time... 163 | ├─17157 /usr/bin/docker-containerd-shim-current 9f8be01f0a1e540ba53c5cdd7864f6579513fabe719c034034075cefa499ab1b /var/run/docker/libcontain... 164 | ├─17179 /usr/bin/docker-containerd-shim-current d9db4238d634b77ee22a98f2a88a873b87017617e8bbe926e63f7707dec82135 /var/run/docker/libcontain... 165 | ├─17180 /usr/bin/docker-containerd-shim-current 1b63ed04989c441a9d97b9096bc81aee78ea1ecf06917afd4eef417ab69d12d7 /var/run/docker/libcontain... 166 | ├─17272 /usr/bin/docker-containerd-shim-current c22f257e887c537c4d2ea12cf63a3f2b7ee9839f6a61d84953082dd7c324ed23 /var/run/docker/libcontain... 167 | └─17320 /usr/bin/docker-containerd-shim-current 862764360d3d2f1eecda1cf41380ad45f3467aa4401dd94ca553749c2b1c7a07 /var/run/docker/libcontain... 168 | 169 | ``` 170 | 171 | 172 | docker 日志:有一些错误日志 173 | 174 | ``` 175 | 176 | [kube@n02 ~]$ sudo journalctl -exu docker | tail 177 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.901008403Z" level=error msg="Handler for POST /v1.26/containers/55ef25f7e156ee4c0ca2c49a5a842411c9f39583097054437918d10931e9ac32/stop returned error: Container 55ef25f7e156ee4c0ca2c49a5a842411c9f39583097054437918d10931e9ac32 is already stopped" 178 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.901256502Z" level=error msg="Handler for POST /v1.26/containers/2250b8dab37e347c545371d8b468da38e5c44231fc69ac3b0577d7baad79583c/stop returned error: Container 2250b8dab37e347c545371d8b468da38e5c44231fc69ac3b0577d7baad79583c is already stopped" 179 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.902458062Z" level=error msg="Handler for POST /v1.26/containers/e2df7730326b9373e35ec533f755d615d4c6a654c266a43dc748a52b3df13a36/stop returned error: Container e2df7730326b9373e35ec533f755d615d4c6a654c266a43dc748a52b3df13a36 is already stopped" 180 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.902711399Z" level=error msg="Handler for POST /v1.26/containers/2a58240e9542de339ee6cd2a81bfec298d362214088c012517cc20a75906c069/stop returned error: Container 2a58240e9542de339ee6cd2a81bfec298d362214088c012517cc20a75906c069 is already stopped" 181 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.905909305Z" level=error msg="Handler for POST /v1.26/containers/c280a53c9a86133b3ae6ed5f6f4f35082ec4a3c2064cd716750c589ca233a22a/stop returned error: Container c280a53c9a86133b3ae6ed5f6f4f35082ec4a3c2064cd716750c589ca233a22a is already stopped" 182 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.911192335Z" level=error msg="Handler for POST /v1.26/containers/3d8919a8bdde3a6036624081f2b288664d8feed63d80f364c394dc3ea2721f02/stop returned error: Container 3d8919a8bdde3a6036624081f2b288664d8feed63d80f364c394dc3ea2721f02 is already stopped" 183 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.912749444Z" level=error msg="Handler for POST /v1.26/containers/1b7cd279a61dca5aeb9f2bc4e3970c199f8dcd8d8fc7301c91d35dfec4170b3a/stop returned error: Container 1b7cd279a61dca5aeb9f2bc4e3970c199f8dcd8d8fc7301c91d35dfec4170b3a is already stopped" 184 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.914244025Z" level=error msg="Handler for POST /v1.26/containers/0fca1a5ae1dd4e2a7868a2fd01c14a3d8cdff7c7886235597cc44c085d108f20/stop returned error: Container 0fca1a5ae1dd4e2a7868a2fd01c14a3d8cdff7c7886235597cc44c085d108f20 is already stopped" 185 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.915657895Z" level=error msg="Handler for POST /v1.26/containers/2c81b5992374f6117a9aa1d416fbda534135fdcee83eea1d1a82b0fe101d0660/stop returned error: Container 2c81b5992374f6117a9aa1d416fbda534135fdcee83eea1d1a82b0fe101d0660 is already stopped" 186 | Jan 18 03:29:43 n02 dockerd-current[17057]: time="2019-01-18T03:29:43.918133242Z" level=error msg="Handler for POST /v1.26/containers/8b99252253f16d0a1ef393ea8fa4d4307c5239cfc3e38f6a8ac1fb0d8d7e4816/stop returned error: Container 8b99252253f16d0a1ef393ea8fa4d4307c5239cfc3e38f6a8ac1fb0d8d7e4816 is already stopped" 187 | Jan 18 03:19:20 n02 oci-systemd-hook[26622]: systemdhook : be2596bf7571: Skipping as container command is /pause, not init or systemd 188 | ``` 189 | 190 | 上面的错误日志主要是由于: 191 | 192 | 1. kebelet 请求 dockerd 对某个容器进行操作(/stop)。 193 | 2. dockerd 无法/操作容器失败,返回错误 194 | 3. dockerd 过小段时间重试 195 | 4. kebelet 重试操作,重复步骤 1 196 | 197 | 198 | 查看 n02 (kube-flannel 启动失败的虚拟机)上的 kubelet 的日志,过滤出 kube-flannel 的日志: 199 | ``` 200 | [kube@n02 ~]$ sudo journalctl -exu kubelet | tail -n1000 | grep -5 kube-flannel-ds-amd64-dvgdn 201 | Jan 18 03:50:45 n02 kubelet[27151]: E0118 03:50:45.175182 27151 pod_workers.go:190] Error syncing pod 94bbe7f2-1acc-11e9-87c4-5254008481d5 ("kube-flannel-ds-amd64-dvgdn_kube-system(94bbe7f2-1acc-11e9-87c4-5254008481d5)"), skipping: failed to "StartContainer" for "kube-flannel" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-flannel pod=kube-flannel-ds-amd64-dvgdn_kube-system(94bbe7f2-1acc-11e9-87c4-5254008481d5)" 202 | Jan 18 03:50:54 n02 kubelet[27151]: E0118 03:50:54.306705 27151 summary_sys_containers.go:45] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service" 203 | Jan 18 03:50:54 n02 kubelet[27151]: E0118 03:50:54.306726 27151 summary_sys_containers.go:45] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" 204 | Jan 18 03:50:59 n02 kubelet[27151]: E0118 03:50:59.176183 27151 pod_workers.go:190] Error syncing pod 94bbe7f2-1acc-11e9-87c4-5254008481d5 ("kube-flannel-ds-amd64-dvgdn_kube-system(94bbe7f2-1acc-11e9-87c4-5254008481d5)"), skipping: failed to "StartContainer" for "kube-flannel" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-flannel pod=kube-flannel-ds-amd64-dvgdn_kube-system(94bbe7f2-1acc-11e9-87c4-5254008481d5)" 205 | Jan 18 03:51:02 n02 kubelet[27151]: E0118 03:51:02.278816 27151 summary_sys_containers.go:45] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service" 206 | Jan 18 03:51:02 n02 kubelet[27151]: E0118 03:51:02.278836 27151 summary_sys_containers.go:45] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" 207 | Jan 18 03:51:04 n02 kubelet[27151]: E0118 03:51:04.310172 27151 summary_sys_containers.go:45] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service" 208 | Jan 18 03:51:04 n02 kubelet[27151]: E0118 03:51:04.310188 27151 summary_sys_containers.go:45] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" 209 | Jan 18 03:51:11 n02 kubelet[27151]: E0118 03:51:11.181258 27151 pod_workers.go:190] Error syncing pod 94bbe7f2-1acc-11e9-87c4-5254008481d5 ("kube-flannel-ds-amd64-dvgdn_kube-system(94bbe7f2-1acc-11e9-87c4-5254008481d5)"), skipping: failed to "StartContainer" for "kube-flannel" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-flannel pod=kube-flannel-ds-amd64-dvgdn_kube-system(94bbe7f2-1acc-11e9-87c4-5254008481d5)" 210 | Jan 18 03:51:14 n02 kubelet[27151]: E0118 03:51:14.319367 27151 summary_sys_containers.go:45] Failed to get system container stats for "/system.slice/kubelet.service": failed to get cgroup stats for "/system.slice/kubelet.service": failed to get container info for "/system.slice/kubelet.service": unknown container "/system.slice/kubelet.service" 211 | Jan 18 03:51:14 n02 kubelet[27151]: E0118 03:51:14.319387 27151 summary_sys_containers.go:45] Failed to get system container stats for "/system.slice/docker.service": failed to get cgroup stats for "/system.slice/docker.service": failed to get container info for "/system.slice/docker.service": unknown container "/system.slice/docker.service" 212 | 213 | ``` 214 | 215 | 后来发现,有异常的虚拟机的网口信息和正常的其他节点存在较大差异,主要就是缺少 eth0 、 flannel、 cni 网络信息等: 216 | 217 | 有问题的节点: 218 | ``` 219 | [kube@n02 ~]$ ip addr 220 | 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 221 | link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 222 | inet 127.0.0.1/8 scope host lo 223 | valid_lft forever preferred_lft forever 224 | inet6 ::1/128 scope host 225 | valid_lft forever preferred_lft forever 226 | 2: eth0: mtu 1500 qdisc noop state DOWN group default qlen 1000 227 | link/ether 52:54:00:84:81:d5 brd ff:ff:ff:ff:ff:ff 228 | 3: eth1: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 229 | link/ether 08:00:27:03:93:5b brd ff:ff:ff:ff:ff:ff 230 | inet 192.168.33.21/24 brd 192.168.33.255 scope global eth1 231 | valid_lft forever preferred_lft forever 232 | inet6 fe80::a00:27ff:fe03:935b/64 scope link 233 | valid_lft forever preferred_lft forever 234 | 4: docker0: mtu 1500 qdisc noqueue state DOWN group default 235 | link/ether 02:42:e4:77:bd:6c brd ff:ff:ff:ff:ff:ff 236 | inet 172.17.0.1/16 scope global docker0 237 | valid_lft forever preferred_lft forever 238 | 239 | ``` 240 | 241 | 正常的节点: 242 | ``` 243 | [kube@n01 ~]$ ip addr 244 | 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 245 | link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 246 | inet 127.0.0.1/8 scope host lo 247 | valid_lft forever preferred_lft forever 248 | inet6 ::1/128 scope host 249 | valid_lft forever preferred_lft forever 250 | 2: eth0: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 251 | link/ether 52:54:00:84:81:d5 brd ff:ff:ff:ff:ff:ff 252 | inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0 253 | valid_lft 71601sec preferred_lft 71601sec 254 | inet6 fe80::5054:ff:fe84:81d5/64 scope link 255 | valid_lft forever preferred_lft forever 256 | 3: eth1: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 257 | link/ether 08:00:27:62:9b:79 brd ff:ff:ff:ff:ff:ff 258 | inet 192.168.33.20/24 brd 192.168.33.255 scope global noprefixroute eth1 259 | valid_lft forever preferred_lft forever 260 | inet6 fe80::a00:27ff:fe62:9b79/64 scope link 261 | valid_lft forever preferred_lft forever 262 | 4: docker0: mtu 1500 qdisc noqueue state DOWN group default 263 | link/ether 02:42:61:22:a2:e1 brd ff:ff:ff:ff:ff:ff 264 | inet 172.17.0.1/16 scope global docker0 265 | valid_lft forever preferred_lft forever 266 | 5: flannel.1: mtu 1450 qdisc noqueue state UNKNOWN group default 267 | link/ether c6:c8:6b:c9:ce:60 brd ff:ff:ff:ff:ff:ff 268 | inet 10.244.3.0/32 scope global flannel.1 269 | valid_lft forever preferred_lft forever 270 | inet6 fe80::c4c8:6bff:fec9:ce60/64 scope link 271 | valid_lft forever preferred_lft forever 272 | 6: cni0: mtu 1450 qdisc noqueue state UP group default qlen 1000 273 | link/ether 0a:58:0a:f4:03:01 brd ff:ff:ff:ff:ff:ff 274 | inet 10.244.3.1/24 scope global cni0 275 | valid_lft forever preferred_lft forever 276 | inet6 fe80::2c40:1eff:fe63:b02b/64 scope link 277 | valid_lft forever preferred_lft forever 278 | 7: veth780a2dff@if3: mtu 1450 qdisc noqueue master cni0 state UP group default 279 | link/ether 26:a6:38:18:ba:87 brd ff:ff:ff:ff:ff:ff link-netnsid 0 280 | inet6 fe80::24a6:38ff:fe18:ba87/64 scope link 281 | valid_lft forever preferred_lft forever 282 | 8: veth63f08a2e@if3: mtu 1450 qdisc noqueue master cni0 state UP group default 283 | link/ether 32:df:db:fe:1c:bf brd ff:ff:ff:ff:ff:ff link-netnsid 1 284 | inet6 fe80::30df:dbff:fefe:1cbf/64 scope link 285 | valid_lft forever preferred_lft forever 286 | 12: veth4aab872e@if3: mtu 1450 qdisc noqueue master cni0 state UP group default 287 | link/ether 86:75:90:31:d8:82 brd ff:ff:ff:ff:ff:ff link-netnsid 5 288 | inet6 fe80::8475:90ff:fe31:d882/64 scope link 289 | valid_lft forever preferred_lft forever 290 | 13: veth01c2c6f8@if3: mtu 1450 qdisc noqueue master cni0 state UP group default 291 | link/ether 1a:12:82:18:8a:b2 brd ff:ff:ff:ff:ff:ff link-netnsid 2 292 | inet6 fe80::1812:82ff:fe18:8ab2/64 scope link 293 | valid_lft forever preferred_lft forever 294 | ``` 295 | 296 | 297 | 根据以往的经验,问题出在 vagrant 启动虚拟机时,个别网络启动失败导致。 298 | 299 | 300 | 重启虚拟机解决。 -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | kubeadm 1.13 安装高可用 kubernetes v1.13.1 集群 2 | --- 3 | 4 | 5 | ### [本仓库完整教程列表] 6 | 7 | 8 | * 一、[Kubeadmin 安装高可用 Kubernetes 集群](https://github.com/HikoQiu/kubeadm-install-k8s)(当前文章) 9 | * 二、基于 Kubernetes 集群部署推荐应用 10 | * [1. Wayne: 360 开源 kubernetes 多集群管理平台](/applications/Wayne/README.md) 11 | * [2. Weave Scope:实时监控 kubernetes 工具](/applications/Weave%20Scope/README.md) 12 | * [3. Helm:Kubernetes 包管理工具](/applications/helm/README.md) 13 | * [3.1 Monocular: Helm charts 仓库管理 WEB UI 工具](/applications/Monocular/README.md) 14 | * [4. NFS:Network File Storage](/applications/NFS/README.md) 15 | * 三、错误排查 16 | * [1. Pod CrashLoopBackOff/Error](/errors/1-pod_status_error.md) 17 | 18 | --- 19 | 20 | # kubeadm 1.13 安装高可用 kubernetes v1.13.1 集群 21 | 22 | ![kubernetes dashboard](./images/dashboard-index.png) 23 | 24 | 在开始前,先看 kubernetes dashboard 的图,提起来一点信心(备注:dashboard 组件在 k8s 正常运转中是可以不用的)。 25 | 26 | --- 27 | 28 | **本教程计划搭建的集群架构图如下:** 29 | 30 | ![k8s-ha](./images/k8s-ha.jpg) 31 | 32 | ## 目录 33 | 34 | * [一、环境准备](#%E4%B8%80%E7%8E%AF%E5%A2%83%E5%87%86%E5%A4%87) 35 | * [1\. 准备本地虚拟机](#1-%E5%87%86%E5%A4%87%E6%9C%AC%E5%9C%B0%E8%99%9A%E6%8B%9F%E6%9C%BA) 36 | * [2\. 虚拟机账号和 sudo 免密](#2-%E8%99%9A%E6%8B%9F%E6%9C%BA%E8%B4%A6%E5%8F%B7%E5%92%8C-sudo-%E5%85%8D%E5%AF%86) 37 | * [3\. 免密登录](#3-%E5%85%8D%E5%AF%86%E7%99%BB%E5%BD%95) 38 | * [4\. 关闭 SELinux 、关闭 swap 分区和配置 iptables](#4-%E5%85%B3%E9%97%AD-selinux-%E5%85%B3%E9%97%AD-swap-%E5%88%86%E5%8C%BA%E5%92%8C%E9%85%8D%E7%BD%AE-iptables) 39 | * [二、安装架构概览](#%E4%BA%8C%E5%AE%89%E8%A3%85%E6%9E%B6%E6%9E%84%E6%A6%82%E8%A7%88) 40 | * [三、安装步骤](#%E4%B8%89%E5%AE%89%E8%A3%85%E6%AD%A5%E9%AA%A4) 41 | * [1\. Docker 环境](#1-docker-%E7%8E%AF%E5%A2%83) 42 | * [2\. 安装 kubernetes yum 源和 kubelet、kubeadm、kubectl](#2-%E5%AE%89%E8%A3%85-kubernetes-yum-%E6%BA%90%E5%92%8C-kubeletkubeadmkubectl) 43 | * [3\. 初始化 kubeadm 配置文件](#3-%E5%88%9D%E5%A7%8B%E5%8C%96-kubeadm-%E9%85%8D%E7%BD%AE%E6%96%87%E4%BB%B6) 44 | * [4\. 安装 master 镜像和执行 kubeadm 初始化](#4-%E5%AE%89%E8%A3%85-master-%E9%95%9C%E5%83%8F%E5%92%8C%E6%89%A7%E8%A1%8C-kubeadm-%E5%88%9D%E5%A7%8B%E5%8C%96) 45 | * [4\.1 拉所需镜像到本地](#41-%E6%8B%89%E6%89%80%E9%9C%80%E9%95%9C%E5%83%8F%E5%88%B0%E6%9C%AC%E5%9C%B0) 46 | * [4\.2 安装 master m01](#42-%E5%AE%89%E8%A3%85-master-m01) 47 | * [i\. kube 用户配置](#i-kube-%E7%94%A8%E6%88%B7%E9%85%8D%E7%BD%AE) 48 | * [ii\. 验证结果](#ii-%E9%AA%8C%E8%AF%81%E7%BB%93%E6%9E%9C) 49 | * [ii\. 安装 CNI 插件 flannel](#ii-%E5%AE%89%E8%A3%85-cni-%E6%8F%92%E4%BB%B6-flannel) 50 | * [4\.3 安装剩余 master](#43-%E5%AE%89%E8%A3%85%E5%89%A9%E4%BD%99-master) 51 | * [4\.3\.1 同步 m01 的 ca 证书](#431-%E5%90%8C%E6%AD%A5-m01-%E7%9A%84-ca-%E8%AF%81%E4%B9%A6) 52 | * [4\.3\.2 安装 master m02](#432-%E5%AE%89%E8%A3%85-master-m02) 53 | * [1\. 配置证书、初始化 kubelet 配置和启动 kubelet](#1-%E9%85%8D%E7%BD%AE%E8%AF%81%E4%B9%A6%E5%88%9D%E5%A7%8B%E5%8C%96-kubelet-%E9%85%8D%E7%BD%AE%E5%92%8C%E5%90%AF%E5%8A%A8-kubelet) 54 | * [2\. 将 etcd 加入集群](#2-%E5%B0%86-etcd-%E5%8A%A0%E5%85%A5%E9%9B%86%E7%BE%A4) 55 | * [3\. 启动 kube\-apiserver、kube\-controller\-manager、kube\-scheduler](#3-%E5%90%AF%E5%8A%A8-kube-apiserverkube-controller-managerkube-scheduler) 56 | * [4\. 将节点标记为 master 节点](#4--%E5%B0%86%E8%8A%82%E7%82%B9%E6%A0%87%E8%AE%B0%E4%B8%BA-master-%E8%8A%82%E7%82%B9) 57 | * [4\.3\.3 安装 master m03](#433-%E5%AE%89%E8%A3%85-master-m03) 58 | * [4\.4 验证三个 master 节点](#44-%E9%AA%8C%E8%AF%81%E4%B8%89%E4%B8%AA-master-%E8%8A%82%E7%82%B9) 59 | * [5\. 加入工作节点](#5-%E5%8A%A0%E5%85%A5%E5%B7%A5%E4%BD%9C%E8%8A%82%E7%82%B9) 60 | * [6\. 部署高可用 CoreDNS](#6-%E9%83%A8%E7%BD%B2%E9%AB%98%E5%8F%AF%E7%94%A8-coredns) 61 | * [7\. 部署监控组件 metrics\-server](#7-%E9%83%A8%E7%BD%B2%E7%9B%91%E6%8E%A7%E7%BB%84%E4%BB%B6-metrics-server) 62 | * [7\.1 部署 metrics\-server](#71-%E9%83%A8%E7%BD%B2-metrics-server) 63 | * [7\.2 遇到的问题](#72-%E9%81%87%E5%88%B0%E7%9A%84%E9%97%AE%E9%A2%98) 64 | * [7\.2\.1 指定 \-\-kubelet\-preferred\-address\-types](#721-%E6%8C%87%E5%AE%9A---kubelet-preferred-address-types) 65 | * [7\.2\.2 指定 \-\-kubelet\-insecure\-tls](#722-%E6%8C%87%E5%AE%9A---kubelet-insecure-tls) 66 | * [8\. 部署 Ingress,服务暴露](#8-%E9%83%A8%E7%BD%B2-ingress%E6%9C%8D%E5%8A%A1%E6%9A%B4%E9%9C%B2) 67 | * [8\.1 必知知识点](#81-%E5%BF%85%E7%9F%A5%E7%9F%A5%E8%AF%86%E7%82%B9) 68 | * [8\.2 部署 Nginx\-ingress\-controller](#82-%E9%83%A8%E7%BD%B2-nginx-ingress-controller) 69 | * [9\. 部署 kubernetes\-dashboard](#9-%E9%83%A8%E7%BD%B2-kubernetes-dashboard) 70 | * [9\.1 Dashboard 配置](#91-dashboard-%E9%85%8D%E7%BD%AE) 71 | * [9\.2 HTTPS 访问 Dashboard](#92-https-%E8%AE%BF%E9%97%AE-dashboard) 72 | * [9\.3 登录 Dashboard](#93-%E7%99%BB%E5%BD%95-dashboard) 73 | * [9\.4 404 问题](#94-404-%E9%97%AE%E9%A2%98) 74 | 75 | # 一、环境准备 76 | 77 | | 环境| 简介 | 78 | | -- |--| 79 | | 环境 | Vagrant + virtural box | 80 | |系统| Centos 7 | 81 | | kubeadm | v1.3 | 82 | | kubernetes | v1.13.1 | 83 | | docker | v1.13.1,官方推荐使用 18.06,不过1.11, 1.12, 1.13 and 17.03 也会很好地运行, 见:https://kubernetes.io/docs/setup/cri/ | 84 | 85 | ### 1. 准备本地虚拟机 86 | 87 | |主机名| IP | 配置 | 备注| 88 | | -- |--| -- | --| 89 | | m01 | 192.168.33.10 | 2核2G | master、同时作为 etcd 节点 | 90 | | m02 | 192.168.33.11 | 2核2G| master、同时作为 etcd 节点 | 91 | | m03 | 192.168.33.12 | 2核2G| master、同时作为tecd 节点 | 92 | |n01| 192.168.33.20 | 2核2G| 工作节点 node,容器编排最终 pod 工作节点 | 93 | |n02| 192.168.33.21 | 2核2G| 工作节点 node,容器编排最终 pod 工作节点 | 94 | 95 | 96 | 为了方面后面操作,配置 m01 m02 m03 n01 n02 的 `/etc/hosts`,如下: 97 | 98 | 99 | ``` 100 | # sudo vi /etc/hosts 101 | 102 | 192.168.33.10 m01 api.k8s.hiko.im 103 | 192.168.33.11 m02 104 | 192.168.33.12 m03 105 | 192.168.33.20 n01 106 | 192.168.33.21 n02 107 | ``` 108 | 109 | 110 | ### 2. 虚拟机账号和 sudo 免密 111 | 112 | 在每台虚拟机上,创建 kubernetes 集群统一用户:`kube` 113 | 114 | ``` 115 | # useradd kube 116 | # visudo 117 | ``` 118 | 119 | 备注:通过 visudo 把用户 kube 加到 sudo 免密。 120 | 121 | 122 | ### 3. 免密登录 123 | 124 | 为了方面后续操作,给 m01 配置免密登录到 m01、m02、m03、n01、n02 125 | 126 | 具体操作: 127 | 128 | i. 先登录进 `m01` 虚拟机,然后执行以下配置免密登录的 ssh 公钥: 129 | 130 | ```` 131 | ## 为 root 生成 ssh 公钥和私钥 132 | sudo su - 133 | ssh-keygen 134 | # 备注:直接一路回车,在 ~/.ssh 目录下生成公钥和私钥。 135 | 136 | ## 为 kube 生成 ssh 公钥和私钥 137 | sudo su - kube 138 | ssh-keygen 139 | # 备注:直接一路回车,在 ~/.ssh 目录下生成公钥和私钥。 140 | ```` 141 | 142 | ii. 配置免密登录 143 | 144 | 为 m01 的 kube 和 root 账号配置免密登录,让 m01 上的 kube 可以免密登录到其他虚拟机的 kube 账号、m01 上的 root 可以免密登录到其他虚拟机的 root 账号。 145 | 146 | 在 m01 上,以 `kube` 账号,依次执行 ssh-copy-id,如下: 147 | ```` 148 | sudo su - kube 149 | ssh-copy-id kube@m01 150 | ssh-copy-id kube@m02 151 | ssh-copy-id kube@m03 152 | ssh-copy-id kube@n01 153 | ssh-copy-id kube@n02 154 | ```` 155 | 156 | 验证配置免密登录是否配置成功,在 m01 上依次测试免密登录时候成功,如: 157 | 158 | ```` 159 | ## m01 虚拟机 160 | ssh kube@m01 161 | ssh kube@m02 162 | ssh kube@m03 163 | ssh kube@n01 164 | ssh kube@n02 165 | ```` 166 | 167 | 如果能正常免密登录到对应的虚拟机,表示配置通过。如果测试不通过,请先检查配置或重新配置,直到正常。 168 | 169 | 同理,配置 m01 的 root 免密登录到其他账号的 root 密码。如果不知道 root 账号,也可以手动拷贝 m01 root 账号的公钥文件的内容( `/root/.ssh/id_rsa.pub` ),复制到其他机器的 /root/.ssh/authorized_keys 文件中。 170 | 171 | 提示:如果其他机器上的 root 下的 /root/.ssh/authorized_keys 不存在,可以手动创建。要注意的是:authorized_keys 的权限需要是 600。 172 | 173 | ``` 174 | ## 如果 authorized_keys 的权限不是 600,执行修改权限的命令。 175 | chmod 600 authorized_keys 176 | ``` 177 | 178 | ### 4. 关闭 SELinux 、关闭 swap 分区和配置 iptables 179 | 180 | 需要关闭 SELinux 避免安装过程中的 Permission Deny;关闭 Swap 分区,不然 kubelet 无法启动。 181 | 通过脚本进行配置所有机器。 182 | 183 | ``` 184 | ## 创建脚本:init.sys.config.sh 185 | 186 | #!/bin/sh 187 | 188 | vhost="m01 m02 m03 n01 n02" 189 | 190 | 191 | # 新建 iptable 配置修改文件 192 | cat < net.iptables.k8s.conf 193 | net.bridge.bridge-nf-call-ip6tables = 1 194 | net.bridge.bridge-nf-call-iptables = 1 195 | EOF 196 | 197 | for h in $vhost 198 | do 199 | 200 | echo "--> $h" 201 | 202 | # 1. 关闭 swap 分区 203 | # kubelet 不关闭,kubelet 无法启动 204 | # 也可以通过将参数 --fail-swap-on 设置为 false 来忽略 swap on 205 | ssh kube@$h "sudo swapoff -a" 206 | echo "sudo swapoff -a -- ok" 207 | 208 | # 防止开机自动挂载 swap 分区,注释掉配置 209 | ssh kube@$h "sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab" 210 | echo "Comment swap config file modified -- ok" 211 | 212 | 213 | # 2. 关闭 SELinux 214 | # 否则后续 k8s 挂载目录时可能报错:Permission Denied 215 | ssh kube@$h "sudo setenforce 0" 216 | echo "sudo setenforce 0 -- ok" 217 | 218 | # 防止开机启动开启,修改 SELINUX 配置 219 | ssh kube@$h "sudo sed -i s'/SELINUX=enforcing/SELINUX=disabled'/g /etc/selinux/config" 220 | echo "Disabled selinux -- ok" 221 | 222 | # 3. 配置 iptables 223 | scp net.iptables.k8s.conf kube@$h:~ 224 | ssh kube@$h "sudo mv net.iptables.k8s.conf /etc/sysctl.d/ && sudo sysctl --system" 225 | 226 | # 安装 wget 227 | ssh kube@$h "sudo yum install -y wget" 228 | 229 | done 230 | ``` 231 | 232 | 执行脚本: 233 | 234 | 235 | ``` 236 | chmod +x ./init.sys.config.sh 237 | ./init.sys.config.sh 238 | ``` 239 | 240 | # 二、安装架构概览 241 | 242 | 243 | ![k8s-ha](./images/k8s-ha.jpg) 244 | 245 | 246 | # 三、安装步骤 247 | 248 | 249 | ## 1. Docker 环境 250 | 251 | Centos 默认 yum 的 docker 版本是 1.13,能支持 kubernetes。如果需要更新 Docker 请参考 Docker 官方文档指导,安装最新版的 Docker。参考:https://docs.docker.com/install/linux/docker-ce/centos/ 252 | 253 | ``` 254 | sudo yum install -y docker 255 | ``` 256 | 257 | 为了方便操作,使用脚本从 m01 上免密登录,进行遍历安装所有其他节点的 Docker 环境。(下面各步骤中,无特殊声明,都是在 m01 上创建脚本和执行脚本) 258 | 259 | 260 | ``` 261 | ## 创建脚本: install.docker.sh 262 | 263 | #!/bin/sh 264 | 265 | vhosts="m01 m02 m03 n01 n02" 266 | 267 | for h in $vhosts 268 | do 269 | echo "Install Docker for $h" 270 | ssh kube@$h "sudo yum install -y docker && sudo systemctl enable docker && sudo systemctl start docker" 271 | done 272 | 273 | ``` 274 | 275 | 执行 Docker 安装和启动: 276 | 277 | ``` 278 | chmod +x install.docker.sh 279 | 280 | ./install.docker.sh 281 | ``` 282 | 283 | 284 | 登录各机器确认,Docker 是否已经安装并且已启动。如果存在失败情况,请调试至正常。 285 | 286 | 287 | ## 2. 安装 kubernetes yum 源和 kubelet、kubeadm、kubectl 288 | 289 | 所有机器上配置 kubernetes.repo yum 源,m01、m02、m03、n01、n02 上安装 kubelet、kubeadm、kubectl, 安装完之后,启动 m01 的 kubelet,详细安装脚本如下: 290 | 291 | ```` 292 | ## 创建脚本: install.k8s.repo.sh 293 | 294 | #!/bin/sh 295 | 296 | vhost="m01 m02 m03 n01 n02" 297 | 298 | ## 1. 阿里云 kubernetes 仓库 299 | cat < kubernetes.repo 300 | [kubernetes] 301 | name=Kubernetes 302 | baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ 303 | enabled=1 304 | gpgcheck=1 305 | repo_gpgcheck=1 306 | gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg 307 | EOF 308 | 309 | mvCmd="sudo cp ~/kubernetes.repo /etc/yum.repos.d/" 310 | for h in $vhost 311 | do 312 | echo "Setup kubernetes repository for $h" 313 | scp ./kubernetes.repo kube@$h:~ 314 | ssh kube@$h $mvCmd 315 | done 316 | 317 | ## 2. 安装 kubelet kubeadm kubectl 318 | installCmd="sudo yum install -y kubelet kubeadm kubectl && sudo systemctl enable kubelet" 319 | for h in $vhost 320 | do 321 | echo "Install kubelet kubeadm kubectl for : $h" 322 | ssh kube@$h $installCmd 323 | done 324 | 325 | ```` 326 | 327 | 执行 `install.k8s.repo.sh` ,完成仓库安装、kubelet、kubeadm、kubectl 的安装。 328 | 329 | 330 | ``` 331 | chmod +x install.k8s.repo.sh 332 | 333 | ./install.k8s.repo.sh 334 | ``` 335 | 336 | 其中,在 master m01 m02 m03 上安装 kubelet、kubeadm、kubectl,在 node n01、n02 上安装 kubectl。 337 | 338 | 安装完 k8s 的各系统组件之后,启动 m01 的 kubelet: 339 | 340 | ``` 341 | # 启动 m01 的 kubelet 342 | sudo systemctl start kubelet 343 | ``` 344 | 345 | // @TODO 打印启动结果 346 | 347 | ## 3. 初始化 kubeadm 配置文件 348 | 349 | 创建三台 master 机器 m01 m02 m03 的 kubeadm 配置文件,其中主要是配置生成证书的域配置、etcd 集群配置。 350 | 351 | 提示:可以通过以下命令查看一份完整的 kubeadm 配置文件的示例: 352 | 353 | ```` 354 | kubeadm config print init-defaults --component-configs KubeProxyConfiguration 355 | ```` 356 | 357 | 以下脚本主要是生成各自的配置文件并分发到 m01、m02、m03 上。 358 | 359 | 配置文件中指定配置高可用的 apiServer、证书和高可用 Etcd,参考 v1.13 的配置文档: 360 | 361 | 1. 高可用 apiServer :[https://kubernetes.io/docs/setup/independent/high-availability/](https://kubernetes.io/docs/setup/independent/high-availability) 362 | 363 | 2. 高可用 etcd:[https://kubernetes.io/docs/setup/independent/setup-ha-etcd-with-kubeadm/](https://kubernetes.io/docs/setup/independent/setup-ha-etcd-with-kubeadm/) 364 | 365 | ```` 366 | ## 创建脚本: init.kubeadm.config.sh 367 | 368 | #!/bin/sh 369 | 370 | ## 1. 配置参数 371 | ## vhost 主机名和 vhostIP IP 一一对应 372 | vhost=(m01 m02 m03) 373 | vhostIP=(192.168.33.10 192.168.33.11 192.168.33.12) 374 | 375 | domain=api.k8s.hiko.im 376 | 377 | ## etcd 初始化 m01 m02 m03 集群配置 378 | etcdInitCluster=( 379 | m01=https://192.168.33.10:2380 380 | m01=https://192.168.33.10:2380,m02=https://192.168.33.11:2380 381 | m01=https://192.168.33.10:2380,m02=https://192.168.33.11:2380,m03=https://192.168.33.12:2380 382 | ) 383 | 384 | ## etcd 初始化时,m01 m02 m03 分别的初始化集群状态 385 | initClusterStatus=( 386 | new 387 | existing 388 | existing 389 | ) 390 | 391 | 392 | ## 2.遍历 master 主机名和对应 IP 393 | ## 生成对应的 kubeadmn 配置文件 394 | for i in `seq 0 $((${#vhost[*]}-1))` 395 | do 396 | 397 | h=${vhost[${i}]} 398 | ip=${vhostIP[${i}]} 399 | 400 | echo "--> $h - $ip" 401 | 402 | ## 生成 kubeadm 配置模板 403 | cat < kubeadm-config.$h.yaml 404 | apiVersion: kubeadm.k8s.io/v1beta1 405 | kind: InitConfiguration 406 | localAPIEndpoint: 407 | advertiseAddress: $ip 408 | bindPort: 6443 409 | --- 410 | apiVersion: kubeadm.k8s.io/v1beta1 411 | kind: ClusterConfiguration 412 | kubernetesVersion: v1.13.1 413 | 414 | # 指定阿里云镜像仓库 415 | imageRepository: registry.aliyuncs.com/google_containers 416 | 417 | # apiServerCertSANs 填所有的 masterip、lbip、其它可能需要通过它访问 apiserver 的地址、域名或主机名等, 418 | # 如阿里fip,证书中会允许这些ip 419 | # 这里填一个自定义的域名 420 | apiServer: 421 | certSANs: 422 | - "$domain" 423 | controlPlaneEndpoint: "$domain:6443" 424 | 425 | ## Etcd 配置 426 | etcd: 427 | local: 428 | extraArgs: 429 | listen-client-urls: "https://127.0.0.1:2379,https://$ip:2379" 430 | advertise-client-urls: "https://$ip:2379" 431 | listen-peer-urls: "https://$ip:2380" 432 | initial-advertise-peer-urls: "https://$ip:2380" 433 | initial-cluster: "${etcdInitCluster[${i}]}" 434 | initial-cluster-state: ${initClusterStatus[${i}]} 435 | serverCertSANs: 436 | - $h 437 | - $ip 438 | peerCertSANs: 439 | - $h 440 | - $ip 441 | networking: 442 | podSubnet: "10.244.0.0/16" 443 | 444 | EOF 445 | 446 | echo "kubeadm-config.$h.yaml created ... ok" 447 | 448 | ## 3. 分发到其他 master 机器 449 | scp kubeadm-config.$h.yaml kube@$h:~ 450 | echo "scp kubeadm-config.$h.yaml ... ok" 451 | 452 | done 453 | 454 | ```` 455 | 456 | 执行成功之后,可以在 m01 m02 m03 的 kube 用户的 home 目录(/home/kube)能看到对应的 kubeadm-config.m0*.yaml 配置文件。 457 | 这个配置文件主要是用于后续初始化集群其他 master 的证书、 etcd 配置、kubelet 配置、kube-apiserver配置、kube-controller-manager 配置等。 458 | 459 | 各 master 机器对应的 kubeadm 配置文件: 460 | ``` 461 | 虚拟机 m01:kubeadm-config.m01.yaml 462 | 虚拟机 m02:kubeadm-config.m02.yaml 463 | 虚拟机 m03:kubeadm-config.m03.yaml 464 | ```` 465 | 466 | ## 4. 安装 master 镜像和执行 kubeadm 初始化 467 | 468 | ### 4.1 拉所需镜像到本地 469 | 470 | 因为 k8s.gcr.io 国内无法访问,我们可以选择通过阿里云的镜像仓库(kubeadm-config.m0*.yaml 配置文件中已经指定使用阿里云镜像仓库 ` registry.aliyuncs.com/google_containers`),将所需的镜像 pull 到本地。 471 | 472 | 在 m01 上,通过命令 `kubeadm config images list ` 查看所需的镜像,可以看到如下结果: 473 | 474 | ``` 475 | kubeadm config images list --config kubeadm-config.m01.yaml 476 | 477 | # 控制台打印结果: 478 | registry.aliyuncs.com/google_containers/kube-apiserver:v1.13.1 479 | registry.aliyuncs.com/google_containers/kube-controller-manager:v1.13.1 480 | registry.aliyuncs.com/google_containers/kube-scheduler:v1.13.1 481 | registry.aliyuncs.com/google_containers/kube-proxy:v1.13.1 482 | registry.aliyuncs.com/google_containers/pause:3.1 483 | registry.aliyuncs.com/google_containers/etcd:3.2.24 484 | registry.aliyuncs.com/google_containers/coredns:1.2.6 485 | ``` 486 | 487 | 接着,分别在 m01 m02 m03 上将镜像拉到本地,具体操作如下脚本: 488 | 489 | ``` 490 | ## 创建脚本:kubeadm.images.sh 491 | 492 | #!/bin/sh 493 | 494 | vhost="m01 m02 m03" 495 | 496 | for h in $vhost;do 497 | echo "Pull image for $h -- begings" 498 | sudo kubeadm config images pull --config kubeadm-config.$h.yaml 499 | done 500 | ``` 501 | 502 | 执行脚本 `kubeadm.images.sh` 拉镜像。 503 | 504 | ```` 505 | chmod +x kubeadm.images.sh 506 | 507 | ./kubeadm.images.sh 508 | ```` 509 | 510 | 执行完之后,可以登录 m01 m02 m03 查看各自本地 docker 镜像,将看到所需要的镜像已经拉到本地: 511 | 512 | ```` 513 | [kube@m01 shells]$ sudo docker images 514 | REPOSITORY TAG IMAGE ID CREATED SIZE 515 | k8s.gcr.io/kube-proxy v1.13.1 fdb321fd30a0 7 days ago 80.2 MB 516 | registry.aliyuncs.com/google_containers/kube-proxy v1.13.1 fdb321fd30a0 7 days ago 80.2 MB 517 | registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy v1.13.1 fdb321fd30a0 7 days ago 80.2 MB 518 | k8s.gcr.io/kube-apiserver v1.13.1 40a63db91ef8 7 days ago 181 MB 519 | registry.aliyuncs.com/google_containers/kube-apiserver v1.13.1 40a63db91ef8 7 days ago 181 MB 520 | registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver v1.13.1 40a63db91ef8 7 days ago 181 MB 521 | k8s.gcr.io/kube-controller-manager v1.13.1 26e6f1db2a52 7 days ago 146 MB 522 | registry.aliyuncs.com/google_containers/kube-controller-manager v1.13.1 26e6f1db2a52 7 days ago 146 MB 523 | registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager v1.13.1 26e6f1db2a52 7 days ago 146 MB 524 | k8s.gcr.io/kube-scheduler v1.13.1 ab81d7360408 7 days ago 79.6 MB 525 | registry.aliyuncs.com/google_containers/kube-scheduler v1.13.1 ab81d7360408 7 days ago 79.6 MB 526 | registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler v1.13.1 ab81d7360408 7 days ago 79.6 MB 527 | k8s.gcr.io/coredns 1.2.6 f59dcacceff4 6 weeks ago 40 MB 528 | registry.aliyuncs.com/google_containers/coredns 1.2.6 f59dcacceff4 6 weeks ago 40 MB 529 | registry.cn-hangzhou.aliyuncs.com/google_containers/coredns 1.2.6 f59dcacceff4 6 weeks ago 40 MB 530 | k8s.gcr.io/etcd 3.2.24 3cab8e1b9802 3 months ago 220 MB 531 | registry.aliyuncs.com/google_containers/etcd 3.2.24 3cab8e1b9802 3 months ago 220 MB 532 | registry.cn-hangzhou.aliyuncs.com/google_containers/etcd 3.2.24 3cab8e1b9802 3 months ago 220 MB 533 | k8s.gcr.io/pause 3.1 da86e6ba6ca1 12 months ago 742 kB 534 | registry.aliyuncs.com/google_containers/pause 3.1 da86e6ba6ca1 12 months ago 742 kB 535 | registry.cn-hangzhou.aliyuncs.com/google_containers/pause 3.1 da86e6ba6ca1 12 months ago 742 kB 536 | 537 | ```` 538 | 539 | ### 4.2 安装 master m01 540 | 541 | 我们目标是要搭建一个高可用的 master 集群,所以需要在三台 master m01 m02 m03机器上分别通过 kubeadm 进行初始化。 542 | 543 | 由于 m02 和 m03 的初始化需要依赖 m01 初始化成功后所生成的证书文件,所以这里需要先在 m01 初始化。 544 | 545 | 546 | ``` 547 | ## 登到 m01 虚拟机 548 | 549 | ## 执行初始化命令,其中 kubeadm-config.m01.yaml 是上一步创建和分发的 kubeadm 初始化所需的配置文件 550 | 551 | sudo kubeadm init --config kubeadm-config.m01.yaml 552 | 553 | ``` 554 | 555 | 初始化成功之后,你会看到打出类似下面的日志: 556 | 557 | 备注:如果初始化失败,需要重试,可以通过 `sudo kubeadm reset --force` 重置之前 kubeadm init 命令的执行结果,恢复一个干净的环境。 558 | 559 | 560 | ``` 561 | [init] Using Kubernetes version: v1.13.1 562 | [preflight] Running pre-flight checks 563 | [preflight] Pulling images required for setting up a Kubernetes cluster 564 | [preflight] This might take a minute or two, depending on the speed of your internet connection 565 | [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' 566 | [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" 567 | [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" 568 | [kubelet-start] Activating the kubelet service 569 | [certs] Using certificateDir folder "/etc/kubernetes/pki" 570 | [certs] Generating "ca" certificate and key 571 | [certs] Generating "apiserver" certificate and key 572 | [certs] apiserver serving cert is signed for DNS names [m01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local api.k8s.hiko.im api.k8s.hiko.im] and IPs [10.96.0.1 10.0.2.15] 573 | [certs] Generating "apiserver-kubelet-client" certificate and key 574 | [certs] Generating "front-proxy-ca" certificate and key 575 | [certs] Generating "front-proxy-client" certificate and key 576 | [certs] Generating "etcd/ca" certificate and key 577 | [certs] Generating "etcd/server" certificate and key 578 | [certs] etcd/server serving cert is signed for DNS names [m01 localhost m01] and IPs [10.0.2.15 127.0.0.1 ::1 192.168.33.10] 579 | [certs] Generating "etcd/peer" certificate and key 580 | [certs] etcd/peer serving cert is signed for DNS names [m01 localhost m01] and IPs [10.0.2.15 127.0.0.1 ::1 192.168.33.10] 581 | [certs] Generating "etcd/healthcheck-client" certificate and key 582 | [certs] Generating "apiserver-etcd-client" certificate and key 583 | [certs] Generating "sa" key and public key 584 | [kubeconfig] Using kubeconfig folder "/etc/kubernetes" 585 | [kubeconfig] Writing "admin.conf" kubeconfig file 586 | [kubeconfig] Writing "kubelet.conf" kubeconfig file 587 | [kubeconfig] Writing "controller-manager.conf" kubeconfig file 588 | [kubeconfig] Writing "scheduler.conf" kubeconfig file 589 | [control-plane] Using manifest folder "/etc/kubernetes/manifests" 590 | [control-plane] Creating static Pod manifest for "kube-apiserver" 591 | [control-plane] Creating static Pod manifest for "kube-controller-manager" 592 | [control-plane] Creating static Pod manifest for "kube-scheduler" 593 | [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" 594 | [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s 595 | [apiclient] All control plane components are healthy after 19.009523 seconds 596 | [uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace 597 | [kubelet] Creating a ConfigMap "kubelet-config-1.13" in namespace kube-system with the configuration for the kubelets in the cluster 598 | [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "m01" as an annotation 599 | [mark-control-plane] Marking the node m01 as control-plane by adding the label "node-role.kubernetes.io/master=''" 600 | [mark-control-plane] Marking the node m01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] 601 | [bootstrap-token] Using token: a1t7c1.mzltpc72dc3wzj9y 602 | [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles 603 | [bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials 604 | [bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token 605 | [bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster 606 | [bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace 607 | [addons] Applied essential addon: CoreDNS 608 | [addons] Applied essential addon: kube-proxy 609 | 610 | Your Kubernetes master has initialized successfully! 611 | 612 | To start using your cluster, you need to run the following as a regular user: 613 | 614 | mkdir -p $HOME/.kube 615 | sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 616 | sudo chown $(id -u):$(id -g) $HOME/.kube/config 617 | 618 | You should now deploy a pod network to the cluster. 619 | Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: 620 | https://kubernetes.io/docs/concepts/cluster-administration/addons/ 621 | 622 | You can now join any number of machines by running the following on each node 623 | as root: 624 | 625 | kubeadm join api.k8s.hiko.im:6443 --token a1t7c1.mzltpc72dc3wzj9y --discovery-token-ca-cert-hash sha256:05f44b111174613055975f012fc11fe09bdcd746bd7b3c8d99060c52619f8738 626 | 627 | 628 | ``` 629 | 630 | 至此就完成了第一台 master m01 的初始化。 631 | 632 | 633 | > 初始化成功的信息解释: 634 | > token 是使用指令 kubeadm token generate 生成的,执行过程如有异常,用命令kubeadm reset 初始化后重试,生成的 token 有效时间为 24 小时,超过 24 小时后需要重新使用命令 kubeadm token create 创建新的 token。 635 | > discovery-token-ca-cert-hash 的值可以使用命令查看,命令:openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' 636 | > 637 | > 参考:http://cloudnil.com/2018/12/14/Deploy-kubernetes(1.13.1)-HA-with-kubeadm/ 638 | 639 | 为了让 m01 的 kube 用户能通过 kubectl 管理集群,接着我们需要给 m01 的 kube 用户配置管理集群的配置。 640 | 641 | > 上面初始化成功中提到的: 642 | > To start using your cluster, you need to run the following as a regular user: 643 | > 644 | > mkdir -p $HOME/.kube 645 | > sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 646 | > sudo chown $(id -u):$(id -g) $HOME/.kube/config 647 | 648 | 649 | #### i. kube 用户配置 650 | 651 | 这里同样通过一个脚本来执行 kube 用户的配置工作(当然,你一行行命令输入也行)。 652 | 653 | 在 m01 虚拟机的 kube 用户上创建 config.using.cluster.sh 的脚本,如下: 654 | 655 | ``` 656 | ## 创建脚本:config.using.cluster.sh 657 | 658 | #!/bin/sh 659 | 660 | # 为 kube 用户配置 661 | mkdir -p $HOME/.kube 662 | sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 663 | sudo chown $(id -u):$(id -g) $HOME/.kube/config 664 | 665 | ``` 666 | 667 | 执行脚本: 668 | 669 | ``` 670 | chmod +x config.using.cluster.sh 671 | ./config.using.cluster.sh 672 | ``` 673 | 674 | 675 | 执行成功将看到: 676 | ``` 677 | [kube@m01 shells]$ ./config.using.cluster.sh 678 | Finish user kube config ... ok 679 | ``` 680 | #### ii. 验证结果 681 | 682 | 通过 kubectl 查看集群状态,将看到: 683 | 684 | ``` 685 | [kube@m01 ~]$ kubectl cluster-info 686 | 687 | Kubernetes master is running at https://api.k8s.hiko.im:6443 688 | KubeDNS is running at https://api.k8s.hiko.im:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy 689 | ``` 690 | 691 | 通过 kubectl 查看集群所有的 Pod: 692 | 693 | ```` 694 | [kube@m01 ~]$ kubectl get pods --all-namespaces 695 | 696 | NAMESPACE NAME READY STATUS RESTARTS AGE 697 | kube-system coredns-78d4cf999f-cw79l 0/1 Pending 0 47m 698 | kube-system coredns-78d4cf999f-w8j47 0/1 Pending 0 47m 699 | kube-system etcd-m01 1/1 Running 0 47m 700 | kube-system kube-apiserver-m01 1/1 Running 0 46m 701 | kube-system kube-controller-manager-m01 1/1 Running 0 46m 702 | kube-system kube-proxy-5954k 1/1 Running 0 47m 703 | kube-system kube-scheduler-m01 1/1 Running 0 47m 704 | ```` 705 | 备注:因为还未装 flannel 网络组件,所以暂时 coredns 的 Pod 显示为 `Pending`,暂时不影响 。 706 | 707 | 708 | #### ii. 安装 CNI 插件 flannel 709 | 710 | 我们同样把整个安装过程放在脚本里面来执行。 711 | 712 | 提醒:如果你的虚拟机环境跟我一样,vagrant + virtualbox 开的虚拟机,默认每个虚拟机会有两个网口(可以通过命令:`ip addr` 查看,其中 10.0.2.15/24 是宿主机和虚拟机 NAT 使用,不能作为 flannel 的网口使用,所以在配置 flannel 的 yaml 配置文件时,需要通过 `-iface` 指定使用虚拟机的局域网的网口,也就是 192.168.33.10/24 这个网口 eth1)。 713 | 714 | 我的环境的网口信息如下: 715 | 716 | ``` 717 | [kube@m01 ~]$ ip addr 718 | 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 719 | link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 720 | inet 127.0.0.1/8 scope host lo 721 | valid_lft forever preferred_lft forever 722 | inet6 ::1/128 scope host 723 | valid_lft forever preferred_lft forever 724 | 2: eth0: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 725 | link/ether 52:54:00:84:81:d5 brd ff:ff:ff:ff:ff:ff 726 | inet 10.0.2.15/24 brd 10.0.2.255 scope global noprefixroute dynamic eth0 727 | valid_lft 80510sec preferred_lft 80510sec 728 | inet6 fe80::5054:ff:fe84:81d5/64 scope link 729 | valid_lft forever preferred_lft forever 730 | 3: eth1: mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 731 | link/ether 08:00:27:e2:15:5d brd ff:ff:ff:ff:ff:ff 732 | inet 192.168.33.10/24 brd 192.168.33.255 scope global noprefixroute eth1 733 | valid_lft forever preferred_lft forever 734 | inet6 fe80::a00:27ff:fee2:155d/64 scope link 735 | valid_lft forever preferred_lft forever 736 | ``` 737 | 738 | 在 m01 上安装 flannel(只需要在 m01 安装一次就行,主要是分配 Pod 网段,后续其他 master 加入集群后,会自动获取到集群的 Pod 网段)。 739 | 740 | 741 | 这里选择的 flannel yaml 配置文件是: https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml。 742 | 743 | 其中指定的镜像 `quay.io/coreos/flannel:v0.10.0-amd64` 国内无法拉取,所以这里使用 docker hub 上其他用户上传的镜像。 744 | 745 | 先 docker pull 拉下来后,使用 docker tag 打出 `quay.io/coreos/flannel:v0.10.0-amd64`,具体操作见脚本。 746 | 747 | 另外,这个 kube-flannel 配置文件没有指定使用 eth1 网口,所以我这里是先把配置文件保存下来,然后添加 -iface 配置。 748 | 749 | // @TODO 添加配置文件具体 url 750 | 751 | 同样,我们通过脚本来执行,如下: 752 | 753 | ```` 754 | ## 创建脚本:install.flannel.sh 755 | 756 | #!/bin/sh 757 | 758 | # 安装 Pod 网络插件 759 | # 这里选择的是 flannel v0.10.0 版本 760 | # 如果想用其他版本,可以替换url 761 | 762 | # 备注:kube-flannel.yml(下面配置的 yaml)中指定的是 quay.io 的镜像。 763 | # 因为国内无法拉 quay.io 的镜像,所以这里从 docker hub 拉去相同镜像, 764 | # 然后打 tag 为 kube-flannel.yml 中指定的 quay.io/coreos/flannel:v0.10.0-amd64 765 | # 再备注:flannel 是所有节点(master 和 node)都需要的网络组件,所以后面其他节点也可以通过相同方式安装 766 | 767 | sudo docker pull jmgao1983/flannel:v0.10.0-amd64 768 | sudo docker tag jmgao1983/flannel:v0.10.0-amd64 quay.io/coreos/flannel:v0.10.0-amd64 769 | 770 | # 安装 flannel 771 | # 如果无法下载,可以使用我提供的 kube-flannel.yml,是一样的 772 | # wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml 773 | 774 | 775 | 776 | # kubelet apply 进行安装 flannel 777 | kubectl apply -f ./kube-flannel.yml 778 | 779 | ```` 780 | 781 | 执行 install.flannel.sh 进行安装,如下: 782 | 783 | ``` 784 | chmod +x install.flannel.sh 785 | ./install.flannel.sh 786 | ``` 787 | 788 | 安装成功之后,通过 `kubectl get pods --all-namespaces`,看到所有 Pod 都正常了。 789 | 790 | ``` 791 | [kube@m01 shells]$ kubectl get pods --all-namespaces 792 | NAMESPACE NAME READY STATUS RESTARTS AGE 793 | kube-system coredns-78d4cf999f-j8j29 1/1 Running 0 59m 794 | kube-system coredns-78d4cf999f-t7bmm 1/1 Running 0 59m 795 | kube-system etcd-m01 1/1 Running 0 58m 796 | kube-system kube-apiserver-m01 1/1 Running 0 58m 797 | kube-system kube-controller-manager-m01 1/1 Running 0 59m 798 | kube-system kube-flannel-ds-amd64-v48p5 1/1 Running 0 18m 799 | kube-system kube-flannel-ds-glvzm 1/1 Running 0 18m 800 | kube-system kube-proxy-tk9v2 1/1 Running 0 59m 801 | kube-system kube-scheduler-m01 1/1 Running 0 59m 802 | ``` 803 | 804 | ### 4.3 安装剩余 master 805 | 806 | 807 | #### 4.3.1 同步 m01 的 ca 证书 808 | 809 | 首先,将 m01 中的 ca 证书,scp 到其他 master 机器(m02 m03)。 810 | 811 | 为了方便,这里也是通过脚本来执行,具体如下: 812 | 813 | 注意:需要确认 m01 上的 root 账号可以免密登录到 m02 和 m03 的 root 账号。 814 | 815 | ``` 816 | ## 创建脚本:sync.master.ca.sh 817 | 818 | #!/bin/sh 819 | 820 | vhost="m02 m03" 821 | usr=root 822 | 823 | who=`whoami` 824 | if [[ "$who" != "$usr" ]];then 825 | echo "请使用 root 用户执行或者 sudo ./sync.master.ca.sh" 826 | exit 1 827 | fi 828 | 829 | echo $who 830 | 831 | # 需要从 m01 拷贝的 ca 文件 832 | caFiles=( 833 | /etc/kubernetes/pki/ca.crt 834 | /etc/kubernetes/pki/ca.key 835 | /etc/kubernetes/pki/sa.key 836 | /etc/kubernetes/pki/sa.pub 837 | /etc/kubernetes/pki/front-proxy-ca.crt 838 | /etc/kubernetes/pki/front-proxy-ca.key 839 | /etc/kubernetes/pki/etcd/ca.crt 840 | /etc/kubernetes/pki/etcd/ca.key 841 | /etc/kubernetes/admin.conf 842 | ) 843 | 844 | pkiDir=/etc/kubernetes/pki/etcd 845 | for h in $vhost 846 | do 847 | 848 | ssh ${usr}@$h "mkdir -p $pkiDir" 849 | 850 | echo "Dirs for ca scp created, start to scp..." 851 | 852 | # scp 文件到目标机 853 | for f in ${caFiles[@]} 854 | do 855 | echo "scp $f ${usr}@$h:$f" 856 | scp $f ${usr}@$h:$f 857 | done 858 | 859 | echo "Ca files transfered for $h ... ok" 860 | done 861 | 862 | ``` 863 | 执行脚本,将 m01 相关的 ca 文件传到 m02 和 m03: 864 | 865 | ``` 866 | chmod +x sync.master.ca.sh 867 | 868 | sudo ./sync.master.ca.sh 869 | ``` 870 | 871 | 执行 ca 拷贝成功,将在控制台看到类似的打印日志: 872 | 873 | ``` 874 | [kube@m01 shells]$ sudo ./sync.master.ca.sh 875 | root 876 | Dirs for ca scp created, start to scp... 877 | scp /etc/kubernetes/pki/ca.crt root@m02:/etc/kubernetes/pki/ca.crt 878 | ca.crt 100% 1025 1.5MB/s 00:00 879 | scp /etc/kubernetes/pki/ca.key root@m02:/etc/kubernetes/pki/ca.key 880 | ca.key 100% 1679 2.4MB/s 00:00 881 | scp /etc/kubernetes/pki/sa.key root@m02:/etc/kubernetes/pki/sa.key 882 | sa.key 100% 1679 2.3MB/s 00:00 883 | scp /etc/kubernetes/pki/sa.pub root@m02:/etc/kubernetes/pki/sa.pub 884 | sa.pub 100% 451 601.6KB/s 00:00 885 | scp /etc/kubernetes/pki/front-proxy-ca.crt root@m02:/etc/kubernetes/pki/front-proxy-ca.crt 886 | front-proxy-ca.crt 100% 1038 1.5MB/s 00:00 887 | scp /etc/kubernetes/pki/front-proxy-ca.key root@m02:/etc/kubernetes/pki/front-proxy-ca.key 888 | front-proxy-ca.key 100% 1679 2.3MB/s 00:00 889 | scp /etc/kubernetes/pki/etcd/ca.crt root@m02:/etc/kubernetes/pki/etcd/ca.crt 890 | ca.crt 100% 1017 1.8MB/s 00:00 891 | scp /etc/kubernetes/pki/etcd/ca.key root@m02:/etc/kubernetes/pki/etcd/ca.key 892 | ca.key 100% 1675 2.7MB/s 00:00 893 | scp /etc/kubernetes/admin.conf root@m02:/etc/kubernetes/admin.conf 894 | admin.conf 100% 5455 7.3MB/s 00:00 895 | Ca files transfered for m02 ... ok 896 | Dirs for ca scp created, start to scp... 897 | scp /etc/kubernetes/pki/ca.crt root@m03:/etc/kubernetes/pki/ca.crt 898 | ca.crt 100% 1025 2.0MB/s 00:00 899 | scp /etc/kubernetes/pki/ca.key root@m03:/etc/kubernetes/pki/ca.key 900 | ca.key 100% 1679 2.5MB/s 00:00 901 | scp /etc/kubernetes/pki/sa.key root@m03:/etc/kubernetes/pki/sa.key 902 | sa.key 100% 1679 3.9MB/s 00:00 903 | scp /etc/kubernetes/pki/sa.pub root@m03:/etc/kubernetes/pki/sa.pub 904 | sa.pub 100% 451 673.7KB/s 00:00 905 | scp /etc/kubernetes/pki/front-proxy-ca.crt root@m03:/etc/kubernetes/pki/front-proxy-ca.crt 906 | front-proxy-ca.crt 100% 1038 2.0MB/s 00:00 907 | scp /etc/kubernetes/pki/front-proxy-ca.key root@m03:/etc/kubernetes/pki/front-proxy-ca.key 908 | front-proxy-ca.key 100% 1679 2.6MB/s 00:00 909 | scp /etc/kubernetes/pki/etcd/ca.crt root@m03:/etc/kubernetes/pki/etcd/ca.crt 910 | ca.crt 100% 1017 1.5MB/s 00:00 911 | scp /etc/kubernetes/pki/etcd/ca.key root@m03:/etc/kubernetes/pki/etcd/ca.key 912 | ca.key 100% 1675 2.3MB/s 00:00 913 | scp /etc/kubernetes/admin.conf root@m03:/etc/kubernetes/admin.conf 914 | admin.conf 100% 5455 6.3MB/s 00:00 915 | Ca files transfered for m03 ... ok 916 | 917 | ``` 918 | 919 | 到 m02 和 m03 上查看 /etc/kubernetes/ 目录,相关的 ca 文件已经同步过去了。 920 | ``` 921 | ├── admin.conf 922 | └── pki 923 | ├── ca.crt 924 | ├── ca.key 925 | ├── etcd 926 | │   ├── ca.crt 927 | │   └── ca.key 928 | ├── front-proxy-ca.crt 929 | ├── front-proxy-ca.key 930 | ├── sa.key 931 | └── sa.pub 932 | 933 | ``` 934 | 935 | #### 4.3.2 安装 master m02 936 | 937 | 基于 kubeadm-config.m02.yaml 和 m01 同步的 ca 证书,初始化相关的证书、kubelet、kube-apiserver、kube-controller-manager、etcd 配置和启动 kubelet,具体操作如下: 938 | 939 | > 备注:因为在安装过程中需要等 kubelet 启动成功,同时中间还有给 etcd 加节点(会有短暂的集群连接不上),所以这里不通过脚本执行,而是登录到对应机器上一步步配置。 940 | 941 | 总共四个步骤,分别是: 942 | 943 | ① 配置证书、初始化 kubelet 配置和启动 kubelet 944 | ② 将 etcd 加入集群 945 | ③ 启动 kube-apiserver、kube-controller-manager、kube-scheduler 946 | ④ 将节点标记为 master 节点 947 | 948 | 949 | ##### 1. 配置证书、初始化 kubelet 配置和启动 kubelet 950 | ``` 951 | sudo kubeadm init phase certs all --config kubeadm-config.m02.yaml 952 | sudo kubeadm init phase etcd local --config kubeadm-config.m02.yaml 953 | sudo kubeadm init phase kubeconfig kubelet --config kubeadm-config.m02.yaml 954 | sudo kubeadm init phase kubelet-start --config kubeadm-config.m02.yaml 955 | ``` 956 | 957 | 逐步执行以上四条命令,执行结果如下: 958 | 959 | ```` 960 | [kube@m02 ~]$ sudo kubeadm init phase certs all --config kubeadm-config.m02.yaml 961 | [certs] Using certificateDir folder "/etc/kubernetes/pki" 962 | [certs] Using existing etcd/ca certificate authority 963 | [certs] Generating "etcd/healthcheck-client" certificate and key 964 | [certs] Generating "apiserver-etcd-client" certificate and key 965 | [certs] Generating "etcd/server" certificate and key 966 | [certs] etcd/server serving cert is signed for DNS names [m02 localhost m02] and IPs [10.0.2.15 127.0.0.1 ::1 192.168.33.11] 967 | [certs] Generating "etcd/peer" certificate and key 968 | [certs] etcd/peer serving cert is signed for DNS names [m02 localhost m02] and IPs [10.0.2.15 127.0.0.1 ::1 192.168.33.11] 969 | [certs] Using existing front-proxy-ca certificate authority 970 | [certs] Generating "front-proxy-client" certificate and key 971 | [certs] Using existing ca certificate authority 972 | [certs] Generating "apiserver" certificate and key 973 | [certs] apiserver serving cert is signed for DNS names [m02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local api.k8s.hiko.im api.k8s.hiko.im] and IPs [10.96.0.1 10.0.2.15] 974 | [certs] Generating "apiserver-kubelet-client" certificate and key 975 | [certs] Using the existing "sa" key 976 | 977 | [kube@m02 ~]$ sudo kubeadm init phase etcd local --config kubeadm-config.m02.yaml 978 | [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" 979 | 980 | [kube@m02 ~]$ sudo kubeadm init phase kubeconfig kubelet --config kubeadm-config.m02.yaml 981 | [kubeconfig] Writing "kubelet.conf" kubeconfig file 982 | 983 | [kube@m02 ~]$ sudo kubeadm init phase kubelet-start --config kubeadm-config.m02.yaml 984 | [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" 985 | [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" 986 | [kubelet-start] Activating the kubelet service 987 | 988 | ```` 989 | 990 | 执行完之后,通过 `systemctl status kubelet` 可以看到 kubelet 的状态是 `active (running)`。 991 | 992 | ``` 993 | [kube@m01 ~]$ systemctl status kubelet 994 | ● kubelet.service - kubelet: The Kubernetes Node Agent 995 | Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled) 996 | Drop-In: /etc/systemd/system/kubelet.service.d 997 | └─10-kubeadm.conf 998 | Active: active (running) since Fri 2018-12-21 03:10:14 UTC; 4h 39min ago 999 | Docs: https://kubernetes.io/docs/ 1000 | Main PID: 565 (kubelet) 1001 | CGroup: /system.slice/kubelet.service 1002 | └─565 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/va... 1003 | ``` 1004 | 1005 | 接着配置 m02 的 kube 用户管理集群,跟 m01 配置 kube 用户管理集群一样,这里不累述: 1006 | 1007 | ``` 1008 | ## 创建脚本:config.using.cluster.sh 1009 | #!/bin/sh 1010 | # 为 kube 用户配置 1011 | mkdir -p $HOME/.kube 1012 | sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config 1013 | sudo chown $(id -u):$(id -g) $HOME/.kube/config 1014 | ``` 1015 | 1016 | 配置完通过: `kubectl cluster-info`和`kubectl get pods --all-namespaces` 进行验证。 1017 | 1018 | ##### 2. 将 etcd 加入集群 1019 | 1020 | ``` 1021 | kubectl exec -n kube-system etcd-m01 -- etcdctl --ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --endpoints=https://192.168.33.10:2379 member add m02 https://192.168.33.11:2380 1022 | ``` 1023 | 1024 | 这个命令其实就是登入 pod etcd-m01,使用 Pod 中的 etcdctl 给 etcd 的集群添加 m02 这个etcd 实例。 1025 | 1026 | 执行这个命令之后,集群会有短暂的不可用(etcd 集群需要选举和同步数据),不用慌,稍等一会就行。 1027 | 1028 | 其他 etcd 的管理命令,如下: 1029 | 1030 | ``` 1031 | # 查看 etcd 集群已有的节点 1032 | kubectl exec -n kube-system etcd-m01 -- etcdctl --ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --endpoints=https://192.168.33.10:2379 member list 1033 | 1034 | ``` 1035 | 1036 | 执行成功之后,可以通过 `kubectl get pods --all-namespaces -owide` 看到 etcd-m02、kube-proxy m02 节点上的 Pod 也已经正常。 1037 | 1038 | ##### 3. 启动 kube-apiserver、kube-controller-manager、kube-scheduler 1039 | 1040 | ``` 1041 | sudo kubeadm init phase kubeconfig all --config kubeadm-config.m02.yaml 1042 | sudo kubeadm init phase control-plane all --config kubeadm-config.m02.yaml 1043 | ``` 1044 | 1045 | 上面命令执行完之后,通过 `kubectl get pods --all-namespaces` 查看, 各节点已经都正常。 1046 | 1047 | ``` 1048 | [kube@m01 ~]$ kubectl get pods --all-namespaces 1049 | NAMESPACE NAME READY STATUS RESTARTS AGE 1050 | kube-system coredns-78d4cf999f-j8zsr 1/1 Running 0 162m 1051 | kube-system coredns-78d4cf999f-lw5qx 1/1 Running 0 162m 1052 | kube-system etcd-m01 1/1 Running 8 5h2m 1053 | kube-system etcd-m02 1/1 Running 12 88m 1054 | kube-system kube-apiserver-m01 1/1 Running 9 5h2m 1055 | kube-system kube-apiserver-m02 1/1 Running 0 87m 1056 | kube-system kube-controller-manager-m01 1/1 Running 4 5h2m 1057 | kube-system kube-controller-manager-m02 1/1 Running 0 87m 1058 | kube-system kube-flannel-ds-amd64-7b86z 1/1 Running 0 3h22m 1059 | kube-system kube-flannel-ds-amd64-98qks 1/1 Running 0 83m 1060 | kube-system kube-proxy-krnjq 1/1 Running 0 5h3m 1061 | kube-system kube-proxy-scb25 1/1 Running 0 83m 1062 | kube-system kube-scheduler-m01 1/1 Running 4 5h2m 1063 | kube-system kube-scheduler-m02 1/1 Running 0 87m 1064 | 1065 | ``` 1066 | 1067 | 1068 | ##### 4. 将节点标记为 master 节点 1069 | 1070 | 在未将 m02 标记为 master 节点前,通过 `kubectl get nodes` 查看当前集群节点时,看到的是这样: 1071 | 1072 | ``` 1073 | [kube@m01 ~]$ kubectl get nodes 1074 | NAME STATUS ROLES AGE VERSION 1075 | m01 Ready master 5h4m v1.13.1 1076 | m02 Ready 89m v1.13.1 1077 | ``` 1078 | 1079 | 执行命令,将 m02 标记为 master,命令: 1080 | 1081 | ``` 1082 | sudo kubeadm init phase mark-control-plane --config kubeadm-config.m02.yaml 1083 | ``` 1084 | 1085 | 重新查看集群节点: 1086 | 1087 | ``` 1088 | [kube@m02 ~]$ kubectl get nodes 1089 | NAME STATUS ROLES AGE VERSION 1090 | m01 Ready master 5h5m v1.13.1 1091 | m02 Ready master 90m v1.13.1 1092 | ``` 1093 | 1094 | #### 4.3.3 安装 master m03 1095 | 1096 | 安装过程和 m02 一样,唯一不同的只是指定的配置文件是:kubeadm-config.m03.yaml 以及 etcd 加入成员时指定的实例地址不一样。 1097 | 1098 | 总共四个步骤,分别是: 1099 | 1100 | ① 配置证书、初始化 kubelet 配置和启动 kubelet 1101 | ② 将 etcd 加入集群 1102 | ③ 启动 kube-apiserver、kube-controller-manager、kube-scheduler 1103 | ④ 将节点标记为 master 节点 1104 | 1105 | 这里不做详细步骤拆解(具体参考 4.3.2 安装 master 02),具体操作命令: 1106 | 1107 | ```` 1108 | 1109 | # 1. 配置证书、初始化 kubelet 配置和启动 kubelet 1110 | sudo kubeadm init phase certs all --config kubeadm-config.m03.yaml 1111 | sudo kubeadm init phase etcd local --config kubeadm-config.m03.yaml 1112 | sudo kubeadm init phase kubeconfig kubelet --config kubeadm-config.m03.yaml 1113 | sudo kubeadm init phase kubelet-start --config kubeadm-config.m03.yaml 1114 | 1115 | # 2. 将 etcd 加入集群 1116 | kubectl exec -n kube-system etcd-m01 -- etcdctl --ca-file /etc/kubernetes/pki/etcd/ca.crt --cert-file /etc/kubernetes/pki/etcd/peer.crt --key-file /etc/kubernetes/pki/etcd/peer.key --endpoints=https://192.168.33.10:2379 member add m03 https://192.168.33.12:2380 1117 | 1118 | # 3. 启动 kube-apiserver、kube-controller-manager、kube-scheduler 1119 | sudo kubeadm init phase kubeconfig all --config kubeadm-config.m03.yaml 1120 | sudo kubeadm init phase control-plane all --config kubeadm-config.m03.yaml 1121 | 1122 | # 4. 将节点标记为 master 节点 1123 | sudo kubeadm init phase mark-control-plane --config kubeadm-config.m03.yaml 1124 | ```` 1125 | 1126 | ### 4.4 验证三个 master 节点 1127 | 1128 | 至此,三个 master 节点安装完成,通过 `kubectl get pods --all-namespaces` 查看当前集群所有 Pod。 1129 | 1130 | ```` 1131 | [kube@m02 ~]$ kubectl get pods --all-namespaces 1132 | NAMESPACE NAME READY STATUS RESTARTS AGE 1133 | kube-system coredns-78d4cf999f-j8zsr 1/1 Running 0 170m 1134 | kube-system coredns-78d4cf999f-lw5qx 1/1 Running 0 171m 1135 | kube-system etcd-m01 1/1 Running 8 5h11m 1136 | kube-system etcd-m02 1/1 Running 12 97m 1137 | kube-system etcd-m03 1/1 Running 0 91m 1138 | kube-system kube-apiserver-m01 1/1 Running 9 5h11m 1139 | kube-system kube-apiserver-m02 1/1 Running 0 95m 1140 | kube-system kube-apiserver-m03 1/1 Running 0 91m 1141 | kube-system kube-controller-manager-m01 1/1 Running 4 5h11m 1142 | kube-system kube-controller-manager-m02 1/1 Running 0 95m 1143 | kube-system kube-controller-manager-m03 1/1 Running 0 91m 1144 | kube-system kube-flannel-ds-amd64-7b86z 1/1 Running 0 3h31m 1145 | kube-system kube-flannel-ds-amd64-98qks 1/1 Running 0 91m 1146 | kube-system kube-flannel-ds-amd64-ljcdp 1/1 Running 0 97m 1147 | kube-system kube-proxy-krnjq 1/1 Running 0 5h12m 1148 | kube-system kube-proxy-scb25 1/1 Running 0 91m 1149 | kube-system kube-proxy-xp4rj 1/1 Running 0 97m 1150 | kube-system kube-scheduler-m01 1/1 Running 4 5h11m 1151 | kube-system kube-scheduler-m02 1/1 Running 0 95m 1152 | kube-system kube-scheduler-m03 1/1 Running 0 91m 1153 | 1154 | ```` 1155 | 1156 | ## 5. 加入工作节点 1157 | 1158 | 这步很简单,只需要在工作节点 n01 和 n02 上执行加入集群的命令即可。 1159 | 1160 | 可以使用上面安装 master m01 成功后打印的命令 `kubeadm join api.k8s.hiko.im:6443 --token a1t7c1.mzltpc72dc3wzj9y --discovery-token-ca-cert-hash sha256:05f44b111174613055975f012fc11fe09bdcd746bd7b3c8d99060c52619f8738`,也可以重新生成 Token。 1161 | 1162 | 这里演示如何重新生成 Token 和 证书 hash,在 m01 上执行以下操作: 1163 | 1164 | ``` 1165 | # 1. 创建 token 1166 | [kube@m01 shells]$ kubeadm token create 1167 | 1168 | # 控制台打印如: 1169 | gz1v4w.sulpuxkqtnyci92f 1170 | 1171 | # 2. 查看我们创建的 k8s 集群的证书 hash 1172 | [kube@m01 shells]$ openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' 1173 | 1174 | # 控制台打印如: 1175 | b125cd0c80462353d8fa3e4f5034f1e1a1e3cc9bade32acfb235daa867c60f61 1176 | 1177 | ``` 1178 | 1179 | 使用 kubeadm join,分别在工作节点 n01 和 n02 上执行,将节点加入集群,如下: 1180 | 1181 | ``` 1182 | [kube@n01 ~]$ sudo kubeadm join api.k8s.hiko.im:6443 --token gz1v4w.sulpuxkqtnyci92f --discovery-token-ca-cert-hash sha256:b125cd0c80462353d8fa3e4f5034f1e1a1e3cc9bade32acfb235daa867c60f61 1183 | 1184 | [preflight] Running pre-flight checks 1185 | [discovery] Trying to connect to API Server "api.k8s.hiko.im:6443" 1186 | [discovery] Created cluster-info discovery client, requesting info from "https://api.k8s.hiko.im:6443" 1187 | [discovery] Requesting info from "https://api.k8s.hiko.im:6443" again to validate TLS against the pinned public key 1188 | [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "api.k8s.hiko.im:6443" 1189 | [discovery] Successfully established connection with API Server "api.k8s.hiko.im:6443" 1190 | [join] Reading configuration from the cluster... 1191 | [join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' 1192 | [kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace 1193 | [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" 1194 | [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" 1195 | [kubelet-start] Activating the kubelet service 1196 | [tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap... 1197 | [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "n01" as an annotation 1198 | 1199 | This node has joined the cluster: 1200 | * Certificate signing request was sent to apiserver and a response was received. 1201 | * The Kubelet was informed of the new secure connection details. 1202 | 1203 | Run 'kubectl get nodes' on the master to see this node join the cluster. 1204 | 1205 | ``` 1206 | 1207 | 在 m01 上通过 `kubectl get nodes` 查看,将看到节点已被加进来(节点刚加进来时,状态可能会是 NotReady,稍等一会就回变成 Ready)。 1208 | 1209 | ``` 1210 | [kube@m01 ~]$ kubectl get nodes 1211 | NAME STATUS ROLES AGE VERSION 1212 | m01 Ready master 33h v1.13.1 1213 | m02 Ready master 30h v1.13.1 1214 | m03 Ready master 30h v1.13.1 1215 | n01 Ready 5m30s v1.13.1 1216 | ``` 1217 | 1218 | 同样操作将 n02 加进来,这里就不多于阐述,参考上面的步骤(将 n01 加入集群时使用的 token 可以继续使用,可以不用重新创建 token)。 1219 | 1220 | 执行成功之后,查看目前集群所有节点: 1221 | 1222 | ``` 1223 | [kube@m01 ~]$ kubectl get nodes 1224 | NAME STATUS ROLES AGE VERSION 1225 | m01 Ready master 34h v1.13.1 1226 | m02 Ready master 30h v1.13.1 1227 | m03 Ready master 30h v1.13.1 1228 | n01 Ready 25m v1.13.1 1229 | n02 Ready 117s v1.13.1 1230 | ``` 1231 | 1232 | 1233 | ## 6. 部署高可用 CoreDNS 1234 | 1235 | 默认安装的 CoreDNS 存在单点问题。在 m01 上通过 `kubectl get pods -n kube-system -owide` 查看当前集群 CoreDNS Pod 分布(如下)。 1236 | 1237 | 从列表中,可以看到 CoreDNS 的两个 Pod 都在 m01 上,存在单点问题。 1238 | 1239 | ``` 1240 | [kube@m01 ~]$ kubectl get pods -n kube-system -owide 1241 | NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 1242 | coredns-78d4cf999f-j8zsr 1/1 Running 0 31h 10.244.0.13 m01 1243 | coredns-78d4cf999f-lw5qx 1/1 Running 0 31h 10.244.0.12 m01 1244 | etcd-m01 1/1 Running 8 34h 192.168.33.10 m01 1245 | etcd-m02 1/1 Running 12 30h 192.168.33.11 m02 1246 | etcd-m03 1/1 Running 0 30h 192.168.33.12 m03 1247 | kube-apiserver-m01 1/1 Running 9 34h 192.168.33.10 m01 1248 | kube-apiserver-m02 1/1 Running 0 30h 192.168.33.11 m02 1249 | kube-apiserver-m03 1/1 Running 0 30h 192.168.33.12 m03 1250 | kube-controller-manager-m01 1/1 Running 4 34h 192.168.33.10 m01 1251 | kube-controller-manager-m02 1/1 Running 0 30h 192.168.33.11 m02 1252 | kube-controller-manager-m03 1/1 Running 0 30h 192.168.33.12 m03 1253 | kube-flannel-ds-amd64-7b86z 1/1 Running 0 32h 192.168.33.10 m01 1254 | kube-flannel-ds-amd64-98qks 1/1 Running 0 30h 192.168.33.12 m03 1255 | kube-flannel-ds-amd64-jkz27 1/1 Running 0 7m21s 192.168.33.21 n02 1256 | kube-flannel-ds-amd64-ljcdp 1/1 Running 0 30h 192.168.33.11 m02 1257 | kube-flannel-ds-amd64-s8vzs 1/1 Running 0 30m 192.168.33.20 n01 1258 | kube-proxy-c4j4r 1/1 Running 0 30m 192.168.33.20 n01 1259 | kube-proxy-krnjq 1/1 Running 0 34h 192.168.33.10 m01 1260 | kube-proxy-n9s8c 1/1 Running 0 7m21s 192.168.33.21 n02 1261 | kube-proxy-scb25 1/1 Running 0 30h 192.168.33.12 m03 1262 | kube-proxy-xp4rj 1/1 Running 0 30h 192.168.33.11 m02 1263 | kube-scheduler-m01 1/1 Running 4 34h 192.168.33.10 m01 1264 | kube-scheduler-m02 1/1 Running 0 30h 192.168.33.11 m02 1265 | kube-scheduler-m03 1/1 Running 0 30h 192.168.33.12 m03 1266 | 1267 | ``` 1268 | 1269 | 参考:http://cloudnil.com/2018/12/14/Deploy-kubernetes(1.13.1)-HA-with-kubeadm/#10-dns集群部署 1270 | 1271 | 两步操作: 1272 | 1273 | ① 删除原来单点的 CoreDNS 1274 | 1275 | ``` 1276 | kubectl delete deploy coredns -n kube-system 1277 | ``` 1278 | 1279 | ② 部署多实例的coredns集群,创建 coredns deployment 配置 coredns-ha.yml (也可以直接使用我提供的 coredns-ha.yaml // @TODO 添加地址),内容如下: 1280 | 1281 | ``` 1282 | apiVersion: apps/v1 1283 | kind: Deployment 1284 | metadata: 1285 | labels: 1286 | k8s-app: kube-dns 1287 | name: coredns 1288 | namespace: kube-system 1289 | spec: 1290 | #集群规模可自行配置 1291 | replicas: 2 1292 | selector: 1293 | matchLabels: 1294 | k8s-app: kube-dns 1295 | strategy: 1296 | rollingUpdate: 1297 | maxSurge: 25% 1298 | maxUnavailable: 1 1299 | type: RollingUpdate 1300 | template: 1301 | metadata: 1302 | labels: 1303 | k8s-app: kube-dns 1304 | spec: 1305 | affinity: 1306 | podAntiAffinity: 1307 | preferredDuringSchedulingIgnoredDuringExecution: 1308 | - weight: 100 1309 | podAffinityTerm: 1310 | labelSelector: 1311 | matchExpressions: 1312 | - key: k8s-app 1313 | operator: In 1314 | values: 1315 | - kube-dns 1316 | topologyKey: kubernetes.io/hostname 1317 | containers: 1318 | - args: 1319 | - -conf 1320 | - /etc/coredns/Corefile 1321 | image: registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.2.6 1322 | imagePullPolicy: IfNotPresent 1323 | livenessProbe: 1324 | failureThreshold: 5 1325 | httpGet: 1326 | path: /health 1327 | port: 8080 1328 | scheme: HTTP 1329 | initialDelaySeconds: 60 1330 | periodSeconds: 10 1331 | successThreshold: 1 1332 | timeoutSeconds: 5 1333 | name: coredns 1334 | ports: 1335 | - containerPort: 53 1336 | name: dns 1337 | protocol: UDP 1338 | - containerPort: 53 1339 | name: dns-tcp 1340 | protocol: TCP 1341 | - containerPort: 9153 1342 | name: metrics 1343 | protocol: TCP 1344 | resources: 1345 | limits: 1346 | memory: 170Mi 1347 | requests: 1348 | cpu: 100m 1349 | memory: 70Mi 1350 | securityContext: 1351 | allowPrivilegeEscalation: false 1352 | capabilities: 1353 | add: 1354 | - NET_BIND_SERVICE 1355 | drop: 1356 | - all 1357 | readOnlyRootFilesystem: true 1358 | terminationMessagePath: /dev/termination-log 1359 | terminationMessagePolicy: File 1360 | volumeMounts: 1361 | - mountPath: /etc/coredns 1362 | name: config-volume 1363 | readOnly: true 1364 | dnsPolicy: Default 1365 | restartPolicy: Always 1366 | schedulerName: default-scheduler 1367 | securityContext: {} 1368 | serviceAccount: coredns 1369 | serviceAccountName: coredns 1370 | terminationGracePeriodSeconds: 30 1371 | tolerations: 1372 | - key: CriticalAddonsOnly 1373 | operator: Exists 1374 | - effect: NoSchedule 1375 | key: node-role.kubernetes.io/master 1376 | volumes: 1377 | - configMap: 1378 | defaultMode: 420 1379 | items: 1380 | - key: Corefile 1381 | path: Corefile 1382 | name: coredns 1383 | name: config-volume 1384 | ``` 1385 | 1386 | 执行 `kubectl apply -f coredns-ha.yaml` 进行部署,如: 1387 | 1388 | ``` 1389 | [kube@m01 shells]$ kubectl apply -f coredns-ha.yaml 1390 | deployment.apps/coredns created 1391 | ``` 1392 | 1393 | 查看 coredns 的实例分布,如下: 1394 | 1395 | ``` 1396 | [kube@m01 ~]$ kubectl get pods --all-namespaces -owide 1397 | NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES 1398 | kube-system coredns-6c67f849c7-2qc68 1/1 Running 0 3m39s 10.244.3.3 n01 1399 | kube-system coredns-6c67f849c7-dps8h 1/1 Running 0 3m39s 10.244.5.3 n02 1400 | kube-system etcd-m01 1/1 Running 8 34h 192.168.33.10 m01 1401 | kube-system etcd-m02 1/1 Running 12 30h 192.168.33.11 m02 1402 | kube-system etcd-m03 1/1 Running 0 30h 192.168.33.12 m03 1403 | kube-system kube-apiserver-m01 1/1 Running 9 34h 192.168.33.10 m01 1404 | kube-system kube-apiserver-m02 1/1 Running 0 30h 192.168.33.11 m02 1405 | kube-system kube-apiserver-m03 1/1 Running 0 30h 192.168.33.12 m03 1406 | kube-system kube-controller-manager-m01 1/1 Running 4 34h 192.168.33.10 m01 1407 | kube-system kube-controller-manager-m02 1/1 Running 0 30h 192.168.33.11 m02 1408 | kube-system kube-controller-manager-m03 1/1 Running 0 30h 192.168.33.12 m03 1409 | kube-system kube-flannel-ds-amd64-7b86z 1/1 Running 0 32h 192.168.33.10 m01 1410 | kube-system kube-flannel-ds-amd64-98qks 1/1 Running 0 30h 192.168.33.12 m03 1411 | kube-system kube-flannel-ds-amd64-jkz27 1/1 Running 0 26m 192.168.33.21 n02 1412 | kube-system kube-flannel-ds-amd64-ljcdp 1/1 Running 0 30h 192.168.33.11 m02 1413 | kube-system kube-flannel-ds-amd64-s8vzs 1/1 Running 0 49m 192.168.33.20 n01 1414 | kube-system kube-proxy-c4j4r 1/1 Running 0 49m 192.168.33.20 n01 1415 | kube-system kube-proxy-krnjq 1/1 Running 0 34h 192.168.33.10 m01 1416 | kube-system kube-proxy-n9s8c 1/1 Running 0 26m 192.168.33.21 n02 1417 | kube-system kube-proxy-scb25 1/1 Running 0 30h 192.168.33.12 m03 1418 | kube-system kube-proxy-xp4rj 1/1 Running 0 30h 192.168.33.11 m02 1419 | kube-system kube-scheduler-m01 1/1 Running 4 34h 192.168.33.10 m01 1420 | kube-system kube-scheduler-m02 1/1 Running 0 30h 192.168.33.11 m02 1421 | kube-system kube-scheduler-m03 1/1 Running 0 30h 192.168.33.12 m03 1422 | 1423 | ``` 1424 | 1425 | 可以看到 coredns 的 Pod 已经分布在 n01 和 n02 上。 1426 | 1427 | ## 7. 部署监控组件 metrics-server 1428 | 1429 | 1430 | ### 7.1 部署 metrics-server 1431 | 1432 | kubernetesv1.11 以后不再支持通过 heaspter 采集监控数据。使用新的监控数据采集组件metrics-server。 metrics-server 比 heaspter 轻量很多,也不做数据的持久化存储,提供实时的监控数据查询。 1433 | 1434 | metrics-server 的部署相关的 yaml 配置文件在这里:[metrics-server resources](http://hiko.im/post/https://github.com/kubernetes-incubator/metrics-server/tree/master/deploy/1.8%2B.html) 1435 | 1436 | 先将所有文件下载,保存在一个文件夹 metrics-server 里。 1437 | 1438 | 修改 metrics-server-deployment.yaml 两处地方,分别是:apiVersion 和 image,最终修改后的 metrics-server-deployment.yaml 如下: 1439 | 1440 | ``` 1441 | --- 1442 | apiVersion: v1 1443 | kind: ServiceAccount 1444 | metadata: 1445 | name: metrics-server 1446 | namespace: kube-system 1447 | --- 1448 | # 1. 修改 apiVersion 1449 | apiVersion: apps/v1 1450 | kind: Deployment 1451 | metadata: 1452 | name: metrics-server 1453 | namespace: kube-system 1454 | labels: 1455 | k8s-app: metrics-server 1456 | spec: 1457 | selector: 1458 | matchLabels: 1459 | k8s-app: metrics-server 1460 | template: 1461 | metadata: 1462 | name: metrics-server 1463 | labels: 1464 | k8s-app: metrics-server 1465 | spec: 1466 | serviceAccountName: metrics-server 1467 | volumes: 1468 | # mount in tmp so we can safely use from-scratch images and/or read-only containers 1469 | - name: tmp-dir 1470 | emptyDir: {} 1471 | containers: 1472 | - name: metrics-server 1473 | # 2. 修改使用的镜像 1474 | image: cloudnil/metrics-server-amd64:v0.3.1 1475 | # 3. 指定启动参数,指定使用 InternalIP 进行获取各节点的监控信息 1476 | command: 1477 | - /metrics-server 1478 | - --kubelet-insecure-tls 1479 | - --kubelet-preferred-address-types=InternalIP 1480 | 1481 | imagePullPolicy: Always 1482 | volumeMounts: 1483 | - name: tmp-dir 1484 | mountPath: /tmp 1485 | ``` 1486 | 1487 | 进入刚创建的 metrics-server,执行 `kubectl apply -f . ` 进行部署(注意 -f 后面有个点),如下: 1488 | 1489 | ``` 1490 | [kube@m01 metrics-server]$ kubectl apply -f . 1491 | 1492 | clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created 1493 | clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created 1494 | rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created 1495 | apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created 1496 | serviceaccount/metrics-server created 1497 | deployment.apps/metrics-server created 1498 | service/metrics-server created 1499 | clusterrole.rbac.authorization.k8s.io/system:metrics-server created 1500 | clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created 1501 | ``` 1502 | 1503 | 查看集群 pods,可以看到 metrics-server 的 Pod 已经起来了。 1504 | 1505 | ``` 1506 | [kube@m01 metrics-server]$ kubectl get pods --all-namespaces 1507 | NAMESPACE NAME READY STATUS RESTARTS AGE 1508 | kube-system coredns-6c67f849c7-2qc68 1/1 Running 0 40m 1509 | kube-system coredns-6c67f849c7-dps8h 1/1 Running 0 40m 1510 | kube-system etcd-m01 1/1 Running 8 35h 1511 | kube-system etcd-m02 1/1 Running 12 31h 1512 | kube-system etcd-m03 1/1 Running 0 31h 1513 | kube-system kube-apiserver-m01 1/1 Running 9 35h 1514 | kube-system kube-apiserver-m02 1/1 Running 0 31h 1515 | kube-system kube-apiserver-m03 1/1 Running 0 31h 1516 | kube-system kube-controller-manager-m01 1/1 Running 4 35h 1517 | kube-system kube-controller-manager-m02 1/1 Running 0 31h 1518 | kube-system kube-controller-manager-m03 1/1 Running 0 31h 1519 | kube-system kube-flannel-ds-amd64-7b86z 1/1 Running 0 33h 1520 | kube-system kube-flannel-ds-amd64-98qks 1/1 Running 0 31h 1521 | kube-system kube-flannel-ds-amd64-jkz27 1/1 Running 0 63m 1522 | kube-system kube-flannel-ds-amd64-ljcdp 1/1 Running 0 31h 1523 | kube-system kube-flannel-ds-amd64-s8vzs 1/1 Running 0 86m 1524 | kube-system kube-proxy-c4j4r 1/1 Running 0 86m 1525 | kube-system kube-proxy-krnjq 1/1 Running 0 35h 1526 | kube-system kube-proxy-n9s8c 1/1 Running 0 63m 1527 | kube-system kube-proxy-scb25 1/1 Running 0 31h 1528 | kube-system kube-proxy-xp4rj 1/1 Running 0 31h 1529 | kube-system kube-scheduler-m01 1/1 Running 4 35h 1530 | kube-system kube-scheduler-m02 1/1 Running 0 31h 1531 | kube-system kube-scheduler-m03 1/1 Running 0 31h 1532 | kube-system metrics-server-644c449b-6tctn 1/1 Running 0 116s 1533 | ``` 1534 | 1535 | 验证 metrices-server 的部署结果,如下: 1536 | 1537 | 刚部署完,通过`kubectl top nodes` 查询节点的监控信息时,提示:`error: metrics not available yet`,稍等一段时间后重新查看就正常了。 1538 | 1539 | ``` 1540 | [kube@m01 metrics-server]$ kubectl top nodes 1541 | error: metrics not available yet 1542 | 1543 | # ... 等一段时间之后,重新执行查看,可以了,如下: 1544 | [kube@m01 metrics-server]$ kubectl top nodes 1545 | NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% 1546 | m01 131m 6% 1286Mi 73% 1547 | m02 77m 3% 1217Mi 70% 1548 | m03 80m 4% 1232Mi 70% 1549 | n01 23m 1% 497Mi 28% 1550 | n02 24m 1% 431Mi 24% 1551 | ``` 1552 | 1553 | ### 7.2 遇到的问题 1554 | 1555 | 如果你使用上面的 metrics-server-deployment.yaml ,应该是不会遇到这两个问提,因为配置中已经指定了。 1556 | 1557 | #### 7.2.1 指定 --kubelet-preferred-address-types 1558 | 1559 | 一开始没有指定 `--kubelet-preferred-address-types=InternalIP`,metrics-server 通过主机名进行请求获取监控信息(通过 kubectl logs {POD名称} -n kube-system查看日志得知)。 1560 | 1561 | 然后通过排查找到 Github 上的 issue。通过指定`--kubelet-preferred-address-types=InternalIP`,问题解决。 1562 | 1563 | issue:https://github.com/kubernetes-incubator/metrics-server/issues/143 1564 | 1565 | #### 7.2.2 指定 --kubelet-insecure-tls 1566 | 因为部署集群的时候,kube-apiserver 签的证书只有域名:api.k8s.hiko.im,没有把各个节点的 IP 签上去,所以这里 metrics-server 通过 IP 去请求时,提示签的证书没有对应的 IP(错误:x509: cannot validate certificate for 192.168.33.11 because it doesn't contain any IP SANs) 1567 | 1568 | 具体日志如: 1569 | ``` 1570 | E1223 14:52:10.537796 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:m01: unable to fetch metrics from Kubelet m01 (192.168.33.10): Get https://192.168.33.10:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.10 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:n01: unable to fetch metrics from Kubelet n01 (192.168.33.20): Get https://192.168.33.20:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.20 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:n02: unable to fetch metrics from Kubelet n02 (192.168.33.21): Get https://192.168.33.21:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.21 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:m03: unable to fetch metrics from Kubelet m03 (192.168.33.12): Get https://192.168.33.12:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.12 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:m02: unable to fetch metrics from Kubelet m02 (192.168.33.11): Get https://192.168.33.11:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.11 because it doesn't contain any IP SANs] 1571 | E1223 14:53:10.529198 1 manager.go:102] unable to fully collect metrics: [unable to fully scrape metrics from source kubelet_summary:m02: unable to fetch metrics from Kubelet m02 (192.168.33.11): Get https://192.168.33.11:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.11 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:m01: unable to fetch metrics from Kubelet m01 (192.168.33.10): Get https://192.168.33.10:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.10 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:n01: unable to fetch metrics from Kubelet n01 (192.168.33.20): Get https://192.168.33.20:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.20 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:m03: unable to fetch metrics from Kubelet m03 (192.168.33.12): Get https://192.168.33.12:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.12 because it doesn't contain any IP SANs, unable to fully scrape metrics from source kubelet_summary:n02: unable to fetch metrics from Kubelet n02 (192.168.33.21): Get https://192.168.33.21:10250/stats/summary/: x509: cannot validate certificate for 192.168.33.21 because it doesn't contain any IP SANs] 1572 | 1573 | ``` 1574 | 1575 | 解决方式就是启动 metrics-server 时,指定 `--kubelet-insecure-tls` 参数。 1576 | 1577 | 1578 | 1579 | ## 8. 部署 Ingress,服务暴露 1580 | 1581 | ### 8.1 必知知识点 1582 | 1583 | 参考:http://cloudnil.com/2018/12/14/Deploy-kubernetes(1.13.1)-HA-with-kubeadm//#12-服务暴露到公网 1584 | 1585 | kubernetes 中的 Service 暴露到外部有三种方式,分别是: 1586 | 1587 | - LoadBlancer Service 1588 | - NodePort Service 1589 | - Ingress 1590 | 1591 | LoadBlancer Service 是kubernetes深度结合云平台的一个组件;当使用LoadBlancer Service暴露服务时,实际上是通过向底层云平台申请创建一个负载均衡器来向外暴露服务;目前LoadBlancer Service支持的云平台已经相对完善,比如国外的GCE、DigitalOcean,国内的 阿里云,私有云 Openstack 等等,由于LoadBlancer Service深度结合了云平台,所以只能在一些云平台上来使用。 1592 | 1593 | NodePort Service 顾名思义,实质上就是通过在集群的每个node上暴露一个端口,然后将这个端口映射到某个具体的service来实现的,虽然每个node的端口有很多(0~65535),但是由于安全性和易用性(服务多了就乱了,还有端口冲突问题)实际使用可能并不多。 1594 | 1595 | Ingress 可以实现使用nginx等开源的反向代理负载均衡器实现对外暴露服务,可以理解Ingress就是用于配置域名转发的一个东西,在nginx中就类似upstream,它与ingress-controller结合使用,通过ingress-controller监控到pod及service的变化,动态地将ingress中的转发信息写到诸如nginx、apache、haproxy等组件中实现方向代理和负载均衡。 1596 | 1597 | ### 8.2 部署 Nginx-ingress-controller 1598 | 1599 | Nginx-ingress-controller 是 kubernetes 官方提供的集成了 Ingress-controller 和 Nginx 的一个 docker 镜像。 1600 | 1601 | 本次部署中,将 Nginx-ingress 部署到 m01、m02、m03上,监听宿主机的 80 端口。 1602 | 1603 | 创建 nginx-ingress.yaml 文件,内容如下: 1604 | 1605 | ``` 1606 | apiVersion: v1 1607 | kind: Namespace 1608 | metadata: 1609 | name: ingress-nginx 1610 | --- 1611 | kind: ConfigMap 1612 | apiVersion: v1 1613 | metadata: 1614 | name: nginx-configuration 1615 | namespace: ingress-nginx 1616 | labels: 1617 | app.kubernetes.io/name: ingress-nginx 1618 | app.kubernetes.io/part-of: ingress-nginx 1619 | --- 1620 | kind: ConfigMap 1621 | apiVersion: v1 1622 | metadata: 1623 | name: tcp-services 1624 | namespace: ingress-nginx 1625 | labels: 1626 | app.kubernetes.io/name: ingress-nginx 1627 | app.kubernetes.io/part-of: ingress-nginx 1628 | --- 1629 | kind: ConfigMap 1630 | apiVersion: v1 1631 | metadata: 1632 | name: udp-services 1633 | namespace: ingress-nginx 1634 | labels: 1635 | app.kubernetes.io/name: ingress-nginx 1636 | app.kubernetes.io/part-of: ingress-nginx 1637 | --- 1638 | apiVersion: v1 1639 | kind: ServiceAccount 1640 | metadata: 1641 | name: nginx-ingress-serviceaccount 1642 | namespace: ingress-nginx 1643 | labels: 1644 | app.kubernetes.io/name: ingress-nginx 1645 | app.kubernetes.io/part-of: ingress-nginx 1646 | --- 1647 | apiVersion: rbac.authorization.k8s.io/v1beta1 1648 | kind: ClusterRole 1649 | metadata: 1650 | name: nginx-ingress-clusterrole 1651 | labels: 1652 | app.kubernetes.io/name: ingress-nginx 1653 | app.kubernetes.io/part-of: ingress-nginx 1654 | rules: 1655 | - apiGroups: 1656 | - "" 1657 | resources: 1658 | - configmaps 1659 | - endpoints 1660 | - nodes 1661 | - pods 1662 | - secrets 1663 | verbs: 1664 | - list 1665 | - watch 1666 | - apiGroups: 1667 | - "" 1668 | resources: 1669 | - nodes 1670 | verbs: 1671 | - get 1672 | - apiGroups: 1673 | - "" 1674 | resources: 1675 | - services 1676 | verbs: 1677 | - get 1678 | - list 1679 | - watch 1680 | - apiGroups: 1681 | - "extensions" 1682 | resources: 1683 | - ingresses 1684 | verbs: 1685 | - get 1686 | - list 1687 | - watch 1688 | - apiGroups: 1689 | - "" 1690 | resources: 1691 | - events 1692 | verbs: 1693 | - create 1694 | - patch 1695 | - apiGroups: 1696 | - "extensions" 1697 | resources: 1698 | - ingresses/status 1699 | verbs: 1700 | - update 1701 | --- 1702 | apiVersion: rbac.authorization.k8s.io/v1beta1 1703 | kind: Role 1704 | metadata: 1705 | name: nginx-ingress-role 1706 | namespace: ingress-nginx 1707 | labels: 1708 | app.kubernetes.io/name: ingress-nginx 1709 | app.kubernetes.io/part-of: ingress-nginx 1710 | rules: 1711 | - apiGroups: 1712 | - "" 1713 | resources: 1714 | - configmaps 1715 | - pods 1716 | - secrets 1717 | - namespaces 1718 | verbs: 1719 | - get 1720 | - apiGroups: 1721 | - "" 1722 | resources: 1723 | - configmaps 1724 | resourceNames: 1725 | - "ingress-controller-leader-nginx" 1726 | verbs: 1727 | - get 1728 | - update 1729 | - apiGroups: 1730 | - "" 1731 | resources: 1732 | - configmaps 1733 | verbs: 1734 | - create 1735 | - apiGroups: 1736 | - "" 1737 | resources: 1738 | - endpoints 1739 | verbs: 1740 | - get 1741 | --- 1742 | apiVersion: rbac.authorization.k8s.io/v1beta1 1743 | kind: RoleBinding 1744 | metadata: 1745 | name: nginx-ingress-role-nisa-binding 1746 | namespace: ingress-nginx 1747 | labels: 1748 | app.kubernetes.io/name: ingress-nginx 1749 | app.kubernetes.io/part-of: ingress-nginx 1750 | roleRef: 1751 | apiGroup: rbac.authorization.k8s.io 1752 | kind: Role 1753 | name: nginx-ingress-role 1754 | subjects: 1755 | - kind: ServiceAccount 1756 | name: nginx-ingress-serviceaccount 1757 | namespace: ingress-nginx 1758 | --- 1759 | apiVersion: rbac.authorization.k8s.io/v1beta1 1760 | kind: ClusterRoleBinding 1761 | metadata: 1762 | name: nginx-ingress-clusterrole-nisa-binding 1763 | labels: 1764 | app.kubernetes.io/name: ingress-nginx 1765 | app.kubernetes.io/part-of: ingress-nginx 1766 | roleRef: 1767 | apiGroup: rbac.authorization.k8s.io 1768 | kind: ClusterRole 1769 | name: nginx-ingress-clusterrole 1770 | subjects: 1771 | - kind: ServiceAccount 1772 | name: nginx-ingress-serviceaccount 1773 | namespace: ingress-nginx 1774 | --- 1775 | apiVersion: apps/v1 1776 | kind: Deployment 1777 | metadata: 1778 | name: nginx-ingress-controller 1779 | namespace: ingress-nginx 1780 | labels: 1781 | app.kubernetes.io/name: ingress-nginx 1782 | app.kubernetes.io/part-of: ingress-nginx 1783 | spec: 1784 | replicas: 3 1785 | selector: 1786 | matchLabels: 1787 | app.kubernetes.io/name: ingress-nginx 1788 | app.kubernetes.io/part-of: ingress-nginx 1789 | template: 1790 | metadata: 1791 | labels: 1792 | app.kubernetes.io/name: ingress-nginx 1793 | app.kubernetes.io/part-of: ingress-nginx 1794 | annotations: 1795 | prometheus.io/port: "10254" 1796 | prometheus.io/scrape: "true" 1797 | spec: 1798 | hostNetwork: true 1799 | affinity: 1800 | nodeAffinity: 1801 | requiredDuringSchedulingIgnoredDuringExecution: 1802 | nodeSelectorTerms: 1803 | - matchExpressions: 1804 | - key: kubernetes.io/hostname 1805 | operator: In 1806 | # 指定部署到三台 master 上 1807 | values: 1808 | - m01 1809 | - m02 1810 | - m03 1811 | podAntiAffinity: 1812 | requiredDuringSchedulingIgnoredDuringExecution: 1813 | - labelSelector: 1814 | matchExpressions: 1815 | - key: app.kubernetes.io/name 1816 | operator: In 1817 | values: 1818 | - ingress-nginx 1819 | topologyKey: "kubernetes.io/hostname" 1820 | tolerations: 1821 | - key: node-role.kubernetes.io/master 1822 | effect: NoSchedule 1823 | serviceAccountName: nginx-ingress-serviceaccount 1824 | containers: 1825 | - name: nginx-ingress-controller 1826 | image: registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:0.21.0 1827 | args: 1828 | - /nginx-ingress-controller 1829 | - --configmap=$(POD_NAMESPACE)/nginx-configuration 1830 | - --tcp-services-configmap=$(POD_NAMESPACE)/tcp-services 1831 | - --udp-services-configmap=$(POD_NAMESPACE)/udp-services 1832 | # - --publish-service=$(POD_NAMESPACE)/ingress-nginx 1833 | - --annotations-prefix=nginx.ingress.kubernetes.io 1834 | securityContext: 1835 | capabilities: 1836 | drop: 1837 | - ALL 1838 | add: 1839 | - NET_BIND_SERVICE 1840 | # www-data -> 33 1841 | runAsUser: 33 1842 | env: 1843 | - name: POD_NAME 1844 | valueFrom: 1845 | fieldRef: 1846 | fieldPath: metadata.name 1847 | - name: POD_NAMESPACE 1848 | valueFrom: 1849 | fieldRef: 1850 | fieldPath: metadata.namespace 1851 | ports: 1852 | - name: http 1853 | containerPort: 80 1854 | - name: https 1855 | containerPort: 443 1856 | livenessProbe: 1857 | failureThreshold: 3 1858 | httpGet: 1859 | path: /healthz 1860 | port: 10254 1861 | scheme: HTTP 1862 | initialDelaySeconds: 10 1863 | periodSeconds: 10 1864 | successThreshold: 1 1865 | timeoutSeconds: 1 1866 | readinessProbe: 1867 | failureThreshold: 3 1868 | httpGet: 1869 | path: /healthz 1870 | port: 10254 1871 | scheme: HTTP 1872 | periodSeconds: 10 1873 | successThreshold: 1 1874 | timeoutSeconds: 1 1875 | resources: 1876 | limits: 1877 | cpu: 1 1878 | memory: 1024Mi 1879 | requests: 1880 | cpu: 0.25 1881 | memory: 512Mi 1882 | ``` 1883 | 1884 | 部署 nginx ingress,执行命令 `kubectl apply -f nginx-ingress.yaml `,控制台打印如下: 1885 | 1886 | ``` 1887 | [kube@m01 shells]$ kubectl apply -f nginx-ingress.yaml 1888 | namespace/ingress-nginx created 1889 | configmap/nginx-configuration created 1890 | configmap/tcp-services created 1891 | configmap/udp-services created 1892 | serviceaccount/nginx-ingress-serviceaccount created 1893 | clusterrole.rbac.authorization.k8s.io/nginx-ingress-clusterrole created 1894 | role.rbac.authorization.k8s.io/nginx-ingress-role created 1895 | rolebinding.rbac.authorization.k8s.io/nginx-ingress-role-nisa-binding created 1896 | clusterrolebinding.rbac.authorization.k8s.io/nginx-ingress-clusterrole-nisa-binding created 1897 | deployment.apps/nginx-ingress-controller created 1898 | ``` 1899 | 1900 | 执行结果: 1901 | 1902 | ``` 1903 | [kube@m01 shells]$ kubectl get pods --all-namespaces 1904 | NAMESPACE NAME READY STATUS RESTARTS AGE 1905 | ingress-nginx nginx-ingress-controller-54dddf57ff-72mnn 1/1 Running 0 6m47s 1906 | ingress-nginx nginx-ingress-controller-54dddf57ff-7mj8f 0/1 Running 0 6m47s 1907 | ingress-nginx nginx-ingress-controller-54dddf57ff-fg2tm 1/1 Running 0 6m47s 1908 | kube-system coredns-6c67f849c7-2qc68 1/1 Running 0 3h14m 1909 | kube-system coredns-6c67f849c7-dps8h 1/1 Running 0 3h14m 1910 | kube-system etcd-m01 1/1 Running 8 37h 1911 | kube-system etcd-m02 1/1 Running 12 34h 1912 | kube-system etcd-m03 1/1 Running 0 33h 1913 | kube-system kube-apiserver-m01 1/1 Running 9 37h 1914 | kube-system kube-apiserver-m02 1/1 Running 0 33h 1915 | kube-system kube-apiserver-m03 1/1 Running 0 33h 1916 | kube-system kube-controller-manager-m01 1/1 Running 5 37h 1917 | kube-system kube-controller-manager-m02 1/1 Running 0 33h 1918 | kube-system kube-controller-manager-m03 1/1 Running 0 33h 1919 | kube-system kube-flannel-ds-amd64-7b86z 1/1 Running 0 35h 1920 | kube-system kube-flannel-ds-amd64-98qks 1/1 Running 0 33h 1921 | kube-system kube-flannel-ds-amd64-jkz27 1/1 Running 0 3h37m 1922 | kube-system kube-flannel-ds-amd64-ljcdp 1/1 Running 0 34h 1923 | kube-system kube-flannel-ds-amd64-s8vzs 1/1 Running 0 4h 1924 | kube-system kube-proxy-c4j4r 1/1 Running 0 4h 1925 | kube-system kube-proxy-krnjq 1/1 Running 0 37h 1926 | kube-system kube-proxy-n9s8c 1/1 Running 0 3h37m 1927 | kube-system kube-proxy-scb25 1/1 Running 0 33h 1928 | kube-system kube-proxy-xp4rj 1/1 Running 0 34h 1929 | kube-system kube-scheduler-m01 1/1 Running 5 37h 1930 | kube-system kube-scheduler-m02 1/1 Running 0 33h 1931 | kube-system kube-scheduler-m03 1/1 Running 0 33h 1932 | kube-system kubernetes-dashboard-847f8cb7b8-p8rjn 1/1 Running 0 23m 1933 | kube-system metrics-server-8658466f94-sr479 1/1 Running 0 39m 1934 | ``` 1935 | 1936 | 1937 | ## 9. 部署 kubernetes-dashboard 1938 | 1939 | ### 9.1 Dashboard 配置 1940 | 1941 | 新建部署 dashboard 的资源配置文件:kubernetes-dashboard.yaml,内容如下: 1942 | 1943 | 1944 | ``` 1945 | apiVersion: v1 1946 | kind: Secret 1947 | metadata: 1948 | labels: 1949 | k8s-app: kubernetes-dashboard 1950 | name: kubernetes-dashboard-certs 1951 | namespace: kube-system 1952 | type: Opaque 1953 | --- 1954 | apiVersion: v1 1955 | kind: ServiceAccount 1956 | metadata: 1957 | labels: 1958 | k8s-app: kubernetes-dashboard 1959 | name: kubernetes-dashboard 1960 | namespace: kube-system 1961 | --- 1962 | kind: Role 1963 | apiVersion: rbac.authorization.k8s.io/v1 1964 | metadata: 1965 | name: kubernetes-dashboard-minimal 1966 | namespace: kube-system 1967 | rules: 1968 | # Allow Dashboard to create 'kubernetes-dashboard-key-holder' secret. 1969 | - apiGroups: [""] 1970 | resources: ["secrets"] 1971 | verbs: ["create"] 1972 | # Allow Dashboard to create 'kubernetes-dashboard-settings' config map. 1973 | - apiGroups: [""] 1974 | resources: ["configmaps"] 1975 | verbs: ["create"] 1976 | # Allow Dashboard to get, update and delete Dashboard exclusive secrets. 1977 | - apiGroups: [""] 1978 | resources: ["secrets"] 1979 | resourceNames: ["kubernetes-dashboard-key-holder", "kubernetes-dashboard-certs"] 1980 | verbs: ["get", "update", "delete"] 1981 | # Allow Dashboard to get and update 'kubernetes-dashboard-settings' config map. 1982 | - apiGroups: [""] 1983 | resources: ["configmaps"] 1984 | resourceNames: ["kubernetes-dashboard-settings"] 1985 | verbs: ["get", "update"] 1986 | # Allow Dashboard to get metrics from heapster. 1987 | - apiGroups: [""] 1988 | resources: ["services"] 1989 | resourceNames: ["heapster"] 1990 | verbs: ["proxy"] 1991 | - apiGroups: [""] 1992 | resources: ["services/proxy"] 1993 | resourceNames: ["heapster", "http:heapster:", "https:heapster:"] 1994 | verbs: ["get"] 1995 | --- 1996 | apiVersion: rbac.authorization.k8s.io/v1 1997 | kind: RoleBinding 1998 | metadata: 1999 | name: kubernetes-dashboard-minimal 2000 | namespace: kube-system 2001 | roleRef: 2002 | apiGroup: rbac.authorization.k8s.io 2003 | kind: Role 2004 | name: kubernetes-dashboard-minimal 2005 | subjects: 2006 | - kind: ServiceAccount 2007 | name: kubernetes-dashboard 2008 | namespace: kube-system 2009 | --- 2010 | kind: Deployment 2011 | apiVersion: apps/v1 2012 | metadata: 2013 | labels: 2014 | k8s-app: kubernetes-dashboard 2015 | name: kubernetes-dashboard 2016 | namespace: kube-system 2017 | spec: 2018 | replicas: 1 2019 | revisionHistoryLimit: 10 2020 | selector: 2021 | matchLabels: 2022 | k8s-app: kubernetes-dashboard 2023 | template: 2024 | metadata: 2025 | labels: 2026 | k8s-app: kubernetes-dashboard 2027 | spec: 2028 | containers: 2029 | - name: kubernetes-dashboard 2030 | # 使用阿里云的镜像 2031 | image: registry.cn-hangzhou.aliyuncs.com/google_containers/kubernetes-dashboard-amd64:v1.10.0 2032 | ports: 2033 | - containerPort: 8443 2034 | protocol: TCP 2035 | args: 2036 | - --auto-generate-certificates 2037 | volumeMounts: 2038 | - name: kubernetes-dashboard-certs 2039 | mountPath: /certs 2040 | # Create on-disk volume to store exec logs 2041 | - mountPath: /tmp 2042 | name: tmp-volume 2043 | livenessProbe: 2044 | httpGet: 2045 | scheme: HTTPS 2046 | path: / 2047 | port: 8443 2048 | initialDelaySeconds: 30 2049 | timeoutSeconds: 30 2050 | volumes: 2051 | - name: kubernetes-dashboard-certs 2052 | secret: 2053 | secretName: kubernetes-dashboard-certs 2054 | - name: tmp-volume 2055 | emptyDir: {} 2056 | serviceAccountName: kubernetes-dashboard 2057 | tolerations: 2058 | - key: node-role.kubernetes.io/master 2059 | effect: NoSchedule 2060 | --- 2061 | kind: Service 2062 | apiVersion: v1 2063 | metadata: 2064 | labels: 2065 | k8s-app: kubernetes-dashboard 2066 | name: kubernetes-dashboard 2067 | namespace: kube-system 2068 | spec: 2069 | ports: 2070 | - port: 443 2071 | targetPort: 8443 2072 | selector: 2073 | k8s-app: kubernetes-dashboard 2074 | --- 2075 | # 配置 ingress 配置,待会部署完 ingress 之后,就可以通过以下配置的域名访问 2076 | apiVersion: extensions/v1beta1 2077 | kind: Ingress 2078 | metadata: 2079 | name: dashboard-ingress 2080 | namespace: kube-system 2081 | annotations: 2082 | # 指定转发协议为 HTTPS,因为 ingress 默认转发协议是 HTTP,而 kubernetes-dashboard 默认是 HTTPS 2083 | nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" 2084 | spec: 2085 | rules: 2086 | # 指定访问 dashboard 的域名 2087 | - host: dashboard.k8s.hiko.im 2088 | http: 2089 | paths: 2090 | - path: / 2091 | backend: 2092 | serviceName: kubernetes-dashboard 2093 | servicePort: 443 2094 | ``` 2095 | 2096 | 执行部署 kubernetes-dashboard,命令 ` kubectl apply -f kubernetes-dashboard.yaml `,控制台打印如下: 2097 | 2098 | ``` 2099 | [kube@m01 shells]$ kubectl apply -f kubernetes-dashboard.yaml 2100 | secret/kubernetes-dashboard-certs created 2101 | serviceaccount/kubernetes-dashboard created 2102 | role.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created 2103 | rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard-minimal created 2104 | deployment.apps/kubernetes-dashboard created 2105 | service/kubernetes-dashboard created 2106 | ingress.extensions/dashboard-ingress created 2107 | ``` 2108 | 2109 | 查看部署结果: 2110 | 2111 | ``` 2112 | [kube@m01 ~]$ kubectl get pods --all-namespaces 2113 | NAMESPACE NAME READY STATUS RESTARTS AGE 2114 | kube-system coredns-6c67f849c7-2qc68 1/1 Running 0 172m 2115 | kube-system coredns-6c67f849c7-dps8h 1/1 Running 0 172m 2116 | kube-system etcd-m01 1/1 Running 8 37h 2117 | kube-system etcd-m02 1/1 Running 12 33h 2118 | kube-system etcd-m03 1/1 Running 0 33h 2119 | kube-system kube-apiserver-m01 1/1 Running 9 37h 2120 | kube-system kube-apiserver-m02 1/1 Running 0 33h 2121 | kube-system kube-apiserver-m03 1/1 Running 0 33h 2122 | kube-system kube-controller-manager-m01 1/1 Running 4 37h 2123 | kube-system kube-controller-manager-m02 1/1 Running 0 33h 2124 | kube-system kube-controller-manager-m03 1/1 Running 0 33h 2125 | kube-system kube-flannel-ds-amd64-7b86z 1/1 Running 0 35h 2126 | kube-system kube-flannel-ds-amd64-98qks 1/1 Running 0 33h 2127 | kube-system kube-flannel-ds-amd64-jkz27 1/1 Running 0 3h14m 2128 | kube-system kube-flannel-ds-amd64-ljcdp 1/1 Running 0 33h 2129 | kube-system kube-flannel-ds-amd64-s8vzs 1/1 Running 0 3h38m 2130 | kube-system kube-proxy-c4j4r 1/1 Running 0 3h38m 2131 | kube-system kube-proxy-krnjq 1/1 Running 0 37h 2132 | kube-system kube-proxy-n9s8c 1/1 Running 0 3h14m 2133 | kube-system kube-proxy-scb25 1/1 Running 0 33h 2134 | kube-system kube-proxy-xp4rj 1/1 Running 0 33h 2135 | kube-system kube-scheduler-m01 1/1 Running 4 37h 2136 | kube-system kube-scheduler-m02 1/1 Running 0 33h 2137 | kube-system kube-scheduler-m03 1/1 Running 0 33h 2138 | kube-system kubernetes-dashboard-847f8cb7b8-p8rjn 1/1 Running 0 62s 2139 | kube-system metrics-server-8658466f94-sr479 1/1 Running 0 17m 2140 | ``` 2141 | 2142 | 可以看到 kubernetes-dashboard 的 Pod 已经跑起来了。 2143 | 2144 | 查看 dashboard 的 ingress 配置`kubectl get ing --all-namespaces`: 2145 | 2146 | ``` 2147 | [kube@m01 ~]$ kubectl get ing --all-namespaces 2148 | NAMESPACE NAME HOSTS ADDRESS PORTS AGE 2149 | kube-system dashboard-ingress dashboard.k8s.hiko.im 80 36m 2150 | ``` 2151 | 2152 | 我们要从笔记本访问到这个 dashboard 服务,需要解析域名 dashboard.k8s.hiko.im 到 m01 m02 m03 的 IP,就可以使用 dashboard.k8s.hiko.im 访问 dashboard。 2153 | 2154 | 因为我们是从笔记本开虚拟机,所以要从宿主机(笔记本)访问,需要修改笔记本的本地 hosts, 添加一条记录: 2155 | 2156 | ``` 2157 | 192.168.33.10 dashboard.k8s.hiko.im 2158 | ``` 2159 | 2160 | 从浏览器访问:http://dashboard.k8s.hiko.im/ 2161 | 2162 | ![image.png](./images/dashboard-login.png) 2163 | 2164 | 到这里,服务都正常跑起来了。 2165 | 2166 | 但是,其实虽然这里能访问到登录页面,但是登录不进去 dashboard,这个问题我在 Github 上问了官方的开发,解决方式就是将 dashboard 的访问配置成 HTTPS(后面介绍,Github issue 地址:https://github.com/kubernetes/dashboard/issues/3464 )。 2167 | 2168 | ### 9.2 HTTPS 访问 Dashboard 2169 | 2170 | 由于通过 HTTP 访问 dashboard 会无法登录进去 dashboard 的问题,所以这里我们将 dashboard 的服务配置成 HTTPS 进行访问。 2171 | 2172 | 总共三步: 2173 | 2174 | **1. 签证书(或者使用权威的证书机构颁发的证书)** 2175 | 2176 | 这里演示的是通过自签证书。 2177 | 2178 | command to sign certifications: 2179 | ``` 2180 | openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /tmp/k8s.hiko.im.key -out /tmp/k8s.hiko.im.crt -subj "/CN=*.hiko.im" 2181 | ``` 2182 | 2183 | 可以看到 `/tmp/` 目录下已经生成了crt 和 key 文件。 2184 | 2185 | ``` 2186 | [kube@m01 ~]$ ll /tmp/| grep k8s 2187 | -rw-rw-r--. 1 kube kube 1094 Dec 23 03:01 k8s.hiko.im.crt 2188 | -rw-rw-r--. 1 kube kube 1704 Dec 23 03:01 k8s.hiko.im.key 2189 | ``` 2190 | 2191 | **2. 创建 k8s Secret 资源** 2192 | ``` 2193 | kubectl -n kube-system create secret tls secret-ca-k8s-hiko-im --key /tmp/k8s.hiko.im.key --cert /tmp/k8s.hiko.im.crt 2194 | ``` 2195 | 2196 | 命令行打印如下: 2197 | 2198 | ``` 2199 | [kube@m01 v1.13]$ kubectl -n kube-system create secret tls secret-ca-k8s-hiko-im --key /tmp/k8s.hiko.im.key --cert /tmp/k8s.hiko.im.crt 2200 | secret/secret-ca-k8s-hiko-im created 2201 | ``` 2202 | 2203 | **3. 配置 dashboard 的 ingress 为 HTTPS 访问服务** 2204 | 2205 | 修改 `kubernetes-dashboard.yaml`,将其中的 Ingress 配置改为支持 HTTPS,具体配置如下: 2206 | 2207 | ``` 2208 | ...省略... 2209 | 2210 | apiVersion: extensions/v1beta1 2211 | kind: Ingress 2212 | metadata: 2213 | name: dashboard-ingress 2214 | namespace: kube-system 2215 | annotations: 2216 | # 如果通过 HTTP 访问,跳转到 HTTPS 2217 | nginx.ingress.kubernetes.io/ssl-redirect: "true" 2218 | nginx.ingress.kubernetes.io/rewrite-target: / 2219 | # 指定转发协议为 HTTPS,因为 ingress 默认转发协议是 HTTP,而 kubernetes-dashboard 默认是 HTTPS 2220 | nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" 2221 | spec: 2222 | # 指定使用的 secret (刚刚创建的 secret) 2223 | tls: 2224 | - secretName: secret-ca-k8s-hiko-im 2225 | rules: 2226 | # 指定访问 dashboard 的域名 2227 | - host: dashboard.k8s.hiko.im 2228 | http: 2229 | paths: 2230 | - path: / 2231 | backend: 2232 | serviceName: kubernetes-dashboard 2233 | servicePort: 443 2234 | ``` 2235 | 2236 | 使用 `kubectl apply -f kubernetes-dashboard.yaml` 让配置生效。 2237 | 2238 | 备注:完整的配置文件,可以参考:[kubernetes-dashboard-https.yaml](https://github.com/HikoQiu/kubeadm-install-k8s/blob/master/tools/v1.13/kubernetes-dashboard-https.yaml) 2239 | 2240 | 2241 | ### 9.3 登录 Dashboard 2242 | 2243 | 登录 dashboard 需要做几个事情(不用担心,一个脚本搞定): 2244 | 2245 | 1. 新建 sa 的账号(也叫 serviceaccount) 2246 | 2. 集群角色绑定(将第 1 步新建的账号,绑定到 cluster-admin 这个角色上) 2247 | 3. 查看 Token 以及 Token 中的 secrect (secrect 中的 token 字段就是来登录的) 2248 | 2249 | 执行以下脚本,获得登录的 Token: 2250 | 2251 | ```` 2252 | ## 创建脚本:create.dashboard.token.sh 2253 | 2254 | #!/bin/sh 2255 | 2256 | kubectl create sa dashboard-admin -n kube-system 2257 | kubectl create clusterrolebinding dashboard-admin --clusterrole=cluster-admin --serviceaccount=kube-system:dashboard-admin 2258 | ADMIN_SECRET=$(kubectl get secrets -n kube-system | grep dashboard-admin | awk '{print $1}') 2259 | DASHBOARD_LOGIN_TOKEN=$(kubectl describe secret -n kube-system ${ADMIN_SECRET} | grep -E '^token' | awk '{print $2}') 2260 | echo ${DASHBOARD_LOGIN_TOKEN} 2261 | 2262 | ```` 2263 | 2264 | 2265 | 复制 Token 去登录就行,Token 样例: 2266 | 2267 | ``` 2268 | eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlLXN5c3RlbSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJkYXNoYm9hcmQtYWRtaW4tdG9rZW4tNWtnZHoiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoiZGFzaGJvYXJkLWFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiYWQxNDAyMjQtMDYxNC0xMWU5LTkxMDgtNTI1NDAwODQ4MWQ1Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50Omt1YmUtc3lzdGVtOmRhc2hib2FyZC1hZG1pbiJ9.ry4xYI6TFF6J8xXsilu0qhuBeRjSNqVPq3OUzl62Ad3e2wM-biC5pPlKNmJLfJzurxnQrqp59VjmVeTA8BZiF7S6hqlrk8XE9_LFlItUvq3rp5wFuhJuVol8Yoi4UJFzUYQF6baH0O3R10aK33g2WmWLIg79OFAkeMMHrLthbL2pc_p_kG13_qDXlEuVgnIAFsKzxnrCCUfZ2GwGsHEFEqTGBCb0u6x3AZqfQgbN3DALkjjNTyTLP5Ok-LJ3Ug8SZZQBksvTeXCGXZDfk2LDDIvp_DyM7nTL3CTT5cQ3g4aBTFAae47NAkQkmjZg0mxvJH0xVnxrvXLND8FLLkzMxg 2269 | 2270 | ``` 2271 | 2272 | 登录成功将看到: 2273 | 2274 | 2275 | ![kubernetes dashboard](./images/dashboard-index.png) 2276 | 2277 | 2278 | ### 9.4 404 问题 2279 | 2280 | 如果你使用上面的配置,应该是不会遇到这个问题。 2281 | 2282 | 上一步配置完成后,我测试访问,响应 404。 2283 | 2284 | 2285 | 通过查看 kubernetes-dashboard Pod 的日志(命令:`kubectl logs kubernetes-dashboard-847f8cb7b8-p8rjn -n kube-system`),发现有一些错误日志,如下: 2286 | 2287 | ``` 2288 | 2018/12/23 15:48:04 http: TLS handshake error from 10.244.0.0:57984: tls: first record does not look like a TLS handshake 2289 | 2018/12/23 15:48:08 http: TLS handshake error from 10.244.0.0:58000: tls: first record does not look like a TLS handshake 2290 | ``` 2291 | 2292 | 同样在 Github issue 中找到解决方案:https://github.com/helm/charts/issues/4204 2293 | 2294 | 问题原因:The problem is that ingress is redirecting to HTTP while dashboard by default is expecting HTTPS , so you should add this to your ingress annotations:nginx.ingress.kubernetes.io/secure-backends: "true" 2295 | 2296 | 新版的配置改成:nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" 2297 | 2298 | 具体配置如(上文中的 kubernetes-dashboard.yaml 已经修正过的): 2299 | 2300 | ``` 2301 | apiVersion: extensions/v1beta1 2302 | kind: Ingress 2303 | metadata: 2304 | name: dashboard-ingress 2305 | namespace: kube-system 2306 | annotations: 2307 | # 指定转发协议为 HTTPS,因为 ingress 默认转发协议是 HTTP,而 kubernetes-dashboard 默认是 HTTPS 2308 | nginx.ingress.kubernetes.io/backend-protocol: "HTTPS" 2309 | spec: 2310 | rules: 2311 | # 指定访问 dashboard 的域名 2312 | - host: dashboard.k8s.hiko.im 2313 | http: 2314 | paths: 2315 | - path: / 2316 | backend: 2317 | serviceName: kubernetes-dashboard 2318 | servicePort: 443 2319 | ``` 2320 | 2321 | 2322 | --------------------------------------------------------------------------------