环境说明:
主机名 | 操作系统版本 | ip | docker version | kubelet version | 配置 | 备注 |
---|---|---|---|---|---|---|
master | Centos 7.6.1810 | 172.27.9.131 | Docker 18.09.6 | V1.14.2 | 2C2G | master 主机 |
node01 | Centos 7.6.1810 | 172.27.9.135 | Docker 18.09.6 | V1.14.2 | 2C2G | node 节点 |
node02 | Centos 7.6.1810 | 172.27.9.136 | Docker 18.09.6 | V1.14.2 | 2C2G | node 节点 |
k8s 集群部署详见: Centos7.6 部署 k8s(v1.14.2) 集群
k8s 学习资料详见: 基本概念, kubectl 命令和资料分享
emptyDir 详见: 存储卷和数据持久化 (Volumes and Persistent Storage)
一, 背景
当 node 节点进行如打补丁, 操作系统升级等操作时, 需停机维护, 这就涉及 pod 驱逐迁移, 本文将详细介绍 node 节点维护的整个过程.
二, pdb 简介
pdb 为 poddisruptionbudgets 缩写, 意为主动驱逐保护;
没有 pdb. 当进行节点维护时, 如果某个服务的多个 pod 在该节点上, 则节点的停机可能会造成服务中断或者服务降级. 举个例子, 某服务有 5 个 pod, 最低 3 个 pod 能保证服务质量, 否则会造成响应慢等影响, 此时该服务的 4 个 pod 在 node01 上, 如果对 node01 进行停机维护, 此时只有 1 个 pod 能正常对外服务, 在 node01 的 4 个 pod 迁移过程中, 就会影响该服务正常响应;
pdb 能保证应用在节点维护时不低于一定数量的 pod 运行, 从而保持服务质量;
三, 准备工作
1. 新建 pod
- [root@master ~]# more nginx-master.YAML
- apiVersion: extensions/v1beta1
- kind: Deployment
- metadata:
- name: nginx-master
- spec:
- replicas: 10
- template:
- metadata:
- labels:
- App: nginx
- spec:
- restartPolicy: Always
- containers:
- - name: nginx
- image: nginx:latest
- [root@master ~]# kubectl apply -f nginx-master.YAML
- deployment.extensions/nginx-master created
- [root@master ~]# kubectl get po -o wide
- NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
- nginx-master-9d4cf4f77-47vfj 1/1 Running 0 28s 10.244.0.129 master <none> <none>
- nginx-master-9d4cf4f77-69jn6 1/1 Running 0 28s 10.244.2.206 node02 <none> <none>
- nginx-master-9d4cf4f77-6drhg 1/1 Running 0 28s 10.244.1.218 node01 <none> <none>
- nginx-master-9d4cf4f77-b7zfd 1/1 Running 0 28s 10.244.1.219 node01 <none> <none>
- nginx-master-9d4cf4f77-fxsjd 1/1 Running 0 28s 10.244.2.204 node02 <none> <none>
- nginx-master-9d4cf4f77-ktnvk 1/1 Running 0 28s 10.244.0.128 master <none> <none>
- nginx-master-9d4cf4f77-mzrx7 1/1 Running 0 28s 10.244.1.217 node01 <none> <none>
- nginx-master-9d4cf4f77-pcznk 1/1 Running 0 28s 10.244.2.203 node02 <none> <none>
- nginx-master-9d4cf4f77-px98b 1/1 Running 0 28s 10.244.2.205 node02 <none> <none>
- nginx-master-9d4cf4f77-wtcwt 1/1 Running 0 28s 10.244.1.220 node01 <none> <none>
新建 pod, 镜像为最新版的 nginx,deployment 为 nginx-master, 数量为 10. 可以看到 10 个 pod 分布在 node01,node02 和 master 3 台不同主机上.
2. 新建 pdb
- [root@master ~]# more pdb-nginx.YAML
- apiVersion: policy/v1beta1
- kind: PodDisruptionBudget
- metadata:
- name: pdb-nginx
- spec:
- minAvailable: 9
- selector:
- matchLabels:
- App: nginx
- [root@master ~]# kubectl apply -f pdb-nginx.YAML
- poddisruptionbudget.policy/pdb-nginx created
- [root@master ~]# kubectl get pdb
- NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
- pdb-nginx 9 N/A 1 8s
新建 pdb pdb-nginx,Label Selector 和 deployment 一样都为 App: nginx,minAvailable: 9 意为存活的 nginx pod 至少为 9 个.
四, 节点维护
本文以节点 node02 维护为例介绍.
1. 设置节点不可调度
- [root@master ~]# kubectl cordon node02
- node/node02 cordoned
- [root@master ~]# kubectl get node
- NAME STATUS ROLES AGE VERSION
- master Ready master 184d v1.14.2
- node01 Ready <none> 183d v1.14.2
- node02 Ready,SchedulingDisabled <none> 182d v1.14.2
- [root@master ~]# kubectl get po -o wide
- NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
- nginx-master-9d4cf4f77-47vfj 1/1 Running 0 30m 10.244.0.129 master <none> <none>
- nginx-master-9d4cf4f77-69jn6 1/1 Running 0 30m 10.244.2.206 node02 <none> <none>
- nginx-master-9d4cf4f77-6drhg 1/1 Running 0 30m 10.244.1.218 node01 <none> <none>
- nginx-master-9d4cf4f77-b7zfd 1/1 Running 0 30m 10.244.1.219 node01 <none> <none>
- nginx-master-9d4cf4f77-fxsjd 1/1 Running 0 30m 10.244.2.204 node02 <none> <none>
- nginx-master-9d4cf4f77-ktnvk 1/1 Running 0 30m 10.244.0.128 master <none> <none>
- nginx-master-9d4cf4f77-mzrx7 1/1 Running 0 30m 10.244.1.217 node01 <none> <none>
- nginx-master-9d4cf4f77-pcznk 1/1 Running 0 30m 10.244.2.203 node02 <none> <none>
- nginx-master-9d4cf4f77-px98b 1/1 Running 0 30m 10.244.2.205 node02 <none> <none>
- nginx-master-9d4cf4f77-wtcwt 1/1 Running 0 30m 10.244.1.220 node01 <none> <none>
设置 node02 不可调度, 查看各节点状态, 发现 node02 为 SchedulingDisabled, 此时 master 不会将新的 pod 调度到该节点上, 但是 node02 上 pod 还是正常运行.
2. 驱逐节点上的 pod
- [root@master ~]# kubectl drain node02 --delete-local-data --ignore-daemonsets --force
- node/node02 already cordoned
参数说明:
--delete-local-data 即使 pod 使用了 emptyDir 也删除
--ignore-daemonsets 忽略 deamonset 控制器的 pod, 如果不忽略, deamonset 控制器控制的 pod 被删除后可能马上又在此节点上启动起来, 会成为死循环;
--force 不加 force 参数只会删除该 NODE 上由 ReplicationController, ReplicaSet, DaemonSet,StatefulSet or Job 创建的 Pod, 加了后还会删除'裸奔的 pod'(没有绑定到任何 replication controller)
可以看到同一时刻只有一个 pod 进行迁移, 对外提供服务的 pod 始终有 9 个.
迁移 pod nginx-master-9d4cf4f77-pcznk 到 node01
迁移 pod nginx-master-9d4cf4f77-px98b 到 master, 此时前一个 pod nginx-master-9d4cf4f77-pcznk 已经迁移完成.
迁移 pod nginx-master-9d4cf4f77-69jn6 到 master
迁移 pod nginx-master-9d4cf4f77-fxsjd 到 master
这个也再次验证了同一时刻只有一个 pod 迁移, nginx 服务始终有 9 个 pod 对外提供服务.
3. 维护结束
- [root@master ~]# kubectl uncordon node02
- node/node02 uncordoned
- [root@master ~]# kubectl get nodes
- NAME STATUS ROLES AGE VERSION
- master Ready master 184d v1.14.2
- node01 Ready <none> 183d v1.14.2
- node02 Ready <none> 183d v1.14.2
维护结束, 重新将 node02 节点置为可调度状态.
五, pod 回迁
pod 回迁貌似还没什么好的办法, 这里采用 delete 然后重建的方式回迁.
- [root@master ~]# kubectl get po -o wide
- NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
- nginx-master-9d4cf4f77-2vnvk 1/1 Running 0 33m 10.244.1.222 node01 <none> <none>
- nginx-master-9d4cf4f77-47vfj 1/1 Running 0 73m 10.244.0.129 master <none> <none>
- nginx-master-9d4cf4f77-6drhg 1/1 Running 0 73m 10.244.1.218 node01 <none> <none>
- nginx-master-9d4cf4f77-7n7pt 1/1 Running 0 32m 10.244.0.131 master <none> <none>
- nginx-master-9d4cf4f77-b7zfd 1/1 Running 0 73m 10.244.1.219 node01 <none> <none>
- nginx-master-9d4cf4f77-ktnvk 1/1 Running 0 73m 10.244.0.128 master <none> <none>
- nginx-master-9d4cf4f77-mzrx7 1/1 Running 0 73m 10.244.1.217 node01 <none> <none>
- nginx-master-9d4cf4f77-pdkst 1/1 Running 0 32m 10.244.0.130 master <none> <none>
- nginx-master-9d4cf4f77-pskmp 1/1 Running 0 32m 10.244.0.132 master <none> <none>
- nginx-master-9d4cf4f77-wtcwt 1/1 Running 0 73m 10.244.1.220 node01 <none> <none>
- [root@master ~]# kubectl delete po nginx-master-9d4cf4f77-47vfj
- pod "nginx-master-9d4cf4f77-47vfj" deleted
- [root@master ~]# kubectl delete po nginx-master-9d4cf4f77-2vnvk
- pod "nginx-master-9d4cf4f77-2vnvk" deleted
- [root@master ~]# kubectl get po -o wide
- NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
- nginx-master-9d4cf4f77-6drhg 1/1 Running 0 76m 10.244.1.218 node01 <none> <none>
- nginx-master-9d4cf4f77-7n7pt 1/1 Running 0 35m 10.244.0.131 master <none> <none>
- nginx-master-9d4cf4f77-b7zfd 1/1 Running 0 76m 10.244.1.219 node01 <none> <none>
- nginx-master-9d4cf4f77-f92hp 1/1 Running 0 44s 10.244.2.207 node02 <none> <none>
- nginx-master-9d4cf4f77-ktnvk 1/1 Running 0 76m 10.244.0.128 master <none> <none>
- nginx-master-9d4cf4f77-mzrx7 1/1 Running 0 76m 10.244.1.217 node01 <none> <none>
- nginx-master-9d4cf4f77-pdkst 1/1 Running 0 35m 10.244.0.130 master <none> <none>
- nginx-master-9d4cf4f77-pskmp 1/1 Running 0 35m 10.244.0.132 master <none> <none>
- nginx-master-9d4cf4f77-tdghn 1/1 Running 0 15s 10.244.2.208 node02 <none> <none>
- nginx-master-9d4cf4f77-wtcwt 1/1 Running 0 76m 10.244.1.220 node01 <none> <none>
在业务低峰 delete pod nginx-master-9d4cf4f77-47vfj 和 nginx-master-9d4cf4f77-2vnvk, 由于 node02 上的 pod 之前都被驱逐, 此时资源使用率最低, 所以 pod 重建时会调度值该节点, 完成 pod 回迁.
六, 节点删除
1. 删除节点
实际运维过程中可能会删除某个 node 节点, 本文还是以 node02 为例, 介绍如果删除节点.
- [root@master ~]# kubectl cordon node02
- [root@master ~]# kubectl drain node02 --delete-local-data --ignore-daemonsets --force
- [root@master ~]# kubectl delete node node02
[root@node02 ~]# kubeadm reset
2. 节点重新加入
master 节点上运行
- [root@master ~]# kubeadm token create --print-join-command
- kubeadm join 172.27.9.131:6443 --token kpz40z.tuxb4t4m1q37vwl1 --discovery-token-ca-cert-hash sha256:5f656ae26b5e7d4641a979cbfdffeb7845cc5962bbfcd1d5435f00a25c02ea50
node02 重新加入集群
[root@node02 ~]# kubeadm join 172.27.9.131:6443 --token svrip0.lajrfl4jgal0ul6i --discovery-token-ca-cert-hash sha256:5f656ae26b5e7d4641a979cbfdffeb7845cc5962bbfcd1d5435f00a25c02ea50
查看 node
本文所有脚本和配置文件已上传: Pode Eviction and Node Manage
来源: http://blog.51cto.com/3241766/2456338