首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >深夜求助!k8s 证书过期,集群无法管理!别慌,一条命令抢救你的 k8s !

深夜求助!k8s 证书过期,集群无法管理!别慌,一条命令抢救你的 k8s !

作者头像
运维有术
发布2025-06-15 12:13:55
发布2025-06-15 12:13:55
46710
代码可运行
举报
文章被收录于专栏:运维有术运维有术
运行总次数:0
代码可运行

2025 年云原生运维实战文档 X 篇原创计划第 13 篇|KubeSphere 最佳实战「2025」系列第 3 篇

大家好,我是术哥,一名专注于云原生、AI技术的布道者。作为 KubeSphere AmbassadorMilvus 北辰使者,我很荣幸能在「运维有术」与大家分享经验。

今天遇到一个有趣的问题:我准备在一套搁置一个月的 KubeKey 部署的 Kubernetes 集群上进行实验时,发现集群管理命令完全无法使用。让我们一起来看看遇到了什么问题:

代码语言:javascript
代码运行次数:0
运行
复制
[root@ksp-control-1 ~]# kubectl get nodes
E0613 08:22:45.268267    2335 memcache.go:265] couldn't get current server API group list: Get "https://lb.opsxlab.cn:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-06-13T08:22:45+08:00 is after 2025-05-22T06:15:43Z
E0613 08:22:45.271798    2335 memcache.go:265] couldn't get current server API group list: Get "https://lb.opsxlab.cn:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-06-13T08:22:45+08:00 is after 2025-05-22T06:15:43Z
E0613 08:22:45.274977    2335 memcache.go:265] couldn't get current server API group list: Get "https://lb.opsxlab.cn:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-06-13T08:22:45+08:00 is after 2025-05-22T06:15:43Z
E0613 08:22:45.278363    2335 memcache.go:265] couldn't get current server API group list: Get "https://lb.opsxlab.cn:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-06-13T08:22:45+08:00 is after 2025-05-22T06:15:43Z
E0613 08:22:45.281713    2335 memcache.go:265] couldn't get current server API group list: Get "https://lb.opsxlab.cn:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-06-13T08:22:45+08:00 is after 2025-05-22T06:15:43Z
Unable to connect to the server: tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2025-06-13T08:22:45+08:00 is after 2025-05-22T06:15:43Z

这个问题正好给了我们一个绝佳的机会,来深入探讨使用 KubeKey 部署的 Kubernetes 集群证书过期的处理方案。作为一个典型案例,让我们一起学习如何快速、安全地为 k8s 集群进行证书续期。

1. 证书过期解决方案

1.1 查看证书到期时间

  • 使用 kubeadm 查看
代码语言:javascript
代码运行次数:0
运行
复制
$ kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration

CERTIFICATE                         EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                          May 22, 2025 06:15 UTC   <invalid>       ca                      no
apiserver                           May 22, 2025 06:15 UTC   <invalid>       ca                      no
!MISSING! apiserver-etcd-client
apiserver-kubelet-client            May 22, 2025 06:15 UTC   <invalid>       ca                      no
controller-manager.conf             May 22, 2025 06:15 UTC   <invalid>       ca                      no
!MISSING! etcd-healthcheck-client
!MISSING! etcd-peer
!MISSING! etcd-server
front-proxy-client                  May 22, 2025 06:15 UTC   <invalid>       front-proxy-ca          no
scheduler.conf                      May 22, 2025 06:15 UTC   <invalid>       ca                      no

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      May 20, 2034 06:15 UTC   8y              no
!MISSING! etcd-ca
front-proxy-ca          May 20, 2034 06:15 UTC   8y              no
  • 使用 kk 查看
代码语言:javascript
代码运行次数:0
运行
复制
$ ./kk certs check-expiration -f ksp-v341-v1288.yaml


 _   __      _          _   __
| | / /     | |        | | / /
| |/ / _   _| |__   ___| |/ /  ___ _   _
|    \| | | | '_ \ / _ \    \ / _ \ | | |
| |\  \ |_| | |_) |  __/ |\  \  __/ |_| |
\_| \_/\__,_|_.__/ \___\_| \_/\___|\__, |
                                    __/ |
                                   |___/

08:27:12 CST [GreetingsModule] Greetings
08:27:13 CST message: [ksp-control-3]
Greetings, KubeKey!
08:27:13 CST message: [ksp-gpu-worker-2]
Greetings, KubeKey!
08:27:13 CST message: [ksp-worker-2]
Greetings, KubeKey!
08:27:13 CST message: [ksp-storage-3]
Greetings, KubeKey!
08:27:14 CST message: [ksp-worker-1]
Greetings, KubeKey!
08:27:14 CST message: [ksp-storage-1]
Greetings, KubeKey!
08:27:14 CST message: [ksp-control-2]
Greetings, KubeKey!
08:27:14 CST message: [ksp-control-1]
Greetings, KubeKey!
08:27:14 CST message: [ksp-worker-3]
Greetings, KubeKey!
08:27:14 CST message: [ksp-storage-2]
Greetings, KubeKey!
08:27:15 CST message: [ksp-gpu-worker-1]
Greetings, KubeKey!
08:27:15 CST success: [ksp-control-3]
08:27:15 CST success: [ksp-gpu-worker-2]
08:27:15 CST success: [ksp-worker-2]
08:27:15 CST success: [ksp-storage-3]
08:27:15 CST success: [ksp-worker-1]
08:27:15 CST success: [ksp-storage-1]
08:27:15 CST success: [ksp-control-2]
08:27:15 CST success: [ksp-control-1]
08:27:15 CST success: [ksp-worker-3]
08:27:15 CST success: [ksp-storage-2]
08:27:15 CST success: [ksp-gpu-worker-1]
08:27:15 CST [CheckCertsModule] Check cluster certs
08:27:16 CST success: [ksp-control-1]
08:27:16 CST success: [ksp-control-3]
08:27:16 CST success: [ksp-control-2]
08:27:16 CST [PrintClusterCertsModule] Display cluster certs form
CERTIFICATE                    EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   NODE
apiserver.crt                  May 22, 2025 06:15 UTC   <invalid>       ca                      ksp-control-1
apiserver-kubelet-client.crt   May 22, 2025 06:15 UTC   <invalid>       ca                      ksp-control-1
front-proxy-client.crt         May 22, 2025 06:15 UTC   <invalid>       front-proxy-ca          ksp-control-1
admin.conf                     May 22, 2025 06:15 UTC   <invalid>                               ksp-control-1
controller-manager.conf        May 22, 2025 06:15 UTC   <invalid>                               ksp-control-1
scheduler.conf                 May 22, 2025 06:15 UTC   <invalid>                               ksp-control-1
apiserver.crt                  May 22, 2025 06:17 UTC   <invalid>       ca                      ksp-control-2
apiserver-kubelet-client.crt   May 22, 2025 06:17 UTC   <invalid>       ca                      ksp-control-2
front-proxy-client.crt         May 22, 2025 06:17 UTC   <invalid>       front-proxy-ca          ksp-control-2
admin.conf                     May 22, 2025 06:17 UTC   <invalid>                               ksp-control-2
controller-manager.conf        May 22, 2025 06:17 UTC   <invalid>                               ksp-control-2
scheduler.conf                 May 22, 2025 06:17 UTC   <invalid>                               ksp-control-2
apiserver.crt                  May 22, 2025 06:17 UTC   <invalid>       ca                      ksp-control-3
apiserver-kubelet-client.crt   May 22, 2025 06:17 UTC   <invalid>       ca                      ksp-control-3
front-proxy-client.crt         May 22, 2025 06:17 UTC   <invalid>       front-proxy-ca          ksp-control-3
admin.conf                     May 22, 2025 06:17 UTC   <invalid>                               ksp-control-3
controller-manager.conf        May 22, 2025 06:17 UTC   <invalid>                               ksp-control-3
scheduler.conf                 May 22, 2025 06:17 UTC   <invalid>                               ksp-control-3

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   NODE
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-1
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-1
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-2
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-2
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-3
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-3
08:27:16 CST success: [LocalHost]
08:27:16 CST Pipeline[CheckCertsPipeline] execute successfully

说明: ksp-v341-v1288.yaml 是部署集群时使用的配置文件。证书的过期时间是 May 22, 2025 06:15 UTC

1.2 备份集群证书关键信息

操作有风险,备份是王道!请务必在任何改动前,完整备份现有环境的配置文件和证书。

请在每个 Control 节点上执行以下操作:

  • 创建备份目录
代码语言:javascript
代码运行次数:0
运行
复制
mkdir /root/ksp-backup
  • 备份原有信息
代码语言:javascript
代码运行次数:0
运行
复制
cp -a /etc/kubernetes /root/ksp-backup/
  • 备份 ssl
代码语言:javascript
代码运行次数:0
运行
复制
cp -a /etc/ssl/etcd/ /root/ksp-backup/etcd-ssl-bak-`date +%Y-%H-%M`
  • 备份etcd 数据
代码语言:javascript
代码运行次数:0
运行
复制
cp -a /var/lib/etcd /root/ksp-backup/etcd-bak-`date +%Y-%H-%M`
  • 查看备份信息
代码语言:javascript
代码运行次数:0
运行
复制
$ ls -R /root/ksp-backup/
/root/ksp-backup/:
etcd-bak-2025-08-29  etcd-ssl-bak-2025-08-29  kubernetes

/root/ksp-backup/etcd-bak-2025-08-29:
member

/root/ksp-backup/etcd-bak-2025-08-29/member:
snap  wal

/root/ksp-backup/etcd-bak-2025-08-29/member/snap:
000000000000009f-0000000000f0d6c0.snap
000000000000009f-0000000000f0fdd1.snap
000000000000009f-0000000000f124e2.snap
000000000000009f-0000000000f14bf3.snap
000000000000009f-0000000000f17304.snap
db

/root/ksp-backup/etcd-bak-2025-08-29/member/wal:
0000000000000136-0000000000ef0215.wal
0000000000000137-0000000000efa254.wal
0000000000000138-0000000000f041b0.wal
0000000000000139-0000000000f0e0e0.wal
000000000000013a-0000000000f17ab5.wal
1.tmp

/root/ksp-backup/etcd-ssl-bak-2025-08-29:
ssl

/root/ksp-backup/etcd-ssl-bak-2025-08-29/ssl:
admin-ksp-control-1-key.pem
admin-ksp-control-1.pem
admin-ksp-control-2-key.pem
admin-ksp-control-2.pem
admin-ksp-control-3-key.pem
admin-ksp-control-3.pem
ca-key.pem
ca.pem
member-ksp-control-1-key.pem
member-ksp-control-1.pem
member-ksp-control-2-key.pem
member-ksp-control-2.pem
member-ksp-control-3-key.pem
member-ksp-control-3.pem
node-ksp-control-1-key.pem
node-ksp-control-1.pem
node-ksp-control-2-key.pem
node-ksp-control-2.pem
node-ksp-control-3-key.pem
node-ksp-control-3.pem

/root/ksp-backup/kubernetes:
addons                   manifests
admin.conf               network-plugin.yaml
controller-manager.conf  node-feature-discovery
coredns-configmap.yaml   nodelocaldns-configmap.yaml
coredns.yaml             nodelocaldns.yaml
kubeadm-config.yaml      pki
kubelet.conf             scheduler.conf

/root/ksp-backup/kubernetes/addons:
kubesphere.yaml  local-volume.yaml

/root/ksp-backup/kubernetes/manifests:
kube-apiserver.yaml           kube-scheduler.yaml
kube-controller-manager.yaml

/root/ksp-backup/kubernetes/node-feature-discovery:
features.d  source.d

/root/ksp-backup/kubernetes/node-feature-discovery/features.d:

/root/ksp-backup/kubernetes/node-feature-discovery/source.d:

/root/ksp-backup/kubernetes/pki:
apiserver.crt                 front-proxy-ca.crt
apiserver.key                 front-proxy-ca.key
apiserver-kubelet-client.crt  front-proxy-client.crt
apiserver-kubelet-client.key  front-proxy-client.key
ca.crt                        sa.key
ca.key                        sa.pub

1.3 更新证书

kk 提供了一键式证书更新功能,只需执行以下命令:

代码语言:javascript
代码运行次数:0
运行
复制
./kk certs renew -f xxxxxx.yaml

实际执行结果如下所示:

代码语言:javascript
代码运行次数:0
运行
复制
$ ./kk certs renew -f ksp-v341-v1288.yaml


 _   __      _          _   __
| | / /     | |        | | / /
| |/ / _   _| |__   ___| |/ /  ___ _   _
|    \| | | | '_ \ / _ \    \ / _ \ | | |
| |\  \ |_| | |_) |  __/ |\  \  __/ |_| |
\_| \_/\__,_|_.__/ \___\_| \_/\___|\__, |
                                    __/ |
                                   |___/

08:44:15 CST [GreetingsModule] Greetings
08:44:15 CST message: [ksp-control-3]
Greetings, KubeKey!
08:44:15 CST message: [ksp-gpu-worker-2]
Greetings, KubeKey!
08:44:15 CST message: [ksp-worker-3]
Greetings, KubeKey!
08:44:15 CST message: [ksp-worker-2]
Greetings, KubeKey!
08:44:15 CST message: [ksp-gpu-worker-1]
Greetings, KubeKey!
08:44:16 CST message: [ksp-control-1]
Greetings, KubeKey!
08:44:16 CST message: [ksp-storage-3]
Greetings, KubeKey!
08:44:16 CST message: [ksp-storage-1]
Greetings, KubeKey!
08:44:16 CST message: [ksp-worker-1]
Greetings, KubeKey!
08:44:16 CST message: [ksp-control-2]
Greetings, KubeKey!
08:44:16 CST message: [ksp-storage-2]
Greetings, KubeKey!
08:44:16 CST success: [ksp-control-3]
08:44:16 CST success: [ksp-gpu-worker-2]
08:44:16 CST success: [ksp-worker-3]
08:44:16 CST success: [ksp-worker-2]
08:44:16 CST success: [ksp-gpu-worker-1]
08:44:16 CST success: [ksp-control-1]
08:44:16 CST success: [ksp-storage-3]
08:44:16 CST success: [ksp-storage-1]
08:44:16 CST success: [ksp-worker-1]
08:44:16 CST success: [ksp-control-2]
08:44:16 CST success: [ksp-storage-2]
08:44:16 CST [RenewCertsModule] Renew control-plane certs
08:44:17 CST stdout: [ksp-control-1]
v1.28.8
08:44:20 CST stdout: [ksp-control-2]
v1.28.8
08:44:25 CST stdout: [ksp-control-3]
v1.28.8
08:44:28 CST success: [ksp-control-1]
08:44:28 CST success: [ksp-control-2]
08:44:28 CST success: [ksp-control-3]
08:44:28 CST [RenewCertsModule] Copy admin.conf to ~/.kube/config
08:44:29 CST success: [ksp-control-2]
08:44:29 CST success: [ksp-control-1]
08:44:29 CST success: [ksp-control-3]
08:44:29 CST [CheckCertsModule] Check cluster certs
08:44:30 CST success: [ksp-control-2]
08:44:30 CST success: [ksp-control-1]
08:44:30 CST success: [ksp-control-3]
08:44:30 CST [PrintClusterCertsModule] Display cluster certs form
CERTIFICATE                    EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   NODE
apiserver.crt                  Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-1
apiserver-kubelet-client.crt   Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-1
front-proxy-client.crt         Jun 13, 2026 00:44 UTC   364d            front-proxy-ca          ksp-control-1
admin.conf                     Jun 13, 2026 00:44 UTC   364d                                    ksp-control-1
controller-manager.conf        Jun 13, 2026 00:44 UTC   364d                                    ksp-control-1
scheduler.conf                 Jun 13, 2026 00:44 UTC   364d                                    ksp-control-1
apiserver.crt                  Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-2
apiserver-kubelet-client.crt   Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-2
front-proxy-client.crt         Jun 13, 2026 00:44 UTC   364d            front-proxy-ca          ksp-control-2
admin.conf                     Jun 13, 2026 00:44 UTC   364d                                    ksp-control-2
controller-manager.conf        Jun 13, 2026 00:44 UTC   364d                                    ksp-control-2
scheduler.conf                 Jun 13, 2026 00:44 UTC   364d                                    ksp-control-2
apiserver.crt                  Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-3
apiserver-kubelet-client.crt   Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-3
front-proxy-client.crt         Jun 13, 2026 00:44 UTC   364d            front-proxy-ca          ksp-control-3
admin.conf                     Jun 13, 2026 00:44 UTC   364d                                    ksp-control-3
controller-manager.conf        Jun 13, 2026 00:44 UTC   364d                                    ksp-control-3
scheduler.conf                 Jun 13, 2026 00:44 UTC   364d                                    ksp-control-3

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   NODE
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-1
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-1
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-2
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-2
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-3
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-3
08:44:30 CST success: [LocalHost]
08:44:30 CST Pipeline[RenewCertsPipeline] execute successfully

  • 使用 kk 查看证书信息

从输出结果可以看到,所有证书已经更新到 Jun 13, 2026 00:44 UTC,有效期为 364 天。这个新的过期时间是以执行更新操作的时刻为起点计算的。

代码语言:javascript
代码运行次数:0
运行
复制
[root@ksp-control-1 kubekey]# ./kk certs check-expiration -f ksp-v341-v1288.yaml


 _   __      _          _   __
| | / /     | |        | | / /
| |/ / _   _| |__   ___| |/ /  ___ _   _
|    \| | | | '_ \ / _ \    \ / _ \ | | |
| |\  \ |_| | |_) |  __/ |\  \  __/ |_| |
\_| \_/\__,_|_.__/ \___\_| \_/\___|\__, |
                                    __/ |
                                   |___/

08:45:38 CST [GreetingsModule] Greetings
08:45:38 CST message: [ksp-worker-2]
Greetings, KubeKey!
08:45:39 CST message: [ksp-control-3]
Greetings, KubeKey!
08:45:43 CST message: [ksp-worker-3]
Greetings, KubeKey!
08:45:43 CST message: [ksp-gpu-worker-1]
Greetings, KubeKey!
08:45:43 CST message: [ksp-gpu-worker-2]
Greetings, KubeKey!
08:45:43 CST message: [ksp-storage-1]
Greetings, KubeKey!
08:45:43 CST message: [ksp-worker-1]
Greetings, KubeKey!
08:45:44 CST message: [ksp-storage-2]
Greetings, KubeKey!
08:45:44 CST message: [ksp-control-1]
Greetings, KubeKey!
08:45:44 CST message: [ksp-control-2]
Greetings, KubeKey!
08:45:44 CST message: [ksp-storage-3]
Greetings, KubeKey!
08:45:44 CST success: [ksp-worker-2]
08:45:44 CST success: [ksp-control-3]
08:45:44 CST success: [ksp-worker-3]
08:45:44 CST success: [ksp-gpu-worker-1]
08:45:44 CST success: [ksp-gpu-worker-2]
08:45:44 CST success: [ksp-storage-1]
08:45:44 CST success: [ksp-worker-1]
08:45:44 CST success: [ksp-storage-2]
08:45:44 CST success: [ksp-control-1]
08:45:44 CST success: [ksp-control-2]
08:45:44 CST success: [ksp-storage-3]
08:45:44 CST [CheckCertsModule] Check cluster certs
08:45:45 CST success: [ksp-control-3]
08:45:45 CST success: [ksp-control-1]
08:45:45 CST success: [ksp-control-2]
08:45:45 CST [PrintClusterCertsModule] Display cluster certs form
CERTIFICATE                    EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   NODE
apiserver.crt                  Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-1
apiserver-kubelet-client.crt   Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-1
front-proxy-client.crt         Jun 13, 2026 00:44 UTC   364d            front-proxy-ca          ksp-control-1
admin.conf                     Jun 13, 2026 00:44 UTC   364d                                    ksp-control-1
controller-manager.conf        Jun 13, 2026 00:44 UTC   364d                                    ksp-control-1
scheduler.conf                 Jun 13, 2026 00:44 UTC   364d                                    ksp-control-1
apiserver.crt                  Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-2
apiserver-kubelet-client.crt   Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-2
front-proxy-client.crt         Jun 13, 2026 00:44 UTC   364d            front-proxy-ca          ksp-control-2
admin.conf                     Jun 13, 2026 00:44 UTC   364d                                    ksp-control-2
controller-manager.conf        Jun 13, 2026 00:44 UTC   364d                                    ksp-control-2
scheduler.conf                 Jun 13, 2026 00:44 UTC   364d                                    ksp-control-2
apiserver.crt                  Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-3
apiserver-kubelet-client.crt   Jun 13, 2026 00:44 UTC   364d            ca                      ksp-control-3
front-proxy-client.crt         Jun 13, 2026 00:44 UTC   364d            front-proxy-ca          ksp-control-3
admin.conf                     Jun 13, 2026 00:44 UTC   364d                                    ksp-control-3
controller-manager.conf        Jun 13, 2026 00:44 UTC   364d                                    ksp-control-3
scheduler.conf                 Jun 13, 2026 00:44 UTC   364d                                    ksp-control-3

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   NODE
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-1
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-1
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-2
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-2
ca.crt                  May 20, 2034 06:15 UTC   8y              ksp-control-3
front-proxy-ca.crt      May 20, 2034 06:15 UTC   8y              ksp-control-3
08:45:45 CST success: [LocalHost]
08:45:45 CST Pipeline[CheckCertsPipeline] execute successfully
  • 使用 kubeadm 查看证书信息
代码语言:javascript
代码运行次数:0
运行
复制
$ kubeadm certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
W0613 08:46:43.867113   11643 utils.go:69] The recommended value for"clusterDNS"in"KubeletConfiguration" is: [10.233.0.10]; the provided value is: [169.254.25.10]

CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Jun 13, 2026 00:44 UTC   364d            ca                      no
apiserver                  Jun 13, 2026 00:44 UTC   364d            ca                      no
apiserver-kubelet-client   Jun 13, 2026 00:44 UTC   364d            ca                      no
controller-manager.conf    Jun 13, 2026 00:44 UTC   364d            ca                      no
front-proxy-client         Jun 13, 2026 00:44 UTC   364d            front-proxy-ca          no
scheduler.conf             Jun 13, 2026 00:44 UTC   364d            ca                      no

CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      May 20, 2034 06:15 UTC   8y              no
front-proxy-ca          May 20, 2034 06:15 UTC   8y              no
  • 验证集群状态(集群已经恢复可管理状态
代码语言:javascript
代码运行次数:0
运行
复制
$ kubectl get nodes
NAME               STATUS   ROLES           AGE    VERSION
ksp-control-1      Ready    control-plane   386d   v1.28.8
ksp-control-2      Ready    control-plane   386d   v1.28.8
ksp-control-3      Ready    control-plane   386d   v1.28.8
ksp-gpu-worker-1   Ready    worker          339d   v1.28.8
ksp-gpu-worker-2   Ready    worker          339d   v1.28.8
ksp-storage-1      Ready    worker          332d   v1.28.8
ksp-storage-2      Ready    worker          311d   v1.28.8
ksp-storage-3      Ready    worker          332d   v1.28.8
ksp-worker-1       Ready    worker          386d   v1.28.8
ksp-worker-2       Ready    worker          386d   v1.28.8
ksp-worker-3       Ready    worker          386d   v1.28.8

2. 技术揭秘

通常情况下,使用 KubeKey 部署的 Kubernetes 集群是不会遇到证书过期问题的。我这套集群出现证书过期,主要是因为它属于测试环境,经常处于开关机状态,导致自动更新证书的机制未能正常执行。

那么,KubeKey 是如何保证集群证书不会过期的呢?这背后的原理其实很简单,让我来为你揭开谜底。

KubeKey 的证书自动续期机制:

  • KubeKey 在部署时会自动配置一个定时任务,定期检查集群中所有证书的有效期
  • 系统会监控证书状态,当发现任何证书的剩余有效期低于 30 天时,就会触发自动更新流程
  • 整个更新过程完全自动化,无需人工干预

为了实现这个自动化的证书更新机制,KubeKey 在系统中配置了以下三个关键组件:

脚本文件

文件路径

功能说明

k8s-certs-renew.service

/etc/systemd/system/k8s-certs-renew.service

系统服务单元文件,用于执行证书更新脚本

k8s-certs-renew.timer

/etc/systemd/system/k8s-certs-renew.timer

定时器单元,设置为每周一凌晨3点自动执行更新

k8s-certs-renew.sh

/usr/local/bin/kube-scripts/k8s-certs-renew.sh

证书更新主脚本,使用 kubeadm certs renew all 命令更新所有证书

备份脚本k8s-certs-renew.sh原始内容:

代码语言:javascript
代码运行次数:0
运行
复制
#!/bin/bash
kubeadmCerts='/usr/local/bin/kubeadm certs'
getCertValidDays() {
  local earliestExpireDate; earliestExpireDate=$(${kubeadmCerts} check-expiration | grep -o "[A-Za-z]\{3,4\}\s\w\w,\s[0-9]\{4,\}\s\w*:\w*\s\w*\s*" | xargs -I {} date -d {} +%s | sort | head -n 1)
  local today; today="$(date +%s)"
  echo -n $(( ($earliestExpireDate - $today) / (24 * 60 * 60) ))
}
echo "## Expiration before renewal ##"
${kubeadmCerts} check-expiration
if [ $(getCertValidDays) -lt 30 ]; then
  echo "## Renewing certificates managed by kubeadm ##"
  ${kubeadmCerts} renew all
  echo "## Restarting control plane pods managed by kubeadm ##"
  $(which crictl | grep crictl) pods --namespace kube-system --name 'kube-scheduler-*|kube-controller-manager-*|kube-apiserver-*|etcd-*' -q | /usr/bin/xargs $(which crictl | grep crictl) rmp -f
  echo "## Updating /root/.kube/config ##"
  cp /etc/kubernetes/admin.conf /root/.kube/config
fi
echo "## Waiting for apiserver to be up again ##"
until printf "" 2>>/dev/null >>/dev/tcp/127.0.0.1/6443; do sleep 1; done
echo "## Expiration after renewal ##"
${kubeadmCerts} check-expiration

我很高兴能和大家分享这次 KubeSphere 部署的 Kubernetes 集群证书过期实战经历。正如大家所见,一个看似棘手的证书过期问题,在 KubeKey 的帮助下,变得如此简单高效。

通过这次实战,我不仅向大家展示了如何利用 kubeadm 和 kk 工具快速诊断并解决证书过期问题,更重要的是,我们一起深入了解了 KubeKey 强大的自动化证书续期机制。它通过 k8s-certs-renew.service 、 k8s-certs-renew.timer 和 k8s-certs-renew.sh 这三个核心组件,实现了证书的自动检查和更新,大大降低了运维的复杂性。

虽然我的测试环境因为经常开关机导致自动续期机制未能正常发挥作用,但这也恰好提供了一个绝佳的实战案例,让我们能够亲手体验并掌握证书续期的全过程。希望我的分享能帮助大家在面对类似问题时,不再手足无措,而是能够从容应对,一键抢救你的 K8s 集群!

感谢大家的阅读和支持,期待下一次的“开盲盒”分享!

获取更多的 KubeSphere、Kubernetes、云原生运维、自动化运维、大数据、AI 大模型、Milvus 向量库等实战技能。

免责声明:

  • 笔者水平有限,尽管经过多次验证和检查,尽力确保内容的准确性,但仍可能存在疏漏之处。敬请业界专家大佬不吝指教。
  • 本文所述内容仅通过实战环境验证测试,读者可学习、借鉴,但严禁直接用于生产环境由此引发的任何问题,作者概不负责
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2025-06-14,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 运维有术 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 1. 证书过期解决方案
    • 1.1 查看证书到期时间
    • 1.2 备份集群证书关键信息
    • 1.3 更新证书
  • 2. 技术揭秘
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档