容器服务从 kube-dns 切换到 CoreDNS-实践教程-文档中心-腾讯云

低版本的 kube-dns 存在一些潜在问题，例如：
1. ﻿依赖库 miekg/dns 存在 bug，导致 kube-dns 在处理 /etc/resolv.conf 中的某些特定 option 时会发生 panic。
2. 依赖库 client-go 的版本较低，不支持 周期性刷新 token 的功能，导致 kube-dns 在访问 kube-apiserver 时无法进行鉴权。因此，我们建议您将集群中的 kube-dns 切换到 CoreDNS。
本文档提供了一种尽可能平滑、对业务无感知的方式，来完成将集群中的 kube-dns 切换到 CoreDNS 的过程。
前置说明
集群 Kubernetes 版本不低于1.12。
在切换到 CoreDNS 之前，请根据集群的 Kubernetes 版本选择最适合的 CoreDNS 版本。详情请参见 选择最佳 CoreDNS 版本。
如果集群的 kube-proxy 正在使用 IPVS 模式，在 kube-dns 缩容阶段，由于 IPVS UDP 会话超时，可能会导致 DNS 解析失败的概率性问题。为了缩短解析失败的持续时间，使切换过程尽可能平滑，请配置 IPVS UDP 会话保持的超时时间。由于 kube-dns 不具备像 CoreDNS 一样的优雅退出的能力，针对 kube-dns 切换到 CoreDNS 的场景，建议将超时时间配置为5秒，配置方法详情请参见 配置会话保持。完成配置后，请等待5分钟后再继续进行后续步骤。
说明：
关于 ipvs udp 会话超时问题，TencentOS 3.1 中0009.23及以上的内核合并了社区 expire_nodest_conn 特性，能快速删除已有连接，减少解析超时的持续时间，用户无需再配置 ipvs UDP 会话保持超时时间。具体特性请参见 ipvs: queue delayed work to expire no destination connections if expire_nodest_conn=1。
操作步骤
准备 CoreDNS 资源文件
基于以下 CoreDNS 资源模板，根据集群 Kubernetes 版本以及 CoreDNS 版本处理 # 标注的内容后，将其保存到 switch2coredns.yaml 文件：
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
rules:
  - apiGroups:
      - '*'
    resources:
      - endpoints
      - services
      - pods
      - namespaces
    verbs:
      - list
      - watch
  # 如果集群 k8s 版本大于等于 1.20，则需要增加对 endpointslice 的权限，反之则不需要，从此开始
  - apiGroups:
      - discovery.k8s.io
    resources:
      - endpointslices
    verbs:
      - list
      - watch
  # 到此结束
﻿
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
  name: coredns
  namespace: kube-system
﻿
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  annotations:
    rbac.authorization.kubernetes.io/autoupdate: "true"
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
    kubernetes.io/bootstrapping: rbac-defaults
  name: system:coredns
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:coredns
subjects:
  - kind: ServiceAccount
    name: coredns
    namespace: kube-system
﻿
---
apiVersion: v1
data:
  Corefile: |2-
        .:53 {
            template ANY HINFO . {
                rcode NXDOMAIN
            }
            errors
            health {
                lameduck 30s
            }
            ready
            kubernetes cluster.local. in-addr.arpa ip6.arpa {
                pods insecure
                # 如果 CoreDNS 版本小于 v1.7.0，则需要 upstream 选项，反之则不需要，CoreDNS 版本确定参见本文档前置说明
                upstream
                fallthrough in-addr.arpa ip6.arpa
            }
            prometheus :9153
            forward . /etc/resolv.conf {
                prefer_udp
            }
            cache 30
            reload
            loadbalance
        }
kind: ConfigMap
metadata:
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
  name: coredns
  namespace: kube-system
﻿
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    k8s-app: kube-dns
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: CoreDNS
  name: coredns
  namespace: kube-system
spec:
  # 副本数先设置成0，后面的步骤会调整
  replicas: 0
  selector:
    matchLabels:
      k8s-app: kube-dns
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      annotations:
        seccomp.security.alpha.kubernetes.io/pod: docker/default
      labels:
        k8s-app: kube-dns
      name: coredns
      namespace: kube-system
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: k8s-app
                    operator: In
                    values:
                      - kube-dns
              topologyKey: kubernetes.io/hostname
      containers:
        - args:
            - -conf
            - /etc/coredns/Corefile
          # 集群所在地域不同，{HOST} 也不同，可直接采用当前 kube-dns 镜像所使用的域名替换这里的 {HOST}，比如香港地域域名为：hkccr.ccs.tencentyun.com
          # 使用要切换到的 CoreDNS 版本来替换 {VERSION}，CoreDNS 版本确定参见本文档前置说明
          image: {HOST}/tkeimages/coredns:{VERSION}
          name: coredns
          livenessProbe:
            failureThreshold: 5
            httpGet:
              path: /health
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 60
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 5
          ports:
            - containerPort: 53
              name: dns
              protocol: UDP
            - containerPort: 53
              name: dns-tcp
              protocol: TCP
            - containerPort: 9153
              name: metrics
              protocol: TCP
          readinessProbe:
            failureThreshold: 5
            httpGet:
              path: /ready
              port: 8181
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 5
          resources:
            limits:
              memory: 170M
            requests:
              cpu: 100m
              memory: 30M
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              add:
                - NET_BIND_SERVICE
              drop:
                - all
            readOnlyRootFilesystem: true
          volumeMounts:
            - mountPath: /etc/coredns
              name: config-volume
              readOnly: true
      dnsPolicy: Default
      serviceAccountName: coredns
      tolerations:
        - effect: NoSchedule
          key: node-role.kubernetes.io/master
        - key: CriticalAddonsOnly
          operator: Exists
      volumes:
        - configMap:
            items:
              - key: Corefile
                path: Corefile
            name: coredns
            optional: true
          name: config-volume
迁移配置
如果您在当前集群中对 kube-dns 进行了一些自定义配置，例如自定义上游等，那么您需要将这些配置等效地迁移到 CoreDNS。请参考以下示例完成迁移：
kube-dns 自定义配置
apiVersion: v1
data:
  federations: |﻿
    {"foo" : "foo.feddomain.com"}
  stubDomains: |﻿
    {"abc.com" : ["1.2.3.4"], "my.cluster.local" : ["2.3.4.5"]}
  upstreamNameservers: |﻿
    ["8.8.8.8", "8.8.4.4"]
kind: ConfigMap
metadata:
  name: kube-dns
  namespace: kube-system
迁移到 CoreDNS 的相应配置
根据 CoreDNS 版本处理 # 标注的内容后，将您的自定义配置写入上一步准备好的 switch2coredns.yaml 文件：
apiVersion: v1
data:
  Corefile: |2-
        .:53 {
            kubernetes cluster.local. in-addr.arpa ip6.arpa {
                pods insecure
                # 如果 CoreDNS 版本小于 v1.7.0，则需要 upstream 选项，反之则不需要，CoreDNS 版本确定参见本文档前置说明
                upstream  8.8.8.8 8.8.4.4
                fallthrough in-addr.arpa ip6.arpa
            }
            federation cluster.local {
               foo foo.feddomain.com
            }
            forward . 8.8.8.8 8.8.4.4
        }
        abc.com:53 {
            forward . 1.2.3.4
        }
        my.cluster.local:53 {
            forward . 2.3.4.5
        }
kind: ConfigMap
metadata:
  labels:
    addonmanager.kubernetes.io/mode: EnsureExists
  name: coredns
  namespace: kube-system
部署 CoreDNS
将上述 switch2coredns.yaml 完善后，执行以下命令部署 CoreDNS：
注意：
此时 CoreDNS Deployment 的副本数为0，不会实际部署 Pod。
kubectl apply -f switch2coredns.yaml
执行切换
切换的总体思路为：逐步扩容 CoreDNS，缩容 kube-dns，直到所有 kube-dns 副本都被 CoreDNS 副本替代。执行切换的步骤如下：
注意：
由于 kube-dns 和 CoreDNS 副本之间配置了 Pod 反亲和，因此如果集群资源有限，例如节点数不大于2，可以考虑先缩容 kube-dns，再扩容 CoreDNS，类似于原地腾挪，但这种方案在缩容时会带来服务容量的降低。
1. 扩容一个 CoreDNS 副本：
kubectl scale deployment coredns -n kube-system --replicas=1
2. 等待该 CoreDNS Pod 正常运行并变为 Ready 状态。如果长时间无法 Ready，请执行以下命令查看 CoreDNS Pod 的日志以诊断问题。
kubectl logs $(COREDNS_POD_NAME) -n kube-system
3. 等待 CoreDNS Pod Ready 后，进入一个业务 Pod 或创建一个测试 Pod（包含 nslookup 工具），将 nameserver 指向该 CoreDNS Pod 的 IP 地址，并测试系统域名、业务域名（注意命名空间）、外部域名等的解析是否正常。
nslookup kubernetes.default $(COREDNS_POD_IP)
nslookup $(业务域名) $(COREDNS_POD_IP)
nslookup www.baidu.com $(COREDNS_POD_IP)
4. 检查该 CoreDNS Pod 是否已添加到 kube-dns Service 的后端列表。
kubectl get endpoints kube-dns -n kube-system -o jsonpath='{.subsets[*].addresses[*].ip}{"\\n"}' | grep $(COREDNS_POD_IP)
5. 缩容一个 kube-dns 副本，假设 kube-dns 原先副本数为 N：
kubectl scale deployment kube-dns -n kube-system --replicas=N-1
6. 确认业务侧没有频繁的 DNS 解析报错，通过监控系统（如果具备）确认 DNS 服务整体 QPS 水平稳定，持续观察5分钟。
7. 重复执行步骤1-6，每扩容一个 CoreDNS Pod 就缩容一个 kube-dns pod，直到 CoreDNS 副本数达到原先 kube-dns 副本数，而 kube-dns 副本数变为0，即完成整体切换。
8. 切换完成后，观察业务持续72小时无问题，再清理 kube-dns 相关资源：
kubectl delete deployment kube-dns -n kube-system
kubectl delete cm kube-dns -n kube-system
kubectl delete serviceaccount kube-dns -n kube-system
回滚
如在切换过程中遇到不符合预期的行为，您可以通过扩容 kube-dns Deployment 到原来的副本数，缩容 CoreDNS Deployment 到0副本的方式实现回滚。
从 kube-dns 切换到 CoreDNS

本页目录：

前置说明

操作步骤

准备 CoreDNS 资源文件

迁移配置

kube-dns 自定义配置

迁移到 CoreDNS 的相应配置

部署 CoreDNS

执行切换

回滚