Ubuntu安装教程请参考Ubuntu K8S安装 以下内容根据Rocky Linux 9.5来配置
确保SELinux状态是关闭或Permissive
sestatus
默认情况下,swap已经是关闭状态
swapoff -a
执行完swapoff -a语句以后,再次检查/etc/fstab文件中是否有swap那一行,如果有,用#号注释掉。 否则会造成节点重启以后kubelet起不来。云环境不需要检查,但是普通个人环境默认应该是fstab有swap的。
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# 先创建containerd.conf文件并写入以下两行
sudo tee /etc/modules-load.d/containerd.conf <<EOF
overlay
br_netfilter
EOF
# 启动模块
modprobe overlay
modprobe br_netfilter
/etc/sysctl.d/kubernetes.conf 这个文件名字无所谓,只要在这个文件夹内即可,有的人叫k8s.conf
cat << EOF | tee /etc/sysctl.d/kubernetes.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF
sysctl --system
这个步骤不需要也没问题的,这是为了通过docker的网站去安装containerd(但不安装docker)
sudo dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
# 替换国内源:
sed -i 's+https://download.docker.com+https://mirrors.tuna.tsinghua.edu.cn/docker-ce+' /etc/yum.repos.d/docker-ce.repo
dnf install containerd.io -y
containerd config default | sudo tee /etc/containerd/config.toml >/dev/null 2>&1
# 以上命令做完之后,config.toml,国内服务器需要修改源
# vim手动修改/etc/containerd/config.toml文件中的内容如下:
sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9"
#原内容为sandbox_image = "registry.k8s.io/pause:3.8"
#替换我们都会使用3.9, 不然安装会提示建议使用3.9
#或使用sed命令
sed -i 's#registry.k8s.io/pause:3.8#registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9#g' /etc/containerd/config.toml
# 使用systemd CGroup
sed -e 's/SystemdCgroup = false/SystemdCgroup = true/g' -i /etc/containerd/config.toml
root@cp:˜# systemctl restart containerd
# 确认一下目前containerd的状态
systemctl status containerd
执行手动pull镜像命令如下(先按照要求启动containerd服务,然后再ctr)
ctr -n k8s.io i pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9
# 或使用crictl,这是一个符合 Kubernetes CRI(容器运行时接口)规范的命令行工具
containerd安装也可以通过wget github的方式安装。
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.32/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.32/rpm/repodata/repomd.xml.key
EOF
国内换镜像源
sed -i 's|https://pkgs.k8s.io/core:/stable:/v1.32/rpm/|https://mirrors.tuna.tsinghua.edu.cn/kubernetes/core:/stable:/v1.32/rpm/|g' /etc/yum.repos.d/kubernetes.repo
dnf install -y kubeadm kubelet kubectl
查看本机ip并设置hosts
vim /etc/hosts
10.128.0.3 k8scp #<-- 新增这一行
127.0.0.1 localhost
创建并设置kebeadm config
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
# 推荐添加imageRepository 国内镜像进行初始化
imageRepository: "registry.cn-hangzhou.aliyuncs.com/google_containers" # <-- 国内镜像
kubernetesVersion: 1.30.9 #<-- Use the word stable for newest version
controlPlaneEndpoint: "k8scp:6443" #<-- 使用我们填写到 /etc/hosts 的地址而非IP
networking:
podSubnet: 192.168.0.0/16 #<-- Match the IP range from the CNI config file
# 国内源新增以下一行
imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
初始化:
kubeadm init --config=kubeadm-config.yaml --upload-certs \
| tee kubeadm-init.out
#<-- Save output for future review
# output 输出获得
kubeadm join k8scp:6443 --token vapzqi.et2p9zbkzk29wwth \
--discovery-token-ca-cert-hash
,! sha256:f62bf97d4fba6876e4c3ff645df3fca969c06169dee3865aab9d0bca8ec9f8cd
我们必须使用非root用户来运行命令
# -m 加/home/student, -s shell是/bin/bash
useradd -m -s /bin/bash student
#改个密码
passwd student
# Ubuntu给予sudo权限 权限组不是wheel, 而是sudo
usermod -aG sudo student
# 登出root 登录student
root@cp:˜# exit
logout
student@cp:˜$ mkdir -p $HOME/.kube
student@cp:˜$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
student@cp:˜$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
# 查看以下config配置是否有问题
student@cp:˜$ less .kube/config
#==============================
apiVersion: v1
clusters:
- cluster:
#<output_omitted>
以前我们都是在更早的步骤安装calico,但是现在我们使用的是cilium.当然在此之前,先安装helm。 helm可以通过github wget下载,也可以如下: helm
sudo dnf install helm -y
## dnf 只在fedora有,所以我们使用脚本安装
# 以下链接已添加GitHub加速镜像
curl -fsSL -o get_helm.sh https://ghfast.top/https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3
chmod 700 get_helm.sh
./get_helm.sh
然后
helm repo add cilium https://helm.cilium.io/
helm repo update
helm template cilium cilium/cilium --version 1.14.19 \
--namespace kube-system > cilium.yaml
请确保cilium.yaml不在root文件夹下,而在非root用户文件夹下,这样方便运行以下命令。
kubectl apply -f /home/student/cilium.yaml
# output
serviceaccount/cilium created
serviceaccount/cilium-operator created
secret/cilium-ca created
secret/hubble-server-certs created
configmap/cilium-config created
clusterrole.rbac.authorization.k8s.io/cilium created
clusterrole.rbac.authorization.k8s.io/cilium-operator created
clusterrolebinding.rbac.authorization.k8s.io/cilium created
clusterrolebinding.rbac.authorization.k8s.io/cilium-operator created
role.rbac.authorization.k8s.io/cilium-config-agent created
rolebinding.rbac.authorization.k8s.io/cilium-config-agent created
service/hubble-peer created
daemonset.apps/cilium created
deployment.apps/cilium-operator created
student@cp:˜$ sudo dnf install bash-completion -y
# 如果没安装的话,就退出再登录 <exit and log back in>
student@cp:˜$ source <(kubectl completion bash)
student@cp:˜$ echo "source <(kubectl completion bash)" >> $HOME/.bashrc
现在输入kubectl des再按Tab就会自动补全了。
kubectl get nodes
# 输出
NAME STATUS ROLES AGE VERSION
cn-node1-cp1 Ready control-plane 133m v1.27.1
kubectl get pods -A
# 输出
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system cilium-h49dp 1/1 Running 0 15m
kube-system cilium-operator-788c7d7585-c2shl 0/1 Pending 0 15m
kube-system cilium-operator-788c7d7585-rn26s 1/1 Running 0 15m
kube-system coredns-5d78c9869d-2rw6j 1/1 Running 0 132m
kube-system coredns-5d78c9869d-b8shj 1/1 Running 0 132m
kube-system etcd-cn-node1-cp1 1/1 Running 4 (97m ago) 132m
kube-system kube-apiserver-cn-node1-cp1 1/1 Running 4 (97m ago) 132m
kube-system kube-controller-manager-cn-node1-cp1 1/1 Running 4 (97m ago) 132m
kube-system kube-proxy-5c758 1/1 Running 4 (97m ago) 132m
kube-system kube-scheduler-cn-node1-cp1 1/1 Running 4 (97m ago) 132m
默认是24小时才过期。 我们可以通过以下方式查看
kubeadm token list
# 然后继续在cp node上使用student用户创建token
sudo kubeadm token create
>>27eee4.6e66ff60318da929
# 创建sha256
openssl x509 -pubkey \
-in /etc/kubernetes/pki/ca.crt | openssl rsa \
-pubin -outform der 2>/dev/null | openssl dgst \
-sha256 -hex | sed 's/ˆ.* //'
>>6d541678b05652e1fa5d43908e75e67376e994c3483d6683f2a18673e5d2a1b0
先到woker节点,新增hosts
root@worker:˜# vim /etc/hosts
10.128.0.3 k8scp #<-- Add this line
127.0.0.1 localhost
然后即可使用加入节点的方式加入,如果你的token已过期则根据新生成的token和sha256值对应调整即可。
kubeadm join \
--token 27eee4.6e66ff60318da929 \
k8scp:6443 \
--discovery-token-ca-cert-hash \
sha256:6d541678b05652e1fa5d43908e75e67376e994c3483d6683f2a18673e5d2a1b0
在安装好CP之后,最快重新设置CP的方式就是直接重置。k8s提供直接重置命令:kubeadm reset。 你也可以直接初始化,当你使用命令初始化的时候,会报错。哪些文件存在,以及哪些端口出被占用。
# Exanmple Output
oot@cp:~# kubeadm init --config=kubeadm-config.yaml --upload-certs \
> | tee kubeadm-init.out
[init] Using Kubernetes version: v1.27.1
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
[ERROR Port-10259]: Port 10259 is in use
[ERROR Port-10257]: Port 10257 is in use
[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
[ERROR Port-10250]: Port 10250 is in use
[ERROR FileContent--proc-sys-net-bridge-bridge-nf-call-iptables]: /proc/sys/net/bridge/bridge-nf-call-iptables does not exist
[ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
# Resolution
rm -f /etc/kubernetes/manifests/*
rm -rf /var/lib/etcd
root@cp:~# modprobe br_netfilter
root@cp:~# modprobe overlay
# 查看谁占用端口并杀死进程
root@cp:~# lsof -i :10257
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kube-cont 1538 root 3u IPv4 28030 0t0 TCP localhost:10257 (LISTEN)
root@cp:~# lsof -i :10250
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kubelet 3130 root 12u IPv6 54564 0t0 TCP *:10250 (LISTEN)
root@cp:~# lsof -i :10259
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
kube-sche 1545 root 3u IPv4 28771 0t0 TCP localhost:10259 (LISTEN)
kill -9 1545
kill -9 1538
kill -9 3130
然后重新初始化即可。 我也将查看端口并杀死进程做成一个shell脚本(ChatGPT)以供参考
#!/bin/bash
# Check and kill the process using port 10257
echo "Checking for port 10257"
pid10257=$(sudo lsof -t -i:10257)
if [ -n "$pid10257" ]; then
echo "Killing process $pid10257 using port 10257"
sudo kill $pid10257
else
echo "No process found using port 10257"
fi
# Check and kill the process using port 10259
echo "Checking for port 10259"
pid10259=$(sudo lsof -t -i:10259)
if [ -n "$pid10259" ]; then
echo "Killing process $pid10259 using port 10259"
sudo kill $pid10259
else
echo "No process found using port 10259"
fi
# Check and kill the process using port 10250
echo "Checking for port 10250"
pid10250=$(sudo lsof -t -i:10250)
if [ -n "$pid10250" ]; then
echo "Killing process $pid10250 using port 10250"
sudo kill $pid10250
else
echo "No process found using port 10250"
fi
可以参考一键安装脚本中的配置步骤:https://github.com/lework/kainstall
使用KubeSphere安装的话,请确保在外网(全局+全局远程DNS)访问。 下载并权限调整为chmod +x后:
./kk create config [--with-kubernetes version] [--with-kubesphere version]
创建模版文件,并调整里面的ip地址,用户名,密码。 可以不填写–with-kubesphere version 也就是不安装kubesphere.
k8s其他管理工具:Rancher, https://kuboard.cn/
k8s YAML编写工具:https://k8syaml.com/