现在一团糟,不知道怎么解决
kubelet 是报这个错误
Sep 15 15:51:54 master kubelet[94723]: E0915 15:51:54.473811 94723 kubelet.go:2183] node "master" not found
我是单节点 master 升级高可用,现在高可用有问题,我就想回到单节点 master
1
cyaki 2021-09-15 16:15:56 +08:00
部署方式是什么? kubeadm ? hyperkube ? 还是手动部署的各个组件
container runtime 是使用的 docker 还是 containerd, cri-o ? 感觉上像是 kubelet 连不上 apiserver,你可以贴一下 apiserver 的日志 类似的问题 https://github.com/kubernetes/kubeadm/issues/1153 |
3
dunhanson OP [root@master ~]# kubectl get nodes
The connection to the server 192.168.2.53:6443 was refused - did you specify the right host or port? [root@master ~]# systemctl status kubelet ● kubelet.service - kubelet: The Kubernetes Node Agent Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /usr/lib/systemd/system/kubelet.service.d └─10-kubeadm.conf Active: active (running) since Thu 2021-09-16 00:04:24 CST; 7h left Docs: https://kubernetes.io/docs/ Main PID: 977 (kubelet) Tasks: 18 (limit: 49767) Memory: 130.8M CGroup: /system.slice/kubelet.service └─977 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --network-plugin=cni --pod-infra-container-image=k8s.gcr.io/pause:3.2 Sep 15 16:13:29 master kubelet[977]: E0915 16:13:29.846153 977 kubelet.go:2183] node "master" not found Sep 15 16:13:29 master kubelet[977]: E0915 16:13:29.946387 977 kubelet.go:2183] node "master" not found Sep 15 16:13:30 master kubelet[977]: E0915 16:13:30.046577 977 kubelet.go:2183] node "master" not found Sep 15 16:13:30 master kubelet[977]: E0915 16:13:30.146836 977 kubelet.go:2183] node "master" not found Sep 15 16:13:30 master kubelet[977]: E0915 16:13:30.246931 977 kubelet.go:2183] node "master" not found Sep 15 16:13:30 master kubelet[977]: E0915 16:13:30.347092 977 kubelet.go:2183] node "master" not found Sep 15 16:13:30 master kubelet[977]: E0915 16:13:30.447178 977 kubelet.go:2183] node "master" not found Sep 15 16:13:30 master kubelet[977]: I0915 16:13:30.462614 977 kubelet_node_status.go:70] Attempting to register node master Sep 15 16:13:30 master kubelet[977]: E0915 16:13:30.547409 977 kubelet.go:2183] node "master" not found Sep 15 16:13:30 master kubelet[977]: E0915 16:13:30.647751 977 kubelet.go:2183] node "master" not found [root@master ~]# |
5
dunhanson OP api-server 日志
https://imgtu.com/i/4ZZUeI |
6
cyaki 2021-09-15 16:29:19 +08:00
kubectl 用不了也是被拒绝连接吗 ?
如果 apiserver 是跑着的,证书没问题,那么 kubectl 是可以连接上去的 你测试下证书,或则查看下 apiserver 下有没有证书错误之类的日志 http --verify pki/ca.pem --cert pki/cert.pem --cert-key pki/cert-key.pem https://192.168.2.53:6443/version https://192.168.2.53:6443/version 这个地址需要是 apiserver tls 证书中包含的地址 By the way 看上面的日志,发现 apiserver 已经挂掉了,要找到 apiserver 挂掉的原因,即使 kubelet 连不上 apiserver, kubelet 也应该可以把 apiserver 跑起来( kubeadm 是用 kubelet 将 apiserver 跑在 docker 里的) |
8
dunhanson OP 我找找看下具体 apiserver 挂的问题
|
9
zanelee 2021-09-15 17:17:16 +08:00
图上看是 apiserver 挂了,容器退出了,启动 apiserver 的时候肯定有报错了,要看报错信息了。
|
10
zen9073 2021-09-15 17:56:31 +08:00
etcd 可以当节点运行,
但是扩容后,无法再缩减到单节点, |
11
flybluewolf 2021-09-15 18:12:10 +08:00
k8s HA 用 kubeadm 部署后,每个 master 需要手工调整编辑 /etc/kubernetes/manifests 下的 kube-controller-manager.yaml, kube-scheduler.yaml 文件
#删除 --port=0 关闭监听非安全端口( http ) #修改 --bind-address=0.0.0.0 修改 etcd.ymal --listen-metrics-urls=http://0.0.0.0:2381 |
12
dunhanson OP @zen9073 这个是错误
opBackOff: "back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-master_kube-system(2521a1e32c7f366d38f88fe227ff6710)" Sep 15 18:13:43 master kubelet[940]: E0915 18:13:43.018221 940 pod_workers.go:191] Error syncing pod 2521a1e32c7f366d38f88fe227ff6710 ("kube-apiserver-master_kube-system(2521a1e32c7f366d38f88fe227ff6710)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-master_kube-system(2521a1e32c7f366d38f88fe227ff6710)" Sep 15 18:13:57 master kubelet[940]: E0915 18:13:57.019782 940 pod_workers.go:191] Error syncing pod 2521a1e32c7f366d38f88fe227ff6710 ("kube-apiserver-master_kube-system(2521a1e32c7f366d38f88fe227ff6710)"), skipping: failed to "StartContainer" for "kube-apiserver" with CrashLoopBackOff: "back-off 5m0s restarting failed container=kube-apiserver pod=kube-apiserver-master_kube-system(2521a1e32c7f366d38f88fe227ff6710)" |
13
dolphintwo 2021-09-15 18:28:48 +08:00
得要更详细的 apiserver 日志
|
15
zen9073 2021-09-16 11:31:59 +08:00
@dunhanson
我只前遇到过你这样的问题, kube-apiserver 起不来是因为 etcd 没起来, etcd 没起来就是我之前说的原因, 现在恢复到原来多节点 master 配置, 备份 etcd,再重新用备份的 etcd 部署单节点 k8s, |
16
defunct9 2021-09-16 11:37:35 +08:00
好像听到谁在叫我
|
17
dunhanson OP |
18
dunhanson OP 唉 下次还是小心点
|