k8s kubelet 访问apiserver报错,connection reset by peer nginx no live upstreams while connecting to upst

I0526 11:22:02.558478    3806 kubelet_node_status.go:286] Setting node annotation to enable volume controller attach/detach
I0526 11:22:02.605102    3806 kubelet_node_status.go:72] Attempting to register node hdss7-22.host.com
E0526 11:22:02.624074    3806 kubelet.go:2252] node "hdss7-22.host.com" not found
E0526 11:22:02.729241    3806 kubelet.go:2252] node "hdss7-22.host.com" not found
E0526 11:22:02.768835    3806 kubelet_node_status.go:94] Unable to register node "hdss7-22.host.com" with API server: Post https://10.4.7.10:7443/api/v1/nodes: read tcp 10.4.7.22:37410->10.4.7.10:7443: read: connection reset by peer
E0526 11:22:02.829862    3806 kubelet.go:2252] node "hdss7-22.host.com" not found
E0526 11:22:02.900886    3806 event.go:249] Unable to write event: 'Patch https://10.4.7.10:7443/api/v1/namespaces/default/events/hdss7-22.host.com.16124f2f494f590b: read tcp 10.4.7.22:37414->10.4.7.10:7443: read: connection reset by peer' (may retry after sleeping)
E0526 11:22:02.931642    3806 kubelet.go:2252] node "hdss7-22.host.com" not found
E0526 11:22:03.032027    3806 kubelet.go:2252] node "hdss7-22.host.com" not found
E0526 11:22:03.132818    3806 kubelet.go:2252] node "hdss7-22.host.com" not found

报错日志如上,根据以上分析是由于访问10.4.7.10:7443报错connection reset by peer,10.4.7.10是用keepalived搭建的一个虚拟IP,指向apiserver的6443端口,看到这个问题之后再去看nginx报错,nginx疯狂打印类似如下错误

no live upstreams while connecting to upstream, client: 10.4.7.21, server: 0.0.0.0:7443, upstream: "kube-apiserver", bytes from/to client:0/0, bytes from/to upstream:0/0

然后根据这个错误一直在网上找答案,结果都说是由于后端服务没启动,但我telenet apiserver的端口6443都说可以通的,怎么也想不起是哪里的问题,干脆把apiserver重启下,重启之后,终于发现nginx有报其他错误,如下

2020/05/26 11:16:31 [crit] 43875#0: *15442 connect() to 10.4.7.21:6443 failed (13: Permission denied) while connecting to upstream, client: 10.4.7.22, server: 0.0.0.0:7443, upstream: "10.4.7.21:6443", bytes from/to client:0/0, bytes from/to upstream:0/0
2020/05/26 11:16:31 [crit] 43875#0: *15442 connect() to 10.4.7.22:6443 failed (13: Permission denied) while connecting to upstream, client: 10.4.7.22, server: 0.0.0.0:7443, upstream: "10.4.7.22:6443", bytes from/to client:0/0, bytes from/to upstream:0/0
2020/05/26 11:16:31 [crit] 43874#0: *15445 connect() to 10.4.7.21:6443 failed (13: Permission denied) while connecting to upstream, client: 10.4.7.21, server: 0.0.0.0:7443, upstream: "10.4.7.21:6443", bytes from/to client:0/0, bytes from/to upstream:0/0
2020/05/26 11:16:31 [crit] 43874#0: *15445 connect() to 10.4.7.22:6443 failed (13: Permission denied) while connecting to upstream, client: 10.4.7.21, server: 0.0.0.0:7443, upstream: "10.4.7.22:6443", bytes from/to client:0/0, bytes from/to upstream:0/0

看到这个错误终于有点欣慰,但是no live upstreams while connecting to upstream错误也是照样风控输出,之前也是根据这个方向去排错,没找到问题,只有去排查failed (13: Permission denied) while connecting to upstream这个问题了,终于找到问题所在,执行以下命令之后回复正常

setsebool -P httpd_can_network_connect 1

 再用kubect去检查节点是否注册正常

kubectl get node

之后能发现节点注册正常,到此结束


版权声明:本文为ty0415原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接和本声明。