RKE2: "proxy error from 127.0.0.1:9345" during rolling update, on apiserver calls triggering webhooks
During mgmt cluster rolling update, I've observed this:
Machines: failed to update Machine "sylva-system/mgmt-1244888013-rke2-capm3-virt-md0-vbn6k-mhnmq":
failed to apply Machine sylva-system/mgmt-1244888013-rke2-capm3-virt-md0-vbn6k-mhnmq:
Internal error occurred:
failed calling webhook "default.machine.cluster.x-k8s.io":
failed to call webhook:
Post "https://capi-webhook-service.capi-system.svc:443/mutate-cluster-x-k8s-io-v1beta1-machine?timeout=10s":
proxy error from 127.0.0.1:9345 while dialing 100.72.159.143:9443, code 502: 502 Bad Gateway
(https://gitlab.com/sylva-projects/sylva-core/-/jobs/6576299223#L1413)
This seems to be an occurrence of https://github.com/rancher/rke2/issues/5614 (more explicitly a variant: issue not triggered by stopping an rke2-server, but because an rke2-server is registered to the RKE2 cluster before it is fully ready).
Edited by Thomas Morin