RKE2: "proxy error from 127.0.0.1:9345" during rolling update, on apiserver calls triggering webhooks

During mgmt cluster rolling update, I've observed this:

Machines: failed to update Machine "sylva-system/mgmt-1244888013-rke2-capm3-virt-md0-vbn6k-mhnmq": 
failed to apply Machine sylva-system/mgmt-1244888013-rke2-capm3-virt-md0-vbn6k-mhnmq: 
Internal error occurred: 
failed calling webhook "default.machine.cluster.x-k8s.io": 
failed to call webhook: 
Post "https://capi-webhook-service.capi-system.svc:443/mutate-cluster-x-k8s-io-v1beta1-machine?timeout=10s": 
proxy error from 127.0.0.1:9345 while dialing 100.72.159.143:9443, code 502: 502 Bad Gateway

(https://gitlab.com/sylva-projects/sylva-core/-/jobs/6576299223#L1413)

This seems to be an occurrence of https://github.com/rancher/rke2/issues/5614 (more explicitly a variant: issue not triggered by stopping an rke2-server, but because an rke2-server is registered to the RKE2 cluster before it is fully ready).

Edited Apr 09, 2024 by Thomas Morin
Assignee Loading
Time tracking Loading