Ensure webhooks avalability

What does this MR do and why?

Various webhooks are only handled by a single pods that may be drained during rolling updates, resulting in errors like this one:

         ├┄╴mgmt-1185922826-rke2-capo-oci-control-plane.17b614ed12900a31                      FailedScaleUp                                                   Failed to create additional control plane Machine for cluster sylva-system/mgmt-1185922826-rke2-capo-oci control plane: failed to generate bootstrap config: Failed to create bootstrap configuration: Internal error occurred: failed calling webhook "mrke2config.kb.io": failed to call webhook: Post "https://rke2-bootstrap-webhook-service.rke2-bootstrap-system.svc:443/mutate-bootstrap-cluster-x-k8s-io-v1alpha1-rke2config?timeout=10s": context deadline exceeded

That occured in https://gitlab.com/sylva-projects/sylva-core/-/jobs/6229760990

This MR increases deployment replicas to 2 on HA clusters and adds a PodDisruptionBudget with minAvailable: 1 to following deployments:

  • capi
  • cabpr
  • cabpk
  • capo
  • capd
  • capv
  • capm3
  • baremetal-operator
  • rancher-webhook

Related reference(s)

Relates to: #997 Closes: #928 (closed)

Edited by Francois Eleouet

Merge request reports

Loading