upgrade to sylva >= 1.6 : race condition related to deletion of avoid-delete-mgmt-resources-flux Kyverno policy

We have observed the following problem on upgrades from Sylva 1.4 to 1.6:

📜 Update sylva-units Helm release and associated resources
00:00
sylva-units HelmRelease not yet managed by SylvaUnitsRelease operator, soft-deleting it
  (no cascade deletion, HelmRelease will be recreated via SylvaUnitsRelease)
► suspending helmrelease sylva-units in sylva-system namespace
✔ helmrelease suspended
policy.kyverno.io "avoid-delete-mgmt-resources-flux" deleted from sylva-system namespace
Error from server: admission webhook "validate.kyverno.svc-fail-finegrained-sylva-system-avoid-delete-mgmt-resources-flux" denied the request: failed to fetch policy with key: key sylva-system/avoid-delete-mgmt-resources-flux: policy.kyverno.io "avoid-delete-mgmt-resources-flux" not found

The time for this is 2026-02-05T12:46:17Z.

In Kyverno logs, we see:

  • that Kyverno notices the deletion of the "avoid-delete-mgmt-resources-flux" Policy
  • that it receives a webhook call for the webhook definition corresponding to this policy

Since this policy acts on HelmReleases, and since the action in commons.sh following the deletion of the policy is a deletion of the sylva-nits HelmRelease, what happens seems to be a subtle race on the deletion of the avoid-delete-mgmt-resources-flux  vs deletion of sylva-units HelmRelease:

  • the policy is deleted (we see that the resources is deleted in kyverno logs)
  • but the  webhook corresponding to the policy isn't deleted early enough...
  • ...so the HelmRelease deletion action happens while the webhook still exists so the API server calls a webhook for a policy that does not exist anymore
Assignee Loading
Time tracking Loading