Add minio-cleanup units, in order to fix tenant failed status after upgrade

What does this MR do and why?

This merge request introduces two new units called minio-monitoring-cleanup and minio-logging-cleanup, which are used to automatically address a known issue with MinIO tenant during upgrades from specific versions. When upgrading from Sylva 1.3, the MinIO tenant enters a failed state with permission errors, showing "open /usr/bin/.minio.check-perm: permission denied" in its status. This issue was fixed upstream from RELEASE.2024-09-22T00-33-43Z image.

The units are implementing a workaround that has the scope to delete the StatefulSet (but not the pods), in order to allow the operator to recreate it with proper permissions. Both units are running a simple script that checks the MinIO tenant image, compares it with the MinIO StatefulSet image and deletes the StatefulSet with the --cascade=orphan flag when there is an image mismatch. This preserves the running pods while allowing the operator to recreate the StatefulSet with the correct configuration.

The units are marked as temporary and will be removed after Sylva 1.4.

Related reference(s)

Closes #2193 (closed).

Test coverage

This was tested in the CI run https://gitlab.com/sylva-projects/sylva-core/-/pipelines/1755374854 and after checking the update-workload-cluster job the output is the following:

kubectl get tenants.minio.min.io -n minio-monitoring monitoring -o jsonpath='{.spec.image}'
quay.io/minio/minio:RELEASE.2024-11-07T00-52-20Z

crustgather-job-9641035383 ~> kubectl get sts -n minio-monitoring monitoring-pool-0 -o jsonpath='{.spec.template.spec.containers[0].image}'
quay.io/minio/minio:RELEASE.2024-11-07T00-52-20Z

crustgather-job-9641035383 ~> kubectl get tenant -n minio-monitoring
NAME         STATE         HEALTH   AGE
monitoring   Initialized   green    2025-04-07T09:14:26Z

CI configuration

Below you can choose test deployment variants to run in this MR's CI.

Click to open to CI configuration

Legend:

Icon Meaning Available values
☁️ Infra Provider capd, capo, capm3
🚀 Bootstrap Provider kubeadm (alias kadm), rke2
🐧 Node OS ubuntu, suse
🛠️ Deployment Options light-deploy, dev-sources, ha, misc, maxsurge-0
🎬 Pipeline Scenarios Available scenario list and description
  • 🎬 preview ☁️ capd 🚀 kadm 🐧 ubuntu

  • 🎬 preview ☁️ capo 🚀 rke2 🐧 suse

  • 🎬 preview ☁️ capm3 🚀 rke2 🐧 ubuntu

  • ☁️ capd 🚀 kadm 🛠️ light-deploy 🐧 ubuntu

  • ☁️ capd 🚀 rke2 🛠️ light-deploy 🐧 suse

  • ☁️ capo 🚀 rke2 🐧 suse

  • ☁️ capo 🚀 kadm 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🎬 rolling-update 🛠️ ha 🐧 ubuntu

  • ☁️ capo 🚀 kadm 🎬 wkld-k8s-upgrade 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🎬 rolling-update-no-wkld 🛠️ ha,misc 🐧 suse

  • ☁️ capo 🚀 rke2 🎬 sylva-upgrade-from-1.3.x 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🎬 sylva-upgrade-from-1.3.x 🛠️ ha,logging 🐧 ubuntu

  • ☁️ capm3 🚀 rke2 🐧 suse

  • ☁️ capm3 🚀 kadm 🐧 ubuntu

  • ☁️ capm3 🚀 kadm 🎬 rolling-update-no-wkld 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capm3 🚀 rke2 🎬 wkld-k8s-upgrade 🛠️ ha 🐧 suse

  • ☁️ capm3 🚀 kadm 🎬 rolling-update 🛠️ ha 🐧 ubuntu

  • ☁️ capm3 🚀 rke2 🎬 sylva-upgrade-from-1.3.x 🛠️ ha,logging 🐧 suse

  • ☁️ capm3 🚀 rke2 🎬 sylva-upgrade-from-1.3.x 🛠️ misc,ha 🐧 suse

  • ☁️ capm3 🚀 kadm 🎬 sylva-upgrade-from-1.3.x 🛠️ ha,logging 🐧 ubuntu

  • ☁️ capm3 🚀 kadm 🎬 sylva-upgrade-from-1.3.x 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capm3 🚀 kadm 🎬 rolling-update 🛠️ ha 🐧 suse

Global config for deployment pipelines

  • autorun pipelines
  • allow failure on pipelines
  • record sylvactl events

Notes:

  • Enabling autorun will make deployment pipelines to be run automatically without human interaction
  • Disabling allow failure will make deployment pipelines mandatory for pipeline success.
  • if both autorun and allow failure are disabled, deployment pipelines will need manual triggering but will be blocking the pipeline

Be aware: after configuration change, pipeline is not triggered automatically. Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.

Edited by Thomas Morin

Merge request reports

Loading