Add an healthCheckExprs to ensure that loki-secrets contains at least a tenant definition.
What does this MR do and why?
1 - Add an healthCheckExprs to ensure that loki-secrets contains at least a tenant definition.
Since secret content is a base64-encoded json string, it does not seem possible to inspect its content in CEL, so we check for its length, assuming that as soon as we'll have a tenant definition, the secret length will increase:
❯ echo '{"loki":{"tenants":[]}}' | base64 | wc -c
33
❯ echo '{"loki":{"tenants":[{}]}}' | base64 | wc -c
37
❯ echo '{"loki":{"tenants":[{"a", "b"}]}}' | base64 | wc -c
49
2 - Additionally, in a second commit, add secrets keys and their data length to debug-on-exit, as it could help debugging some specific cases like this one.
3- Finally, in a third commit, add a CleanupPolicy that will delete any empty aggregated secret:
Since we've observed that loki-aggregated-secrets was sometimes empty whereas source secrets were present, we need to workaround this corner case. This can't be done by adding a precondition (checking if tenants is empty) to the policy, since it would prevent the policy from creating loki-secrets, and there wouldn't be anything to re-trigger the policy. At the opposite, deleting the generated resource results in policy being re-triggered, that's why we introduce this cleanup policy (only on first cluster installation) to force re-generation of the empty secret.
This has been tested by inverting the policy condition (use GreaterThanOrEquals instead of LessThanOrEquals), we can see that secret is periodically deleted and re-created:
$ k get secrets loki-secrets -w
NAME TYPE DATA AGE
loki-secrets Opaque 1 97m
loki-secrets Opaque 1 0s
loki-secrets Opaque 1 59s
loki-secrets Opaque 1 0s
loki-secrets Opaque 1 59s
loki-secrets Opaque 1 0s
Related reference(s)
Closes: #2895 (closed)
Test coverage
CI configuration
Below you can choose test deployment variants to run in this MR's CI.
Click to open to CI configuration
Legend:
| Icon | Meaning | Available values |
|---|---|---|
| Infra Provider |
capd, capo, capm3
|
|
| Bootstrap Provider |
kubeadm (alias kadm), rke2, okd, ck8s
|
|
| Node OS |
ubuntu, suse, na, leapmicro
|
|
| Deployment Options |
light-deploy, dev-sources, ha, misc, maxsurge-0, logging, no-logging, openbao
|
|
| Pipeline Scenarios | Available scenario list and description |
-
🎬 preview☁️ capd🚀 kadm🐧 ubuntu -
🎬 preview☁️ capo🚀 rke2🐧 suse -
🎬 preview☁️ capm3🚀 rke2🐧 ubuntu -
☁️ capd🚀 kadm🛠️ light-deploy🐧 ubuntu -
☁️ capd🚀 rke2🛠️ light-deploy🐧 suse -
☁️ capo🚀 rke2🐧 suse -
☁️ capo🚀 rke2🐧 leapmicro -
☁️ capo🚀 kadm🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update-no-wkld🛠️ ha,logging🐧 suse -
☁️ capo🚀 rke2🎬 rolling-update-no-wkld🛠️ ha,logging🐧 ubuntu -
☁️ capo🚀 kadm🎬 rolling-update-no-wkld🛠️ ha,logging🐧 suse -
☁️ capo🚀 kadm🎬 rolling-update-no-wkld🛠️ ha,logging🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capo🚀 kadm🎬 wkld-k8s-upgrade🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update-no-wkld🛠️ ha🐧 suse -
☁️ capo🚀 rke2🎬 sylva-upgrade-from-1.5.x🛠️ ha🐧 ubuntu -
☁️ capo🚀 rke2🎬 sylva-upgrade-from-1.5.x🛠️ ha,misc🐧 ubuntu -
☁️ capo🚀 rke2🛠️ ha,misc🐧 ubuntu -
☁️ capo🚀 rke2🛠️ ha,misc,openbao🐧 suse -
☁️ capm3🚀 rke2🐧 suse -
☁️ capm3🚀 kadm🐧 ubuntu -
☁️ capm3🚀 ck8s🐧 ubuntu -
☁️ capm3🚀 kadm🎬 rolling-update-no-wkld🛠️ ha,misc🐧 ubuntu -
☁️ capm3🚀 rke2🎬 wkld-k8s-upgrade🛠️ ha🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capm3🚀 rke2🎬 sylva-upgrade-from-1.5.x🛠️ ha🐧 suse -
☁️ capm3🚀 rke2🛠️ misc,ha🐧 suse -
☁️ capm3🚀 rke2🎬 sylva-upgrade-from-1.5.x🛠️ ha,misc🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 suse -
☁️ capm3🚀 ck8s🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capm3🚀 rke2|okd🎬 no-update🐧 ubuntu|na
Global config for deployment pipelines
-
autorun pipelines -
allow failure on pipelines -
record sylvactl events
Notes:
- Enabling
autorunwill make deployment pipelines to be run automatically without human interaction - Disabling
allow failurewill make deployment pipelines mandatory for pipeline success. - if both
autorunandallow failureare disabled, deployment pipelines will need manual triggering but will be blocking the pipeline
Be aware: after configuration change, pipeline is not triggered automatically.
Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.