Add unit to backup CAPI resources of all clusters
What does this MR do and why?
This Merge Request add
- a unit_templates
backup-s3to configure the target s3 bucket used to store backups (it will be used by other backup mechanisms) - a unit
backup-capi-resourceswhich backup CAPI resources of all clusters
It's the first MR of a batch of two MRs:
- !3813 (merged) which add
backup-capi-resourcesunit -
!4288 (merged) which add
backup-etcdunit
All clusters are backed up in one go on the management cluster, stored in one file per namespace.
clusterctl move cannot move only one cluster if several are in the same namespace: all the CAPI resources are backed up per namespace.
It is using kube-cronjob to periodically perform this backup. Backups are not versioned for now but a timestamp can be added to the name of files to avoid overwriting previous backup if the target s3 bucket do not support versioning.
The backup operation consists in:
- build a
tar.gzfromclusterctl moveto save the cluster configuration (plus list of resources provided as parameter...ConfigMap/capo-cluster-resourcesfor now) - save this
tar.gzto a s3 bucket- the s3 bucket configuration is provided using specific configuration at the
rootof values.yamlbackup: store: timestamped: false s3: host: <s3-host> accessKey: <ak> secretKey: <sk> bucket: <bucket name> cert: <s3-host certificate if relevant>
- the s3 bucket configuration is provided using specific configuration at the
- if the pushgateway unit is enabled, backup results are sent as metrics to prometheus using the pushgateway.
Related reference(s)
fix #1784 (closed)
Test coverage
Manual for now on management as workload cluster, providing following logs:
-- Set kubectl configuration.
Cluster "internal" set.
User "user" set.
Context "internal" created.
Switched to context "internal".
List of namespaces to backup : my-rke2-capo-workload sylva-system
-- Start backing up clusters from namespace 'my-rke2-capo-workload'.
Moving to directory...
Discovering Cluster API objects
Starting move of Cluster API objects Clusters=1
Moving Cluster API objects ClusterClasses=0
Saving files to /tmp/tmp.LkkniL/my-rke2-capo-workload_capi_resources_backup
-- Clusters backed up.
-- Backup compressed.
Added `backup` successfully.
`/tmp/tmp.BHCgAN` -> `backup/sylva-backup/my-rke2-capo-workload_capi_resources_backup.tar.gz`
Total: 62.97 KiB, Transferred: 62.97 KiB, Speed: 3.30 MiB/s
-- Backup uploaded
Backup succeeded in 123 seconds
-- Push result to the pushgateway
-- Start backing up clusters from namespace 'sylva-system'.
Moving to directory...
Discovering Cluster API objects
Starting move of Cluster API objects Clusters=1
Moving Cluster API objects ClusterClasses=0
Saving files to /tmp/tmp.OMaIbg/sylva-system_capi_resources_backup
-- Clusters backed up.
-- Backup compressed.
Added `backup` successfully.
`/tmp/tmp.dcMpgL` -> `backup/sylva-backup/sylva-system_capi_resources_backup.tar.gz`
Total: 72.93 KiB, Transferred: 72.93 KiB, Speed: 3.97 MiB/s
-- Backup uploaded
Backup succeeded in 91 seconds
-- Push result to the pushgateway
Backup summary:
2 Succeeded: my-rke2-capo-workload sylva-system
0 Failed :
If prometheus is deployed, the folowing metrics are available :
CI configuration
Below you can choose test deployment variants to run in this MR's CI.
Click to open to CI configuration
Legend:
| Icon | Meaning | Available values |
|---|---|---|
| Infra Provider |
capd, capo, capm3
|
|
| Bootstrap Provider |
kubeadm (alias kadm), rke2
|
|
| Node OS |
ubuntu, suse
|
|
| Deployment Options |
light-deploy, oci, ha, misc
|
|
| Pipeline Scenarios |
no-wkld simple-update simple-update-no-wkld rolling-update rolling-update-no-wkld wkld-k8s-upgrade nightly sylva-upgrade sylva-upgrade-no-wkld sylva-upgrade-from-x.x.x preview
|
-
🎬 preview☁️ capd🚀 kadm🐧 ubuntu🛠️ oci -
🎬 preview☁️ capo🚀 rke2🐧 suse -
🎬 preview☁️ capm3🚀 rke2🐧 ubuntu -
☁️ capd🚀 kadm🛠️ light-deploy🐧 ubuntu -
☁️ capd🚀 rke2🛠️ oci,light-deploy🐧 suse -
☁️ capo🚀 rke2🛠️ oci🐧 suse -
☁️ capo🚀 kadm🛠️ oci🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capo🚀 kadm🎬 wkld-k8s-upgrade🐧 ubuntu -
☁️ capo🚀 rke2🎬 rolling-update-no-wkld🛠️ ha,misc🐧 suse -
☁️ capo🚀 rke2🎬 sylva-upgrade-from-1.3.x🛠️ ha,misc🐧 ubuntu -
☁️ capm3🚀 rke2🐧 suse -
☁️ capm3🚀 kadm🛠️ oci🐧 ubuntu -
☁️ capm3🚀 kadm🎬 rolling-update-no-wkld🛠️ ha,misc🐧 ubuntu -
☁️ capm3🚀 rke2🎬 wkld-k8s-upgrade🛠️ ha🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 ubuntu -
☁️ capm3🚀 rke2🎬 sylva-upgrade-from-1.3.x🛠️ misc,ha🐧 suse -
☁️ capm3🚀 kadm🎬 rolling-update🛠️ ha🐧 suse
Global config for deployment pipelines
-
autorun pipelines -
allow failure on pipelines
Notes:
- Enabling
autorunwill make deployment pipelines to be run automatically without human interaction - Disabling
allow failurewill make deployment pipelines mandatory for pipeline success. - if both
autorunandallow failureare disabled, deployment pipelines will need manual triggering but will be blocking the pipeline
Be aware: after configuration change, pipeline is not triggered automatically.
Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.
