use OpenStack images from sylva_diskimagebuilder_images/os_images
This MR will close #528 (closed):
- it propagates the information on openstack images UUIDs built by the Job added by !1231 (merged) into in s-c-c
- in s-c-c, the new
image_keysetting is used to let the user specify which image to use by specifying a key ofos_imagesorsylva_diskimagebuilder_images - it ensure that the get-openstack-images unit can be enabled in a workload cluster and that it gets the proper information on images from share cluster settings
Result: (in user values for mgmt cluster or a workload cluster)
sylva_diskimagebuilder_images:
ubuntu-jammy-plain-rke2-1.25.14:
enabled: true
cluster:
capo:
image_key: ubuntu-jammy-plain-rke2-1-25-14
#could also be done
control_plane:
capo:
#image_key: ubuntu-jammy-plain-rke2-1-25-14
machine_deployments:
foo:
capo:
#image_key: ubuntu-jammy-plain-rke2-1-25-14
Behavior: get-openstack-images will ensure the presence in Glance of the image defined by sylva_diskimagebuilder_images.ubuntu-jammy-plain-rke2-1.25.14 based on diskimage-builder project version sylva_diskimagebuilder_images, and s-c-c will just make use of it in the OpenStackMachineTemplate
- make the name of the configmap produced by os-images-info dynamic to workaround a race condition that I observed (on an
apply.shwhere I had added an image in sylva_diskimagebuilder_images, the get-openstack-images Kustomization was reconciled before the os-images-info Kustomization was processed by the Flux Kustomization controller, and re-run but with the old os-images-infos configmap) - I changed the temporary storage from an
emptyDirinto anephemeralvolume of 8Gi (size aligned with what we have for os-image-server by default) - the emptyDir/Memory was too tight (I observed OOM crashes: a few GB of disk image is a lot for our CAPO VMs which only have 8GB by default - them1.largeflavor in our CAPO CI env), and using an emptyDir withoutmedium: Memorygave "no space left on device" errors - to get this MR to work, I fixed three issues in
push-images-to-glance.py(closes get-openstack-images fails to push an image (#909 - closed), closes push-images-to-glance.py creates wrong datastru... (#910 - closed) and closes push-images-to-glance.py uses HTTP when insecur... (#918 - closed)), and I added some useful logging to understand these issues more easily - set the job
restartPolicytoNever- we had decided on making theget-openstack-imagesJobrestartPolicyOnFailure, to allow for retries, but this policy has the unfortunate side-effect that on a failure the pod is deleted, along with its logs (see https://kubernetes.io/docs/concepts/workloads/controllers/job/#pod-backoff-failure-policy: "If your job has restartPolicy = "OnFailure", keep in mind that your Pod running the Job will be terminated once the job backoff limit has been reached [...]"). We havebackoffLimit: 5, so even withrestartPolicytoNever, we'll have 5 retries
This is a follow-up to !1231 (merged) and sylva-projects/sylva-elements/helm-charts/sylva-capi-cluster!240 (merged).
Edited by Thomas Morin