use OpenStack images from sylva_diskimagebuilder_images/os_images (!1433) · Merge requests · Sylva-projects / sylva-core

This MR will close #528 (closed):

it propagates the information on openstack images UUIDs built by the Job added by !1231 (merged) into in s-c-c
in s-c-c, the new image_key setting is used to let the user specify which image to use by specifying a key of os_images or sylva_diskimagebuilder_images
it ensure that the get-openstack-images unit can be enabled in a workload cluster and that it gets the proper information on images from share cluster settings

Result: (in user values for mgmt cluster or a workload cluster)

sylva_diskimagebuilder_images:
  ubuntu-jammy-plain-rke2-1.25.14:
    enabled: true

cluster:
  capo:
    image_key: ubuntu-jammy-plain-rke2-1-25-14

  #could also be done
  control_plane:
    capo:
      #image_key: ubuntu-jammy-plain-rke2-1-25-14

  machine_deployments:
    foo:
      capo:
        #image_key: ubuntu-jammy-plain-rke2-1-25-14

Behavior: get-openstack-images will ensure the presence in Glance of the image defined by sylva_diskimagebuilder_images.ubuntu-jammy-plain-rke2-1.25.14 based on diskimage-builder project version sylva_diskimagebuilder_images, and s-c-c will just make use of it in the OpenStackMachineTemplate

💡 this MR is best reviewed one commit at a time, a few commi commit fixes adjusts a few things related to !1231 (merged) and the os-images-info job:

make the name of the configmap produced by os-images-info dynamic to workaround a race condition that I observed (on an apply.sh where I had added an image in sylva_diskimagebuilder_images, the get-openstack-images Kustomization was reconciled before the os-images-info Kustomization was processed by the Flux Kustomization controller, and re-run but with the old os-images-infos configmap)
I changed the temporary storage from an emptyDir into an ephemeral volume of 8Gi (size aligned with what we have for os-image-server by default) - the emptyDir/Memory was too tight (I observed OOM crashes: a few GB of disk image is a lot for our CAPO VMs which only have 8GB by default - the m1.large flavor in our CAPO CI env), and using an emptyDir without medium: Memory gave "no space left on device" errors
to get this MR to work, I fixed three issues in push-images-to-glance.py (closes get-openstack-images fails to push an image (#909 - closed), closes push-images-to-glance.py creates wrong datastru... (#910 - closed) and closes push-images-to-glance.py uses HTTP when insecur... (#918 - closed)), and I added some useful logging to understand these issues more easily
set the job restartPolicy to Never - we had decided on making the get-openstack-images Job restartPolicy OnFailure, to allow for retries, but this policy has the unfortunate side-effect that on a failure the pod is deleted, along with its logs (see https://kubernetes.io/docs/concepts/workloads/controllers/job/#pod-backoff-failure-policy: "If your job has restartPolicy = "OnFailure", keep in mind that your Pod running the Job will be terminated once the job backoff limit has been reached [...]"). We have backoffLimit: 5, so even with restartPolicy to Never, we'll have 5 retries

This is a follow-up to !1231 (merged) and sylva-projects/sylva-elements/helm-charts/sylva-capi-cluster!240 (merged).

/cc @mihai.zaharia @cristian.manda

Edited Jan 31, 2024 by Thomas Morin

use OpenStack images from sylva_diskimagebuilder_images/os_images

Merge request reports