simplify baremetal OS images settings
What we have today
in sylva-units defaults values.yaml, default_os_images and os_images, that contain this kind of things for configuring the os-image-server unit:
ubuntu-jammy-plain-rke2-1-26-9:
uri: "{{ .Values.sylva_base_oci_registry }}/sylva-elements/diskimage-builder/ubuntu-jammy-plain-rke2-1.26.9:0.0.12"
filename: ubuntu-jammy-plain-rke2-1.26.9.qcow2
checksum: ffdc81fcdc0104151aa792a508eefe0d47660b18683949edcd734b3a4f938f20
persistence:
enabled: true
size: 3Gi
We can't list all images produced by diskimage builder here, because this would result in os-image downloading all of them, which is useless and too heavy for a given deployment. So we list today a single image here.
Downstream sylva, we end up having to recreate the same data structure for the image we want to use.
Then we have sylva-capi cluster, which will need this kind of config:
capm3:
machine_image_url: http://ubuntu-jammy-plain-rke2-1-26-9.os-images.svc.cluster.local:8080/ubuntu-jammy-plain-rke2-1.26.9.qcow2
machine_image_format: qcow2
machine_image_checksum: http://ubuntu-jammy-plain-rke2-1-26-9.os-images.svc.cluster.local:8080/ubuntu-jammy-plain-rke2-1.26.9.qcow2.sha256sum
machine_image_checksum_type: sha256
So today, we have lots of grunt work and repetitive things to write when we want to use a given image for a given deployment:
- have sylva-core use images from diskimage-builder (see #636 (closed)), at least if we keep wanting to have this kind of content for os_images (which I think we in fact don't need, see below)
- get the right settings in `cluster.capm3.machine
- have a way of ensuring that we don't deploy a
k8s_version: Xwith an image built for Kubernetes version Y (we maybe don't always want to do that, we probably would like to be able to enforce this sometimes)
Note that we also still don't leverage one interesting thing in Metal3: we can give the image checksum directly to the IPA, instead of giving it an URL to find it. Giving the checksum directly would be one step towards securing against MITM attacks where the attacker would intercept the HTTP session in which the OS images is downloaded to insert malicious content.
Here is what I propose: A) stop our habit of giving the image checksum in os-image-server values (os_images) when we use an OCI artifact
- registries have their own way of avoiding dataplane corruption
- if we care about provenance validation, we need to complete the implementation of OCI artifact signing and signature verification
- if we want the os-image-server downloader to not have to spend a few tens of seconds computing the SHA256, we can let it fetch it from the OCI artifact annotations B) simplify file names: we don't car what the actual filename is, we only need it to be the same one produced in os-image-server Ingress and the one used in sylva-capi-cluster URL (machine_image_url), we can standardize on
image.qcow2everywhere by default and things will just work C) how to let renovate bot update references in sylva-units to diskimage-builder artifacts: - have sylva-units refer once to a given diskimage-builder release
sylva_diskimagebuilder_version: 0.1.1
- have a datastructure in sylva-units list all artifacts basenames:
diskimage_builder_os_images
diskimage_builder_os_images:
ubuntu-jammy-plain-rke2-1.26.9: {}
ubuntu-jammy-plain-rke2-1.25.15: {}
opensuse-15.5-plain-rke2-1.26.9: {}
ubuntu-jammy-hardened-rke2-1.26.9: {}
ubuntu-jammy-plain-kubeadm-1.26.9: {}
ubuntu-jammy-plain-kubeadm-1.26.9: {}
- from this dict, generate
os_imagesin sylva-units with templating, deriving the key from what precedes the:and building the value from the rest (using sylva_diskimagebuilder_version to build the full OCI URL), without defining any filename nor checksum - let's go further and assume that an entry in
os_imagesis added only for keys for which there is anenabled: truefield defined: this will allow the users to select which images are prepared by os-image-server - now let's see how to simplify s-c-c capm3 config ...
- have the os-image-server downloader tool build a configmap containing a dict like this:
os_images_info:
ubuntu-jammy-plain-rke2-1.26.9:
url: http://ubuntu-jammy-plain-rke2-1-26-9.os-images.cluster.local:8080/ubuntu-jammy-plain-rke2-1-26-9/image.qcow2
checksum: <checksum retrieved from the OCI artifact>
checksumType: ...
format: qcow2
ubuntu-jammy-plain-rke2-1.25.15: {}
opensuse-15.5-plain-rke2-1.26.9: {}
ubuntu-jammy-hardened-rke2-1.26.9: {}
ubuntu-jammy-plain-kubeadm-1.26.9: {}
ubuntu-jammy-plain-kubeadm-1.26.9: {}
- this dict would be passed to sylva-capi-cluster HelmRelease with valuesFrom
- sylva-capi-cluster would accept a new
os_image_keykey undercapm3
cluster:
capm3:
os_image_key: ubuntu-jammy-plain-rke2-1.26.9
when this syntax is used, sylva-capi-cluster would build the Metal3MachineTemplate.spec.template.spec.image fields from os_images_info.$os_image_key .
End result
In sylva-units values.yaml we would only have this:
diskimage_builder_os_images:
ubuntu-jammy-plain-rke2-1.26.9: {}
ubuntu-jammy-plain-rke2-1.25.15: {}
opensuse-15.5-plain-rke2-1.26.9: {}
ubuntu-jammy-hardened-rke2-1.26.9: {}
ubuntu-jammy-plain-kubeadm-1.26.9: {}
ubuntu-jammy-plain-kubeadm-1.26.9: {}
Renovate bot would update it when a new diskimage-builder is tagged
For a given deployment people would have to:
- specify which image they want os-image-builder to support
diskimage_builder_os_images:
opensuse-15.5-plain-rke2-1.26.9:
enabled: true
They could even parametrize this:
diskimage_builder_os_images:
opensuse-15.5-plain-rke2-{{ .Values.cluster.k8s_version }}:
enabled: true
- for a given cluster (mgmt or workload cluster), we would only have this kind of things:
cluster:
capm3:
os_image_key: ubuntu-jammy-plain-rke2-1.26.9 ## again, {{ k8s_version }} could be used here
control_plane:
capm3:
os_image_key: ubuntu-jammy-hardened-rke2-1.26.9 ## example if a different image is wanted for the CP
Related
This issue share similarities with #528 (closed)