sylva-units: optimization, reduce CPU/memory cost around the automatic cluster-machines-ready depends_on

This MR refactors the way the automatic cluster-machines-ready depends_on is handled and brings a nice improvement to the time and memory used to render sylva-units: speed increased by nearly 2, memory decreased by >10 which should resolve the n GiB memory consumption peaks we see today.

Context

The automatic cluster-machines-ready depends_on ensures that every unit either is a direct or indirect dependency of the cluster unit, or is automatically set to depend on cluster-machines-ready. The underlying goal is to have determinism during Sylva upgrades, which would lead to not testing the upgrade of a major component always with the same k8s version (since it would occur either with the new or with the old version of Kubernetes, without any control). This determinism is ensured because a unit will either reconcile before cluster unit or after cluster-machines-ready.

Initial implementation

In the initial implementation, the dynamic dependency is defined in values.yaml unit_templates.base-deps.depends_on, and refers to a pre-computed list containing the list of direct and indirect dependencies of the cluster unit (_internal.cluster_machines_ready_unit_deps). This makes the declaration of dependencies "self-referencing" since the dependency graph is defined in a way that refers to the sub-graph with the dependencies of the cluster-machines-ready unit).

Then in templates/units.yaml we had specific code to compute _internal.cluster_machines_ready_unit_deps calling include "all-unit-dependencies" in a particular way to ignore the cluster-machines-ready dynamic declaration, which is necessary to make this computation work despite its partly "self-referencing" nature.

What lead to this change

When working on reducing the CPU/memory consumption during sylva-units rendering, I realized that we could get x2 speed improvement and a /15 memory consumption improvement by linearizing the "all-unit-dependencies" (ie. making it a loop instead of a recursive function).

However, the relevance of such a refactoring wasn't initially obvious so I had tried the idea by having Claude do the rewrite for me (Sonet 4.5).

Because the resulting code was quite complex (see !7459), I decided to take a step back and see how we could simplify the whole thing by:

  • (1) first removing the "self-referencing" part
  • (2) then implement a "schoolbook" algorithm for detecting cycles in the dependency graph, with non-recursive implementation

This MR was intended to be step (1).

Performance improvement 🎉

The fortunate surprise is that this "step 1", alone, is sufficient to get a huge perf improvement:

  • time for rendering: 10s instead of 16s (-40%)
  • memory consumption: 88Mi instead of 978Mi (divided by 11, approximately)

We discussed with @feleouet and we honestly aren't too sure about why this is sufficient to have a huge improvement. We tend to conclude that the way this part was implemented was too complex to make it easy to analyse were the performance bottleneck was, and that analysing this in depth is possibly not worth it. The simplification and the performance improvement is worth taking anyway (and maybe step (2) won't be needed).

Before/After time/memory measurements

Before (run of /usr/bin/time -v helm template with the values of my CAPO dev environment):

       User time (seconds): 33.74
        System time (seconds): 2.08
        Percent of CPU this job got: 219%
>>>>    Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.32    <<<<<<<<<<
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
>>>>    Maximum resident set size (kbytes): 1002152   <<<<<<<<<<<<<<<<<<<<
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 314
        Minor (reclaiming a frame) page faults: 885337
        Voluntary context switches: 18932
        Involuntary context switches: 4769
        Swaps: 0
        File system inputs: 44584
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

After this change:

        User time (seconds): 16.02
        System time (seconds): 0.47
        Percent of CPU this job got: 168%
>>>>>   Elapsed (wall clock) time (h:mm:ss or m:ss): 0:09.78  <<<<<<<<<<<<<<<<<<<<
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
>>>>>   Maximum resident set size (kbytes): 89868  <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 5
        Minor (reclaiming a frame) page faults: 98049
        Voluntary context switches: 18270
        Involuntary context switches: 3033
        Swaps: 0
        File system inputs: 992
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Implementation details

Not all units had the automatic conditional injection of the cluster-machines-ready dependency:

  • only those inheriting the base-deps unit template
  • some units are explicitly overriding the dependency

This MR was implemented so that this is unchanged.

Additional changes (made in two separate commits):

  • initialize $all_dependencies_cache earlier, since we can now use it for the first call to include "all-unit-dependencies"
  • remove $ignore_units parameter from "all-unit-dependencies", since it's now not needed anymore

Review

:bulb This MR is nicer to review one commit at a time.

Testing

This MR was tested by diff'ing before/after the manifests produced by helm template.

CI configuration

Below you can choose test deployment variants to run in this MR's CI.

Click to open to CI configuration

Legend:

Icon Meaning Available values
☁️ Infra Provider capd, capo, capm3
🚀 Bootstrap Provider kubeadm (alias kadm), rke2, okd, ck8s
🐧 Node OS ubuntu, suse, na, leapmicro
🛠️ Deployment Options Deployment option list and description
🎬 Pipeline Scenarios Available scenario list and description
🟢 Enabled units Any available units name, by default apply to management and workload cluster. Can be prefixed by mgmt: or wkld: to be applied only to a specific cluster type
🔴 Disabled units Any available units name, by default apply to management and workload cluster. Can be prefixed by mgmt: or wkld: to be applied only to a specific cluster type
🏗️ Target platform Can be used to select specific deployment environment Available platform list and description
  • 🎬 preview ☁️ capd 🚀 kadm 🐧 ubuntu

  • 🎬 preview ☁️ capo 🚀 rke2 🐧 suse

  • 🎬 preview ☁️ capm3 🚀 rke2 🐧 ubuntu

  • ☁️ capd 🚀 kadm 🛠️ light-deploy 🐧 ubuntu

  • ☁️ capd 🚀 rke2 🛠️ light-deploy 🐧 suse

  • ☁️ capo 🚀 rke2 🐧 suse

  • ☁️ capo 🚀 rke2 🐧 leapmicro

  • ☁️ capo 🚀 kadm 🐧 ubuntu

  • ☁️ capo 🚀 kadm 🐧 ubuntu 🟢 neuvector,mgmt:harbor

  • ☁️ capo 🚀 rke2 🎬 rolling-update 🛠️ ha 🐧 ubuntu

  • ☁️ capo 🚀 kadm 🎬 wkld-k8s-upgrade 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🎬 rolling-update-no-wkld 🛠️ ha 🐧 suse

  • ☁️ capo 🚀 rke2 🎬 sylva-upgrade 🛠️ ha 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🎬 sylva-upgrade-from-1.6.x 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capo 🚀 rke2 🛠️ misc 🐧 ubuntu 🟢 mgmt:harbor 🔴 neuvector

  • ☁️ capo 🚀 rke2 🛠️ ha,misc,openbao🐧 suse

  • ☁️ capo 🚀 rke2 🐧 suse 🎬 upgrade-from-prev-tag

  • ☁️ capm3 🚀 rke2 🐧 suse

  • ☁️ capm3 🚀 kadm 🐧 ubuntu

  • ☁️ capm3 🚀 ck8s 🐧 ubuntu

  • ☁️ capm3 🚀 kadm 🎬 rolling-update-no-wkld 🛠️ ha,misc 🐧 ubuntu

  • ☁️ capm3 🚀 rke2 🎬 wkld-k8s-upgrade 🛠️ ha 🐧 suse

  • ☁️ capm3 🚀 kadm 🎬 rolling-update 🛠️ ha 🐧 ubuntu

  • ☁️ capm3 🚀 rke2 🎬 upgrade-from-prev-release-branch 🛠️ ha 🐧 suse

  • ☁️ capm3 🚀 rke2 🛠️ misc,ha 🐧 suse

  • ☁️ capm3 🚀 rke2 🎬 sylva-upgrade 🛠️ ha,misc 🐧 suse

  • ☁️ capm3 🚀 kadm 🎬 rolling-update 🛠️ ha 🐧 suse

  • ☁️ capm3 🚀 ck8s 🎬 rolling-update 🛠️ ha 🐧 ubuntu

  • ☁️ capm3 🚀 rke2|okd 🎬 no-update 🐧 ubuntu|na

  • ☁️ capm3 🚀 rke2 🐧 suse 🎬 upgrade-from-release-1.5

  • ☁️ capm3 🚀 rke2 🐧 suse 🎬 upgrade-to-main

Global config for deployment pipelines

  • autorun pipelines
  • allow failure on pipelines
  • record sylvactl events

Notes:

  • Enabling autorun will make deployment pipelines to be run automatically without human interaction
  • Disabling allow failure will make deployment pipelines mandatory for pipeline success.
  • if both autorun and allow failure are disabled, deployment pipelines will need manual triggering but will be blocking the pipeline

Be aware: after configuration change, pipeline is not triggered automatically. Please run it manually (by clicking the run pipeline button in Pipelines tab) or push new code.

Edited by Thomas Morin

Merge request reports

Loading