Capture k3d CI debug artifacts (helm values, pod events)

What does this MR do?

Adds a k3d_collect_debug helper that captures the effective helm values, rendered manifest, pod events, pod image strings (spec vs resolved imageID), and logs from non-Running pods into k3d-debug/ before the per-job cluster is destroyed. Wired into the after_script of both .k3d_qa_template and .k3d_review_specs_template, and published as a when: always CI artifact so passing and failing runs can be compared side by side.

This helps investigate the recurring k3d deploy timeouts tracked in the related work item, in particular the open question of whether the ci.digests.yaml image pins are applied to the k3d release. The pod-images.txt entry in the artifact shows the actual image strings and their resolved digests, which answers that directly.

What's in k3d-debug/

  • helm-values.yamlhelm get values <release> --all
  • helm-manifest.yaml — rendered Kubernetes manifests
  • helm-status.txt, helm-history.txt
  • pods.txt, pods-describe.txt (per-pod Events)
  • events.txt — cluster-wide events, sorted by lastTimestamp
  • nodes.txtget nodes -o wide + describe nodes
  • pod-images.txt — spec vs resolved imageID per container
  • failed-pod-logs/ — current and --previous logs for non-Running pods
  • values-inputs/ — post-envsubst copies of .values/*.yaml and ci.digests.yaml

All capture commands are best-effort (|| true) so the after_script cannot fail the job because of debug collection.

Related to #6478 (closed)

Author checklist

For general guidance, please follow our Contributing guide.

Required

For anything in this list which will not be completed, please provide a reason in the MR discussion.

  • Merge Request Title and Description are up to date, accurate, and descriptive.
  • MR targeting the appropriate branch.
  • MR has a green pipeline.
  • Documentation created/updated.
  • Tests added/updated, and test plan for scenarios not covered by automated tests.
  • Equivalent MR/issue for omnibus-gitlab opened.

N/A: this MR only adds diagnostic instrumentation to CI job templates; no chart-rendered code changes, no user-facing docs, no omnibus counterpart needed.

Reviewers checklist

Edited by Nailia Iskhakova

Merge request reports

Loading