crust-gather CI issues

I've seen a few things in last nightly runs, related to the introduction of crust-gather in debug-on-exit:

  • it's very verbose, up to the point that it's problematic because in GitLab UI we now sometimes don't see the last lines of the apply.sh/bootstrap.sh which has the most useful information (the information is not lost, it exists in raw logs, but it's a pain to find it)
  • crust-gather has errors like the following (due, I think, to the fact that it must try something like kubectl debug node which requires suitable pod security context settings that aren't there):
2024-10-04T02:53:27.495761Z ERROR collect:collect:representations{node="wc-1481097742-rke2-capo-oci-md0-4pzqz-g7q7r"}:
get_or_create{pod_name="node-debug-wc-1481097742-rke2-capo-oci-md0-4pzqz-g7q7r"}:
 kubectl_crust_gather::scanners::nodes:
 error=Failed to create pod:
 Api(ErrorResponse { status: "Failure", message:
 "pods \"node-debug-wc-1481097742-rke2-capo-oci-md0-4pzqz-g7q7r\" is forbidden: violates PodSecurity \"restricted:latest\":
 host namespaces (hostNetwork=true, hostPID=true, hostIPC=true), allowPrivilegeEscalation != false (container \"debug\" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container \"debug\" must set securityContext.capabilities.drop=[\"ALL\"]), restricted volume types (volume \"host-root\" uses restricted volume type \"hostPath\"), runAsNonRoot != true (pod or container \"debug\" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container \"debug\" must set securityContext.seccompProfile.type to \"RuntimeDefault\" or \"Localhost\")", reason:
 "Forbidden", code: 403 })

example here: https://gitlab.com/sylva-projects/sylva-core/-/jobs/7990956533

It would at the very least be important to push an evolution to that the crust-gather logs are redirected to a specific file, rather than sent on stdout/stderr.

/cc @stoub @loic.nicolle

Edited Oct 04, 2024 by Thomas Morin
Assignee Loading
Time tracking Loading