Dump node logs via http
This MR introduces a bunch of CI adaptation for be able to get logs from nodes directly on servers.
The idea is to launch an http server on each node at startup in order to be able to download content of /var/log directory. I chose to use miniserve because it is lightweight and it has built-in feature to generate tar.gz archive for downloading full content of a directory.
Then the archive can be downloaded for each nodes at the end of the pipeline using new script .gitlab/ci/scripts/dump_machine_logs.sh. On purpose, this script is made for be used in our CI and is not designed for be usable in any on field deployment. This script scans Machine object created during the deployment in order to get node's IPs and try to download archives. Then archives are saved in job artifacts under node_logs directory in cluster dump directory.
According to infra provider download process may differ:
- on capo: http server is listening on port 25888 (arbitrary chosen by myself). I had to create manually a security group in our CI tenant to allow this port to be reachable from our runner.
- on capm3-virt: in baremetal emulation node IPs are not accessible directly on runner. We need to pass through bootstrap server using dedicated service/endpointSlice to reach those IPs.
-
on capd: the script is currently not working. Surprisingly adding some
additional_commandsprevents machines to get ready. I skipped the support of capd here as it seemed less important. If needed some improvement could be done in the future.
Related reference(s)
Solves #1102 (closed)
close #1013 (closed)
