Skip to content

Capture helper service logs into job/tasks main trace log

What does this MR do?

This MR takes everything from !3551 (closed) and !3564 (closed), except for the final 2 commits of each MR, and consolidates them into a single MR. The goal is to get as much as possible of this code merged, without actually enabling the feature. The latter is blocked on gitlab!100349 (merged)

This MR adds (nearly) all the code necessary to capture and streams logs from helper service containers to the CI task/job's main trace logs for the docker and kubernetes executor, but does not enable the feature. The single line to enable the feature for each of the docker and kubernetes executors, plus integration tests for each will follow in a subsequent MR.

This functionality is will be enabled when the variable CI_DEBUG_SERVICES = true is set in gitlab-ci.yaml or config.toml. While the logs are currently written to the jobs trace logs, they could easily be written elsewhere (file, log aggregator service, syslog...) in the future.

docker

The approach in this MR relies on the docker.Client.CaptureLogs() API; I also considered ContainerAttach() as it was also suitable. In the end, for this use case, both API's are very similar; both return an FD in the form of an io.Reader() into which the container's stdout and stderr are multiplexed, and on which stdcopy.StdCopy() can be used to read the contents. CaptureLogs() was a tad simpler so I chose it.

kubernetes

The approach in this MR relies on the kubernetes.Clientset().CoreV1().Pods().GetLogs() API. Despite the knows problems with that API, we're using it here as a first iteration, which may turn out to be good enough (see discussion #2119 (comment 1072311180)).

Why was this MR needed?

!2119 (merged) asks for service logs to be captured somehow. This is to help debug failing jobs when the failure is (at least in part) caused by behaviour in one of the service container services' (though not necessarily a failure to start said service container). A few possible approaches are mentioned in the issue; this MR takes the most "iteration friendly" (i.e. simplest) approach of copying the service logs inline into the main trace logs, but leaves room for the logs to be written elsewhere (e.g. a file) in the future.

What's the best way to test this MR?

  1. BASELINE: Don't specify CI_DEBUG_SERVICES and run a CI job with a service container. The output of the main log trace should be unchanged from main.

  2. Set CI_DEBUG_SERVICES to a bogus value. The error message invalid value '<xxx>' for CI_DEBUG_SERVICES variable should appear in the main trace logs.

Example https://gitlab.com/avonbertoldi/test-project/-/jobs/2786325292

  1. Set CI_DEBUG_SERVICES = true in the CI configuration
    1. register a runner with a docker and kubernetes executors (one at a time to test each executor respectively)
    2. create a job that includes a service which writes logs (example below)
    3. run the job (using a runner built from this branch)
    4. the service container's logs should appear in the job's main log trace in grey colour, with the container name prefixed to the log lines.

gitlab-ci.yaml

stages:
  - test

variables:
  POSTGRES_PASSWORD: password
  CI_DEBUG_SERVICES: "true"

format:
  stage: test
  image:
    name: alpine
  services:
    - postgres:latest
    - redis:latest
  script:
    - sleep 30

Example https://gitlab.com/avonbertoldi/test-project/-/jobs/2841278059

What are the relevant issue numbers?

Notes:

  • Best reviewed commit-at-a-time.
  • @ratchade @ajwalker there is no new code here compared to !3551 (closed) and !3564 (closed). You have both approved the former, and @ratchade has approved the latter. The only difference between this MR and those two MRs is that the k8s MR moved some content around, and in this MR that happens earlier; the final content is exactly the same. All the changes required to resolve issues you raised in the other MRs were carried over to here. I did fix a couple of typos I found while doing a final-self review.
Edited by Axel von Bertoldi

Merge request reports