Skip to content

TestDockerCommandRunAttempts flaky test

Job #510666409 failed for 8a27dcae:

=== RUN   TestDockerCommandRunAttempts
 Running with gitlab-runner development version (HEAD)
   on  misscont
 Preparing the "docker" executor
 Using Docker executor with image alpine:3.7 ...
 Using locally found image version due to if-not-present pull policy
 Using docker image sha256:6d1ef012b5674ad8a127ecfa9b5e6f5178d171b90ee462846974177fd9bdd39f for alpine:3.7 ...
 Preparing environment
 Running on runner-misscont-project-0-concurrent-0 via runner-0277ea0f-project-250833-concurrent-0...
 Getting source from Git repository
 Fetching changes...
 Initialized empty Git repository in /builds/gitlab-org/ci-cd/tests/gitlab-test/.git/
 Created fresh repository.
 From https://gitlab.com/gitlab-org/ci-cd/tests/gitlab-test
  * [new branch]      add-lfs-object    -> refs/origin/heads/add-lfs-object
  * [new branch]      add-lfs-submodule -> refs/origin/heads/add-lfs-submodule
  * [new branch]      master            -> refs/origin/heads/master
  * [new branch]      update-readme     -> refs/origin/heads/update-readme
  * [new branch]      add-lfs-object    -> origin/add-lfs-object
  * [new branch]      add-lfs-submodule -> origin/add-lfs-submodule
  * [new branch]      master            -> origin/master
  * [new branch]      update-readme     -> origin/update-readme
 Checking out 91956efe as master...
 Skipping Git submodules setup
 Restoring cache
 Downloading artifacts
 Running before_script and script
 $ sleep 60
 ERROR: Container "34295ab5c20a1da4936a9ea0078570473d089d72a2563f69eebb0d95c3312ae4" not found or removed. Will retry...
 Retrying build_script
 $ sleep 60
 Running after_script
 Uploading artifacts for failed job
 ERROR: Job failed: exit code 137
 coverage: 46.7% of statements
 panic: test timed out after 10m0s
 goroutine 947 [running]:
 testing.(*M).startAlarm.func1()
 	/usr/local/go/src/testing/testing.go:1377 +0xdf
 created by time.goFunc
 	/usr/local/go/src/time/sleep.go:168 +0x44
 goroutine 1 [chan receive, 9 minutes]:
 testing.(*T).Run(0xc0000d7900, 0xf7c76d, 0x1c, 0xfd5790, 0x4bfd01)
 	/usr/local/go/src/testing/testing.go:961 +0x377
 testing.runTests.func1(0xc0000d7b00)
 	/usr/local/go/src/testing/testing.go:1202 +0x78
 testing.tRunner(0xc0000d7b00, 0xc000221d48)
 	/usr/local/go/src/testing/testing.go:909 +0xc9
 testing.runTests(0xc000368d00, 0x177ff60, 0x50, 0x50, 0x7ffefaf7aa18)
 	/usr/local/go/src/testing/testing.go:1200 +0x2a7
 testing.(*M).Run(0xc00030ec00, 0x0)
 	/usr/local/go/src/testing/testing.go:1117 +0x176
 gitlab.com/gitlab-org/gitlab-runner/executors/docker.TestMain(0xc00030ec00)
 	/builds/gitlab-org/gitlab-runner/executors/docker/docker_test.go:42 +0xe8
 main.main()
 	_testmain.go:264 +0x1c1
 goroutine 6 [chan receive]:
 k8s.io/klog.(*loggingT).flushDaemon(0x1787ec0)
 	/go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:1010 +0x8b
 created by k8s.io/klog.init.0
 	/go/pkg/mod/k8s.io/klog@v1.0.0/klog.go:411 +0xd6
 goroutine 9 [syscall, 9 minutes]:
 os/signal.signal_recv(0x45f166)
 	/usr/local/go/src/runtime/sigqueue.go:147 +0x9c
 os/signal.loop()
 	/usr/local/go/src/os/signal/signal_unix.go:23 +0x22
 created by os/signal.init.0
 	/usr/local/go/src/os/signal/signal_unix.go:29 +0x41
 goroutine 784 [sleep]:
 runtime.goparkunlock(...)
 	/usr/local/go/src/runtime/proc.go:310
 time.Sleep(0x3b9aca00)
 	/usr/local/go/src/runtime/time.go:105 +0x157
 gitlab.com/gitlab-org/gitlab-runner/executors/docker_test.assertFailedToInspectContainer(0xc0000d7900, 0xc000182400, 0xc0004dbb98)
 	/builds/gitlab-org/gitlab-runner/executors/docker/docker_command_test.go:1258 +0xc6
 gitlab.com/gitlab-org/gitlab-runner/executors/docker_test.TestDockerCommandRunAttempts(0xc0000d7900)
 	/builds/gitlab-org/gitlab-runner/executors/docker/docker_command_test.go:1241 +0x42a
 testing.tRunner(0xc0000d7900, 0xfd5790)
 	/usr/local/go/src/testing/testing.go:909 +0xc9
 created by testing.(*T).Run
 	/usr/local/go/src/testing/testing.go:960 +0x350
 FAIL	gitlab.com/gitlab-org/gitlab-runner/executors/docker	600.046s
 FAIL

We are checking for Container "34295ab5c20a1da4936a9ea0078570473d089d72a2563f69eebb0d95c3312ae4" not found or removed string which of course never happens when exit code 137 is returned from the Runner. exit code 137 is a common exit code from Docker when a container gets forcibly killed which we are doing in the test.

TODO

  • We need to remove unused linter exception from .golangci-lint once #25385 is resolved.
Edited by Elliot Rushton