The container name is already in use [Kubernetes Executor]

Summary

After moving to the group-wide K8s cluster we started experiencing failed builds. It happens several times a day and became a bit annoying.

In Gitlab it appears as an endless (till the timeout) train of the following message: Waiting for pod gitlab-managed-apps/runner-jqsuxeuc-project-35-concurrent-02sgpn to be running, status is Pending

Examining the pod's status we can see

Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "runner-jqsuxeuc-project-35-concurrent-02sgpn": operation timeout: context deadline exceeded

message followed by

Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "runner-jqsuxeuc-project-35-concurrent-02sgpn": Error response from daemon: Conflict. The container name "/k8s_POD_runner-jqsuxeuc-project-35-concurrent-02sgpn_gitlab-managed-apps_9158aaa1-ad9f-11e9-9b78-42010a84007f_0" is already in use by container "06d70c6ff225fd4e8798f58652c124ac832da0e5f7d2687b96eb9fd9014f5d91". You have to remove (or rename) that container to be able to reuse that name.

Steps to reproduce

It's quite difficult to reproduce since it happens sporadically for different jobs spread across different projects. It happens with different runners (we've got several instances of runners deployed, each tagged for use in particular jobs).

Actual behavior

Timeout and a failed build due to the troubles with a container.

Expected behavior

A runner pod up & running.

Relevant logs and/or screenshots

See the logs above.

Environment description

GKE

A dedicated node pool for runners with Container-Optimized OS

K8s version: 1.13.6-gke.13

Container runtime: docker://18.9.3

Used GitLab Runner version

Running with gitlab-runner 11.11.2 (ac2a293c)

Possible fixes

Edited by Darren Eastman