Docker conflict, "already in use by container"
Overview
When a runner is running the same job multiple times, sometimes end up having issues with containers names which result in issues like below:
ERROR: Preparation failed: Error response from daemon: Conflict. The container name "/runner-465639a0-project-44-concurrent-3-build" is already in use by container "109a38a9e6ef215f4518992c652e04572e5977b5bfbef72d038a0cf1bb662946". You have to remove (or rename) that container to be able to reuse that name.
Sometimes even for the service as reported in #4327 (comment 178140765) as well
This seems to be the case for both Runners that have concurrent
set to 1 to 4.
Things we need to investigate
- We need to get more debug info, when this fails, the container with that name in what state is it? Is it running, failed state?
- Is it possible that there are multiple GitLab Runner processes talking to the same Docker Daemon
Proposal
First Iteration
Rename containers
All services, build & predefined containers should change their naming structure, to include the CI_JOB_ID
so for example:
Assuming $CI_JOB_ID
it 1234
- Build Container: Before
runner-x-SdEBvt-project-20-concurrent-0-build-4
After:runner-x-SdEBvt-project-20-concurrent-0-build-4-1234
- Service Container: Before
runner-x-SdEBvt-project-20-concurrent-0-postgres-0
After:runner-x-SdEBvt-project-20-concurrent-0-postgres-0
- Helper image container:
runner-x-SdEBvt-project-20-concurrent-0-predefined
After:runner-x-SdEBvt-project-20-concurrent-0-predefined-1234
Notice that we did not rename the volume containers on purpose so that we have a persistent volume between runs.
Remove helper images on context cancellation
Investigate if we call removeContainer
on the helper image when the build is canceled/termianted, meaning that the context is cancelled.
Future Iterations
When terminating containers we are sometimes just calling removeContainer
& killContainer
but we don't allow for graceful shutdown, that is why first we should Call ContainerStop
instead of killContaienr
and also call containerStop
before we remove the container. If we still fail to remove/kill the container we will simply ignore and log it as an error on the Runner logs.
More detail in #6359 (closed)
Original Report
Hello !
Since I upgrade to the latest Gitlab CE, my pipelines are failing with those kinds of errors :
ERROR: Preparation failed: Error response from daemon: Conflict. The container name "/runner-465639a0-project-44-concurrent-3-build" is already in use by container "109a38a9e6ef215f4518992c652e04572e5977b5bfbef72d038a0cf1bb662946". You have to remove (or rename) that container to be able to reuse that name.
It happens when I push to a branch with an already running pipeline. To resolve that, I have to cancel my already running build and relaunch my latest commit pipeline that failed.
I am using Gitlab CE 9.4.3 and Gitlab-multi-runner 9.4.
Anyone have seen this ?
Thank you very much !