GitLab Runner job failed after all commands have run
Summary
I've recently setup a job that seems to be working okay. I noticed that one of the jobs failed unexpectedly, even though all the commands seem to have completed successfully.
Steps to reproduce
Unsure how to reproduce. I've only noticed it once.
Actual behavior
All the commands finished (it was deploying an image to AWS ECS), however it fails with the following message:
ERROR: Job failed (system failure): error during connect: Get https://67.205.160.217:2376/v1.18/containers/e21b12b0e46700422c8fdb72ac7581509159262301c36928e1f36619b0d46e70/json: dial tcp 67.205.160.217:2376: getsockopt: no route to host
Expected behavior
The job should have succeeded.
Relevant logs and/or screenshots
(changed some names for security reason)
Running with gitlab-runner 10.1.0 (c1ecf97f)
on docker-auto-scale (4e4528ca)
Using Docker executor with image python:3-alpine ...
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image docker:dind ID=sha256:013f358b88761c0905838f3c2cfb1d922cff45ef854f2a8d741494f09f21ed79 for docker service...
Waiting for services to be up and running...
Using docker image sha256:d00fde078d8ef704733b1663134794a0e708d06c023f2714d71b5296d949a57b for predefined container...
Pulling docker image python:3-alpine ...
Using docker image python:3-alpine ID=sha256:83da413805809a25edd2b050166d335b609b6c502ec433e0b42309e675d72ce9 for build container...
Running on runner-4e4528ca-project-4455384-concurrent-0 via runner-4e4528ca-srm-1509614585-858779d8...
Cloning repository...
Cloning into '/builds/sometest/sometestio'...
Checking out 9fe587a8 as master...
Skipping Git submodules setup
$ apk add --update curl
fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/community/x86_64/APKINDEX.tar.gz
(1/3) Installing libssh2 (1.7.0-r0)
(2/3) Installing libcurl (7.55.0-r2)
(3/3) Installing curl (7.55.0-r2)
Executing busybox-1.24.2-r13.trigger
OK: 32 MiB in 37 packages
$ curl -o /usr/local/bin/ecs-cli https://s3.amazonaws.com/amazon-ecs-cli/ecs-cli-linux-amd64-latest
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
63 15.2M 63 9995k 0 0 2383 0 1:51:57 1:11:35 0:40:22 20.9M
100 15.2M 100 15.2M 0 0 3727 0 1:11:35 1:11:35 --:--:-- 24.1M
time="2017-11-02T09:36:26Z" level=info msg="Saved ECS CLI configuration for cluster (default)"
time="2017-11-02T09:36:26Z" level=warning msg="Skipping unsupported YAML option..." option name=networks
time="2017-11-02T09:36:26Z" level=warning msg="Skipping unsupported YAML option for service..." option name=networks service name=sometestservice
$ chmod +x /usr/local/bin/ecs-cli
$ export VCS_REF=${CI_COMMIT_SHA:0:8}
$ ecs-cli configure --region $AWS_REGION --access-key $AWS_ACCESS_KEY_ID --secret-key $AWS_SECRET_ACCESS_KEY --cluster $AWS_ECS_CLUSTER_NAME
$ ecs-cli compose --project-name testtask service stop
time="2017-11-02T09:36:28Z" level=info msg="Updated ECS service successfully" desiredCount=0 serviceName=ecscompose-service-testtask
time="2017-11-02T09:36:28Z" level=info msg="Service status" desiredCount=0 runningCount=1 serviceName=ecscompose-service-testtask
time="2017-11-02T09:36:59Z" level=info msg="Service status" desiredCount=0 runningCount=0 serviceName=ecscompose-service-testtask
time="2017-11-02T09:36:59Z" level=info msg="ECS Service has reached a stable state" desiredCount=0 runningCount=0 serviceName=ecscompose-service-testtask
$ ecs-cli compose --project-name testtask service up
time="2017-11-02T09:36:59Z" level=warning msg="Skipping unsupported YAML option..." option name=networks
time="2017-11-02T09:36:59Z" level=warning msg="Skipping unsupported YAML option for service..." option name=networks service name=sometestservice
time="2017-11-02T09:37:01Z" level=info msg="Using ECS task definition" TaskDefinition="ecscompose-testtask:12"
time="2017-11-02T09:37:02Z" level=info msg="Updated the ECS service with a new task definition. Old containers will be stopped automatically, and replaced with new ones" desiredCount=1 serviceName=ecscompose-service-testtask taskDefinition="ecscompose-testtask:12"
ERROR: Job failed (system failure): error during connect: Get https://67.205.160.217:2376/v1.18/containers/e21b12b0e46700422c8fdb72ac7581509159262301c36928e1f36619b0d46e70/json: dial tcp 67.205.160.217:2376: getsockopt: no route to host
Environment description
Shared runners on GitLab.com
Used GitLab Runner version
Running with gitlab-runner 10.1.0 (c1ecf97f)
on docker-auto-scale (4e4528ca)