gitlab-runner-service failing to check service that is alive
Runner fails to start service with:
Waiting for services to be up and running...
*** WARNING: Service runner-b3864ed1-project-1-concurrent-0-postgres-0 probably didn't start properly.
Health check error:
exit code 1
Health check container logs:
2018-08-02T21:14:25.382508495Z No HOST or PORT
Service container logs:
2018-08-02T21:14:24.333910322Z The files belonging to this database system will be owned by user "postgres".
2018-08-02T21:14:24.334004783Z This user must also own the server process.
2018-08-02T21:14:24.334012074Z
2018-08-02T21:14:24.334136318Z The database cluster will be initialized with locale "en_US.utf8".
2018-08-02T21:14:24.334149142Z The default database encoding has accordingly been set to "UTF8".
2018-08-02T21:14:24.334154897Z The default text search configuration will be set to "english".
2018-08-02T21:14:24.334163941Z
2018-08-02T21:14:24.334202578Z Data page checksums are disabled.
2018-08-02T21:14:24.334241746Z
2018-08-02T21:14:24.334347596Z fixing permissions on existing directory /var/lib/postgresql/data ... ok
2018-08-02T21:14:24.335072437Z creating subdirectories ... ok
2018-08-02T21:14:24.364645568Z selecting default max_connections ... 100
2018-08-02T21:14:24.396789864Z selecting default shared_buffers ... 128MB
2018-08-02T21:14:24.827422650Z creating configuration files ... ok
2018-08-02T21:14:25.468874178Z creating template1 database in /var/lib/postgresql/data/base/1 ... ok
2018-08-02T21:14:25.501152818Z initializing pg_authid ... ok
2018-08-02T21:14:25.569349040Z initializing dependencies ... ok
2018-08-02T21:14:25.675218066Z creating system views ... ok
2018-08-02T21:14:25.750057204Z loading system objects' descriptions ... ok
2018-08-02T21:14:25.782935188Z creating collations ... ok
*********
I was on the docker host itself when this occurred, by inserting a pause script into the job I was able to see
- The container was running fine
- I could connect to it fine using the IP and I could connect to the instance of postgres running on it fine
Here's the relevant YML:
test:
stage: test
services:
- postgres:9.3
variables:
POSTGRES_DB: test
POSTGRES_USER: test
POSTGRES_PASSWORD: test
dependencies:
- build
script:
- gradle test
artifacts:
paths:
- build/reports/tests/
when: always
here is my docker version:
docker --version
Docker version 18.06.0-ce, build 0ffa825
Here is my runner version:
root@gitlab-runner:/# gitlab-runner --version
Version: 11.1.0
Git revision: 081978aa
Git branch:
GO version: go1.8.7
Built: 2018-07-22T07:24:46+00:00
OS/Arch: linux/amd64
Here is my gitlab version information:
GitLab 11.1.2 (35936b0)
GitLab Shell 7.1.4
GitLab Workhorse v5.0.0
GitLab API v4
Ruby 2.4.4p296
Rails 4.2.10
postgresql 9.6.8
This error:
No HOST or PORT
Seems to come from a script on the runner itself that appears to be looking for two vars :
host=$(env | grep -m1 _TCP_ADDR | cut -d = -f 2)
port=$(env | grep -m1 _TCP_PORT | cut -d = -f 2)
I modified gitlab-runner-service for the helper image to dump the whole container ENV when the this error is encountered and indeed there are no envs that end in _TCP_ADDR or _TCP_PORT on it when it runs:
Health check container logs:
2018-08-02T23:56:01.111600624Z HOSTNAME=6cfd90e650b9 SHLVL=1 HOME=/root PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin PWD=/
2018-08-02T23:56:01.111671306Z No HOST or PORT
Edited by Jim Carreer