Skip to content

When running CICD from `main` branch - it doesn't create "services" (e.g. postgres service). From MRs everything is fine.

Summary

When running CICD from main branch - it doesn't create "services" (e.g. postgres service). From MRs everything is fine.

alternative summary:

PGSQL service fails to start with initdb: error: superuser name "[MASKED]" is disallowed; role names cannot begin with "pg_" error when running from main branches. Everything is ok from MRs and non-main branches.

Steps to reproduce

.gitlab-ci.yml
default:
  image: python:3.10

tests:
  services:
    - postgres:15.2

  variables:
    POSTGRES_HOST: postgres # connected to a "service"
    POSTGRES_DB: $POSTGRES_DB
    POSTGRES_USER: $POSTGRES_USER
    POSTGRES_PASSWORD: $POSTGRES_PASSWORD
    POSTGRES_PORT: 5432
    POSTGRES_HOST_AUTH_METHOD: trust

  stage: test
  script:
    - cat /etc/hosts
    - rounds=42;
      while [ $rounds -gt 0 ]; do
        set +e;
        printf "" 2>>/dev/null >>/dev/tcp/$POSTGRES_HOST/$POSTGRES_PORT && break;
        set -e;
        rounds=$(($rounds - 1));
        echo sleeping, remaining rounds=$rounds;
        sleep 5;
      done;
    - cat /etc/hosts
    - ...
  only:
    - main
    - merge_requests

Actual behavior

When running from MRs, this is what I see:

$ cat /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
172.17.0.3	postgres 3eeae23c47d7 runner-j1aldqxs-project-45508215-concurrent-0-68f134eedf16233b-postgres-0
172.17.0.4	runner-j1aldqxs-project-45508215-concurrent-0
$ rounds=42; while [ $rounds -gt 0 ]; do set +e; printf "" 2>>/dev/null >>/dev/tcp/$POSTGRES_HOST/$POSTGRES_PORT && break; set -e; rounds=$(($rounds - 1)); echo sleeping, remaining rounds=$rounds; sleep 5; done;
$ cat /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
172.17.0.3	postgres 3eeae23c47d7 runner-j1aldqxs-project-45508215-concurrent-0-68f134eedf16233b-postgres-0
172.17.0.4	runner-j1aldqxs-project-45508215-concurrent-0

It gets the list of hosts, postgres is in it, and there is a single round of retires required to check for postgres to be up (sometimes 2).

But when running from a main branch, I see this:

$ cat /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
172.17.0.3	runner--azerasq-project-45508215-concurrent-0
$ rounds=42; while [ $rounds -gt 0 ]; do set +e; printf "" 2>>/dev/null >>/dev/tcp/$POSTGRES_HOST/$POSTGRES_PORT && break; set -e; rounds=$(($rounds - 1)); echo sleeping, remaining rounds=$rounds; sleep 5; done;
sleeping, remaining rounds=41
sleeping, remaining rounds=40
...
sleeping, remaining rounds=2
sleeping, remaining rounds=1
sleeping, remaining rounds=0
$ cat /etc/hosts
127.0.0.1	localhost
::1	localhost ip6-localhost ip6-loopback
fe00::0	ip6-localnet
ff00::0	ip6-mcastprefix
ff02::1	ip6-allnodes
ff02::2	ip6-allrouters
172.17.0.3	runner--azerasq-project-45508215-concurrent-0

See no postgres in hosts, and later steps in the script fail, because they can't connect to the DB.

Expected behavior

Expect postgres to always be in hosts, and be accessible after a few tries.

Relevant logs and/or screenshots

image

image

image

Environment description

I use shared Runners on GitLab.com.

Used GitLab Runner version

From MRs (all succeeded):

Running with gitlab-runner 15.9.0~beta.115.g598a7c91 (598a7c91)
  on blue-1.shared.runners-manager.gitlab.com/default j1aLDqxS, system ID: s_b437a71a38f9
  feature flags: FF_USE_IMPROVED_URL_MASKING:true

OR

Running with gitlab-runner 15.9.0~beta.115.g598a7c91 (598a7c91)
  on blue-4.shared.runners-manager.gitlab.com/default J2nyww-s, system ID: s_5425356d8adf
  feature flags: FF_USE_IMPROVED_URL_MASKING:true

OR

Running with gitlab-runner 15.9.0~beta.115.g598a7c91 (598a7c91)
  on blue-3.shared.runners-manager.gitlab.com/default zxwgkjAP, system ID: s_284de3abf026
  feature flags: FF_USE_IMPROVED_URL_MASKING:true

From main (all failed):

Running with gitlab-runner 15.9.0~beta.115.g598a7c91 (598a7c91)
  on blue-5.shared.runners-manager.gitlab.com/default -AzERasQ, system ID: s_8a38c517a741
  feature flags: FF_USE_IMPROVED_URL_MASKING:true

OR

Running with gitlab-runner 15.9.0~beta.115.g598a7c91 (598a7c91)
  on blue-3.shared.runners-manager.gitlab.com/default zxwgkjAP, system ID: s_284de3abf026
  feature flags: FF_USE_IMPROVED_URL_MASKING:true

OR

Running with gitlab-runner 15.9.0~beta.115.g598a7c91 (598a7c91)
  on blue-4.shared.runners-manager.gitlab.com/default J2nyww-s, system ID: s_5425356d8adf
  feature flags: FF_USE_IMPROVED_URL_MASKING:true

Notice that blue-4.shared.runners-manager.gitlab.com/default has both successful and unsuccessful runs.

Maybe related tickets

#29499 (closed)

Possible fixes

Don't set POSTGRES_HOST_AUTH_METHOD to trust, but it's what's recommended in https://docs.gitlab.com/ee/ci/services/postgres.html

UPDATE: after removing POSTGRES_HOST_AUTH_METHOD (just 1 line of change to the yaml file), it continued working on MRs, and continued not working on main branch, with more visible error:

initdb: error: superuser name "[MASKED]" is disallowed; role names cannot begin with "pg_"

image

Either way, it's not clear why it always fails in main branch, and always succeeds in MRs or non-main branches

Edited by Evergreen Hacker