Skip to content

Jobs are randomly hanging

Summary

From time to time our test jobs are hanging and failing with the message ERROR: Job failed: execution took longer than 1h0m0s seconds

Steps to reproduce

our gitlab-ci.yml

test_app:
    stage: test
    image: registry.gitlab.com/drink-it/shop/docker:latest
    services:
        - docker:dind
    variables:
        DOCKER_DRIVER: overlay2
    tags:
        - docker
    script:
        - docker --version
        - docker-compose --version
        - docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
        - docker-compose up -d webserver
        - cp app/etc/local.xml.docker app/etc/local.xml
        - docker-compose exec -T webserver bash -c 'cd /var/www/htdocs/dev && composer install --no-interaction'
        - docker-compose exec -T webserver bash -c './shell/docker/check-mysql.sh'
        - docker-compose exec -T webserver bash -c 'set-base-url -c http://drink.loc/'
        - docker-compose exec -T webserver bash -c 'phantomjs --webdriver=4444 --ignore-ssl-errors=yes >/dev/null 2>&1 &'
        - docker-compose exec -T webserver bash -c './shell/docker/check-phantomjs.sh'
        - docker-compose exec -T webserver bash -c 'cd /var/www/htdocs/dev && ./vendor/bin/codecept run --steps'
    artifacts:
        when: on_failure
        paths:
            - $CI_PROJECT_DIR/dev/tests/_output/*.png

our docker-compose.yml

version: "3.1"
services:

    memcached:
        image: memcached:alpine
        container_name: drink-memcached

    redis:
        image: redis:alpine
        container_name: drink-redis

    mysql:
        image: mysql:5.6
        container_name: drink-mysql
        volumes:
            - ./dev/docker/.data/db:/var/lib/mysql
            - ./dev/build/db/dump.sql.gz:/docker-entrypoint-initdb.d/dump.sql.gz

    webserver:
        image: registry.gitlab.com/drink-it/shop:latest
        container_name: drink-webserver
        depends_on:
            - "mysql"
            - "redis"
            - "memcached"
        extra_hosts:
            - "drink.loc:127.0.0.1"
            - "business.drink.loc:127.0.0.1"
        volumes:
            - .:/var/www/htdocs

Actual behaviour

Sometimes build hangs and fails due to the timeout

in the beginning we thought that the problem is in the tests, but running same steps on local machine always gives successful result. moreover, when it fails on runner, it is not always at the same test step, which makes me think that the problem could be due to the lack of instance capacities.

Expected behaviour

I expect job not to hang or at least provide more informative message on the reason of hanging

Relevant logs and/or screenshots

Here is the log of failed job:

Running with gitlab-runner 11.1.0-rc2 (83bc9589)
 on docker-auto-scale fa6cab46
Using Docker executor with image registry.gitlab.com/drink-it/shop/docker:latest ...
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:a340e9c87a9562c063519ce3c6a80e315932107e4a56a5cb17adb5dd5658ecd4 for docker:dind ...
Waiting for services to be up and running...
Pulling docker image registry.gitlab.com/drink-it/shop/docker:latest ...
Using docker image sha256:0c7e5b70bd1d97d058dc2961a11ce67738afc511ad888cc8525496e5c320885a for registry.gitlab.com/drink-it/shop/docker:latest ...
Running on runner-fa6cab46-project-4675100-concurrent-0 via runner-fa6cab46-srm-1531857937-e044967f...
Cloning repository...
Cloning into '/builds/drink-it/shop'...
Checking out 0394ce19 as 173-product-front-template-collision...
Skipping Git submodules setup
Checking cache for default-1...
Downloading cache.zip from http://runners-cache-3-internal.gitlab.com:444/runner/project/4675100/default-1 
Successfully extracted cache
$ docker --version
Docker version 18.05.0-ce, build f150324
$ docker-compose --version
docker-compose version 1.21.2, build a133471
$ docker login -u gitlab-ci-token -p $CI_BUILD_TOKEN $CI_REGISTRY
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
$ docker-compose up -d webserver
Creating network "shop_default" with the default driver
Pulling redis (redis:alpine)...
alpine: Pulling from library/redis
Digest: sha256:e57274dac037e5b0c7680717fcaaa0efeffb23430e54e839c50819c9d842a38c
Status: Downloaded newer image for redis:alpine
Pulling mysql (mysql:5.6)...
5.6: Pulling from library/mysql
Digest: sha256:29e32fba52c3e6708fdc8a7678287debe3554febced25ade8686a63d4409ceda
Status: Downloaded newer image for mysql:5.6
Pulling memcached (memcached:alpine)...
alpine: Pulling from library/memcached
Digest: sha256:9206241d87e1ff7101f836bc88ed07b40637a99bbf59788d3a44366dc306d0e8
Status: Downloaded newer image for memcached:alpine
Pulling webserver (registry.gitlab.com/drink-it/shop:latest)...
latest: Pulling from drink-it/shop
Digest: sha256:59b954bcc62f620128d1f5f3c25e454cd7bd292726c1f088722c3f6976ccc16c
Status: Downloaded newer image for registry.gitlab.com/drink-it/shop:latest
Creating drink-mysql ... 
Creating drink-redis ... 
Creating drink-memcached ... 
Creating drink-mysql     ... done
Creating drink-memcached ... done
Creating drink-redis     ... done
Creating drink-webserver ... 
Creating drink-webserver ... done
$ cp app/etc/local.xml.docker app/etc/local.xml
$ docker-compose exec -T webserver bash -c 'cd /var/www/htdocs/dev && composer install --no-interaction'
Do not run Composer as root/super user! See https://getcomposer.org/root for details
Loading composer repositories with package information
Installing dependencies (including require-dev) from lock file
Nothing to install or update
Generating autoload files
$ docker-compose exec -T webserver bash -c './shell/docker/check-mysql.sh'
ERROR 2003 (HY000): Can't connect to MySQL server on 'mysql' (111)
database is accessible now.
$ docker-compose exec -T webserver bash -c 'set-base-url -c http://drink.loc/'
Base URL set to http://drink.loc/
$ docker-compose exec -T webserver bash -c 'phantomjs --webdriver=4444 --ignore-ssl-errors=yes >/dev/null 2>&1 &'
$ docker-compose exec -T webserver bash -c './shell/docker/check-phantomjs.sh'
PhantomJS is not yet on port :4444
PhantomJS is on port :4444
$ docker-compose exec -T webserver bash -c 'cd /var/www/htdocs/dev && ./vendor/bin/codecept run --steps'
Codeception PHP Testing Framework v2.3.6
Powered by PHPUnit 6.4.4 by Sebastian Bergmann and contributors.

Acceptance Tests (4) ---------------------------------------
001_HomePageCept: See Home Page
Signature: 001_HomePageCept
Test: tests/acceptance/001_HomePageCept.php
Scenario --
 I am on page "/"
 I see element ".cms-index-index"
 PASSED 

002_CustomerRegistrationCept: Register New Customer
Signature: 002_CustomerRegistrationCept
Test: tests/acceptance/002_CustomerRegistrationCept.php
Scenario --
 I have fixtures ["b2c_customer"]
 I am on page "/"
 I see "Login / Register"
 I click "Login / Register"
 I see "Neues Kundenkonto anlegen"
 I see "Registrieren"
 I click "Registrieren"
 I see "Als Privatperson registrieren"
 I fill field "Geburtstag (Kein Verkauf an ...","17.07.2001"
 I click "Konto eröffnen"
 I see "Der Verkauf an Minderjährige ist verboten."
 I fill field "Geburtstag (Kein Verkauf an ...","17.07.2000"
 I click "Konto eröffnen"
 I wait 1
 I don't see "Der Verkauf an Minderjährige ist verboten."
 I fill field "Vorname","John"
 I fill field "Nachname","Doe"
 I fill field "E-Mail Adresse","john.doe+1531858152@exam..."
 I fill field "E-Mail bestätigen","john.doe+1531858152@e..."
 I fill field "Passwort","123123q"
 I fill field "Passwort bestätigen","123123q"
 I check option "//*[@id="form-validate"]/div[10]/div/di..."
 I wait for element "//input[contains(@class, 'valida...",30
 I see "Konto eröffnen"
 I click "Konto eröffnen"
Pulling docker image gitlab/gitlab-runner-helper:x86_64-83bc9589 ...
ERROR: Job failed: execution took longer than 1h0m0s seconds

Environment description

we are using gitlab shared runners

Used GitLab Runner version

Running with gitlab-runner 11.1.0-rc2 (83bc9589)
 on docker-auto-scale fa6cab46