Skip to content

CI Jobs Fail prematurely if there isn't new output for 70+ minutes

Summary

We have some CI jobs that run a very long mvn command with no output. The mvn command will run for well over 70 minutes, and sometimes when it does, the job just fails for no apparent reason. The jobs that fail this way just end with no artifacts. When this happens, the job on the runner continues to run normally. The actual time that the failure occurs varies a lot, but it is always over 70 minutes. Out CI timeout is set to 6 hours, so we are not hitting that.

Workaround

Have the script output a progress report. Example:

(while true; do sleep 5m; date; done)&

Steps to reproduce

This issue can be reproduced with a very simple CI job like so:

stages:
  - test

test:
  stage: test
  script:
    - sleep 5h
    - echo "finished"
    - exit 0

Running this, you will see the job output up to the "sleep 5h", but then the job fails at ~96 minutes. When it fails, there is no additional output for the job.

This is a similar job that does not fail, because it prints new output every 30 minutes:

stages:
  - test

test:
  stage: test
  script:
    - sleep 30m 
    - sleep 30m 
    - sleep 30m 
    - sleep 30m 
    - sleep 30m 
    - sleep 30m 
    - sleep 30m 
    - sleep 30m 
    - sleep 30m 
    - sleep 30m 
    - echo "finished"
    - exit 0

What is the current bug behavior?

CI job fails prematurely if there isn't somewhat continuous output.

What is the expected correct behavior?

CI jobs should continue to run as long as the script is still running, regardless of output.

Results of GitLab environment info

Expand for output related to GitLab environment info

System information System: Current User: git Using RVM: no Ruby Version: 2.3.6p384 Gem Version: 2.6.13 Bundler Version:1.13.7 Rake Version: 12.3.0 Redis Version: 3.2.11 Git Version: 2.14.3 Sidekiq Version:5.0.5 Go Version: unknown

GitLab information Version: 10.6.3 Revision: 753d851 Directory: /opt/gitlab/embedded/service/gitlab-rails DB Adapter: postgresql

Using LDAP: no Using Omniauth: no

GitLab Shell Version: 6.0.4 Repository storage paths:

  • default: /var/opt/gitlab/git-data/repositories Hooks: /opt/gitlab/embedded/service/gitlab-shell/hooks Git: /opt/gitlab/embedded/bin/git

Results of GitLab application Check

Expand for output related to the GitLab application check

Checking GitLab Shell ...

GitLab Shell version >= 6.0.4 ? ... OK (6.0.4) Repo base directory exists? default... yes Repo storage directories are symlinks? default... no Repo paths owned by git:root, or git:git? default... yes Repo paths access is drwxrws---? default... yes hooks directories in repos are links: ... 1/2 ... ok 1/20 ... ok 1/21 ... ok 1/22 ... ok 1/23 ... ok 1/27 ... ok 38/29 ... ok 42/30 ... ok 1/31 ... ok 68/32 ... ok 69/33 ... ok 69/34 ... ok 68/35 ... ok 68/36 ... ok 24/37 ... ok 68/38 ... repository is empty 68/39 ... ok 69/40 ... ok 24/43 ... ok 72/44 ... ok 68/46 ... repository is empty 71/47 ... ok 69/48 ... repository is empty 69/49 ... ok 69/52 ... ok 72/53 ... ok 2/54 ... ok Running /opt/gitlab/embedded/service/gitlab-shell/bin/check Check GitLab API access: OK Redis available via internal API: OK

Access to /var/opt/gitlab/.ssh/authorized_keys: OK gitlab-shell self-check successful

Checking GitLab Shell ... Finished

Checking Sidekiq ...

Running? ... yes Number of Sidekiq processes ... 1

Checking Sidekiq ... Finished

Reply by email is disabled in config/gitlab.yml Checking LDAP ...

LDAP is disabled in config/gitlab.yml

Checking LDAP ... Finished

Checking GitLab ...

Git configured correctly? ... yes Database config exists? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... yes Uploads directory tmp has correct permissions? ... no Try fixing it: sudo chown -R git /var/opt/gitlab/gitlab-rails/uploads sudo find /var/opt/gitlab/gitlab-rails/uploads -type f -exec chmod 0644 {} ; sudo find /var/opt/gitlab/gitlab-rails/uploads -type d -not -path /var/opt/gitlab/gitlab-rails/uploads -exec chmod 0700 {} ; For more information see: doc/install/installation.md in section "GitLab" Please fix the error above and rerun the checks. Init script exists? ... skipped (omnibus-gitlab has no init script) Init script up-to-date? ... skipped (omnibus-gitlab has no init script) Projects have namespace: ... 1/2 ... yes 1/20 ... yes 1/21 ... yes 1/22 ... yes 1/23 ... yes 1/27 ... yes 38/29 ... yes 42/30 ... yes 1/31 ... yes 68/32 ... yes 69/33 ... yes 69/34 ... yes 68/35 ... yes 68/36 ... yes 24/37 ... yes 68/38 ... yes 68/39 ... yes 69/40 ... yes 24/43 ... yes 72/44 ... yes 68/46 ... yes 71/47 ... yes 69/48 ... yes 69/49 ... yes 69/52 ... yes 72/53 ... yes 2/54 ... yes Redis version >= 2.8.0? ... yes Ruby version >= 2.3.5 ? ... yes (2.3.6) Git version >= 2.9.5 ? ... yes (2.14.3) Git user has default SSH configuration? ... yes Active users: ... 57

Checking GitLab ... Finished

Edited by Allison Browne