ERROR: Job failed: command terminated with exit code 137 (auto retry not working)
Summary
From time to time (1/30 jobs) I get job failing suddenly with exit code 137
.
Steps to reproduce
It happens randomly, all runners are kubernetes executor.
What is the current bug behavior?
All jobs are configured with
retry:
max: 2
when:
- runner_system_failure
- stuck_or_timeout_failure
- unknown_failure
The job is marked as failed and it doesn't retry it automatically.
What is the expected correct behavior?
I would expect that auto retry come into action but it doesn't.
Relevant logs and/or screenshots
$ skt -vv --state --junit junit --rc ${RC_FILE} --workdir ${WORKDIR} build --makeopts=-j$(nproc) --cfgtype rh-configs --rh-configs-glob 'redhat/configs/kernel-*-'"${ARCH_CONFIG}"'.config'
2018-12-05 21:23:53,644 INFO basecfg: None
2018-12-05 21:23:53,644 INFO cfgtype: rh-configs
2018-12-05 21:23:53,644 INFO building kernel: ['make', '-C', '/cki-project/cki-pipeline/workdir', 'INSTALL_MOD_STRIP=1', '-j64', 'targz-pkg', '-j64']
2018-12-05 21:23:53,645 INFO building Red Hat configs: ['make', '-C', '/cki-project/cki-pipeline/workdir', 'rh-configs']
ERROR: Job failed: command terminated with exit code 137
Results of GitLab environment info
gitlab-rake gitlab:env:info
System information
System:
Current User: git
Using RVM: no
Ruby Version: 2.4.5p335
Gem Version: 2.7.6
Bundler Version:1.16.6
Rake Version: 12.3.1
Redis Version: 3.2.12
Git Version: 2.18.1
Sidekiq Version:5.2.1
Go Version: unknown
GitLab information
Version: 11.5.1
Revision: c90ae59
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: postgresql
URL: https://xci32.lab.eng.rdu2.redhat.com
HTTP Clone URL: https://xci32.lab.eng.rdu2.redhat.com/some-group/some-project.git
SSH Clone URL: git@xci32.lab.eng.rdu2.redhat.com:some-group/some-project.git
Using LDAP: no
Using Omniauth: yes
Omniauth Providers:
GitLab Shell
Version: 8.4.1
Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories
Hooks: /opt/gitlab/embedded/service/gitlab-shell/hooks
Git: /opt/gitlab/embedded/bin/git
Results of GitLab application Check
gitlab-rake gitlab:check SANITIZE=true [8/1966]
Checking GitLab Shell ...
GitLab Shell version >= 8.4.1 ? ... OK (8.4.1)
hooks directories in repos are links: ...
36/1 ... ok
36/2 ... ok
38/3 ... ok
38/4 ... ok
Running /opt/gitlab/embedded/service/gitlab-shell/bin/check
Check GitLab API access: OK
Redis available via internal API: OK
Access to /var/opt/gitlab/.ssh/authorized_keys: OK
gitlab-shell self-check successful
Checking GitLab Shell ... Finished
Checking Gitaly ...
default ... OK
Checking Gitaly ... Finished
Checking Sidekiq ...
Running? ... yes
Number of Sidekiq processes ... 1
Checking Sidekiq ... Finished
Reply by email is disabled in config/gitlab.yml
Checking LDAP ...
LDAP is disabled in config/gitlab.yml
Checking LDAP ... Finished
Checking GitLab ...
Git configured correctly? ... yes
Database config exists? ... yes
All migrations up? ... yes
Database contains orphaned GroupMembers? ... no
GitLab config exists? ... yes
GitLab config up to date? ... yes
Log directory writable? ... yes
Tmp directory writable? ... yes
Uploads directory exists? ... yes
Uploads directory has correct permissions? ... yes
Uploads directory tmp has correct permissions? ... skipped (no tmp uploads folder yet)
Init script exists? ... skipped (omnibus-gitlab has no init script)
Init script up-to-date? ... skipped (omnibus-gitlab has no init script)
Projects have namespace: ...
36/1 ... yes
36/2 ... yes
38/3 ... yes
38/4 ... yes
Redis version >= 2.8.0? ... yes
Ruby version >= 2.3.5 ? ... yes (2.4.5)
Git version >= 2.9.5 ? ... yes (2.18.1)
Git user has default SSH configuration? ... yes
Active users: ... 6
Checking GitLab ... Finished
Edited by Agustin Henze