virtualbox guests do sometimes not poweroff after job completion
Summary
We execute build jobs on windows by starting these machines as virtualbox guests. Jobs power up the machine and power off when the job is complete. At the moment we are able to handle a few concurrent jobs. Unfortunately we observe that sometimes the virtualbox guests that run as VBoxHeadless processes in the build machine do not poweroff after job completion. This has the implication that one of the slots is lost until we manually trigger the poweroff of the build machine by examining htop
, VBoxManage list runningvms
and doing VBoxManage controlvm {guid} poweroff
for machines that do not show disk io in htop
. When enough slots are lost the jobs effectively begin to starve. This bug always involves an intervention of a human. Finding stuck virtualbox machines is error-prone. Shutting down the wrong machine causes builds to fail. Note that the jobs that were run by one of the stuck virtualbox machines went green and are no longer visible even in the admin area.
Steps to reproduce
We cannot reproduce the sympton but only observe the starvation of build jobs and then begin to investigate.
What is the current bug behavior?
We observe virtualbox guests in the form of VBoxHeadless processes to not poweroff after job completion.
What is the expected correct behavior?
We expect virtualbox guests in the form of VBoxHeadless processes to always poweroff after job completion.
Relevant logs and/or screenshots
root@gitlab-runner-host~# gitlab-runner --version
Version: 9.5.1
Git revision: 96b34cc
Git branch: 9-5-stable
GO version: go1.8.3
Built: Wed, 04 Oct 2017 16:26:27 +0000
OS/Arch: linux/amd64
Output of checks
The bug happens in our corporate gitlab-ce environment.
Results of GitLab environment info
Expand for output related to GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:env:info
)(For installations from source run and paste the output of:
sudo -u git -H bundle exec rake gitlab:env:info RAILS_ENV=production
)
Results of GitLab application Check
Expand for output related to the GitLab application check
System information System: Ubuntu 16.04 Current User: git Using RVM: no Ruby Version: 2.3.6p384 Gem Version: 2.6.13 Bundler Version:1.13.7 Rake Version: 12.3.0 Redis Version: 3.2.11 Git Version: 2.14.3 Sidekiq Version:5.0.5 Go Version: unknownGitLab information Version: 10.6.2 Revision: 3e3c05b Directory: /opt/gitlab/embedded/service/gitlab-rails DB Adapter: postgresql URL: https://gitlab-ce-host HTTP Clone URL: https://gitlab-ce-host/some-group/some-project.git SSH Clone URL: git@gitlab-ce-host:some-group/some-project.git Using LDAP: yes Using Omniauth: no
GitLab Shell Version: 6.0.4 Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories Hooks: /opt/gitlab/embedded/service/gitlab-shell/hooks Git: /opt/gitlab/embedded/bin/git