Every Gitlab build hangs for 5 minutes after finishing Azure uploads
Summary
We have a current (13.5.1-ee) Kubernetes deployed Gitlab which has recently developed a behaviour that in practice blocks CI for everyone: After every job finishes (either successfully or with a failure), and this is reported by the runner (as confirmed by the local log), the coordinator waits for 5 minutes to report the status via the UI. This blocks starting any dependent jobs and stages.
Steps to reproduce
- Access our Gitlab instance.
- Install example project on our instance.
- Observe builds time.
Example Project
Attached please find an examples project and a video that shows the behaviour.
What is the current bug behavior?
Job starts and finishes as expected, then the system waits to close the job with the three dots shown at the bottom of the screen.
What is the expected correct behavior?
Job starts and finishes as expected, then the system closes the job.
Relevant logs and/or screenshots
- We have turned the log Debug flags on, but they do not show anything out of the ordinary.
- We have reviewed the runner logs, but they do not show anything out of the ordinary.
Output of checks
Private instance hosted on AKS.
Results of GitLab environment info
Private instance hosted on AKS, created via helm, see attached values. The periphery uses hosted services where possible.
Results of GitLab application Check
Application check(rake
) as per Kubernetes instructions.
git@gitlab-task-runner-5c55b46cff-bxh7x:/$ /usr/local/bin/gitlab-rake gitlab:check Checking GitLab subtasks ... Checking GitLab Shell ... GitLab Shell: ... GitLab Shell version >= 13.11.0 ? ... OK (13.11.0) Running /home/git/gitlab-shell/bin/check gitlab-shell self-check failed Try fixing it: Make sure GitLab is running; Check the gitlab-shell configuration file: sudo -u git -H editor /home/git/gitlab-shell/config.yml Please fix the error above and rerun the checks. Checking GitLab Shell ... Finished Checking Gitaly ... Gitaly: ... default ... OK Checking Gitaly ... Finished Checking Sidekiq ... Sidekiq: ... Running? ... no Try fixing it: sudo -u git -H RAILS_ENV=production bin/background_jobs start For more information see: doc/install/installation.md in section "Install Init Script" see log/sidekiq.log for possible errors Please fix the error above and rerun the checks. Checking Sidekiq ... Finished Checking Incoming Email ... Incoming Email: ... Reply by email is disabled in config/gitlab.yml Checking Incoming Email ... Finished Checking LDAP ... LDAP: ... LDAP is disabled in config/gitlab.yml Checking LDAP ... Finished Checking GitLab App ... Git configured correctly? ... no Trying to fix error automatically. ...Failed Try fixing it: sudo -u git -H "/usr/bin/git" config --global core.autocrlf "input" For more information see: doc/install/installation.md in section "GitLab" Database config exists? ... yes All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... yes Uploads directory tmp has correct permissions? ... skipped (no tmp uploads folder yet) Init script exists? ... no Try fixing it: Install the init script For more information see: doc/install/installation.md in section "Install Init Script" Please fix the error above and rerun the checks. Init script up-to-date? ... can't check because of previous errors Projects have namespace: ... GitLab Instance / Monitoring ... yes Research and Development / Infrastructure / Gitlab Installation ... yes Codebots / Marketing Site ... yes Research and Development / Fourth Generation Bots / Bot Talk ... yes Research and Development / Fourth Generation Bots / Bot Learn ... yes Research and Development / Fourth Generation Bots / Bot Vision ... yes Codebots / Marketing Blog ... yes Brodie O'Carroll / Marketing Blog ... yes Codebots / Site Builder ... yes Jörn Guy Süß / Gitlab Cleaner ... yes Research and Development / Collaboration / University of Queensland / Micro Credential ... yes Research and Development / Collaboration / University of Queensland / Agility ... yes Research and Development / Collaboration / CRCP ... yes Research and Development / gitlab-5minute-lag ... yes Redis version >= 4.0.0? ... yes Ruby version >= 2.5.3 ? ... yes (2.6.6) Git version >= 2.24.0 ? ... no Your git bin path is "/usr/bin/git" Try fixing it: Update your git to a version >= 2.24.0 from Unknown Please fix the error above and rerun the checks. Git user has default SSH configuration? ... yes Active users: ... 7 Is authorized keys file accessible? ... skipped (authorized keys not enabled) GitLab configured to store new projects in hashed storage? ... yes All projects are in hashed storage? ... yes Elasticsearch version 6.x - 7.x? ... skipped (elasticsearch is disabled) Checking GitLab App ... Finished Checking GitLab subtasks ... Finished
Possible fixes
This behaviour does not depend on the:
- type of executor. It occurs with the docker and kubernetes executor.
- size of artifacts (for test cases there are none)
- size of logs (for test cases they are 5 lines long)
- image being used (for test cases it is busybox)
- script (for tests it is one line)
- network quality (for tests I have activated feature flags that would address this)
We have no resolution and no logs that show issues. It feels that the coordinator is attempting a call to another system and times out.
gitlab-org/gitlab-runner~bug