Self-hosted gitlab instance often goes to state of HTTP 502: Waiting for GitLab to boot

Summary

Seemingly arbitrarily, when accessing self-managed Gitlab website, it shows "HTTP 502: Waiting for GitLab to boot", also Container Registries are not accessible either. Gitlab does not seem to be booting during that time, in fact gitlab-ctl status shows all the services running. The issue persists until I manually restart Gitlab with gitlab-ctl restart.

The only relevant log I could find was puma_stderr.log, that does show a fault, the log is shown further down.

Steps to reproduce

Unfortunately I don't know what causes the issue. I know that it's not caused by any user action, for example I upgraded Gitlab instance from 17.6.1 to 17.7.1 at 01:35 during the night and the error in puma_stderr.log appeared at 06:48 in the morning on Saturday, nobody was using Gitlab during that time.

I am managing several self-managed Gitlab instances and none of the other ones have this issue happening, so I suspect the issue is with the OS environment, as this instance is the only one running Ubuntu 22.04.

Example Project

What is the current bug behavior?

As was described in summary.

What is the expected correct behavior?

Gitlab should not go to a state of "HTTP 502: Waiting for GitLab to boot" without actually booting.

Relevant logs and/or screenshots

cat /var/log/gitlab/puma/puma_stderr.log
#<Thread:0x00007f5f63c4e900 /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:72 run> terminated with exception (report_on_exception is true):
/opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/core.rb:203:in `_set_state!': undefined method `state=' for nil:NilClass (NoMethodError)

      env[ENV_INFO_KEY].state = state
                       ^^^^^^^^
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/core.rb:128:in `block in call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/core.rb:130:in `block in call'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:38:in `run!'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:51:in `run!'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:99:in `block (2 levels) in run_loop!'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:99:in `each'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:99:in `block in run_loop!'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:80:in `loop'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:80:in `run_loop!'
        from /opt/gitlab/embedded/lib/ruby/gems/3.2.0/gems/rack-timeout-0.7.0/lib/rack/timeout/support/scheduler.rb:72:in `block (2 levels) in runner'
bundler: failed to load command: puma (/opt/gitlab/embedded/bin/puma)

Output of checks

Results of GitLab environment info

Expand for output related to GitLab environment info
System information
System:         Ubuntu 22.04
Current User:   git
Using RVM:      no
Ruby Version:   3.2.5
Gem Version:    3.5.23
Bundler Version:2.5.11
Rake Version:   13.0.6
Redis Version:  7.0.15
Sidekiq Version:7.2.4
Go Version:     unknown

GitLab information
Version:        17.7.1
Revision:       ea03507eff8
Directory:      /opt/gitlab/embedded/service/gitlab-rails
DB Adapter:     PostgreSQL
DB Version:     14.11
URL:            https://gitlab.x.x
HTTP Clone URL: https://gitlab.x.x/some-group/some-project.git
SSH Clone URL:  git@gitlab.x.x:some-group/some-project.git
Using LDAP:     yes
Using Omniauth: yes
Omniauth Providers:

GitLab Shell
Version:        14.39.0
Repository storages:
- default:      unix:/var/opt/gitlab/gitaly/gitaly.socket
GitLab Shell path:              /opt/gitlab/embedded/service/gitlab-shell

Gitaly
- default Address:      unix:/var/opt/gitlab/gitaly/gitaly.socket
- default Version:      17.7.1
- default Git Version:  2.47.0

Results of GitLab application Check

Expand for output related to the GitLab application check

gitlab-ctl status run: gitaly: (pid 2106) 48546s; run: log: (pid 2100) 48546s run: gitlab-kas: (pid 2107) 48546s; run: log: (pid 2104) 48546s run: gitlab-workhorse: (pid 2109) 48546s; run: log: (pid 2102) 48546s run: logrotate: (pid 35491) 1746s; run: log: (pid 2094) 48546s run: nginx: (pid 2103) 48546s; run: log: (pid 2097) 48546s run: postgresql: (pid 2096) 48546s; run: log: (pid 2093) 48546s run: puma: (pid 2112) 48546s; run: log: (pid 2108) 48546s run: redis: (pid 2113) 48546s; run: log: (pid 2110) 48546s run: registry: (pid 2105) 48546s; run: log: (pid 2101) 48546s run: sidekiq: (pid 2099) 48546s; run: log: (pid 2095) 48546s

gitlab-rake gitlab:check SANITIZE=true Checking GitLab subtasks ...

Checking GitLab Shell ...

GitLab Shell: ... GitLab Shell version >= 14.39.0 ? ... OK (14.39.0) Running /opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell-check Internal API available: FAILED - Internal API unreachable gitlab-shell self-check failed Try fixing it: Make sure GitLab is running; Check the gitlab-shell configuration file: sudo -u git -H editor /opt/gitlab/embedded/service/gitlab-shell/config.yml Please fix the error above and rerun the checks.

Checking GitLab Shell ... Finished

Checking Gitaly ...

Gitaly: ... default ... OK

Checking Gitaly ... Finished

Checking Sidekiq ...

Sidekiq: ... Running? ... yes Number of Sidekiq processes (cluster/worker) ... 1/1

Checking Sidekiq ... Finished

Checking Incoming Email ...

Incoming Email: ... Reply by email is disabled in config/gitlab.yml

Checking Incoming Email ... Finished

Checking LDAP ...

LDAP: ... Server: ldapmain LDAP authentication... Success LDAP users with access to your GitLab server (only showing the first 100 results) User output sanitized. Found 83 users of 100 limit.

Checking LDAP ... Finished

Checking GitLab App ...

Database config exists? ... yes Tables are truncated? ... skipped All migrations up? ... yes Database contains orphaned GroupMembers? ... no GitLab config exists? ... yes GitLab config up to date? ... yes Cable config exists? ... yes Resque config exists? ... yes Log directory writable? ... yes Tmp directory writable? ... yes Uploads directory exists? ... yes Uploads directory has correct permissions? ... yes Uploads directory tmp has correct permissions? ... yes Systemd unit files or init script exist? ... skipped (omnibus-gitlab has neither init script nor systemd units) Systemd unit files or init script up-to-date? ... skipped (omnibus-gitlab has neither init script nor systemd units) Projects have namespace: ... 2/1 ... yes 1/3 ... yes 9/4 ... yes 13/9 ... yes 13/11 ... yes 13/12 ... yes 13/16 ... yes 13/18 ... yes 15/19 ... yes 64/56 ... yes 102/58 ... yes 104/59 ... yes 13/63 ... yes 13/64 ... yes 13/65 ... yes 13/66 ... yes 13/67 ... yes 74/68 ... yes 87/69 ... yes 87/70 ... yes 93/71 ... yes 93/73 ... yes 96/75 ... yes 96/76 ... yes 99/77 ... yes 98/78 ... yes 99/80 ... yes 99/81 ... yes 100/83 ... yes 100/84 ... yes 101/85 ... yes 101/86 ... yes 101/87 ... yes 101/88 ... yes 104/89 ... yes 99/90 ... yes 106/92 ... yes 99/93 ... yes 99/94 ... yes 99/95 ... yes 113/97 ... yes 113/98 ... yes 116/99 ... yes 117/100 ... yes 99/101 ... yes 117/102 ... yes 130/103 ... yes 132/104 ... yes 133/105 ... yes 138/106 ... yes 10/107 ... yes 113/108 ... yes 91/109 ... yes 113/110 ... yes 141/111 ... yes 142/112 ... yes 99/113 ... yes 140/114 ... yes 116/119 ... yes 116/120 ... yes 116/121 ... yes 216/122 ... yes 136/123 ... yes 99/124 ... yes 99/125 ... yes 224/126 ... yes 99/127 ... yes 99/128 ... yes 229/129 ... yes 232/130 ... yes 234/131 ... yes 113/132 ... yes 246/133 ... yes 98/134 ... yes 251/135 ... yes 133/136 ... yes 133/137 ... yes 99/138 ... yes 98/139 ... yes 98/140 ... yes 260/141 ... yes 113/142 ... yes 261/144 ... yes 265/146 ... yes 265/148 ... yes 265/149 ... yes 265/150 ... yes 265/151 ... yes 224/153 ... yes 99/154 ... yes 98/155 ... yes 265/156 ... yes 285/157 ... yes 265/158 ... yes 133/159 ... yes 260/160 ... yes 136/161 ... yes 98/162 ... yes 100/163 ... yes 260/164 ... yes 260/165 ... yes 260/166 ... yes 260/167 ... yes 260/168 ... yes 98/169 ... yes 265/170 ... yes 260/171 ... yes 260/172 ... yes 260/173 ... yes 260/174 ... yes 260/175 ... yes 260/176 ... yes 325/177 ... yes 325/178 ... yes Redis version >= 6.2.14? ... yes Ruby version >= 3.0.6 ? ... yes (3.2.5) Git user has default SSH configuration? ... yes Active users: ... 50 Is authorized keys file accessible? ... yes GitLab configured to store new projects in hashed storage? ... yes All projects are in hashed storage? ... yes

Checking GitLab App ... Finished

Checking GitLab subtasks ... Finished

I am aware of the note of "we will only investigate if the tests are passing", but in this case that is exactly the issue. After manually doing "gitlab-ctl restart", all the tests pass, including internal API:

Internal API available: OK Redis available via internal API: OK gitlab-shell self-check successful

Possible fixes

Edited by naegleria