502 Bad Gateway errors after upgrade from 10.8.4 to 11.0
Summary
After upgrading from 10.8.4 to 11.0 with apt-get upgrade, I only got 502 Bad Gateway errors. I solved it, but took hours and a bunch of frustration to figure the problem out, so I report it so others hopefully will not have to suffer from this.
Steps to reproduce
-
Relevant settings in the gitlab.rb file:
unicorn['port'] = 8200 gitlab_workhorse['auth_backend'] = "http://127.0.0.1:8200"
Every other
unicorn
andgitlab_workhorse
variables left on defaults. This is important. -
sudo apt-get update
-
sudo apt-get upgrade
What is the current bug behavior?
I only got 502 Bad Gateway errors when opening gitlab in the browser.
What is the expected correct behavior?
Work as normally.
Details of troubleshooting
Troubleshooting processes and ports
First it was strange that everything seemed to be running fine. Only debugging the listening processes would give me any clue what's going on:$ sudo ss -lptn 'sport = :8200' [10:44:24] State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 1024 ::1:8200 :::* users:(("bundle",pid=21537,fd=24),("bundle",pid=21534,fd=24),("bundle",pid=21531,fd=24),("bundle",pid=21528,fd=24),("bundle",pid=21525,fd=24),("bundle",pid=21479,fd=24))$ systemctl status gitlab-runner.service gitlab-runner[18080]: time="2018-06-23T10:25:44+02:00" level=warning msg="Checking for jobs... failed" runner=a97db987 status="502 Bad Gateway" gitlab-runner[18080]: time="2018-06-23T10:25:47+02:00" level=warning msg="Checking for jobs... failed" runner=a97db987 status="502 Bad Gateway"
Relevant logs
Relevant parts of /var/log/gitlab/gitlab-workhorse/current: 2018-06-22_19:35:25.30399 gitlab-workhorse @ - - [2018/06/22:21:35:25 +0200] "POST /api/v4/jobs/request HTTP/1.0" 204 0 "" "gitlab-runner 11.0.0 (11-0-stable; go1.8.7; linux/amd64)" 0.000 2018-06-22_19:35:28.30586 gitlab-workhorse @ - - [2018/06/22:21:35:28 +0200] "POST /api/v4/jobs/request HTTP/1.0" 204 0 "" "gitlab-runner 11.0.0 (11-0-stable; go1.8.7; linux/amd64)" 0.000 2018-06-22_19:35:31.30627 time="2018-06-22T21:35:31+02:00" level=error msg=error error="badgateway: failed after 0s: dial tcp 127.0.0.1:8200: getsockopt: connection refused" method=POST uri=/api/v4/jobs/request 2018-06-22_19:35:31.30641 gitlab-workhorse @ - - [2018/06/22:21:35:31 +0200] "POST /api/v4/jobs/request HTTP/1.0" 502 24 "" "gitlab-runner 11.0.0 (11-0-stable; go1.8.7; linux/amd64)" 0.001 2018-06-22_19:35:34.30648 time="2018-06-22T21:35:34+02:00" level=error msg=error error="badgateway: failed after 0s: dial tcp 127.0.0.1:8200: getsockopt: connection refused" method=POST uri=/api/v4/jobs/requestProof of behavioral change in /var/log/gitlab/unicorn/unicorn_stderr.log BEFORE: I, [2018-06-03T00:38:45.876424 #27021] INFO -- : Refreshing Gem list I, [2018-06-03T00:39:31.755580 #27021] INFO -- : listening on addr=127.0.0.1:8200 fd=24 I, [2018-06-03T00:39:31.755761 #27021] INFO -- : unlinking existing socket=/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket I, [2018-06-03T00:39:31.755917 #27021] INFO -- : listening on addr=/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket fd=25 I, [2018-06-03T00:39:31.861706 #27021] INFO -- : master process ready I, [2018-06-03T00:39:31.916296 #27170] INFO -- : worker=1 ready I, [2018-06-03T00:39:31.921632 #27167] INFO -- : worker=0 ready I, [2018-06-03T00:39:31.955684 #27173] INFO -- : worker=2 ready I, [2018-06-03T00:39:32.007350 #27176] INFO -- : worker=3 ready I, [2018-06-03T00:39:32.023315 #27179] INFO -- : worker=4 ready
and AFTER the upgrade: I, [2018-06-23T10:25:33.424428 #18058] INFO -- : Refreshing Gem list I, [2018-06-23T10:26:00.212393 #18058] INFO -- : listening on addr=[::1]:8200 fd=24 I, [2018-06-23T10:26:00.212529 #18058] INFO -- : unlinking existing socket=/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket I, [2018-06-23T10:26:00.212681 #18058] INFO -- : listening on addr=/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket fd=25 I, [2018-06-23T10:26:00.245306 #18058] INFO -- : master process ready I, [2018-06-23T10:26:00.307602 #18170] INFO -- : worker=0 ready I, [2018-06-23T10:26:00.317529 #18176] INFO -- : worker=2 ready I, [2018-06-23T10:26:00.332280 #18179] INFO -- : worker=3 ready I, [2018-06-23T10:26:00.335263 #18173] INFO -- : worker=1 ready I, [2018-06-23T10:26:00.366180 #18182] INFO -- : worker=4 ready
Details of package version
gitlab-ee 11.0.0-ee.0
Environment details
- Operating System: Ubuntu 16.0.4
- Installation Target: Bare Metal Machine
- Installation Type: Upgrade from version 10.8.4
- Is this a single or multiple node installation? single
The problem
In 10.8.4 unicorn would listen on IPv4 address, but in 11.0 it would not, so gitlab-workhorse couldn't connect to the rails application with the specified settings above.
The solution
Explicitly specify socket settings in gitlab.rb
so that gitlab-workhorse and unicorn can talk to each other:
unicorn['socket'] = '/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket'
gitlab_workhorse['auth_socket'] = "/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket"
Suggestion
We have validators for these configuration options which will not work together in our product. The user should be notified at least that some default changed.
Also defaults should not change without any notice. My first thing was to search any relevant update notes or changelog, but I found nothing about this.