Skip to content

502 Bad Gateway errors after upgrade from 10.8.4 to 11.0

Summary

After upgrading from 10.8.4 to 11.0 with apt-get upgrade, I only got 502 Bad Gateway errors. I solved it, but took hours and a bunch of frustration to figure the problem out, so I report it so others hopefully will not have to suffer from this.

Steps to reproduce

  1. Relevant settings in the gitlab.rb file:

    unicorn['port'] = 8200
    gitlab_workhorse['auth_backend'] = "http://127.0.0.1:8200"

    Every other unicorn and gitlab_workhorse variables left on defaults. This is important.

  2. sudo apt-get update

  3. sudo apt-get upgrade

What is the current bug behavior?

I only got 502 Bad Gateway errors when opening gitlab in the browser.

What is the expected correct behavior?

Work as normally.

Details of troubleshooting

Troubleshooting processes and ports First it was strange that everything seemed to be running fine. Only debugging the listening processes would give me any clue what's going on:
$ sudo ss -lptn 'sport = :8200'                                                                                                                                                                                                                                                                                   [10:44:24]
State      Recv-Q Send-Q                                                                                                                  Local Address:Port                                                                                                                                 Peer Address:Port
LISTEN     0      1024                                                                                                                              ::1:8200                                                                                                                                           :::*
users:(("bundle",pid=21537,fd=24),("bundle",pid=21534,fd=24),("bundle",pid=21531,fd=24),("bundle",pid=21528,fd=24),("bundle",pid=21525,fd=24),("bundle",pid=21479,fd=24))

$ systemctl status gitlab-runner.service gitlab-runner[18080]: time="2018-06-23T10:25:44+02:00" level=warning msg="Checking for jobs... failed" runner=a97db987 status="502 Bad Gateway" gitlab-runner[18080]: time="2018-06-23T10:25:47+02:00" level=warning msg="Checking for jobs... failed" runner=a97db987 status="502 Bad Gateway"

Relevant logs
Relevant parts of /var/log/gitlab/gitlab-workhorse/current:
2018-06-22_19:35:25.30399 gitlab-workhorse @ - - [2018/06/22:21:35:25 +0200] "POST /api/v4/jobs/request HTTP/1.0" 204 0 "" "gitlab-runner 11.0.0 (11-0-stable; go1.8.7; linux/amd64)" 0.000
2018-06-22_19:35:28.30586 gitlab-workhorse @ - - [2018/06/22:21:35:28 +0200] "POST /api/v4/jobs/request HTTP/1.0" 204 0 "" "gitlab-runner 11.0.0 (11-0-stable; go1.8.7; linux/amd64)" 0.000
2018-06-22_19:35:31.30627 time="2018-06-22T21:35:31+02:00" level=error msg=error error="badgateway: failed after 0s: dial tcp 127.0.0.1:8200: getsockopt: connection refused" method=POST uri=/api/v4/jobs/request
2018-06-22_19:35:31.30641 gitlab-workhorse @ - - [2018/06/22:21:35:31 +0200] "POST /api/v4/jobs/request HTTP/1.0" 502 24 "" "gitlab-runner 11.0.0 (11-0-stable; go1.8.7; linux/amd64)" 0.001
2018-06-22_19:35:34.30648 time="2018-06-22T21:35:34+02:00" level=error msg=error error="badgateway: failed after 0s: dial tcp 127.0.0.1:8200: getsockopt: connection refused" method=POST uri=/api/v4/jobs/request

Proof of behavioral change in /var/log/gitlab/unicorn/unicorn_stderr.log BEFORE: I, [2018-06-03T00:38:45.876424 #27021] INFO -- : Refreshing Gem list I, [2018-06-03T00:39:31.755580 #27021] INFO -- : listening on addr=127.0.0.1:8200 fd=24 I, [2018-06-03T00:39:31.755761 #27021] INFO -- : unlinking existing socket=/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket I, [2018-06-03T00:39:31.755917 #27021] INFO -- : listening on addr=/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket fd=25 I, [2018-06-03T00:39:31.861706 #27021] INFO -- : master process ready I, [2018-06-03T00:39:31.916296 #27170] INFO -- : worker=1 ready I, [2018-06-03T00:39:31.921632 #27167] INFO -- : worker=0 ready I, [2018-06-03T00:39:31.955684 #27173] INFO -- : worker=2 ready I, [2018-06-03T00:39:32.007350 #27176] INFO -- : worker=3 ready I, [2018-06-03T00:39:32.023315 #27179] INFO -- : worker=4 ready

and AFTER the upgrade: I, [2018-06-23T10:25:33.424428 #18058] INFO -- : Refreshing Gem list I, [2018-06-23T10:26:00.212393 #18058] INFO -- : listening on addr=[::1]:8200 fd=24 I, [2018-06-23T10:26:00.212529 #18058] INFO -- : unlinking existing socket=/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket I, [2018-06-23T10:26:00.212681 #18058] INFO -- : listening on addr=/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket fd=25 I, [2018-06-23T10:26:00.245306 #18058] INFO -- : master process ready I, [2018-06-23T10:26:00.307602 #18170] INFO -- : worker=0 ready I, [2018-06-23T10:26:00.317529 #18176] INFO -- : worker=2 ready I, [2018-06-23T10:26:00.332280 #18179] INFO -- : worker=3 ready I, [2018-06-23T10:26:00.335263 #18173] INFO -- : worker=1 ready I, [2018-06-23T10:26:00.366180 #18182] INFO -- : worker=4 ready

Details of package version

gitlab-ee 11.0.0-ee.0

Environment details

  • Operating System: Ubuntu 16.0.4
  • Installation Target: Bare Metal Machine
  • Installation Type: Upgrade from version 10.8.4
  • Is this a single or multiple node installation? single

The problem

In 10.8.4 unicorn would listen on IPv4 address, but in 11.0 it would not, so gitlab-workhorse couldn't connect to the rails application with the specified settings above.

The solution

Explicitly specify socket settings in gitlab.rb so that gitlab-workhorse and unicorn can talk to each other:

unicorn['socket'] = '/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket'
gitlab_workhorse['auth_socket'] = "/var/opt/gitlab/gitlab-rails/sockets/gitlab.socket"

Suggestion

We have validators for these configuration options which will not work together in our product. The user should be notified at least that some default changed.

Also defaults should not change without any notice. My first thing was to search any relevant update notes or changelog, but I found nothing about this.

Edited by György Kiss