8.15.0 RC2 does not seem to work with HAProxy on staging
We tried deploying 8.15.0 RC2 to staging, but we kept getting a 500 Error. Upon more investigation, it appears that the HAProxy health check (defined via httpchk line) is failing to access /:
x.x.x.x - - [20/Dec/2016:00:53:22 +0000] "GET / HTTP/1.0" 400 45 "-" "-"
When we enabled nginx debug logs on the staging server, we see:
2016/12/20 00:36:56 [debug] 58459#0: *1 http proxy header:
"GET / HTTP/1.1^M
X-Real-IP: 192.168.x.x^M
X-Forwarded-For: 192.168.x.x^M
Connection: close^M
X-Forwarded-Proto: https^M
X-Forwarded-Ssl: on^M
^M
"
2016/12/20 00:36:56 [debug] 58459#0: *1 http cleanup add: 00000000013FDED8
2016/12/20 00:36:56 [debug] 58459#0: *1 get rr peer, try: 1
2016/12/20 00:36:56 [debug] 58459#0: *1 stream socket 30
2016/12/20 00:36:56 [debug] 58459#0: *1 epoll add connection: fd:30 ev:80002005
2016/12/20 00:36:56 [debug] 58459#0: *1 connect to unix:/var/opt/gitlab/gitlab-workhorse/socket, fd:30 #2
2016/12/20 00:36:56 [debug] 58459#0: *1 connected
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream connect: 0
2016/12/20 00:36:56 [debug] 58459#0: *1 posix_memalign: 00000000013EC580:128 @16
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream send request
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream send request body
2016/12/20 00:36:56 [debug] 58459#0: *1 chain writer buf fl:1 s:138
2016/12/20 00:36:56 [debug] 58459#0: *1 chain writer in: 00000000013FDF10
2016/12/20 00:36:56 [debug] 58459#0: *1 writev: 138 of 138
2016/12/20 00:36:56 [debug] 58459#0: *1 chain writer out: 0000000000000000
2016/12/20 00:36:56 [debug] 58459#0: *1 event timer add: 30: 3600000:1482197816482
2016/12/20 00:36:56 [debug] 58459#0: *1 http finalize request: -4, "/?" a:1, c:2
2016/12/20 00:36:56 [debug] 58459#0: *1 http request count:2 blk:0
2016/12/20 00:36:56 [debug] 58459#0: *1 post event 00007FCB87659190
2016/12/20 00:36:56 [debug] 58459#0: *1 post event 00007FCB876591F0
2016/12/20 00:36:56 [debug] 58459#0: *1 delete posted event 00007FCB87659190
2016/12/20 00:36:56 [debug] 58459#0: *1 http run request: "/?"
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream check client, write event:1, "/"
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream recv(): -1 (11: Resource temporarily unavailable)
2016/12/20 00:36:56 [debug] 58459#0: *1 delete posted event 00007FCB876591F0
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream request: "/?"
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream dummy handler
2016/12/20 00:36:56 [debug] 58459#0: *1 post event 00007FCB876591F0
2016/12/20 00:36:56 [debug] 58459#0: *1 delete posted event 00007FCB876591F0
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream request: "/?"
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream dummy handler
2016/12/20 00:36:56 [debug] 58459#0: *1 post event 00007FCB8774A1F0
2016/12/20 00:36:56 [debug] 58459#0: *1 post event 00007FCB876591F0
2016/12/20 00:36:56 [debug] 58459#0: *1 delete posted event 00007FCB8774A1F0
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream request: "/?"
2016/12/20 00:36:56 [debug] 58459#0: *1 http upstream process header
2016/12/20 00:36:56 [debug] 58459#0: *1 malloc: 00000000013FE450:4096
2016/12/20 00:36:56 [debug] 58459#0: *1 recv: fd:30 118 of 3921
2016/12/20 00:36:56 [debug] 58459#0: *1 http proxy status 400 "400 Bad Request"
2016/12/20 00:36:56 [debug] 58459#0: *1 http proxy header: "Content-Type: text/plain"
2016/12/20 00:36:56 [debug] 58459#0: *1 http proxy header: "Connection: close"
2016/12/20 00:36:56 [debug] 58459#0: *1 http proxy header done
2016/12/20 00:36:56 [debug] 58459#0: *1 HTTP/1.1 400 Bad Request^M
Notice how nginx seems to attempt to talk to Workhorse, but then it gets a bad recv() call.
When I issue curl to the same host with HTTP/1.0 or HTTP/1.1, I get a valid 302 value, not 400.
Disabling the httpchk line on HAProxy makes things work.
I'm concerned this would happen in production, so I'm not sure if we should deploy until we get to the root of this problem. The alternative is to deploy, comment out the health check, and do it some other way (e.g. http://serverfault.com/questions/664332/haproxy-returns-bad-request-invalid-host-for-seemingly-no-reason).
/cc: @jacobvosmaer-gitlab, @dbalexandre, @jameslopez, @ahanselka, @northrup, @pcarranza