Considerations for bypassing nginx for websockets and https git
GitLab.com is considering removing the NGINX ingress controller from the request path for current HTTPs Git and Websockets traffic. This issue is to discuss the merits of this approach and any considerations we need to make before we connect Workhorse directly to HAProxy.
The motivation of this change is as follows:
- Reduce complexity by eliminating a service that is not doing much right now except acting as reverse proxy
- Improve reliability by not having to deal with nginx-controller pod cycles
- Reduce cost by not having to dedicate cpu/memory resources to NGINX
To start, these diagrams illustrate the current and proposed configuration. The proposal will replace the current NGINX controller with an internal GCP LB which reduces complexity:
Current
graph LR
subgraph GitLab.com
CloudFlare
end
subgraph GCPLoadBalancer
external-lb
end
subgraph HAProxy
fe
end
subgraph GCPInternalLB
internal-lb
end
subgraph NGINXPod
nginx-controller
end
subgraph WebServicePod
workhorse
puma
end
CloudFlare -->|https| external-lb
external-lb -->|https| fe
fe --> |https| internal-lb
internal-lb --> |https| nginx-controller
nginx-controller --> |http| workhorse
workhorse --> |http| puma
Proposed
graph LR
subgraph GitLab.com
CloudFlare
end
subgraph GCPExternalLB
external-lb
end
subgraph HAProxy
fe
end
subgraph GCPInternalLB
internal-lb
end
subgraph WebServicePod
workhorse
puma
end
CloudFlare -->|https| external-lb
external-lb -->|https| fe
fe --> |http| internal-lb
internal-lb --> |http| workhorse
workhorse --> |http| puma
For this configuration change we need to ensure that we won't run into any surprises connecting workhorse directly to HAProxy
Current NGINX configuration (pulled from staging)
# Configuration checksum: 16258588997816761317
# setup custom paths that do not require root access
pid /tmp/nginx.pid;
load_module /etc/nginx/modules/ngx_http_modsecurity_module.so;
daemon off;
worker_processes 4;
worker_rlimit_nofile 261120;
worker_shutdown_timeout 10s ;
events {
multi_accept on;
worker_connections 16384;
use epoll;
}
http {
lua_package_cpath "/usr/local/lib/lua/?.so;/usr/lib/lua-platform-path/lua/5.1/?.so;;";
lua_package_path "/etc/nginx/lua/?.lua;/etc/nginx/lua/vendor/?.lua;/usr/local/lib/lua/?.lua;;";
lua_shared_dict configuration_data 5M;
lua_shared_dict certificate_data 16M;
lua_shared_dict locks 512k;
lua_shared_dict sticky_sessions 1M;
init_by_lua_block {
require("resty.core")
collectgarbage("collect")
local lua_resty_waf = require("resty.waf")
lua_resty_waf.init()
-- init modules
local ok, res
ok, res = pcall(require, "configuration")
if not ok then
error("require failed: " .. tostring(res))
else
configuration = res
configuration.nameservers = { "10.228.0.10" }
end
ok, res = pcall(require, "balancer")
if not ok then
error("require failed: " .. tostring(res))
else
balancer = res
end
ok, res = pcall(require, "monitor")
if not ok then
error("require failed: " .. tostring(res))
else
monitor = res
end
}
init_worker_by_lua_block {
balancer.init_worker()
monitor.init_worker()
}
real_ip_header X-Forwarded-For;
real_ip_recursive on;
set_real_ip_from 0.0.0.0/0;
geoip_country /etc/nginx/geoip/GeoIP.dat;
geoip_city /etc/nginx/geoip/GeoLiteCity.dat;
geoip_org /etc/nginx/geoip/GeoIPASNum.dat;
geoip_proxy_recursive on;
aio threads;
aio_write on;
tcp_nopush on;
tcp_nodelay on;
log_subrequest on;
reset_timedout_connection on;
keepalive_timeout 75s;
keepalive_requests 100;
client_body_temp_path /tmp/client-body;
fastcgi_temp_path /tmp/fastcgi-temp;
proxy_temp_path /tmp/proxy-temp;
ajp_temp_path /tmp/ajp-temp;
client_header_buffer_size 1k;
client_header_timeout 60s;
large_client_header_buffers 4 8k;
client_body_buffer_size 8k;
client_body_timeout 60s;
http2_max_field_size 4k;
http2_max_header_size 16k;
http2_max_requests 1000;
types_hash_max_size 2048;
server_names_hash_max_size 1024;
server_names_hash_bucket_size 256;
map_hash_bucket_size 64;
proxy_headers_hash_max_size 512;
proxy_headers_hash_bucket_size 64;
variables_hash_bucket_size 128;
variables_hash_max_size 2048;
underscores_in_headers off;
ignore_invalid_headers on;
limit_req_status 503;
include /etc/nginx/mime.types;
default_type text/html;
gzip on;
gzip_comp_level 5;
gzip_http_version 1.1;
gzip_min_length 256;
gzip_types application/atom+xml application/javascript application/x-javascript application/json application/rss+xml application/vnd.ms-fontobject application/x-font-ttf application/x-web-app-manifest+json application/xhtml+xml application/xml font/opentype image/svg+xml image/x-icon text/css text/plain text/x-component;
gzip_proxied any;
gzip_vary on;
# Custom headers for response
server_tokens off;
more_clear_headers Server;
# disable warnings
uninitialized_variable_warn off;
# Additional available variables:
# $namespace
# $ingress_name
# $service_name
# $service_port
log_format upstreaminfo '$the_real_ip - [$the_real_ip] - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $request_length $request_time [$proxy_upstream_name] $upstream_addr $upstream_response_length $upstream_response_time $upstream_status $req_id';
map $request_uri $loggable {
default 1;
}
access_log /var/log/nginx/access.log upstreaminfo if=$loggable;
error_log /var/log/nginx/error.log notice;
resolver 10.228.0.10 valid=30s ipv6=off;
# See https://www.nginx.com/blog/websocket-nginx
map $http_upgrade $connection_upgrade {
default upgrade;
# See http://nginx.org/en/docs/http/ngx_http_upstream_module.html#keepalive
'' '';
}
# The following is a sneaky way to do "set $the_real_ip $remote_addr"
# Needed because using set is not allowed outside server blocks.
map '' $the_real_ip {
default $remote_addr;
}
# trust http_x_forwarded_proto headers correctly indicate ssl offloading
map $http_x_forwarded_proto $pass_access_scheme {
default $http_x_forwarded_proto;
'' $scheme;
}
map $http_x_forwarded_port $pass_server_port {
default $http_x_forwarded_port;
'' $server_port;
}
# Obtain best http host
map $http_host $this_host {
default $http_host;
'' $host;
}
map $http_x_forwarded_host $best_http_host {
default $http_x_forwarded_host;
'' $this_host;
}
# validate $pass_access_scheme and $scheme are http to force a redirect
map "$scheme:$pass_access_scheme" $redirect_to_https {
default 0;
"http:http" 1;
"https:http" 1;
}
map $pass_server_port $pass_port {
443 443;
default $pass_server_port;
}
# Reverse proxies can detect if a client provides a X-Request-ID header, and pass it on to the backend server.
# If no such header is provided, it can provide a random value.
map $http_x_request_id $req_id {
default $http_x_request_id;
"" $request_id;
}
# Create a variable that contains the literal $ character.
# This works because the geo module will not resolve variables.
geo $literal_dollar {
default "$";
}
server_name_in_redirect off;
port_in_redirect off;
ssl_protocols TLSv1.1 TLSv1.2;
# turn on session caching to drastically improve performance
ssl_session_cache builtin:1000 shared:SSL:10m;
ssl_session_timeout 10m;
# allow configuring ssl session tickets
ssl_session_tickets on;
# slightly reduce the time-to-first-byte
ssl_buffer_size 4k;
# allow configuring custom ssl ciphers
ssl_ciphers 'ECDHE-RSA-AES256-GCM-SHA384:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA256:ECDHE-RSA-AES256-SHA:ECDHE-RSA-AES128-SHA:AES256-GCM-SHA384:AES128-GCM-SHA256:AES256-SHA256:AES128-SHA256:AES256-SHA:AES128-SHA:!aNULL:!eNULL:!EXPORT:!DES:!MD5:!PSK:!RC4';
ssl_prefer_server_ciphers on;
ssl_ecdh_curve auto;
proxy_ssl_session_reuse on;
upstream upstream_balancer {
server 0.0.0.1; # placeholder
balancer_by_lua_block {
balancer.balance()
}
keepalive 32;
keepalive_timeout 30s;
keepalive_requests 0;
}
# Global filters
## start server _
server {
server_name _ ;
listen 80 default_server reuseport backlog=1024;
set $proxy_upstream_name "-";
listen 443 default_server reuseport backlog=1024 ssl ;
# PEM sha: 71e510cc2fb53e55867f5f2fff3413242db0c90e
ssl_certificate /etc/ingress-controller/ssl/default-fake-certificate.pem;
ssl_certificate_key /etc/ingress-controller/ssl/default-fake-certificate.pem;
location / {
set $namespace "";
set $ingress_name "";
set $service_name "";
set $service_port "0";
set $location_path "/";
rewrite_by_lua_block {
balancer.rewrite()
}
access_by_lua_block {
}
header_filter_by_lua_block {
}
body_filter_by_lua_block {
}
log_by_lua_block {
balancer.log()
monitor.call()
}
if ($scheme = https) {
more_set_headers "Strict-Transport-Security: max-age=15724800";
}
access_log off;
port_in_redirect off;
set $proxy_upstream_name "upstream-default-backend";
client_max_body_size 1m;
proxy_set_header Host $best_http_host;
# Pass the extracted client certificate to the backend
# Allow websocket connections
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_set_header X-Request-ID $req_id;
proxy_set_header X-Real-IP $the_real_ip;
proxy_set_header X-Forwarded-For $the_real_ip;
proxy_set_header X-Forwarded-Host $best_http_host;
proxy_set_header X-Forwarded-Port $pass_port;
proxy_set_header X-Forwarded-Proto $pass_access_scheme;
proxy_set_header X-Original-URI $request_uri;
proxy_set_header X-Scheme $pass_access_scheme;
# Pass the original X-Forwarded-For
proxy_set_header X-Original-Forwarded-For $http_x_forwarded_for;
# mitigate HTTPoxy Vulnerability
# https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
proxy_set_header Proxy "";
# Custom headers to proxied server
proxy_set_header Referrer-Policy "strict-origin-when-cross-origin";
proxy_connect_timeout 5s;
proxy_send_timeout 60s;
proxy_read_timeout 60s;
proxy_buffering off;
proxy_buffer_size 4k;
proxy_buffers 4 4k;
proxy_request_buffering on;
proxy_http_version 1.1;
proxy_cookie_domain off;
proxy_cookie_path off;
# In case of errors try the next upstream server before returning an error
proxy_next_upstream error timeout;
proxy_next_upstream_tries 3;
proxy_pass http://upstream_balancer;
proxy_redirect off;
}
# health checks in cloud providers require the use of port 80
location /healthz {
access_log off;
return 200;
}
# this is required to avoid error if nginx is being monitored
# with an external software (like sysdig)
location /nginx_status {
allow 127.0.0.1;
deny all;
access_log off;
stub_status on;
}
}
## end server _
# backend for when default-backend-service is not configured or it does not have endpoints
server {
listen 8181 default_server reuseport backlog=1024;
set $proxy_upstream_name "-";
location / {
return 404;
}
}
# default server, used for NGINX healthcheck and access to nginx stats
server {
listen 18080 default_server reuseport backlog=1024;
set $proxy_upstream_name "-";
location /healthz {
access_log off;
return 200;
}
location /is-dynamic-lb-initialized {
access_log off;
content_by_lua_block {
local configuration = require("configuration")
local backend_data = configuration.get_backends_data()
if not backend_data then
ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
return
end
ngx.say("OK")
ngx.exit(ngx.HTTP_OK)
}
}
location /nginx_status {
set $proxy_upstream_name "internal";
access_log off;
stub_status on;
}
location /configuration {
access_log off;
allow 127.0.0.1;
deny all;
# this should be equals to configuration_data dict
client_max_body_size 10m;
client_body_buffer_size 10m;
proxy_buffering off;
content_by_lua_block {
configuration.call()
}
}
location / {
set $proxy_upstream_name "upstream-default-backend";
proxy_set_header Host $best_http_host;
proxy_pass http://upstream_balancer;
}
}
}
stream {
lua_package_cpath "/usr/local/lib/lua/?.so;/usr/lib/lua-platform-path/lua/5.1/?.so;;";
lua_package_path "/etc/nginx/lua/?.lua;/etc/nginx/lua/vendor/?.lua;/usr/local/lib/lua/?.lua;;";
lua_shared_dict tcp_udp_configuration_data 5M;
init_by_lua_block {
require("resty.core")
collectgarbage("collect")
-- init modules
local ok, res
ok, res = pcall(require, "tcp_udp_configuration")
if not ok then
error("require failed: " .. tostring(res))
else
tcp_udp_configuration = res
tcp_udp_configuration.nameservers = { "10.228.0.10" }
end
ok, res = pcall(require, "tcp_udp_balancer")
if not ok then
error("require failed: " .. tostring(res))
else
tcp_udp_balancer = res
end
}
init_worker_by_lua_block {
tcp_udp_balancer.init_worker()
}
lua_add_variable $proxy_upstream_name;
log_format log_stream [$time_local] $protocol $status $bytes_sent $bytes_received $session_time;
access_log /var/log/nginx/access.log log_stream;
error_log /var/log/nginx/error.log;
upstream upstream_balancer {
server 0.0.0.1:1234; # placeholder
balancer_by_lua_block {
tcp_udp_balancer.balance()
}
}
server {
listen unix:/tmp/ingress-stream.sock;
content_by_lua_block {
tcp_udp_configuration.call()
}
}
# TCP services
# UDP services
}```
</details>
## Preserving IP
Workhorse supports the `X-Forwarded-For` header and will use it as the RemoteAddr for logging. We currently set `X-Forwarded-For` in HAProxy so there should be no change here.
## Keepalive configuration
Current NGINX config:
keepalive 32;
keepalive_timeout 30s;
keepalive_requests 0;
Keepalives improve performance by holding open TCP connections, we currently disable
keepalives for all Kubernetes traffic because it resulted in connections remaining open to terminating pods https://gitlab.com/gitlab-com/gl-infra/production/-/issues/3140
We have options to enable more aggressive connection re-use at HAProxy if this is an issue later as we migrate more services to Kubernetes. https://www.haproxy.com/blog/http-keep-alive-pipelining-multiplexing-and-connection-pooling/
for now, there should be no impact.
## GZIP
current NGINX config:
gzip on;
gzip_comp_level 5;
gzip_http_version 1.1;
gzip_min_length 256;
gzip_types application/atom+xml application/javascript application/x-javascript application/json application/rss+xml application/vnd.ms-fontobject application/x-font-ttf application/x-web-app-manifest+json application/xhtml+xml application/xml font/opentype image/svg+xml image/x-icon text/css text/plain text/x-component;
gzip_proxied any;
gzip_vary on;
GZIP compression is enabled so this will no longer be set once we remove NGINX.
this can also be set at HAProxy with:
compression algo gzip compression type text/html text/plain
Though I don't see this being a major impact for Git HTTPS anyway, we might consider turning this on at HAProxy for web/api traffic, though there is probably nothing we need to do here since it is covered by CloudFlare https://support.cloudflare.com/hc/en-us/articles/200168086-Does-Cloudflare-compress-resources-
## proxy buffering and proxy request buffering
This is already disabled so there should be no impact there, we also already do buffering at Cloudflare https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/10419#note_368918755 .s
Edited by John Jarvis