Proxy GitLab Pages traffic through -listen-proxy instead of -listen-http
Production Change
Problem to solve
Currently, GitLab Pages logs internal IP addresses instead of clients' IPs. This happens because we use the listen_http flag instead of thelisten_proxy one. In that case pages daemon doesn't know that it's behind the load balancer and ignores proxy headers.
Logging isn't a big issue on its own, but we also want to implement IP-based rate-limiting, so we need accurate information about client IPs.
Original thread: gitlab-org/gitlab-pages!594 (comment 701507574)
Change Summary
Change listen_http to listen_proxy in GitLab Pages config.
Change Details
- Services Impacted - ServicePages
- Change Technician - @skarbek
- Change Reviewer - @hphilipps
- Time tracking - 30 minutes
- Downtime Component - 0
Detailed steps for the change
Pre-Change Steps - steps to be completed before execution of the change
Estimated Time to Complete (mins) - 5 minutes
- 
Set label changein-progress on this issue 
- 
curl http://vshushlin.gitlab.io/pagest-http-test/
- 
Open up the logging details below such that we can document the before and after 
- 
Get approval on MR gitlab-com/gl-infra/k8s-workloads/gitlab-com!1399 (merged) 
Change Steps - steps to take to execute the change
Estimated Time to Complete (mins) - 10 minutes
- 
Merge and watch application of MR: gitlab-com/gl-infra/k8s-workloads/gitlab-com!1399 (merged) 
Post-Change Steps - steps to take to verify the change
Estimated Time to Complete (mins) - 1 minute
- 
curl http://vshushlin.gitlab.io/pagest-http-test/
- 
Proceed to Monitoring section 
- 
Post Before and after proving we are now gathering the client IPs 
Rollback
Rollback steps - steps to be taken in the event of a need to rollback this change
Estimated Time to Complete (mins) - 10 minutes
- 
Revert and watch application of MR: gitlab-com/gl-infra/k8s-workloads/gitlab-com!1399 (merged) 
Monitoring
Key metrics to observe
- 
Metric: Pages Apdex and Service Error Ratio - Location: https://dashboards.gitlab.net/d/web-pages-main/web-pages-overview?orgId=1
- What changes to this metric should prompt a rollback: lower apgex or increased errors
 
- 
Log Watching: https://log.gprd.gitlab.net/goto/b91ff9789f9342e7791196f0c03d631c - If after the application of the configuration change we continue to see local IP addresses, something is wrong, proceed to rollbackaa
 
Summary of infrastructure changes
- 
Does this change introduce new compute instances? No 
- 
Does this change re-size any existing compute instances? No 
- 
Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc? No 
Changes checklist
- 
This issue has a criticality label (e.g. C1, C2, C3, C4) and a change-type label (e.g. changeunscheduled, changescheduled) based on the Change Management Criticalities. 
- 
This issue has the change technician as the assignee. 
- 
Pre-Change, Change, Post-Change, and Rollback steps and have been filled out and reviewed. 
- 
This Change Issue is linked to the appropriate Issue and/or Epic 
- 
Necessary approvals have been completed based on the Change Management Workflow. 
- 
Change has been tested in staging and results noted in a comment on this issue. 
- 
A dry-run has been conducted and results noted in a comment on this issue. 
- 
SRE on-call has been informed prior to change being rolled out. (In #production channel, mention @sre-oncalland this issue and await their acknowledgement.)
- 
Release managers have been informed (If needed! Cases include DB change) prior to change being rolled out. (In #production channel, mention @release-managersand this issue and await their acknowledgment.)
- 
There are currently no active incidents.