Bypasses the NGINX ingress on canary https git

Production Change

Change Summary

After the investigation in delivery#1497 (closed) we have decided to allow https git clients to connect directly to workhorse from HAProxy instead of going through the nginx ingress. This simplifies our configuration, reduced a network hop, and saves money since we will no longer require compute resources for nginx.

This configuration change has been deployed in Staging since last week without any issue.

Change Details

Services Impacted - HTTPS Git
Change Technician - @jarv
Change Criticality - C3
Change Type - changeunscheduled, changescheduled
Change Reviewer - @skarbek
Due Date - 2021-02-02 - 2021-02-03
Time tracking - 2 days
Downtime Component - none

Detailed steps for the change

Note: The starting weight for the canary backend is 5.

Drain canary backends

 ./bin/set-weights gprd gke-cny-git 0

Merge and apply https://ops.gitlab.net/gitlab-cookbooks/chef-repo/-/merge_requests/4953
Ramp up the traffic slowly on canary while monitoring

 ./bin/set-weights gprd gke-cny-git 2 https_git
 ./bin/set-weights gprd gke-cny-git 5 https_git
 ./bin/set-weights gprd gke-cny-git 1 canary_https_git

Rollback

Drain canary backends

 ./bin/set-weights gprd gke-cny-git 0

Revert and apply https://ops.gitlab.net/gitlab-cookbooks/chef-repo/-/merge_requests/4953

Monitoring

Summary of infrastructure changes

Does this change introduce new compute instances?
Does this change re-size any existing compute instances?
Does this change introduce any additional usage of tooling like Elastic Search, CDNs, Cloudflare, etc?

Summary of the above

Changes checklist

This issue has a criticality label (e.g. C1, C2, C3, C4) and a change-type label (e.g. changeunscheduled, changescheduled) based on the Change Management Criticalities.
This issue has the change technician as the assignee.
Pre-Change, Change, Post-Change, and Rollback steps and have been filled out and reviewed.
Necessary approvals have been completed based on the Change Management Workflow.
Change has been tested in staging and results noted in a comment on this issue.
A dry-run has been conducted and results noted in a comment on this issue.
SRE on-call has been informed prior to change being rolled out. (In #production channel, mention @sre-oncall and this issue and await their acknowledgement.)
There are currently no active incidents.

Edited Feb 02, 2021 by John Jarvis