Distribution FCL for incident 5521
Why this issue
Based on the new FCL process - https://about.gitlab.com/handbook/engineering/#feature-change-locks - we are going to put a Feature Change Lock in place from (2021-09-23 to 2021-09-30) due to the incident gitlab-com/gl-infra/production#5521 (closed)
- Root Cause Analysis Issue - gitlab-com/gl-infra/production#5521 (comment 676483734)
- Corrective Actions Issue - gitlab-com/gl-infra/production#5533 (closed)
This issue is for reporting the progress of Distribution FCL.
Root cause
As outlined in gitlab-com/gl-infra/production#5521 (closed)
The golang version used to compile the gitlab workhorse cloud native images was updated from 1.16 to 1.17 gitlab-org/build/CNG!736 (merged) (merged)
The upgrade to Go v1.17 introduced a significant performance regression with ZIP files. Instead of 2 HTTP Range Requests, it appears this call to f.readDataDescriptor() in https://go-review.googlesource.com/c/go/+/312310/14/src/archive/zip/reader.go#120 caused additional HTTP Range Requests to be fired for every file in the archive, which significantly slowed down the generation of the artifact metadata.
Deliverables
Based on the classification of corrective actions, the deliverables of Distribution FCL is the documentation of major dependency upgrade process.
Work plan
- Work on corrective action, process documentation improvements - gitlab-org/omnibus-gitlab#6416 (closed)
- Improve reliability/testing of related areas in Distribution
- Add dangerbot warning when major versions updates - #931 (closed)
- Trigger QA tests from the CNG pipelines rather than manaul - gitlab-org/charts/gitlab#2824 (closed)
- Once above actions are complete, addition Distribution reliability issues unrelated to the incident will be looked at for the FCL time and linked to this issue.
Followup
The following followup issues where created while working on the work plan
- Automate Component Update Bot Epic and Issue creation
- Generate Go project dependencies list automatically
FCL retrospective
Done async here: #942 (closed)