Maven Package size limit breaking pipelines follow up items
Background
The ~"Development Department" has had a long running initiative to introduce application limits that was fanned out to various teams. ~"group::package" had investigated limits for Package Registries and determined that limits were required.
With net new registries, limits were easily introduced without concern for impact on existing customer data because there was none. This meant that it was relatively simple to set a limit and introduce those patterns into best practices for new Package Registries.
An issue was created and scheduled to add limits to the Nuget Package Registry during the implementation of this issue, the engineer identified an opportunity to implement limits for the other registries with very little effort.
Upon deployment, it was determined that the code was operating as expected, file sizes that exceeded the limit were rejected which was deemed to be a correct implementation and accepted.
Unfortunately, without due consideration to existing customer impact during the product development flow, failures experienced by customers who had routinely been exceeding this newly introduced limit were experienced and then escalated by our customers.
Infrastructure investigated and the issue was mitigated. As is standard practice, an RCA was conducted where ~"group::package" participated to identify root causes and address concerns.
Root Cause Analysis: gitlab-com/gl-infra/production#2563 (closed)
Precipitating merge request: gitlab-org/gitlab!39633 (merged)
Action Items
Directly addressing Root Cause
-
Add explicit checks on customer impact during Product Development Flow: !61124 (merged) -
Add a review process for application limits - due within %13.5 - DRI @sabrams -
Update the database column defaults from 50MB to reasonable defaults - due within %13.4 - DRI @sabrams -
Join product development flow working group - @dcroft - !61126 (merged)
Indirectly addressing Root Cause
-
Add testing guidelines for package features - due within %13.5 - DRI @jhampton -
Add admin UI for these settings - missed %13.4, in %13.5 - DRI @sabrams -
Investigate ability to return custom error messages to the Maven client - due within %13.5 - DRI @sabrams -
Add checklist item to evaluate customer impact to MR template - TBD
Feedback
I'd really like to have a quick review from each of the following people to make sure we're not missing an opportunity to improve. Please take a look at and add anything to the Background
and Action Items
above that I'm missing and then feel free to unassign yourself.