Create new label for blocking deployments and feature-flags
Problem Statement
When a change request has steps that are complete, but requires lengthy periods of time where verification is to be completed, I think we should consider a special label to be applied in those situations. Currently a C2 change that is active blocks deployments. As teamDelivery ramps up on deploying as frequently as possible, this becomes cumbersome if the issue has a long tail as Release Managers must ask for approval from the EOC for each deploy.
During incidents that are high severity, but do not impact deployments, example: https://gitlab.com/gitlab-com/gl-infra/production/-/issues/6709, we work around deployment blockers by either asking the EOC (which has itself become laborious) or we intentionally downgrade the severity of the incident.
Current Process
release-tools contains a production check that does the following:
- checks for active S1/S2 incidents
- checks for active C1/C2 change requests
- active deployments
If no incidents/change requests/deployments are found, release-tools will post a message indicating such - notifying the Release Manager is all clear to proceed. If ANY of the above are found, the release manager is notified of what it finds. From here, Release Managers may simply hold off on deploying if an active deploy is found. For active Change Requests and Incidents, unless it is obviously clear, we will then ask for permission to bypass the check, normally by asking the current EOC. If permission is granted, Release Managers utilize a special chatops command that bypasses the checks, it'll write a statement out on the release issue for that month and we must also ask that the EOC to provide confirmation of the override on that same comment from our tooling.
Examples:
- When things are all OK: gitlab-org/release/tasks#3638 (comment 890900809)
- When things go wrong: gitlab-org/release/tasks#3638 (comment 891324625)
Proposal
- Eliminate the S1/S2 and change issue check, replacing it with new single label, something such as
blocks deployments - Use the new label
blocks deploymentsas defaults for severity1 and severity2 incidents as well as C1 and C2 change requests - Create two labels, one for feature-flags and one for deployments
- Modify our policy such that the removal of said label, or the addition of said labels provides the reasoning
Doing so should eliminate some of the overhead required for both EOC's and RM's and streamline this procedure to make it easier for all parties involved.