Option to allow release managers the ability to bypass baking_time in cases of incident remediation
Details
During a recent severity2 incident: production#18806 (closed) we merged a fix to a failing foreign key constraint and from the time this was merged, to the point at which is was deployed to production took almost 6 hours, even with the release manager picking the MR into auto-deploy.
The processes are in place to protect our deployments, and as this was not a severity1 incident, hot patching was not an option. While we waited for the deployment, the baking_time in the co-ordinator pipeline was acknowledged as a candidate for potentially bypassing. This baking_time waits for 30 minutes before proceeding with the deployment, which in standard deployments makes sense, as we want to minimise potential disruption and provide the ability to catch issues early before they reach production.
However, in cases where we are trying to quickly deploy a fix to production during a high severity incident, being able to bypass that 30 minute baking period to expedite remediation looks like it has the potential to be a quick improvement.
Proposal
Support the ability to cancel the baking_time delay at the release managers discretion -- for example, if they assess the benefit to outweigh the risk, like when shipping an urgent and narrowly targeted revert MR (like in the case of the incident that prompted this idea).