Separate destructive deploys containing non-reversible database changes from other releases
(Placeholder issue) - (Also, this may not be the correct repository for this issue)
This issue is a corrective action from https://docs.google.com/document/d/1E1_w5v7d3sUvFNip34AyPOpkMTTT23I_EhbOWmYzlWM/edit#bookmark=id.m5twlkx50amz
Column drop database migrations are fairly uncommon events. They are destructive in that once they are executed, it's difficult to reverse the change. We generally drop columns a long time after we have stopped using a field and started ignoring it.
As we move to serialisable deploys, in which we deploy a single unit through a pipeline of "Deploy-to-Staging" -> "QA Testing in Staging" -> "Deploy-to-Canary" -> "Deploy-to-Production", one of the risks is that the pipeline can be held up by the following sequence:
- A release contains both application changes AND destructive database migrations
- The release is deployed to QA, the destructive database migration runs, but QA fails
- We now have a situation where we need to roll staging back to a known good state. This takes time and blocks your release pipeline from further deploys.
As a way around this situation, we could separate deploys into two categories:
- Application changes (one or more)
- Destructive Database changes (one of more)
Since we don't drop columns until long after they've stopped being used, destructive database changes have little risk of causing the build to break. Most QA problems will occur when application changes are deployed.
If we separate application changes and destructive database changes, we separate the risk of QA failure from the risk of having to deal with irreversible database changes, and this reduces the deployment risk altogether.
Additionally (although still not ideal), knowing that our application changes never include destructive changes makes rolling a production deployment back safer too.
cc @abrandl @msmiley @stanhu @meks @clefelhocz1 @jarv @marin @skarbek