Document migration failure scenarios and how to react and mitigate them
Context
The Package team has been working on a new version of the Container Registry that relies on a metadata database (DB) to enable online Garbage Collection (GC) and unblock the implementation of several other features.
With the new version now implemented, we've been working on the rollout, starting with GitLab.com (&5523 (closed)).
Proposal
Now that we settled on the high-level deployment/migration approach (#374 (closed)) we should:
- Identify and describe all possible failure scenarios;
- Document how to react and mitigate them;
- Practice failure and mitigation in pre-prod/staging (separate followup issue).
This should be done in collaboration with Infrastructure.
Edited by João Pereira