Determine Risk for Post Deployment/Background Migrations in relation to Release Preparation
Problem Statement
We need clarification on where we may determine a breakpoint for when migrations are running and when said migration is safe for release.
Today
We have 3 types of database migrations:
- Regular migrations - usually schema changes or data modifications that are quick
- Post Deployment migrations - executed after a deployment due to application requirements
- Background migrations - Long running determined by the data and method of being modified
Risk Assessment
Currently we only validate that the post deployment migrations have been executed, we do not validate that migrations are complete. This means if a long running migration were to be running at the time that Release Managers begin release procedures, a migration may be active. The primary question here is, is there risk in this or perhaps another statement what level of risk are we subject to?
Questions
GitLab has a giant database. It would be very common for large migrations of data to take a long time. If checking that a PDM runs, we'll know if we harm the database as we'll end up causing some sort of incident. Since we do not wait for lengthy background migrations to complete, I wonder if we are missing potential failure scenarios that our self managed users would be subject too,
Exit Criterion
-
Assess risk - reach out to dev teams as necessary to help us determine an answer -
If we want to wait for the PDM to complete, we'll have a bit of work to do, create the necessary issues to ensure that we check the PDM up to the selected sha for release and validate that all migrations have completed. -
If we deem this risk low, document, and close