Workflow for post deployment migrations that need to be run during low traffic periods
🔥 Problem
As our database grows and gets more traffic, we see more post deployment migrations with DDL changes that fail due to being unable to obtain a lock on a high traffic table.
Examples:
Our current process requests that release managers are pinged on MRs by authors or maintainers when MRs introducing long index creations are merged, but this has a few problems:
- It does not include migrations that may run into locking problems during high traffic periods.
- Since the MR still needs to be deployed, it might be a few release manager shifts before the PDM pipeline is run, increasing the risk that the release manager may not remember this migration is pending when the run the PDM pipeline.
There are a few levels to this problem:
- First, we need a way for release managers to be more aware when such migrations exist.
- Second, it would be best if there was a way to run post deployment migrations, but skip any that are tagged for low traffic and run them separately, similar to how we have the ability to create certain indexes on the weekend.
🚒 Solution
Based on the few ways we can look at this, we can take an iterative approach.
- In this issue, update the processes:
- Update the database review process to open an issue in the release tracker and assigned to the release managers. This creates a more permanent visibility. The ping in
#f_upcoming_release
is then not needed, but we could still keep it as an additional guard. - Update the release manager process to include checking if they are assigned any such issues before running the PDM pipeline.
- Update the database review process to open an issue in the release tracker and assigned to the release managers. This creates a more permanent visibility. The ping in
- In a separate issue, explore ways to mark and defer certain migrations until low traffic periods or weekends.
This issue was opened to followup production#8588 (comment 1326059393)
Edited by Steve Abrams