Consider dead tuple ratio to control migration speed
Data migrations lead to an increase in vacuum activity. Depending on how this is set up, we may run into situations where vacuum is unable to catch up with the rate of updates. This can be directly caused by having a too high frequency of updates or indirectly by other vacuum activity delaying vacuum on the relevant table.
In any case, as we can see in gitlab-com/gl-infra/production#4455 (comment 565835397), this has potential to cause noise (alerts).
The proposal here is to consider the dead tuple ratio (dead tuples vs total tuples) and implement a threshold for the migration.
When reaching the threshold, we can decrease the migration rate or completely pause the migration (easier).
Edited by Andreas Brandl