Batched background migration worker doesn't handle statement timeout while finding batch boundaries

Everyone can contribute. Help move this issue forward while earning points, leveling up and collecting rewards.

If a batched background migration encounters a statement timeout in the batching strategy while finding a batch boundary, it fails to reduce the batch size and instead the exception fails the sidekiq job.

This causes the sidekiq job to fail in a loop, as the next worker picks the same batch again.

Current workaround: Find the job id and pause it with chatops - /chatops run batched_background_migrations pause <id> --database <main/ci>

Instead we should reduce the batch size of the migration when it fails with a statement timeout, so that future iterations have less work to do and are likely to succeed.

Edited by 🤖 GitLab Bot 🤖