Guard worker: dynamic pre import threshold
🔥 Problem
The Guard worker is used a fixed value (from application settings) for the pre_import threshold. That threshold is used to detect long running migrations and abort them.
The problem is that now that we bumping the tags limit of repo we accept for the migration, we have a highly disparate tags count distribution which in turn create a disparate distribution of pre import execution time.
If we use a pre import value to low, heavy repositories will get canceled and retried 2 times. Worse, heavy repositories have long pre import steps, like 30 minutes. 3 executions means that we invest 1h30m of Container registry resources for nothing.
In the long run, a fixed value hurts the migration throughput.
🚒 Solution
- Use a dynamic pre import:
- Apply the fixed threshold, if we are above (long running migration), then:
- When considering an image repository, gets the tags count (yes, that's a Container Registry).
- Use an application setting that defines the pre_import tags rate.
- The pre import timeout will be: * <pre_import tags rate>.
- Use a feature flag
Edited by David Fernandez