DestroyAllExpiredService calls each_batch in an unsafe manner
Summary
In MR !47496 (merged) an unsafe usage of EachBatch
was introduced into DestroyAllExpiredService
by dc669d91.
The JobArtifacts
to be deleted are batched using the non-unique columm expire_at
which can cause an infinite loop as described in #285097. The result is that artifacts are not deleted and no errors or failures are reported because the total number of iterations is limited by LOOP_LIMIT
.
Within the each_batch
code, it's possible for the records identified by the batch start
and stop
objects to have the same value for expire_at
, which generates a condition that results in an empty query: where expire_at < 5/1/2021 12:00:00.000 AND expire_at >= 5/1/2021 12:00:00.000
Steps to reproduce
- Create a project with >100
JobArtifact
records. - Set the
expire_at
column to a single date value using the database:update ci_job_artifacts set expire_at = now() where [...]
- Run the
expire_job_artifacts_worker
background task and note that expired artifacts are not deleted.
What is the current bug behavior?
The expire_build_artifacts_worker
task runs without error, but does not delete artifacts if >100 are expired at the same time.
What is the expected correct behavior?
Artifacts are deleted by the background task even if >100 are expired at the same time.
NOTE: The documentation for EachBatch should be updated to include this known-infinite loop problem when non-unique columns are used for batching and provides stronger warnings against using such columns. Or a fix for #285097 should be prioritized.