StuckCiJobsWorker does not drop all stuck pending builds
!62239 (merged) added an optimization to fix the database timeout problem for the queries executed when identifying the stuck pending jobs.
In short, we're only looking at the jobs created in the last 5 days:
- https://gitlab.com/gitlab-org/gitlab/-/blob/3e8ee5dd61c0232267cea59cdb574602dbd50c53/app/workers/stuck_ci_jobs_worker.rb#L28-31
- https://gitlab.com/gitlab-org/gitlab/-/blob/3e8ee5dd61c0232267cea59cdb574602dbd50c53/app/models/commit_status.rb#L62-64
This is problematic because we have some pending jobs that should have been dropped:
-- running a query to see how far behind the clone is from the current time.
gitlabhq_dblab=# select max(updated_at), now() from ci_builds;
max | now
----------------------------+-------------------------------
2021-08-24 12:00:56.082766 | 2021-08-24 14:18:48.715547+00
(1 row)
gitlabhq_dblab=# SELECT count(*) FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND ("ci_builds"."status" IN ('pending')) AND (created_at < '2021-08-19 14:08:08.522426');
count
-------
495
(1 row)
Edited by Marius Bobin