Skip to content

Update the query that looks for stuck pending jobs

What does this MR do?

Related to #339237 (closed)

We have around 500 outdated stuck pending jobs in the database and this is causing a deadlock with the effort of migrating them to the ci_pending_builds table. This MR changes the database query to start dropping of the older builds and catching up to the newer ones.

This change is behind a feature flag and can be disabled if the query proves to be problematic.

Query and execution plans:

I think the previous timeout errors were caused by the wrong index being used(we have some that overlap):

  • "ci_builds_gitlab_monitor_metrics" btree (status, created_at, project_id) WHERE type::text = 'Ci::Build'::text
  • "index_ci_builds_on_status_and_type_and_runner_id" btree (status, type, runner_id)
  • "index_ci_builds_runner_id_pending_covering" btree (runner_id, id) INCLUDE (project_id) WHERE status::text = 'pending'::text AND type::text = 'Ci::Build'::text

The data that is being searched is pretty small in size, at this moment there are under 10000 pending builds:

gitlabhq_production=> SELECT COUNT(*) FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."id" IN (SELECT "ci_builds"."id" FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND ("ci_builds"."status" IN ('pending'))) ;
 count
-------
  9134
(1 row)

As the worker processes these records, the query planner might decide to use a different index, so I've added a ORDER BY "ci_builds"."created_at" ASC, "ci_builds"."project_id" to the end of the query to force it to use ci_builds_gitlab_monitor_metrics all the time.

Screenshots or Screencasts (strongly suggested)

How to setup and validate locally (strongly suggested)

Does this MR meet the acceptance criteria?

Conformity

Availability and Testing

Security

Does this MR contain changes to processing or storing of credentials or tokens, authorization and authentication methods or other items described in the security review guidelines? If not, then delete this Security section.

  • Label as security and @ mention @gitlab-com/gl-security/appsec
  • The MR includes necessary changes to maintain consistency between UI, API, email, or other methods
  • Security reports checked/validated by a reviewer from the AppSec team
Edited by Marius Bobin

Merge request reports