Skip to content

Fix to use FIFO as pending builds strategy for group runners

What does this MR do and why?

According to the CI documentation: "Group runners process jobs by using a first in, first out (FIFO) queue." So jobs that were created first should be executed first. Unfortunately, we discovered that this is not the case in reality (see #364368 (comment 1001088326)). This MR should fix that bug.

Related issue: Pipelines start/finish in seemingly random order (#364368)

🛠 with at Siemens

/cc @bufferoverflow

Example

Let's start 3 pipeline within a few seconds and see in which order the jobs are executed:

Jobs: actual order
Jobs: target order

SQL details

Ci::Queue::BuildQueueService.new(Ci::Runner.find(11)).builds_for_group_runner
SQL (before)
SELECT "ci_pending_builds".* 
FROM "ci_pending_builds" 
WHERE (ci_pending_builds.namespace_traversal_ids && ARRAY[182]::int[])

Query plan (no data)

Query plan (20'000 pending builds)

Query plan (20'000 pending builds, SEQ SCAN OFF)

SQL (after)
SELECT "ci_pending_builds".* 
FROM "ci_pending_builds" 
WHERE (ci_pending_builds.namespace_traversal_ids && ARRAY[182]::int[]) 
ORDER BY build_id ASC

Query plan (no data)

Query plan (20'000 pending builds)

Query plan (20'000 pending builds, SEQ SCAN OFF)

How to set up and validate locally

  1. Setup example above
  2. Trigger some pipelines
  3. Check order of executed jobs

MR acceptance checklist

This checklist encourages us to confirm any changes have been analyzed to reduce risks in quality, performance, reliability, security, and maintainability.

Edited by Jonas Wälter

Merge request reports