Introduce additional DB table (acceleration structure) to optimise job queueing (as an intermediate solution for better queueing)

cc @grzesiek @darbyfrey @craig-gomes

changed title from Optimise job picking to Introduce additional DB table (acceleration structure) to optimise job queueing

changed the description

changed title from Introduce additional DB table (acceleration structure) to optimise job queueing to Introduce additional DB table (acceleration structure) to optimise job queueing (as an intermediate solution for better queueing)

Hi @ayufan,

Please add labels to your issue, this aids categorization and locating issues in the future.

Thanks for your help!

You are welcome to help improve this comment.

added auto updated + 1 deleted label

added automation:ml groupcloud connector + 1 deleted label

This issue was automatically tagged with the label groupmemory by TanukiStan, a machine learning classification model, with a probability of 1.

If this label is incorrect, please tag this issue with the correct group label as well as automation:ml wrong to help me learn from my mistakes.

Query Plan today (does not include tags matching, but theirs impact is currently marginal)

explain analyze
SELECT "ci_builds".* FROM "ci_builds"
  INNER JOIN "projects" ON "projects"."id" = "ci_builds"."project_id"
  LEFT JOIN project_features ON ci_builds.project_id = project_features.project_id
  LEFT JOIN (
    SELECT "ci_builds"."project_id", count(*) AS running_builds FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND ("ci_builds"."status" IN ('running')) AND "ci_builds"."runner_id" IN (SELECT "ci_runners"."id" FROM "ci_runners" WHERE "ci_runners"."runner_type" = 1) GROUP BY "ci_builds"."project_id"
  ) AS project_builds ON ci_builds.project_id=project_builds.project_id
WHERE
  ("ci_builds"."status" IN ('pending')) AND "ci_builds"."runner_id" IS NULL AND
  "projects"."shared_runners_enabled" = TRUE AND "projects"."pending_delete" = FALSE AND
  (project_features.builds_access_level IS NULL or project_features.builds_access_level > 0) AND
  "ci_builds"."type" = 'Ci::Build' AND
  ("projects"."visibility_level" = 20 OR (EXISTS (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" WHERE (namespaces.id = projects.namespace_id))                                                                                          UNION (SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT 1 FROM "base_and_ancestors" AS "namespaces" LEFT JOIN namespace_statistics ON namespace_statistics.namespace_id = namespaces.id WHERE "namespaces"."parent_id" IS NULL AND (COALESCE(namespaces.shared_runners_minutes_limit, 400, 0) = 0 OR COALESCE(namespace_statistics.shared_runners_seconds, 0) < COALESCE((namespaces.shared_runners_minutes_limit + COALESCE(namespaces.extra_shared_runners_minutes_limit, 0)), (400 + COALESCE(namespaces.extra_shared_runners_minutes_limit, 0)), 0) * 60))))
ORDER BY COALESCE(project_builds.running_builds, 0) ASC, ci_builds.id ASC;

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=51823627.84..51823820.59 rows=77101 width=1644) (actual time=466.906..467.247 rows=3634 loops=1)
   Sort Key: (COALESCE(project_builds.running_builds, '0'::bigint)), ci_builds.id
   Sort Method: quicksort  Memory: 3425kB
   ->  Hash Left Join  (cost=80.22..51711144.62 rows=77101 width=1644) (actual time=13.592..455.648 rows=3634 loops=1)
         Hash Cond: (ci_builds.project_id = project_builds.project_id)
         ->  Nested Loop  (cost=1.70..51710863.70 rows=77101 width=1636) (actual time=0.167..438.934 rows=3634 loops=1)
               ->  Nested Loop Left Join  (cost=1.14..594862.08 rows=141502 width=1636) (actual time=0.073..163.737 rows=18145 loops=1)
                     Filter: ((project_features.builds_access_level IS NULL) OR (project_features.builds_access_level > 0))
                     Rows Removed by Filter: 1
                     ->  Index Scan using index_ci_builds_on_status_and_type_and_runner_id on ci_builds  (cost=0.70..207129.27 rows=145221 width=1636) (actual time=0.056..80.373 rows=18146 loops=1)
                           Index Cond: (((status)::text = 'pending'::text) AND ((type)::text = 'Ci::Build'::text) AND (runner_id IS NULL))
                     ->  Index Scan using index_project_features_on_project_id on project_features  (cost=0.44..2.66 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=18146)
                           Index Cond: (ci_builds.project_id = project_id)
               ->  Index Scan using projects_pkey on projects  (cost=0.56..361.24 rows=1 width=4) (actual time=0.015..0.015 rows=0 loops=18145)
                     Index Cond: (id = ci_builds.project_id)
                     Filter: (shared_runners_enabled AND (NOT pending_delete) AND ((visibility_level = 20) OR (SubPlan 2)))
                     Rows Removed by Filter: 1
                     SubPlan 2
                       ->  Nested Loop Left Join  (cost=353.15..358.21 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=11587)
                             Filter: ((COALESCE(namespaces_2.shared_runners_minutes_limit, 400) = 0) OR (COALESCE(namespace_statistics.shared_runners_seconds, 0) < (COALESCE((namespaces_2.shared_runners_minutes_limit + COALESCE(namespaces_2.extra_shared_runners_minutes_limit, 0)), (400 + COALESCE(namespaces_2.extra_shared_runners_minutes_limit, 0)), 0) * 60)))
                             Rows Removed by Filter: 1
                             CTE base_and_ancestors
                               ->  Recursive Union  (cost=0.43..352.72 rows=101 width=345) (actual time=0.007..0.010 rows=1 loops=11587)
                                     ->  Index Scan using namespaces_pkey on namespaces  (cost=0.43..3.45 rows=1 width=345) (actual time=0.004..0.004 rows=1 loops=11587)
                                           Index Cond: (id = projects.namespace_id)
                                     ->  Nested Loop  (cost=0.43..34.73 rows=10 width=345) (actual time=0.002..0.002 rows=0 loops=11267)
                                           ->  WorkTable Scan on base_and_ancestors  (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.000 rows=1 loops=11267)
                                           ->  Index Scan using namespaces_pkey on namespaces namespaces_1  (cost=0.43..3.45 rows=1 width=345) (actual time=0.001..0.001 rows=0 loops=11267)
                                                 Index Cond: (id = base_and_ancestors.parent_id)
                             ->  CTE Scan on base_and_ancestors namespaces_2  (cost=0.00..2.02 rows=1 width=12) (actual time=0.010..0.012 rows=1 loops=11587)
                                   Filter: (parent_id IS NULL)
                                   Rows Removed by Filter: 0
                             ->  Index Scan using index_namespace_statistics_on_namespace_id on namespace_statistics  (cost=0.42..3.44 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=11587)
                                   Index Cond: (namespace_id = namespaces_2.id)
         ->  Hash  (cost=78.51..78.51 rows=1 width=12) (actual time=13.414..13.417 rows=554 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 32kB
               ->  Subquery Scan on project_builds  (cost=78.48..78.51 rows=1 width=12) (actual time=11.985..13.296 rows=554 loops=1)
                     ->  GroupAggregate  (cost=78.48..78.50 rows=1 width=12) (actual time=11.984..13.180 rows=554 loops=1)
                           Group Key: ci_builds_1.project_id
                           ->  Sort  (cost=78.48..78.49 rows=1 width=4) (actual time=11.976..12.340 rows=3306 loops=1)
                                 Sort Key: ci_builds_1.project_id
                                 Sort Method: quicksort  Memory: 251kB
                                 ->  Nested Loop  (cost=1.25..78.47 rows=1 width=4) (actual time=0.052..10.968 rows=3306 loops=1)
                                       ->  Index Scan using index_ci_runners_on_runner_type on ci_runners  (cost=0.55..17.28 rows=12 width=4) (actual time=0.011..0.039 rows=15 loops=1)
                                             Index Cond: (runner_type = 1)
                                       ->  Index Scan using index_ci_builds_on_status_and_type_and_runner_id on ci_builds ci_builds_1  (cost=0.70..5.08 rows=2 width=8) (actual time=0.026..0.690 rows=220 loops=15)
                                             Index Cond: (((status)::text = 'running'::text) AND ((type)::text = 'Ci::Build'::text) AND (runner_id = ci_runners.id))
 Planning Time: 3.459 ms
 Execution Time: 469.076 ms

Introduce acceleration structure

DROP TABLE IF EXISTS ci_builds_pending;

CREATE TABLE ci_builds_pending (
    id BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY,
    build_id BIGINT NOT NULL,
    project_id BIGINT NOT NULL,
    PRIMARY KEY (id),
    UNIQUE (build_id),
    INDEX (project_id),
    FOREIGN KEY (build_id) REFERENCES ci_builds(id) ON DELETE CASCADE,
    FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE
);

DROP TABLE IF EXISTS ci_builds_running;

CREATE TABLE ci_builds_running (
    id BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY,
    build_id BIGINT NOT NULL,
    project_id BIGINT NOT NULL,
    runner_id BIGINT NOT NULL,
    PRIMARY KEY (id),
    UNIQUE (build_id, runner_id),
    FOREIGN KEY (build_id) REFERENCES ci_builds(id) ON DELETE CASCADE,
    FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE,
    FOREIGN KEY (runner_id) REFERENCES ci_runners(id) ON DELETE CASCADE
);

CREATE INDEX idx_ci_builds_running_project_id ON ci_builds_running (project_id);
CREATE INDEX idx_ci_builds_running_runner_id ON ci_builds_running (runner_id);

# Insert all pending builds
INSERT INTO ci_builds_pending (build_id, project_id) SELECT id, project_id FROM ci_builds WHERE status='pending' AND type='Ci::Build' AND runner_id IS NULL;

# Insert all running builds for quick filtering on the shared runners
INSERT INTO ci_builds_running (build_id, project_id, runner_id) SELECT id, project_id, runner_id FROM ci_builds WHERE status='running' AND type='Ci::Build' AND runner_id IN (SELECT "ci_runners"."id" FROM "ci_runners" WHERE "ci_runners"."runner_type" = 1);

Query Plan

explain analyze
SELECT "ci_builds_pending".build_id FROM "ci_builds_pending"
  INNER JOIN "projects" ON "projects"."id" = "ci_builds_pending"."project_id"
  LEFT JOIN project_features ON ci_builds_pending.project_id = project_features.project_id
  LEFT JOIN (
    SELECT project_id, count(*) AS running_builds FROM ci_builds_running GROUP BY project_id
  ) AS project_builds ON ci_builds_pending.project_id=project_builds.project_id
WHERE
  "projects"."shared_runners_enabled" = TRUE AND "projects"."pending_delete" = FALSE AND
  (project_features.builds_access_level IS NULL or project_features.builds_access_level > 0) AND
  ("projects"."visibility_level" = 20 OR (EXISTS (WITH RECURSIVE "base_and_ancestors" AS ((SELECT "namespaces".* FROM "namespaces" WHERE (namespaces.id = projects.namespace_id))                                                                                          UNION (SELECT "namespaces".* FROM "namespaces", "base_and_ancestors" WHERE "namespaces"."id" = "base_and_ancestors"."parent_id")) SELECT 1 FROM "base_and_ancestors" AS "namespaces" LEFT JOIN namespace_statistics ON namespace_statistics.namespace_id = namespaces.id WHERE "namespaces"."parent_id" IS NULL AND (COALESCE(namespaces.shared_runners_minutes_limit, 400, 0) = 0 OR COALESCE(namespace_statistics.shared_runners_seconds, 0) < COALESCE((namespaces.shared_runners_minutes_limit + COALESCE(namespaces.extra_shared_runners_minutes_limit, 0)), (400 + COALESCE(namespaces.extra_shared_runners_minutes_limit, 0)), 0) * 60))))
ORDER BY COALESCE(project_builds.running_builds, 0) ASC, ci_builds_pending.build_id ASC;

Results

Compared to prior:

The query cost reduced from 51823820 to 6453633 a 9x improvement!
The query is faster by around 100ms, from 469ms to 364ms. However, here we have no load, so this will be significantly different value for a live system

More fields (these should be pretty constant):

This table should contain: tags (unsure in what form, is there some clever trick to do this matching?)
This table should contain: protected a flag to indicate if builds needs to use protected runner
This table should contain: runs_on_shared (or is_shared) or anything else allow us to indicate that

All of that can be even more optimised by (iteratively):

The new table allows us to remove more parts of the big query, step by step, and recalculate it as part of ci_pending_builds
We could precalculate: namespace quota: we could remove entries from ci_pending_builds or have a flag: is_shared (if a build needs to be executed by shared runners)
We could precalculate: also as part of is_shared a project visibility and eligability for being picked by shared runners
If project runs out of quota/visibility, we could update its all builds to have is_shared set to false, end reset it if still has quota/visibility
We could store tags in as an array, to be able to quickly bitmask them with our list of tags
The table ci_builds could be partitioned, as we would only depend on ci_pending/running_builds being in-memory for quick builds matching

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=6453609.37..6453633.45 rows=9634 width=16) (actual time=363.558..363.844 rows=3634 loops=1)
   Sort Key: (COALESCE(project_builds.running_builds, '0'::bigint)), ci_builds_pending.build_id
   Sort Method: quicksort  Memory: 267kB
   ->  Hash Left Join  (cost=146.03..6452971.89 rows=9634 width=16) (actual time=2.134..361.453 rows=3634 loops=1)
         Hash Cond: (ci_builds_pending.project_id = project_builds.project_id)
         ->  Nested Loop  (cost=1.00..6452801.54 rows=9634 width=12) (actual time=0.116..357.450 rows=3634 loops=1)
               ->  Nested Loop Left Join  (cost=0.44..58301.71 rows=17681 width=12) (actual time=0.028..81.403 rows=18145 loops=1)
                     Filter: ((project_features.builds_access_level IS NULL) OR (project_features.builds_access_level > 0))
                     Rows Removed by Filter: 1
                     ->  Seq Scan on ci_builds_pending  (cost=0.00..645.46 rows=18146 width=12) (actual time=0.013..3.167 rows=18146 loops=1)
                     ->  Index Scan using index_project_features_on_project_id on project_features  (cost=0.44..3.16 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=18146)
                           Index Cond: (ci_builds_pending.project_id = project_id)
               ->  Index Scan using projects_pkey on projects  (cost=0.56..361.66 rows=1 width=4) (actual time=0.015..0.015 rows=0 loops=18145)
                     Index Cond: (id = ci_builds_pending.project_id)
                     Filter: (shared_runners_enabled AND (NOT pending_delete) AND ((visibility_level = 20) OR (SubPlan 2)))
                     Rows Removed by Filter: 1
                     SubPlan 2
                       ->  Nested Loop Left Join  (cost=353.15..358.21 rows=1 width=0) (actual time=0.016..0.016 rows=0 loops=11587)
                             Filter: ((COALESCE(namespaces_2.shared_runners_minutes_limit, 400) = 0) OR (COALESCE(namespace_statistics.shared_runners_seconds, 0) < (COALESCE((namespaces_2.shared_runners_minutes_limit + COALESCE(namespaces_2.extra_shared_runners_minutes_limit, 0)), (400 + COALESCE(namespaces_2.extra_shared_runners_minutes_limit, 0)), 0) * 60)))
                             Rows Removed by Filter: 1
                             CTE base_and_ancestors
                               ->  Recursive Union  (cost=0.43..352.72 rows=101 width=345) (actual time=0.007..0.010 rows=1 loops=11587)
                                     ->  Index Scan using namespaces_pkey on namespaces  (cost=0.43..3.45 rows=1 width=345) (actual time=0.004..0.004 rows=1 loops=11587)
                                           Index Cond: (id = projects.namespace_id)
                                     ->  Nested Loop  (cost=0.43..34.73 rows=10 width=345) (actual time=0.002..0.002 rows=0 loops=11267)
                                           ->  WorkTable Scan on base_and_ancestors  (cost=0.00..0.20 rows=10 width=4) (actual time=0.000..0.000 rows=1 loops=11267)
                                           ->  Index Scan using namespaces_pkey on namespaces namespaces_1  (cost=0.43..3.45 rows=1 width=345) (actual time=0.001..0.001 rows=0 loops=11267)
                                                 Index Cond: (id = base_and_ancestors.parent_id)
                             ->  CTE Scan on base_and_ancestors namespaces_2  (cost=0.00..2.02 rows=1 width=12) (actual time=0.010..0.012 rows=1 loops=11587)
                                   Filter: (parent_id IS NULL)
                                   Rows Removed by Filter: 0
                             ->  Index Scan using index_namespace_statistics_on_namespace_id on namespace_statistics  (cost=0.42..3.44 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=11587)
                                   Index Cond: (namespace_id = namespaces_2.id)
         ->  Hash  (cost=138.10..138.10 rows=554 width=16) (actual time=2.011..2.013 rows=554 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 34kB
               ->  Subquery Scan on project_builds  (cost=0.28..138.10 rows=554 width=16) (actual time=0.024..1.910 rows=554 loops=1)
                     ->  GroupAggregate  (cost=0.28..132.56 rows=554 width=16) (actual time=0.024..1.794 rows=554 loops=1)
                           Group Key: ci_builds_running.project_id
                           ->  Index Only Scan using idx_ci_builds_running_project_id on ci_builds_running  (cost=0.28..110.49 rows=3306 width=8) (actual time=0.019..1.061 rows=3306 loops=1)
                                 Heap Fetches: 3306
 Planning Time: 2.086 ms
 Execution Time: 364.311 ms
(42 rows)

@abrandl @grzesiek Just a quick check. It seems intriguing :)

Especially, if we could somehow do it as a drop-in replacement to current system, and just evaluate its impact with a minimal effort. I would expect that this would be a massive difference for GitLab just because of much more compact data structure.

This appears to be a way to match tags, where: a runner needs to contain all tags of a build levering an efficient matching of postgres:

CREATE TABLE ci_builds_pending (
    id BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY,
    build_id BIGINT NOT NULL,
    project_id BIGINT NOT NULL,
    tag_ids BIGINT[],
    runs_on_shared INT(1),
    protected INT(2),

    PRIMARY KEY (id),
    UNIQUE (build_id),
    INDEX (project_id),
    FOREIGN KEY (build_id) REFERENCES ci_builds(id) ON DELETE CASCADE,
    FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE
);

...

WHERE
  tag_ids <@ ARRAY[runner_tag_ids] AND
  protected IS TRUE AND
  runs_on_shared IS TRUE

Ref: https://www.postgresql.org/docs/9.1/intarray.html

@abrandl will be OOO for the next couple of days. @pbair or @iroussos can you take a look?

@ayufan the proposed solution sounds like a nice approach to me as long as we don't have any information that may change in ci_builds_pending (so we would have to maintain the consistency of the materialized table) and we only have to update it on the cases you mention:

Insert build to table on status transition to pending as part of state machine

Delete item from table on status transition from pending as part of state machine

The table and indexes would be pretty bloated, but it would stay comparably small and I assume that Autovacuum and our re-indexing processes should be able to handle that bloat (others can correct me on this statement).

Other than that, we would have to add a couple additional indexes, but that's a minor implementation discussion.

I wonder if we could start with the more simple solution and implement Redis queue for untagged builds only -> #322972

added to epic &5433 (closed)

mentioned in epic &5433 (closed)

Setting label(s) ~"Category:Memory" devopsenablement sectionenablement based on groupmemory.

added Category:Cloud Connector devopssystems sectioncore platform + 1 deleted label

changed title from Introduce additional DB table (acceleration structure) to optimise job queueing (as an intermediate solution for better queueing) to Idea to consider: Introduce additional DB table (acceleration structure) to optimise job queueing (as an intermediate solution for better queueing)

changed the description

added devopsverify rapid action + 1 deleted label and removed devopssystems + 1 deleted label

Introduce additional DB table (acceleration structure) to optimise job queueing (as an intermediate solution for better queueing)

Designs

Child items ...

Activity

Query Plan today (does not include tags matching, but theirs impact is currently marginal)

Introduce acceleration structure

Query Plan

Results